Writing to two file handles from the same source results in two files of very different sizes using an ALFAT

In the course of troubleshooting my design, I found what appears to be a random and somewhat rare event (1 in 100 tests). The issue is while writing the same source data to two different file handles and closing the files successfully at the end of the data stream, I have the rare occurrence of one full sized file and one much shorter file (5,000 KB and 1350 KB as an example).

I write to an SD card using an I2C interface. I store chunks of data from the stream using a microprocessor and then use the write command to tell the ALFAT to write in 2048 Byte payloads, switching back and forth between the files. My ALFAT is at version 2.0.0

While this is a rare event, I cannot afford losing half of 1 out of 100 data sets. Any help would be great. Thanks

If you expect some help, at least provide enough information, code you are using, some schematics, etc.

ALFAT/F40 is very solid. Is there a way to narrow this down? There maybe an error inn your code or it is a timing issue. The easiest way would be in adding delays in your i2c code to see if this makes a difference.

Have you tried to do the same using uart, perhaps from a pc? Just to test things out.

Welcome to the community.

After additional testing, I found that the short file is the end of that particular data stream and the following stream’s data file begins with the missing data from the previous file. My data storage process is as follows with some additional background information.
In the course of my initial testing I found that as the number of files increases the amount of time to open a file for write increases by a non-trivial amount. This led me to develop a method for mitigating this by opening a file that would be at the top of the alphanumeric file table, such as 0000.bin (which opens near instantaneously), writing my data to that file, and after I’m finished logging data to rename the file with the correct time stamp. The rename command exhibits the same behavior, albeit worse, as the open command, in some of my tests taking over a minute, but since I’m in a down period I have time to spare. My steps are specifically as follows:

  1. Initialize the SD card.
  2. Open file 0000.bin and 0001.bin for append (I have two data streams that I am logging).
  3. Write 2048 Byte payload to 0000.bin.
  4. Flush file 0000.bin
  5. Write 2048 Byte payload to 0001.bin.
  6. Flush file 0001.bin
  7. Repeat steps 3 - 6 until data stream ends.
  8. Flush file 0000.bin and 0001.bin.
  9. Close file 0000.bin and 0001.bin.
  10. Rename file 0000.bin to time stamp xxxxxxxxxxxx_0.bin
  11. Rename file 0001.bin to time stamp xxxxxxxxxxxx_1.bin
  12. Repeat steps 1 - 11 when new data stream is detected.
    I am concerned that I am missing something about how the ALFAT manages the file table of the storage device. I believe that somehow I am creating a break in the 0000.bin file and that results in 2 0000.bin files that my rename method successfully renames one piece and the open to append is finding and appending to the end of the other piece. I don’t understand how the ALFAT is doing this and I would like any ideas with regards to fixing this issue. Also any additional information about what the ALFAT does to the file table of the storage device for each command.

Additionally, I have run the same test with only a single stream and 1 file opening and gotten the same results.

Thanks

@ JohnnyQ -

If I repeat from step 1 to step 11 does it happen 100% or just still 1%?
And at step 7

How large the data is, please?

@ Dat

It is still just about 1% of the time. Total data in a stream varies. I haven’t seen an error in the smallest of sets, approximately 300 KB, but errors do seem to occur in files that contain at least 1000 KB of data.

@ JohnnyQ -

try to reduce flushing or even remove flush at step 4-6-8 to see any different?

@ Dat -

How large is the internal buffer of the ALFAT?

@ JohnnyQ -
it is 4K

When I removed flushes I lose data. When I reduce the number of times I flush, I get fewer small file errors but it is still around 1 in 200. I’m trying a variety of combinations for the number of flushes, but I don’t feel confident in that being the solution.

Also, I have recently noted that after switching from “Open file for Append” to “Open file for Write”, that my result for 99% of the data sets is a single correct file. When an error does occur, it is one large file that contains the majority of the data and then a second file of the exact same name that contains the missing data, which is typically some multiple of 2KB and between 2KB and 10 KB.

I only open a file in one place in my code and I always use the same file handle. I make sure to parse every result code from the ALFAT and record any that are not !00. I never see a non !00 and I have even gone as far as to make sure that I don’t progress if there is a failed result.

I don’t understand how a second file with the exact same name is being generated?

@ JohnnyQ -

Hi, we will take a look. If you can, try it with UART to see any different.

Also, without flushing, data should not be lost, unless you didn’t close file when finish.

@ Dat -

I can’t do UART with my setup and I am definitely closing the files. I have even tried adding an initialize at the end to flush and close all the handles just to see if there was something I was doing wrong with the close. This didn’t seem to change anything.

I ran just over 2000 tests last night and had just at 30 instances of duplicate files that contain data from a central portion of the data stream. So it is like the Alfat is writing to the correct memory location and then jumping to somewhere else for a few writes and then jumping back to the correct file.

Show your code ! Someone may catch a bug in your code.