I'm trying to develop an alarm history structure to be stored in non-volatile flash memory. Flash memory has a limited number of write cycles so I need a way to add records to the structure without rewriting all of the flash pages in the structure each time or writing out updated pointers to the head/tail of the queue.
Additionally once the available flash memory space has been used I want to begin overwriting records previously stored in flash starting with the first record added first-in-first-out. This makes me think a circular buffer would work best for adding items. However when viewing records I want the structure to work like a stack. E.g. The records would be displayed in reverse chronological order last-in-first-out.
Structure size, head, tail, indexes can not be stored unless they are stored in the record itself since if they were written out each time to a fixed location it would exceed the maximum write cycles on the page where they were stored.
So should I use a stack, a queue, or some hybrid structure? How should I store the head, tail, size information in flash so that it can be re-initialized after power-up?
See a related question Circular buffer in Flash.
Lookup ring-buffer
Assuming you can work out which is the last entry (from a time stamp etc so don't need to write a marker) this also has the best wear leveling performance.
Edit: Doesn't apply to the OP's flash controller: You shouldn't have to worry about wear leveling in your code. The flash memory controller should handle this behind the scenes.
However, if you still want to go ahead an do this, just use a regular circular buffer, and keep pointers to the head and tail of the stack.
You could also consider using a Least Recently Used cache to manage where on flash to store data.
You definitely want a ring buffer. But you're right, the meta information is a bit...interesting.
Map your entries on several sections. When the sections are full, overwrite starting with the first section. Add a sequence-number (nbr sequence numbers > 2 * entries), so on reboot you know what the first entry is.
You could do a version of the ring-buffer, where the first element stored in the page is number of times that page has been written. This allows you to determine where you should write next by finding the first page where the number is lower than the previous page. If they're all the same, you start from the beginning with the next number.
Related
First Of all thanks for stopping by and here is my qusetion.
I'm working on a project where i use an array that contains much information (About 300 variables every variable is about 25 Characters).So my question here is What is the best way to store it...??
I Have two possible way and please tell me which is better...
First Way : Make a normal local where i can store all the needed information and of course it will be stored on the RAM (As far as I Know).
Second Way : To store them in a file and whenever i need the array i simply read the data form the file and get the array.
Note That : The array is used occasionally and not every time.
My Second Question is :
Is there a possible error that may occur to the Hard Drive if i made the program write and read so many time in a short period of time.And if so what is the min period i can write and read safely without any possible error...???
Thanks In Advance
Reading and writing files are very slow operation in comparison to RAM access. 300 strings with 25 elements each wouldn't consume too much space with modern RAM. If you need access to this data rarely (once per 10 minutes or once per hour), you, probably, could keep this data on HDD, but, to my opinion, it will be simpler for you to keep it in RAM for all the time.
You can find an answer to your second question here.
This data you have mentioned can be easily stored in memory without any memory crash. If there is need to store much larger amount of data, then NSKeyedArchiver (used to stored object in any format in NSData form to disk) can be used or CoreData framework can also be used. CoreData framework also supports caching and it is faster http://nshipster.com/nscoding/. Hope this helps.
I am writing a program which requires me to create an array of a million records. The array indices are unique ids(0-million represents unique product id). At first all elements are initialized to zero. They are incremented depending upon product sold.
This approach however has a high space complexity (4 * million bytes). Later I saw that only certain products need frequent updating. So is there any way in which I can reduce memory usage as well as keep track of all the products?
If you don't need frequent updating then you can store all the results in a file. Whenever you are updating any entry you can just create a temp file with all the other entries plus the updated one. After that you can just change the name of the temp file using rename(temp,new);.
Although, an array of million records doesn't require that much memory(just 4 megabytes). So, your approach is the best and the easiest one.
The best approach(algorithmically) would be to make a hash table to store all the entries. But if you are not an expert in C then making a hash table could be a problem for you.
This sounds more like a situation for table in a database than an in-memory array to me. If your use case allows for it, I'd use a database instead.
Otherwise, if in your use case:
a significant fraction of the products will eventually be used,
RAM is limited,
external storage (disk, serial memory) is available,
average access performance comparable to RAM speeds is required, and
increased worst case access time is acceptable,
then you could try some sort of caching scheme (lru maybe?). This will use more code space, somewhat increase your average access time, and more significantly increase your worst case access time.
If a large fraction of the products will not just be infrequently, but never used, then you should look into #fatrock92's suggestion of a hash table.
It's better use dynamic allocation of memory for array.
use of malloc or realloc can give you better way to allocate memory
I think you know how to use malloc and realloc
You can use link list, so whenever you need you can add or update elements in your list.
Also you can hold last time access in each node so you'd able to remove the nodes that has not been used lately.
I was reading the Lpc2148 Manual and in the Static Ram section I came across
Write back buffer
The SRAM controller incorporates a write-back buffer in order to prevent CPU stalls
during back-to-back writes. The write-back buffer always holds the last data sent by
software to the SRAM. This data is only written to the SRAM when another write is
requested by software.(the data is only written to the SRAM when software does another
write). If a chip reset occurs, actual SRAM contents will not reflect the most recent write
request (i.e. after a "warm" chip reset, the SRAM does not reflect the last write operation).
Any software that checks SRAM contents after reset must take this into account. Two
identical writes to a location guarantee that the data will be present after a Reset.
What does it mean. and what did he mean by CPU stalls and back to back writes
I'm not an EE so this is a layman's analogy. You are the only shopper at a supermarket. Because business is slow, there is only one cashier working this shift. There is no checkout counter - only a cashier and a barcode scanner. You hand items, one at a time, to the cashier. When the cashier is holding an item, they cannot take another item. Only when the cashier is done scanning an item, can they accept another one. If you don't have a bag or a cart and you bring individual items from the shelves to the cashier, there is no problem. But if you bring more than one item to the cashier and try to hand them all at once (back to back) you can't. You hand them one by one and you wait for each to be processed. This is called a stall.
Suddenly, the checkout counter with the conveyor belt is invented. Now you place your shopping at the counter and are free to go shop for more stuff. The cashier scans items at their own (slow) pace, because there is both a place for you to put them and a way for the cashier to reach them. The number of items you can put on the counter is limited, but it does allow you to drop off some stuff and continue shopping, making your shopping much more efficient.
There is a slight problem: before the invention of the checkout counter, when you wanted to know how much money the shopping spree is going to cost you, you could just look at the total displayed by the cash register. But now, you need to look at the cash register and the items on the counter that have not yet been processed.
That's why the read-from-SRAM instruction first surreptitiously checks whether the address you're reading from is one of the addresses to-be-written-to in the write queue/buffer. If so, it takes the value from the latest write-queue-entry with the same address instead of actually reading from SRAM. Reads from addresses that are in the write queue can be faster than reads from SRAM, but reads from addresses that are not currently in the write queue are made a little slower by the overhead (or at least less energy efficient, if SRAM reads and cache searches are done in parallel). Overall, this makes reads worse but the gains from no waiting for writes are worth it.
What they are telling you is that their cashier has an off-by-one bug: it drains the write queue not until it is empty but until there is only one item left on the counter. A snickers bar. And then, the cashier will look at that snickers bar forever and not put it through checkout. If you need to purchase the snickers bar, you need to put another item on the counter. Then, the cashier will happily move the conveyor belt and take the snickers bar. The text suggests you use another snickers bar, but you don't have to. In general, the last item you put on the counter will never be processed by the cashier.
I've recently established the need to store infrequently-updated configuration variables in the EEPROM of a microcontroller. Adding state to the program immediately forces one to worry about
detection of uninitialized data in EEPROM (i.e. first boot),
converting or invalidating data from old firmware versions, and
addressing of multiple structures, each of which which may grow in firmware updates.
Extensive Googling has only turned up one article that addresses keeping your EEPROM data valid through firmware updates. Has anyone used the approach discussed in that article? Is there a better alternative approach?
Personally, I prefer a "tagged table" format.
In this format, your data is split up into a series of "tables". Each table has a header that follows a predictable format and a body that can change as you need it to.
Here's an example of what one of the tables would look like:
Byte 0: Table Length (in 16-bit words)
Byte 1: Table ID (used by firmware to determine what this data is)
Byte 2: Format Version (incremented every time the format of this table changes)
Byte 3: Checksum (simple sum-to-zero checksum)
Byte 4: Start of body
...
Byte N: End of body
I wasn't storing a lot of data, so I used a single byte for each field in the header. You can use whatever size you need, so long as you never change it. The data tables are written one after another into the EEPROM.
When your firmware needs to read the data out of the EEPROM, it starts reading at the first table. If the firmware recognizes the table ID and supports the listed table version, it loads the data out of the body of the table (after validating the checksum, of course). If the ID, version, or checksum don't check out, the table is simply skipped. The length field is used to locate the next table in the chain. When firmware sees a table with a length of zero, it knows that it has reached the end of the data and that there are no more tables to process.
I find this format flexible (I can add any type of data into the body of a table) and robust (keep the header format constant and the data tables will be both forward- and backwards-compatible).
There are a couple of caveats, though they are not too burdensome. First, you need to ensure that your firmware can handle the case where important data either isn't in the table or is using an unsupported format version. You will also need to initialize the first byte of the EEPROM storage area to zero (so that on the first boot, you don't start loading in garbage thinking that it's data). Since each table knows its length it is possible to expand or shrink a table; however, you have to move the rest of the table storage area around in order to ensure that there are no "holes" (if the entire chain of tables can't fit in your device's memory, then this process can be annoying). Personally, I don't find any of these to be that big of a problem, and it is well worth the trouble I save over using some other methods of data storage.
Nigel Jones has covered some of the basics in your reference. There are plenty of alternatives.
One alternative, of you have lots of room, is storing key-value pairs instead of structures. Then you can update one value (by appending it) without erasing everything. This is most useful in devices that have a limited number of erase cycles. Your read routine will need to scan from the beginning, updating values each time the key is encountered. Of course your update routine will need to have a "garbage collector" that kicks in when the memory is full.
To handle device errors and power-downs in the middle of updates, we usually store multiple copies of the data. The simplest approach is to pingpong between to halves of the device using sequence number to determine which is newer. A CRC on each section is used to validate it. This also addresses the uninitialized data issue.
For the key-value version you'd need to append the new CRC after each write.
Suppose you have a really large table, say a few billion unordered rows, and now you want to index it for fast lookups. Or maybe you are going to bulk load it and order it on the disk with a clustered index. Obviously, when you get to a quantity of data this size you have to stop assuming that you can do things like sorting in memory (well, not without going to virtual memory and taking a massive performance hit).
Can anyone give me some clues about how databases handle large quantities of data like this under the hood? I'm guessing there are algorithms that use some form of smart disk caching to handle all the data but I don't know where to start. References would be especially welcome. Maybe an advanced databases textbook?
Multiway Merge Sort is a keyword for sorting huge amounts of memory
As far as I know most indexes use some form of B-trees, which do not need to have stuff in memory. You can simply put nodes of the tree in a file, and then jump to varios position in the file. This can also be used for sorting.
Are you building a database engine?
Edit: I built a disc based database system back in the mid '90's.
Fixed size records are the easiest to work with because your file offset for locating a record can be easily calculated as a multiple of the record size. I also had some with variable record sizes.
My system needed to be optimized for reading. The data was actually stored on CD-ROM, so it was read-only. I created binary search tree files for each column I wanted to search on. I took an open source in-memory binary search tree implementation and converted it to do random access of a disc file. Sorted reads from each index file were easy and then reading each data record from the main data file according to the indexed order was also easy. I didn't need to do any in-memory sorting and the system was way faster than any of the available RDBMS systems that would run on a client machine at the time.
For fixed record size data, the index can just keep track of the record number. For variable length data records, the index just needs to store the offset within the file where the record starts and each record needs to begin with a structure that specifies it's length.
You would have to partition your data set in some way. Spread out each partition on a separate server's RAM. If I had a billion 32-bit int's - thats 32 GB of RAM right there. And thats only your index.
For low cardinality data, such as Gender (has only 2 bits - Male, Female) - you can represent each index-entry in less than a byte. Oracle uses a bit-map index in such cases.
Hmm... Interesting question.
I think that most used database management systems using operating system mechanism for memory management, and when physical memory ends up, memory tables goes to swap.