How to get largest free contiguous block of memory in FatFs - c

Using the FatFs and their API, I am trying to pre-allocate file system space for the remainder of the drive to write files of unknown sizes. At the end of the file write process, any unused space is then truncated (f_truncate). However, I'm having a problem allocating enough space when the file system become fragmented after deletion of some files. To overcome this problem, I would like to allocate only enough space as the largest contiguous block of memory on the file system.
In the FatFS API, there are functions to get the amount of free space left on the device, which is f_getfree. There is also the f_expand function, which takes in a size in bytes and returns whether or not a free contiguous block of memory exists for that given size.
Is there an efficient way to calculate the largest free contiguous block of memory available? I'm trying to avoid any sort of brute force "guess and check" method. Thanks

One way would be to create your own extension to the API to count contiguous FAT entries across all the sectors.
Without modifying the API, you could use f_lseek(). Write open a file, use f_lseek() to expand the size of the file until 'disk full'(end of contiguous space). This would need to be repeated with new files until all of the disk was allocated. Pick from this the maximum allocated file, and delete the others.

Related

HOW DO I SOLVE THIS SUPPOSEDLY HARD QN?

Consider a file system that uses contiguous allocation method. For a disk
consists of 100 data blocks, each block is 4KB. What is the maximum and
minimum number of files of size 15KB can it save? Find the number of files the
disk able to support using (1) link allocation and (2) index allocation, assuming
the address is 32 bit

C - Can you free individual memory adresses of an array allocated dynamically?

i do not seem to find an answer to this question. Why you cant free up an individual adress is it because the space needs to be continuous? and if this is the answer then why fragmentation occurs on Hard-Disks
Can you free individual memory adresses of an array allocated dynamically?
If the memory is at the end of an array, you can free off the unneeded excess by performing a realloc to a smaller size, with the caveat that you may actually get a new pointer to new memory with the prefix contents copied into it, and the original memory freed in its entirety.
Otherwise, no. The free interface is defined to only accept addresses returned from malloc, calloc or realloc.
Why you cant free up an individual adresss is it because the space needs to be continuous?
Well, the direct answer is that there is no interface defined to do so. There is no way to tell free how much of the pointer you passed in should be freed. If you want to free all memory to the end of the allocated block, realloc does that.
If contiguity is not important to your program, just use separate allocations for each array element, and free them individually.
and if this is the answer then why fragmentation occurs on Hard-Disks
One way to imagine a scenario of fragmentation on a file system is that if three files are created one after another, and then the middle one is deleted, there is now a hole between two files.
|---File 1---|--- Hole ---|---File 3---|
Now suppose a new file is created, so it starts out inside the hole between the two files, but as it grows, it cannot fit in the hole, so now the rest of the file is after File 3. In this case, we would say the new file is fragmented.
|---File 1---|---File 4...|---File 3---|...File 4---|
This happens on "Hard-Drives" because a filesystem is designed that way: allow a large file to span the available holes in the physical medium.
A RAM disk used for a filesystem would also eventually have fragmented files.
A non-contiguous data structure could be considered to be "fragmented", e.g., a linked-list or a tree, but that is by design. An array is considered contiguous by its very definition. However, files on a filesystem are not arrays.
Broadly, the reason you cannot release individual portions of allocated memory is that it is not useful enough to justify writing the software to support it.
The C standard specifies services provided by malloc, free, realloc, and related routines. The only provisions it makes for releasing space are by using free to release an allocation and by using realloc to replace an allocation with a smaller one.
C implementations are free to extend the C standard by providing services to release portions of allocated space. However, I am not aware of any that have done so. If a program were allowed to free arbitrary pieces of memory, the memory management software would have to keep track of all of them. That requires extra data and time. Additionally, it can interfere with some schemes for managing memory. Memory management software might organize memory so that allocations of particular sizes can be satisfied quickly out of specialized pools, and having to take back an arbitrary sized portion that was part of a specialized pool could be a problem.
If there were a demand for such a service, it could be written. But programs and algorithms have evolved over the years to use the existing services, and there does not seem to be much need to release individual portions of allocations. Generally, if a program is going to work with many objects that it might free individually, it allocates them individually. This is common when building data structures of all sorts, using pointers to construct trees, hashed arrays of lists or other structures, and so on. Data structures are often built out of individual nodes that can be allocated or freed individually. So there is little need to carve individual pieces to be released out of larger allocations.
The organization of memory has very little to do with the organization of data on hard disk or other storage media. Data is generally transferred between arbitrary places on disk and arbitrary places in memory as needed. In a variety of circumstances, files are “memory mapped,” meaning that the contents of a file are made visible in memory so that one can read the file contents by reading memory and one can modify the file by modifying memory. However, even in this situation, there is not generally any relationship between where the blocks of the file are on disk and where the blocks of the file are in memory. The blocks of a file are managed by the file system and are not necessarily contiguous, and the blocks in memory are managed by the memory system and may be rearranged arbitrarily with support from virtual memory hardware.
First question NO as you can only free the whole memory allocated by one malloc family function
Fragmentation of hard disks does not have anything common with the memory allocations.
Memory allocation is handled as seemingly continuous blocks of memory (it might not be in physical memory though, but that's not relevant).
There is no simple way to "cut a hole" in a single memory allocation, but you could do something like this:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define ARRAY_LEN 11
int main (void)
{
char *array;
array = (char *) malloc(ARRAY_LEN);
strcpy(array,"0123456789");
printf("%s\n",array);
//Remove the 5th element:
memmove(&array[5], &array[5+1], ARRAY_LEN-5);
array = realloc(array, ARRAY_LEN-1);
printf("%s\n",array);
free(array);
return 0;
}
Some Linux filesystems allows for "punching holes" in files, so with a mmap'ed file, you might be able to use the fallocate systemcall on it while using it as an array in memory.
Can you free individual memory adresses of an array allocated dynamically?
You seem to recognize that the answer is "no", because you follow up with
Why you cant free up an individual adress is it because the space needs to be continuous?
Each individual allocation is continuous, but the union of all dynamically-allocated space is by no means certain to be continuous. More on this later.
At the most practical level, you cannot free chunks of a larger allocation because C provides no mechanism for doing so. In particular, among the specifications for the free() function is:
if the argument does not match a pointer earlier returned by a memory management function, or if the space has been deallocated by a call to free or realloc, the behavior is undefined.
Thus, free() exhibits UB if its argument is a pointer into the interior of an allocated block.
Note also that free() accepts only one parameter. It makes no provision for the caller to specify the amount of memory to free, so the memory-management subsystem has to figure that out from the argument argument presented. That's fine for the operating model that one frees only whole, previously-allocated blocks, but it does not easily support freeing an allocation in multiple pieces.
Furthermore, consider that although you cannot free specific chunks of a larger allocation, you can use realloc() to reduce the the size of an allocation (which may also involve moving it).
Anything beyond that is in the realm of implementation-specific behavior, but do bear in mind that
it is very common for allocation to be performed and accounted for in terms of multi-byte blocks -- for example, multiples of 16 bytes -- regardless of the specific sizes requested. An implementation that works this way cannot under any circumstances free partial blocks, though one could imagine being able to free individual blocks from a larger allocation.
some implementations store memory management metadata adjacent to the dynamically-allocated space presented to the program. In such an implementation, it is not useful to free pieces of a larger allocation because they cannot, in general, be reused until the whole allocation is freed, for there is no available place for the needed metadata.
and if this is the answer then why fragmentation occurs on Hard-Disks
You don't need to free allocations in pieces to get memory fragmentation. It can suffice to perform multiple allocations and afterward free only some of them. This is a real issue that can degrade performance and even cause programs to fail.
With that said, however, file systems typically use different methods and data structures for tracking their metadata than do C memory-management implementations, and the underlying hardware has different characteristics and behavior, so there's really no justification for forming expectations about the behavior and capabilities of one variety of storage based on the behavior and capabilities of the other.

Prepend to file without temp file by manipulating inode?

Prepending to a large file is difficult, since it requires pushing all
other characters forward. However, could it be done by manipulating
the inode as follows?:
Allocate a new block on disk and fill with your prepend data.
Tweak the inode to tell it your new block is now the first
block, and to bump the former first block to the second block
position, former second block to the third position, and so on.
I realize this still requires bumping blocks forward, but it should be
more efficient than having to use a temp file.
I also realize the new first block will be a "short" block (not all the data in the block is part of the file), since your prepend data is unlikely to be exactly the same size as a block.
Or, if inode blocks are simply linked, it would require very little
work to do the above.
NOTE: my last experience directly manipulating disk data was with a
Commodore 1541, so my knowledge may be a bit out of date...
Modern-day operating systems should not allow a user to do that, as inode data structures are specific to the underlying file system.
If your file system/operating system supports it, you could make your file a sparse file by prepending empty data at the beginning, and then writing to the sparse blocks. In theory, this should give you what you want.
YMMV, I'm just throwing around ideas. ;)
This could work! Yes, userland programs should not be mucking around with inodes. Yes, it necessarily depends on whatever scheme used to track blocks by whatever file systems implement this function. None of this is a reason to reject this proposal out of hand.
Here is how it could work.
For the sake of illustration, suppose we have an inode that tracks blocks by an array of direct pointers to data blocks. Further suppose that the inode carries a starting-offset and and ending-offset that apply to the first and last blocks respectively, so you can have less-than-full blocks both at the beginning and end of a file.
Now, suppose you want to prepend data. It would go something like this.
IF (new data will fit into unused space in first data block)
write the new data to the beginning of the first data block
update the starting-offset
return success indication to caller
try to allocate a new data block
IF (block allocation failed)
return failure indication to caller
shift all existing data block pointers down by one
write the ID of the newly-allocated data block into the first slot of the array
write as much data as will fit into the second block (the old first block)
write the rest of data into the newly-allocated data block, shifted to the end
starting-offset := (data block size - length of data in first block)
return success indication to caller

C Memory Management in Embedded Systems

I have to use c/asm to create a memory management system since malloc/free don't yet exist. I need to have malloc/free!
I was thinking of using the memory stack as the space for the memory, but this would fail because when the stack pointer shrinks, ugly things happen with the allocated space.
1) Where would memory be allocated? If I place it randomly in the middle of the Heap/Stack and the Heap/Stack expands, there will be conflicts with allocated space!
12 What Is the simplest/cleanest solution for memory management? These are the only options I've researched:
A memory stack where malloc grows the stack and free(p) shrinks the stack by shifting [p..stack_pointer] (this would invalidate the shifted memory addresses though...).
A linked list (Memory Pool) with a variable-size chunk of memory. However I don't know where to place this in memory... should the linked list be a "global" variable, or "static"?
Thanks!
This article provides a good review of memory management techniques. The resources section at the bottom has links to several open source malloc implementations.
For embedded systems the memory is partitioned at link time into several sections or pools, i.e.:
ro (code + constants)
rw (heap)
zi (zero initialised memory for static variables)
You could add a 4th section in the linker configuration files that would effectively allocate a space in the memory map for dynamic allocations.
However once you have created the raw storage for dynamic memory then you need to understand how many, how large and how frequent the dynamic allocations will occur. From this you can build a picture of how the memory will fragment over time.
Typically an application that is running OS free will not use dynamic memory as you don't want to have to deal with the consequences of malloc failing. If at all possible the better solution is design to avoid it. If this is not at all possible try and simplify the dynamic behaviour using a few large structures that have the data pre-allocated before anything needs to use it.
For example say that you have an application that processes 10bytes of data whilst receiving the next 10 bytes of data to process, you could implement a simple buffering solution. The driver will always be requesting buffers of the same size and there would be a need for 3 buffers. Adding a little meta data to a structure:
{
int inUse;
char data[10];
}
You could take an array of three of theses structures (remembering to initialise inUse to 0 and flick between [0] and [1], with [2] reserved for the situations when a few too many interrupts occur and the next buffer is required buffer one is freed (the need for the 3rd buffer). The alloc algorithm would on need to check for the first buffer !inUse and return a pointer to data. The free would merely need to change inUse back to 0.
Depending on the amount of available RAM and machine (physical / virtual addressing) that you're using there are lots of possible algorithms, but the more complex the algorithm the longer the allocations could take.
Declare a huge static char buffer and use this memory to write your own malloc & free functions.
Algorithms for writing malloc and free could be as complex (and optimized) or as simple as you want.
One simple way could be following...
based on the type of memory allocation needs in your application try to find the most common buffer sizes
declare structures for each size with a char buffer of that length
and a boolean to represent whether buffer is occupied or not.
Then declare static arrays of above structures( decide array sizes
based on the total memory available in the system)
now malloc would simply go the most suitable array based on the
required size and search for a free buffer (use some search algo here
or simply use linear search) and return. Also mark the boolean in the
associated structure to TRUE.
free would simply search for buffer and mark the boolean to FALSE.
hope this helps.
Use the GNU C library. You can use just malloc() and free(), or any other subset of the library. Borrowing the design and/or implementation and not reinventing the wheel is a good way to be productive.
Unless, of course, this is homework where the point of the exercise is to implement malloc and free....

temporary files vs malloc (in C)

I have a program that generates a variable amount of data that it has to store to use later.
When should I choose to use mallod+realloc and when should I choose to use temporary files?
mmap(2,3p) (or file mappings) means never having to choose between the two.
Use temporary files if the size of your data is larger than the virtual address space size of your target system (2-3 gb on 32-bit hosts) or if it's at least big enough that it would put serious resource strain on the system.
Otherwise use malloc.
If you go the route of temporary files, use the tmpfile function to create them, since on good systems they will never have names in the filesystem and have no chance of getting left around if your program terminates abnormally. Most people do not like temp file cruft like Microsoft Office products tend to leave all over the place. ;-)
Prefer a temporary file if you need/want it to be visible to other processes, and malloc/realloc if not. Also consider the amount of data compared to your address space and virtual memory: will the data consume too much swap space if left in memory? Also consider how good a fit the respective usage is for your application: file read/write etc. can be a pain compared to memory access... memory mapped files make it easier, but you may need custom library support to do dynamic memory allocation within them.
In a modern OS, all the memory gets paged out to disk if needed anyway, so feel free to malloc() anything up to a couple of gigabytes.
If you know the maximum size, it's not too big and you only need one copy, you should use a static buffer, allocated at program load time:
char buffer[1000];
int buffSizeUsed;
If any of those pre-conditions are false and you only need the information while the program is running, use malloc:
char *buffer = malloc (actualSize);
Just make sure you check that the allocations work and that you free whatever you allocate.
If the information has to survive the termination of your program or be usable from other programs at the same time, it'll need to go into a file (or long-lived shared memory if you have that capability).
And, if it's too big to fit into your address space at once, you'll need to store it in a file and read it in a bit at a time.
That's basically going from the easiest/least-flexible to the hardest/most-flexible possibilities.
Where your requirements lie along that line is a decision you need to make.
On a 32-bit system, you won't be able to malloc() more than 2GB or 3GB or so. The big advantage of files is that they are limited only by disk size. Even with a 64-bit system, it's unusual to be able to allocate more than 8GB or 16GB because there are usually limits on how large the swap file can grow.
Use ram for data that is private and for the life of a single process. Use a temp file if the data needs to persist beyond the a single process.

Resources