I cannot seem to find an answer to my question. I am building a memory allocator. Let's say that the user mallocs 20 bytes of memory. I am storing some memory for the header and the 20 byte payload itself.
So it looks like this
header | payload
My question is: how should I deal with memory alignment. Should I deal with it? Because, working on a 64 bit system, I would need a 4 word alignment. Should I read in this case the aligned number of bytes and give only the first 20 bytes of the payload, keep the block as not allocated and just decrease the size of it? Any suggestion would be really helpful.
Related
I'm writing a piece of code for an embedded system (mcu Cortex-M4 with very limited ram < 32KB) where I need big chunks of memory for a very small and defined time and I thought to use some kind of a memory pool so the memory won't get wasted for functions and actions that run once or twice a lifetime.
I made some efforts but I think I'd like to learn more about memory pools and rewrite my code better.
I saw some examples where a pointer is used to point to the next available free chunk, but I don't understand how I can process those kind of pools.
For example, I can't use strstr in a memory pool where the total string would be spread into more than one chunk. I would need to read chunk by chunk and store the total string into one larger array to carry on further process. Please correct me if I'm wrong.
So, if I get it right, if I have a memory pool of 1024 bytes with 32 bytes long for each chunk that gives us 32 chunks in total. And if I want to store a string of total length let's say 256 chars (bytes) I'd need 8 chunks but if I want to read the string I'd need to copy those 8 chunks into a 256 chars array.
Am I missing something?
In C programming, you can pass any kind of pointer you like as an argument to free, how does it know the size of the allocated memory to free? Whenever I pass a pointer to some function, I have to also pass the size (ie an array of 10 elements needs to receive 10 as a parameter to know the size of the array), but I do not have to pass the size to the free function. Why not, and can I use this same technique in my own functions to save me from needing to cart around the extra variable of the array's length?
When you call malloc(), you specify the amount of memory to allocate. The amount of memory actually used is slightly more than this, and includes extra information that records (at least) how big the block is. You can't (reliably) access that other information - and nor should you :-).
When you call free(), it simply looks at the extra information to find out how big the block is.
Most implementations of C memory allocation functions will store accounting information for each block, either in-line or separately.
One typical way (in-line) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it. Graphically, that would be along the lines of:
____ The allocated block ____
/ \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
^
|
+-- The address you are given
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defined (a) but the in-line accounting option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" or "Double free" if you overwrite them or free them twice.
The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
(a) I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system), assuming you asked for 128 bytes or less (requests for more would be met with a NULL return value). A very simple bit-mask (i.e., not in-line) was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bit-mask to decide what blocks were used or available.
Both these options managed to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent blocks when freeing), particularly important in the environment we were working in.
From the comp.lang.c FAQ list: How does free know how many bytes to free?
The malloc/free implementation remembers the size of each block as it is allocated, so it is not necessary to remind it of the size when freeing. (Typically, the size is stored adjacent to the allocated block, which is why things usually break badly if the bounds of the allocated block are even slightly overstepped)
This answer is relocated from How does free() know how much memory to deallocate? where I was abrubtly prevented from answering by an apparent duplicate question. This answer then should be relevant to this duplicate:
For the case of malloc, the heap allocator stores a mapping of the original returned pointer, to relevant details needed for freeing the memory later. This typically involves storing the size of the memory region in whatever form relevant to the allocator in use, for example raw size, or a node in a binary tree used to track allocations, or a count of memory "units" in use.
free will not fail if you "rename" the pointer, or duplicate it in any way. It is not however reference counted, and only the first free will be correct. Additional frees are "double free" errors.
Attempting to free any pointer with a value different to those returned by previous mallocs, and as yet unfreed is an error. It is not possible to partially free memory regions returned from malloc.
On a related note GLib library has memory allocation functions which do not save implicit size - and then you just pass the size parameter to free. This can eliminate part of the overhead.
The heap manager stored the amount of memory belonging to the allocated block somewhere when you called malloc.
I never implemented one myself, but I guess the memory right in front of the allocated block might contain the meta information.
The original technique was to allocate a slightly larger block and store the size at the beginning, then give the application the rest of the blog. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and it also creates dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
malloc() and free() are system/compiler dependent so it's hard to give a specific answer.
More information on this other question.
To answer the second half of your question: yes, you can, and a fairly common pattern in C is the following:
typedef struct {
size_t numElements
int elements[1]; /* but enough space malloced for numElements at runtime */
} IntArray_t;
#define SIZE 10
IntArray_t* myArray = malloc(sizeof(intArray_t) + SIZE * sizeof(int));
myArray->numElements = SIZE;
to answer the second question, yes you could (kind of) use the same technique as malloc()
by simply assigning the first cell inside every array to the size of the array.
that lets you send the array without sending an additional size argument.
When we call malloc it's simply consume more byte from it's requirement. This more byte consumption contain information like check sum,size and other additional information.
When we call free at that time it directly go to that additional information where it's find the address and also find how much block will be free.
This allocator will be used inside an embedded system with static memory (ie, no system heap available, so the 'heap' will simply be 'char heap[4096]')
There seems to be lots of "small memory allocators" around, but I'm looking for one that handles REALLY small allocations. I'm talking typical sizes of 16 bytes with small CPU use and smaller memory use.
Considering the typical allocation sizes are <= 16 bytes, Rare allocations being <= 64 bytes, and the "one in a million" allocations being upto 192 bytes, I would thinking of simply chopping those 4096 bytes into 255 pages of 16 bytes each and having a bitmap and "next free chunk" pointer. So rather than searching, if the memory is available, the appropriate chunks are marked and the function returns the pointer. Only once the end is reached would it go searching for an appropriate slot of required size. Due to the nature of the system, earlier blocks 'should' have been released by the time the 'Next free chunk' arrives at the end of the 'heap'.
So,
Does anyone know something like this already exists?
If not, can anyone poke holes in my theory?
Or, can they suggest something better?
C only, no C++. (Entire application must fit into <= 64KB, and there's about 40K of graphics so far...)
OP: can anyone poke holes in my theory?
In reading the first half, I thought out a solution using a bit array to record usage and came up with effectively the same thing you outline in the 2nd half.
So here is the hole: avoid hard coding a 16-bite block. Allow your bit map to work with, say 20 or 24 byte blocks at the beginning of you development. During this time, you may want to put tag information and sentinels on the edges of the block. Thus you can more readily track down double free(), usage outside allocation, etc. Of course, the price is a smaller effective pool.
After your debug stage, go with your 16-byte solution with confidence.
Be sure to keep track of 0 <= total allocation <= (2048 - overhead) and allow a check of it versus your bitmap.
For debug, consider filling a freed block with "0xDEAD", etc. to help force inadvertent free usage errors.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
C programming : How does free know how much to free?
free() is called to deallocate the memory allocated by malloc() function call. From where does the free() find the information about the no. of bytes allocated by the malloc() function. I.e., how do you conform the no. of bytes allocated by the malloc() and where is this information stored.
-Surya
Most implementations of C memory allocation functions will store accounting information for each block, either inline or separately.
One typical way (inline) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it.
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defineda but the inline-accounting-info option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" if you overwrite them. The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
a I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system) and a simple non-inline bit-mask was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bitmask to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent free blocks), particularly important in the environment we were working in.
This is implementation dependent. The heap stores that data in some manner that facilitates accessing it having a pointer returned by malloc() - for example, the block could store the number of bytes at the beginning and malloc() would return an offsetted pointer.
When you allocate a block of memory, more bytes than you requested are allocated. How many depends on the implementation but here is an example:
struct MallocHeader {
struct MallocHeader * prev, * next;
size_t length;
... more data, padding, etc ...
char data[0];
}
When malloc() allocates the memory from the free list, it will allocate size + sizeof(struct MallocHeader) and return the address of data. In free(), the offset of data in the struct MallocHeader is subtracted from the pointer you pass in and then it knows the size.
This is implementation dependent - it depends on the libc implementation and also on the operating system implementation (more on the operating system implementation).
I haven't got a need to know such things but if you really want to you can create your own memory allocator.
By mistake I found out that in C++ when allocating with new[] operator stores the number of elements at the beginning of the allocated zone returning to the user the zone after the number of elements (On Visual Studio).
new[NUMBER] ---> [NUMBER (4bytes)]+[allocated area]
it returns the pointer to the allocated area
and probably when the delete[] operator is called
it looks 4 bytes before the [allocated area] to see
how much elements will be deleted
After searching around a bit and consulting the Dinosaur Book, I've come to SO seeking wisdom. Note that this is somewhat homework-related, but actually isn't a homework problem. Also, this is using the C programming language.
I'm working with a kernel that currently allocates memory in 4K chunks. In an attempt to cut down on wasted memory, I've rewritten what I think malloc to be like, where it'll grab a 4K page and then give out memory from that, as needed. That part is currently working fine. I plan to have a linked list of pages of memory. Memory is handled as a char*, so my struct has a char* in it. It also currently has some ints describing it, as well as a pointer to the next node.
The question is this: I plan to use a bit vector to keep track of free and used memory. I want to figure out how many integers (4 bytes, 32 bits) I need to keep track of all the 1 byte blocks in the page of memory. So 1 bit in the bit vector will correspond to 1 byte in the page. The catch is that I must fit this all within the 4K I've been allocated, so I need to figure out the number of integers necessary to satisfy the 1-bit-per-byte constraint and fit in the 4K.
Or rather, I need to maximize the "actual" memory I'll have, while minimizing the number of integers required to map one bit per byte, while both parts ("actual" memory and bit vector) are in the same page.
Due to information about the page, and the pointers, I won't actually have 4K to play with, but something closer to 4062 bytes.
I believe this to be a linear programming problem, but the formulations of it that I've tried haven't worked out.
You want to use a bitmap to keep track of allocated bytes in a 4k page? And are wondering how to figure out how big the bitmap should be (in bytes)? The answer is 456 (after rounding), found by solving this equation:
X + X/8 = 4096
which reduces to:
9X = 32768
But ... the whole approach of keeping a bitmap within each allocated page sounds very wrong to me. What if you want to allocate a 12k buffer?