This question already has answers here:
How does free know how much to free?
(11 answers)
Closed 9 years ago.
we allocate memory dynamically in C using malloc() and we receive a pointer to a location in the heap.
now we use free() to deallocate the memory, passing the same pointer value as its argumnet.
the Question now is how does free() know how much to deallocate.. considering the fact that we can always resize the memory block allocated by malloc().
is there anything related to Hash Tables here?
A typical implementation will store information just before the address returned by malloc. That information will include the information that realloc or free needs to know to do their work, but the details of what exactly is stored there depends on the implementation.
The original technique was to allocate a slightly larger block and store the size at the beginning, a part the application didn't see. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and also create dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
A simplist implementation is the one in the famous K&R C Bible,page 186 - 188.
The memory block we get actually is more (a struct head's or a union head's size) than we apply for.The struct may be like this:
typedef long Align;
union header
{
struct
{
union header* ptr; // next block
unsigned size; // size of this block , times of head size
}s;
Align x;
};
A figure to demonstrate it:
When we call the free function, the behaviour may be like this:
void free(void* ptr)
{
Header *bp, *p;
bp = (Header *)ptr - 1;
/* ..... */
/*return the memory to the linked list */
}
In visual studio, we have two models: release version and debug version,we could even use
the head to store debug message to make debug easier.The header in debug version is called _CrtMemBlockHeader, the definition is as below :
typedef struct _CrtMemBlockHeader
{
struct _CrtMemBlockHeader * pBlockHeaderNext;
struct _CrtMemBlockHeader * pBlockHeaderPrev;
char * szFileName;
int nLine;
size_t nDataSize;
int nBlockUse;
long lRequest;
unsigned char gap[nNoMansLandSize];
} _CrtMemBlockHeader;
Then the memory lalout is:
A memory manager uses tables to store additional data based on a pointer, sometimes right before the pointer, sometimes elsewhere. With C being very simple, the data is most likely pointer-2 or pointer-4, as int or long type. The correct details depend on the compiler.
When we use malloc ,a block will get reserve whose size will be littile more than what we have requested and in return to this malloc we get a pointer to start of this block.
AS i told you size of this block will be littile more than what exactly you needed.This extra space will be used to keep actual requested size of block,pointer to next free block and some data which checks "if you trying to access more than allocated block".
So whenever we call free using the pointer we want to deallocate, this free will search for the extra information given in the block space, Where it gets final size to deallocate.
Related
If you want to allocate an array of struct you can do it statically by declaring something like
struct myStruct myStructArray[100];
or dinamically with something like
struct myStruct *myStructArray = calloc(100, sizeof(struct myStruct) );
but in this case you are responsible for freeing the memory.
In many applications and samples I found a mixed approach:
struct wrapperStruct
{
int myInt;
struct myStruct myStructArray[1];
};
Then the allocation is performed like this
int n = 100;
size_t memory_size = sizeof(struct wrapperStruct) + (n - 1) * sizeof(struct myStruct);
struct wrapperStruct *wrapperStruct_p = calloc(1, memory_size);
So (if I understood correctly) since the array is the last member of the struct and the field of a struct respect the same position in memory then you are "extending" the single entry array myStructArray with 99 entries.
This allow you to safety write something like wrapperStruct_p.myStructArray[44] without causing a buffer overflow and without having to create a dynamic allocated array of struct and then take care of the memory disposal at the end. So the alternative approach would be:
struct wrapperStruct
{
int myInt;
struct myStruct *myStructArray;
};
struct wrapperStruct *wrapperStruct_p = calloc(1, sizeof(struct wrapperStruct) );
wrapperStruct_p.myStructArray = calloc(100, sizeof(struct myStruct) )
The question is what happens when you try to free the wrapperStruct_p variable ?
Are you causing a memory leak ?
Is the C memory management able to understand that the array of struct is made of 100 entries and not 1 ?
What are the benefits of the first approach apart from not having to free the pointer inside the struct ?
The question is what happens when you try to free the wrapperStruct_p
variable ?
Are you causing a memory leak ?
Most likely, but not necessary. The memory for the inner dynamic array is not freed, but you could still free it later if you saved the pointer address to some other variable.
Is the C memory management able to understand that the array of struct is made of 100 entries and not 1 ?
"C memory management" takes care of stack and heap allocations (the latter using systemcalls so maybe it's not really a "C memory management"), it doesn't do much else other than provide syntactic sugar on top of assembler (unlike garbage collected languages like Java or other).
C itself doesn't care about how many entries are somewhere and what part of memory you access (SEGFAULTS are the OS response to memory access violations)
What are the benefits of the first approach apart from not having to
free the pointer inside the struct ?
If by "first approach" you mean stack allocated array, then it's mainly the fact that you do not need to allocate anything and the stack does it for you (drawback being that it stays allocated in the declared scope and you can't free up or increase the array space) then the constant allocation speed and assurance you'll get your 100 array items no matter the OS response (many realtime applications require maximum response times, therefore a heap allocation can be a really big slowdown causing problems).
If by "first approach" you mean using the wrapper struct, then I do not see any benefits other than the one you already stated.
I'd even suggest you not advocate/use this approach, since it is a really confusing technique that doesn't serve noticeable benefits (plus it allocates 1 space even though it may not be even used, but that's a detail)
The main goal is to write code that is easily understandable by other people. Machines and compilers can nowadays do wonders with code, so unless you are a compiler designer, standard library developer or machine level programmer for embedded systems, you should write simple to understand code.
The setup
Let's say I have a struct father which has member variables such as an int, and another struct(so father is a nested struct). This is an example code:
struct mystruct {
int n;
};
struct father {
int test;
struct mystruct M;
struct mystruct N;
};
In the main function, we allocate memory with malloc() to create a new struct of type struct father, then we fill it's member variables and those of it's children:
struct father* F = (struct father*) malloc(sizeof(struct father));
F->test = 42;
F->M.n = 23;
F->N.n = 11;
We then get pointers to those member variables from outside the structs:
int* p = &F->M.n;
int* q = &F->N.n;
After that, we print the values before and after the execution of free(F), then exit:
printf("test: %d, M.n: %d, N.n: %d\n", F->test, *p, *q);
free(F);
printf("test: %d, M.n: %d, N.n: %d\n", F->test, *p, *q);
return 0;
This is a sample output(*):
test: 42, M.n: 23, N.n: 11
test: 0, M.n: 0, N.n: 1025191952
*: Using gcc (Ubuntu 9.3.0-17ubuntu1~20.04) 9.3.0
Full code on pastebin: https://pastebin.com/khzyNPY1
The question
That was the test program that I used to test how memory is deallocated using free(). My idea(from reading K&R "8.7 Example - A Storage Allocator", in which a version of free() is implemented and explained) is that, when you free() the struct, you're pretty much just telling the operating system or the rest of the program that you won't be using that particular space in memory that was previously allocated with malloc(). So, after freeing those memory blocks, there should be garbage values in the member variables, right? I can see that happening with N.n in the test program, but, as I ran more and more samples, it was clear that in the overwhelming majority of cases, these member variables are "reset" to 0 more than any other "random" value. My question is: why is that? Is it because the stack/heap is filled with zeroes more frequently than any other value?
As a last note, here are a few links to related questions but which do not answer my particular question:
C - freeing structs
What REALLY happens when you don't free after malloc?
After calling free, the pointers F, p and q no longer point to valid memory. Attempting to dereference those pointers invokes undefined behavior. In fact, the values of those pointers become indeterminate after the call to free, so you may also invoke UB just by reading those pointer values.
Because dereferencing those pointers is undefined behavior, the compiler can assume it will never happen and make optimizations based on that assumption.
That being said, there's nothing that states that the malloc/free implementation has to leave values that were stored in freed memory unchanged or set them to specific values. It might write part of its internal bookkeeping state to the memory you just freed, or it might not. You'd have to look at the source for glibc to see exactly what it's doing.
Apart from undefined behavior and whatever else the standard might dictate, since the dynamic allocator is a program, fixed a specific implementation, assuming it does not make decisions based on external factors (which it does not) the behavior is completely deterministic.
Real answer: what you are seeing here is the effect of the internal workings of glibc's allocator (glibc is the default C library on Ubuntu).
The internal structure of an allocated chunk is the following (source):
struct malloc_chunk {
INTERNAL_SIZE_T mchunk_prev_size; /* Size of previous chunk (if free). */
INTERNAL_SIZE_T mchunk_size; /* Size in bytes, including overhead. */
struct malloc_chunk* fd; /* double links -- used only if free. */
struct malloc_chunk* bk;
/* Only used for large blocks: pointer to next larger size. */
struct malloc_chunk* fd_nextsize; /* double links -- used only if free. */
struct malloc_chunk* bk_nextsize;
};
In memory, when the chunk is in use (not free), it looks like this:
chunk-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of previous chunk, if unallocated (P clear) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Size of chunk, in bytes |A|M|P| flags
mem-> +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| User data starts here... |
Every field except mchunk_prev_size and mchunk_size is only populated if the chunk is free. Those two fields are right before the user usable buffer. User data begins right after mchunk_size (i.e. at the offset of fd), and can be arbitrarily large. The mchunk_prev_size field holds the size of the previous chunk if it's free, while the mchunk_size field holds the real size of the chunk (which is at least 16 bytes more than the requested size).
A more thorough explanation is provided as comments in the library itself here (highly suggested read if you want to know more).
When you free() a chunk, there are a lot of decisions to be made as to where to "store" that chunk for bookkeeping purposes. In general, freed chunks are sorted into double linked lists based on their size, in order to optimize subsequent allocations (that can get already available chunks of the right size from these lists). You can see this as a sort of caching mechanism.
Now, depending on your glibc version, they could be handled slightly differently, and the internal implementation is quite complex, but what is happening in your case is something like this:
struct malloc_chunk *victim = addr; // address passed to free()
// Add chunk at the head of the free list
victim->fd = NULL;
victim->bk = head;
head->fd = victim;
Since your structure is basically equivalent to:
struct x {
int a;
int b;
int c;
}
And since on your machine sizeof(struct malloc_chunk *) == 2 * sizeof(int), the first operation (victim->fd = NULL) is effectively wiping out the contents of the first two fields of your structure (remember, user data begins exactly at fd), while the second one (victim->bk = head) is altering the third value.
The Standard specifies nothing about the behavior of a program that uses a pointer to allocated storage after it has been freed. Implementations are free to extend the language by specifying the behavior of more programs than required by the Standard, and the authors of the Standard intended to encourage variety among implementations which would support popular extensions on a quality-of-implementation basis directed by the marketplace. Some operations with pointers to dead objects are widely supported (e.g. given char *x,*y; the Standard would allow conforming implementations to behave in arbitrary fashion if a program executes free(x); y=x; in cases where x had been non-null, without regard for whether anything ever does anything with y after its initialization, but most implementations would extend the language to guarantee that such code would have no effect if y is never used) but dereferencing of such pointers generally isn't.
Note that if one were to pass two copies of the same pointer to a freed object to:
int test(char *p1, char *p2)
{
char *q;
if (*p1)
{
q = malloc(0):
free(q);
return *p1+*p2;
}
else
return 0;
}
it is entirely possible that the act of allocating and freeing q would disturb the bit patterns in the storage that had been allocated to *p1 (and also *p2), but a compiler would not be required to allow for that possibility. A compiler might plausibly return the sum of the value that was read from *p1 before the malloc/free, and a value that was read from *p2 after it; this sum could be an odd number even though if p1 and p2 are equal, *p1+*p2 should always be even.
Two things happen when you call free:
In the C model of computing, any pointer values that point to the freed memory (either its beginning, such as your F, or things within it, such as your p and q) are no longer valid. The C standard does not define what happens when you attempt to use these pointer values, and optimization by the compiler may have unexpected effects on how your program behaves if you attempt to use them.
The freed memory is released for other purposes. One of the most common other purposes for which it is used is tracking memory that is available for allocation. In other words, the software that implements malloc and free needs data structures to record which blocks of memory have been freed and other information. When you free memory, that software often uses some of the memory for this purpose. That can result in the changes you saw.
The freed memory may also be used by other things in your program. In a single-threaded program without signal handlers or similar things, generally no software would run between the free and the preparation of the arguments to the printf you show, so nothing else would reuse the memory so quickly—reuse by the malloc software is the most likely explanation for what you observed. However, in a multithreaded program, the memory might be reused immediately by another thread. (In practice, this may be a bit unlikely, as the malloc software may keep preferentially separate pools of memory for separate threads, to reduce the amount of inter-thread synchronization that is necessary.)
When a dynamically allocated object is freed, it no longer exists. Any subsequent attempt to access it has undefined behavior. The question is therefore nonsense: the members of an allocated struct cease to exist at the end of the host struct's lifetime, so they cannot be set or reset to anything at that point. There is no valid way to attempt to determine any values for such no-longer-existing objects.
Most examples using structs in C use malloc to assign the required size block of memory to a pointer to that struct. However, variables with basic types (int, char etc.) are allocated to the stack and it is assumed that enough memory will be available.
I understand the idea behind this is that memory may not be available for larger structs so we use malloc to ensure we do indeed have enough memory but in the case of our struct being small is this really necessary? For example if a struct only consists of three ints, surely I am always fine to assume there is enough memory?
So really my question boils down to what are the best practises in C regarding when it is necessary to malloc variables and what is the justification?
The only time you don't have to allocate memory is when you statically allocate memory, which is what happens when you have a statement like:
int number = 5;
You can always write it as:
int *pNumber = malloc(sizeof(int));
but you have to make sure to free it or you will be leaking memory.
You can do the same thing with a struct (instead of dynamically allocating memory for it, statically allocate):
struct some_struct_t myStruct;
and access members by:
myStruct.member1 = 0;
etc...
The big difference between dynamic allocation and static is whether that data is available outside of your current scope. With static allocation, it's not. With dynamic it is, but you have to make sure to free it.
Where you run into trouble is when you have to return a structure (or a pointer to it) from a function. You either have to dynamically allocate inside the function which is returning it or you have to pass in a pointer to an externally (dynamically or statically) allocated structure which the function can then work with.
Good code gets re-used. Good code have few size limitations. Write good code.
Use malloc() whenever there is anything more than trivial buffer sizes.
Buffer size to write an int: The needed buffer size is at most sizeof(int)*CHAR_BIT/3 + 3. Use a fixed buffer.
Buffer size to write a double as in sprintf(buf, "%f",...: The needed buffer size could be thousands of bytes: use malloc(). Or use sprintf(buf, "%e",... and use a fixed buffer.
Forming a file path name could involve thousands of char. Use malloc().
I'm trying to create a memory allocation system, and part of this involves storing integers at pointer locations to create a sort of header. I store a couple of integers, and then two pointers (with locations to the next and prev spots in memory).
Right now I'm trying to figure out if I can store the pointer at a location that I could later use as the original pointer.
int * header;
int * prev;
int * next;
...
*(header+3) = prev;
*(header+4) = next;
Then later...
headerfunction(*(header+4));
would perform an operation using the pointer to the 'next' location in memory.
(code for illustration only)
Any help or suggestions greatly appreciated!
Don't do direct pointer manipulation. Structs were made to eliminate the need for you to do that directly.
Instead, do something a bit more like this:
typedef struct
{
size_t cbSize;
} MyAwesomeHeapHeader;
void* MyAwesomeMalloc(size_t cbSize)
{
MyAwesomeHeapHeader* header;
void* internalAllocatorPtr;
size_t cbAlloc;
// TODO: Maybe I want a heap footer as well?
// TODO: I should really check the following for an integer overflow:
cbAlloc = sizeof(MyAwesomeHeapHeader) + cbSize;
internalAllocatorPtr = MyAwesomeRawAllocator(cbAlloc);
// TODO: Check for null
header = (MyAwesomeHeapHeader*)internalAllocatorPtr;
header->heapSize = cbSize;
// TODO: other fields here.
return (uint8_t*)(internalAllocatorPtr) + sizeof(MyAwesomeHeapHeader);
}
What-ever you are doing is not safe because you are trying to write a memory location which is not pointed by header as *(header+3) it will try to write to some other memory location 12 byte far from header pointer & if this newly memory is held by another variable then it will cause problem.
You can do as first of all allocating a big memory & then the start address will give you the source of your memory in which you can use some starting bytes or memory for controlling other parts of the remaining memory with the help of structures.
Akp is correct, just looking at what you are trying to accomplish in your code segment, if you are trying to store integer pointers in header, header should be defined as such:
int **header;
and then memory should be allocated for it.
With regards to the actual memory allocation, if on a Unix machine, you should look into the brk() syscall.
You are building a memory allocation system, and thus we assume you have a trunk of memory somewhere you can use freely to manage allocations and freeings.
As per your question, the header pointer is allocated in the heap memory (by the compiler and libraries) - and you may wonder if it is safe to use that memory since you are allocating memory. It depends on your system, and if there is another (system) memory allocation management.
But what you could do is
main() {
void *header;
void *prev;
void *next;
manage_memory_allocations(&header, &prev, &next); // never returns
}
In this case, the pointers are created on the stack - so the allocation depends on the memory where the processor stack points to.
Note the "never returns" as the memory is "freed" as soon as main ends.
This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
How do free and malloc work in C?
How does free know how many bytes of memory to be free'd when called in a program?
This is implementation specific, but when malloc is called, the size of the allocated memory is kept somewhere (usually offset from the pointer itself). When free is called, it will use that stored size.
This is exactly why you should only ever call free on a pointer that was returned by malloc.
It's done automatically. The corresponding "malloc" has saved the size in a secret place (typically stored at a negative offset from the pointer).
This, of course, mean that you can only free memory that corresponds to a block previously allocated by "malloc".
Asking how it knows "how many bytes to free" is a mistake. It's not like each byte individually has a free/not-free status bit attached to it (well, it could, but this would be an awful implementation). In many implementations the number of bytes in an allocation may be completely irrelevant; it's the data structures used to manage it that are relevant.
It's an implementation detail than can and will vary between different platforms. Here's one example though of how it could be implemented.
Every free call must be paired with a malloc / realloc call which knows the size request. The implementation of malloc could choose to store this size at an offset of the returned memory. Say by allocating a larger buffer than requested, stuffing the size in the front and then returning an offset into the allocated memory. The free function could then simply use the offset of the provided pointer to discover the size to free.
For example
void* malloc(size_t size) {
size_t actualSize = size + sizeof(size_t);
void* buffer = _internal_allocate(actualSize);
*((size_t*)buffer) = size;
return ((size_t*)buffer) + 1;
}
void free(void* buffer) {
size_t* other = buffer;
other--;
size_t originalSize = *other;
// Rest of free
...
}
The answer is implementation-specific.
malloc might keep a dictionary mapping addresses to data records
malloc might allocate a slightly larger block than requested and store metadata before or after the block it actually returns.
In some special cases, not intended for general use, free() is completely a no-op and it doesn't actually keep track.