Freeing the stack - c

So I know that a call free() on a variable allocated in the stack would cause an invalid pointer error.
In a malloced pointer, malloc() allocates 8 bytes before the actual pointer to leave information about its size. So I was wondering if I had made a long before a struct and then called free on that struct if it would be possible to free the struct (of course this is going off the assumption that the allocation of those 8 bytes is the only thing extra that malloc does).
I guess my final question would be if there is any real difference between stack variable allocation and heap allocation (in terms of the backend calls to the kernel).

Some C implementations might use data before the allocated space to help them manage the space. Some do not. Some do that for allocations of certain sizes and not others. If they do it, it might be eight bytes, or it might be some other amount. You should not rely on any behavior in this regard.
When you declare a long object and a struct of some sort in a block, the compiler might or might not put them next to each other on the stack. It might put the long before the struct or vice-versa, or, because it optimizes your program, it might keep the long in a register and never put it on the stack at all, and it might do other things. In some C implementations, a long is eight bytes. In some, it is not. There is no good way for you to ensure two separate objects are put in adjacent memory. (You can make them not separate by putting them in a larger struct.)
Even if you are able to cobble together a long followed by a struct, how would you know what value to put into the long? Did the C implementation put the length of the allocation in there? Or is it a pointer to another block? Or to some other part of the database the C implementation uses to track allocated memory? If malloc and free are using memory just before the allocated space, that memory is not empty. It needs to have some value in it, and you do not know what that is.
If you get lucky, passing the address of the struct to free might not crash your program right away. But then you have freed a part of the stack, in some sense. When you call malloc again, the pointer you get back might be for that memory, and then your program presumably will write to that space. Then what happens when your program calls other routines, causing the stack to grow into that space? You will have overlapping uses of the same memory. Some of your data will be stomping over other data, and your program will not work.
Yes, there are differences between memory allocated on the stack and memory allocated from the heap. This is outside of the model that C presents to your program. However, in systems where processes have stack and heap, they are generally in different places in the memory of your process. In particular, the stack memory must remain available for use as the stack grows and shrinks. You cannot mix it with the heap without breaking things.
It is good to ask questions about what happens when you try various things. However, modern implementations of malloc and free are quite complicated, and you pretty much have to accept them as a service that you cannot peer into easily. Instead, to help you learn, you might think about this:
How would you write your own malloc and free?
Write some code that allocates a large amount of memory using malloc, say a megabyte, and write two routines called MyMalloc and MyFree that work like malloc and free, except they use the memory you allocated. When MyMalloc is called, it will carve out a chunk of the memory. When MyFree is called, it will return the chunk to make it available again.
Write some experimental code that somewhat randomly calls MyMalloc with various sizes and MyFree, in somewhat random orders.
How can you make all of this work? How do you divide the megabyte into chunks? How do you remember which chunks are allocated and which are free? When somebody calls MyFree, how do you know how much they are giving back? When neighboring chunks are returned with MyFree, how do you put them back together into bigger pieces again?

I think your real question is how does the stack work.
The stack is one big memory block allocated when your program starts. There is a pointer to the top of the stack. The name is suggestive: think of a stack of magazines.
When a function is called, the parameters are placed on top of the stack. The function itself then places its local variables on top of that. When the function exits, the stack pointer is simply moved back to where it was before the function was called. This frees all the local variables and input arguments used by the function.
The heap manager has nothing to do with this block of memory. Tricking free to put some of the stack in the heap manager’s memory is going to wreak havoc on your program. The memory would likely be used again as you call other functions, and used simultaneously if you malloc memory, leading to data corruption at best, and stack corruption (read crash) at worst.

When you speak of memory being allocated on the stack, you have to understand that in most implementations, the stack is allocted in a block -- variables are not allocated individually or separately.
+-+ +--------------------------------------------------+
| | Stack frame data section; local variables and |
| | |
| | function arguments in order determined by the |
| | |
| | calling convention of the target platform |
Stack frame for | | |
function call; +---+ | (size is implementation dependent) |
block allocated | | |
| | |
| +--------------------------------------------------+
| |Instruction pointer (return address) |
| +--------------------------------------------------+
| |Space for return value (if not in a CPU register) |
+-+ +--------------------------------------------------+
| |
| |
| |
| (stack frame of previously called function) |
| |
| |
+--------------------------------------------------+
Each function call is allocated its own stack frame, with the required size to hold the return value (if necessary), the instruction pointer of the return address, and all of the local variables and function arguments. So while memory for the stack frame is allocated, it's not allocated with respect to any individual variable -- only in regard to the sum of the individual sizes.

Related

Why did C never implement "stack extension"?

Why did C never implement "stack extension" to allow (dynamically-sized) stack variables of a callee function to be referenced from the caller?
This could work by extending the caller's stack frame to include the "dynamically-returned" variables from the callee's stack frame. (You could, but shouldn't, implement this with alloca from the caller - it may not survive optimisation.)
e.g. If I wanted to return the dynamically-size string "e", the implementation could be:
--+---+-----+
| a | b |
--+---+-----+
callee(d);
--+---+-----+---------+---+
| a | b | junk | d |
--+---+-----+---------+---+
char e[calculated_size];
--+---+-----+---------+---+---------+
| a | b | junk | d | e |
--+---+-----+---------+---+---------+
dynamic_return e;
--+---+-----+-------------+---------+
| a | b | waste | e |
--+---+-----+-------------+---------+
("Junk" contains the return address and other system-specific metadata which is invisible to the program.)
This would waste a little stack space, when used.
The up-side is a simplification of string processing, and any other functions which have to currently malloc ram, return pointers and hope that the caller remembers to free at the right time.
Obviously, there is no point in added such a feature to C at this stage of its life, I'm just interested in why this wasn't a good idea.
A new object may be returned through many layers of software. So the wasted space may be that from dozens or even hundreds of function calls.
Consider also a routine that performs some iterative task. In each iteration, it gets some newly allocated object from a subroutine, which it inserts into a linked list or other data structure. Such iterative tasks may repeat for hundreds, thousands, or millions of iterations. The stack will overflow with wasted space.
Some objections to your idea. Some have been mentioned already in comments. Some come from the top of my head.
C doesn't have stacks or stack frames. C simply defines scopes and their life times and it is left to implementations as to how to implement the standard. Stacks and stack frames are really just the most popular way to implement some C semantics.
C doesn't have strings. C doesn't really have arrays as such. Well, it does have arrays, but as soon as you mention an array in an expression (e.g. a return expression), the array decays to a pointer to its first element. Returning a "string" or an array on the stack would involve significant impact on well established areas of the language.
C does have structs. However, you can already return a struct. I can't tell you how its done, because it is an implementation detail.
A problem with your specific implementation is that the caller has to know how big the "waste" is. Don't forget that the waste will include the stack frame of the callee but also the waste from any functions the callee calls either directly or indirectly. The returning convention will have to include information on the size of the waste and a pointer to the return value.
Stacks, as a rule, are quite limited compared to heap memory, particularly in applications that use threading. At some point the caller will need to move the returned array down into its own stack frame. If the array was merely a pointer to storage in the heap, this would be much more efficient, but then you've got the existing model.
You have to realize, that the implementation of the stack is strongly dictated by the CPU and the OS kernel. The language does not have much say in this. Limitiations are, for instance:
The ret instruction of the X86 architecture expects the return address at the memory location stored in the stack pointer. Thus, there cannot be anything else on top (semantical top - usually this is the lowest address, as stacks tend to grow down). You could work around this, of course, but that would likely incur additional overheads which C programmers are not going to be willing to pay.
The stack pointer defines what part of the allocated stack memory is actually used. When control flow is changed asynchronously (hardware interrupt), the current CPU's registers are generally immediately stored to memory addresses below the stack pointer by the interrupt handler. This can happen at any time, even throughout most of the kernel code. Any data stored below the place where the stack pointer point to would be clobbered by this. (Well, technically, that's not fully correct, there is generally a "red zone" below the stack pointer to which the interrupt handlers may not write any data. But here we are getting very firmly into architectural design peculiarities.)
Destroying a stack frame is generally a single addition of a constant to the stack pointer. This is the fastest kind of instruction you can get, it will generally not require a single cycle to execute (it will execute in parallel to some memory access). If the stack frame has a dynamic size, the stack frame must be destroyed by loading the stack pointer from memory, and for that a base pointer must have been retained. That's a memory access with a significant latency, and another register that must be saved to be used. Again, this is overhead that's generally unnecessary.
Your proposal would definitely be implementable, but it would require some workarounds. And these workarounds would generally cost performance. Small bits of performance, but definitely measurable amounts. That's not what compiler/kernel developers want, and for good reason.

How does free() function know how much bytes to deallocate and how to access that information with in our program? [duplicate]

In C programming, you can pass any kind of pointer you like as an argument to free, how does it know the size of the allocated memory to free? Whenever I pass a pointer to some function, I have to also pass the size (ie an array of 10 elements needs to receive 10 as a parameter to know the size of the array), but I do not have to pass the size to the free function. Why not, and can I use this same technique in my own functions to save me from needing to cart around the extra variable of the array's length?
When you call malloc(), you specify the amount of memory to allocate. The amount of memory actually used is slightly more than this, and includes extra information that records (at least) how big the block is. You can't (reliably) access that other information - and nor should you :-).
When you call free(), it simply looks at the extra information to find out how big the block is.
Most implementations of C memory allocation functions will store accounting information for each block, either in-line or separately.
One typical way (in-line) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it. Graphically, that would be along the lines of:
____ The allocated block ____
/ \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
^
|
+-- The address you are given
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defined (a) but the in-line accounting option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" or "Double free" if you overwrite them or free them twice.
The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
(a) I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system), assuming you asked for 128 bytes or less (requests for more would be met with a NULL return value). A very simple bit-mask (i.e., not in-line) was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bit-mask to decide what blocks were used or available.
Both these options managed to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent blocks when freeing), particularly important in the environment we were working in.
From the comp.lang.c FAQ list: How does free know how many bytes to free?
The malloc/free implementation remembers the size of each block as it is allocated, so it is not necessary to remind it of the size when freeing. (Typically, the size is stored adjacent to the allocated block, which is why things usually break badly if the bounds of the allocated block are even slightly overstepped)
This answer is relocated from How does free() know how much memory to deallocate? where I was abrubtly prevented from answering by an apparent duplicate question. This answer then should be relevant to this duplicate:
For the case of malloc, the heap allocator stores a mapping of the original returned pointer, to relevant details needed for freeing the memory later. This typically involves storing the size of the memory region in whatever form relevant to the allocator in use, for example raw size, or a node in a binary tree used to track allocations, or a count of memory "units" in use.
free will not fail if you "rename" the pointer, or duplicate it in any way. It is not however reference counted, and only the first free will be correct. Additional frees are "double free" errors.
Attempting to free any pointer with a value different to those returned by previous mallocs, and as yet unfreed is an error. It is not possible to partially free memory regions returned from malloc.
On a related note GLib library has memory allocation functions which do not save implicit size - and then you just pass the size parameter to free. This can eliminate part of the overhead.
The heap manager stored the amount of memory belonging to the allocated block somewhere when you called malloc.
I never implemented one myself, but I guess the memory right in front of the allocated block might contain the meta information.
The original technique was to allocate a slightly larger block and store the size at the beginning, then give the application the rest of the blog. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and it also creates dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
malloc() and free() are system/compiler dependent so it's hard to give a specific answer.
More information on this other question.
To answer the second half of your question: yes, you can, and a fairly common pattern in C is the following:
typedef struct {
size_t numElements
int elements[1]; /* but enough space malloced for numElements at runtime */
} IntArray_t;
#define SIZE 10
IntArray_t* myArray = malloc(sizeof(intArray_t) + SIZE * sizeof(int));
myArray->numElements = SIZE;
to answer the second question, yes you could (kind of) use the same technique as malloc()
by simply assigning the first cell inside every array to the size of the array.
that lets you send the array without sending an additional size argument.
When we call malloc it's simply consume more byte from it's requirement. This more byte consumption contain information like check sum,size and other additional information.
When we call free at that time it directly go to that additional information where it's find the address and also find how much block will be free.

why size is not provided in free statement [duplicate]

This question already has answers here:
Closed 12 years ago.
Possible Duplicate:
C programming : How does free know how much to free?
Hello All,
How OS will come to know how much size i have to free when we define free(pointer).
I mean we are not providing any size , only pointer to free statement.
How's internally handle the size ?
Thanks,
Neel
The OS won't have a clue, as free is not a system call. However, your C libraries memory allocation system will have recorded the size in some way when the memory was originally allocated by malloc(), so it knows how much to free.
The size is stored internally in the allocator, and the pointer you pass to free is used to reach that data. A very basic approach is to store the size 4 bytes before the pointer, so substracting 4 from the pointer gives you a pointer to it's size.
Notice that the OS doesn't handle this directly, it's implemented by your C/C++ runtime allocator.
When you call malloc, the C library will automatically carve a space out for you on the heap. Because things created on the heap are created dynamically, what is on the heap at any given point in time is not known as it is for the stack. So the library will keep track of all the memory that you have allocated on the heap.
At some point your heap might look like this:
p---+
V
---------------------------------------
... | used (4) | used (10) | used (8) | ...
---------------------------------------
The library will keep track of how much memory is allocated for each block. In this case, the pointer p points to the start of the middle block.
If we do the following call:
free(p);
then the library will free this space for you on the heap, like so...
p---+
V
----------------------------------------
... | used (4) | unused(10) | used (8) | ...
----------------------------------------
Now, the next time that you are looking for some space, say with a call like:
void* ptr = malloc(10);
The newly unused space may be allocated to your program again, which will allow us to reduce the overall amount of memory our program uses.
ptr---+
V
----------------------------------------
... | used (4) | used(10) | used (8) | ...
----------------------------------------
The way your library might handle internally managing the sizes is different. A simple way to implement this, would be to just add an additional amount of bytes (we'll say 1 for the example) at the beginning of each block allocated to hold the size of each block. So our previous block of heap memory would look like this:
bytes: 1 4 1 10 1 8
--------------------------------
... |4| used |10| used |8| used | ...
--------------------------------
^
+---ptr
Now, if we say that block sizes will be rounded up to be divisible by 2, they we have an extra bit at the end of the size (because we can always assume it to be 0, which we can conveniently use to check whether the corresponding block is used or unused.
When we pass a pointer in free:
free(ptr);
The library would move the pointer given back one byte, and change the used/unused bit to unused. In this specific case, we don't even have to actually know the size of the block in order to free it. It only becomes an issue when we try to reallocate the same amount of data. Then, the malloc call would go down the line, checking to see if the next block was free. If it is free, then if it is the right size that block will be returned back to the user, otherwise a new block will be cut at the end of the heap, and more space allocated from the OS if necessary.

How does free know how much to free?

In C programming, you can pass any kind of pointer you like as an argument to free, how does it know the size of the allocated memory to free? Whenever I pass a pointer to some function, I have to also pass the size (ie an array of 10 elements needs to receive 10 as a parameter to know the size of the array), but I do not have to pass the size to the free function. Why not, and can I use this same technique in my own functions to save me from needing to cart around the extra variable of the array's length?
When you call malloc(), you specify the amount of memory to allocate. The amount of memory actually used is slightly more than this, and includes extra information that records (at least) how big the block is. You can't (reliably) access that other information - and nor should you :-).
When you call free(), it simply looks at the extra information to find out how big the block is.
Most implementations of C memory allocation functions will store accounting information for each block, either in-line or separately.
One typical way (in-line) is to actually allocate both a header and the memory you asked for, padded out to some minimum size. So for example, if you asked for 20 bytes, the system may allocate a 48-byte block:
16-byte header containing size, special marker, checksum, pointers to next/previous block and so on.
32 bytes data area (your 20 bytes padded out to a multiple of 16).
The address then given to you is the address of the data area. Then, when you free the block, free will simply take the address you give it and, assuming you haven't stuffed up that address or the memory around it, check the accounting information immediately before it. Graphically, that would be along the lines of:
____ The allocated block ____
/ \
+--------+--------------------+
| Header | Your data area ... |
+--------+--------------------+
^
|
+-- The address you are given
Keep in mind the size of the header and the padding are totally implementation defined (actually, the entire thing is implementation-defined (a) but the in-line accounting option is a common one).
The checksums and special markers that exist in the accounting information are often the cause of errors like "Memory arena corrupted" or "Double free" if you overwrite them or free them twice.
The padding (to make allocation more efficient) is why you can sometimes write a little bit beyond the end of your requested space without causing problems (still, don't do that, it's undefined behaviour and, just because it works sometimes, doesn't mean it's okay to do it).
(a) I've written implementations of malloc in embedded systems where you got 128 bytes no matter what you asked for (that was the size of the largest structure in the system), assuming you asked for 128 bytes or less (requests for more would be met with a NULL return value). A very simple bit-mask (i.e., not in-line) was used to decide whether a 128-byte chunk was allocated or not.
Others I've developed had different pools for 16-byte chunks, 64-bytes chunks, 256-byte chunks and 1K chunks, again using a bit-mask to decide what blocks were used or available.
Both these options managed to reduce the overhead of the accounting information and to increase the speed of malloc and free (no need to coalesce adjacent blocks when freeing), particularly important in the environment we were working in.
From the comp.lang.c FAQ list: How does free know how many bytes to free?
The malloc/free implementation remembers the size of each block as it is allocated, so it is not necessary to remind it of the size when freeing. (Typically, the size is stored adjacent to the allocated block, which is why things usually break badly if the bounds of the allocated block are even slightly overstepped)
This answer is relocated from How does free() know how much memory to deallocate? where I was abrubtly prevented from answering by an apparent duplicate question. This answer then should be relevant to this duplicate:
For the case of malloc, the heap allocator stores a mapping of the original returned pointer, to relevant details needed for freeing the memory later. This typically involves storing the size of the memory region in whatever form relevant to the allocator in use, for example raw size, or a node in a binary tree used to track allocations, or a count of memory "units" in use.
free will not fail if you "rename" the pointer, or duplicate it in any way. It is not however reference counted, and only the first free will be correct. Additional frees are "double free" errors.
Attempting to free any pointer with a value different to those returned by previous mallocs, and as yet unfreed is an error. It is not possible to partially free memory regions returned from malloc.
On a related note GLib library has memory allocation functions which do not save implicit size - and then you just pass the size parameter to free. This can eliminate part of the overhead.
The heap manager stored the amount of memory belonging to the allocated block somewhere when you called malloc.
I never implemented one myself, but I guess the memory right in front of the allocated block might contain the meta information.
The original technique was to allocate a slightly larger block and store the size at the beginning, then give the application the rest of the blog. The extra space holds a size and possibly links to thread the free blocks together for reuse.
There are certain issues with those tricks, however, such as poor cache and memory management behavior. Using memory right in the block tends to page things in unnecessarily and it also creates dirty pages which complicate sharing and copy-on-write.
So a more advanced technique is to keep a separate directory. Exotic approaches have also been developed where areas of memory use the same power-of-two sizes.
In general, the answer is: a separate data structure is allocated to keep state.
malloc() and free() are system/compiler dependent so it's hard to give a specific answer.
More information on this other question.
To answer the second half of your question: yes, you can, and a fairly common pattern in C is the following:
typedef struct {
size_t numElements
int elements[1]; /* but enough space malloced for numElements at runtime */
} IntArray_t;
#define SIZE 10
IntArray_t* myArray = malloc(sizeof(intArray_t) + SIZE * sizeof(int));
myArray->numElements = SIZE;
to answer the second question, yes you could (kind of) use the same technique as malloc()
by simply assigning the first cell inside every array to the size of the array.
that lets you send the array without sending an additional size argument.
When we call malloc it's simply consume more byte from it's requirement. This more byte consumption contain information like check sum,size and other additional information.
When we call free at that time it directly go to that additional information where it's find the address and also find how much block will be free.

c malloc questions (mem corruption)

When using malloc, if it produces a core dump with the error:
malloc(): memory corruption: ....... ***
Does this mean that malloc tried to allocate memory that was not free to allocate? IF so what are the causes of this?
It completely depends on your malloc implementation, but usually what this is means is that at some point prior to that malloc something wrote more data to a malloced buffer than its size.
A lot of malloc implementations store some of their data inline with their memory, in other words:
+--------------------------------+
|14 bytes -> Padding |
+--------------------------------+
|2 bytes -> Internal malloc info |
+--------------------------------+
|6 bytes -> Your data |
+--------------------------------+
|8 bytes -> Padding |
+--------------------------------+
|2 bytes -> Internal malloc info |
+--------------------------------+
So if some code of yours or a library wrote 16 bytes to that 6 byte buffer it would overwrite the padding and the 2 bytes of internal malloc info. The next time you call malloc it will try to walk through its data to find space, hit the overwritten space, and it will be nonsensical since you overwrote it, corrupting the heap.
Depending on the implementation such an error could also be caused by making a double free.
Most likely, this is not a problem in malloc itself. Rather, this is a problem with your application modifying parts of the heap that it shouldn't.
If you are running on Linux, try using Valgrind to see which code is trashing your heap.
The usual cause of this is that you wrote over data that malloc() did not give you permission to write over - either buffer overrun (writing beyond the end of the space you were given) or buffer underrun (writing before the start of the buffer).
It can sometimes be caused by freeing a pointer that was not allocated by malloc() et al, or by re-freeing (double freeing) a pointer that was allocated by malloc(). For example, freeing a static buffer is a bad idea; you will get corruption.
You should assume that the problem is in your code - it is extremely unlikely to be a problem in malloc() et al, and rather unlikely to be in any other library you are using.
There are several things that are usual causes of heap corruption:
overrunning the memory allocation (writing past the end of the allocated block)
double freeing a block
using a pointer after it's been freed
and of course something writing erroneously through a pointer that has nothing to do with a previous allocation (a 'ram hit' or rogue pointer) - this is the general case that includes all of the above.
These problems can be difficult to debug because the cause and effect are often separated by time and space (different area of code). So the bug doesn't get noticed until an eternity (in computer time) passes after the bug that caused the problem executes.
Using a debug heap can be very helpful in debugging these issues. Microsoft's compilers have a CrtDebug heap that's enabled in debug builds (but can have additional configuration items set). I'm not sure what GCC has out of the box, but there are tools I'm familiar with in passing such as Valgrind and Electric Fence that might help. Finally there a ton of home-grown heap debug libraries that might be helpful (Google around).
Could you please provide your malloc() statement?
Also, I wanted to double check that the return value is not null?
Outside of not having the memory to allocate to begin with, the problems I have encountered when using malloc() or new similar the nature you mentioned where actually resultant of a corrupted heap. I usually found some "interesting" code elsewhere in the program doing soomething like memcpy() with a character buffer causing a buffer overrun and a mangled address space.
-bn

Resources