This is from Beej's guide to C
"The drawback to using calloc() is that it takes time to clear memory, and in most cases, you don't need it clear since you'll just be writing over it anyway. But if you ever find yourself malloc()ing a block and then setting the memory to zero right after, you can use calloc() to do that in one call."
so what is a potential scenario when i will want to clear memory to zero.
When the function you are passing a buffer to states in its documentation that a buffer must be zero-filled. You may also always zero out the memory for safety; it doesn't actually take that much time unless the buffers are really huge. Memory allocation itself is the potentially expensive part of the operation.
One scenario is where you are allocating an array of integers, (say, as accumulators or counter variables) and you want each element in the array to start at 0.
In some case where you are allocating memory for some structure and some member of that structure are may going to evaluation in some expression or in conditional statement without initializing that structure in that case it would be harmful or will give you undefined behavior . So overcome form this better you
1> malloc that structure and memset it with 0 before using that structure
or
2> calloc that structure
Note: some advance memory management program with malloc also reset memory with 0
There are lots of times when you might want memory zeroed!
Some examples:
Allocating memory to contain a structure, where you want all the
members initialised to zero
Allocating memory for an array of chars which you are later going to write some number of chars into, and then treat as a NULL
terminated string
Allocating memory for an array of pointers which you want initialised to NULL
If all allocated memory is zero-filled, the program's behavior is much more reproducible (so the behavior is more likely the same if you re-run your program). This is why I don't use uninitialized malloc zones.
(for similar reasons, when debugging a C or C++ program on Linux, I usually do echo 0 > /proc/sys/kernel/randomize_va_space so that mmap behavior is more reproducible).
And if your program does not allocate huge blocks (i.e. dozens of megabytes), the time spent inside malloc is much bigger than the time to zero it.
Related
I'm trying to figure out how many bytes in a block are taken up by the boundary tags. I have been told that when trying to malloc an adjacent block of memory, a "jump" will appear in assembly code, and I can use that to determine the size of the boundary tag. I've tried this:
int* arr = malloc(8);
arr++;
arr = malloc(8);
But there isn't any jump in assembly code. Am I "trying to malloc an adjacent block of memory"?
EDIT: I think he means a jump will appear between address value. I use the beginning of the second block of memory subtract the payload size of the first block. But I'm still confused, how could I malloc an adjacent block of memory?
Unless you're writing an actual memory allocator, you can't actually allocate two consecutive chunks of memory. If you want to see some pretty gnarly code which does this, have a look at the Illumos malloc https://github.com/illumos/illumos-gate/blob/master/usr/src/lib/libc/port/gen/malloc.c.
If you want to see how Illumos (and Solaris) handle the redzone between allocated blocks, you should trawl through https://github.com/illumos/illumos-gate/tree/master/usr/src/lib/libumem/common.
The memory consumed by malloc(3) requires, for proper management of the actually used memory, of some structures that must be dynamically allocated also. For this reason, many allocators just do allocate the space required for the management data adjacent to the block space dedicated to the user. This makes that normally two consecutive junks of memory allocated by malloc(2) show some gap in their addresses.
There are other reasons to see gaps, one fundamental is that malloc normally gives you aligned memory addresses, so it is warranted that the data you store on that memory will be properly aligned.
And of course, there can be implementations (normally when heap allocation should be more robust in respect to buffer overruns) that the memory dedicated to storage of management data is completely unrelated and apart off the final given memory. In this case you could observe no gaps between memory allocations on some cases.
Anyway, your code has serious bugs, let's see:
int* arr = malloc(8);
You had better here to acquire just the memory you need, using the sizeof operator, as in int *arr = malloc(sizeof *arr); instead.
arr++;
this statement is useless, as you are going to overwrite the value of arr (the pointer) with new assignment statement after it from malloc(), so it is of no use to increment the pointer value. You also are somewhat losing the returned value of the previous malloc() (which is essential in order to return the allocated memory, but read below).
arr = malloc(8);
Until here, you had the chance to --arr decrementing the value of arr in order to be capable of free(3) that block. But this statement overwrites the value stored in arr so the previous pointer value is overwritten by the new pointer. Memory you acquired on the first malloc has no way to be accessed again. This is what is commonly known as a memory leak, and is normally a serious error (very difficult to catch) on long run programs (like servers or system daemons). The program allocates a bunch of memory in the inner part of a loop, that is not returned back with a call to free(3), so the program begins growing and growing until it exhausts all the available memory.
A final note, I don't understand what did you mean with malloc adjacent block of memory. Did you believe that incrementing the pointer would make malloc() to give you a special block of memory?
First, malloc has no idea of what are you going to do with the pointer it gives to you.
But also, it doesn't know anything about the variable contents of the pointer you are assigning to (you can even not store it in a variable, and pass it as another parameter to another functions) So there's no possibility for malloc to know that you have incremented the pointer value, or even the pointer location, from its body.
I cannot guess your interpretation. It would be nice to know what has made you to think that you can control how malloc(3) selects the block of memory to give to you. You have no control on the internals of malloc() You just specify the amount of continous memory you want, and mallocs provides it, giving you a pointer pointing to the start of that block. You cannot assume that the next time you call malloc (with the same or different amount of memory) it will give you an adjacent block. It just can be completely unrelated (above or below in memory) to the previous given block. And you cannot modify that pointer, because you need it to call free(3) once you don't need the block anymore, with exactly the same pointer value that malloc(3) gave to you. If, for some reason you modify the pointer, you need to restore it to the original value to be capable of calling free(3). Lack to do so, you'll probably crash your program at the next call to free(3).
I just see a memory leak. Malloc 2 times into different vars 8 bytes of space and see if the difference is more than 8 bytes or 2 int.
Consider the following:
int* x = calloc(3,sizeof(int));
x[3] = 100;
which is located inside of a function.
I get no error when I compile and run the program, but when I run it with valgrind I get an "Invalid write of size 4".
I understand that I am accessing a memory place outside of what I have allocated with calloc, but I'm trying to understand what actually happens.
Does some address in the stack(?) still have the value 100? Because there must certainly be more available memory than what I have allocated with calloc. Is the valgrind error more of a "Hey, you probably did not mean to do that"?
I understand that I am accessing a memory place outside of what I have allocated with calloc, but I'm trying to understand what actually happens.
"What actually happens" is not well-defined; it depends entirely on what gets overwritten. As long as you don't overwrite anything important, your code will appear to run as expected.
You could wind up corrupting other data that was allocated dynamically. You could wind up corrupting some bit of heap bookkeeping.
The language does not enforce any kind of bounds-checking on array accesses, so if you read or write past the end of the array, there are no guarantees on what will happen.
Does some address in the stack(?) still have the value 100?
First of all, calloc allocates memory on the heap not stack.
Now, regarding the error.
Sure most of the time there is plenty of memory available when your program is running. However when you allocate memory for x bytes, the memory manager looks for some free chunk of memory of that exact size(+ maybe some more if calloc requested larger memory to store some auxiliary info), there are no guaranties on what the bytes after that chunk are used for, and even no guaranties that they are not read-only or can be accessed by your program.
So anything can happen. In the case if the memory was just there waiting for it to be used by your program, nothing horrible happens, but if that memory was used by something else in your program, the values would be mess up, or worst of all the program could crash because of accessing something that wasn't supposed to be accessed.
So the valgrind error should be treated very seriously.
The C language doesn't require bounds checking on array accesses, and most C compilers don't implement it. Besides if you used some variable size instead of constant value 3, the array size could be unknown during compilation, and there would be no way to check if the access isn't out of bound.
There's no guarantees on what was allocated in the space past x[3] or what will be written there in the future. alinsoar mentioned that x[3] itself does not cause undefined behavior, but you should not attempt to fetch or store a value from there. Often you will probably be able to write and access this memory location without problems, but writing code that relies on reaching outside of your allocated arrays is setting yourself up for very hard to find errors in the future.
Does some address in the stack(?) still have the value 100?
When using calloc or malloc, the values of the array are not actually on the stack. These calls are used for dynamic memory allocation, meaning they are allocated in a seperate area of memory known as the "Heap". This allows you to access these arrays from different parts of the stack as long as you have a pointer to them. If the array were on the stack, writing past the bounds would risk overwriting other information contained in your function (like in the worst case the return location).
The act of doing that is what is called undefined behavior.
Literally anything can happen, or nothing at all.
I give you extra points for testing with Valgrind.
In practice, it is likely you will find the value 100 in the memory space after your array.
Beware of nasal demons.
You're allocating memory for 3 integer elements but accessing the 4th element (x[3]). Hence, the warning message from valgrind. Compiler will not complain about it.
I am trying to undestand the C functions malloc and free. I know this has been discussed a lot on StackOverflow. However, I think I kind of know what these functions do by now. I want to know why to use them. Let's take a look at this piece of code:
int n = 10;
char* array;
array = (char*) malloc(n * sizeof(char));
// Check whether memory could be allocated or not...
// Do whatever with array...
free(array);
array = NULL;
I created a pointer of type char which I called array. Then I used malloc to find a chunk of memory that is currently not used and (10 * sizeof(char)) bytes large. That address I casted to type char pointer before assigning it to my previously created char pointer. Now I can work with my char array. When I am done, I'll use free to free that chunk of memory since it's not being used anymore.
I have one question: Why wouldn't I just do char array[10];? Wikipedia has only one small sentence to give to answer that, and that sentence I unfortunately don't understand:
However, the size of the array is fixed at compile time. If one wishes to allocate a similar array dynamically...
The slide from my university is similarily concise:
It is also possible to allocate memory from the heap.
What is the heap? I know a data structure called heap. :)
However, I've someone could explain to me in which case it makes sense to use malloc and free instead of the regular declaration of a variable, that'd be great. :)
C provides three different possible "storage durations" for objects:
Automatic - local storage that's specific to the invocation of the function it's in. There may be more than one instance of objects created with automatic storage, if a function is called recursively or from multiple threads. Or there may be no instances (if/when the function isn't being called).
Static - storage that exists, in exactly one instance, for the entire duration of the running program.
Allocated (dynamic) - created by malloc, and persists until free is called to free it or the program terminates. Allocated storage is the only type of storage with which you can create arbitrarily large or arbitrarily many objects which you can keep even when functions return. This is what malloc is useful for.
First of all there is no need to cast the malloc
array = malloc(n * sizeof(char));
I have one question: Why wouldn't I just do char array[10];?
What will you do if you don't know how many storage space do you want (Say, if you wanted to have an array of arbitrary size like a stack or linked list for example)?
In this case you have to rely on malloc (in C99 you can use Variable Length Arrays but for small memory size).
The function malloc is used to allocate a certain amount of memory during the execution of a program. The malloc function will request a block of memory from the heap. If the request is granted, the operating system will reserve the requested amount of memory.
When the amount of memory is not needed anymore, you must return it to the operating system by calling the function free.
In simple: you use an array when you know the number of elements the array will need to hold at compile time. you use malloc with pointers when you don't know how many elements the array will need to be at compile time.
For more detail read Heap Management With malloc() and free().
Imagine you want to allocate 1,000 arrays.
If you did not have malloc and free... but needed a declaration in your source for each array, then you'd have to make 1,000 declarations. You'd have to give them all names. (array1, array2, ... array1000).
The idea in general of dynamic memory management is to handle items when the quantity of items is not something you can know in advance at the time you are writing your program.
Regarding your question: Why wouldn't I just do char array[10];?. You can, and most of the time, that will be completely sufficient. However, what if you wanted to do something similar, but much much bigger? Or what if the size of your data needs to change during execution? These are a few of the situations that point to using dynamically allocated memory (calloc() or malloc()).
Understanding a little about how/when the stack and heap are used would be good: When you use malloc() or calloc(), it uses memory from the heap, where automatic/static variables are given memory on the stack, and are freed when you leave the scope of that variable, i.e the function or block it was declared in.
Using malloc and calloc become very useful when the size of the data you need is not known until run-time. When the size is determined, you can easily call one of these to allocate memory onto the heap, then when you are finished, free it with free()
Regarding What is the heap? There is a good discussion on that topic here (slightly different topic, but good discussion)
In response to However, I've someone could explain to me in which case it makes sense to use malloc() and free()...?
In short, If you know what your memory requirements are at build time (before run-time) for a particular variable(s), use static / automatic creation of variables (and corresponding memory usage). If you do not know what size is necessary until run-time, use malloc() or calloc() with a corresponding call to free() (for each use) to create memory. This is of course a rule-of-thumb, and a gross generalization. As you gain experience using memory, you will find scenarios where even when size information is known before run-time, you will choose to dynamically allocate due to some other criteria. (size comes to mind)
If you know in advance that you only require an array of 10 chars, you should just say char array[10]. malloc is useful if you don't know in advance how much storage you need. It is also useful if you need storage that is valid after the current function returns. If you declare array as char array[10], it will be allocated on the stack. This data will not be valid after your function returns. Storage that you obtain from malloc is valid until you call free on it.
Also, there is no need to cast the return value of malloc.
Why to use free after malloc can be understood in the way that it is a good style to free memory as soon as you don't need it. However if you dont free the memory then it would not harm much but only your run time cost will increase.
You may also choose to leave memory unfreed when you exit the program. malloc() uses the heap and the complete heap of a process is freed when the process exits. The only reason why people insist on freeing the memory is to avoid memory leaks.
From here:
Allocation Myth 4: Non-garbage-collected programs should always
deallocate all memory they allocate.
The Truth: Omitted deallocations in frequently executed code cause
growing leaks. They are rarely acceptable. but Programs that retain
most allocated memory until program exit often perform better without
any intervening deallocation. Malloc is much easier to implement if
there is no free.
In most cases, deallocating memory just before program exit is
pointless. The OS will reclaim it anyway. Free will touch and page in
the dead objects; the OS won't.
Consequence: Be careful with "leak detectors" that count allocations.
Some "leaks" are good!
Also the wiki has a good point in Heap base memory allocation:-
The heap method suffers from a few inherent flaws, stemming entirely
from fragmentation. Like any method of memory allocation, the heap
will become fragmented; that is, there will be sections of used and
unused memory in the allocated space on the heap. A good allocator
will attempt to find an unused area of already allocated memory to use
before resorting to expanding the heap. The major problem with this
method is that the heap has only two significant attributes: base, or
the beginning of the heap in virtual memory space; and length, or its
size. The heap requires enough system memory to fill its entire
length, and its base can never change. Thus, any large areas of unused
memory are wasted. The heap can get "stuck" in this position if a
small used segment exists at the end of the heap, which could waste
any magnitude of address space, from a few megabytes to a few hundred.
I have a doubt regarding:
void *a = malloc(40);
free(a);
If I consider that malloc(40) allocates 40 bytes of memory and returns the address of this memory and then free(a) deallocates/frees this memory but doesn't do anything with the bit pattern residing in that memory. So, supposedly this same memory is allocated to say void *b, then on printing the value at address pointed to by b gives me the same value that was residing or it gives me a garbage value and why?
I assume that you have this situation in mind:
void * a = malloc(40);
free(a);
void * b = malloc(40);
assert(a == b);
This is of course entirely plausible, since memory is likely to be reused.
However, since a == b, you've answered your own question: The value of b is identical to the value of a!
I believe that you've asked the wrong question, and that you are actually interested in comparing the memory pointed to by b. That's a whole different kettle of fish. Anything could have happened in between the two malloc calls. Nothing is guaranteed. The memory pointed to by the return value of a malloc call is uninitialized, and you cannot make any assumptions about its content. It stands to reason that the memory will not have changed in a typical, optimized C library, but there's no guarantee. A "safe" runtime environment may well choose to overwrite freed or allocated memory with a specific test pattern to allow better detection of invalid accesses.
It can give you any value.
C/C++ standards do not mandate the value to be anything specific. In technical terms the value of any uninitialized variable/memory is Indeterminate.
In short, Your program should not rely on this value to be anything specific and if it does then it is non-portable.
Its not guaranteed anything. Printing block you got from malloc may print previous data, or may not. There is many things that can alter next malloc block(so next address will be different) or alter old memory block itself, too.
free will often modify the memory you're freeing.
It's a common trick, especially in debug mode, for free to overwrite memory with some fixed pattern to make it easier to tell if you're double freeing memory, or just manipulating freed memory.
Likewise malloc might overwrite memory with a different pattern to make it obvious that the memory is uninitialised.
malloc and free are C style memory management rather than C++. C++ alternatives are new and delete operators. As for the bit pattern remaining in the memory after free(), yes, it's the same bit pattern. If you want to manually delete the bit pattern, you can use memset() or ZeroMemory() if you write WinApi code.
It will give you some garbage value as function free() will set the specified bytes to some pattern which will make it known to the memory that the bytes are freed and uninitialized. Having said so, it is very unlikely and highly improbable that you will encounter a case defined in your question. Even if you will get allocated the same bytes again, believe me, you will by no means recognize them :-)
AFAIK, free() generally sets the memory to 0xFE 0xEE with Visual Studio which roughly means that the memory was allocated but now is freed. These value are known as Sentinel Values which means that the heap is still owned by the process but not in use. Memory which are freed from the process will show you "?? ??".
First, the code is not c++ but plain c.
The reason is, that free() / delete exist there to so the system can note that the memory region is again available for allocation.
Why should it do anything beyond ?
This is a security issue, however. I believe some security-oriented modern systems do zero the memory before giving it to an application. Then if you use malloc() for the first time you will get an equivalent to calloc. Never the less, if you happen to free the memory and then allocate it again might be able to read your own data.
The plain reason for such behaviour is simple. Zeroing memory would be time consuming and you can do it by hand. Actually it has O(n) complexity. If you write a number cruncher that reuses its memory, you do not care what you get in there after malloc() because most likely you should be overwriting it, and you definitely do not want your FLOPS to be negatively affected by unnecessary memset() upon calling free().
If you want to be sure that nothing can read the memory after you call free you need to use memset(a, 0, SIZE) before calling free().
Trying to understand answers to my question
what happens when tried to free memory allocated by heap manager, which allocates more than asked for?
I wrote this function and puzzled by its output
int main(int argc,char **argv){
char *p,*q;
p=malloc(1);
strcpy(p,"01234556789abcdefghijklmnopqrstuvwxyz"); //since malloc allocates atleast 1 byte
q=malloc(2);
// free(q);
printf("q=%s\n",q);
printf("p=%s\n",p);
return 0;
}
Output
q=vwxyz
p=01234556789abcdefghijklm!
Can any one explain this behavior? or is this implementation specific?
also if free(q) is uncommented, I am getting SIGABRT.
You are copying more bytes to *p than you have allocated, overwriting whatever might have been at the memory locations after the allocated space.
When you then call malloc again, it takes a part of memory it knows to be unused at the moment (which happens to be a few bytes after *p this time), writes some bookkeeping information there and returns a new pointer to that location.
The bookkeeping information malloc writes happens to start with a '!' in this run, followed by a zero byte, so your first string is truncated. The new pointer happens point to the end of the memory you overwrote before.
All this is implementation specific and might lead to different results each run or depending on the phase of the moon. The second call to malloc() would also absolutely be in its right to just crash the program in horrible ways (especially since you might be overwriting memory that malloc uses internally).
You are just being lucky this time: this is an undefined behavior and don't count on it.
Ususally, but depending on the OS, memory is allocated in "pages" (i.e. multiple bytes). Malloc() on the other hand allocates memory from those "pages" in a more "granular" way: there is "overhead" associated with each allocation being managed through malloc.
The signal you are getting from free is most probably related to the fact that you mess up the memory management by writing past what you were allocated with p i.e. writing on the overhead information used by the memory manager to keep track of memory blocks etc.
This is a classical heap overflow. p has only 1 byte, but the heap manager pads the allocation (32 bytes in your case). q is allocated right after p, so it naturally gets the next available spot. For example if the address of p is 0x1000, the adress that gets assigned to q is 0x1020. This explains why q points to part of the string.
The more interesting question is why p is only "01234556789abcdefghijklm" and not "01234556789abcdefghijklmnopqrstuvwxyz". The reason is that memory manager uses the gaps between allocation for its internal bookkeeping. From a memory manager perspective the memory layout is as following:
p D q
where D is internal data structure of memory manager (0x1010 to 0x1020 in our example). While allocating memory for q, the heap manager writes its stuff to the bookkeeping area (0x1010 to 0x1020). A byte is changed to 0 truncates the string since it is treated as NULL terminator.
THE VALUE OF "p":
you allocated enough space to fit this: ""
[[ strings are null terminated, remember? you don't see it, but it's there -- so that's one byte used up. ]]
but you are trying to store this: "01234556789abcdefghijklmnopqrstuvwxyz"
the result, therefore, is that the "stuff" starting with "123.." is being stored beyond the memory you allocated -- possibly writing over other "stuff" elsewhere. as such your results will be messy, and as "jidupont" said you're lucky that it doesn't just crash.
OUTPUT OF PRINTING [BROKEN] "p"
as said, you've written way past the end of "p"; but malloc doesn't know this. so when you asked for another block of memory for "q", maybe it gave you the memory following what it gave you for "p"; and maybe it aligned the memory (typical) so it's pointer is rounded up to some nice number; and then maybe it uses some of this memory to store bookkeeping information you're not supposed to be concerned with. but you don't know, do you? you're not supposed to know either -- you're just not supposed to write to memory that you haven't allocated yourself!
and the result? you see some of what you expected -- but it's truncated! because ... another block was perhaps allocated IN the memory you used (and used without permission, i might add), or something else owned that block and changed it, and in any case some values were changed -- resulting in: "01234556789abcdefghijklm!". again, lucky that things didn't just explode.
FREEING "q"
if you free "q", then try to access it -- as you are doing by trying to print it -- you will (usually) get a nasty error. this is well deserved. you shouldn't uncomment that "free(q)". but you also shouldn't try to print "q", because you haven't put anything there yet! for all you know, it might contain gibberish, and so print will continue until it encounters a NULL -- which may not happen until the end of the world -- or, more likely, until your program accesses yet more memory that it shouldn't, and crashes because the OS is not happy with you. :)
It shouldn't be that puzzling that intentionally misusing these functions will give nonsensical results.
Two consecutive mallocs are not guaranteed to give you two consecutive areas of memory. malloc may choose to allocate more than the amount of memory you requested, but not less if the allocation succeeds. The behavior of your program when you choose to overwrite unallocated memory is not guaranteed to be predictable.
This is just the way C is. You can easily misuse the returned memory areas from malloc and the language doesn't care. It just assumes that in a correct program you will never do so, and everything else is up for grabs.
Malloc is a function just like yours :)
There is a lot of malloc implementations so i won't go into useless details.
At the first call malloc it asks memory to the system. For the example let's say 4096 which is the standard memory page size which is good. So you call malloc asking for 1 byte. The function malloc will asks 4096 bytes to the system. Next, it will use a small part of this memory to store internal data such the positions of the available blocks. Then it will cut one part of this block and send it back to you.
An internal algorithm will trys to reuse the blocks after a call to free to avoid re-asking memory to the system.
So with this little explanation you can now understand why you code is working.
You are writing in the memory asked my malloc to the system. This comportment doesn't bother the system because you stay in the memory allocated for your processes. The problem is you can't know for sure that you are not writing on critical parts of your software memory. This kind off error are called buffer overflow and are causing most of the "mystical bugs".
The best way to avoid them is to use valgrind on linux. This soft will tell you if you are writing or reading where you are not supposed to.
It that clear enough ?
I suggest reading this introduction.
Pointers And Memory
It helped me understand the difference between stack and heap allocation, very good introduction.