When to use malloc, is it really necessary - c

The malloc example I'm studying is
#include <stdio.h>
#include <stdlib.h>
int main()
{
int *vec;
int i, size;
printf("Give size of vector: ");
scanf("%d",&size);
vec = (int *) malloc(size * sizeof(int));
for(i=0; i<size; i++) vec[i] = i;
for(i=0; i<size; i++)
printf("vec[%d]: %d\n", i, vec[i]);
free(vec);
}
But I can make a program behave at runtime like this program behaves writing it in C wihout malloc, can't I? So what's the use of malloc here?

It is dynamic memory allocation.
The very important point there is that you don't know how much memory you'll need because the amount of memory you must end up with is dependant on the user input.
Thus the use of malloc, which takes size as part of its argument, and size is unknown at compile-time.

This specific example could have been done using variable length arrays which were supported by the standard since c99, now optional as of the 2011 standard. Although if size is very large allocating it on the stack would not work since the stack is much smaller than available heap memory which is what malloc will use. This previous post of mine also has some links on variable length arrays you might find useful.
Outside of this example when you have dynamic data structures using malloc is pretty hard to avoid.

Two issues:
First, you might not know how much memory you need until run time. Using malloc() allows you to allocate exactly the right amount: no more, no less. And your application can "degrade" gracefully if there is not enough memory.
Second, malloc() allocated memory from the heap. This can sometimes be an advantage. Local variables that are allocated on the stack have a very limited amount of total memory. Static variables mean that your app will use all the memory all the time, which could even potentially prevent your app from loading if there isn't enough memory.

Related

Why does malloc need to be used for dynamic memory allocation in C?

I have been reading that malloc is used for dynamic memory allocation. But if the following code works...
int main(void) {
int i, n;
printf("Enter the number of integers: ");
scanf("%d", &n);
// Dynamic allocation of memory?
int int_arr[n];
// Testing
for (int i = 0; i < n; i++) {
int_arr[i] = i * 10;
}
for (int i = 0; i < n; i++) {
printf("%d ", int_arr[i]);
}
printf("\n");
}
... what is the point of malloc? Isn't the code above just a simpler-to-read way to allocate memory dynamically?
I read on another Stack Overflow answer that if some sort of flag is set to "pedantic", then the code above would produce a compile error. But that doesn't really explain why malloc might be a better solution for dynamic memory allocation.
Look up the concepts for stack and heap; there's a lot of subtleties around the different types of memory. Local variables inside a function live in the stack and only exist within the function.
In your example, int_array only exists while execution of the function it is defined in has not ended, you couldn't pass it around between functions. You couldn't return int_array and expect it to work.
malloc() is used when you want to create a chunk of memory which exists on the heap. malloc returns a pointer to this memory. This pointer can be passed around as a variable (eg returned) from functions and can be used anywhere in your program to access your allocated chunk of memory until you free() it.
Example:
'''C
int main(int argc, char **argv){
int length = 10;
int *built_array = make_array(length); //malloc memory and pass heap pointer
int *array = make_array_wrong(length); //will not work. Array in function was in stack and no longer exists when function has returned.
built_array[3] = 5; //ok
array[3] = 5; //bad
free(built_array)
return 0;
}
int *make_array(int length){
int *my_pointer = malloc( length * sizeof int);
//do some error checking for real implementation
return my_pointer;
}
int *make_array_wrong(int length){
int array[length];
return array;
}
'''
Note:
There are plenty of ways to avoid having to use malloc at all, by pre-allocating sufficient memory in the callers, etc. This is recommended for embedded and safety critical programs where you want to be sure you'll never run out of memory.
Just because something looks prettier does not make it a better choice.
VLAs have a long list of problems, not the least of which they are not a sufficient replacement for heap-allocated memory.
The primary -- and most significant -- reason is that VLAs are not persistent dynamic data. That is, once your function terminates, the data is reclaimed (it exists on the stack, of all places!), meaning any other code still hanging on to it are SOL.
Your example code doesn't run into this problem because you aren't using it outside of the local context. Go ahead and try to use a VLA to build a binary tree, then add a node, then create a new tree and try to print them both.
The next issue is that the stack is not an appropriate place to allocate large amounts of dynamic data -- it is for function frames, which have a limited space to begin with. The global memory pool, OTOH, is specifically designed and optimized for this kind of usage.
It is good to ask questions and try to understand things. Just be careful that you don't believe yourself smarter than the many, many people who took what now is nearly 80 years of experience to design and implement systems that quite literally run the known universe. Such an obvious flaw would have been immediately recognized long, long ago and removed before either of us were born.
VLAs have their place, but it is, alas, small.
Declaring local variables takes the memory from the stack. This has two ramifications.
That memory is destroyed once the function returns.
Stack memory is limited, and is used for all local variables, as well as function return addresses. If you allocate large amounts of memory, you'll run into problems. Only use it for small amounts of memory.
When you have the following in your function code:
int int_arr[n];
It means you allocated space on the function stack, once the function will return this stack will cease to exist.
Image a use case where you need to return a data structure to a caller, for example:
Car* create_car(string model, string make)
{
Car* new_car = malloc(sizeof(*car));
...
return new_car;
}
Now, once the function will finish you will still have your car object, because it was allocated on the heap.
The memory allocated by int int_arr[n] is reserved only until execution of the routine ends (when it returns or is otherwise terminated, as by setjmp). That means you cannot allocate things in one order and free them in another. You cannot allocate a temporary work buffer, use it while computing some data, then allocate another buffer for the results, and free the temporary work buffer. To free the work buffer, you have to return from the function, and then the result buffer will be freed to.
With automatic allocations, you cannot read from a file, allocate records for each of the things read from the file, and then delete some of the records out of order. You simply have no dynamic control over the memory allocated; automatic allocations are forced into a strictly last-in first-out (LIFO) order.
You cannot write subroutines that allocate memory, initialize it and/or do other computations, and return the allocated memory to their callers.
(Some people may also point out that the stack memory commonly used for automatic objects is commonly limited to 1-8 mebibytes while the memory used for dynamic allocation is generally much larger. However, this is an artifact of settings selected for common use and can be changed; it is not inherent to the nature of automatic versus dynamic allocation.)
If the allocated memory is small and used only inside the function, malloc is indeed unnecessary.
If the memory amount is extremely large (usually MB or more), the above example may cause stack overflow.
If the memory is still used after the function returned, you need malloc or global variable (static allocation).
Note that the dynamic allocation through local variables as above may not be supported in some compiler.

why use malloc function in c when we can declare arrays using arr[size] ,taking input from the user for size?

why use malloc function when we can write the code in c like this :
int size;
printf("please the size of the array\n");
scanf("%d",&size);
int arr[size];
this eliminates the possibility of assigning garbage value to array size and is also taking the size of the array at run time ...
so why use dynamic memory allocation at all when it can be done like this ?
This notation
int arr[size];
means VLA - Variable-Length Array.
Standard way they are implemented is that they are allocated on stack.
What is wrong with it?
Stack is usually relatively small - on my linux box it is only 8MB.
So if you try to run following code
#include <stdio.h>
const int MAX_BUF=10000000;
int main()
{
char buf[MAX_BUF];
int idx;
for( idx = 0 ; idx < MAX_BUF ; idx++ )
buf[idx]=10;
}
it will end up with seg fault.
TL;DR version
PRO:
VLA are OK for small allocations. You don't have to worry about freeing memory when leaving scope.
AGAINST:
They are unsafe for big allocations. You can't tell what is safe size to allocate (say recursion).
Besides the fact that VLA may encounter problems when their size is too large, there is a much more important thing with these: scope.
A VLA is allocated when the declaration is encountered and deallocated when the scope (the { ... }) is left. This has advantages (no function call needed for both operations) and disadvantages (you can't return it from a function or allocate several objects).
malloc allocates dynamically, so the memory chunk persists after return from the function you happen to be in, you can allocated with malloc several times (e.g in a for loop) and you determine exactly when you deallocate (by a call to free).
Why to not use the following:
int size;
printf("please the size of the array\n");
scanf("%d",&size);
int arr[size];
Insufficient memory. int arr[size]; may exceed resources and this goes undetected. #Weather Vane Code can detect failure with a NULL check using *alloc().
int *arr = malloc(sizeof *arr * size);
if (arr == NULL && size > 0) Handle_OutOfMemory();
int arr[size]; does not allow for an array size of 0. malloc(sizeof *arr * 0); is not a major problem. It may return NULL or a pointer on success, yet that can easily be handled.
Note: For array sizes, type size_t is best which is some unsigned integer type - neither too narrow, nor too wide. int arr[size]; is UB if size < 0. It is also a problem with malloc(sizeof *arr * size). An unqualified size is not a good idea with variable length array (VLA) nor *alloc().
VLAs, required since C99 are only optionally supported in a compliant C11 compiler.
What you write is indeed a possibility nowadays, but if you do that with g++ it will issue warnings (which is generally a bad thing).
Other thing is your arr[size] is stored at stack, while malloc stores data at heap giving you much more space.
With that is connected probably the main issue and that is, you can actually change size of your malloc'd arrays with realloc or free and another malloc. Your array is there for the whole stay and you cannot even free it at some point to save space.

malloc non-deterministic behaviour

#include <stdio.h>
#include <stdlib.h>
int main(void)
{
int *arr = (int*)malloc(10);
int i;
for(i=0;i<100;i++)
{
arr[i]=i;
printf("%d", arr[i]);
}
return 0;
}
I am running above program and a call to malloc will allocate 10 bytes of memory and since each int variable takes up 2 bytes so in a way I can store 5 int variables of 2 bytes each thus making up my total 10 bytes which I dynamically allocated.
But on making a call to for-loop it is allowing me to enter values even till 99th index and storing all these values as well. So in a way if I am storing 100 int values it means 200 bytes of memory whereas I allocated only 10 bytes.
So where is the flaw with this code or how does malloc behave? If the behaviour of malloc is non-deterministic in such a manner then how do we achieve proper dynamic memory handling?
The flaw is in your expectations. You lied to the compiler: "I only need 10 bytes" when you actually wrote 100*sizeof(int) bytes. Writing beyond an allocated area is undefined behavior and anything may happen, ranging from nothing to what you expect to crashes.
If you do silly things expect silly behaviour.
That said malloc is usually implemented to ask the OS for chunks of memory that the OS prefers (like a page) and then manages that memory. This speeds up future mallocs especially if you are using lots of mallocs with small sizes. It reduces the number of context switches that are quite expensive.
First of all, in the most Operating Systems the size of int is 4 bytes. You can check that with:
printf("the size of int is %d\n", sizeof(int));
When you call the malloc function you allocate size at heap memory. The heap is a set aside for dynamic allocation. There's no enforced pattern to the allocation and deallocation of blocks from the heap; you can allocate a block at any time and free it at any time. This makes it much more complex to keep track of which parts of the heap are allocated or free at any given time. Because your program is small and you have no collision in the heap you can run this for with more values that 100 and it runs too.
When you know what are you doing with malloc then you build programs with proper dynamic memory handling. When your code has improper malloc allocation then the behaviour of the program is "unknown". But you can use gdb debugger to find where the segmentation will be revealed and how the things are in heap.
malloc behaves exactly as it states, allocates n number bytes of memory, nothing more. Your code might run on your PC, but operating on non-allocated memory is undefined behavior.
A small note...
Int might not be 2 bytes, it varies on different architectures/SDKs. When you want to allocate memory for n integer elements, you should use malloc( n * sizeof( int ) ).
All in short, you manage dynamic memory with other tools that the language provides ( sizeof, realloc, free, etc. ).
C doesn't do any bounds-checking on array accesses; if you define an array of 10 elements, and attempt to write to a[99], the compiler won't do anything to stop you. The behavior is undefined, meaning the compiler isn't required to do anything in particular about that situation. It may "work" in the sense that it won't crash, but you've just clobbered something that may cause problems later on.
When doing a malloc, don't think in terms of bytes, think in terms of elements. If you want to allocate space for N integers, write
int *arr = malloc( N * sizeof *arr );
and let the compiler figure out the number of bytes.

When do I need dynamic memory? [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
Malloc or normal array definition?
We learn that there is dynamic memory in C and dynamic variables:
#include <stdio.h>
int a = 17;
int main(void)
{
int b = 18; //automatic stack memory
int * c;
c = malloc( sizeof( int ) ); //dynamic heap memory
*c = 19;
printf("a = %d at address %x\n", a, &a);
printf("b = %d at address %x\n", b, &b);
printf("c = %d at address %x\n", *c, c);
free(c);
system("PAUSE");
return 0;
}
How do I know which type of memory to use? When do I ned one or the other?
Use dynamic in the following situations:
When you need a lot of memory. Typical stack size is 1 MB, so anything bigger than 50-100KB should better be dynamically allocated, or you're risking crash. Some platforms can have this limit even lower.
When the memory must live after the function returns. Stack memory gets destroyed when function ends, dynamic memory is freed when you want.
When you're building a structure (like array, or graph) of size that is unknown (i.e. may get big), dynamically changes or is too hard to precalculate. Dynamic allocation allows your code to naturally request memory piece by piece at any moment and only when you need it. It is not possible to repeatedly request more and more stack space in a for loop.
Prefer stack allocation otherwise. It is faster and can not leak.
You use dynamic memory when the size of your allocation is not known in advance only on runtime.
For example you ask a user to input names (lets say up to 10 names) and store them in a string array. Since you do not know how much names the user will provide (only on runtime that is) you will have to allocate the array only after you know how much to allocate so you will use dynamic allocation.
You can of course use an array of fixed sized 10 but for larger amounts this will be wasteful
Use dynamic memory allocation, if you don't know exactly how much memory your program will need to allocate at compile-time.
int a[n] for example will limit your array size to n. Also, it allocated n x 4 bytes of memory whether you use it or not. This is allocated on the stack, and the variable n must be known at compile time.
int *a = (int *)malloc(n * sizeof (int)) on the other hand allocated at runtime, on the heap, and the n needs to be known only at runtime, not necessarily at compile-time.
This also ensures you allocate exactly as much memory as you really need. However, as you allocated it at runtime, the cleaning up has to be done by you using free.
You should use dynamic memory when:
If you want your object to persist beyond the scope in which it was created.
Usually, stack sizes are limited and hence if your object occupies a lot of memory then you might run out of stack space in such cases one would usually go for dynamic memory allocation.
Note that c99 standard introduces Variable Length Arrays(VLA) in C so you need not use dynamic memory allocation just because you do not know the array dimensions before hand(unless ofcourse #2 mentioned above is the case)
It is best to avoid dynamic memory allocations as much as you can because it means explicitly managing the memory instead of the automatic mechanism provided by the language.Explicit memory management means that you are prone to make more errors, which might lead to catastrophic effects.
Having said that dynamic memory allocations cannot be avoided always and must be used when the use is imperative(two cases mentioned above).
If you can program without dynamic allocation don't use it!
But a day you will be blocked, and the only way to unblock you will be to use dynamic allocation then now you can use it
Als made an interesting point that you should allocate memory from the heap if your object needs to persist beyond the scope in which it was created. In the code above, you don't need to allocate memory from heap at all. You can rewrite it like this:
#include <stdio.h>
int a = 17;
int main(void)
{
int b = 18; //automatic stack memory
int c[1]; // allocating stack memory. sizeof(int) * 1
c[0] = 19;
printf("a = %d at address %x\n", a, &a);
printf("b = %d at address %x\n", b, &b);
printf("c = %d at address %x\n", c[0], c);
system("PAUSE");
return 0;
}
In fact, as part of the C99 standard (Variable-length array), you can use the [] operator to allocate dynamic space for an array on the stack just as you would normally do to create an array. You don't even need to know the size of the array at compilation time. The compiler will just adjust the esp register (for x86 machines) based on the requested allocation space and you're good to go.

memory allocation in C

I have a question regarding memory allocation order.
In the following code I allocate in a loop 4 strings.
But when I print the addresses they don't seem to be allocated one after the other... Am I doing something wrong or is it some sort of defense mechanism implemented by the OS to prevent possible buffer overflows? (I use Windows Vista).
Thank you.
char **stringArr;
int size=4, i;
stringArr=(char**)malloc(size*sizeof(char*));
for (i=0; i<size; i++)
stringArr[i]=(char*)malloc(10*sizeof(char));
strcpy(stringArr[0], "abcdefgh");
strcpy(stringArr[1], "good-luck");
strcpy(stringArr[2], "mully");
strcpy(stringArr[3], "stam");
for (i=0; i<size; i++) {
printf("%s\n", stringArr[i]);
printf("%d %u\n\n", &(stringArr[i]), stringArr[i]);
}
Output:
abcdefgh
9650064 9650128
good-luck
9650068 9638624
mully
9650072 9638680
stam
9650076 9638736
Typically when you request memory through malloc(), the C runtime library will round the size of your request up to some minimum allocation size. This makes sure that:
the runtime library has room for its bookkeeping information
it's more efficient for the runtime library to manage allocated blocks that are all multiples of some size (such as 16 bytes)
However, these are implementation details and you can't really rely on any particular behaviour of malloc().
But when I print the addresses they don't seem to be allocated one after the other...
So?
Am I doing something wrong or is it some sort of defense mechanism implemented by the OS to prevent possible buffer overflows?
Probably "neither".
Just out of interest, what addresses do you get?
You shouldn't depend on any particular ordering or spacing of values returned by malloc. It behaves in mysterious and unpredictable ways.
Typically it is reasonable to expect that a series of chronological allocations will result in a memory addresses that are somehow related, but as others have pointed out, it is certainly not a requirement of the heap manager. In this particular case, though, it is possible that you are seeing results of the low fragmentation heap. Windows keeps lists of small chunks of memory that can quickly satisfy a request. These could be in any order.
You can't depend on malloc to give you contiguous addresses. It's entirely up to the implementation and presumably the current state of the heap; some implementations may, many won't.
If you need the addresses to be contiguous, allocate one large block of memory and set up your pointers to point to different areas within it.
As others have mentioned, there is no standard to specify in which order the memory blocks allocated by malloc() should be located in the memory. For example, freed blocks can be scattered all around the heap and they may be re-used in any order.
But even if the blocks happen to be one after each other, they most likely do not form a contiguous block. In order to reduce fragmentation, the heap manager only allocates blocks of specific size, for example power of two (64, 128, 256, 512 etc. bytes). So, if you reserve 10 bytes for a string, there may be perhaps 22 or 54 un-used bytes after that.
The memory overhead is another reason why it is not good idea to use dynamic memory allocation unless really necessary. It is much easier and safer just to use a static array.
Since you are interesting in knowing the addresses returned by malloc(), you should make sure you are printing them properly. By "properly", I mean that you should use the right format specifier for printf() to print addresses. Why are you using "%u" for one and "%d" for another?
You should use "%p" for printing pointers. This is also one of the rare cases where you need a cast in C: because printf() is a variadic function, the compiler can't tell that the pointers you're passing as an argument to it need to be of the type void * or not.
Also, you shouldn't cast the return value of malloc().
Having fixed the above, the program is:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void)
{
char **stringArr;
int size=4, i;
stringArr = malloc(size * sizeof *stringArr);
for (i=0; i < size; i++)
stringArr[i] = malloc(10 * sizeof *stringArr[i]);
strcpy(stringArr[0], "abcdefgh");
strcpy(stringArr[1], "good-luck");
strcpy(stringArr[2], "mully");
strcpy(stringArr[3], "stam");
for (i=0; i<size; i++) {
printf("%s\n", stringArr[i]);
printf("%p %p\n", (void *)(&stringArr[i]), (void *)(stringArr[i]));
}
return 0;
}
and I get the following output when I run it:
abcdefgh
0x100100080 0x1001000a0
good-luck
0x100100088 0x1001000b0
mully
0x100100090 0x1001000c0
stam
0x100100098 0x1001000d0
On my computer, char ** pointers are 8 bytes long, so &stringArr[i+1] is at 8 bytes greater than &stringArr[i]. This is guaranteed by the standard: If you malloc() some space, that space is contiguous. You allocated space for 4 pointers, and the addresses of those four pointers are next to each other. You can see that more clearly by doing:
printf("%d\n", (int)(&stringArr[1] - &stringArr[0]));
This should print 1.
About the subsequent malloc()s, since each stringArr[i] is obtained from a separate malloc(), the implementation is free to assign any suitable addresses to them. On my implementation, with that particular run, the addresses are all 0x10 bytes apart.
For your implementation, it seems like char ** pointers are 4 bytes long.
About your individual string's addresses, it does look like malloc() is doing some sort of randomization (which it is allowed to do).

Resources