Why must malloc be used? - c

From what I understand, the malloc function takes a variable and allocates memory as asked. In this case, it will ask the compiler to prepare memory in order to fit the equivalence of twenty double variables. Is my way of understanding it correctly, and why must it be used?
double *q;
q=(double *)malloc(20*sizeof(double));
for (i=0;i<20; i++)
{
*(q+i)= (double) rand();
}

You don't have to use malloc() when:
The size is known at compile time, as in your example.
You are using C99 or C2011 with VLA (variable length array) support.
Note that malloc() allocates memory at runtime, not at compile time. The compiler is only involved to the extent that it ensures the correct function is called; it is malloc() that does the allocation.
Your example mentions 'equivalence of ten integers'. It is very seldom that 20 double occupy the same space as 10 int. Usually, 10 double will occupy the same space as 20 int (when sizeof(int) == 4 and sizeof(double) == 8, which is a very commonly found setting).

It's used to allocate memory at run-time rather than compile-time. So if your data arrays are based on some sort of input from the user, database, file, etc. then malloc must be used once the desired size is known.
The variable q is a pointer, meaning it stores an address in memory. malloc is asking the system to create a section of memory and return the address of that section of memory, which is stored in q. So q points to the starting location of the memory you requested.
Care must be taken not to alter q unintentionally. For instance, if you did:
q = (double *)malloc(20*sizeof(double));
q = (double *)malloc(10*sizeof(double));
you will lose access to the first section of 20 double's and introduce a memory leak.

When you use malloc you are asking the system "Hey, I want this many bytes of memory" and then he will either say "Sorry, I'm all out" or "Ok! Here is an address to the memory you wanted. Don't lose it".
It's generally a good idea to put big datasets in the heap (where malloc gets your memory from) and a pointer to that memory on the stack (where code execution takes place). This becomes more important on embedded platforms where you have limited memory. You have to decide how you want to divvy up the physical memory between the stack and heap. Too much stack and you can't dynamically allocate much memory. Too little stack and you can function call your way right out of it (also known as a stack overflow :P)

As the others said, malloc is used to allocate memory. It is important to note that malloc will allocate memory from the heap, and thus the memory is persistent until it is free'd. Otherwise, without malloc, declaring something like double vals[20] will allocate memory on the stack. When you exit the function, that memory is popped off of the stack.
So for example, say you are in a function and you don't care about the persistence of values. Then the following would be suitable:
void some_function() {
double vals[20];
for(int i = 0; i < 20; i++) {
vals[i] = (double)rand();
}
}
Now if you have some global structure or something that stores data, that has a lifetime longer than that of just the function, then using malloc to allocate that memory from the heap is required (alternatively, you can declare it as a global variable, and the memory will be preallocated for you).

In you example, you could have declared double q[20]; without the malloc and it would work.
malloc is a standard way to get dynamically allocated memory (malloc is often built above low-level memory acquisition primitives like mmap on Linux).
You want to get dynamically allocated memory resources, notably when the size of the allocated thing (here, your q pointer) depends upon runtime parameters (e.g. depends upon input). The bad alternative would be to allocate all statically, but then the static size of your data is a strong built-in limitation, and you don't like that.
Dynamic resource allocation enables you to run the same program on a cheap tablet (with half a gigabyte of RAM) and an expensive super-computer (with terabytes of RAM). You can allocate different size of data.
Don't forget to test the result of malloc; it can fail by returning NULL. At the very least, code:
int* q = malloc (10*sizeof(int));
if (!q) {
perror("q allocation failed");
exit(EXIT_FAILURE);
};
and always initialize malloc-ed memory (you could prefer using calloc which zeroes the allocated memory).
Don't forget to later free the malloc-ed memory. On Linux, learn about using valgrind. Be scared of memory leaks and dangling pointers. Recognize that the liveness of some data is a non-modular property of the entire program. Read about garbage collection!, and consider perhaps using Boehm's conservative garbage collector (by calling GC_malloc instead of malloc).

You use malloc() to allocate memory dynamically in C. (Allocate the memory at the run time)
You use it because sometimes you don't know how much memory you'll use when you write your program.
You don't have to use it when you know thow many elements the array will hold at compile time.
Another important thing to notice that if you want to return an array from a function, you will want to return an array which was not defined inside the function on the stack. Instead, you'll want to dynamically allocate an array (on the heap) and return a pointer to this block:
int *returnArray(int n)
{
int i;
int *arr = (int *)malloc(sizeof(int) * n);
if (arr == NULL)
{
return NULL;
}
//...
//fill the array or manipulate it
//...
return arr; //return the pointer
}

Related

Why does malloc need to be used for dynamic memory allocation in C?

I have been reading that malloc is used for dynamic memory allocation. But if the following code works...
int main(void) {
int i, n;
printf("Enter the number of integers: ");
scanf("%d", &n);
// Dynamic allocation of memory?
int int_arr[n];
// Testing
for (int i = 0; i < n; i++) {
int_arr[i] = i * 10;
}
for (int i = 0; i < n; i++) {
printf("%d ", int_arr[i]);
}
printf("\n");
}
... what is the point of malloc? Isn't the code above just a simpler-to-read way to allocate memory dynamically?
I read on another Stack Overflow answer that if some sort of flag is set to "pedantic", then the code above would produce a compile error. But that doesn't really explain why malloc might be a better solution for dynamic memory allocation.
Look up the concepts for stack and heap; there's a lot of subtleties around the different types of memory. Local variables inside a function live in the stack and only exist within the function.
In your example, int_array only exists while execution of the function it is defined in has not ended, you couldn't pass it around between functions. You couldn't return int_array and expect it to work.
malloc() is used when you want to create a chunk of memory which exists on the heap. malloc returns a pointer to this memory. This pointer can be passed around as a variable (eg returned) from functions and can be used anywhere in your program to access your allocated chunk of memory until you free() it.
Example:
'''C
int main(int argc, char **argv){
int length = 10;
int *built_array = make_array(length); //malloc memory and pass heap pointer
int *array = make_array_wrong(length); //will not work. Array in function was in stack and no longer exists when function has returned.
built_array[3] = 5; //ok
array[3] = 5; //bad
free(built_array)
return 0;
}
int *make_array(int length){
int *my_pointer = malloc( length * sizeof int);
//do some error checking for real implementation
return my_pointer;
}
int *make_array_wrong(int length){
int array[length];
return array;
}
'''
Note:
There are plenty of ways to avoid having to use malloc at all, by pre-allocating sufficient memory in the callers, etc. This is recommended for embedded and safety critical programs where you want to be sure you'll never run out of memory.
Just because something looks prettier does not make it a better choice.
VLAs have a long list of problems, not the least of which they are not a sufficient replacement for heap-allocated memory.
The primary -- and most significant -- reason is that VLAs are not persistent dynamic data. That is, once your function terminates, the data is reclaimed (it exists on the stack, of all places!), meaning any other code still hanging on to it are SOL.
Your example code doesn't run into this problem because you aren't using it outside of the local context. Go ahead and try to use a VLA to build a binary tree, then add a node, then create a new tree and try to print them both.
The next issue is that the stack is not an appropriate place to allocate large amounts of dynamic data -- it is for function frames, which have a limited space to begin with. The global memory pool, OTOH, is specifically designed and optimized for this kind of usage.
It is good to ask questions and try to understand things. Just be careful that you don't believe yourself smarter than the many, many people who took what now is nearly 80 years of experience to design and implement systems that quite literally run the known universe. Such an obvious flaw would have been immediately recognized long, long ago and removed before either of us were born.
VLAs have their place, but it is, alas, small.
Declaring local variables takes the memory from the stack. This has two ramifications.
That memory is destroyed once the function returns.
Stack memory is limited, and is used for all local variables, as well as function return addresses. If you allocate large amounts of memory, you'll run into problems. Only use it for small amounts of memory.
When you have the following in your function code:
int int_arr[n];
It means you allocated space on the function stack, once the function will return this stack will cease to exist.
Image a use case where you need to return a data structure to a caller, for example:
Car* create_car(string model, string make)
{
Car* new_car = malloc(sizeof(*car));
...
return new_car;
}
Now, once the function will finish you will still have your car object, because it was allocated on the heap.
The memory allocated by int int_arr[n] is reserved only until execution of the routine ends (when it returns or is otherwise terminated, as by setjmp). That means you cannot allocate things in one order and free them in another. You cannot allocate a temporary work buffer, use it while computing some data, then allocate another buffer for the results, and free the temporary work buffer. To free the work buffer, you have to return from the function, and then the result buffer will be freed to.
With automatic allocations, you cannot read from a file, allocate records for each of the things read from the file, and then delete some of the records out of order. You simply have no dynamic control over the memory allocated; automatic allocations are forced into a strictly last-in first-out (LIFO) order.
You cannot write subroutines that allocate memory, initialize it and/or do other computations, and return the allocated memory to their callers.
(Some people may also point out that the stack memory commonly used for automatic objects is commonly limited to 1-8 mebibytes while the memory used for dynamic allocation is generally much larger. However, this is an artifact of settings selected for common use and can be changed; it is not inherent to the nature of automatic versus dynamic allocation.)
If the allocated memory is small and used only inside the function, malloc is indeed unnecessary.
If the memory amount is extremely large (usually MB or more), the above example may cause stack overflow.
If the memory is still used after the function returned, you need malloc or global variable (static allocation).
Note that the dynamic allocation through local variables as above may not be supported in some compiler.

Correct use of free() when deallocating a 2d matrix in c

I'm just starting to learn coding in c, and I have a few questions regarding 2d matrices in combination with the free() command.
I know that you first need to create an array with pointer, pointing to the different columns of the matrix:
double **array = (double **)malloc(5*sizeof(double *));
for(int n = 0; n<5; n++){
array[n] = (double *) malloc(6*sizeof(double));
I know that the correct way to then deallocate this matrix is to first deallocate the individual rows and then the array pointer itself. Something along the lines of:
for (i = 0; i < nX; i++){
free(array[i]); }
free(array);
My question is: Why is this necessary? I know that this incorrect, but why can't you just use: free(array)? This would deallocate the pointer array, to my understanding. Won't the memory that is used by the columns just be overwritten when something else needs acces to it? Would free(array) lead to corrupted memory in any way?
Any help is much appreciated!
Your code, not only allocate memory for array of pointers (the blue array), but in the for loop, you also allocate memory for the red arrays as well. So, free(array) line, alone, will just free the memory allocated by the blue array, but not the red ones. You need to free the red ones, just before loosing the contact with them; that is, before freeing the blue array.
And btw;
Won't the memory that is used by the columns just be overwritten when something else needs acces to it?
No. The operating system will keep track of the memory allocated by your process (program) and will not allow any other process to access the allocated memory until your process terminates. Under normal circumstances —I mean, remembering the C language not having a garbage collector— the OS never knows that you've lost connection with the allocated memory space and will never attempt like, "well, this memory space is not useful for this process anymore; so, let's de-allocate it and serve it for another process."
It would not lead to corruption, no, but would create a memory leak.
If done once in your program, it probably doesn't matter much (a lot of professional/expensive applications have - small,unintentional - memory leaks), but repeat this in a loop, and you may run out of memory after a while. Same thing if your code is called from an external program (if your code is in a library).
Aside: Not freeing buffers can be a useful way (temporarily) to check if the crashes you're getting in your programs originate from corrupt memory allocation or deallocation (when you cannot use Valgrind). But in the end you want to free everything, once.
If you want to perform only one malloc, you could also allocate one big chunk, then compute the addresses of the rows. In the end, just deallocate the big chunk (example here: How do we allocate a 2-D array using One malloc statement)
This is needed because C does not have a garbage collector.
Once you allocate memory with malloc or similar function, it is marked as "in use" for as long as your program is running.
It does not matter if you no longer hold a pointer to this memory in your program.
There is no mechanism in the C language to check this and automatically free the memory.
Also, when you allocate memory with malloc the function does not know what you are using the memory for. For the allocator it is all just bytes.
So when you free a pointer (or array of pointers), there is no logic to "realize" these are pointers that contain memory addresses.
This is simply how the C language is designed: the dynamic memory management is almost1 completely manual - left to the programmer, so you must call free for every call to malloc.
1 C language does handle some of the more tedious tasks needed to dynamically allocate memory in a program such as finding where to get a free continuous chunk of memory of the size you asked for.
Let's take this simple example:
int **ptr = malloc(2*sizeof *ptr);
int *foo = malloc(sizeof *foo);
int *bar = malloc(sizeof *bar);
ptr[0] = foo;
ptr[1] = bar;
free(ptr);
If your suggestion were implemented, foo and bar would now be dangling pointers. How would you solve the scenario if you just want to free ptr?

C: How do I initialize a global array when size is not known until runtime?

I am writing some code in C (not C99) and I think I have a need for several global arrays. I am taking in data from several text files I don't yet know the size of, and I need to store these values and have them available in several different methods. I already have written code for reading the text files into an array, but if an array isn't the best choice I am sure I could rewrite it.
If you had encountered this situation, what would you do? I don't necessarily need code examples, just ideas.
Use dynamic allocation:
int* pData;
char* pData2;
int main() {
...
pData = malloc(count * sizeof *pData); // uninitialized
pData2 = calloc(count, sizeof *pData2); // zero-initialized
/* work on your arrays */
free(pData);
free(pData2);
...
}
First of all, try to make sense of the requirement. You cannot possibly initialize a memory of "unknown" size, you can only have it initialized once you have a certain amount of memory (in terms of bytes). So, the first thing is to get the memory allocated.
This is the scenario to use memory allocator functions, malloc() and family, which allows you to allocate memory of a given size at run-time. Define a pointer, then, at run-time, get the memory size and use the allocator functions to allocate the memory of required size.
That said,
calloc() initializes the returned memory to 0.
realloc() is used to re-size the memory at run-time.
Also, while using dynamic memory allocation, you should be careful enought to clean up the allocated memory using free() when you're done using the memory to avoid memory leaks.

Freeing portions of dynamically allocated blocks?

I was curious whether there exists a dynamic memory allocation system that allows the programmer to free part of an allocated block.
For example:
char* a = malloc (40);
//b points to the split second half of the block, or to NULL if it's beyond the end
//a points to a area of 10 bytes
b = partial_free (a+10, /*size*/ 10)
Thoughts on why this is wise/unwise/difficult? Ways to do this?
Seems to me like it could be useful.
Thanks!
=====edit=====
after some research, it seems that the bootmem allocator for the linux kernel allows something similar to this operation with the bootmem_free call. So, I'm curious -- why is it that the bootmem allocator allows this, but ANSI C does not?
No there is no such function which allows parital freeing of memory.
You could however use realloc() to resize memory.
From the c standard:
7.22.3.5 The realloc function
#include <stdlib.h>
void *realloc(void *ptr, size_t size);
The realloc function deallocates the old object pointed to by ptr and returns a pointer to a new object that has the size specified by size. The contents of the new object shall be the same as that of the old object prior to deallocation, up to the lesser of the new and old sizes. Any bytes in the new object beyond the size of the old object have indeterminate values.
There is no ready-made function for this, but doing this isn't impossible. Firstly, there is realloc() . realloc takes a pointer to a block of memory and resizes the allocation to the size specified.
Now, if you have allocated some memory:
char * tmp = malloc(2048);
and you intend to deallocate the first, 1 K of memory, you may do:
tmp = realloc(foo, 2048-1024);
However, the problem in this case is that you cannot be certain that tmp will remain unchanged. Since, the function might just deallocate the entire 2K memory and move it elsewhere.
Now I'm not sure about the exact implementation of realloc, but from what I understand, the code:
myptr = malloc( x - y );
actually mallocs a new memory buffer of size x-y, then it copies the bytes that fit using memcpy and finally frees the original allocated memory.
This may create some potential problems. For example, the new reallocated memory may be located at a different address, so any past pointers you may have may become invalidated. Resulting in undefined runtime errors, segmentation faults and general debugging hell. So I would try to avoid resorting to this.
Firstly, I cannot think of any situation where you would be likely to need such a thing (when there exists realloc to increase/decrease the memory as mentioned in the answers).
I would like to add another thing. In whatever implementations I have seen of the malloc subsystem (which I admit is not a lot), malloc and free are implemented to be dependent on something called as the prefix byte(s). So whatever address is returned to you by malloc, internally the malloc subsystem will allocate some additional byte(s) of memory prior to the address returned to you, to store sanity check information which includes number of allocated bytes and possible what allocation policy you use (if your OS supports multiple mem allocation policies) etc. When you say something like free (x bytes), the malloc subsystem goes back to peek back into the prefix byte to sanity check and only if it finds the prefix in place does the free successfully happen. Therefore, it will not allow you to free some number of blocks starting in between.

how is dynamic memory allocation better than array?

int numbers*;
numbers = malloc ( sizeof(int) * 10 );
I want to know how is this dynamic memory allocation, if I can store just 10 int items to the memory block ? I could just use the array and store elemets dynamically using index. Why is the above approach better ?
I am new to C, and this is my 2nd day and I may sound stupid, so please bear with me.
In this case you could replace 10 with a variable that is assigned at run time. That way you can decide how much memory space you need. But with arrays, you have to specify an integer constant during declaration. So you cannot decide whether the user would actually need as many locations as was declared, or even worse , it might not be enough.
With a dynamic allocation like this, you could assign a larger memory location and copy the contents of the first location to the new one to give the impression that the array has grown as needed.
This helps to ensure optimum memory utilization.
The main reason why malloc() is useful is not because the size of the array can be determined at runtime - modern versions of C allow that with normal arrays too. There are two reasons:
Objects allocated with malloc() have flexible lifetimes;
That is, you get runtime control over when to create the object, and when to destroy it. The array allocated with malloc() exists from the time of the malloc() call until the corresponding free() call; in contrast, declared arrays either exist until the function they're declared in exits, or until the program finishes.
malloc() reports failure, allowing the program to handle it in a graceful way.
On a failure to allocate the requested memory, malloc() can return NULL, which allows your program to detect and handle the condition. There is no such mechanism for declared arrays - on a failure to allocate sufficient space, either the program crashes at runtime, or fails to load altogether.
There is a difference with where the memory is allocated. Using the array syntax, the memory is allocated on the stack (assuming you are in a function), while malloc'ed arrays/bytes are allocated on the heap.
/* Allocates 4*1000 bytes on the stack (which might be a bit much depending on your system) */
int a[1000];
/* Allocates 4*1000 bytes on the heap */
int *b = malloc(1000 * sizeof(int))
Stack allocations are fast - and often preferred when:
"Small" amount of memory is required
Pointer to the array is not to be returned from the function
Heap allocations are slower, but has the advantages:
Available heap memory is (normally) >> than available stack memory
You can freely pass the pointer to the allocated bytes around, e.g. returning it from a function -- just remember to free it at some point.
A third option is to use statically initialized arrays if you have some common task, that always requires an array of some max size. Given you can spare the memory statically consumed by the array, you avoid the hit for heap memory allocation, gain the flexibility to pass the pointer around, and avoid having to keep track of ownership of the pointer to ensure the memory is freed.
Edit: If you are using C99 (default with the gnu c compiler i think?), you can do variable-length stack arrays like
int a = 4;
int b[a*a];
In the example you gave
int *numbers;
numbers = malloc ( sizeof(int) * 10 );
there are no explicit benefits. Though, imagine 10 is a value that changes at runtime (e.g. user input), and that you need to return this array from a function. E.g.
int *aFunction(size_t howMany, ...)
{
int *r = malloc(sizeof(int)*howMany);
// do something, fill the array...
return r;
}
The malloc takes room from the heap, while something like
int *aFunction(size_t howMany, ...)
{
int r[howMany];
// do something, fill the array...
// you can't return r unless you make it static, but this is in general
// not good
return somethingElse;
}
would consume the stack that is not so big as the whole heap available.
More complex example exists. E.g. if you have to build a binary tree that grows according to some computation done at runtime, you basically have no other choices but to use dynamic memory allocation.
Array size is defined at compilation time whereas dynamic allocation is done at run time.
Thus, in your case, you can use your pointer as an array : numbers[5] is valid.
If you don't know the size of your array when writing the program, using runtime allocation is not a choice. Otherwise, you're free to use an array, it might be simpler (less risk to forget to free memory for example)
Example:
to store a 3-D position, you might want to use an array as it's alwaays 3 coordinates
to create a sieve to calculate prime numbers, you might want to use a parameter to give the max value and thus use dynamic allocation to create the memory area
Array is used to allocate memory statically and in one go.
To allocate memory dynamically malloc is required.
e.g. int numbers[10];
This will allocate memory statically and it will be contiguous memory.
If you are not aware of the count of the numbers then use variable like count.
int count;
int *numbers;
scanf("%d", count);
numbers = malloc ( sizeof(int) * count );
This is not possible in case of arrays.
Dynamic does not refer to the access. Dynamic is the size of malloc. If you just use a constant number, e.g. like 10 in your example, it is nothing better than an array. The advantage is when you dont know in advance how big it must be, e.g. because the user can enter at runtime the size. Then you can allocate with a variable, e.g. like malloc(sizeof(int) * userEnteredNumber). This is not possible with array, as you have to know there at compile time the (maximum) size.

Resources