Related
What's the best practice when having a user defined array? By user defined, I mean size as well as the element values.
Option A:
Predefine an array of size (lets say) 100, then ask the user how many elements they would like in the array, knowing it will be less than what I have defined.
int array [100];
printf("Input the number of elements to be stored in the array: ");
scanf("%d", &numElements);
Option B:
Declare the array after I ask the user how many elements.
printf("Input the number of elements to be stored in the array: ");
scanf("%d", &numElements);
int array [numElements];
With option A, it could take up unnecessary memory, but I'm not sure when the cons are with Option B, would it be runtime?
As it is tagged C, I will try to answer in the scope of C language.
The second case I think is more prefered than the first one. First of all, for the first case, your array will have constant size and you cannot do realloc on it if a user gives input let's say more than 100, as you will get an error that int[100] is not assignable. For the second case, it is assumed that the given input is the sufficient size to create a constant size array because for the same reasons you cannot realloc to change the size of the array but at least you know the input is given by the user.
My suggestion would be to use a dynamic array which is a bit harder to manipulate as you may have memory leaks, for example, when the elements in your dynamic array are not primitive types but structs or other types that require memory allocation.
However, using dynamic array, you can realloc the size to make it bigger or smaller to save some memory space.
I am new to Stack Overflow so maybe your question is something deeper that my answer will not be enough. BTW, I hope it will give you some hint.
P.S.
Static arrays are always faster to be used, so if you are sure that the number of elements will not be more than a certain number, then it is better to use constant size array.
if using C++ :
{
The best practice is to use the std::vector from the Standard Template Library(STL).
Use reserve() and resize() methods to allocate fixed required space or push_back() and pop_back() to resize according to each insertion and deletion. More about vectors here.
}
else if using C :
{
Use dynamic memory allocation to change the size during runtime using malloc(), calloc(), free() and realloc() defined in <stdlib.h>.
malloc() is used to dynamically allocate a single large block of memory with the specified size. It returns a pointer of type void which can be cast into a pointer of any form.
Syntax:
ptr = (cast-type*) malloc(nbyte-size);
calloc() is used to dynamically allocate the specified number of blocks of memory of the specified type. It initializes each block with a default value ‘0’.
Syntax:
ptr = (cast-type*)calloc(n, element-size);
where n is the number of elements required.
free() is used to dynamically de-allocate the memory. The memory allocated using functions malloc() and calloc() is not de-allocated on their own. Hence the free() method is used, whenever the dynamic memory allocation takes place.
Syntax:
free(ptr);
realloc() is used to dynamically change the memory allocation of a previously allocated memory. If the memory previously allocated with the help of malloc() or calloc() is insufficient, realloc() can be used to dynamically re-allocate memory.
Syntax:
ptr = realloc(ptr, newSize);
}
In C, we can input the size of an array (at runtime) from the user by the concept of dynamic memory allocation. But we can also use
int n;
scanf("%d",&n);
int a[n];
So what is the need of using pointers for dynamic memory allocation using new?
What you have shown is called variable length array supported from C99.
Yes based on the input you are allocating memory. What if you want to extend the allocated memory.
Don't you need pointers now? In order to do realloc() . This is one scenario I can think of but we need pointers for dynamic memory allocation.
C doesn't have new so my answer is specific to C which has malloc() and family functions
If you have a function to allocate memory dynamically say
int *alloc_memory()
{
int n;
scanf("%d",&n);
int a[n];
// Fill values to array and do
return a;
}
Now this will lead to undefined behavior as the allocated memory just has scope of the function. Pointers are useful for this purpose
int *alloc_memory()
{
int n;
scanf("%d",&n);
int *p = malloc(sizeof(int) * n);
// Fill values
return p;
}
The point is VLA doesn't provide the flexibility which dynamic memory allocation by pointers provide you.
Variable length array came into existence after C99 standard. Before that, there was no concept for VLA. Please note, moving forward, since C11, this has been changed to an optional feature.
OTOH, dynamic memory allocation using malloc()## and family was there from long back.
That said, the VLA is still an array and usually, an array and a pointer are not the same. Array holds type and size information, while a pointer does not have any size information.
Also, FWIW, the array size can be defined at runtime, but that does not change the scope and lifetime as compared to normal arrays. Just using VLA approach does not change the lifetime of an otherwise automatic array to global or something else.
## There is no new keyword in C. GLIBC provides malloc() and family of APIs to handle dynamic memory allocation.
Using a VLA is conceptually similar to calling alloca to allocate automatic mmory.
A few differences between a Variable Length Array (VLA) and dynamically allocated memory using malloc:
1) The VLA is an automatic variable that will cease to exist when your function returns. Whereas dynamically allocated memory with malloc will exist until free is called or your program exits.
2) For data types other than arrays, such as structs, you probably want allocated with malloc.
3) The VLA is typically stored on the stack (although this is not strictly required by the C99 specification), whereas dynamically allocated memory with malloc is stored on the heap.
You are not doing any "dynamic memory allocation", i.e. allocation of memory with dynamic lifetime. You are using "variable-length arrays" in C99. But it's still a local variable, with "automatic storage duration", which means the variable's lifetime is the scope in which it was declared.
Today I was helping a friend of mine with some C code, and I've found some strange behavior that I couldn't explain him why it was happening. We had TSV file with a list of integers, with an int each line. The first line was the number of lines the list had.
We also had a c file with a very simple "readfile". The first line was read to n, the number of lines, then there was an initialization of:
int list[n]
and finally a for loop of n with a fscanf.
For small n's (till ~100.000), everything was fine. However, we've found that when n was big (10^6), a segfault would occur.
Finally, we changed the list initialization to
int *list = malloc(n*sizeof(int))
and everything when well, even with very large n.
Can someone explain why this occurred? what was causing the segfault with int list[n], that was stopped when we start using list = malloc(n*sizeof(int))?
There are several different pieces at play here.
The first is the difference between declaring an array as
int array[n];
and
int* array = malloc(n * sizeof(int));
In the first version, you are declaring an object with automatic storage duration. This means that the array lives only as long as the function that calls it exists. In the second version, you are getting memory with dynamic storage duration, which means that it will exist until it is explicitly deallocated with free.
The reason that the second version works here is an implementation detail of how C is usually compiled. Typically, C memory is split into several regions, including the stack (for function calls and local variables) and the heap (for malloced objects). The stack typically has a much smaller size than the heap; usually it's something like 8MB. As a result, if you try to allocate a huge array with
int array[n];
Then you might exceed the stack's storage space, causing the segfault. On the other hand, the heap usually has a huge size (say, as much space as is free on the system), and so mallocing a large object won't cause an out-of-memory error.
In general, be careful with variable-length arrays in C. They can easily exceed stack size. Prefer malloc unless you know the size is small or that you really only do want the array for a short period of time.
int list[n]
Allocates space for n integers on the stack, which is usually pretty small. Using memory on the stack is much faster than the alternative, but it is quite small and it is easy to overflow the stack (i.e. allocate too much memory) if you do things like allocate huge arrays or do recursion too deeply. You do not have to manually deallocate memory allocated this way, it is done by the compiler when the array goes out of scope.
malloc on the other hand allocates space in the heap, which is usually very large compared to the stack. You will have to allocate a much larger amount of memory on the heap to exhaust it, but it is a lot slower to allocate memory on the heap than it is on the stack, and you must deallocate it manually via free when you are done using it.
int list[n] stores the data in the stack, while malloc stores it in the heap.
The stack is limited, and there is not much space, while the heap is much much bigger.
int list[n] is a VLA, which allocates on the stack instead of on the heap. You don't have to free it (it frees automatically at the end of the function call) and it allocates quickly but the storage space is very limited, as you have discovered. You must allocate larger values on the heap.
This declaration allocates memory on the stack
int list[n]
malloc allocates on the heap.
Stack size is usually smaller than heap, so if you allocate too much memory on the stack you get a stackoverflow.
See also this answer for further information
Assuming you have a typical implementation in your implementation it's most likely that:
int list[n]
allocated list on your stack, where as:
int *list = malloc(n*sizeof(int))
allocated memory on your heap.
In the case of a stack there is typically a limit to how large these can grow (if they can grow at all). In the case of a heap there is still a limit, but that tends to be much largely and (broadly) constrained by your RAM+swap+address space which is typically at least an order of magnitude larger, if not more.
If you are on linux, you can set ulimit -s to a larger value and this might work for stack allocation also.
When you allocate memory on stack, that memory remains till the end of your function's execution. If you allocate memory on heap(using malloc), you can free the memory any time you want(even before the end of your function's execution).
Generally, heap should be used for large memory allocations.
When you allocate using a malloc, memory is allocated from heap and not from stack, which is much more limited in size.
int array[n];
It is an example of statically allocated array and at the compile time the size of the array will be known. And the array will be allocated on the stack.
int *array(malloc(sizeof(int)*n);
It is an example of dynamically allocated array and the size of the array will be known to user at the run time. And the array will be allocated on the heap.
Today I was helping a friend of mine with some C code, and I've found some strange behavior that I couldn't explain him why it was happening. We had TSV file with a list of integers, with an int each line. The first line was the number of lines the list had.
We also had a c file with a very simple "readfile". The first line was read to n, the number of lines, then there was an initialization of:
int list[n]
and finally a for loop of n with a fscanf.
For small n's (till ~100.000), everything was fine. However, we've found that when n was big (10^6), a segfault would occur.
Finally, we changed the list initialization to
int *list = malloc(n*sizeof(int))
and everything when well, even with very large n.
Can someone explain why this occurred? what was causing the segfault with int list[n], that was stopped when we start using list = malloc(n*sizeof(int))?
There are several different pieces at play here.
The first is the difference between declaring an array as
int array[n];
and
int* array = malloc(n * sizeof(int));
In the first version, you are declaring an object with automatic storage duration. This means that the array lives only as long as the function that calls it exists. In the second version, you are getting memory with dynamic storage duration, which means that it will exist until it is explicitly deallocated with free.
The reason that the second version works here is an implementation detail of how C is usually compiled. Typically, C memory is split into several regions, including the stack (for function calls and local variables) and the heap (for malloced objects). The stack typically has a much smaller size than the heap; usually it's something like 8MB. As a result, if you try to allocate a huge array with
int array[n];
Then you might exceed the stack's storage space, causing the segfault. On the other hand, the heap usually has a huge size (say, as much space as is free on the system), and so mallocing a large object won't cause an out-of-memory error.
In general, be careful with variable-length arrays in C. They can easily exceed stack size. Prefer malloc unless you know the size is small or that you really only do want the array for a short period of time.
int list[n]
Allocates space for n integers on the stack, which is usually pretty small. Using memory on the stack is much faster than the alternative, but it is quite small and it is easy to overflow the stack (i.e. allocate too much memory) if you do things like allocate huge arrays or do recursion too deeply. You do not have to manually deallocate memory allocated this way, it is done by the compiler when the array goes out of scope.
malloc on the other hand allocates space in the heap, which is usually very large compared to the stack. You will have to allocate a much larger amount of memory on the heap to exhaust it, but it is a lot slower to allocate memory on the heap than it is on the stack, and you must deallocate it manually via free when you are done using it.
int list[n] stores the data in the stack, while malloc stores it in the heap.
The stack is limited, and there is not much space, while the heap is much much bigger.
int list[n] is a VLA, which allocates on the stack instead of on the heap. You don't have to free it (it frees automatically at the end of the function call) and it allocates quickly but the storage space is very limited, as you have discovered. You must allocate larger values on the heap.
This declaration allocates memory on the stack
int list[n]
malloc allocates on the heap.
Stack size is usually smaller than heap, so if you allocate too much memory on the stack you get a stackoverflow.
See also this answer for further information
Assuming you have a typical implementation in your implementation it's most likely that:
int list[n]
allocated list on your stack, where as:
int *list = malloc(n*sizeof(int))
allocated memory on your heap.
In the case of a stack there is typically a limit to how large these can grow (if they can grow at all). In the case of a heap there is still a limit, but that tends to be much largely and (broadly) constrained by your RAM+swap+address space which is typically at least an order of magnitude larger, if not more.
If you are on linux, you can set ulimit -s to a larger value and this might work for stack allocation also.
When you allocate memory on stack, that memory remains till the end of your function's execution. If you allocate memory on heap(using malloc), you can free the memory any time you want(even before the end of your function's execution).
Generally, heap should be used for large memory allocations.
When you allocate using a malloc, memory is allocated from heap and not from stack, which is much more limited in size.
int array[n];
It is an example of statically allocated array and at the compile time the size of the array will be known. And the array will be allocated on the stack.
int *array(malloc(sizeof(int)*n);
It is an example of dynamically allocated array and the size of the array will be known to user at the run time. And the array will be allocated on the heap.
int numbers*;
numbers = malloc ( sizeof(int) * 10 );
I want to know how is this dynamic memory allocation, if I can store just 10 int items to the memory block ? I could just use the array and store elemets dynamically using index. Why is the above approach better ?
I am new to C, and this is my 2nd day and I may sound stupid, so please bear with me.
In this case you could replace 10 with a variable that is assigned at run time. That way you can decide how much memory space you need. But with arrays, you have to specify an integer constant during declaration. So you cannot decide whether the user would actually need as many locations as was declared, or even worse , it might not be enough.
With a dynamic allocation like this, you could assign a larger memory location and copy the contents of the first location to the new one to give the impression that the array has grown as needed.
This helps to ensure optimum memory utilization.
The main reason why malloc() is useful is not because the size of the array can be determined at runtime - modern versions of C allow that with normal arrays too. There are two reasons:
Objects allocated with malloc() have flexible lifetimes;
That is, you get runtime control over when to create the object, and when to destroy it. The array allocated with malloc() exists from the time of the malloc() call until the corresponding free() call; in contrast, declared arrays either exist until the function they're declared in exits, or until the program finishes.
malloc() reports failure, allowing the program to handle it in a graceful way.
On a failure to allocate the requested memory, malloc() can return NULL, which allows your program to detect and handle the condition. There is no such mechanism for declared arrays - on a failure to allocate sufficient space, either the program crashes at runtime, or fails to load altogether.
There is a difference with where the memory is allocated. Using the array syntax, the memory is allocated on the stack (assuming you are in a function), while malloc'ed arrays/bytes are allocated on the heap.
/* Allocates 4*1000 bytes on the stack (which might be a bit much depending on your system) */
int a[1000];
/* Allocates 4*1000 bytes on the heap */
int *b = malloc(1000 * sizeof(int))
Stack allocations are fast - and often preferred when:
"Small" amount of memory is required
Pointer to the array is not to be returned from the function
Heap allocations are slower, but has the advantages:
Available heap memory is (normally) >> than available stack memory
You can freely pass the pointer to the allocated bytes around, e.g. returning it from a function -- just remember to free it at some point.
A third option is to use statically initialized arrays if you have some common task, that always requires an array of some max size. Given you can spare the memory statically consumed by the array, you avoid the hit for heap memory allocation, gain the flexibility to pass the pointer around, and avoid having to keep track of ownership of the pointer to ensure the memory is freed.
Edit: If you are using C99 (default with the gnu c compiler i think?), you can do variable-length stack arrays like
int a = 4;
int b[a*a];
In the example you gave
int *numbers;
numbers = malloc ( sizeof(int) * 10 );
there are no explicit benefits. Though, imagine 10 is a value that changes at runtime (e.g. user input), and that you need to return this array from a function. E.g.
int *aFunction(size_t howMany, ...)
{
int *r = malloc(sizeof(int)*howMany);
// do something, fill the array...
return r;
}
The malloc takes room from the heap, while something like
int *aFunction(size_t howMany, ...)
{
int r[howMany];
// do something, fill the array...
// you can't return r unless you make it static, but this is in general
// not good
return somethingElse;
}
would consume the stack that is not so big as the whole heap available.
More complex example exists. E.g. if you have to build a binary tree that grows according to some computation done at runtime, you basically have no other choices but to use dynamic memory allocation.
Array size is defined at compilation time whereas dynamic allocation is done at run time.
Thus, in your case, you can use your pointer as an array : numbers[5] is valid.
If you don't know the size of your array when writing the program, using runtime allocation is not a choice. Otherwise, you're free to use an array, it might be simpler (less risk to forget to free memory for example)
Example:
to store a 3-D position, you might want to use an array as it's alwaays 3 coordinates
to create a sieve to calculate prime numbers, you might want to use a parameter to give the max value and thus use dynamic allocation to create the memory area
Array is used to allocate memory statically and in one go.
To allocate memory dynamically malloc is required.
e.g. int numbers[10];
This will allocate memory statically and it will be contiguous memory.
If you are not aware of the count of the numbers then use variable like count.
int count;
int *numbers;
scanf("%d", count);
numbers = malloc ( sizeof(int) * count );
This is not possible in case of arrays.
Dynamic does not refer to the access. Dynamic is the size of malloc. If you just use a constant number, e.g. like 10 in your example, it is nothing better than an array. The advantage is when you dont know in advance how big it must be, e.g. because the user can enter at runtime the size. Then you can allocate with a variable, e.g. like malloc(sizeof(int) * userEnteredNumber). This is not possible with array, as you have to know there at compile time the (maximum) size.