Why would you ever want to have an array on the heap? My professor gave us two reasons:
To pass the array to functions, instead of passing a copy
So that the array outlives the scope
Can't these both instead by solved by:
Passing a pointer to an array on the stack
Returning the value of the array instead of the array itself (i.e. use the copy constructor)
Could someone give me an example of where an array in the heap has to be used?
Arrays in heap are used to outlive the function's scope. Passing a pointer to an array on the stack is only valid if you don't want to use it later in a previous (upper) caller. And you can't return an array from a function, you can return a pointer to an array, but if it was allocated in stack, it will point to an invalid memory position after the function returns.
The 1st reason is wrong: arrays are never passed by copy. When you call a function, array names always decay into a pointer to its first element, precisely to avoid copying the whole array. If you want to pass an array by copy, you have to embed it inside a struct and pass a struct instead.
Dynamic array allocation is also useful if you don't know the size of your array in advance (although this is not true after C99 brought variable length arrays - but still, variable length arrays are alloced on stack, so you'd have the same problem).
Another good reason to use heap allocation is that you can easily fall out of stack memory for very big arrays. The heap is generally larger.
#include <assert.h>
#include <stdlib.h>
int * f(int* array) {
assert(array[0] == 1); // OK
int static_array[] = {1, 2, 3};
//return static_array = {1, 2, 3}; //BAD: only lives in this function
int * dynamic_array = malloc(sizeof(int) * 2);
dynamic_array[0] = 1;
dynamic_array[1] = 2;
return dynamic_array; // OK: lives outside also
}
int main()
{
int static_array[] = {1, 2, 3};
int * returned_array;
returned_array = f(static_array);
assert(returned_array[0] == 1);
free(returned_array);
}
An array in C is represented as a pointer that references the location of the array data (it points to the first item in the array). In the case of stack-based arrays, the array pointer and data are in the same location. In the case of heap-allocated arrays, the array pointer is on the stack and points to the location on the heap where the array data begins.
For point (2), you cannot return the value of the array. What is returned instead is the location of the array in memory or on the stack. Thus, allocating it on the heap ensures that the data is preserved when returning the array from a function.
A std::vector on the other hand works functionally like an array. With this, the array data is allocated on the heap, but the object that manages the array is on the stack. Thus, the lifetime of the array is controlled by the lifetime of the vector object.
The std::vector has the behaviour you describe:
passing a vector by value to a function causes the data to be copied when passing it to the function;
the vector data only lives for the lifetime of the function.
Passing the vector from a function can cause the array data to be copied. However, this can be optimised using things like return value optimisation and R-value references, which avoid the copy.
If this code runs without crashing you may allocate all your arrays on the stack.
#include <string.h>
int main() {
volatile char buf[1024 * 1024 * 64];
memset(buf, 0, sizeof(buf));
}
Unless you are required to let the array outlive the scope of the function that declares and initialise it, the compiler can do some optimisations that will most likely end up being more efficient then what a programmer can guess. Unless you have time to benchmark and experiment AND that your application is performance critical, leave the optimisation to the compiler.
Reasons you would want to allocate an array on the heap instead of the stack:
The array is very large;
The array's lifetime is outside the scope of any one function;
The array size is not known at compile time, and VLAs are either not available or cannot be used in a particular situation (VLAs cannot be declared static or at file scope, for example);
The array is meant to be resizable.
Related
I want to add something to the end of the array passed to the function.
Which is better, declaring a new larger array or using alloc ()?
1.
void array_append(int *block, size_t size)
{
int new_block[size + 2];
memcpy(new_block, block, size);
(...append)
}
void array_append(int *block, size_t size)
{
int *new_block = calloc(1, sizeof(int) + 2);
memcpy(new_block, block, size);
(...append)
free(new_block);
}
I am not returning the newly created array anywhere.
I only use new_block inside functions.
Does not modify the original array in the function.
Declaring new_block as static is omitted.
I know how calloc() / malloc() works, I know that this operation has to be validated.
new_block is only meant to live in a function.
I just wonder which solution is better and why ...
regards
You should dynamically allocate an array instead of using a variable length array because in general in the last case the code can be unsafe due to a comparatively big size of the array that can lead to the stack overflow.
I want to add something to the end of the array
But you cannot really. Unless with realloc(). This is how your ...append trick can be done, whatever it means.
If you need a temporary array to work with and then copy into your array (but not at the end!), then all methods for allocation are allowed - it really depends on how often and with which sizes.
If it is called very often with limited sizes, it could be a static array.
There is no easy solution for growing arrays (or for memory management in general). At the extreme you allocate every element individually and link them together: a linked list.
--> avoid reaching the end of your arrays. Define a higher maximum or then implement a linked list.
In certain situations realloc() also makes sense (big changes in size, but not often). Problem is sometimes the whole array has to be memcopied to keep the larger array contiguous: "realloc", not "append". So it is "expensive".
I am not returning the newly created array anywhere.
That is part of the problem. You actually seem to be doing half of what realloc() does: allocate the new space, memcpy() the old contents...and then free the old and return the new array(-pointer) to the caller.
First version can not return the array pointer, because end of function is also end of local auto arrays, VLA or not.
If the append can be done to the existing array (which it can if the caller expects this and the memory of the array has room), you can merely append to the existing array.
Otherwise, you need a new array. In this case, the array must be returned to the caller. You can do this by returning a pointer to its first element or by having the caller pass a pointer to a pointer, and you modify the pointed-to pointer to point to the first element of the new array.
When you provide a new array, you must allocate memory for it with malloc or a similar routine. You should not use an array defined inside your function without static, as the memory for such an array is reserved only until execution of the function ends. When your function returns to the caller, that memory is released for other uses. (Generally, you also should not use an array declared with static, but for reasons involving good design, reducing bugs, and multiple serial or parallel calls to the function.)
So i want to return an array of a size n (variable) which my function has as input. I know that in order to return arrays in C I have to define them static, but the problem is that n is a variable and thus I get an error. I thought of actually using malloc/calloc but then I won't be able to free them after the function returns the array. Please take note that I'm not allowed to change anything on main(). Are there any other alternatives which I could use? Thanks in advance.
float *Arr( int *a , int n ){
static float b[ n ];
return b
}
Got to point out that the function will only be called Once,I saw the solution you posted but i noticed you aren't freeing the allocated memory,is it not of much importance when the malloc is called inside a function?
The important thing to notice here is that this syntax:
float arr[n];
Allocates an array on the stack of the current function. In other words, that array is a local variable. Any local variable becomes invalid after the function returns, and therefore returning the array directly is undefined behavior. It will most likely cause a crash when trying to access the array from outside the function, if not anything worse.
In addition to that, declaring a variable-length array as static is invalid in any case.
If you want to write a function which creates and returns any kind of array (dynamically sized or not), the only option you have is to use dynamic allocation through malloc() and then return a pointer to the array (technically there's also alloca() to make dynamic stack allocations, but I would avoid it as it can easily break your program if the allocation is too large).
Here's an example of correct code:
float *create_array(size_t n_elements){
float *arr = malloc(sizeof(float) * n_elements);
if (arr == NULL) {
// Memory could not be allocated, handle the error appropriately.
}
return arr;
}
In this case, malloc() is reserving memory outside of the local stack of the function, in the heap. The result is a pointer that can be freely returned and passed around without any problem, since that area of memory keeps being valid after the function returns (until it is released). When you're done working with the data, you can release the allocated memory by calling free():
float *arr = create_array(100);
// ...
free(arr);
If you don't have a way to release the memory through free() after using malloc(), that's a problem in the long run, but in general, it is not a strict requirement: if your array is always needed, from its creation until the exit of the program, then there's no need to explicitly free() it, since memory is automatically released when the program terminates.
If your function needs to be called more than once or needs to create significantly sized arrays that are only useful in part of the program and should therefore be discarded when no longer in use, then I'm afraid there's no good way of doing it. You should use free() in that case.
To answer your question precisely:
Please take note that I'm not allowed to change anything on main(). Are there any other alternatives which I could use?
No, there are no other better alternatives. The only correct approach here is to dynamically allocate the array through malloc(). The fact that you cannot free it afterwards is a different kind of problem.
So i want to return an array of a size n(variable) which my function
has as input,
You can't, because C functions cannot return arrays at all. They can, and some do, return pointers, however, as your function is declared to do. Such a pointer may point to an element of an array.
i know that in order to return arrays in c i have to
define them static,
As long as I am being pedantic, the problem is to do with the lifetime of the object to which the returned pointer points. If it is an element of an automatically-allocated array, then it, along with the rest of the array, ceases to exist when the function returns. The caller must not try to dereference such a pointer.
The two other alternatives are
static allocation, which you get by declaring the variable static or by declaring it at file scope, and
dynamic allocation, which you get by reserving memory via malloc(), calloc(), or a related function.
Statically allocated objects exist for the entire lifetime of the program, and dynamically allocated ones exist until deallocated.
but problem is that n is a variable and thus i get
an error.
Yes, because variable-length arrays must be automatically allocated. Static objects exist for the whole run of the program, so the compiler needs to reserve space for them at compile time.
I thought of actually using malloc/calloc but then i won't be
able to free them after the function returns the array.
That's correct, but dynamic allocation is still probably the best solution. It is not unreasonable for a called function to return a pointer to an allocated object, thus putting the responsibility on its caller to free that object. Ordinarily, that would be a well-documented characteristic of the function, so that its callers know that such responsibility comes with calling the function.
Moreover, although it's a bit untidy, if your function is to be called only once then it may be acceptable to just allow the program to terminate without freeing the array. The host operating system can generally be relied upon to clean up the mess.
Please take
note that im not allowed to change anything on main(),are there any
other alternatives which i could use?
If you have or can impose a bound on the maximum value of n then you can declare a static array of that maximum size or longer, and return a pointer to that. The caller is receiving a pointer, remember, not an array, so it can't tell how long the pointed-to array actually is. All it knows is that the function promises n accessible elements.
Note well that there is a crucial difference between the dynamic allocation and static allocation alternatives: in the latter case, the function returns a pointer to the same array on every call. This is not inherently wrong, but it can be problematic. If implemented, it is a characteristic of the function that should be both intentional and well-documented.
If want an array of n floats where n is dynamic, you can either create a
variadic-length array (VLA):
void some_function(...)
{
//...
float b[ n ]; //allocate b on the stack
//...
}
in which case there would be no function call for the allocation, or you can allocate it dynamically, e.g., with malloc or calloc, and then free it after you're done with it.
float *b = malloc(sizeof(*b)*n);
A dynamic (malloc/calloc) allocation may be wrapped in a function that returns a pointer to the allocated memory (the wrapper may do some initializations on the allocated memory after the memory has been successfully allocated). A VLA allocation may not, because a VLA ends its lifetime at the end of its nearest enclosing block (C11 Standard - 6.2.4 Storage durations of objects(p7)).
If you do end up wrapping a malloc/calloc call in a "constructor" function like your float *Arr(void), then you obviously should not free the to-be-returned allocated memory inside Arr–Arr's caller would be responsible for freeing the result (unless it passed the responsibility over to some other part of the program):
float *Arr( int n, ...
/*some params to maybe initialize the array with ?*/ )
{
float *r; if (!(r=malloc(sizeof(*r)*n)) return NULL;
//...
//do some initializations on r
//...
return r; //the caller should free it
}
you could use malloc to reserve memory for your n sized array
Like this:
#include <stdlib.h>
#include <stdio.h>
float * arr(int * a, int n ) {
float *fp = malloc ( (size_t) sizeof(float)*n);
if (!fp) printf("Oh no! Run out of memory\n");
return fp;
}
int main () {
int i;
float * fpp = arr(&i,200);
printf("the float array is located at %p in memory\n", fpp);
return(0);
}
It seems like what you want to do is:
have a function that provides (space for) an array with a variable number of elements,
that the caller is not responsible for freeing,
that there only needs to be one instance of at a time.
In this case, instead of attempting to define a static array, you can use a static pointer to manage memory allocated and freed with realloc as needed to adjust the size, as shown in the code below. This will leave one instance in existence at all times after the first call, but so would a static array.
This might not be a good design (it depends on circumstances not stated in the question), but it seems to match what was requested.
#include <stdio.h>
#include <stdlib.h>
float *Arr(int *a , int n)
{
// Keep a static pointer to memory, with no memory allocated initially.
static float *b = NULL;
/* When we want n elements, use realloc to release the old memory, if any,
and allocate new memory.
*/
float *t = realloc(b, n * sizeof *t);
// Fail if the memory allocation failed.
if (!t)
{
fprintf(stderr, "Error, failed to allocate memory in Arr.\n");
exit(EXIT_FAILURE);
}
// Return the new memory.
return b;
}
This question already has answers here:
Difference between static memory allocation and dynamic memory allocation
(7 answers)
Closed 5 years ago.
I was wondering if someone could explain the differences between the memory allocation for ai and *pai
int ai[10];
int *pai = (int * ) calloc (10, sizeof(int));
I understand the second one is dynamically allocated but im struggling to explain why.
Let's see what is being specified in standard (difference wise)
From 7.22.3.1 (Under Memory management functions)
... The lifetime of an allocated object extends from the allocation
until the deallocation.
So yes, this is for dynamically allocated memory. Their lifetime is different from that of local variables. By calling free they are deallocated. Until then they will be alive. Doesn't depend on the life time of the scope on which they are created.
The first one is having automatic storage duration. This is the primary difference. So in the functions scope where it is declared, when it ends then it's lifetime will be over.
Also some people say that there is a heap and stack - but (un)fortunately C standard doesn't mention it. It is completely implementation of the features expected by the C standard. The implementation can be anything. The differences presented is least bothered about those kind of stuff.
As a conceptual redpill (taken from movie Matrix) pai is of automatic storage duration but the address of the memory it contains is not. The variable pai will be lost when the function where it is defined is executed. But the memory it points to, doesn't.
Well why is it called dynamic allocation?
Know one thing - when in programming we say dynamic in the context of language - it means we are doing something in runtime. Same here, we are allocating some memory when in run time by calling functions like malloc,calloc etc. That's why dynamic allocation.
In the first line, you create a variable of an array type, but the symbol ai is a constant pointer to this variable.
in the second line, you create a pointer type variable. then you allocate an array dynamically with calloc() and you puts it's address in the pointer.
The array ai is allocated on the stack, it implicitly goes out of scope, when the end of the function is reached. The pointer pai points to a memory location, which can be an array or a single element of the type pointed to, the memory is allocated on the heap and must be freed later. The second can be passed back to the function-caller on the end of the function and can even be resized with realloc (realloc does not clear the new memory like calloc does, malloc is like calloc without zeroing out the new memory). The first is for fast array computation and should be in the cache most of the time. The second is for unknown lenght of arrays, when the function is called. When the size is known, many programmers tend to define an array in the caller and pass it to the function, which modifies it. The array is implicitly converted to a pointer when calling the function.
Some library implementations store a pointer to an array in the global section, which can be reallocated. Or they have a fixed length array in global space. These variables are recommended to be thread_local. The user does not have to care about the memorymanagement of the variable of the other library.
library.h
const char* getResourceString(int id);
library.c
thread_local char* string_buf = NULL;
const char* getResourceString(int id) {
int i = getResourceSize(id);
string_buf = realloc(string_buf, i);
// fill the memory
return string_buffer;
};
These are quite different operations:
int ai[10];
declares an array object of 10 ints. If it is declared inside a block, it will have automatic storage duration, meaning that it will vanish at block end (both identifier and data). If it is declared outside any block (at file level) it will have static storage duration and will exist throughout all program.
int *pai = calloc (10, sizeof(int)); // DON'T CAST MALLOC IN C
declares a pointer to an allocated zone of memory that can contains ten integers. You can use pai as a pointer to the first element of an array and do pointer arithmetics on it. But sizeof(pai) is sizeof(int *). The array will have dynamic storage duration meaning that its life will end:
if the allocated block of memory is freed
if it is reused to store other objects
double * pd = pai;
for (int i=1; i<5; i++) { // assuming sizeof(double) == 2 * sizeof(int) here
pd[i] = i; // the allocated memory now contains 5 double
}
So in both case you can use the identifier as pointing to an array of 10 integers, but first one is an integer array object while second one is just a pointer to a block of dynamic memory (memory with no declared type that can take the type of an object that will be copied/created there) .
Gerenally speaking, automatically allocated objects will be on the stack, while dynamically allocated objects will be on the heap. Although this distinction is implementation (not standard) dependent, stack and heap are the most commonly used way to manage memory in C programs. They are basically two distinct regions of memory, the first is dedicated to automatic allocations and the second is dedicated to dynamic allocations. So when you call a function (say, the main function) all the objects declared in the scope of this function will be stacked (automatically allocated in the stack). If some dynamic allocation happens in this function, the memory will be allocated in the heap so that all pointers to this area will be pointing to objects outside the stack. When your function returns, all objects in the stack are also automatically unstacked and virtually don't exist anymore. But all objects in the heap will exist until you deallocate them (or they will be forcefully deallocated by the OS when the program ends). Arrays are structures that can be allocated automatically or dynamically. See this example:
int * automaticFactor() //wrong, see below
{
int x[10];
return &x[0];
}
int * dynamicFactor()
{
int * y = (int *) malloc(sizeof(int) * 10);
return &y[0];
}
int main()
{
//this will not work because &x[0] points to the stack
//and that area will be unstacked after the function return
int * n = automaticFactor();
//this will work because &y[0] points to the heap
//and that area will be in the heap until manual deallocation
int * m = dynamicFactor();
return 0;
}
Note that the pointers themselves are in the stack. What is in the heap is the area they are pointing to. So when you declare a pointer inside a function (such as the y of the example), it will also be unstacked at the end of the function. But since its value (i.e. the address of the allocated area) was returned to a pointer outside the function (i.e. to m), you will not lose track of the area allocated in the heap by the function.
What is the difference between declaring an array "dynamically",
[ie. using realloc() or malloc(), etc... ]
vs
declaring an array within main() with Global scope?,
eg.
int main()
{
int array[10];
return 0;
}
I am learning, and at the moment it feels that there is not much differnce between
declaring a variable (array, whatever) -with Global scope,
when compared to a
dynamically allocated variable (array, whatever) -AND never calling free() on it AND allowing it to be 'destoryed' when the program ends'
What are the consequences of either option?
EDIT
Thank you for your responses.
Global scope should have been 'local scope' -local to main()
When you declare an array like int arr[10] in a function, the space for the array is allocated on the stack. The memory will be freed when your function exits.
When you declare an array or any other data structure using malloc() or realloc(), you allocated the space on the heap and the memory will only be freed afer the program exits. So when the program is running, you are responsible for freeing it using free() after you no longer want to use it. If you don't free it and make your array pointer point to something else, you will create a memory leak. However, your computer will always be able to retrieve all the program's used memory after the program ends because of virtual memory.
As kaylum said in comment below your question, the array in your second example does not have global scope. Its scope is limited to main(), and it is inaccessible in other scopes unless main() explicitly makes it available (e.g. passes it by argument to another function).
Dynamic memory allocation means that the programmer explicitly allocates memory when needed, and explicitly releases it when no longer needed. Because of that, the amount of memory allocated can be determined at run time (e.g. calculated from user input). Also, if the programmer forgets to release the memory, or reallocates it inappropriately, memory can be leaked (still allocated by the program, but not accessible by the program). For example;
/* within a function */
char *p = malloc(100);
p = malloc(200);
free(p);
leaks 100 bytes, every time this code is executed, because the result of the first malloc() call is never released, and it is then inaccessible to the program because its value is not stored anywhere.
Your second example is actually an array of automatic storage duration. As far as your program is concerned, it only exists until the end of the scope in which it is created. In your case, as main() returns, the array will cease to exist.
An example of an array with global scope is
int array[10];
void f() {array[0] = 42;}
int main()
{
array[0] = 10;
f();
/* array[0] will be 42 here */
}
The difference is that this array exists and is accessible to every function that has visibility of the declaration, within the same compilation unit.
One other important difference is that global arrays are (usually) zero initialised - a global array of int will have all elements zero. A dynamically allocated array will not have elements initialised (unless created with calloc(), which does initialise to zero). Similarly, an automatic array will not have elements initialised. It is undefined behaviour to access the value of something (including an array element) that is uninitialised.
So
#include <stdio.h>
int array[10];
int main()
{
int *array2;
int array3[10];
array2 = malloc(10*sizeof(*array2));
printf("%d\n", array[0]); /* okay - will print 0 */
printf("%d\n", array2[0]); /* undefined behaviour. array2[0] is uninitialised */
printf("%d\n", array3[0]); /* undefined behaviour. array3[0] uninitialised */
return 0;
}
Obviously the way to avoid undefined behaviour is to initialise array elements to something valid before trying to access their value (e.g. printing them out, in the example above).
I have an char array of fixed size in C application. I am passing that array to some function and from there I am passing it to multiple functions. So that the array gets filled in some of the functions based on some condition. Since I am sending a fixed size array I am facing problem when I copy data to it if the size is more than the array size. I know that I have to make that char array dynamic but as I said that array gets filled in multiple functions and size will be different. So do I need to dynamically allocate the array wherever it gets filled? Consider the array gets filled in 30+ different functions. Or is there a way to do a minimal modification?
As your question title says C, IMO the best approach will be to decalre a pointer of that particular variable type in your main() function, pass the address of that pointer [essencially a double-pointer] to other functions and allocate memory dynamically.
You can keep on passing the pointer to all other functions. Inside every called functions, measure the amount of memory required to put the data [from that particular function] and use realloc() to increase the available memory.
As mentioned by UncleO, the required pointer should be the pointer to array [i.e, a double pointer]
EDIT
For the very first time allocating memory to that pointer, you can use malloc() or calloc().
From next time onwards, to extend [resize] the amount of memory, you need to use realloc()
You don't pass an array to a function in C, although it appears that way. What gets passed is a pointer to the first element of the array. The pointer is passed by value. That is, the value of the pointer (the memory location) is copied into a local variable. The contents of the array can be changed using this local variable.
If you use malloc() or realloc() with this local variable, then the pointer that was "passed in" won't be affected. realloc() may resize the memory, but it can also free that memory and allocate some new memory to the local variable.
If you want to change the array pointer, then you should pass in a pointer to the pointer. The thing the pointer points to is what is changed. This is a bit more cumbersome. But this way you can allocate more memory is needed.
#include <stdlib.h>
char* arr;
void changeit(char** arrptr)
{
*arrptr = realloc(*arrptr, 20*sizeof(char));
}
void main (void)
{
arr = malloc(10*sizeof(char));
changeit(&arr);
}
To function that do not alter the array, pass the array address and size.
int foo1(const char *array, size_t size, ...)
To each function that does not change the array size, pass array address and array size
int foo2(char *array, size_t size, ...)
To functions that may alter the array size, pass the address of the address of the array and the address of the size.
int foo3(char **array, size_t *size, ...)
Code could put these two variables together
typedef struct {
size_t size;
char array;
} YetAnotherArrayType;
Chux,
There's a little typo at then end of your post. I think you mean:
typedef struct {
size_t size;
char* array;
} YetAnotherArrayType;
You didn't make array a pointer type.
If the poster is handcrafting a container, the classic solution is to track both a size and a capacity.
In that model you allocate the array to some initial a capacity and set size to 0. You then track its population causing it grow by some chunk size or factor each time it fills up.
Frequent reallocation can be a massive performance drain and by the sounds of the program in question such behaviour seems likely.