Don't I have to malloc() this array? - c

I'm struggling to learn the C rules of malloc() / free(). Consider the below code, which runs just fine. (On Linux, compiled with GCC.) Specifically, I'm wondering about the arr array. I get that you have to malloc for all the struct elements within the array... but why don't you have to malloc for the array itself?
#include<stdio.h>
#include<stdlib.h>
#define ARR_SIZE 100
typedef struct containerStruct{
int data1;
int data2;
int data3;
// ...etc...
} container;
int main(){
container* arr[ARR_SIZE]; // Don't I have to malloc this?
int i;
for(i=0; i<ARR_SIZE; i++){
arr[i] = (container*) malloc (sizeof(container)); // I get why I have to malloc() here
}
...do stuff...
for(i=0; i<ARR_SIZE; i++){
free(arr[i]); // I get why I have to free() here
}
// Don't I have to free(arr) ?
return 0;
}
I'm guessing that the container* arr[ARR_SIZE]; line tells the compiler all it needs to know to carve out the space in memory for the array, which is why this code works.
But it still doesn't "feel" right, somehow. When I malloc() something, I'm reserving memory space on the heap, right? But my container* arr[ARR_SIZE]; call creates the array on the stack. So... the array of container* porters exists on the stack, and all of those container* pointers point to a container struct on the heap. Is that correct?

Right beneath your declaration of arr, you have the following:
int i;
Which reserves enough space (probably on the "stack") to store an int. Does this "feel" wrong to you as well?
The declaration of arr is no different. The compiler allocates enough space (probably on the "stack") for a ARR_SIZE element array of container *s.

When I malloc() something, I'm reserving memory space on the heap, right?
Don't say that, the heap is an implementation detail. You're dynamically allocating memory. You're also responsible for dynamically freeing it again.
But my container* arr[ARR_SIZE]; call
It's not a call, it's a declaration.
creates the array on the stack.
Don't say that either, the stack is also an implementation detail.
You're declaring a local (here an array of pointers) with automatic scope, and the compiler is responsible for managing its memory and lifetime.
Its lifetime isn't dynamic, because it becomes unreachable as soon as you reach the } at the end of the enclosing block, and so the compiler can handle it for you deterministically.
All local variables behave the same here:
{
int i;
double d[2];
} /* neither i nor d are reachable after here,
so the compiler takes care of releasing any storage */
So... the array of container* pointers exists on the stack,
Consider the simpler declaration container c;. This is a local with automatic scope, and the compiler takes care of (de)allocation as discussed.
Now consider container *p;. This is also a local variable with automatic scope, but the variable is a pointer. You still need to point it at something manually, and if the thing you point it at was returned from malloc, you'll need to free it yourself.
Further consider a simple array container a[2];. Now you have a local array with automatic scope, containing two instances of your container type. Both instances have the same lifetime, it's still managed by the compiler. You can access a[0] and a[1] and pretty much anything else is illegal.
and all of those container* pointers point to a container struct on the heap. Is that correct?
No. Finally consider container* ap[2]. Again we have a local array with automatic scope, containing two pointers. The type of the pointer is pointer-to-container, and the lifetime of the pointers is managed by the compiler. However, the pointers don't point at anything yet, and it'll be your responsibility to decide what to point them at, and figure out what lifetime that pointed-at thing has.
Once you do
ap[i] = malloc(sizeof(*ap[i]));
(you don't need to cast, and it's generally safer to avoid naming the type explicitly, in case you change it later and allocate the wrong size), you've allocated an object which you're responsible for freeing. You've also pointed one of the pointers in your array at this object, but that doesn't somehow change the lifetime of the pointer itself. The array just goes out of scope like usual wherever its pointers point.

You can choose either one of the following options:
int i;
container** arr = malloc(sizeof(container*)*ARR_SIZE);
for (i=0; i<ARR_SIZE; i++)
arr[i] = malloc(sizeof(container));
// do stuff...
for(i=0; i<ARR_SIZE; i++)
free(arr[i]);
free(arr);
int i;
container* arr[ARR_SIZE];
for (i=0; i<ARR_SIZE; i++)
arr[i] = malloc(sizeof(container));
// do stuff...
for(i=0; i<ARR_SIZE; i++)
free(arr[i]);
container arr[ARR_SIZE];
// do stuff...
Since ARR_SIZE is constant, you may as well choose the last option.

Since ARR_SIZE is fixed member of the dynamic array *arr[ARR_SIZE], you no longer have to allocate memory for the whole array, but just the elements within it, as they are flexible in this case.
When you malloc any sort of dynamic array, it is safe to always check the return value of the void* pointer returned from malloc. Additionally, when you free each element, it is even safer to set each member to NULL again, to prevent the pointer from accessing memory again.
Illustrated through this code:
#include <stdio.h>
#include <stdlib.h>
#define ARR_SIZE 100
typedef struct {
int data1;
int data2;
int data3;
// ...etc...
} container_t;
int
main(void) {
container_t *arr[ARR_SIZE];
int i;
for (i = 0; i < ARR_SIZE; i++) {
arr[i] = malloc(sizeof(container_t));
if (arr[i] == NULL) {
printf("Malloc Problem here\n");
exit(EXIT_FAILURE);
}
}
for (i = 0; i < ARR_SIZE; i++) {
if (arr[i]) {
free(arr[i]);
arr[i] = NULL;
}
}
return 0;
}

container* arr[ARR_SIZE]; tells the compiler to allocate an array of ARR_SIZE elements of type container* and compiler allocates the memory accordingly.
In other words, this is similar to saying int x[5] = 0; where compiler allocates enough space for an array of 5 ints. In your case, compiler allocates enough space for ARR_SIZE number of pointers, container* and that is it. Now, it's upto you to make those pointers point to valid memory location. For that, you can either
use memory allocator functions (which allocates memory from heap, as you mentioned)
assign the address of other variables of the same type (does not need allocation from heap, anyway).
So, the bottom line, you don't need to allocate any memory for the array. For each individual array elements, you need to allocate memory using memory allocator functions as you want each elements to point to valid memory.

Related

Declaring memory on stack overwrites previously declared memory

How can I allocate memory on the stack and have it point to different memory addresses so I can use it later? For example. this code:
for (int i = 0; i < 5; i++) {
int nums[5];
nums[0] = 1;
printf("%p\n", &nums[0]);
}
Will print out the same address every time. How can I write memory to stack (not the heap, no malloc) and have it not overwrite something else that's on the stack already.
You could use alloca to allocate a different array from the runtime stack for each iteration in the loop. The array contents will remain valid until you exit the function:
#include <stdlib.h>
#include <stdio.h>
void function() {
for (int i = 0; i < 5; i++) {
int *nums = alloca(5 * sizeof(*nums));
nums[0] = 1;
printf("%p\n", (void *)nums);
/* store the value of `num` so the array can be used elsewhere.
* the arrays must only be used before `function` returns to its caller.
*/
...
}
/* no need to free the arrays */
}
Note however that alloca() is not part of the C Standard and might not be available on all architectures. There are further restrictions on how it can be used, see the documentation for your system.
I believe you are looking for:
a way to control how memory is being allocated on the stack, at least in the context of not overwriting already-used memory
Of course, that's taken care by the OS! The low level system calls will make sure that a newly created automatic variable will not be written upon an already used memory block.
In your example:
for (int i = 0; i < 5; i++) {
int nums[5];
...
}
this is not the case, since nums will go out of scope, when the i-th iteration of the for loop terminates.
As a result, the memory block nums was stored into during the first iteration, will be marked as free when the second iteration initiates, which means that when nums of the first iteration is going to be allocated in the stack, it will not be aware of any existence of the nums of the first iteration, since that has gone already out of scope - it doesn't exist!

Memory allocation using for loop

My Doubt is regarding only memory allocation so don't think about program output
#include<stdio.h>
int main(){
for(int i=0;i<20;i++){
char *str=malloc(sizeof(char)*6); //assuming length of each string is 6
scanf("%s",str);
insertinlinkedlist(str);
}
}
whenever i allocate memory here as shown above only the base address of char array will pass to linked list,and that is the memory block allocated for char array is inside main only and i am storing the base address of that array in str which is local to main and is passed to insetinlinkedlist
I want to ask whenever memory is allocated inside loop than why the number of
memory blocks(no of char arrays declared ) are created equal to n (number of time loop runs) since variable name is same we should be directed to same memory location
Note I have checked in compiler by running the loop all the times when loop runs memory the value of str is different
is The above method is correct of allocating memory through loop and through same variable "Is the method ensures that every time we allocate memory in above manner their will be no conflicts while memory allocation and every time we will get the address of unique memory block"
Now above doubt also creates a doubt in my mind
That if we do something like that
int main(){
for(int i=0;i<n;i++){
array[50];
}
}
then it will also create 50 array inside stack frame
malloc returns a pointer to the first allocated byte. Internally it keeps track of how much memory was allocated so it knows how much to free (you do need to insert calls to free() or you'll leak memory, by the way). Usually, it does this by allocating a little bit of memory before the pointer it gives you and storing the length there, however it isn't required to do it that way.
The memory allocated by malloc is not tied to main in any way. Currently main is the only function whose local variables have a pointer to that memory, but you could pass the pointer to another function, and that function would also be able to access the memory. Additionally, when the function that called malloc returns, that memory will remain allocated unless manually freed.
The variable name doesn't matter. A pointer is (to first approximation) just a number. Much like how running int a = 42; a = 20; is permitted and replaces the previous value of a with a new one, int *p = malloc(n); p = malloc(n); will first assign the pointer returned by the first malloc call to p, then will replace it with the return value of the second call. You can also have multiple pointers that point to the same address:
int *a = malloc(42);
int *b = malloc(42);
int *c = a;
a = malloc(42);
At the end of that code, c will be set to the value returned by the first malloc call, and a will have the value returned by the last malloc call. Just like if you'd done:
//assume here that f() returns a different value each time
//it's called, like malloc does
int a = f();
int b = f();
int c = a;
a = f();
As for the second part of your question:
for(int i=0;i<n;i++){
int array[50];
}
The above code will create an array with enough space for 50 ints inside the current stack frame. It will be local to the block within the for loop, and won't persist between iterations, so it won't create n separate copies of the array. Since arrays declared this way are part of the local stack frame, you don't need to manually free them; they will cease to exist when you exit that block. But you could pass a pointer to that array to another function, and it would be valid as long as you haven't exited the block. So the following code...
int sum(int *arr, size_t n) {
int count = 0;
for (size_t i = 0; i < n; i++) {
count += arr[i];
}
return count;
}
for(int i=0;i<n;i++){
int array[50];
printf("%d\n", sum(array, 50));
}
...would be legal (from a memory-management perspective, anyway; you never initialize the array, so the result of the sum call is not defined).
As a minor side note, sizeof(char) is defined to be 1. You can just say malloc(6) in this case. sizeof is necessary when allocating an array of a larger type.

C Pointer help: Array/pointer equivalence

In this toy code example:
int MAX = 5;
void fillArray(int** someArray, int* blah) {
int i;
for (i=0; i<MAX; i++)
(*someArray)[i] = blah[i]; // segfault happens here
}
int main() {
int someArray[MAX];
int blah[] = {1, 2, 3, 4, 5};
fillArray(&someArray, blah);
return 0;
}
... I want to fill the array someArray, and have the changes persist outside the function.
This is part of a very large homework assignment, and this question addresses the issue without allowing me to copy the solution. I am given a function signature that accepts an int** as a parameter, and I'm supposed to code the logic to fill that array. I was under the impression that dereferencing &someArray within the fillArray() function would give me the required array (a pointer to the first element), and that using bracketed array element access on that array would give me the necessary position that needs to be assigned. However, I cannot figure out why I'm getting a segfault.
Many thanks!
I want to fill the array someArray, and have the changes persist outside the function.
Just pass the array to the function as it decays to a pointer to the first element:
void fillArray(int* someArray, int* blah) {
int i;
for (i=0; i<MAX; i++)
someArray[i] = blah[i];
}
and invoked:
fillArray(someArray, blah);
The changes to the elements will be visible outside of the function.
If the actual code was to allocate an array within fillArray() then an int** would be required:
void fillArray(int** someArray, int* blah) {
int i;
*someArray = malloc(sizeof(int) * MAX);
if (*someArray)
{
for (i=0; i<MAX; i++) /* or memcpy() instead of loop */
(*someArray)[i] = blah[i];
}
}
and invoked:
int* someArray = NULL;
fillArray(&someArray, blah);
free(someArray);
When you create an array, such as int myArray[10][20], a guaranteed contiguous block of memory is allocated from the stack, and normal array arithmetic is used to find any given element in the array.
If you want to allocate that 3D "array" from the heap, you use malloc() and get some memory back. That memory is "dumb". It's just a chunk of memory, which should be thought of as a vector. None of the navigational logic attendant with an array comes with that, which means you must find another way to navigate your desired 3D array.
Since your call to malloc() returns a pointer, the first variable you need is a pointer to hold the vector of int* s you're going to need to hold some actual integer data IE:
int *pArray;
...but this still isn't the storage you want to store integers. What you have is an array of pointers, currently pointing to nothing. To get storage for your data, you need to call malloc() 10 times, with each malloc() allocating space for 20 integers on each call, whose return pointers will be stored in the *pArray vector of pointers. This means that
int *pArray
needs to be changed to
int **pArray
to correctly indicate that it is a pointer to the base of a vector of pointers.
The first dereferencing, *pArray[i], lands you somewhere in an array of int pointers, and the 2nd dereferencing, *p[i][j], lands you somewhere inside an array of ints, pointed to by an int pointer in pArray[i].
IE: you have a cloud of integer vectors scattered all over the heap, pointed to by an array of pointers keeping track of their locations. Not at all similar to Array[10][20] allocated statically from the stack, which is all contiguous storage, and doesn't have a single pointer in it anywhere.
As others have eluded to, the pointer-based heap method doesn't seem to have a lot going for it at first glance, but turns out to be massively superior.
1st, and foremost, you can free() or realloc() to resize heap memory whenever you want, and it doesn't go out of scope when the function returns. More importantly, experienced C coders arrange their functions to operate on vectors where possible, where 1 level of indirection is removed in the function call. Finally, for large arrays, relative to available memory, and especially on large, shared machines, the large chunks of contiguous memory are often not available, and are not friendly to other programs that need memory to operate. Code with large static arrays, allocated on the stack, are maintenance nightmares.
Here you can see that the table is just a shell collecting vector pointers returned from vector operations, where everything interesting happens at the vector level, or element level. In this particular case, the vector code in VecRand() is calloc()ing it's own storage and returning calloc()'s return pointer to TblRand(), but TblRand has the flexibility to allocate VecRand()'s storage as well, just by replacing the NULL argument to VecRand() with a call to calloc()
/*-------------------------------------------------------------------------------------*/
dbl **TblRand(dbl **TblPtr, int rows, int cols)
{
int i=0;
if ( NULL == TblPtr ){
if (NULL == (TblPtr=(dbl **)calloc(rows, sizeof(dbl*))))
printf("\nCalloc for pointer array in TblRand failed");
}
for (; i!=rows; i++){
TblPtr[i] = VecRand(NULL, cols);
}
return TblPtr;
}
/*-------------------------------------------------------------------------------------*/
dbl *VecRand(dbl *VecPtr, int cols)
{
if ( NULL == VecPtr ){
if (NULL == (VecPtr=(dbl *)calloc(cols, sizeof(dbl))))
printf("\nCalloc for random number vector in VecRand failed");
}
Randx = GenRand(VecPtr, cols, Randx);
return VecPtr;
}
/*--------------------------------------------------------------------------------------*/
static long GenRand(dbl *VecPtr, int cols, long RandSeed)
{
dbl r=0, Denom=2147483647.0;
while ( cols-- )
{
RandSeed= (314159269 * RandSeed) & 0x7FFFFFFF;
r = sqrt(-2.0 * log((dbl)(RandSeed/Denom)));
RandSeed= (314159269 * RandSeed) & 0x7FFFFFFF;
*VecPtr = r * sin(TWOPI * (dbl)(RandSeed/Denom));
VecPtr++;
}
return RandSeed;
}
There is no "array/pointer" equivalence, and arrays and pointers are very different. Never confuse them. someArray is an array. &someArray is a pointer to an array, and has type int (*)[MAX]. The function takes a pointer to a pointer, i.e. int **, which needs to point to a pointer variable somewhere in memory. There is no pointer variable anywhere in your code. What could it possibly point to?
An array value can implicitly degrade into a pointer rvalue for its first element in certain expressions. Something that requires an lvalue like taking the address (&) obviously does not work this way. Here are some differences between array types and pointer types:
Array types cannot be assigned or passed. Pointer types can
Pointer to array and pointer to pointer are different types
Array of arrays and array of pointers are different types
The sizeof of an array type is the length times the size of the component type; the sizeof of a pointer is just the size of a
pointer

C - allocating values in an array of pointers in outside function

Lets say I have the following situation (some rough pseudocode):
struct {
int i;
} x
main(){
x** array = malloc(size of x pointer); // pointer to an array of pointers of type x
int* size = current size of x // (initally 0)
add(array, size);
}
add(x** array, int* size){ // adds one actual element to the array
x** temp = realloc(array, (*size)+1); // increase the array size by one
free(array);
array = temp;
// My question is targeted here
array[*size] = malloc(size of x); // makes a pointer to the value
array[*size]->i = size;
*size++;
}
My question is: Once add() is finished, do the values of the pointers stored in array disappear along with the function call stack, since I allocated them inside func()? I fear that they might, in which case would there be a better way for me to do things?
No, they don't. They persist until the pointer returned by malloc() is passed to the corresponding free() function. There would be no point in the existence of the malloc() function if it worked the same way as automatic arrays.
Edit: sidenote. As #Ancurio pointer it out, you're incorrectly freeing the memory behind the previous pointer returned by malloc() which is at that time invalid as realloc() has been used on it. Don't do that. realloc() does its job properly.)

Initializing array of integer pointer in C

I have some confusions/problems about the usage of pointers in C. I've put the example code below to understand it easily. Please notice differences of these codes. If you have some problem understanding it, please have a feedback.
This doesn't work.
#include <stdio.h>
#include <stdlib.h>
void process() {
int *arr;
arr=(int*)malloc(5*sizeof(int));
arr=(int*){3,1,4,5,2};
for(int z=0;z<5;z++) {
printf("%d ",arr[z]);
}
printf("\n");
}
int main() {
process();
return 0;
}
But this works.
#include <stdio.h>
#include <stdlib.h>
void process() {
int *arr;
arr=(int*)malloc(5*sizeof(int));
arr=(int[]){3,1,4,5,2};
for(int z=0;z<5;z++) {
printf("%d ",arr[z]);
}
printf("\n");
}
int main() {
process();
return 0;
}
This also works too. Why? I didn't allocate memory here.
#include <stdio.h>
#include <stdlib.h>
void process() {
int *arr;
arr=(int[]){3,1,4,5,2};
for(int z=0;z<5;z++) {
printf("%d ",arr[z]);
}
printf("\n");
}
int main() {
process();
return 0;
}
Why aren't they same?
arr=(int*){3,1,4,5,2};
arr=(int[]){3,1,4,5,2};
Is there any other way to initializing array of integer pointer, not using in this individual assigning way?
arr[0]=3;
arr[1]=1;
arr[2]=4;
arr[3]=5;
arr[4]=2;
How can i get the size/number of allocation in memory of pointer so that i can use something like for(int z=0;z<NUM;z++) { instead of for(int z=0;z<5;z++) { statically?
Any answer is highly appreciated.
Thanks in advance.
The malloc calls in the first few examples allocate a block of memory and assign a pointer to that memory to arr. As soon as you assign to arr again, the pointer value is overwritten, and you've lost track of that allocated memory -- i.e., you've leaked it. That's a bug right there.
In other words, if you allocate a block of memory using using malloc(), then you can write data into it using array syntax (for example):
int* arr = (int *) malloc(sizeof(int) * 5);
for (int i=0; i<5; ++i)
arr[i] = i;
But you can't assign anything else directly to arr, or you lose the pointer to that block of memory. And when you allocate a block using malloc(), don't forget to delete it using free() when you don't need it anymore.
An array is not a pointer-to-integer; it's an array. An array name is said to "decay to a pointer" when you pass it as an argument to a function accepting a pointer as an argument, but they're not the same thing.
Regarding your last question: that's actually the difference between an array and a pointer-to-type: the compiler knows the size of an array, but it does not know the size of a block pointed to by an arbitrary pointer-to-type. The answer, unfortunately, is no.
But since you're writing C++, not C, you shouldn't use arrays anyway: use `std::vector'! They know their own length, plus they're expandable.
When you say: ptr = {1,2,3,4,5}, you make ptr point to a memory in the data segment, where constant array {1,2,3,4,5} resides and thus you are leaking memory. If you want to initialize your memory, just after allocation, write: ptr[0]=1; ptr[1]=2; and so on. If you want to copy data, use memcpy.
The comma-separated list of values is a construct for initializing arrays. It's not suitable for initializing pointers to arrays. That's why you need that (int[]) - it tells gcc to create an unnamed array of integers that is initialized with the provided values.
gcc won't let you losing the information that arr is an array unless you really want to. That's why you need a cast. It's better that you declare arr as an array, not as a pointer. You can still pass it to the functions that accept a pointer. Yet gcc would not let you leak memory with malloc:
error: incompatible types when assigning to type ‘int[5]’ from type ‘void *’
If you want a variable that can be either an array or an pointer to allocated memory, create another variable for that. Better yet, create a function that accepts a pointer and pass it either an array or a pointer.

Resources