If a variable is declared within a loop, does the previous declarations become garbage? For example, in the following:
loop{
int array[10];
array[i]=......
}
array is declared for each loop iteration. When it is newly declared, is the new memory location that array allocates same with the older location?. If it is not, does the older declarations become garbage, because the allocated area is not freed? Finally, how can it be freed without exiting the loop if the array is static like the above example?
You aren't actually allocating anything. This goes on the stack, and the size of the stack frame is calculated by your compliler at compile time. The array will reuse the same amout of stack space each iteration. The int array[10] does effectively nothing at run time.
There's a big difference by doing this:
for (...) {
int a[10];
a[0] = x;
}
and doing this:
for (...) {
int* a = (int*)malloc(sizeof(int)*10);
a[0] = x;
free(a);
}
The first "allocation" is fixed in size, and will cost you nothing. The second can be of variable size and will be a heap allocated array which you will need to manually free. C has no concept of garbage collection, so nothing really becomes garbage. But you are required to free whatever you allocate using the malloc function. If you never use that function you never need to free anything. The compiler will take care of that for you.
This is an automatic variable that the compiler handles - automatically.
You only have to take care of storage you allocate yourself using new or malloc. The rest is handled for you.
The array comes into scope each time you enter the loop and is destroyed again at the end of each loop. The compiler is very likely to reuse the same space each time, but that is not defined by the language. There will be no garbage either way.
You can assume that, for every iteration of the loop, a new array is created and and its is destroyed at the end of iteration. It implies content of newly created array is undefined .(may be garbage - more chances that it contain same data since it might occupy same place in the stack)
However, internally their wont be any allocation or deallocation for the int array[10] as pointed by Dervall
Related
I have been writing a program where I have a 2d array that changes size if the user wants, as follows:
#include <stdlib.h>
#include <stdio.h>
int max_length = 1024;
int memory_length = 16;
int block_length = 64;
void process_input(int memory[memory_length][block_length], char* user_input) {
...
}
int main(void) {
printf("Not sure what to do? Enter 'help'\n");
while (0 == 0) {
int memory[memory_length][block_length];
char user_input[max_length];
printf(">> ");
fgets(user_input, max_length, stdin);
printf("\n");
process_input(memory, user_input);
if (user_input[0] == 'e' && user_input[1] == 'n' && user_input[2] == 'd') {
break;
}
printf("\n");
}
return 0;
}
NOTE: The process_input() function that I made allows the user to play around with the values inside the array 'memory' as well as change the value of memory_length or block_length, hence then changing the length of the array. After the user is done the cycle repeats with a fresh array.
I can use the 2d array perfectly fine, parsing it to any function. However one day I discover that there are functions such as malloc() that allow you to dynamically allocate memory through a pointer. This made me then question:
Should I re-write my whole very complicated program to use malloc and other 'memory functions', or is it okay to keep it this way?
Also as a side question that might be answered by answering the main question:
Every time I declare the 2d array, does the previous contents of the array get free, or do I keep filling up my memory like an amateur?
Finally if there is anything else that you may notice in the code or in my writing please let me know.
Thanks.
Should I re-write my whole very complicated program to use malloc and other 'memory functions', or is it okay to keep it this way?
Probably rewrite it indeed. int memory[memory_length][block_length]; in main() is a variable-length array (VLA). It is allocated with automatic storage and gets the size of those size variables at the point where its declaration is encountered, then it can't be resized from there.
For some reason a lot of beginners seem to think you can resize the VLA by changing the variables originally used to determine it's size, but no such magic relation between the VLA and those variables exists. How to declare variable-length arrays correctly?
The only kind of array in C that allows run-time resizing is one which was allocated dynamically. The only alternative to that is to allocate an array "large enough" and then keep track of how much of the array you actively are using - but it will sit there in memory (and that is likely no big deal).
However, it is not recommended to allocate huge arrays with automatic storage, since those usually end up on the stack and can cause stack overflows. Use either static storage duration or allocated storage (with malloc etc).
Every time I declare the 2d array, does the previous contents of the array get free, or do I keep filling up my memory like an amateur?
You can only declare it once. In case you do so inside a local scope, with automatic storage duration, it does indeed get cleared up every time you leave the scope which it was declared. But that also means that it can't be used outside that scope.
Finally if there is anything else that you may notice in the code or in my writing please let me know.
Yes, get rid of the global variables. There is no reason to use them in this example, avoid them like the plague. For example a function using an already allocated array might pass the sizes along, like in this example:
void process_input (size_t memory_length,
size_t block_length,
int memory[memory_length][block_length],
char* user_input)
In C, local variables, i.e. variables declared within a function, are allocated on the stack. They are only allocated once when the function is first called. The fact that you can declare variables within a while loop can lead to some confusion. The loop does not somehow allocate the memory again and again.
The memory allocated for all local variables is released when the function return.
The main reason that you might want declare a variable inside a loop (besides convenience) is to limit the scope of the variable. In your code above, you cannot access the "memory" variable outside of the while loop. You can easily check this for yourself. Your compiler should raise an error.
Whether the stack or the heap contains more memory depends on your computer architecture. In an embedded system you can often specify whether to allocate more or less memory to the heap or the stack. On a computer with virtual memory, such as a PC, the size of the heap and the stack are only limited by the size of your hard drive and the address space.
Allocating arrays on the heap is not as simple as it might seem. Single dimensional arrays work just as you might imagine, but things get more complicated with multidimensional arrays, so it is probably better to stick with either a locally or statically declared array in your case.
I have an infinite while loop, I am not sure if I should use a char array or char pointer. The value keeps getting overwritten and used in other functions. With a char pointer, I understand there could be a memory leak, so is it preferred to use an array?
char *recv_data = NULL;
int main(){
.....
while(1){
.....
recv_data = cJSON_PrintUnformatted(root);
.....
}
}
or
char recv[256] = {0};
int main(){
.....
while(1){
.....
strcpy(recv, cJSON_PrintUnformatted(root));
.....
}
}
The first version should be preferred.
It doesn't have a limit on the size of the returned string.
You can use free(recv_data) to fix the memory leak.
The second version has these misfeatures:
The memory returned from the function can't be freed, because you never assigned it to a variable that you can pass to free().
It's a little less efficient, since it performs an unnecessary copy.
Based on how you used it, the cJSON_PrintUnformatted returns a pointer to a char array. Since there are no input arguments, it probably allocates memory inside the function dynamically. You probably have to free that memory. So you need the returned pointer in order to deallocate the memory yourself.
The second option discards that returned pointer, and so you lost your only way to free the allocated memroy. Hence it will remain allocated -> memroy leak.
But of course this all depends on how the function is implemented. Maybe it just manipulates a global array and return a pointer to it, so there is no need to free it.
Indeed, the second version has a memory leak, as #Barmar points out.
However, even if you were to fix the memory leak, you still can't really use the first version of your code: With the first version, you have to decide at compile-time what the maximum length of the string returned by cJSON_PrintUnformatted(). Now,
If you choose a value that's too low, the strcpy() function would exceed the array bounds and corrupt your stack.
If you choose a value that's so high as to be safe - you might have to exceed the amount of space available for your program's stack, causing a Stack Overflow (yes, like the name of this site). You could fix that using a strncpy(), giving the maximum size - and then what you'd have is a truncated string.
So you really don't have much choice than using whatever memory is pointed to by the cJSON_PrintUnformatted()'s return value (it's probably heap-allocated memory). Plus - why make a copy of it when it's already there for you to use? Be lazy :-)
PS - What should really happen is for the cJSON_PrintUnformatted() to take a buffer and a buffer size as parameters, giving its caller more control over memory allocation and resource limits.
What is the difference between declaring an array "dynamically",
[ie. using realloc() or malloc(), etc... ]
vs
declaring an array within main() with Global scope?,
eg.
int main()
{
int array[10];
return 0;
}
I am learning, and at the moment it feels that there is not much differnce between
declaring a variable (array, whatever) -with Global scope,
when compared to a
dynamically allocated variable (array, whatever) -AND never calling free() on it AND allowing it to be 'destoryed' when the program ends'
What are the consequences of either option?
EDIT
Thank you for your responses.
Global scope should have been 'local scope' -local to main()
When you declare an array like int arr[10] in a function, the space for the array is allocated on the stack. The memory will be freed when your function exits.
When you declare an array or any other data structure using malloc() or realloc(), you allocated the space on the heap and the memory will only be freed afer the program exits. So when the program is running, you are responsible for freeing it using free() after you no longer want to use it. If you don't free it and make your array pointer point to something else, you will create a memory leak. However, your computer will always be able to retrieve all the program's used memory after the program ends because of virtual memory.
As kaylum said in comment below your question, the array in your second example does not have global scope. Its scope is limited to main(), and it is inaccessible in other scopes unless main() explicitly makes it available (e.g. passes it by argument to another function).
Dynamic memory allocation means that the programmer explicitly allocates memory when needed, and explicitly releases it when no longer needed. Because of that, the amount of memory allocated can be determined at run time (e.g. calculated from user input). Also, if the programmer forgets to release the memory, or reallocates it inappropriately, memory can be leaked (still allocated by the program, but not accessible by the program). For example;
/* within a function */
char *p = malloc(100);
p = malloc(200);
free(p);
leaks 100 bytes, every time this code is executed, because the result of the first malloc() call is never released, and it is then inaccessible to the program because its value is not stored anywhere.
Your second example is actually an array of automatic storage duration. As far as your program is concerned, it only exists until the end of the scope in which it is created. In your case, as main() returns, the array will cease to exist.
An example of an array with global scope is
int array[10];
void f() {array[0] = 42;}
int main()
{
array[0] = 10;
f();
/* array[0] will be 42 here */
}
The difference is that this array exists and is accessible to every function that has visibility of the declaration, within the same compilation unit.
One other important difference is that global arrays are (usually) zero initialised - a global array of int will have all elements zero. A dynamically allocated array will not have elements initialised (unless created with calloc(), which does initialise to zero). Similarly, an automatic array will not have elements initialised. It is undefined behaviour to access the value of something (including an array element) that is uninitialised.
So
#include <stdio.h>
int array[10];
int main()
{
int *array2;
int array3[10];
array2 = malloc(10*sizeof(*array2));
printf("%d\n", array[0]); /* okay - will print 0 */
printf("%d\n", array2[0]); /* undefined behaviour. array2[0] is uninitialised */
printf("%d\n", array3[0]); /* undefined behaviour. array3[0] uninitialised */
return 0;
}
Obviously the way to avoid undefined behaviour is to initialise array elements to something valid before trying to access their value (e.g. printing them out, in the example above).
If I define an array in if statement then does memory gets allocated during compile time eg.
if(1)
{
int a[1000];
}
else
{
float b[1000];
}
Then a memory of 2 * 1000 for ints + 4 * 1000 for floats get allocated?
It is reserved on the stack at run-time (assuming a non-trivial condition - in your case, the compiler would just exclude the else part). That means it only exists inside the scope block (between the {}).
In your example, only the memory for the ints gets allocated on the stack (1000 * sizeof(int)).
As you can guess, this is happening at run time. The generated code has instructions to allocate the space on the stack when the corresponding block of code is entered.
Keep in mind that this is happening because of the semantics of the language. The block structure introduces a new scope, and any automatic variables allocated in that scope have a lifetime that lasts as long as the scope does. In C, this is implemented by allocating it on the stack, which collapses as the scope disappears.
Just to drive home the point, note that the allocation would be different had the variables been of different nature.
if(1)
{
static int a[1000];
}
else
{
static float b[1000];
}
In this case, space is allocated for both the ints and the floats. The lifetime of these variables is the program. But the visibility is within the block scope they are allocated in.
Scope
Variables declared inside the scope of a pair of { } are on the stack. This applies to variables declared at the beginning of a function or in any pair of { } within the function.
int myfunc()
{
int i = 0; // On the stack, scoped: myfunc
printf("%i\n");
if (1)
{
int j = 1; // On the stack, scope: this if statement
printf("%i %i\n",i,j);
}
printf("%i %i\n",i,j); // Won't work, no j
}
These days the scope of the variables is limited to the surrounding { }. I recall that some older Microsoft compilers didn't limit the scope, and that in the example above the final printf() would compile.
So Where is it in memory?
The memory of i and j is merely reserved on the stack. This is not the same as memory allocation done with malloc(). That is important, because calling malloc() is very slow in comparison. Also with memory dynamically allocated using malloc() you have to call free().
In effect the compiler knows ahead of time what space is needed for a function's variables and will generate code that refers to memory relative to whatever the stack pointer is when myfunc() is called. So long as the stack is big enough (2MBytes normally, depends on the OS), all is good.
Stack overflow occurs in the situation where myfunc() is called with the stack pointer already close to the end of the stack (i.e. myfunc() is called by a function which in turn had been called by another which it self was called by yet another, etc. Each layer of nested calls to functions moves the stack pointer on a bit more, and is only moved back when functions return).
If the space between the stack pointer and the end of the stack isn't big enough to hold all the variables that are declared in myfunc(), the code for myfunc() will simply try to use locations beyond the end of the stack. That is almost always a bad thing, and exactly how bad and how hard it is to notice that something has gone wrong depends on the operating system. On small embedded micro controllers it can be a nightmare as it usually means some other part of the program's data (eg global variables) get silently overwritten, and it can be very hard to debug. On bigger systems (Linux, Windows) the OS will tell you what's happened, or will merely make the stack bigger.
Runtime Efficiency Considerations
In the example above I'm assigning values to i and j. This does actually take up a small amount of runtime. j is assigned 1 only after evaluation of the if statement and subsequent branch into where j is declared.
Say for example the if statement hadn't evaluated as true; in that case j is never assigned 1. If j was declared at the start of myfunc() then it would always get assigned the value of 1 regardless of whether the if statement was true - a minor waste of time. But consider a less trivial example where a large array is declared an initialised; that would take more execution time.
int myfunc()
{
int i = 0; // On the stack, scoped: myfunc
int k[10000] = {0} // On the stack, scoped: myfunc. A complete waste of time
// when the if statement evaluates to false.
printf("%i\n");
if (0)
{
int j = 1; // On the stack, scope: this if statement
// It would be better to move the declaration of k to here
// so that it is initialised only when the if evaluates to true.
printf("%i %i %i\n",i,j,k[500]);
}
printf("%i %i\n",i,j); // Won't work, no j
}
Placing the declaration of k at the top of myfunc() means that a loop 10,000 long is executed to initialise k every time myfunc() is called. However it never gets used, so that loop is a complete waste of time.
Of course, in these trivial examples compilers will optimise out the unnecessary code, etc. In real code where the compiler cannot predict ahead of time what the execution flow will be then things are left in place.
Memory for the array in the if block will be allocated on stack at run time. else part will be optimized (removed) by the compiler. For more on where the variables will be allocated memory, see Segmentation Fault when writing to a string
As DCoder & paddy corrected me, the memory will be calculated at compile time but allocated at run-time in stack memory segment, but with the scope & lifetime of the block in which the array is defined. The size of memory allocated depends on size of int & float in your system. Read this for an overview on C memory map
I have been writing C for only a scant few weeks and have not taken the time to worry myself too much about malloc(). Recently, though, a program of mine returned a string of happy faces instead of the true/false values I had expected to it.
If I create a struct like this:
typedef struct Cell {
struct Cell* subcells;
}
and then later initialize it like this
Cell makeCell(int dim) {
Cell newCell;
for(int i = 0; i < dim; i++) {
newCell.subcells[i] = makeCell(dim -1);
}
return newCell; //ha ha ha, this is here in my program don't worry!
}
Am I going to end up accessing happy faces stored in memory somewhere, or perhaps writing over previously existing cells, or what? My question is, how does C allocate memory when I haven't actually malloc()ed the appropriate amount of memory? What's the default?
Short answer: It isn't allocated for you.
Slightly longer answer: The subcells pointer is uninitialized and may point anywhere. This is a bug, and you should never allow it to happen.
Longer answer still: Automatic variables are allocated on the stack, global variables are allocated by the compiler and often occupy a special segment or may be in the heap. Global variables are initialized to zero by default. Automatic variables do not have a default value (they simply get the value found in memory) and the programmer is responsible for making sure they have good starting values (though many compilers will try to clue you in when you forget).
The newCell variable in you function is automatic, and is not initialized. You should fix that pronto. Either give newCell.subcells a meaningful value promptly, or point it at NULL until you allocate some space for it. That way you'll throw a segmentation violation if you try to dereference it before allocating some memory for it.
Worse still, you are returning a Cell by value, but assigning it to a Cell * when you try to fill the subcells array. Either return a pointer to a heap allocated object, or assign the value to a locally allocated object.
A usual idiom for this would have the form something like
Cell* makeCell(dim){
Cell *newCell = malloc(sizeof(Cell));
// error checking here
newCell->subcells = malloc(sizeof(Cell*)*dim); // what if dim=0?
// more error checking
for (int i=0; i<dim; ++i){
newCell->subCells[i] = makeCell(dim-1);
// what error checking do you need here?
// depends on your other error checking...
}
return newCell;
}
though I've left you a few problems to hammer out..
And note that you have to keep track of all the bits of memory that will eventually need to be deallocated...
There is no default value for your pointer. Your pointer will point to whatever it stores currently. As you haven't initialized it, the line
newCell.subcells[i] = ...
Effectively accesses some uncertain part of memory. Remember that subcells[i] is equivalent to
*(newCell.subcells + i)
If the left side contains some garbage, you will end up adding i to a garbage value and access the memory at that uncertain location. As you correctly said, you will have to initialize the pointer to point to some valid memory area:
newCell.subcells = malloc(bytecount)
After which line you can access that many bytes. With regards to other sources of memory, there are different kind of storage that all have their uses. What kind you get depends on what kind of object you have and which storage class you tell the compiler to use.
malloc returns a pointer to an object with no type. You can make a pointer point to that region of memory, and the type of the object will effectively become the type of the pointed to object type. The memory is not initialized to any value and access usually is slower. Objects so obtained are called allocated objects.
You can place objects globally. Their memory will be initialized to zero. For points, you will get NULL pointers, for floats you will get a proper zero too. You can rely on a proper initial value.
If you have local variables but use the static storage class specifier, then you will have the same initial value rule as for global objects. The memory usually is allocated the same way like global objects, but that's in no way a necessity.
If you have local variables without any storage class specifier or with auto, then your variable will be allocated on the stack (even though not defined so by C, this is what compilers do practically of course). You can take its address in which case the compiler will have to omit optimizations like putting it into registers of course.
Local variables used with the storage class specifier register, are marked as having a special storage. As a result, you cannot take its address anymore. In recent compilers, there is normally no need to use register anymore, because of their sophisticated optimizers. If you are really expert, then you may get some performance out of it if using it, though.
Objects have associated storage durations that can be used to show the different initialization rules (formally, they only define how long at least the objects live). Objects declared with auto and register have automatic storage duration and are not initialized. You have to explicitly initialize them if you want them to contain some value. If you do not, they will contain whatever the compiler left on the stack before they began lifetime. Objects that are allocated by malloc (or another function of that family, like calloc) have allocated storage duration. Their storage is not initialized either. An exception is when using calloc, in which case the memory is initialized to zero ("real" zero. i.e all bytes 0x00, without regard to any NULL pointer representation). Objects that are declared with static and global variables have static storage duration. Their storage is initialized to zero appropriate for their respective type. Note that an object must not have a type, but the only way to get a type-less object is using allocated storage. (An object in C is a "region of storage").
So what is what? Here is the fixed code. Because once you allocated a block of memory you can't get back anymore how many items you allocated, best is to always store that count somewhere. I've introduced a variale dim to the struct that gets the count stored.
Cell makeCell(int dim) {
/* automatic storage duration => need to init manually */
Cell newCell;
/* note that in case dim is zero, we can either get NULL or a
* unique non-null value back from malloc. This depends on the
* implementation. */
newCell.subcells = malloc(dim * sizeof(*newCell.subcells));
newCell.dim = dim;
/* the following can be used as a check for an out-of-memory
* situation:
* if(newCell.subcells == NULL && dim > 0) ... */
for(int i = 0; i < dim; i++) {
newCell.subcells[i] = makeCell(dim - 1);
}
return newCell;
}
Now, things look like this for dim=2:
Cell {
subcells => {
Cell {
subcells => {
Cell { subcells => {}, dim = 0 }
},
dim = 1
},
Cell {
subcells => {
Cell { subcells => {}, dim = 0 }
},
dim = 1
}
},
dim = 2
}
Note that in C, the return value of a function is not needed to be an object. No storage at all is required to exist. Consequently, you are not allowed to change it. For example, the following is not possible:
makeCells(0).dim++
You will need a "free function" that free's the allocated memory again. Because storage for allocated objects is not freed automatically. You have to call free to free that memory for every subcells pointer in your tree. It's left as an exercise for you to write that up :)
Anything not allocated on the heap (via malloc and similar calls) is allocated on the stack, instead. Because of that, anything created in a particular function without being malloc'd will be destroyed when the function ends. That includes objects returned; when the stack is unwound after a function call the returned object is copied to space set aside for it on the stack by the caller function.
Warning: If you want to return an object that has pointers to other objects in it, make sure that the objects pointed to are created on the heap, and better yet, create that object on the heap, too, unless it's not intended to survive the function in which it is created.
My question is, how does C allocate memory when I haven't actually malloc()ed the appropriate amount of memory? What's the default?
To not allocate memory. You have to explicity create it on the stack or dynamically.
In your example, subcells points to an undefined location, which is a bug. Your function should return a pointer to a Cell struct at some point.
Am I going to end up accessing happy faces stored in memory somewhere, or perhaps writing over previously existing cells, or what?
You are lucky that you got a happy face. On one of those unlucky days, it could've wiped your system clean ;)
My question is, how does C allocate memory when I haven't actually malloc()ed the appropriate amount of memory?
It doesn't. However, what happens is when you define you Cell newCell, the subCells pointer is initialized to garbage value. Which may be a 0 (in which case you'd get a crash) or some integer big enough to make it look like an actual memory address. The compiler, on such cases, would happily fetch whatever value is residing there and bring it back to you.
What's the default?
This is the behavior if you don't initialize your variables. And your makeCell function looks a little under-developed.
There are really three sections where things can be allocated - data, stack & heap.
In the case you mention, it would be allocated on the stack. The problem with allocating something on the stack is that it's only valid for the duration of the function. Once your function returns, that memory is reclaimed. So, if you return a pointer to something allocated on the stack, that pointer will be invalid. If you return the actual object though (not a pointer), a copy of the object will automatically be made for the calling function to use.
If you had declared it as a global variable (e.g. in a header file or outside of a function) it would be allocated in the data section of memory. The memory in this section is allocated automatically when your program starts and deallocated automatically when it finishes.
If you allocate something on the heap using malloc(), that memory is good for as long as you want to use it - until you call free() at which point it is released. This gives you the flexibility to allocate and deallocate memory as you need it (as opposed to using globals where everything is allocated up front and only released when your program terminates).
Local variables are "allocated" on the stack. The stack is a preallocated amount of memory to hold those local variables. The variables cease to be valid when the function exits and will be overwritten by whatever comes next.
In your case, the code is doing nothing since it doesn't return your result. Also, a pointer to an object on the stack will also cease to be valid when the scope exits, so I guess in your precise case (you seems to be doing a linked list), you will need to use malloc().
I'm going to pretend I'm the computer here, reading this code...
typedef struct Cell {
struct Cell* subcells;
}
This tells me:
We have a struct type called Cell
It contains a pointer called subcells
The pointer should be to something of type struct Cell
It doesn't tell me whether the pointer goes to one Cell or an array of Cell. When a new Cell is made, the value of that pointer is undefined until a value is assigned to it. It's Bad News to use pointers before defining them.
Cell makeCell(int dim) {
Cell newCell;
New Cell struct, with an undefined subcells pointer. All this does is reserve a little chunk of memory to be called newCell that is the size of a Cell struct. It doesn't change the values that were in that memory - they could be anything.
for(int i = 0; i < dim; i++) {
newCell.subcells[i] = makeCell(dim -1);
In order to get newCell.subcells[i], a calculation is made to offset from subcells by i, then that is dereferenced. Specifically, this means the value is pulled from that memory address. Take, for instance, i==0... Then we would be dereferencing the subcells pointer itself (no offset). Since subcells is undefined, it could be anything. Literally anything! So, this would ask for a value from somewhere completely random in memory. There's no guarantee of anything with the result. It may print something, it may crash. It definitely should not be done.
}
return newCell;
}
Any time you work with a pointer, it's important to make sure it's set to a value before you dereference it. Encourage your compiler to give you any warnings it can, many modern compilers can catch this sort of thing. You can also give pointers cutesy default values like 0xdeadbeef (yup! that's a number in hexadecimal, it's just also a word, so it looks funny) so that they stand out. (The %p option for printf is helpful for displaying pointers, as a crude form of debugging. Debugger programs also can show them quite well.)