I have a struct defined in this way.
typedef struct COUNTRY {
char Code[3];
char Country[30];
int Population;
float Expectancy;
struct Country *Pointer;
} COUNTRY;
I have seen an array of structs allocated like this:
COUNTRY *countries = calloc(128, sizeof(COUNTRY));
or maybe like this:
COUNTRY *countries = malloc(128 * sizeof(COUNTRY));
But what does this do:
COUNTRY countries[128] = {};
Because I am still able to write to each entries' fields in all cases. Is the third option just bad form? It seems better to me because you can put that line up with the rest of your variable declarations outside of main(). Otherwise, you can only calloc() or malloc() inside of main() or other function.
Am I doing something wrong?
This:
COUNTRY countries[128];
simply defines an object whose type is "array of 128 COUNTRY elements".
The = {} is an initializer -- but empty initializers are illegal in C (I think gcc supports them as an extension). A portable alternative is:
COUNTRY countries[128] = { 0 };
which initializes all members of all elements to zero (0 for integers, \0' for characters, 0.0 for floating-point, NULL for pointers, and recursively for sub-elements). But since you specified the number of elements in the array (as 128), the initializer has no effect on how the array object is allocated.
If the declaration occurs inside a function definition, the array object has automatic storage duration, which means that it ceases to exist when execution reaches the end of the enclosing block. Such objects are commonly allocated on "the stack".
If it occurs outside any function definition (at file scope) or if it has the keyword static, then it has static storage duration, which means that it continues to exist for the entire execution of the program.
Objects allocated with malloc or calloc have allocated storage duration, which means that they continue to exist until they're explicitly deallocated by a call to free(). Such objects are commonly allocated on "the heap". (I'm ignoring realloc(), which complicates the description a bit.)
The first two statements will allocate array of structs on the heap, while the last one will initialize the array of structs on the stack.
It is not a bad form, it is just a matter where you want your data to be stored - on the stack ( freed automatically when your variable goes out of the scope, stack usually have significantly smaller size then heap, so you could overflow it if you place big data structures there), or on the heap (lifetime of data is not related to the scope, you need to manually free your memory).
It seems better to me because you can put that line up with the rest of your variable declarations outside of main().
If you need statically allocated object with the lifetime of the program, use this approach, there's nothing wrong with it. Please note that in this particular case, variable is not stored on the stack, but in the .data segment of your program (check this question for more details: How are global variables stored?).
The last form is 'stack allocated' or 'statically allocated'. Like calloc, all of the fields will be zeroed out.
Inside a function it is 'stack allocated' and that memory will go away when the function returns.
Outside any function, at file scope, it is statically allocated and a global piece of memory allocated before main() starts.
malloc/calloc are used when you don't know how many you need at compile time. For example in a linked list, you need to allocate/deallocate nodes on the fly. When you use an array, you the know exactly how many you need at compile time.
What also differs is where the memory is taken from. If you declare an array in a function, the memory will be taken from the stack. In the case of malloc/calloc, the memory is set aside in the heap.
= {};
is GNU C extension and is the same as:
= {0};
Related
This question already has answers here:
Difference between static memory allocation and dynamic memory allocation
(7 answers)
Closed 5 years ago.
I was wondering if someone could explain the differences between the memory allocation for ai and *pai
int ai[10];
int *pai = (int * ) calloc (10, sizeof(int));
I understand the second one is dynamically allocated but im struggling to explain why.
Let's see what is being specified in standard (difference wise)
From 7.22.3.1 (Under Memory management functions)
... The lifetime of an allocated object extends from the allocation
until the deallocation.
So yes, this is for dynamically allocated memory. Their lifetime is different from that of local variables. By calling free they are deallocated. Until then they will be alive. Doesn't depend on the life time of the scope on which they are created.
The first one is having automatic storage duration. This is the primary difference. So in the functions scope where it is declared, when it ends then it's lifetime will be over.
Also some people say that there is a heap and stack - but (un)fortunately C standard doesn't mention it. It is completely implementation of the features expected by the C standard. The implementation can be anything. The differences presented is least bothered about those kind of stuff.
As a conceptual redpill (taken from movie Matrix) pai is of automatic storage duration but the address of the memory it contains is not. The variable pai will be lost when the function where it is defined is executed. But the memory it points to, doesn't.
Well why is it called dynamic allocation?
Know one thing - when in programming we say dynamic in the context of language - it means we are doing something in runtime. Same here, we are allocating some memory when in run time by calling functions like malloc,calloc etc. That's why dynamic allocation.
In the first line, you create a variable of an array type, but the symbol ai is a constant pointer to this variable.
in the second line, you create a pointer type variable. then you allocate an array dynamically with calloc() and you puts it's address in the pointer.
The array ai is allocated on the stack, it implicitly goes out of scope, when the end of the function is reached. The pointer pai points to a memory location, which can be an array or a single element of the type pointed to, the memory is allocated on the heap and must be freed later. The second can be passed back to the function-caller on the end of the function and can even be resized with realloc (realloc does not clear the new memory like calloc does, malloc is like calloc without zeroing out the new memory). The first is for fast array computation and should be in the cache most of the time. The second is for unknown lenght of arrays, when the function is called. When the size is known, many programmers tend to define an array in the caller and pass it to the function, which modifies it. The array is implicitly converted to a pointer when calling the function.
Some library implementations store a pointer to an array in the global section, which can be reallocated. Or they have a fixed length array in global space. These variables are recommended to be thread_local. The user does not have to care about the memorymanagement of the variable of the other library.
library.h
const char* getResourceString(int id);
library.c
thread_local char* string_buf = NULL;
const char* getResourceString(int id) {
int i = getResourceSize(id);
string_buf = realloc(string_buf, i);
// fill the memory
return string_buffer;
};
These are quite different operations:
int ai[10];
declares an array object of 10 ints. If it is declared inside a block, it will have automatic storage duration, meaning that it will vanish at block end (both identifier and data). If it is declared outside any block (at file level) it will have static storage duration and will exist throughout all program.
int *pai = calloc (10, sizeof(int)); // DON'T CAST MALLOC IN C
declares a pointer to an allocated zone of memory that can contains ten integers. You can use pai as a pointer to the first element of an array and do pointer arithmetics on it. But sizeof(pai) is sizeof(int *). The array will have dynamic storage duration meaning that its life will end:
if the allocated block of memory is freed
if it is reused to store other objects
double * pd = pai;
for (int i=1; i<5; i++) { // assuming sizeof(double) == 2 * sizeof(int) here
pd[i] = i; // the allocated memory now contains 5 double
}
So in both case you can use the identifier as pointing to an array of 10 integers, but first one is an integer array object while second one is just a pointer to a block of dynamic memory (memory with no declared type that can take the type of an object that will be copied/created there) .
Gerenally speaking, automatically allocated objects will be on the stack, while dynamically allocated objects will be on the heap. Although this distinction is implementation (not standard) dependent, stack and heap are the most commonly used way to manage memory in C programs. They are basically two distinct regions of memory, the first is dedicated to automatic allocations and the second is dedicated to dynamic allocations. So when you call a function (say, the main function) all the objects declared in the scope of this function will be stacked (automatically allocated in the stack). If some dynamic allocation happens in this function, the memory will be allocated in the heap so that all pointers to this area will be pointing to objects outside the stack. When your function returns, all objects in the stack are also automatically unstacked and virtually don't exist anymore. But all objects in the heap will exist until you deallocate them (or they will be forcefully deallocated by the OS when the program ends). Arrays are structures that can be allocated automatically or dynamically. See this example:
int * automaticFactor() //wrong, see below
{
int x[10];
return &x[0];
}
int * dynamicFactor()
{
int * y = (int *) malloc(sizeof(int) * 10);
return &y[0];
}
int main()
{
//this will not work because &x[0] points to the stack
//and that area will be unstacked after the function return
int * n = automaticFactor();
//this will work because &y[0] points to the heap
//and that area will be in the heap until manual deallocation
int * m = dynamicFactor();
return 0;
}
Note that the pointers themselves are in the stack. What is in the heap is the area they are pointing to. So when you declare a pointer inside a function (such as the y of the example), it will also be unstacked at the end of the function. But since its value (i.e. the address of the allocated area) was returned to a pointer outside the function (i.e. to m), you will not lose track of the area allocated in the heap by the function.
I am new to C. I have these two files set up in this way.
I do not fully understand how I am able to assign values in the Item array without dynamically allocating memory.
The line Collection c; places all fields on the stack, so is that why I can directly set array members?
//collection.c
typedef struct {
uint32 price;
uint32 itemId;
} Item;
typedef struct {
Item item[MAX_SIZE];
uint32 name;
} Collection;
void function(Collection * ptr)
{
int i;
uint32 id = 0;
for(i = 0; i < MAX_SIZE; i++)
{
ptr->item[i].price = 10;
ptr->item[i].itemId = id;
id++;
}
}
//collection_main.c
Collection c; //global struct variable
//calls function in collection.c
function(&c);
I do not fully understand how I am able to assign values in the Item array without dynamically allocating memory.
First, as you are new to C, be aware of a potential issue with passing C functions pointers (which is quite reasonable, BTW). Unless you can guarantee that your calling code will always pass a valid pointer you need to check that pointer value in the function as best you can. That will typically amount to checking for a non-null pointer like this :
if ( ptr == NULL )
return <whatever to signal an error> ;
In this case you did allocate memory, because you created a Collection variable and that contains allocated space for the required fields.
The line Collection c; places all fields on the stack,
If it's in a function it will (typically) allocate space on that function's stack frame, which you should logically view as a separate area that the calling code cannot access. Make no assumptions about the layout of the stack. A very typical bug is to try and return a pointer to an item declared inside a function, and even supposedly experienced programmers have been known to do it.
Another potential bug in passing a pointer to a function is trying to access beyond the limits of the space allocated and pointed to. This can do things like corrupt other variables or even crash code. Your own code is correctly using the declared constant size of the array, so no problem.
If you do this outside of a function (which is possible), you would be using space reserved by the OS for these type of variables. That may not be on the stack but elsewhere. The OS gets that information from the compiled code file.
so is that why I can directly set array members ?
C code (and the executable binary that's produced by the compiler) does not care or check whether the pointers you pass are valid or not. So it's possible to pass a bad pointer to a C function and cause chaos.
In this case you did allocate all the required valid memory when you declared the variable and you passed a pointer to that variable. So no problem.
Dynamic memory allocation
It is more usual to consider explicit allocation using the malloc() family of functions as dynamic allocation. Allocations for local and global variables may be dynamic in the sense that they can happen at runtime but the allocation and deallocation are not the responsibility of the programmer to explicitly control so you do not generally need to think about these as part of dynamic memory allocation.
A minor point to close :
uint32 name ;
I'd consider this a bad choice of field name. Using "name" implies a string, whereas you probably mean a string id from e.g. an array. So try something like :
uint32 nameid ;
instead.
You'd be surprised how many coding problems crop up in a production environment simply because of a poor choice of variable name. Make them informative if possible and practical.
This is just a good coding habit to get into, IMO.
If a variable is declared within a loop, does the previous declarations become garbage? For example, in the following:
loop{
int array[10];
array[i]=......
}
array is declared for each loop iteration. When it is newly declared, is the new memory location that array allocates same with the older location?. If it is not, does the older declarations become garbage, because the allocated area is not freed? Finally, how can it be freed without exiting the loop if the array is static like the above example?
You aren't actually allocating anything. This goes on the stack, and the size of the stack frame is calculated by your compliler at compile time. The array will reuse the same amout of stack space each iteration. The int array[10] does effectively nothing at run time.
There's a big difference by doing this:
for (...) {
int a[10];
a[0] = x;
}
and doing this:
for (...) {
int* a = (int*)malloc(sizeof(int)*10);
a[0] = x;
free(a);
}
The first "allocation" is fixed in size, and will cost you nothing. The second can be of variable size and will be a heap allocated array which you will need to manually free. C has no concept of garbage collection, so nothing really becomes garbage. But you are required to free whatever you allocate using the malloc function. If you never use that function you never need to free anything. The compiler will take care of that for you.
This is an automatic variable that the compiler handles - automatically.
You only have to take care of storage you allocate yourself using new or malloc. The rest is handled for you.
The array comes into scope each time you enter the loop and is destroyed again at the end of each loop. The compiler is very likely to reuse the same space each time, but that is not defined by the language. There will be no garbage either way.
You can assume that, for every iteration of the loop, a new array is created and and its is destroyed at the end of iteration. It implies content of newly created array is undefined .(may be garbage - more chances that it contain same data since it might occupy same place in the stack)
However, internally their wont be any allocation or deallocation for the int array[10] as pointed by Dervall
Why can I return from a function an array setup by malloc:
int *dog = (int*)malloc(n * sizeof(int));
but not an array setup by
int cat[3] = {0,0,0};
The "cat[ ]" array is returned with a Warning.
Thanks all for your help
This is a question of scope.
int cat[3]; // declares a local variable cat
Local variables versus malloc'd memory
Local variables exist on the stack. When this function returns, these local variables will be destroyed. At that point, the addresses used to store your array are recycled, so you cannot guarantee anything about their contents.
If you call malloc, you will be allocating from the heap, so the memory will persist beyond the life of your function.
If the function is supposed to return a pointer (in this case, a pointer-to-int which is the first address of the integer array), that pointer should point to good memory. Malloc is the way to ensure this.
Avoiding Malloc
You do not have to call malloc inside of your function (although it would be normal and appropriate to do so).
Alternatively, you could pass an address into your function which is supposed to hold these values. Your function would do the work of calculating the values and would fill the memory at the given address, and then it would return.
In fact, this is a common pattern. If you do this, however, you will find that you do not need to return the address, since you already know the address outside of the function you are calling. Because of this, it's more common to return a value which indicates the success or failure of the routine, like an int, than it is to return the address of the relevant data.
This way, the caller of the function can know whether or not the data was successfully populated or if an error occurred.
#include <stdio.h> // include stdio for the printf function
int rainCats (int *cats); // pass a pointer-to-int to function rainCats
int main (int argc, char *argv[]) {
int cats[3]; // cats is the address to the first element
int success; // declare an int to store the success value
success = rainCats(cats); // pass the address to the function
if (success == 0) {
int i;
for (i=0; i<3; i++) {
printf("cat[%d] is %d \r", i, cats[i]);
getchar();
}
}
return 0;
}
int rainCats (int *cats) {
int i;
for (i=0; i<3; i++) { // put a number in each element of the cats array
cats[i] = i;
}
return 0; // return a zero to signify success
}
Why this works
Note that you never did have to call malloc here because cats[3] was declared inside of the main function. The local variables in main will only be destroyed when the program exits. Unless the program is very simple, malloc will be used to create and control the lifespan of a data structure.
Also notice that rainCats is hard-coded to return 0. Nothing happens inside of rainCats which would make it fail, such as attempting to access a file, a network request, or other memory allocations. More complex programs have many reasons for failing, so there is often a good reason for returning a success code.
There are two key parts of memory in a running program: the stack, and the heap. The stack is also referred to as the call stack.
When you make a function call, information about the parameters, where to return, and all the variables defined in the scope of the function are pushed onto the stack. (It used to be the case that C variables could only be defined at the beginning of the function. Mostly because it made life easier for the compiler writers.)
When you return from a function, everything on the stack is popped off and is gone (and soon when you make some more function calls you'll overwrite that memory, so you don't want to be pointing at it!)
Anytime you allocate memory you are allocating if from the heap. That's some other part of memory, maintained by the allocation manager. Once you "reserve" part of it, you are responsible for it, and if you want to stop pointing at it, you're supposed to let the manager know. If you drop the pointer and can't ask to have it released any more, that's a leak.
You're also supposed to only look at the part of memory you said you wanted. Overwriting not just the part you said you wanted, but past (or before) that part of memory is a classic technique for exploits: writing information into part of memory that is holding computer instructions instead of data. Knowledge of how the compiler and the runtime manage things helps experts figure out how to do this. Well designed operating systems prevent them from doing that.
heap:
int *dog = (int*)malloc(n*sizeof(int*));
stack:
int cat[3] = {0,0,0};
Because int cat[3] = {0,0,0}; is declaring an automatic variable that only exists while the function is being called.
There is a special "dispensation" in C for inited automatic arrays of char, so that quoted strings can be returned, but it doesn't generalize to other array types.
cat[] is allocated on the stack of the function you are calling, when that stack is freed that memory is freed (when the function returns the stack should be considered freed).
If what you want to do is populate an array of int's in the calling frame pass in a pointer to an that you control from the calling frame;
void somefunction() {
int cats[3];
findMyCats(cats);
}
void findMyCats(int *cats) {
cats[0] = 0;
cats[1] = 0;
cats[2] = 0;
}
of course this is contrived and I've hardcoded that the array length is 3 but this is what you have to do to get data from an invoked function.
A single value works because it's copied back to the calling frame;
int findACat() {
int cat = 3;
return cat;
}
in findACat 3 is copied from findAtCat to the calling frame since its a known quantity the compiler can do that for you. The data a pointer points to can't be copied because the compiler does not know how much to copy.
When you define a variable like 'cat' the compiler assigns it an address. The association between the name and the address is only valid within the scope of the definition. In the case of auto variables that scope is the function body from the point of definition onwards.
Auto variables are allocated on the stack. The same address on the stack is associated with different variables at different times. When you return an array, what is actually returned is the address of the first element of the array. Unfortunately, after the return, the compiler can and will reuse that storage for completely unrelated purposes. What you'd see at a source code level would be your returned variable mysteriously changing for no apparent reason.
Now, if you really must return an initialized array, you can declare that array as static. A static variable has a permanent rather than a temporary storage allocation. You'll need to keep in mind that the same memory will be used by successive calls to the function, so the results from the previous call may need to be copied somewhere else before making the next call.
Another approach is to pass the array in as an argument and write into it in your function. The calling function then owns the variable, and the issues with stack variables don't arise.
None of this will make much sense unless you carefully study how the stack works. Good luck.
You cannot return an array. You are returning a pointer. This is not the same thing.
You can return a pointer to the memory allocated by malloc() because malloc() has allocated the memory and reserved it for use by your program until you explicitly use free() to deallocate it.
You may not return a pointer to the memory allocated by a local array because as soon as the function ends, the local array no longer exists.
This is a question of object lifetime - not scope or stack or heap. While those terms are related to the lifetime of an object, they aren't equivalent to lifetime, and it's the lifetime of the object that you're returning that's important. For example, a dynamically alloced object has a lifetime that extends from allocation to deallocataion. A local variable's lifetime might end when the scope of the variable ends, but if it's static its lifetime won't end there.
The lifetime of an object that has been allocated with malloc() is until that object has been freed using the free() function. Therefore when you create an object using malloc(), you can legitimately return the pointer to that object as long as you haven't freed it - it will still be alive when the function ends. In fact you should take care to do something with the pointer so it gets remembered somewhere or it will result in a leak.
The lifetime of an automatic variable ends when the scope of the variable ends (so scope is related to lifetime). Therefore, it doesn't make sense to return a pointer to such an object from a function - the pointer will be invalid as soon as the function returns.
Now, if your local variable is static instead of automatic, then its lifetime extends beyond the scope that it's in (therefore scope is not equivalent to lifetime). So if a function has a local static variable, the object will still be alive even when the function has returned, and it would be legitimate to return a pointer to a static array from your function. Though that brings in a whole new set of problems because there's only one instance of that object, so returning it multiple times from the function can cause problems with sharing the data (it basically only works if the data doesn't change after initialization or there are clear rules for when it can and cannot change).
Another example taken from another answer here is regarding string literals - pointers to them can be returned from a function not because of a scoping rule, but because of a rule that says that string literals have a lifetime that extends until the program ends.
I have been writing C for only a scant few weeks and have not taken the time to worry myself too much about malloc(). Recently, though, a program of mine returned a string of happy faces instead of the true/false values I had expected to it.
If I create a struct like this:
typedef struct Cell {
struct Cell* subcells;
}
and then later initialize it like this
Cell makeCell(int dim) {
Cell newCell;
for(int i = 0; i < dim; i++) {
newCell.subcells[i] = makeCell(dim -1);
}
return newCell; //ha ha ha, this is here in my program don't worry!
}
Am I going to end up accessing happy faces stored in memory somewhere, or perhaps writing over previously existing cells, or what? My question is, how does C allocate memory when I haven't actually malloc()ed the appropriate amount of memory? What's the default?
Short answer: It isn't allocated for you.
Slightly longer answer: The subcells pointer is uninitialized and may point anywhere. This is a bug, and you should never allow it to happen.
Longer answer still: Automatic variables are allocated on the stack, global variables are allocated by the compiler and often occupy a special segment or may be in the heap. Global variables are initialized to zero by default. Automatic variables do not have a default value (they simply get the value found in memory) and the programmer is responsible for making sure they have good starting values (though many compilers will try to clue you in when you forget).
The newCell variable in you function is automatic, and is not initialized. You should fix that pronto. Either give newCell.subcells a meaningful value promptly, or point it at NULL until you allocate some space for it. That way you'll throw a segmentation violation if you try to dereference it before allocating some memory for it.
Worse still, you are returning a Cell by value, but assigning it to a Cell * when you try to fill the subcells array. Either return a pointer to a heap allocated object, or assign the value to a locally allocated object.
A usual idiom for this would have the form something like
Cell* makeCell(dim){
Cell *newCell = malloc(sizeof(Cell));
// error checking here
newCell->subcells = malloc(sizeof(Cell*)*dim); // what if dim=0?
// more error checking
for (int i=0; i<dim; ++i){
newCell->subCells[i] = makeCell(dim-1);
// what error checking do you need here?
// depends on your other error checking...
}
return newCell;
}
though I've left you a few problems to hammer out..
And note that you have to keep track of all the bits of memory that will eventually need to be deallocated...
There is no default value for your pointer. Your pointer will point to whatever it stores currently. As you haven't initialized it, the line
newCell.subcells[i] = ...
Effectively accesses some uncertain part of memory. Remember that subcells[i] is equivalent to
*(newCell.subcells + i)
If the left side contains some garbage, you will end up adding i to a garbage value and access the memory at that uncertain location. As you correctly said, you will have to initialize the pointer to point to some valid memory area:
newCell.subcells = malloc(bytecount)
After which line you can access that many bytes. With regards to other sources of memory, there are different kind of storage that all have their uses. What kind you get depends on what kind of object you have and which storage class you tell the compiler to use.
malloc returns a pointer to an object with no type. You can make a pointer point to that region of memory, and the type of the object will effectively become the type of the pointed to object type. The memory is not initialized to any value and access usually is slower. Objects so obtained are called allocated objects.
You can place objects globally. Their memory will be initialized to zero. For points, you will get NULL pointers, for floats you will get a proper zero too. You can rely on a proper initial value.
If you have local variables but use the static storage class specifier, then you will have the same initial value rule as for global objects. The memory usually is allocated the same way like global objects, but that's in no way a necessity.
If you have local variables without any storage class specifier or with auto, then your variable will be allocated on the stack (even though not defined so by C, this is what compilers do practically of course). You can take its address in which case the compiler will have to omit optimizations like putting it into registers of course.
Local variables used with the storage class specifier register, are marked as having a special storage. As a result, you cannot take its address anymore. In recent compilers, there is normally no need to use register anymore, because of their sophisticated optimizers. If you are really expert, then you may get some performance out of it if using it, though.
Objects have associated storage durations that can be used to show the different initialization rules (formally, they only define how long at least the objects live). Objects declared with auto and register have automatic storage duration and are not initialized. You have to explicitly initialize them if you want them to contain some value. If you do not, they will contain whatever the compiler left on the stack before they began lifetime. Objects that are allocated by malloc (or another function of that family, like calloc) have allocated storage duration. Their storage is not initialized either. An exception is when using calloc, in which case the memory is initialized to zero ("real" zero. i.e all bytes 0x00, without regard to any NULL pointer representation). Objects that are declared with static and global variables have static storage duration. Their storage is initialized to zero appropriate for their respective type. Note that an object must not have a type, but the only way to get a type-less object is using allocated storage. (An object in C is a "region of storage").
So what is what? Here is the fixed code. Because once you allocated a block of memory you can't get back anymore how many items you allocated, best is to always store that count somewhere. I've introduced a variale dim to the struct that gets the count stored.
Cell makeCell(int dim) {
/* automatic storage duration => need to init manually */
Cell newCell;
/* note that in case dim is zero, we can either get NULL or a
* unique non-null value back from malloc. This depends on the
* implementation. */
newCell.subcells = malloc(dim * sizeof(*newCell.subcells));
newCell.dim = dim;
/* the following can be used as a check for an out-of-memory
* situation:
* if(newCell.subcells == NULL && dim > 0) ... */
for(int i = 0; i < dim; i++) {
newCell.subcells[i] = makeCell(dim - 1);
}
return newCell;
}
Now, things look like this for dim=2:
Cell {
subcells => {
Cell {
subcells => {
Cell { subcells => {}, dim = 0 }
},
dim = 1
},
Cell {
subcells => {
Cell { subcells => {}, dim = 0 }
},
dim = 1
}
},
dim = 2
}
Note that in C, the return value of a function is not needed to be an object. No storage at all is required to exist. Consequently, you are not allowed to change it. For example, the following is not possible:
makeCells(0).dim++
You will need a "free function" that free's the allocated memory again. Because storage for allocated objects is not freed automatically. You have to call free to free that memory for every subcells pointer in your tree. It's left as an exercise for you to write that up :)
Anything not allocated on the heap (via malloc and similar calls) is allocated on the stack, instead. Because of that, anything created in a particular function without being malloc'd will be destroyed when the function ends. That includes objects returned; when the stack is unwound after a function call the returned object is copied to space set aside for it on the stack by the caller function.
Warning: If you want to return an object that has pointers to other objects in it, make sure that the objects pointed to are created on the heap, and better yet, create that object on the heap, too, unless it's not intended to survive the function in which it is created.
My question is, how does C allocate memory when I haven't actually malloc()ed the appropriate amount of memory? What's the default?
To not allocate memory. You have to explicity create it on the stack or dynamically.
In your example, subcells points to an undefined location, which is a bug. Your function should return a pointer to a Cell struct at some point.
Am I going to end up accessing happy faces stored in memory somewhere, or perhaps writing over previously existing cells, or what?
You are lucky that you got a happy face. On one of those unlucky days, it could've wiped your system clean ;)
My question is, how does C allocate memory when I haven't actually malloc()ed the appropriate amount of memory?
It doesn't. However, what happens is when you define you Cell newCell, the subCells pointer is initialized to garbage value. Which may be a 0 (in which case you'd get a crash) or some integer big enough to make it look like an actual memory address. The compiler, on such cases, would happily fetch whatever value is residing there and bring it back to you.
What's the default?
This is the behavior if you don't initialize your variables. And your makeCell function looks a little under-developed.
There are really three sections where things can be allocated - data, stack & heap.
In the case you mention, it would be allocated on the stack. The problem with allocating something on the stack is that it's only valid for the duration of the function. Once your function returns, that memory is reclaimed. So, if you return a pointer to something allocated on the stack, that pointer will be invalid. If you return the actual object though (not a pointer), a copy of the object will automatically be made for the calling function to use.
If you had declared it as a global variable (e.g. in a header file or outside of a function) it would be allocated in the data section of memory. The memory in this section is allocated automatically when your program starts and deallocated automatically when it finishes.
If you allocate something on the heap using malloc(), that memory is good for as long as you want to use it - until you call free() at which point it is released. This gives you the flexibility to allocate and deallocate memory as you need it (as opposed to using globals where everything is allocated up front and only released when your program terminates).
Local variables are "allocated" on the stack. The stack is a preallocated amount of memory to hold those local variables. The variables cease to be valid when the function exits and will be overwritten by whatever comes next.
In your case, the code is doing nothing since it doesn't return your result. Also, a pointer to an object on the stack will also cease to be valid when the scope exits, so I guess in your precise case (you seems to be doing a linked list), you will need to use malloc().
I'm going to pretend I'm the computer here, reading this code...
typedef struct Cell {
struct Cell* subcells;
}
This tells me:
We have a struct type called Cell
It contains a pointer called subcells
The pointer should be to something of type struct Cell
It doesn't tell me whether the pointer goes to one Cell or an array of Cell. When a new Cell is made, the value of that pointer is undefined until a value is assigned to it. It's Bad News to use pointers before defining them.
Cell makeCell(int dim) {
Cell newCell;
New Cell struct, with an undefined subcells pointer. All this does is reserve a little chunk of memory to be called newCell that is the size of a Cell struct. It doesn't change the values that were in that memory - they could be anything.
for(int i = 0; i < dim; i++) {
newCell.subcells[i] = makeCell(dim -1);
In order to get newCell.subcells[i], a calculation is made to offset from subcells by i, then that is dereferenced. Specifically, this means the value is pulled from that memory address. Take, for instance, i==0... Then we would be dereferencing the subcells pointer itself (no offset). Since subcells is undefined, it could be anything. Literally anything! So, this would ask for a value from somewhere completely random in memory. There's no guarantee of anything with the result. It may print something, it may crash. It definitely should not be done.
}
return newCell;
}
Any time you work with a pointer, it's important to make sure it's set to a value before you dereference it. Encourage your compiler to give you any warnings it can, many modern compilers can catch this sort of thing. You can also give pointers cutesy default values like 0xdeadbeef (yup! that's a number in hexadecimal, it's just also a word, so it looks funny) so that they stand out. (The %p option for printf is helpful for displaying pointers, as a crude form of debugging. Debugger programs also can show them quite well.)