why use malloc with structure? - c

Why would I use malloc when same job can be done by without malloc as below..
#include <stdio.h>
#include <conio.h>
struct node {
int data;
struct node *l;
struct node *r;
};
int main(){
//Case 1
struct node n1;
n1.data = 99;
printf("n1 data is %d\n", n1.data);
//Case 2
struct node *n2 = (struct node *) malloc (sizeof(struct node));
n2 -> data = 4444;
printf("n2 data is:%d\n",n2 -> data);
free(n2);
return (0);
}
I am having hard time to understand how n1 which is not initialized to memory location is able to store data (99) .
when to use case 1 and when to use case 2.

Why would I use malloc when same job cane be done by without malloc as
below..
You use malloc, to allocate memory on heap, and without malloc, you are placing you struct in stack memory.
I am having hard time to understand how n1 which is not initialized
to memory location is able to store data (99) .
Initialized or not, when you assign data n1.data = 99; , it is stored.
2) when to use case 1 and when to use case 2
Case 1 is is used, when you know that you will be using the structure object within a confined scope, and will not be making references to the structure data, beyond its scope.
Case 2 is used when you will be using your structure at multiple places, and you are willing to manage memory for it manually(and carefully!). The advantage of this method is that, you create and initialize the structure at some part of the program scope, and you create a pointer and pass the pointer around, since passing a 4 byte pointer is far more efficient than , passing the structure itself.

int main() {
struct node n1;
n1.data = 99
This reserves space on the stack (in main's frame) equivalent to the size of a struct node. This is known as a local, and it only exists within the context of main.
struct node *n2 = (struct node *) malloc (sizeof(struct node));
This is an allocation on the heap. This memory exists no matter what function context you are in. This is typically called "dynamic allocation".
These node structures are the basis for a linked list, which can have nodes added, removed, re-ordered, etc. at will.
See also:
What and where are the stack and heap?

In the first case, memory is allocated on the stack. When the variable n1 runs out of scope, the memory is released.
In the second case, memory is allocated on the heap. You have to explicitly release memory resources (as you are doing with free).
Rule-of-thumb can be that you use stack-allocated memory for local, temporary data structures of limited size (the stack is only a portion of the computer's memory, differs per platform). Use the heap for data structures you want to persist or are large.
Googling for stack and heap will give you much more information.

Your data type looks like a node in a tree. Two primary reasons to use malloc for tree node allocations would be
To allocate the arbitrary number of nodes. The number of tree nodes will in general case be a run-time value. For this reason, it is impossible to declare the proper number of local variables for such nodes since all local variables have to be declared at compile time. Meanwhile, malloc can be called in run-time as many times as you want, allocating as many node object as you need.
To make sure that the node does not be destroyed automatically when the lifetime of the local object ends (i.e. at the end of the block). Objects allocated by malloc live forever, i.e. until you destroy them explicitly by calling free. Such object will transcend the block boundaries and function boundaries. Nothing like that is possible with local objects, since local objects are automatically destroyed at the end of their block.
Your code sample does not depend on any of the benefits of dynamic allocation, since it does not really create a real tree. It just declared as single node. But if you attempt to build a full tree with run-time number of nodes, yo will immediately realize that it is impossible to do by declaring nodes as local objects. You will unavoidably have to allocate your nodes using malloc.
Your "how n1 which is not initialized to memory location is able to store data" question must be caused by some confusion. struct node n1; is an object definition, which means that it assigns a memory location for n1. That's exactly the purpose of object definition.

Normally you use malloc only, if you don't know the size of memory you will need before the application is running.
Your code is correct, but you can't allocate memory dynamically. What if you want to save a measured value in the node.date and you don't know, how many measures you will capture? Then you have to malloc some new memory on each measure you take.
If you define all the nodes before run-time (directly in the code), you can't save more measures than you've defined before.
Search for linked lists in c and you will find some good examples.

Related

C: Is my understanding about the specifics of heap and stack allocation correct?

I have a sort of linked list implemented (code at bottom) in C (which has obvious issues, but I'm not asking about those or about linked lists; I'm aware for instance that there are no calls to free() the allocated memory) given below which does what I expect (so far as I've checked). My question is about the first couple of lines of the addnodeto() function and what it does to the heap/stack.
My understanding is that calling malloc() sets aside some memory on the heap, and then returns the address of that memory (pointing to the beginning) which is assigned to struct node *newnode which is itself on the stack. When the function is first called, *nodetoaddto is a pointer to struct node first, both of which are on the stack. Thus the (*nodeaddto)->next = newnode sets first.next equal to the value of newnode which is the address of the newly allocated memory.
When we leave this function, and continue executing the main() function, is *newnode removed from the stack (not sure if 'deallocated' is the correct word), leaving only struct node first pointing to the 'next' node struct on the heap? If so, does this 'next' struct node have a variable name also on the stack or heap, or it is merely some memory pointed too? Moreover, is it true to say that struct node first is on the stack, whilst all subsequent nodes will be on the heap, and that just before main() returns 0 there are no structs/variables on the stack other than struct node first? Or is/are there 1/more than 1 *newnode still on the stack?
I did try using GDB which showed that struct node *newnode was located at the same memory address both times addnodeto() was called (so was it removed and then happened to be re-defined/allocated in to the same location, or was perhaps the compiler being smart and left it there even once the function was exited the first time, or other?), but I couldn't work anything else out concretely. Thank you.
The code:
#include <stdio.h>
#include <stdlib.h>
#define STR_LEN 5
struct node {
char message[STR_LEN];
struct node *next;
};
void addnodeto(struct node **nodeaddto, char letter, int *num_of_nodes){
struct node *newnode = malloc(sizeof(struct node));
(*nodeaddto)->next = newnode;
newnode->message[0] = letter;
(*nodeaddto) = newnode;
*num_of_nodes += 1;
}
int main(void){
struct node first = {"F", NULL};
struct node *last = &first;
int num_nodes = 1;
addnodeto(&last, 'S', &num_nodes);
addnodeto(&last, 'T', &num_nodes);
addnodeto(&last, 'I', &num_nodes);
printf("Node: %d holds the char: %c\n", num_nodes-3, first.message[0]);
printf("Node: %d holds the char: %c\n", num_nodes-2, (first.next)->message[0]);
printf("Node: %d holds the char: %c\n", num_nodes-1, ((first.next)->next)->message[0]);
printf("Node: %d holds the char: %c\n", num_nodes, (last)->message[0]);
return 0;
}
Which when run outputs:
Node: 1 holds the char: F
Node: 2 holds the char: S
Node: 3 holds the char: T
Node: 4 holds the char: I
As expected.
My understanding is that calling malloc() sets aside some memory on the heap, and then returns the address of that memory (pointing to the beginning)…
Yes, but people who call it “the heap” are being sloppy with terminology. A heap is a kind of data structure, like a linked list, a binary tree, or a hash table. Heaps can be used for things other than tracking available memory, and available memory can be tracked using data structures other than a heap.
I do not actually know of a specific term for the memory that the memory management routines manage. There are actually several different sets of memory we might want terms for:
all the memory they have acquired from the operating system so far and are managing, including both memory that is currently allocated to clients and memory that has been freed (and not yet returned to the operating system) and is available for reuse;
the memory that is currently allocated to clients;
the memory that is currently available for reuse; and
the entire range of memory that is being managed, including portions of the virtual address space reserved for future mapping when necessary to request more memory from the operating system.
I have seen “pool” used to describe such memory but have not seen a specific definition of it.
… which is assigned to struct node *newnode which is itself on the stack.
struct node *newnode is indeed nominally on the stack in common C implementations. However, the C standard only classifies it as automatic storage duration, meaning its memory is automatically managed by the C implementation. The stack is the most common way to implement that, but specialized C implementations may do it in other ways. Also, once the compiler optimizes the program, newnode might not be on the stack; the compiler might generate code that just keeps it in a register, and there are other possibilities too.
A complication here is when we are talking about memory use in a C program, we can talk about the memory use in a model computer the C standard uses to describe the semantics of programs or the memory use in actual practice. For example, as the C standard describes it, every object has some memory reserved for it during its lifetime. However, when a program is compiled, the compiler can produce any code it wants that gets the same results as required by the C standard. (The output of the program has to be the same, and certain other interactions have to behave the same.) So a compiler might not use memory for an object at all. After optimization, an object might be in memory at one time and in registers at another, or it might always be in a register and never in memory, and it might be in different registers at different times, and it might not be any particular place because it might have been incorporated into other things. For example, in int x = 3; printf("%d\n", 4*x+2);, the compiler might eliminate x completely and just print “14”. So, when asking about where things are in memory, you should be clear about whether you want to discuss the semantics in the model computer that the C standard uses or the actual practice in optimized programs.
When the function is first called, *nodetoaddto is a pointer to struct node first, both of which are on the stack.
nodetoaddto may be on the stack, per above, but it also may be in a register. It is common that function arguments are passed in registers.
It points to a struct node. By itself, struct node is a type, so it is just a concept, not an object to point to. In contrast, “a struct node” is an object of that type. That object might or might not be on the stack; addnodeto would not care; it could link to it regardless of where it is in memory. Your main routine does create its first and last nodes with automatic storage duration, but it could use static just as well, and then the nodes would likely be located in a different part of memory rather than the stack, and addnodeto would not care.
Thus the (*nodeaddto)->next = newnode sets first.next equal to the value of newnode which is the address of the newly allocated memory.
Yes: In main, last is initialized to pointer to first. Then &last is passed to addnodeto, so nodeaddto is a pointer to last. So *nodeaddto is a pointer to first. So (*nodeaddto)->next is the next member in `first.
When we leave this function, and continue executing the main() function, is *newnode removed from the stack (not sure if 'deallocated' is the correct word), leaving only struct node first pointing to the 'next' node struct on the heap?
newnode is an object with automatic storage duration inside addnodeto, so its memory is automatically released when addnodeto ends.
*newnode is a struct node with allocated storage duration, so its memory is not released when a function ends. Its memory is released when free is called, or possibly some other routine that may release memory, like realloc.
If so, does this 'next' struct node have a variable name also on the stack or heap, or it is merely some memory pointed [to]?
There are no variable names in the stack or in the heap. Variable names exist only in source code (and in the compiler while compiling and in debugging information associated with the compiled program, but that debugging information is generally separate from the normal execution of the program). When we work with allocated memory, we generally work with it only by pointers to it.
Moreover, is it true to say that struct node first is on the stack, whilst all subsequent nodes will be on the heap,…
Yes, subject to the caveats about stack and “heap” above.
… and that just before main() returns 0 there are no structs/variables on the stack other than struct node first?
All of the automatic objects in main are on the stack (or otherwise automatically managed): first, last, and num_nodes.
Or is/are there 1/more than 1 *newnode still on the stack?
No.

Dinamically vs Static memory allocation for a struct array

If you want to allocate an array of struct you can do it statically by declaring something like
struct myStruct myStructArray[100];
or dinamically with something like
struct myStruct *myStructArray = calloc(100, sizeof(struct myStruct) );
but in this case you are responsible for freeing the memory.
In many applications and samples I found a mixed approach:
struct wrapperStruct
{
int myInt;
struct myStruct myStructArray[1];
};
Then the allocation is performed like this
int n = 100;
size_t memory_size = sizeof(struct wrapperStruct) + (n - 1) * sizeof(struct myStruct);
struct wrapperStruct *wrapperStruct_p = calloc(1, memory_size);
So (if I understood correctly) since the array is the last member of the struct and the field of a struct respect the same position in memory then you are "extending" the single entry array myStructArray with 99 entries.
This allow you to safety write something like wrapperStruct_p.myStructArray[44] without causing a buffer overflow and without having to create a dynamic allocated array of struct and then take care of the memory disposal at the end. So the alternative approach would be:
struct wrapperStruct
{
int myInt;
struct myStruct *myStructArray;
};
struct wrapperStruct *wrapperStruct_p = calloc(1, sizeof(struct wrapperStruct) );
wrapperStruct_p.myStructArray = calloc(100, sizeof(struct myStruct) )
The question is what happens when you try to free the wrapperStruct_p variable ?
Are you causing a memory leak ?
Is the C memory management able to understand that the array of struct is made of 100 entries and not 1 ?
What are the benefits of the first approach apart from not having to free the pointer inside the struct ?
The question is what happens when you try to free the wrapperStruct_p
variable ?
Are you causing a memory leak ?
Most likely, but not necessary. The memory for the inner dynamic array is not freed, but you could still free it later if you saved the pointer address to some other variable.
Is the C memory management able to understand that the array of struct is made of 100 entries and not 1 ?
"C memory management" takes care of stack and heap allocations (the latter using systemcalls so maybe it's not really a "C memory management"), it doesn't do much else other than provide syntactic sugar on top of assembler (unlike garbage collected languages like Java or other).
C itself doesn't care about how many entries are somewhere and what part of memory you access (SEGFAULTS are the OS response to memory access violations)
What are the benefits of the first approach apart from not having to
free the pointer inside the struct ?
If by "first approach" you mean stack allocated array, then it's mainly the fact that you do not need to allocate anything and the stack does it for you (drawback being that it stays allocated in the declared scope and you can't free up or increase the array space) then the constant allocation speed and assurance you'll get your 100 array items no matter the OS response (many realtime applications require maximum response times, therefore a heap allocation can be a really big slowdown causing problems).
If by "first approach" you mean using the wrapper struct, then I do not see any benefits other than the one you already stated.
I'd even suggest you not advocate/use this approach, since it is a really confusing technique that doesn't serve noticeable benefits (plus it allocates 1 space even though it may not be even used, but that's a detail)
The main goal is to write code that is easily understandable by other people. Machines and compilers can nowadays do wonders with code, so unless you are a compiler designer, standard library developer or machine level programmer for embedded systems, you should write simple to understand code.

Difference in creating a struct using malloc and without malloc

Could someone please explain to me the difference between creating a structure with and without malloc. When should malloc be used and when should the regular initialization be used?
For example:
struct person {
char* name;
};
struct person p = {.name="apple"};
struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";
What is really the difference between the two? When would one approach be used over others?
Having a data structure like;
struct myStruct {
int a;
char *b;
};
struct myStruct p; // alternative 1
struct myStruct *q = malloc(sizeof(struct myStruct)); // alternative 2
Alternative 1: Allocates a myStruct width of memory space on stack and hands back to you the memory address of the struct (i.e., &p gives you the first byte address of the struct). If it is declared in a function, its life ends when the function exits (i.e. if function gets out of the scope, you can't reach it).
Alternative 2: Allocates a myStruct width of memory space on heap and a pointer width of memory space of type (struct myStruct*) on stack. The pointer value on the stack gets assigned the value of the memory address of the struct (which is on the heap) and this pointer address (not the actual structs address) is handed back to you. It's life time never ends until you use free(q).
In the latter case, say, myStruct sits on memory address 0xabcd0000 and q sits on memory address 0xdddd0000; then, the pointer value on memory address 0xdddd0000 is assigned as 0xabcd0000 and this is returned back to you.
printf("%p\n", &p); // will print "0xabcd0000" (the address of struct)
printf("%p\n", q); // will print "0xabcd0000" (the address of struct)
printf("%p\n", &q); // will print "0xdddd0000" (the address of pointer)
Addressing the second part of your; when to use which:
If this struct is in a function and you need to use it after the function exits, you need to malloc it. You can use the value of the struct by returning the pointer, like: return q;.
If this struct is temporary and you do not need its value after, you do not need to malloc memory.
Usage with an example:
struct myStruct {
int a;
char *b;
};
struct myStruct *foo() {
struct myStruct p;
p.a = 5;
return &p; // after this point, it's out of scope; possible warning
}
struct myStruct *bar() {
struct myStruct *q = malloc(sizeof(struct myStruct));
q->a = 5;
return q;
}
int main() {
struct myStruct *pMain = foo();
// memory is allocated in foo. p.a was assigned as '5'.
// a memory address is returned.
// but be careful!!!
// memory is susceptible to be overwritten.
// it is out of your control.
struct myStruct *qMain = bar();
// memory is allocated in bar. q->a was assigned as '5'.
// a memory address is returned.
// memory is *not* susceptible to be overwritten
// until you use 'free(qMain);'
}
If we assume both examples occur inside a function, then in:
struct person p = {.name="apple"};
the C implementation automatically allocates memory for p and releases it when execution of the function ends (or, if the statement is inside a block nested in the function, when execution of that block ends). This is useful when:
You are working with objects of modest size. (For big objects, using many kibibytes of memory, malloc may be better. The thresholds vary depending on circumstances.)
You are working with a small number of objects at one time.
In:
struct person* p_tr = malloc(sizeof(struct person));
p_tr->name = "apple";
the program explicitly requests memory for an object, and the program generally should release that memory with free when it is done with the object. This is useful when:
The object must be returned to the caller of the function. An automatic object, as used above, will cease to exist (in the C model of computation; the actual memory in your computer does not stop existing—rather it is merely no longer reserved for use for the object) when execution of the function ends, but this allocated object will continue to exist until the program frees it (or ends execution).
The object is very large. (Generally, C implementations provide more memory for allocation by malloc than they do for automatic objects.)
The program will create a variable number of such objects, depending on circumstances, such as creating linked lists, trees, or other structures from input whose size is not known before it is read.
Note that struct person p = {.name="apple"}; initializes the name member with "apple" and initializes all other members to zero. However, the code that uses malloc and assigns to p_tr->name does not initialize the other members.
If struct person p = {.name="apple"}; appears outside of a function, then it creates an object with static storage duration. It will exist for the duration of program execution.
Instead of struct person* p_tr = malloc(sizeof(struct person));, it is preferable to use struct person *p_tr = malloc(sizeof *p_tr);. With the former, a change to the p_tr requires edits in two places, which allows a human opportunity to make mistakes. With the latter, changing the type of p_tr in just one place will still result in the correct size being requested.
struct person p = {.name="apple"};
^This is Automatic allocation for a variable/instance of type person.
struct person* p_tr = malloc(sizeof(person));
^This is dynamic allocation for a variable/instance of type person.
Static memory allocation occurs at Compile Time.
Dynamic memory allocation means it allocates memory at runtime when the program executes that line of instruction
Judging by your comments, you are interested in when to use one or the other. Note that all types of allocation reserve a computer memory sufficient to fit the value of the variable in it. The size depends on the type of the variable. Statically allocated variables are pined to a place in the memory by the compiler. Automatically allocated variables are pinned to a place in stack by the same compiler. Dynamically allocated variables do not exist before the program starts and do not have any place in memory till they are allocated by 'malloc' or other functions.
All named variables are allocated statically or automatically. Dynamic variables are allocated by the program, but in order to be able to access them, one still needs a named variable, which is a pointer. A pointer is a variable which is big enough to keep an address of another variable. The latter could be allocated dynamically or statically or automatically.
The question is, what to do if your program does not know the number of objects it needs to use during the execution time. For example, what if you read some data from a file and create a dynamic struct, like a list or a tree in your program. You do not know exactly how many members of such a struct you would have. This is the main use for the dynamically allocated variables. You can create as many of them as needed and put all on the list. In the simplest case you only need one named variable which points to the beginning of the list to know about all of the objects on the list.
Another interesting use is when you return a complex struct from a function. If allocated automatically on the stack, it will cease to exist after returning from the function. Dynamically allocated data will be persistent till it is explicitly freed. So, using the dynamic allocation would help here.
There are other uses as well.
In your simple example there is no much difference between both cases. The second requires additional computer operations, call to the 'malloc' function to allocate the memory for your struct. Whether in the first case the memory for the struct is allocated in a static program region defined at the program start up time. Note that the pointer in the second case also allocated statically. It just keeps the address of the memory region for the struct.
Also, as a general rule, the dynamically allocated data should be eventually freed by the 'free' function. You cannot free the static data.

C - Constructor, better to return struct or pointer to struct?

I'm currently making a RedBlackTree in C and I still don't understand which one is better / more ideal when it comes to having a constuctor function for your structures.
struct RedBlackTree* RedBlackTree_new()
{
struct RedBlackTree *tree = calloc(1, sizeof(struct RedBlackTree));
if (tree == NULL)
error(DS_MSG_OUT_OF_MEM);
return tree
}
VS.
struct RedBlackTree RedBlackTree_new()
{
struct RedBlackTree tree;
tree.root = NULL;
tree.size = 0;
return tree;
}
I mean, if I do the second option, then I constantly have to pass it into my functions as a pointer using & and to my knowledge, I can never destroy it until my program ends (can someone verify if that's true?). For example, if I had adestroy function for my Tree, I wouldn't be able to free the memory allocated from structures within the RedBlackTree if they weren't created with malloc or calloc right?
Also in a more general case, what are the advantages and disadvantages of either? I can always retrieve the data from the pointer by using * and I can always turn my data into a pointer by using &, so it almost feels like they are completely interchangable in a sense.
The real difference is the lifetime of the object. An object allocated on heap through dynamic allocation (malloc/calloc and free) survives until it's explicitly freed.
On the contrary an object which has automatic storage, like in your second example, survives only the scope in which it's declared and must be copied somewhere else to make it survive.
So this should help you in choosing which suits better a specific circumstance.
From an efficiency perspective dynamic allocation is more expensive and requires additional indirections but allows you to pass pointers around, which prevents data from being copied thus can be more efficient in other situations, eg. when objects are large and copies would be expensive.
firstly it's better to use typedef. it's easier.
if u create an object dynamically, u need to free every member of the object urself. or, the memory leak.
if it is a big struct , when u return it, it create a temp object. it cost more. so I prefer pointer!
and forget what i say before. I just sleepwalking.

Malloc for character array and Structures

In this question the usage of malloc for character arrays is explained in a detailed way.
When should I use malloc in C and when don't I?
Is this same for structures in C?
For example consider the following definition:
struct node
{
int x;
struct node * link;
}
typedef struct node * NODE;
Consider the following two usage of the above structure:
1)
NODE temp = (NODE) malloc(sizeof(struct node));
temp->x =5;
temp->link = NULL;
2)
struct node node1, *temp;
node1.x = 5;
node1.link = NULL;
temp = &node1;
Can I use the declaration of temp from the second example and modify the node1.link point to another structure struct node node2 by using temp->link = &node2 (pointer to node2 structure)?
Here, this implementation is used for creating a tree data structure.
Will the structures also follow the same rules as like arrays as stated in the above link?
Because many implementations I have seen followed the first usage. Is there any specific reason for using malloc ?
You can do what you describe in #2, but remember that it will only work as long as node1 is in scope. If node1 is a local variable in a function, then when the function returns, the memory location it refers to will be reclaimed and used for something else. Any other pointers in your program which still point to &node1 will no longer be valid. One of the advantages of using malloc to allocate memory dynamically is that it remains valid until you call free to dispose of it explicitly.
the first usage use the heap memory which is almost as large as you phsical memory, but the second usage use the stack momory which is quite scare( normally 8M )
Yes. You can do "temp->link = &node2;", because temp points to node1.
Yes, this rule is very general in C, so it can work for arrays.
When malloc is used, the space is allocated in the heap rather than the stack. This allows you to allocate more space for large data structure. The downside is that you also need to "free" the memory after the usage. Otherwise, memory leak will occur.
If you allocate memory dynamically, you must deallocate that memory programatically by calling free().
In first you are allocating memory dynamically it will allocate the memory from heap. If you store the data in heap memory it will stay until you call free else it will memory will deallocate when program terminate.
your second one store the value in stack its scope is local to the function.
Your example seems that you are working on linked list so you may need to add or delete the node at any time, so using dynamic allocation is the best option.

Resources