In this question the usage of malloc for character arrays is explained in a detailed way.
When should I use malloc in C and when don't I?
Is this same for structures in C?
For example consider the following definition:
struct node
{
int x;
struct node * link;
}
typedef struct node * NODE;
Consider the following two usage of the above structure:
1)
NODE temp = (NODE) malloc(sizeof(struct node));
temp->x =5;
temp->link = NULL;
2)
struct node node1, *temp;
node1.x = 5;
node1.link = NULL;
temp = &node1;
Can I use the declaration of temp from the second example and modify the node1.link point to another structure struct node node2 by using temp->link = &node2 (pointer to node2 structure)?
Here, this implementation is used for creating a tree data structure.
Will the structures also follow the same rules as like arrays as stated in the above link?
Because many implementations I have seen followed the first usage. Is there any specific reason for using malloc ?
You can do what you describe in #2, but remember that it will only work as long as node1 is in scope. If node1 is a local variable in a function, then when the function returns, the memory location it refers to will be reclaimed and used for something else. Any other pointers in your program which still point to &node1 will no longer be valid. One of the advantages of using malloc to allocate memory dynamically is that it remains valid until you call free to dispose of it explicitly.
the first usage use the heap memory which is almost as large as you phsical memory, but the second usage use the stack momory which is quite scare( normally 8M )
Yes. You can do "temp->link = &node2;", because temp points to node1.
Yes, this rule is very general in C, so it can work for arrays.
When malloc is used, the space is allocated in the heap rather than the stack. This allows you to allocate more space for large data structure. The downside is that you also need to "free" the memory after the usage. Otherwise, memory leak will occur.
If you allocate memory dynamically, you must deallocate that memory programatically by calling free().
In first you are allocating memory dynamically it will allocate the memory from heap. If you store the data in heap memory it will stay until you call free else it will memory will deallocate when program terminate.
your second one store the value in stack its scope is local to the function.
Your example seems that you are working on linked list so you may need to add or delete the node at any time, so using dynamic allocation is the best option.
Related
I have a sort of linked list implemented (code at bottom) in C (which has obvious issues, but I'm not asking about those or about linked lists; I'm aware for instance that there are no calls to free() the allocated memory) given below which does what I expect (so far as I've checked). My question is about the first couple of lines of the addnodeto() function and what it does to the heap/stack.
My understanding is that calling malloc() sets aside some memory on the heap, and then returns the address of that memory (pointing to the beginning) which is assigned to struct node *newnode which is itself on the stack. When the function is first called, *nodetoaddto is a pointer to struct node first, both of which are on the stack. Thus the (*nodeaddto)->next = newnode sets first.next equal to the value of newnode which is the address of the newly allocated memory.
When we leave this function, and continue executing the main() function, is *newnode removed from the stack (not sure if 'deallocated' is the correct word), leaving only struct node first pointing to the 'next' node struct on the heap? If so, does this 'next' struct node have a variable name also on the stack or heap, or it is merely some memory pointed too? Moreover, is it true to say that struct node first is on the stack, whilst all subsequent nodes will be on the heap, and that just before main() returns 0 there are no structs/variables on the stack other than struct node first? Or is/are there 1/more than 1 *newnode still on the stack?
I did try using GDB which showed that struct node *newnode was located at the same memory address both times addnodeto() was called (so was it removed and then happened to be re-defined/allocated in to the same location, or was perhaps the compiler being smart and left it there even once the function was exited the first time, or other?), but I couldn't work anything else out concretely. Thank you.
The code:
#include <stdio.h>
#include <stdlib.h>
#define STR_LEN 5
struct node {
char message[STR_LEN];
struct node *next;
};
void addnodeto(struct node **nodeaddto, char letter, int *num_of_nodes){
struct node *newnode = malloc(sizeof(struct node));
(*nodeaddto)->next = newnode;
newnode->message[0] = letter;
(*nodeaddto) = newnode;
*num_of_nodes += 1;
}
int main(void){
struct node first = {"F", NULL};
struct node *last = &first;
int num_nodes = 1;
addnodeto(&last, 'S', &num_nodes);
addnodeto(&last, 'T', &num_nodes);
addnodeto(&last, 'I', &num_nodes);
printf("Node: %d holds the char: %c\n", num_nodes-3, first.message[0]);
printf("Node: %d holds the char: %c\n", num_nodes-2, (first.next)->message[0]);
printf("Node: %d holds the char: %c\n", num_nodes-1, ((first.next)->next)->message[0]);
printf("Node: %d holds the char: %c\n", num_nodes, (last)->message[0]);
return 0;
}
Which when run outputs:
Node: 1 holds the char: F
Node: 2 holds the char: S
Node: 3 holds the char: T
Node: 4 holds the char: I
As expected.
My understanding is that calling malloc() sets aside some memory on the heap, and then returns the address of that memory (pointing to the beginning)…
Yes, but people who call it “the heap” are being sloppy with terminology. A heap is a kind of data structure, like a linked list, a binary tree, or a hash table. Heaps can be used for things other than tracking available memory, and available memory can be tracked using data structures other than a heap.
I do not actually know of a specific term for the memory that the memory management routines manage. There are actually several different sets of memory we might want terms for:
all the memory they have acquired from the operating system so far and are managing, including both memory that is currently allocated to clients and memory that has been freed (and not yet returned to the operating system) and is available for reuse;
the memory that is currently allocated to clients;
the memory that is currently available for reuse; and
the entire range of memory that is being managed, including portions of the virtual address space reserved for future mapping when necessary to request more memory from the operating system.
I have seen “pool” used to describe such memory but have not seen a specific definition of it.
… which is assigned to struct node *newnode which is itself on the stack.
struct node *newnode is indeed nominally on the stack in common C implementations. However, the C standard only classifies it as automatic storage duration, meaning its memory is automatically managed by the C implementation. The stack is the most common way to implement that, but specialized C implementations may do it in other ways. Also, once the compiler optimizes the program, newnode might not be on the stack; the compiler might generate code that just keeps it in a register, and there are other possibilities too.
A complication here is when we are talking about memory use in a C program, we can talk about the memory use in a model computer the C standard uses to describe the semantics of programs or the memory use in actual practice. For example, as the C standard describes it, every object has some memory reserved for it during its lifetime. However, when a program is compiled, the compiler can produce any code it wants that gets the same results as required by the C standard. (The output of the program has to be the same, and certain other interactions have to behave the same.) So a compiler might not use memory for an object at all. After optimization, an object might be in memory at one time and in registers at another, or it might always be in a register and never in memory, and it might be in different registers at different times, and it might not be any particular place because it might have been incorporated into other things. For example, in int x = 3; printf("%d\n", 4*x+2);, the compiler might eliminate x completely and just print “14”. So, when asking about where things are in memory, you should be clear about whether you want to discuss the semantics in the model computer that the C standard uses or the actual practice in optimized programs.
When the function is first called, *nodetoaddto is a pointer to struct node first, both of which are on the stack.
nodetoaddto may be on the stack, per above, but it also may be in a register. It is common that function arguments are passed in registers.
It points to a struct node. By itself, struct node is a type, so it is just a concept, not an object to point to. In contrast, “a struct node” is an object of that type. That object might or might not be on the stack; addnodeto would not care; it could link to it regardless of where it is in memory. Your main routine does create its first and last nodes with automatic storage duration, but it could use static just as well, and then the nodes would likely be located in a different part of memory rather than the stack, and addnodeto would not care.
Thus the (*nodeaddto)->next = newnode sets first.next equal to the value of newnode which is the address of the newly allocated memory.
Yes: In main, last is initialized to pointer to first. Then &last is passed to addnodeto, so nodeaddto is a pointer to last. So *nodeaddto is a pointer to first. So (*nodeaddto)->next is the next member in `first.
When we leave this function, and continue executing the main() function, is *newnode removed from the stack (not sure if 'deallocated' is the correct word), leaving only struct node first pointing to the 'next' node struct on the heap?
newnode is an object with automatic storage duration inside addnodeto, so its memory is automatically released when addnodeto ends.
*newnode is a struct node with allocated storage duration, so its memory is not released when a function ends. Its memory is released when free is called, or possibly some other routine that may release memory, like realloc.
If so, does this 'next' struct node have a variable name also on the stack or heap, or it is merely some memory pointed [to]?
There are no variable names in the stack or in the heap. Variable names exist only in source code (and in the compiler while compiling and in debugging information associated with the compiled program, but that debugging information is generally separate from the normal execution of the program). When we work with allocated memory, we generally work with it only by pointers to it.
Moreover, is it true to say that struct node first is on the stack, whilst all subsequent nodes will be on the heap,…
Yes, subject to the caveats about stack and “heap” above.
… and that just before main() returns 0 there are no structs/variables on the stack other than struct node first?
All of the automatic objects in main are on the stack (or otherwise automatically managed): first, last, and num_nodes.
Or is/are there 1/more than 1 *newnode still on the stack?
No.
Assume I have a struct node like such:
struct node {
struct other_struct value;
int some_num;
}
I have seen snippets of code where the struct can be initialized without calling malloc, like this:
struct node my_node;
my_node.value = NULL;
my_node.some_num = 2;
And then value can later be malloced. However, how would I free my_node?
my_node is allocated on the stack, and once it goes out of scope, its memory is automatically deallocated.
Use malloc if you want to allocate something on the heap, and it will persist until you free it. For your example, you would do:
struct node *my_node = malloc(sizeof(struct node));
my_node->some_num = 2;
// Sometime later
free(my_node);
Space will also be allocated for the value field if you malloc (as long as you pass the right size) or if you declare on the stack. sizeof(struct node) includes the size of other_struct.
Currently with those code snippets you can't really free value or my_node, as they're both just static variables, if in your node structure, the other_struct field was a pointer, then you could dynamically allocate some memory, and save the address in that pointer.
Then when done, free the pointer value.
If then, my_node was also a pointer (which I think is what you want), you would need to allocate memory for the node struct, and save the address to the pointer my_node. THEN, allocate some memory for the other_struct struct and save it to the pointer value. And after you're done, you would free value FIRST, then free my_node.
When doing things like this, I generally create a little constructor/destructor function, which will do it all for me. It can be too easy to forget to free the inner pointer (value) and just free the outer pointer (mynode). Then that would cause a memory-leak, as value would still be allocated and taking up room in memory.
The code:
struct node {
struct other_struct value;
int some_num;
};
defines value to be a member of struct node. It is a part of struct node. Whenever a struct node is created, it will contain the member named value, and no other memory needs to be allocated for value. Whether a struct node is created automatically or by malloc or other means, its memory will always include memory for value.
If you changed it to:
struct node {
struct other_struct *value;
int some_num;
}
then value would be a pointer to a struct other_struct, and you would need to provide memory for it to point to.
In this code:
struct node my_node;
my_node.value = NULL;
my_node.some_num = 2;
the object my_node is created automatically, and you do not have to do anything to release it or its members; that will happen automatically. You cannot use my_node.value = NULL because value is a structure, not a pointer.
If the structure definition is changed so that value is a pointer, then you can set to NULL, or you can set it to point at an existing object, or you can allocate memory for it and set it to point to the allocated memory. If you allocate memory for it, then you should ensure that memory is later freed (except it is okay not to free it if you are intentionally keeping it to the end of program execution anyway, and you are executing in user mode on a general-purpose operating system).
You don't free it... it is destroyed, when the block it has been declared in is abandoned by the program.
Let's say that this variable is defined at the beginning of a function's body block. When the function is being called, the function creates this variable as it's a local. When the function terminates, all local variables are destroyed, so it is your local structure.
If you declare such a variable out of any function, then your variable has global duration, and it is created at program initialization, and lives as much as the program itself.
I have the following structure that I've declared:
typedef struct binTreeNode
{
void *data;
struct binTreeNode *left;
struct binTreeNode *right;
} myBinaryTreeNode;
In my main function, I'm trying to use a pointer to an instance of that structure called root.
myBinaryTreeNode *root = malloc(sizeof(myBinaryTreeNode));
printf("\nPlease insert root data: ");
int input;
scanf("%d", &input);
root->data = (void*)&input;
printInt(root->data);
free(root);
this code runs perfectly well. But, I thought that when you have a struct with members that are pointers, you should free() each one of them (additionally to the pointer to the struct).
So here, I didn't malloc for root->data (because I think that mallocing the struct does that), but it is initialized to the input value and it's value gets printed successfully. When I try to free(root->data) my program crashes.
So root->data is not malloced when I malloc root? If not, how can I still use it?
Why does this happen? What am I missing here?
First, get the concept of where and how you need to (or need not) call free().
You don't need to malloc() "for" root->data, that variable is already allocated while you allocated memory equal to the size of the structure. Now, next, you certainly need root->data to point to some valid memory, so that you can dereference the pointer to read from to write into it. You can do that by either of two ways
store an address which is valid (like in your case, supply the address of an already existing variable)
assign a pointer returned by another malloc() (success).
In case, you're storing the pointer returned by malloc(), yes you must free() the memory but in your case, root->data holds a pointer not returned by malloc(), so it does not need free()-in either.
Just to add and emphasis on the part related to free()-ing, you must not attempt to free() the memory which has not been previously allocated by a call to malloc() and family otherwise, it invokes undefined behavior. To quote, standard C11, chapter §7.22.3.3, (emphasis mine)
void free(void *ptr);
The free function causes the space pointed to by ptr to be deallocated, that is, made
available for further allocation. If ptr is a null pointer, no action occurs. Otherwise, if
the argument does not match a pointer earlier returned by a memory management
function, or if the space has been deallocated by a call to free or realloc, the
behavior is undefined.
data member points to input local variable, so you cannot free it.
You must free members if you dynamically allocate them, like
typedef struct binTreeNode
{
int *data;
struct binTreeNode *left;
struct binTreeNode *right;
} myBinaryTreeNode;
root->data = malloc(sizeof(int));
scanf("%d", root->data);
free(root->data);
When I try to free(root->data) my program crashes.
You cannot free memory that you have not dynamically allocated. If you want to free the elements, you should first allocate them this way :
typedef struct binTreeNode
{
int *data;
struct binTreeNode *left;
struct binTreeNode *right;
} myBinaryTreeNode;
root->data = malloc(sizeof(int));
if (root->data == NULL)
{
printf("Error allocating memory\n");
return;
}
scanf("%d", root->data);
free(root->data);
Note that you should check the result of malloc before continuing.
So root->data is not malloced when I malloc root? If not, how can I still use it?
Note also that space for root->data is allocated when you allocate memory for the struct, but it is not malloced. You need an extra malloc especially for root->data, as shown in the example above. The reason why you need malloc for root->data is to be able to dereference the pointer.
I have added some annotations to your code.
I hope this clears things up.
// I have added an enclosing block for the purpose of demonstration:
{
// Here you allocate memory for a tree node object with dynamic/allocated
// storage duration.
// That is: three pointers, that point to nowhere are created and will
// exist until the memory holding them is free'd.
myBinaryTreeNode *root = malloc(sizeof(myBinaryTreeNode));
printf("\nPlease insert root data: ");
// Here you create an object with automatic storage duration.
// That is, the object is automatically destroyed when the function
// returns or the block is closed: { ... }
int input;
// here you attempt to assign a value to the variable
// (you should check the return value: the input may fail)
scanf("%d", &input);
// Here you assign the address of an object to a pointer.
// Note that you are only allowed to use the pointer as long as the
// object it points to "lives".
root->data = (void*)&input;
// Here you probably print the value, or the pointer address, or something else
printInt(root->data);
free(n); // <-- Where and how did you allocate this?
// Here you free the memory holding the tree node
free(root);
} // <-- At this point the "input" variable is automatically destroyed.
You should really read up on the different storage durations (automatic and dynamic/allocated) and the purpose of dynamic memory allocation:
http://en.cppreference.com/w/c/language/storage_duration
http://en.cppreference.com/w/c/language/lifetime
C Memory Management
To address the questions in detail:
When I try to free(root->data) my program crashes.
That is because you can only free() what you have malloc(). Memory management is different for the different types of storage duration.
So root->data is not malloced when I malloc root?
Memory for the pointer is allocated, but it doesn't "own" or "point to" any object by default.
If not, how can I still use it?
Just as you did. You assign a memory address of an object to it, that lives long enough for you to use it. You can produce such object using dynamic memory management (then take care that you destroy it manually once you're done: don't leak memory) or using "automatic memory management" as you have done (then take care that it isn't destroyed too early: don't access invalid memory). (To name just two of the possible ways to create an object...)
I have one structure that contains pointer to another structure Node. This pointer is a front pointer to the struct Linked List. So, I am building my link list and each time I insert node I create separate structure and link in to other LL nodes.
Question: do I need to allocate memory for each node in LL? Meaning using malloc such as
*pointer_to_struct = (structAlias *)malloc(sizeof(structAlias));
and then to initialize its members.
Or I simply create structure and give values to it's members (members are void * and structAlias *next) without memory allocating for each struct?
In the general case, yes, you must allocate memory for each node in the list, and you will probably want to use either malloc or calloc. If you just declare a struct local variable, that local variable will be invalidated when the function returns, but you probably want the node to outlive the function call.
Why would I use malloc when same job can be done by without malloc as below..
#include <stdio.h>
#include <conio.h>
struct node {
int data;
struct node *l;
struct node *r;
};
int main(){
//Case 1
struct node n1;
n1.data = 99;
printf("n1 data is %d\n", n1.data);
//Case 2
struct node *n2 = (struct node *) malloc (sizeof(struct node));
n2 -> data = 4444;
printf("n2 data is:%d\n",n2 -> data);
free(n2);
return (0);
}
I am having hard time to understand how n1 which is not initialized to memory location is able to store data (99) .
when to use case 1 and when to use case 2.
Why would I use malloc when same job cane be done by without malloc as
below..
You use malloc, to allocate memory on heap, and without malloc, you are placing you struct in stack memory.
I am having hard time to understand how n1 which is not initialized
to memory location is able to store data (99) .
Initialized or not, when you assign data n1.data = 99; , it is stored.
2) when to use case 1 and when to use case 2
Case 1 is is used, when you know that you will be using the structure object within a confined scope, and will not be making references to the structure data, beyond its scope.
Case 2 is used when you will be using your structure at multiple places, and you are willing to manage memory for it manually(and carefully!). The advantage of this method is that, you create and initialize the structure at some part of the program scope, and you create a pointer and pass the pointer around, since passing a 4 byte pointer is far more efficient than , passing the structure itself.
int main() {
struct node n1;
n1.data = 99
This reserves space on the stack (in main's frame) equivalent to the size of a struct node. This is known as a local, and it only exists within the context of main.
struct node *n2 = (struct node *) malloc (sizeof(struct node));
This is an allocation on the heap. This memory exists no matter what function context you are in. This is typically called "dynamic allocation".
These node structures are the basis for a linked list, which can have nodes added, removed, re-ordered, etc. at will.
See also:
What and where are the stack and heap?
In the first case, memory is allocated on the stack. When the variable n1 runs out of scope, the memory is released.
In the second case, memory is allocated on the heap. You have to explicitly release memory resources (as you are doing with free).
Rule-of-thumb can be that you use stack-allocated memory for local, temporary data structures of limited size (the stack is only a portion of the computer's memory, differs per platform). Use the heap for data structures you want to persist or are large.
Googling for stack and heap will give you much more information.
Your data type looks like a node in a tree. Two primary reasons to use malloc for tree node allocations would be
To allocate the arbitrary number of nodes. The number of tree nodes will in general case be a run-time value. For this reason, it is impossible to declare the proper number of local variables for such nodes since all local variables have to be declared at compile time. Meanwhile, malloc can be called in run-time as many times as you want, allocating as many node object as you need.
To make sure that the node does not be destroyed automatically when the lifetime of the local object ends (i.e. at the end of the block). Objects allocated by malloc live forever, i.e. until you destroy them explicitly by calling free. Such object will transcend the block boundaries and function boundaries. Nothing like that is possible with local objects, since local objects are automatically destroyed at the end of their block.
Your code sample does not depend on any of the benefits of dynamic allocation, since it does not really create a real tree. It just declared as single node. But if you attempt to build a full tree with run-time number of nodes, yo will immediately realize that it is impossible to do by declaring nodes as local objects. You will unavoidably have to allocate your nodes using malloc.
Your "how n1 which is not initialized to memory location is able to store data" question must be caused by some confusion. struct node n1; is an object definition, which means that it assigns a memory location for n1. That's exactly the purpose of object definition.
Normally you use malloc only, if you don't know the size of memory you will need before the application is running.
Your code is correct, but you can't allocate memory dynamically. What if you want to save a measured value in the node.date and you don't know, how many measures you will capture? Then you have to malloc some new memory on each measure you take.
If you define all the nodes before run-time (directly in the code), you can't save more measures than you've defined before.
Search for linked lists in c and you will find some good examples.