I'm getting started with dynamic lists and i don't understand why it is necessary to use the malloc function even when declaring the first node in the main() program, the piece of code below should just print the data contained in the first node but if i don't initialize the node with the malloc function it just doesn't work:
struct node{
int data;
struct node* next;
};
void insert(int val, struct node*);
int main() {
struct node* head ;
head->data = 2;
printf("%d \n", head->data);
}
You don’t technically, but maintaining all nodes with the same memory pattern is only an advantage to you, with no real disadvantages.
Just assume that all nodes are stored in the dynamic memory.
Your “insert” procedure would be better named something like “add” or (for full functional context) “cons”, and it should return the new node:
struct node* cons(int val, struct node* next)
{
struct node* this = (struct node*)malloc( sizeof struct node );
if (!this) return next; // or some other error condition!
this->data = val;
this->next = next;
return this;
}
Building lists is now very easy:
int main()
{
struct node* xs = cons( 2, cons( 3, cons( 5, cons( 7, NULL ) ) ) );
// You now have a list of the first four prime numbers.
And it is easy to handle them.
// Let’s print them!
{
struct node* p = xs;
while (p)
{
printf( "%d ", p->data );
p = p->next;
}
printf( "\n" );
}
// Let’s get the length!
int length = 0;
{
struct node* p = xs;
while (p)
{
length += 1;
p = p->next;
}
}
printf( "xs is %d elements long.\n", length );
By the way, you should try to be as consistent as possible when naming things. You have named the node data “data” but the constructor’s argument calls it “val”. You should pick one and stick to it.
Also, it is common to:
typedef struct node node;
Now in every place except inside the definition of struct node you can just use the word node.
Oh, and I almost forgot: Don’t forget to clean up with a proper destructor.
node* destroy( node* root )
{
if (!root) return NULL;
destroy( root->next );
free( root );
return NULL;
}
And an addendum to main():
int main()
{
node* xs = ...
...
xs = destroy( xs );
}
When you declare a variable, you define the type of the variable, then it's
name and optionally you declare it's initial value.
Every type needs an specific amount of memory. For example int would be
32 bit long on a 32bit OS, 8 bit long on a 64.
A variable declared in a function is usually stored in the stack associated
with the function. When the function returns, the stack for that function is
no longer available and the variable does not longer exist.
When you need the value/object of the variable to exist even after a function
returns, then you need to allocate memory on a different part of the program,
usually the heap. That's exactly what malloc, realloc and calloc do.
Doing
struct node* head ;
head->data = 2;
is just wrong. You've declaring a pointer named head of type struct node,
but you are not assigning anything to it. So it points to an unspecified
location in memory. head->data = 2 tries to store a value at an unspecified
location and the program will most likely crash with a segfault.
In main you could do this:
int main(void)
{
struct node head;
head.data = 2;
printf("%d \n", head.data);
return 0;
}
head will be saved in the stack and will persist as long as main doesn't
return. But this is only a very small example. In a complex program where you
have many more variables, objects, etc. it's a bad idea to simply declare all
variables you need in main. So it's best that objects get created when they
are needed.
For example you could have a function that creates the object and another one
that calls create_node and uses that object.
struct node *create_node(int data)
{
struct node *head = malloc(sizeof *head);
if(head == NULL)
return NULL; // no more memory left
head->data = data;
head->next = NULL;
return head;
}
struct node *foo(void)
{
struct node *head = create_node(112);
// do somethig with head
return head;
}
Here create_node uses malloc to allocate memory for one struct node
object, initializes the object with some values and returns a pointer to that memory location.
foo calls create_node and does something with it and it returns the
object. If another function calls foo, this function will get the object.
There are also other reasons for malloc. Consider this code:
void foo(void)
{
int numbers[4] = { 1, 3, 5, 7 };
...
}
In this case you know that you will need 4 integers. But sometimes you need an
array where the number of elements is only known during runtime, for example
because it depends on some user input. For this you can also use malloc.
void foo(int size)
{
int *numbers = malloc(size * sizeof *numbers);
// now you have "size" elements
...
free(numbers); // freeing memory
}
When you use malloc, realloc, calloc, you'll need to free the memory. If
your program does not need the memory anymore, you have to use free (like in
the last example. Note that for simplicity I omitted the use of free in the
examples with struct head.
What you have invokes undefined behavior because you don't really have a node,, you have a pointer to a node that doesn't actually point to a node. Using malloc and friends creates a memory region where an actual node object can reside, and where a node pointer can point to.
In your code, struct node* head is a pointer that points to nowhere, and dereferencing it as you have done is undefined behavior (which can commonly cause a segfault). You must point head to a valid struct node before you can safely dereference it. One way is like this:
int main() {
struct node* head;
struct node myNode;
head = &myNode; // assigning the address of myNode to head, now head points somewhere
head->data = 2; // this is legal
printf("%d \n", head->data); // will print 2
}
But in the above example, myNode is a local variable, and will go out of scope as soon as the function exists (in this case main). As you say in your question, for linked lists you generally want to malloc the data so it can be used outside of the current scope.
int main() {
struct node* head = malloc(sizeof struct node);
if (head != NULL)
{
// we received a valid memory block, so we can safely dereference
// you should ALWAYS initialize/assign memory when you allocate it.
// malloc does not do this, but calloc does (initializes it to 0) if you want to use that
// you can use malloc and memset together.. in this case there's just
// two fields, so we can initialize via assignment.
head->data = 2;
head->next = NULL;
printf("%d \n", head->data);
// clean up memory when we're done using it
free(head);
}
else
{
// we were unable to obtain memory
fprintf(stderr, "Unable to allocate memory!\n");
}
return 0;
}
This is a very simple example. Normally for a linked list, you'll have insert function(s) (where the mallocing generally takes place and remove function(s) (where the freeing generally takes place. You'll at least have a head pointer that always points to the first item in the list, and for a double-linked list you'll want a tail pointer as well. There can also be print functions, deleteEntireList functions, etc. But one way or another, you must allocate space for an actual object. malloc is a way to do that so the validity of the memory persists throughout runtime of your program.
edit:
Incorrect. This absolutely applies to int and int*,, it applies to any object and pointer(s) to it. If you were to have the following:
int main() {
int* head;
*head = 2; // head uninitialized and unassigned, this is UB
printf("%d\n", *head); // UB again
return 0;
}
this is every bit of undefined behavior as you have in your OP. A pointer must point to something valid before you can dereference it. In the above code, head is uninitialized, it doesn't point to anything deterministically, and as soon as you do *head (whether to read or write), you're invoking undefined behavior. Just as with your struct node, you must do something like following to be correct:
int main() {
int myInt; // creates space for an actual int in automatic storage (most likely the stack)
int* head = &myInt; // now head points to a valid memory location, namely myInt
*head = 2; // now myInt == 2
printf("%d\n", *head); // prints 2
return 0;
}
or you can do
int main() {
int* head = malloc(sizeof int); // silly to malloc a single int, but this is for illustration purposes
if (head != NULL)
{
// space for an int was returned to us from the heap
*head = 2; // now the unnamed int that head points to is 2
printf("%d\n", *head); // prints out 2
// don't forget to clean up
free(head);
}
else
{
// handle error, print error message, etc
}
return 0;
}
These rules are true for any primitive type or data structure you're dealing with. Pointers must point to something, otherwise dereferencing them is undefined behavior, and you hope you get a segfault when that happens so you can track down the errors before your TA grades it or before the customer demo. Murphy's law dictates UB will always crash your code when it's being presented.
Statement struct node* head; defines a pointer to a node object, but not the node object itself. As you do not initialize the pointer (i.e. by letting it point to a node object created by, for example, a malloc-statement), dereferencing this pointer as you do with head->data yields undefined behaviour.
Two ways to overcome this, (1) either allocate memory dynamically - yielding an object with dynamic storage duration, or (2) define the object itself as an, for example, local variable with automatic storage duration:
(1) dynamic storage duration
int main() {
struct node* head = calloc(1, sizeof(struct node));
if (head) {
head->data = 2;
printf("%d \n", head->data);
free(head);
}
}
(2) automatic storage duration
int main() {
struct node head;
head.data = 2;
printf("%d \n", head.data);
}
Related
I am learning how to make a linked list, but its failing to print out anything at all, and I cant figure out why??? please help. I believe it has something to do with my pointers but I don't know what it is.
#include <stdio.h>
#include <stdlib.h>
// typedef is used to give a data type a new name
typedef struct node * link ;// link is now type struct node pointer
/*
typedef allows us to say "link ptr"
instead of "struct node * ptr"
*/
struct node{
int item ;// this is the data
link next ;//same as struct node * next, next is a pointer
};
void printAll(link head); // print a linked list , starting at link head
void addFirst(link ptr, int val ); // add a node with given value to a list
link removeLast(link ptr); // removes and returns the last element in the link
//prints the link
void printAll(link head){
link ptr = head;
printf("\nPrinting Linked List:\n");
while(ptr != NULL){
printf(" %d ", (*ptr).item);
ptr = (*ptr).next;// same as ptr->next
}
printf("\n");
}
//adds to the head of the link
void addFirst(link ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = ptr;
ptr = tmp;
}
// testing
int main(void) {
link head = NULL;// same as struct node * head, head is a pointer type
//populating list
for(int i = 0; i<3; i++){
addFirst(head, i);
}
printAll(head);
return 0;
}
output:
Printing Linked List:
Process returned 0 (0x0) execution time : 0.059 s
Press any key to continue
It's because you're passing a null pointer to your function and the condition for exiting the loop is for that pointer to be null, so nothing happens.
Your addFirst function takes a pointer's value, but it cannot modify the head that you declared inside of main().
To modify head you need to pass a pointer to link, then you can dereference that pointer to access your head and you can then change it.
void addFirst(link *ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = *ptr;
*ptr = tmp;
}
Now you can change the head pointer. Just remember to pass the address to it when calling the function. addFirst(&head,i)
In the for loop
for(int i = 0; i<3; i++){
addFirst(head, i);
}
you create a bunch of pointers which all point to NULL. head is never changing since pointer itself is passed "by value". E.g. head is copied and all modifications to the pointer itself in addFirst are not visible outside.
This is the same as with say int. Imagine void foo(int x);. Whatever this function does to x is not visible outside.
However changes to the memory which link ptr points to are visible of course.
E.g. this line does nothing:
tmp->next = ptr;
ptr = tmp; <=== this line
}
You can fix this in several ways. One is to return new node from addFirst and another one is to make link ptr to be a pointer to pointer: link *ptr. Since in this case you want to change pointer value (not pointee value):
//link *ptr here a pointer to pointer
void addFirst(link * ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = *ptr; //<<changed
*ptr = tmp; //<<changed
}
Do not forget to update declaration above also. And the call:
void addFirst(link * ptr, int val ); // add a node with given value to a list
...
for(int i = 0; i<3; i++){
addFirst(&head, i);
}
Then this code produces:
Printing Linked List:
2 1 0
Added:
It's important to understand that working with linked list requires working with two different types of data.
First is struct node and you pass around this type of data using links.
Second is head. This is a pointer to the very first node. When you would like to modify the head you find it is not a "node". It is something else. It's a "name" for the first node in the list. This name by itself is a pointer to node. See how memory layout for head is different from the list itself.
head[8 bytes]->node1[16 bytes]->node2[16 bytes]->...->nodek[16 bytes]->NULL;
by the way - the only thing which have lexical name here is head. All the nodes do not have name and accessible through node->next syntax.
You can also imagine another pointer here, link last which will point to nodek. Again this will have different memory layout from nodes itself. And if you would like to modify that in a function you will need to pass to function pointer to that (e.g.pointer to pointer).
Pointer and data it points to are different things. In your mind you need to separate them. Pointer is like int or float. It is passed "by value" to functions. Yes link ptr is already pointer and that permits you to update the data it points to. However the pointer itself is passed by value and updates to pointer (in your case ptr=tmp) are not visible outside.
(*ptr).next=xxx will be visible of course because data is updated (not pointer). That means you need to do one extra step - make changes to your pointer visible outside of function, e.g. convert the pointer itself (head) into data for another pointer, e.g. use struct node **ptr (first star here says this is pointer to a node, and the second star converts that pointer to data for another pointer.
I've come across what seems to be a strange problem when implementing a singly linked list. I call a list_destroyer and pass the pointer to the head of the list, however when the method returns, the pointer that is passed still points to a full list. I don't believe I've passed a struct anywhere.
Here is my struct list, and typedef
typedef struct list list_t
struct list{
void* datum;
list_t* next;
};
And here is the code that is causing problem
void list_destroy(list_t *head){
list_t *destroy = head;
while(head){
//printf("%d \n", list_size(head));
head = head->next;
free(destroy);
destroy = head;
}
//printf("%d \n", list_size(head));
head = NULL;
//printf("%d \n", list_size(head));
}
The list_size functions have been commented out because they aren't necessary, but I use them to see the output of the code. The printf output shows that the size is decreasing. The two printf's surrounding the "head = NULL;" statement both print a size of zero. This is also confirmed with gdb. However, when I have this code (following) calling list_destroy, the pointer that is passed through is unchanged.
int main(){
list_t *test = NULL;
int a = 1;
int b = 2;
list_append(test,&a);
list_append(test,&b);
printf("%d \n", list_size(test));
list_destroy(test);
printf("%d \n", list_size(test));
}
I still get the printf above and below the list_destroy to both output 2. I haven't initialized a new list_t anywhere, so I don't see how the printf after the list_destroy would still output 2, (especially when the printf within the list_destroy says the list_t* passed in has a size of 0 at the end.
however when the method returns, the pointer that is passed still points to a full list.
That's incorrect: when the function returns, the pointer points to what used to be a full list. Chances are, your system would let you traverse the entire list without a break. However, dereferencing this pointer after the call is undefined behavior, so the same code could crash on other systems.
The problem has a name - head becomes a dangling pointer.
Fixing the problem is easy - pass a pointer to pointer, and set it to NULL upon completion:
void list_destroy(list_t **headPtr){
list_t *head = *headPtr;
list_t *destroy = head;
while(head){
head = head->next;
free(destroy);
destroy = head;
}
*headPtr = NULL;
}
I have a question. Why does the output for the 2 pointers different? I did not assign pointees to either but one does not return NULL while one returns NULL.
typedef struct node
{
bool word;
struct node* children[27];
}
node;
int main(void)
{
node header;
node* header_2 = malloc(sizeof(node));
printf("%p %p\n", header.children[1], header_2->children[1]);
}
OUTPUT: 0xbfba76d4 (nil). Shouldn't both be NULL? Thanks a lot!
Consider the following case:
int i;
int *j = malloc(sizeof(int));
printf("%d, %d", i, (*j)) ;
(You cannot guarantee that i=0 and *j=0 because a memory has been allocated to both but their values may be garbage value which is what that memory location had previously occupied)
In order to have a defined value, always initialize the allocation/initialization with 0.
node a; // Everything default-initialized
void foo()
{
static nodeb; // Everything default-initialized
node c; // Nothing initialized
node d = { 0 }; // Everything default-initialized
node *p = malloc(sizeof(*p)); // Nothing initialized
node *q = calloc(1, sizeof(*q)); // Everything zero-initialized
}
Everything default initialized means they are initialized with the default value which is zero.
Nothing initialized means they will persist the value of the location which may be a garbage value or zero.
Ref link: C struct with pointers initialization
this line: node header;
will contain whatever trash happens to be on the stack at the address of the header variable.
this line: node* header_2 = malloc(sizeof(node));
will contain whatever is returned by the call to malloc
(which, if malloc is successful will be a pointer to somewhere in the 'heap' and if malloc fails will be NULL)
My question is an extension of this: Returning pointer to a local structure
I wrote the following code to create an empty list:
struct node* create_empty_list(void)
{
struct node *head = NULL;
return head;
}
I just read that returning pointers to local variables is useless, since the variable will be destroyed when the function exits. I believe the above code is returning a NULL pointer, so I don't think it's a pointer to a local variable.
Where is the memory allocated to the pointer in this case. I didn't allocate any memory on the heap, and it should be on the stack, as an automatic variable. But what happens when the code exits (to the pointer), if I try to use it in the program, by assigning this pointer some pointees / de-referencing and alike?
struct node* create_empty_list(void)
{
struct node *head = NULL;
return head;
}
is equivalent to:
struct node* create_empty_list(void)
{
return NULL;
}
which is perfectly fine.
The problem would happen if you had something like:
struct node head;
return &head; // BAD, returning a pointer to an automatic object
Here, you are returning the value of a local variable, which is OK:
struct node* create_empty_list()
{
struct node* head = NULL;
return head;
}
The value of head, which happens to be NULL (0), is copied into the stack before function create_empty_list returns. The calling function would typically copy this value into some other variable.
For example:
void some_func()
{
struct node* some_var = create_empty_list();
...
}
In each of the examples below, you would be returning the address of a local variable, which is not OK:
struct node* create_empty_list()
{
struct node head = ...;
return &head;
}
struct node** create_empty_list()
{
struct node* head = ...;
return &head;
}
The address of head, which may be a different address every time function create_empty_list is called (depending on the state of the stack at that point), is returned. This address, which is typically a 4-byte value or an 8-byte value (depending on your system's address space), is copied into the stack before the function returns. You may use this value "in any way you like", but you should not rely on the fact that it represents the memory address of a valid variable.
A few basic facts about variables, that are important for you to understand:
Every variable has an address and a value.
The address of a variable is constant (i.e., it cannot change after you declare the variable).
The value of a variable is not constant (unless you explicitly declare it as a const variable).
With the word pointer being used, it is implied that the value of the variable is by itself the address of some other variable. Nonetheless, the pointer still has its own address (which is unrelated to its value).
Please note that the description above does not apply for arrays.
As others have mentioned, you are returning value, what is perfectly fine.
However, if you had changed functions body to:
struct node head;
return &head;
you would return address (pointer to) local variable and that could be potentially dangerous as it is allocated on the stack and freed immediately after leaving function body.
If you changed your code to:
struct node * head = (struct node *) malloc( sizeof( struct node ) );;
return head;
Then you are returning value of local value, that is pointer to heap-allocated memory which will remain valid until you call free on it.
Answering
Where is the memory allocated to the pointer in this case. I didn't
allocate any memory on the heap, and it should be on the stack, as an
automatic variable. But what happens when the code exits (to the
pointer), if I try to use it in the program, by assigning this pointer
some pointees / de-referencing and alike?
There is no memory allocated to the pointer in your case. There is memory allocated to contain the pointer, which is on the stack, but since it is pointing to NULL it doesn't point to any usable memory. Also, you shouldn't worry about that your pointer is on the stack, because returning it would create a copy of the pointer.
(As others mentioned) memory is allocated on the stack implicitly when you declare objects in a function body. As you probably know (judging by your question), memory is allocated on the heap by explicitly requesting so (using malloc in C).
If you try to dereference your pointer you are going to get a segmentation fault. You can assign to it, as this would just overwrite the NULL value. To make sure you don't get a segmentation fault, you need to check that the list that you are using is not the NULL pointer. For example here is an append function:
struct node
{
int elem;
struct node* next;
};
struct node* append(struct node* list, int el) {
// save the head of the list, as we would be modifying the "list" var
struct node* res = list;
// create a single element (could be a separate function)
struct node* nn = (struct node*)malloc(sizeof(struct node));
nn->elem = el;
nn->next = NULL;
// if the given list is not empty
if (NULL != list) {
// find the end of the list
while (NULL != list->next) list = list->next;
// append the new element
list->next = nn;
} else {
// if the given list is empty, just return the new element
res = nn;
}
return res;
}
The crucial part is the if (NULL != list) check. Without it, you would try to dereference list, and thus get a segmentation fault.
This may be a stupid question, and I see similar questions been asked, but I dont get the answers given. why does the following code produce:
error: incompatible types when assigning to type ‘node_t’ from type ‘struct node_t *’
node_t list_array[10];
typedef struct node
{
int value;
struct node *next;
struct node *prev;
} node_t;
node_t* create_node(void)
{
node_t *np;
np->next = NULL;
np->prev = NULL;
np->value = rand() % 10;
return np;
}
int main(void)
{
int i;
for(i = 0; i < 10; i++)
{
list_array[i] = create_node();
}
return 0;
}
Make the array into an array of pointers to fix the error, since create_node returns a pointer:
node_t *list_array[10];
Note you're not allocating any memory in create_node so using np is illegal. Try:
node_t *np = malloc(sizeof *np);
I want to make an array of node_t structs
In that case you could leave the node_t list_array[10] and:
Pass &list_array[i] as an argument to the function
Have the function return a node_t instead of a node_t *
Because one is a structure and the other is a pointer to the structure.
The create_node() function returns a pointer to a node (which you really should malloc() in that function, by the way) and you try to assign it to an actual structure in the array.
You can solve it by simply changing your declaration to:
node_t *list_array[10];
so that it's an array of pointers rather than an array of structures.
Because create_node() returns a pointer, but list_array[i] is an actual instance. You can't assign a pointer over an instance, they're completely different.
The solution is typically to represent each node as a pointer, which requires list_array to be an array of pointers:
node_t *list_array[10];
Then the assigment makes sense, and the code will compile.
Note, however, that the code will not "work", since it's dereferencing a NULL pointer inside create_node(). It seems you forgot to call malloc():
node_t* create_node(void)
{
node_t *np;
if((np = malloc(sizeof *np)) != NULL)
{
np->next = NULL;
np->prev = NULL;
np->value = rand() % 10;
}
return np;
}
This is the classic "pointer vs. instance" confusion. Even more serious than your warning is:
node_t *np;
np->next = NULL;
which will compile, and then segfault.
The confusion arises because of a misunderstanding of what a pointer is. When compiled, a pointer is just a single number, like 140734799803888. This number is used just to locate the physical chunk of data. It's a memory address.
Pointers vs. instances are confusing, and one of the first conceptual challenges you encounter in programming. So here's an analogy:
If you've ever used a GPS, it will tell you where you are (pointer) but not what you are (data). Pointers work the same way. If someone wants to shake your hand, they wouldn't shake GPS coordinates (pointer)! They'd use the GPS coordinates to locate you, then physically visit you (data) and shake your hand. That's how pointers work.
So in your code above, you declare a pointer np, but don't give it any location to keep track of. Then, you ask "use the number in np to locate my data" (but you haven't set a number for np!) In particular, np->next asks to use the location np + someOffset (which is undefined!) to find your physical data (which is nowhere), and change it.
That's why you get a seg fault.
node_t list_array[10] should be node_t *list_array[10]
You have also, not malloced your node_t *np
node_t *np = malloc(sizeof(node_t));
For this program, I see no point to using dynamic storage duration (malloc). I would dispose of create_node in favour of memcpy, if you wish to keep all of your objects in static storage duration. For example,
#include <string.h>
typedef struct node
{
int value;
struct node *next;
struct node *prev;
} node_t;
int main(void) {
node_t list_array[10];
for (int i = 0; i < sizeof (list_array) / sizeof (*list_array); i++) {
memcpy(list_array + i,
&(node_t){ .value = rand() % 10,
.next = NULL,
.prev = NULL },
sizeof (*list_array));
}
return 0;
}