I've come across what seems to be a strange problem when implementing a singly linked list. I call a list_destroyer and pass the pointer to the head of the list, however when the method returns, the pointer that is passed still points to a full list. I don't believe I've passed a struct anywhere.
Here is my struct list, and typedef
typedef struct list list_t
struct list{
void* datum;
list_t* next;
};
And here is the code that is causing problem
void list_destroy(list_t *head){
list_t *destroy = head;
while(head){
//printf("%d \n", list_size(head));
head = head->next;
free(destroy);
destroy = head;
}
//printf("%d \n", list_size(head));
head = NULL;
//printf("%d \n", list_size(head));
}
The list_size functions have been commented out because they aren't necessary, but I use them to see the output of the code. The printf output shows that the size is decreasing. The two printf's surrounding the "head = NULL;" statement both print a size of zero. This is also confirmed with gdb. However, when I have this code (following) calling list_destroy, the pointer that is passed through is unchanged.
int main(){
list_t *test = NULL;
int a = 1;
int b = 2;
list_append(test,&a);
list_append(test,&b);
printf("%d \n", list_size(test));
list_destroy(test);
printf("%d \n", list_size(test));
}
I still get the printf above and below the list_destroy to both output 2. I haven't initialized a new list_t anywhere, so I don't see how the printf after the list_destroy would still output 2, (especially when the printf within the list_destroy says the list_t* passed in has a size of 0 at the end.
however when the method returns, the pointer that is passed still points to a full list.
That's incorrect: when the function returns, the pointer points to what used to be a full list. Chances are, your system would let you traverse the entire list without a break. However, dereferencing this pointer after the call is undefined behavior, so the same code could crash on other systems.
The problem has a name - head becomes a dangling pointer.
Fixing the problem is easy - pass a pointer to pointer, and set it to NULL upon completion:
void list_destroy(list_t **headPtr){
list_t *head = *headPtr;
list_t *destroy = head;
while(head){
head = head->next;
free(destroy);
destroy = head;
}
*headPtr = NULL;
}
Related
I am trying to understand how the code below for creating a singly linked list works using a double pointer.
#include <stdio.h>
#include <stdlib.h>
struct Node {
int data;
struct Node* next;
};
void push(struct Node** headRef, int data) {
struct Node* newNode = (struct Node*)malloc(sizeof(struct Node));
newNode->data = data;
newNode->next = *headRef;
*headRef = newNode;
}
//Function to implement linked list from a given set of keys using local references
struct Node* constructList(int keys[], int n) {
struct Node *head = NULL;
struct Node **lastPtrRef = &head;
int i, j;
for(i = 0; i < n; i++) {
push(lastPtrRef, keys[i]);
lastPtrRef = &((*lastPtrRef)->next); //this line
if((*lastPtrRef) == NULL) {
printf("YES\n");
}
}
return head;
}
int main() {
int keys[] = {1, 2, 3, 4};
int n = sizeof(keys)/sizeof(keys[0]);
//points to the head node of the linked list
struct Node* head = NULL;
head = constructList(keys, n); //construct the linked list
struct Node *temp = head;
while(temp != NULL) { //print the linked list
printf(" %d -> ", temp->data);
temp = temp->next;
}
}
I understand the purpose of using the double pointer in the function push(), it allows you to change what the pointer headRef is pointing to inside the function. However in the function constructList(), I don't understand how the following line works:
lastPtrRef = &((*lastPtrRef)->next);
Initially lastPtrRef would be pointing to head which points to NULL. In the first call to push(), within the for loop in constructList(), the value that head points to is changed (it points to the new node containing the value 1). So after the first call to push(), lastPtrRef will be pointing to head which points to a node with the value of 1. However, afterwards the following line is executed:
lastPtrRef = &((*lastPtrRef)->next);
Whereby lastPtrRef is given the address of whatever is pointed to by the next member of the newly added node. In this case, head->next is NULL.
I am not really sure what the purpose of changing lastPtrRef after the call to push(). If you want to build a linked list, don't you want lastPtrRef to have the address of the pointer which points to the node containing 1, since you want to push the next node (which will containing 2) onto the head of the list (which is 1)?
In the second call to push() in the for loop in constructList, we're passing in lastPtrRef which points to head->next (NULL) and the value 2. In push() the new node is created, containing the value 2, and newNode->next points to head->next which is NULL. headRef in push gets changed so that it points to newNode (which contains 2).
Maybe I'm understanding the code wrong, but it seems that by changing what lastPtrRef points to, the node containing 1 is getting disregarded. I don't see how the linked list is created if we change the address lastPtrRef holds.
I would really appreciate any insights as to how this code works. Thank you.
This uses a technique called forward-chaining, and I believe you already understand that (using a pointer-to-pointer to forward-chain a linked list construction).
This implementation is made confusing by the simple fact that the push function seems like it would be designed to stuff items on the head of a list, but in this example, it's stuffing them on the tail. So how does it do it?
The part that is important to understand is this seemingly trivial little statement in push:
newNode->next = *headRef
That may not seem important, but I assure you it is. The function push, in this case, does grave injustice to what this function really does. In reality it is more of a generic insert. Some fact about that function
It accepts a pointer-to-pointer headRef as an argument, as well as some data to put in to the linked list being managed.
After allocating a new node and saving the data within, it sets the new node's next pointer to whatever value is currently stored in the dereferenced headRef pointer-to-pointer (so.. a pointer) That's what the line I mentioned above accomplishes.
It then stores the new node's address at the same place it just pulled the prior address from; i.e. *headRef
Interestingly, it has no return value (it is void) further making this somewhat confusing. Turns out it doesn't need one.
Upon returning to the caller, at first nothing may seem to have changed. lastPtrRef still points to some pointer (in fact the same pointer as before; it must, since it was passed by value to the function). But now that pointer points to the new node just allocated. Further, that new node's next pointer points to whatever was in *lastPtrRef before the function call (i.e. whatever value was in the pointer pointed to by lastPtrRef before the function call).
That's important. That is what that line of code enforces, That means if you invoke this with lastPtrRef addressing a pointer pointing to NULL (such as head on initial loop entry), that pointer will receive the new node, and the new node's next pointer will be NULL. If you then change the address in lastPtrRef to point to the next pointer of the last-inserted node (which points to NULL; we just covered that), and repeat the process, it will hang another node there, setting that node's next pointer to NULL, etc. With each iteration, lastPtrRef addresses the last-node's next pointer, which is always NULL.
That's how push is being used to construct a forward linked list. One final thought. What would you get for a linked list if you had this:
#include <stdio.h>
#include <stdlib.h>
struct Node
{
int data;
struct Node* next;
};
void push(struct Node** headRef, int data)
{
struct Node* newNode = (struct Node*)malloc(sizeof(struct Node));
newNode->data = data;
newNode->next = *headRef;
*headRef = newNode;
}
int main()
{
//points to the head node of the linked list
struct Node* head = NULL;
push(&head, 1);
push(&head->next, 2);
push(&head->next, 3);
for (struct Node const *p = head; p; p = p->next)
printf("%p ==> %d\n", p, p->data);
}
This seemingly innocent example amplifies why I said push is more of a generic insert than anything else. This just populates the initial head node.
push(&head, 1);
Then this appends to that node by using the address of the new node's next pointer as the first argument, similar to what your constructList is doing, but without the lastPtrRef variable (we don't need it here):
push(&head->next, 2);
But then this:
push(&head->next, 3);
Hmmm. Same pointer address as the prior call, so what will it do? Hint: remember what that newNode->next = *headRef line does (I droned on about it forever; I hope something stuck).
The output of the program above is this (obviously the actual address values will be different, dependent to your instance and implementation):
0x100705950 ==> 1
0x10073da90 ==> 3
0x100740b90 ==> 2
Hope that helps.
I am implementing a queue using a generic linked list in C where each node in list is a simple struct which contains a void pointer and a node pointer to the next node. In the dequeue operation I want to remove the head (which is working fine), but I also want to return that node after it is removed.
EDITED TO CLARIFY
What i did in the dequeue function is (example):
//This is my queue struct
typedef struct Queue{
LinkedList* list;
size_t item_size;
} Queue;
//this is the dequeue function
Node* dequeue(Queue* queue){
Node* head = queue->list->head;
Node* returnedValue = (Node*)malloc(sizeof(Node));
memcpy(returnedValue, head, sizeof(Node));
removeBegin(queue->list);
return returnedValue;
}
//this is the remove head function
void removeBegin(LinkedList* list){
Node* tempHead = list->head;
list->head = list->head->next;
tempHead->next = NULL;
free(tempHead->value);
tempHead->value = NULL;
free(tempHead);
tempHead = NULL;
}
the problem is everything before the free function is ok. Everything is being copied correctly. But immediately after the free function call the value that is copied to the newly allocated node becomes garbage (or 0).
The way I call the function is simply initialize the queue using this function:
Queue* init_queue(size_t size){
Queue* queue = (Queue*)malloc(sizeof(Queue));
// int x = 10;
queue->list = createList(NULL, size);
return queue;
}
then call dequeue and pass it the pointer of the queue.
How can I solve this?
thanks a lot.
The memory allocating using
Node* n1 = (Node*)malloc(sizeof(Node));
is uninitialised. This means that accessing the value of n1->value gives undefined behaviour. A consequence is that
memcpy(n1->value, n->value, sizeof(n->value));
also gives undefined behaviour.
When behaviour is undefined, the consequences of executing any further code could be anything. Your observation
newly allocated node becomes garbage (or 0)
is one possible outcome.
There are more problems as well. However, you haven't provided enough information (e.g. how is the function called? how is the pointer passed as n initialised? how is n->value initialised?) so it is not possible to give advice on how to FIX your function.
I am learning how to make a linked list, but its failing to print out anything at all, and I cant figure out why??? please help. I believe it has something to do with my pointers but I don't know what it is.
#include <stdio.h>
#include <stdlib.h>
// typedef is used to give a data type a new name
typedef struct node * link ;// link is now type struct node pointer
/*
typedef allows us to say "link ptr"
instead of "struct node * ptr"
*/
struct node{
int item ;// this is the data
link next ;//same as struct node * next, next is a pointer
};
void printAll(link head); // print a linked list , starting at link head
void addFirst(link ptr, int val ); // add a node with given value to a list
link removeLast(link ptr); // removes and returns the last element in the link
//prints the link
void printAll(link head){
link ptr = head;
printf("\nPrinting Linked List:\n");
while(ptr != NULL){
printf(" %d ", (*ptr).item);
ptr = (*ptr).next;// same as ptr->next
}
printf("\n");
}
//adds to the head of the link
void addFirst(link ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = ptr;
ptr = tmp;
}
// testing
int main(void) {
link head = NULL;// same as struct node * head, head is a pointer type
//populating list
for(int i = 0; i<3; i++){
addFirst(head, i);
}
printAll(head);
return 0;
}
output:
Printing Linked List:
Process returned 0 (0x0) execution time : 0.059 s
Press any key to continue
It's because you're passing a null pointer to your function and the condition for exiting the loop is for that pointer to be null, so nothing happens.
Your addFirst function takes a pointer's value, but it cannot modify the head that you declared inside of main().
To modify head you need to pass a pointer to link, then you can dereference that pointer to access your head and you can then change it.
void addFirst(link *ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = *ptr;
*ptr = tmp;
}
Now you can change the head pointer. Just remember to pass the address to it when calling the function. addFirst(&head,i)
In the for loop
for(int i = 0; i<3; i++){
addFirst(head, i);
}
you create a bunch of pointers which all point to NULL. head is never changing since pointer itself is passed "by value". E.g. head is copied and all modifications to the pointer itself in addFirst are not visible outside.
This is the same as with say int. Imagine void foo(int x);. Whatever this function does to x is not visible outside.
However changes to the memory which link ptr points to are visible of course.
E.g. this line does nothing:
tmp->next = ptr;
ptr = tmp; <=== this line
}
You can fix this in several ways. One is to return new node from addFirst and another one is to make link ptr to be a pointer to pointer: link *ptr. Since in this case you want to change pointer value (not pointee value):
//link *ptr here a pointer to pointer
void addFirst(link * ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = *ptr; //<<changed
*ptr = tmp; //<<changed
}
Do not forget to update declaration above also. And the call:
void addFirst(link * ptr, int val ); // add a node with given value to a list
...
for(int i = 0; i<3; i++){
addFirst(&head, i);
}
Then this code produces:
Printing Linked List:
2 1 0
Added:
It's important to understand that working with linked list requires working with two different types of data.
First is struct node and you pass around this type of data using links.
Second is head. This is a pointer to the very first node. When you would like to modify the head you find it is not a "node". It is something else. It's a "name" for the first node in the list. This name by itself is a pointer to node. See how memory layout for head is different from the list itself.
head[8 bytes]->node1[16 bytes]->node2[16 bytes]->...->nodek[16 bytes]->NULL;
by the way - the only thing which have lexical name here is head. All the nodes do not have name and accessible through node->next syntax.
You can also imagine another pointer here, link last which will point to nodek. Again this will have different memory layout from nodes itself. And if you would like to modify that in a function you will need to pass to function pointer to that (e.g.pointer to pointer).
Pointer and data it points to are different things. In your mind you need to separate them. Pointer is like int or float. It is passed "by value" to functions. Yes link ptr is already pointer and that permits you to update the data it points to. However the pointer itself is passed by value and updates to pointer (in your case ptr=tmp) are not visible outside.
(*ptr).next=xxx will be visible of course because data is updated (not pointer). That means you need to do one extra step - make changes to your pointer visible outside of function, e.g. convert the pointer itself (head) into data for another pointer, e.g. use struct node **ptr (first star here says this is pointer to a node, and the second star converts that pointer to data for another pointer.
I'm getting started with dynamic lists and i don't understand why it is necessary to use the malloc function even when declaring the first node in the main() program, the piece of code below should just print the data contained in the first node but if i don't initialize the node with the malloc function it just doesn't work:
struct node{
int data;
struct node* next;
};
void insert(int val, struct node*);
int main() {
struct node* head ;
head->data = 2;
printf("%d \n", head->data);
}
You don’t technically, but maintaining all nodes with the same memory pattern is only an advantage to you, with no real disadvantages.
Just assume that all nodes are stored in the dynamic memory.
Your “insert” procedure would be better named something like “add” or (for full functional context) “cons”, and it should return the new node:
struct node* cons(int val, struct node* next)
{
struct node* this = (struct node*)malloc( sizeof struct node );
if (!this) return next; // or some other error condition!
this->data = val;
this->next = next;
return this;
}
Building lists is now very easy:
int main()
{
struct node* xs = cons( 2, cons( 3, cons( 5, cons( 7, NULL ) ) ) );
// You now have a list of the first four prime numbers.
And it is easy to handle them.
// Let’s print them!
{
struct node* p = xs;
while (p)
{
printf( "%d ", p->data );
p = p->next;
}
printf( "\n" );
}
// Let’s get the length!
int length = 0;
{
struct node* p = xs;
while (p)
{
length += 1;
p = p->next;
}
}
printf( "xs is %d elements long.\n", length );
By the way, you should try to be as consistent as possible when naming things. You have named the node data “data” but the constructor’s argument calls it “val”. You should pick one and stick to it.
Also, it is common to:
typedef struct node node;
Now in every place except inside the definition of struct node you can just use the word node.
Oh, and I almost forgot: Don’t forget to clean up with a proper destructor.
node* destroy( node* root )
{
if (!root) return NULL;
destroy( root->next );
free( root );
return NULL;
}
And an addendum to main():
int main()
{
node* xs = ...
...
xs = destroy( xs );
}
When you declare a variable, you define the type of the variable, then it's
name and optionally you declare it's initial value.
Every type needs an specific amount of memory. For example int would be
32 bit long on a 32bit OS, 8 bit long on a 64.
A variable declared in a function is usually stored in the stack associated
with the function. When the function returns, the stack for that function is
no longer available and the variable does not longer exist.
When you need the value/object of the variable to exist even after a function
returns, then you need to allocate memory on a different part of the program,
usually the heap. That's exactly what malloc, realloc and calloc do.
Doing
struct node* head ;
head->data = 2;
is just wrong. You've declaring a pointer named head of type struct node,
but you are not assigning anything to it. So it points to an unspecified
location in memory. head->data = 2 tries to store a value at an unspecified
location and the program will most likely crash with a segfault.
In main you could do this:
int main(void)
{
struct node head;
head.data = 2;
printf("%d \n", head.data);
return 0;
}
head will be saved in the stack and will persist as long as main doesn't
return. But this is only a very small example. In a complex program where you
have many more variables, objects, etc. it's a bad idea to simply declare all
variables you need in main. So it's best that objects get created when they
are needed.
For example you could have a function that creates the object and another one
that calls create_node and uses that object.
struct node *create_node(int data)
{
struct node *head = malloc(sizeof *head);
if(head == NULL)
return NULL; // no more memory left
head->data = data;
head->next = NULL;
return head;
}
struct node *foo(void)
{
struct node *head = create_node(112);
// do somethig with head
return head;
}
Here create_node uses malloc to allocate memory for one struct node
object, initializes the object with some values and returns a pointer to that memory location.
foo calls create_node and does something with it and it returns the
object. If another function calls foo, this function will get the object.
There are also other reasons for malloc. Consider this code:
void foo(void)
{
int numbers[4] = { 1, 3, 5, 7 };
...
}
In this case you know that you will need 4 integers. But sometimes you need an
array where the number of elements is only known during runtime, for example
because it depends on some user input. For this you can also use malloc.
void foo(int size)
{
int *numbers = malloc(size * sizeof *numbers);
// now you have "size" elements
...
free(numbers); // freeing memory
}
When you use malloc, realloc, calloc, you'll need to free the memory. If
your program does not need the memory anymore, you have to use free (like in
the last example. Note that for simplicity I omitted the use of free in the
examples with struct head.
What you have invokes undefined behavior because you don't really have a node,, you have a pointer to a node that doesn't actually point to a node. Using malloc and friends creates a memory region where an actual node object can reside, and where a node pointer can point to.
In your code, struct node* head is a pointer that points to nowhere, and dereferencing it as you have done is undefined behavior (which can commonly cause a segfault). You must point head to a valid struct node before you can safely dereference it. One way is like this:
int main() {
struct node* head;
struct node myNode;
head = &myNode; // assigning the address of myNode to head, now head points somewhere
head->data = 2; // this is legal
printf("%d \n", head->data); // will print 2
}
But in the above example, myNode is a local variable, and will go out of scope as soon as the function exists (in this case main). As you say in your question, for linked lists you generally want to malloc the data so it can be used outside of the current scope.
int main() {
struct node* head = malloc(sizeof struct node);
if (head != NULL)
{
// we received a valid memory block, so we can safely dereference
// you should ALWAYS initialize/assign memory when you allocate it.
// malloc does not do this, but calloc does (initializes it to 0) if you want to use that
// you can use malloc and memset together.. in this case there's just
// two fields, so we can initialize via assignment.
head->data = 2;
head->next = NULL;
printf("%d \n", head->data);
// clean up memory when we're done using it
free(head);
}
else
{
// we were unable to obtain memory
fprintf(stderr, "Unable to allocate memory!\n");
}
return 0;
}
This is a very simple example. Normally for a linked list, you'll have insert function(s) (where the mallocing generally takes place and remove function(s) (where the freeing generally takes place. You'll at least have a head pointer that always points to the first item in the list, and for a double-linked list you'll want a tail pointer as well. There can also be print functions, deleteEntireList functions, etc. But one way or another, you must allocate space for an actual object. malloc is a way to do that so the validity of the memory persists throughout runtime of your program.
edit:
Incorrect. This absolutely applies to int and int*,, it applies to any object and pointer(s) to it. If you were to have the following:
int main() {
int* head;
*head = 2; // head uninitialized and unassigned, this is UB
printf("%d\n", *head); // UB again
return 0;
}
this is every bit of undefined behavior as you have in your OP. A pointer must point to something valid before you can dereference it. In the above code, head is uninitialized, it doesn't point to anything deterministically, and as soon as you do *head (whether to read or write), you're invoking undefined behavior. Just as with your struct node, you must do something like following to be correct:
int main() {
int myInt; // creates space for an actual int in automatic storage (most likely the stack)
int* head = &myInt; // now head points to a valid memory location, namely myInt
*head = 2; // now myInt == 2
printf("%d\n", *head); // prints 2
return 0;
}
or you can do
int main() {
int* head = malloc(sizeof int); // silly to malloc a single int, but this is for illustration purposes
if (head != NULL)
{
// space for an int was returned to us from the heap
*head = 2; // now the unnamed int that head points to is 2
printf("%d\n", *head); // prints out 2
// don't forget to clean up
free(head);
}
else
{
// handle error, print error message, etc
}
return 0;
}
These rules are true for any primitive type or data structure you're dealing with. Pointers must point to something, otherwise dereferencing them is undefined behavior, and you hope you get a segfault when that happens so you can track down the errors before your TA grades it or before the customer demo. Murphy's law dictates UB will always crash your code when it's being presented.
Statement struct node* head; defines a pointer to a node object, but not the node object itself. As you do not initialize the pointer (i.e. by letting it point to a node object created by, for example, a malloc-statement), dereferencing this pointer as you do with head->data yields undefined behaviour.
Two ways to overcome this, (1) either allocate memory dynamically - yielding an object with dynamic storage duration, or (2) define the object itself as an, for example, local variable with automatic storage duration:
(1) dynamic storage duration
int main() {
struct node* head = calloc(1, sizeof(struct node));
if (head) {
head->data = 2;
printf("%d \n", head->data);
free(head);
}
}
(2) automatic storage duration
int main() {
struct node head;
head.data = 2;
printf("%d \n", head.data);
}
Here's my function to delete a linked list:
void deleteList( NODE* head )
{
NODE* temp1;
NODE* tempNext;
temp1 = head;
tempNext = NULL;
while( temp1 != NULL )
{
tempNext = temp1->next;
free(temp1);
temp1 = tempNext;
}
}
So temp1 first points where the head pointer is pointing. If it isn't NULL, tempNext will be set to point to the next element of the list. Then the first element (temp1) is free'd, and temp1 is reassigned to point to where tempNext is pointing and process repeats.
Is this the right approach to deleting an entire list?
I ask this because when I print the list after using this function, it still prints the list. And IIRC freeing something doesn't delete it but only marks it as available so I'm not sure how to tell if this is correct or not.
Your code looks correct.
You're also correct that freeing a list's elements doesn't immediately change the memory they pointed to. It just returns the memory to the heap manager which may reallocate it in future.
If you want to make sure that client code doesn't continue to use a freed list, you could change deleteList to also NULL their NODE pointer:
void deleteList( NODE** head )
{
NODE* temp1 = *head;
/* your code as before */
*head = NULL;
}
It still print the list, because you probably don't set the head pointer to NULL after calling this function.
I ask this because when I print the list after using this function, it still prints the list.
There is a difference between freeing a pointer and invalidating a pointer. If you free your whole linked list and the head, it means that you no longer "own" the memory at the locations that head and all the next pointers point to. Thus you can't garintee what values will be there, or that the memory is valid.
However, the odds are pretty good that if you don't touch anything after freeing your linked list, you'll still be able to traverse it and print the values.
struct node{
int i;
struct node * next;
};
...
struct node * head = NULL;
head = malloc(sizeof(struct node));
head->i = 5;
head->next = NULL;
free(head);
printf("%d\n", head->i); // The odds are pretty good you'll see "5" here
You should always free your pointer, then directly set it to NULL because in the above code, while the comment is true. It's also dangerous to make any assumptions about how head will react/contain after you've called free().
This is a pretty old question, but maybe it'll help someone performing a search on the topic.
This is what I recently wrote to completely delete a singly-linked list. I see a lot of people who have heartburn over recursive algorithms involving large lists, for fear of running out of stack space. So here is an iterative version.
Just pass in the "head" pointer and the function takes care of the rest...
struct Node {
int i;
struct Node *next;
};
void DeleteList(struct Node *Head) {
struct Node *p_ptr;
p_ptr = Head;
while (p_ptr->next != NULL) {
p_ptr = p_ptr->next;
Head->next = p_ptr->next;
free(p_ptr);
p_ptr = Head;
}
free(p_ptr);
}