Using pointers to pointers to update linked lists in C

Using pointers to pointers to update linked lists in C - c

I am currently working through C Programming A Modern Approach by K.N. King, and one of the exercise questions is:
The following function is supposed to insert a new node into its proper place in an ordered list, returning a pointer to the first node in the modified list. Unfortunately, the function doesn't word correctly in all cases. Explain what's wrong with it and show how to fix it. Assume that the node structure is the one defined in Section 17.5.
struct node
{
int value;
struct node next;
};
struct node *insert_into_ordered_list(struct node *list, struct node *new_node)
{
struct node *cur = list, *prev = NULL;
while (cur->value <= new_node->value) {
prev = cur;
cur = cur->next;
}
prev->next = new_node;
new_node->next = cur;
return list;
}
After trying to understand this for a while and struggling, I eventually came across another stackoverflow question where someone posted this as their answer.
I currently don't have enough reputation to ask about it on there, so I'm asking here to try to wrap my head around this.
struct node * insert_into_ordered_list( struct node *list, struct node *new_node )
{
struct node **pp = &list;
while ( *pp != NULL && !( new_node->value < ( *pp )->value ) )
{
pp = &( *pp )->next;
}
new_node->next = *pp;
*pp = new_node;
return list;
}
Could someone explain to me how the previous node, the one before the inserted new_node is being updated to point at the inserted new_node? I'm guessing the line *pp = new_node; has something to do with it, but I can't understand how. Thank you.

As you said, pp is a pointer to the pointer to the "current node in the position in the list" - this is sometimes called a "handler" in the literature. The name pp comes from "pointer to pointer".
* is the "dereference operator", specifically it means "the data at", so *pp means "the data at memory location pp," which in this case is the actual pointer to the current node.
When you use an assignment operator with a derefernce, you are saying set the data at memory location pp to new_node (that data happens to also be a pointer). Remember, new_node is a pointer to your new node's data. So when you run the line:
new_node->next = *pp;
you are setting the "next" entry in the data at new_node equal to the pointer to the "current" node. Then when you run the line:
*pp = new_node;
you are updating the pointer at location pp to point to the structured data in struct node format at new_node.
Sometimes when unwrapping all these pointers and dereferences in C, it helps me to make sure I understand the type of every expression and subexpression.
For example here are various expressions above, and their types, expressed as boolean operations in modern C using the typeof operator. Note that in a real program, at compile time these will be replaced with the value 1 ("truthy" in C) :)
typeof(new_node) == struct node *;
typeof(pp) == struct node **;
typeof(*pp) == struct node *;
typeof(*new_node) = struct node;
Note that since in C, the = operator causes memory to be copied to cause the left side of the expression to be equal to the right side in any future evaluations (commonly referred to as the lvalue and the rvalue of the expression). So in the parlance, After an = the lvalue could evaluate differently than before. The rvalue is used for calculating the new value of the lvalue, but itself remains unmodified after the assignment operation.
It is important to remember, anytime you use = you are blowing over any data in the lvalue. This is often confusing, as = is the ONLY operator in C that causes "side-effects" (Sometimes () is also considered an operator, of course function calls can also cause side-effects, as in this example using a pointer argument to the function and dereferencing it within the function body).
ALL other operators simply evaluate inside expressions (for example, *, &, + etc.), but when you use = it makes changes. For example, any given expression containing the lvalue or anything dependent on the lvalue might evaluate to a different value before and after an = operation. Because functions can have pointer arguments as in this example, function calls can also cause side effects.
You can also, more simply, think of * as an operator that "removes *'s" from types, and & as an operator that "adds *'s" to types - as above they are called the dereference and reference operator.

pp either initially points to the pointer head or due to the while loop to the data member next of some node
while ( *pp != NULL && !( new_node->value < ( *pp )->value ) )
{
pp = &( *pp )->next;
}
For example if pp points to head. If head is equal to NULL or new_node->value < head->value then these statements
new_node->next = *pp;
*pp = new_node;
in fact are equivalent to
new_node->next = head;
head = new_node
If pp points to the data member next of some node then these statements
new_node->next = *pp;
*pp = new_node;
change the value of the of the current data member next with the address of the new node
*pp = new_node;
and before this the data member next of the new node is set to the value stored in the data member next of the current node.
As for this function
struct node *insert_into_ordered_list(struct node *list, struct node *new_node)
{
struct node *cur = list, *prev = NULL;
while (cur->value <= new_node->value) {
prev = cur;
cur = cur->next;
}
prev->next = new_node;
new_node->next = cur;
return list;
}
then there is no check whether the pointer cur becomes (or initially is) equal to NULL. And secondly the pointer list is not changed when the new node is inserted before the node pointed to by list.
The function could be defined the following way
struct node *insert_into_ordered_list(struct node *list, struct node *new_node)
{
if ( list == NULL || new_node->value < list->value )
{
new_node->next = list;
list = new_node;
}
else
{
struct node *cur = list;
while ( cur->next != NULL && cur->next->value <= new_node->value)
{
cur = cur->next;
}
new_node->next = cur->next;
cur->next = new_node;
}
return list;
}

Related

Purpose of Using a double pointer in Linked Lists

I have a few questions about this function which intends to delete all occurrence of a given integer in the list :
void deleteAllOcc (node**headRef,int d)
{
node*temp;
if (*headRef==NULL)
return;
while (*headRef)
{
if ((*headRef)->data==d)
{
temp=*headRef;
*headRef=(*headRef)->next;
free(temp);
}
else
{
headRef=&((*headRef)->next);
}
}
}
First why did we use **headRef instead of *head ?
Secondly, considering this :
if ((*headRef)->data==d)
{
temp=*headRef;
*headRef=(*headRef)->next;
free(temp);
}
Don't we need to update the links ?
And third, considering this :
else
{
headRef=&((*headRef)->next);
}
Why we didnt do *headRef=(*headRef)->next); here?
Thanks in advance

For starters the function can be written simpler
void deleteAllOcc( node **headRef, int d )
{
while ( *headRef != NULL )
{
if ( ( *headRef )->data == d )
{
node *temp = *headRef;
*headRef = ( *headRef )->next;
free( temp );
}
else
{
headRef = &( *headRef )->next;
}
}
}
That is the first check for the equality of the passed by reference the pointer to the head node is redundant.
First why did we use **headRef instead of *head?
It depends on how the function is defined. In the case of the function definition provided in your question the pointer to the head node is passed by reference that is through a pointer to the pointer head. In this case changing the pointed pointer head within the function you are changing the original function.
For example imagine that the list contains only one node and its data member data is equal to the specified value of the argument d.
in main
node *head = malloc( sizeof( mode ) );
head->data = 1;
head->next = NULL;
deleteAllOcc( &head, 1 );
//...
and within the function deleteAllOcc you have
if((*headRef)->data==d)
{
temp=*headRef;
*headRef=(*headRef)->next;
free(temp);
}
As *headRef yields the original pointer head in main then this statement
*headRef=(*headRef)->next;
changes the original pointer head in main bot a copy of the value of the pointer.
You could define the function in other way when the pointer to the head node is not passed by reference but passed by value. But in this case you need to return the possible new value for the pointer to the head node in main.
For example
node * deleteAllOcc( node *head, int d )
{
while ( head != NULL && head->data == d )
{
node *temp = head;
head = head->next;
free( temp );
}
if ( head != NULL )
{
for ( node *current = head; current->next != NULL; )
{
if ( current->next->data == d )
{
node *temp = current->next;
current->next = current->next->next;
free( temp );
}
else
{
current = current->next;
}
}
}
return head;
}
and in main the function should be called for example like
node *head = NULL;
//filling the list
head = deleteAllOcc( head, 1 );
As you can see the function definition when the pointer to the head node is passed by reference looks more simpler.
how are we updating the links between the nodes after deleting a node
You have a pointer to the data member next of the current node. The pointed node by the data member next of the current node contains the value equal to the value of the parameter d. So the next node must be deleted. It ,means that the data member next of the current node shall point to the node after the deleted node. That is you need to write
*headRef = ( *headRef )->next;
where headRef at the current moment points to the data member next of the current
node.
|node1| next | -> |node2| next | -> |node3| next |
| |
headRef node that must be deleted
So in the data member next pointed to by headRef you have to write the value of the data member next of the deleted node node2.
why we didnt do *headRef=(*headRef)->next); here?
In this code snippet
else
{
headRef=&((*headRef)->next);
}
you are reallocating the pointer headRef to point to the data member next of the next node. If you will write
*headRef = ( *headRef )->next;
then you will lose the node currently pointed to by the value stored in the expression *headRef that is in the currently pointed data member next will be changed rewriting the stored pointer.

First why did we use **headRef instead of *head?
When you send the adress of a linked list to a function, you create a pointer on the first element of this list. It allows you to iterate on the list etc. But if you want to manipulate the complete list (or edit it), you have to create a pointer on this list, (that means a double pointer on this list by sending the adress of the linked list
why we didnt do *headRef=(*headRef)->next); here?
Your method can work but I advise you to create a pointer to a temporary node *(node tmp) in order to assign the value to it and swap data with it. It's much more readable for everyone.

Confusion about 'pointer-to-struct-or-union' type

Both the snippets of code seem like they accomplish the same thing, however the first snippet compiles and executes without errors and yields expected behaviour, the second snippet does not compile.
void insert_at_index(Node** head, int index, int number) {
Node* new_node = (Node*)malloc(sizeof(Node));
new_node->data = number;
int count = 0;
int i;
Node* temp = *head;
if (index == 0) {
new_node->next = temp;
new_node->prev = temp->prev;
*head = new_node;
}
}
void insert_at_index(Node** head, int index, int number) {
Node* new_node = (Node*)malloc(sizeof(Node));
new_node->data = number;
int count = 0;
int i;
//Node* temp = *head;
if (index == 0) {
new_node->next = *head;
new_node->prev = *head->prev; //compiler error: expression must have pointer-to-struct-or-union type
*head = new_node;
}
}
Since **head is a pointer to a pointer to a struct, *head should be a pointer to a struct right, so *header-> should be the correct way to reference things in Node *head imo. But I am not sure how to fix this, also I am new to C. I am using MSVC compiler, hence the pointer casting.

so *header-> should be the correct way to reference things in Node *head imo.
No, and that is where your mistake is. Your thinking is on the right track, but your assumption about the syntax is wrong.
The -> member access operator has a higher precedence than the * dereference operator. So *head->prev gets evaluated as *(head->prev), which fails to compile since head is a Node** and so you can't apply -> to it. You need to use (*head)->prev instead to dereference head into a Node* before you can then access its members via ->.

In the first code snippet you have
Node* temp = *head;
//...
new_node->prev = temp->prev;
Now make the reverse substitution for the variable temp in the statement
new_node->prev = temp->prev;
You will get according to the definition of the variable temp the following
new_node->prev = ( *head )->prev;
As you can see it is not the same as
new_node->prev = *head->prev;
in the second code snippet.
So you need at first to get an object of the type Node * and then to apply the operator ->. Postfix operators like -> have higher precedence than unary operators like *. That means that the expression
*head->prev
is parsed like
*( head->prev )
You could use two postfix operators sequentially like for example
new_node->prev = head[0]->prev;
But using such an expression with pointers that do not point to arrays can confuse readers of the code.

Why does my code not insert a new node into this linked list?

I am new to programming. I just want to know why this doesn't work.
My understanding of pointers isn't clear, especially when using pointers across functions.
void append(struct Node** head_ref, float new_data)
{
struct Node* new_node = (struct Node*) malloc(sizeof(struct Node));
struct Node *last = *head_ref; /* used in step 5*/
new_node->data = new_data;
new_node->next = NULL;
while (last != NULL)
last = last->next;
last = new_node;
return;
}
void append(struct Node** head_ref, float new_data)
{
while (last->next != NULL)
last = last->next;
last->next = new_node;
return;
}
In the first function the new data doesn't get included, I get only the original linked list.
But the second function works just fine. How does a double pointer work when inserting a new node in the beginning of the linked list? (I have seen answers regarding this question, but I am still confused)

In the first example, you move the pointer last until it points at a NULL location. Then, you set the pointer to new_node. However, at this point, last has no real association to your linked list. It is just a pointer to some memory. In the second example, the correct one, you iterate until you reach the tail of the linked list, where next of that node is NULL. Then, you set that next to new_node. There is now a new tail to the list, that is new_node.

Changing the local variable last does not change the value of the data member next of the previous (last) node.
To be more clear let's assume that the list is empty. Then you have to change the pointer referenced by this double pointer head_ref.
You declared a new pointer
struct Node *last = *head_ref;
The loop
while (last != NULL)
last = last->next;
is skipped because now already last is equal to NULL die to the initialization in the previous declaration. And then you changed this local variable last
last = new_node;
The original value of the pointer pointed by head_ref was not changed because last and *head_ref occupy different extents of memory. You changed the memory occupied by last but not changed the memory occupied by head_ref.
Also you should check whether a memory was successfully allocated.
The function can look the following way
int append( struct Node **head_ref, float new_data )
{
struct Node *new_node = malloc( sizeof( struct Node ) );
int success = new_node != NULL;
if ( success )
{
new_node->data = new_data;
new_node->next = NULL;
while ( *head_ref != NULL ) head_ref = &( *head_ref )->next;
*head_ref = new_node;
}
return success;
}
As for this loop (I think you wanted just to show the loop not a whole function)
while (last->next != NULL)
last = last->next;
last->next = new_node;
then you are changing the data member next of the previous (last ) node.
Though this loop will not work if initially head_ref is equal to NULL.

Why use double pointer when inserting node in sorted order in linkedlist?

typedef struct node node;
struct node {
int data;
node *next;
};
int insert_asc(node **phead, int data) {
node **traser;
node *newnode = malloc(sizeof(node));
if (newnode == 0)
return 0;
newnode->data = data;
for (traser = phead; *traser != 0; traser = &(*traser)->next)
if (data <= (*traser)->data)
break;
newnode->next = *traser;
*traser = newnode;
return 1;
}
The confusing part for me is when you dereference a double pointer traser.
how come (*traser)->next holds the next node's address?
and what exactly is *traser here?

Double pointers are used in the posted code for two separate purposes:
node **phead: the head of the list is passed by referenced so it can be updated by insert_asc if the new node must be inserted at the head of the list. Passing by reference is not possible in C, the idiomatic way to achieve it is to pass a pointer to the value to be updated by the function, hence a double pointer phead.
node **traser: To avoid making a special case of the empty list and the insertion at the head of the list, the programmer uses a pointer to keep track of the place to insert the new node. traser first points to the head of the list which in this case is the value of phead and is updated to point to the link between nodes, the next member of the current node, when it is determined that the new node must be inserted after the current one. This is an elegant way to implement insertion without a special case. C allows thanks to pointers, it is not possible in java nor javascript because these language do not have generalised pointers.
Note however that the code could be make more readable by use NULL instead of 0 when comparing pointers:
typedef struct node node;
struct node {
int data;
node *next;
};
int insert_asc(node **phead, int data) {
node **traser;
node *newnode = malloc(sizeof(node));
if (newnode == NULL)
return 0;
newnode->data = data;
for (traser = phead; *traser != NULL; traser = &(*traser)->next) {
if (data <= (*traser)->data)
break;
}
newnode->next = *traser;
*traser = newnode;
return 1;
}
Note also that new nodes with a given value of data are inserted before nodes with the same data value. It does not make a difference in this case and may be a little faster for lists with many duplicates, but if the payload was more elaborate, this insertion method would implement a non-stable sort, whereas using < instead of <= would make the sort stable.
For illustration, here is alternative implementation that does not use a double pointer for the insertion point and needs extra tests for the special cases:
int insert_asc(node **phead, int data) {
node *cur;
node *newnode = malloc(sizeof(node));
if (newnode == NULL)
return 0;
newnode->data = data;
cur = *phead;
if (cur == NULL || cur->next == NULL) {
newnode->next = cur;
*phead = newnode;
} else {
while (cur->next != NULL && data < cur->next->data)
cur = cur->next;
newnode->next = cur->next;
cur->next = newnode;
}
return 1;
}

You are using a double pointer here in order to keep track of the head of your list.
If you were using a simple pointer here and exchanged the nodes, you would risk loosing the address of some nodes of your list.
This is because if you were passing a simple pointer to the head of your list, you would then manipulate a copy of you head address in your function, therefore when you exchange positions in your function, the address of your head would still be the same, if you exchanged the head with another node, then the address of all nodes before the old head would be lost after your function modifies your list.
Edit: pythontutor.com is a tool that helped me understanding the behavior of linked list quite easily thanks to its excellent visualization tool, I would highly recommend you to use it.

How to clear a linked list using double pointer?

Hi I'm trying to make a function that clears a linked list that *first will point to, then the node **first should be freed and the pointer *first set to NULL.
I'm having trouble grasping double pointers and can't get this to work correctly.

You have to move to the next list element before you delete the node. Otherwise you are accessing memory that has been freed.
while( *first != NULL )
{
temp = *first;
*first = temp->next;
free(temp);
}
Be aware that because you're trashing the whole list, *first is going to eventually be NULL. So you can use a single-pointer (just like temp) to traverse your list, and then set *first = NULL at the end. That saves an extra pointer indirection, which arguably is wasteful of CPU in this case.
[edit] What I mean is:
struct node *curr = *first;
struct node *prev;
while( curr != NULL )
{
prev = curr;
curr = curr->next;
free(prev);
}
*first = NULL;
Generally, I find that the less pointer dereferencing you have going on, the easier the code is to understand at a glance.

node* remove_node(node **double_pointer,int search_value)
//pass head of your node as a parameter to double pointer
{while(*double_pointer && **(double_pointer).value!=search_value)
{
double_pointer=&((**double_pointer)->next);
}
if(**double_pointer!=null)
{//lines below are to delete the node which contains our search value
node* deleted_node=*double_pointer;
*double_pointer=*double_pointer->next;
return deleted node;
}}