C Linked List pointer understanding - c

I'm trying to understand how C linked list pointer works.
I understand that a pointer to a variable is a "link" to an address memory, and that a pointer to a pointer is, sometimes, a reference to a pointer itself.
What concerns me is how could, for example, a node reference modify the original list value, but not the list itself.
I'll explain myself better:
void insertNode(struct node** head, int value) {
struct node* new = malloc(sizeof(struct node*));
struct node* ref = (*head); //this is a reference. -> same address.
//base case
if((*head) == NULL) {
//do things
} else { // not null
while(ref->next != null) {
ref = ref->next; //THIS: how can this not modify the head itself?
}
//null spot found, set up node
new->value = 10; //some int here
new->next = NULL;
ref->next = new; //Instead, how can this modify the head? and why?
}
}
here's a little snippets of code, and my question is:
Yes, i'm holding a reference to head through ref.
But why
ref = ref->next;
only modify ref itself, while
ref->next = new
modify also the head?
through GDB i saw that both, at the beginning, share the same address memory, but ref only modify the referenced list on the new insert.
Can someone explain it?

ref is just a pointer; modifying ref will not modify what is pointed by ref.
The while loop is actually just looking for the last element of the list. After the while loop , ref will simply point to the last element of the list.
First "mystery" line:
ref = ref->next; //THIS: how can this not modify the head itself?
Here we just read ref->next, so the head cannot be modified.
Second "mystery" line:
ref->next = new; //Instead, how can this modify the head? and why?
Here we modify what is pointed by ref. At this line ref points either to the last element of the list, or it points the head (which is also the last element of the list if there is only one element in the list, or which is the newly created head (to be done in //do things) if the list was empty.

Maybe some pictures will help.
Before calling insertNode, you have a sequence of nodes linked like so:
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
You have a pointer (call it h) that points to the first element of the list:
+---+
| h |
+---+
|
V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
When you call insertNode, you pass a pointer to h in as a parameter, which we call head:
+------+
| head |
+------+
|
V
+---+
| h |
+---+
|
V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
You create a pointer variable named ref that takes the value of *head (h); IOW, ref winds up pointing to the first element of the list:
+------+ +-----+
| head | | ref |
+------+ +-----+
| |
V |
+---+ |
| h | |
+---+ |
| +----+
V V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
Then you create another node on the heap, and assign that pointer to a local variable named new:
+------+ +-----+ +-----+ +-------+------+
| head | | ref | | new | ---> | value | next |
+------+ +-----+ +-----+ +-------+------+
| |
V |
+---+ |
| h | |
+---+ |
| +----+
V V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
So, the thing to notice is that while ref and *head (h) have the same value (the address of the first node in the list), they are different objects. Thus, anything that changes the value of ref does not affect either head or h.
So, if we execute this loop
while(ref->next != null) {
ref = ref->next;
the result of the first iteration is
+------+ +-----+ +-----+ +-------+------+
| head | | ref | | new | ---> | value | next |
+------+ +-----+ +-----+ +-------+------+
| |
V |
+---+ |
| h | |
+---+ |
| +------------+
V V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
After another iteration we get
+------+ +-----+ +-----+ +-------+------+
| head | | ref | | new | ---> | value | next |
+------+ +-----+ +-----+ +-------+------+
| |
V |
+---+ |
| h | |
+---+ |
| +----------------------------------+
V V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
At this point, ref->next is NULL, so the loop exits.
We then assign values to new->value and new->next, such that new->next is NULL:
+------+ +-----+ +-----+ +-------+------+
| head | | ref | | new | ---> | value | next | ---|||
+------+ +-----+ +-----+ +-------+------+
| |
V |
+---+ |
| h | |
+---+ |
| +----------------------------------+
V V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
Finally, we set ref->next to the value of new, thus adding the node new points to to the end of the list:
+------+ +-----+ +-----+ +-------+------+
| head | | ref | | new | ---> | value | next | ---|||
+------+ +-----+ +-----+ +-------+------+
| | ^
V | |
+---+ | +-------------------------------+
| h | | |
+---+ | |
| +----------------------------------+ |
V V |
+-------+------+ +-------+------+ +-------+------+ |
| value | next | ---> | value | next | ---> | value | next | ---+
+-------+------+ +-------+------+ +-------+------+
Note that ref->next isn't pointing to the variable new, it's pointing to the thing that new points to.
So, that's why updating ref does not affect head (or *head (h)). The base case, where the list is empty, will end up writing to *head (h), setting it to point to a new node allocated from the heap.

Related

removing the tail of a linked list with a tail pointer

I'm trying to create a function that removes the tail of a linked list. I have both a head and tail pointer, so my question is, when I'm trying to remove from the tail, do I still need to traverse the list to remove the node at the tail?
Assuming that your list is a singly linked list, you still need to traverse it, because the second to last element becomes the new tail and it should be updated.
So, storing the list's tail, is not a great idea.
Instead, if you have a doubly linked list, you can freed the tail and update it, following the "back link".
As part of removing the node, you need the node preceding the node to remove for two reasons:
tail needs to be updated to point to it.
Its next field needs to be set to NULL.
Before:
head Node Node Node tail
+------+ +-------+ +-------+ +-------+ +------+
| ------->| -----...-->| ------->| NULL |<------- |
+------+ | | | | | | +------+
| | | | | |
+-------+ +-------+ +-------+
After:
head Node Node tail
+------+ +-------+ +-------+ +------+
| ------->| -----...-->| NULL |<--------------------- |
+------+ | | | | +------+
| | | |
+-------+ +-------+
To find this node in a singly-linked list requires traversing the list.
To find this node in a doubly-linked list doesn't require traversing the list. We can simply use the pointers stylized as <=== in the following diagram:
head Node Node Node tail
+------+ +-------+ +-------+ +-------+ +------+
| ------->| -----...-->| ------->| NULL |<======= |
+------+ | NULL |<-...------ |<======= | +------+
| | | | | |
+-------+ +-------+ +-------+

why dynamically allocated variables will not change(when we use their addresses) like static variables

head is a pointer to memory which is equal to current
void display(node_t **head) {
struct node *current = (*head);
if((*head) == NULL) {
printf("List is empty \n");
return;
}
while(current != NULL) {
printf("current->n=%d\n", current->next); //0 1 2
printf("head->n=%d\n", head->next); // 0 0 0 if we assign current to its next
current = current->next; // should it affect to the head because we are
//because we are assigning memory address to another
}
}
sample code using static vars:
int *head,*current,c=1;
head=&c;
current=&c;
(*current)++;
head and current will change in this case
while(current != NULL) {
printf("current->n=%d\n", current->n); //0 1 2
printf("head->n=%d\n", head->n); // 0 0 0 if we assign current to its next
current = current->next; // should it affect to the head because we are
//because we are assigning memory address to another
}
Ignoring the inconsistent naming (is it n or next?), head and current are separate objects - nothing you do to current will affect head.
It might help if we draw a picture. Let's assume you have a list with three nodes (I don't know what your node structure looks like, so this is pretty generic):
+---+---+ +---+---+ +---+---+
| | |----->| | |------>| | |---|||
+---+---+ +---+---+ +---+---+
You have a pointer to the first element (we'll call it h):
+---+ +---+---+ +---+---+ +---+---+
h: | |----->| | |----->| | |------>| | |---|||
+---+ +---+---+ +---+---+ +---+---+
Your head argument in the display function points to h:
+---+
head: | |
+---+
|
V
+---+ +---+---+ +---+---+ +---+---+
h: | |----->| | |----->| | |------>| | |---|||
+---+ +---+---+ +---+---+ +---+---+
You initialize current with the value of *head, which is the value of h, so current points to the first node of the list:
+---+
head: | |
+---+
|
V
+---+ +---+---+ +---+---+ +---+---+
h: | |----->| | |----->| | |------>| | |---|||
+---+ +---+---+ +---+---+ +---+---+
^
|
+---+
current: | |
+---+
Each time you execute
current = current->next
that will set current to point to the next element in the list:
+---+
head: | |
+---+
|
V
+---+ +---+---+ +---+---+ +---+---+
h: | |----->| | |----->| | |------>| | |---|||
+---+ +---+---+ +---+---+ +---+---+
^
|
+---+
current: | |
+---+
+---+
head: | |
+---+
|
V
+---+ +---+---+ +---+---+ +---+---+
h: | |----->| | |----->| | |------>| | |---|||
+---+ +---+---+ +---+---+ +---+---+
^
|
+---+
current: | |
+---+
Nothing you do to current affects head - they are separate objects.
int *head,*current,c=1;
head=&c;
current=&c;
while(True):
current++;
In this case, head and current will change.
Yes, because you are explicitly assigning a new value to both head and current - however, current++ will have no effect on head.
And this isn't valid C, by the way - looks like you started in C and finished in Python.
Because at this point here: struct node *current = (*head); you dereference head. If head is an array of node pointers, now current points to the first item of that array.
Also I would advise checking for NULL before dereferencing head as otherwise you have a SEGFAULT waiting to happen.

I made a program while ago and i dont understand some parts of it . It's about lists [closed]

Closed. This question needs debugging details. It is not currently accepting answers.
Edit the question to include desired behavior, a specific problem or error, and the shortest code necessary to reproduce the problem. This will help others answer the question.
Closed 3 years ago.
Improve this question
I may miss something about pointers but I dont get why my code works
I made a list and I got to insert new nodes (values) at the end of the list .
Here is the code of the function that I don't understand properly.
Just imagine we got to add 3 elements like 1 2 3
Tlist InsertAtEnd (Tlist head,int x) {
Tnode *newNode, *tmp;
newNode = makenode(x);
tmp = head;
if (head == NULL) {
head = newNode;
}
else {
while(tmp->next!=NULL) {
tmp=tmp->next;
}
tmp->next = newNode;
}
return head;
}
So if I insert the first node it's all fine I understand the process.
The problems comes at the second insertion.
It goes in the else and then it changes tmp->next = newnode;
I changed the value tmp-next with the value of newnode (which is the second node im adding) but as it is called tmp should be temporary .
I return head right? I changed nothing in head didn't I? but that's what makes me smh .
When it prints the values of the list it prints them correctly how's that possible. How do I make the linked list only changing the tmp values?
Pictures should help. Not knowing how Tlist and Tnode are defined, I have to make some guesses, but based on the code it looks like they're defined something like this:
struct node { int value; struct node *next; };
typedef struct node Tnode;
typedef Tnode *Tlist;
Let's imagine you have an object in your code named list. This object is a pointer to a Tnode type, and it initially is set to NULL (points "nowhere"):
+---+
list: | +-+---|||
+---+
We call InsertAtEnd, passing list as the head argument and the value 1 as x. The first thing your function does is create a new node with that value (and I'm assuming it nulls out the next member):
+---+ +---+---+
newNode: | +-+---> | 1 | +-+---|||
+---+ +---+---+
Since your list is initially empty, we set head to point to this new node:
+---+
list: | +-+---||| // list is a distinct object from head
+---+
+---+ +---+---+
head: | +-+---> | 1 | +-+---|||
+---+ +---+---+
^
+---+ |
newNode: | +-+----+
+---+
Remember, the value of head is the address of the newly created node.
The code then returns head, which we'll assume is assigned back to list, giving us this:
+---+ +---+---+
list: | +-+---> | 1 | +-+---|||
+---+ +---+---+
Now we call InsertAtEnd again with list and the value 2. The first thing your code does is create the new node:
+---+ +---+---+
newNode: | +-+---> | 2 | +-+---|||
+---+ +---+---+
Now, since the list isn't empty, we use the temporary variable tmp that initially points to the same object as head:
+---+
list: | +-+-------+
+---+ |
V
+---+ +---+---+
head: | +-+---> | 1 | +-+---|||
+---+ +---+---+
^
+---+ |
tmp: | +-+-------+
+---+
When then iterate through the list until the next member of the thing tmp points to is NULL. In this case, it's the first object of the list, so we set tmp->next to point to our new node:
+---+
list: | +-+-------+
+---+ |
V
+---+ +---+---+ +---+---+
head: | +-+---> | 1 | +-+---> | 2 | +-+---|||
+---+ +---+---+ +---+---+
^ ^
+---+ | |
tmp: | +-+-------+ |
+---+ |
|
+---+ |
newNode: | +-+------------------+
+---+
When we return from InsertAtEnd, our list looks like this:
+---+ +---+---+ +---+---+
list: | +-+---> | 1 | +-+---> | 2 | +-+---|||
+---+ +---+---+ +---+---+
Now we call InsertAtEnd one more time, with list and the value 3. Again, we start by creating the new node:
+---+ +---+---+
newNode: | +-+---> | 3 | +-+---|||
+---+ +---+---+
Then we point tmp to the first element in the list, and "walk" down the list until tmp->next is NULL:
+---+
list: | +-+-------+
+---+ |
V
+---+ +---+---+ +---+---+
head: | +-+---> | 1 | +-+---> | 2 | +-+---|||
+---+ +---+---+ +---+---+
^
+---+ |
tmp: | +-+-------+ // tmp->next != NULL
+---+
Second iteration:
+---+
list: | +-+-------+
+---+ |
V
+---+ +---+---+ +---+---+
head: | +-+---> | 1 | +-+---> | 2 | +-+---|||
+---+ +---+---+ +---+---+
^
+---+ |
tmp: | +-+---------------------+ // tmp->next == NULL
+---+
Again, we set tmp->next to point to the new node:
+---+
list: | +-+-------+
+---+ |
V
+---+ +---+---+ +---+---+ +---+---+
head: | +-+---> | 1 | +-+---> | 2 | +-+---> | 3 | +-+---|||
+---+ +---+---+ +---+---+ +---+---+
^ ^
+---+ | |
tmp: | +-+---------------------+ |
+---+ |
|
+---+ |
newNode: | +-+--------------------------------+
+---+
and our list is now
+---+ +---+---+ +---+---+ +---+---+
list: | +-+---> | 1 | +-+---> | 2 | +-+---> | 3 | +-+---|||
+---+ +---+---+ +---+---+ +---+---+
Hopefully that helps.
When I took Data Structures way, way back in the mid-Cretaceous, it took me about a week longer than my classmates to grok the concept of linked lists, so I feel you here.
You are changing the temp->next value, on the second iteration(adding the second node) temp references the head and you add a new value with temp->next = newNode, which really does head->next = newNode. You always return the head, which doesn't change after the first insertion. The temporary variable references head, so the changes are persistent.
The tail of singly linked list is last node which is pointing to nothing. While loop is iterating till the end of linked list( i.e until it finds the next node as NULL) and once the loop is exited temp is pointing to the last node on which newly created node is appended.
I return head right? I changed nothing in head didnt i ?
Yes, indeed you are. But at the first insert you have created head and subsequent insert attached newly created node at the end of the tail(head itself is the tail for immediate next insert after head).

Correct way to join two double linked list

In the Linux kernel source, the list_splice is implemented with __list_splice:
static inline void __list_splice(const struct list_head *list,
struct list_head *prev,
struct list_head *next)
{
struct list_head *first = list->next; // Why?
struct list_head *last = list->prev;
first->prev = prev;
prev->next = first;
last->next = next;
next->prev = last;
}
Isn't the list already pointing to the head of a linked list?
Why do we need to fetch list->next instead?
The double linked list API in the Linux kernel is implemented as an abstraction of circular list. In that simple scheme the HEAD node does not contain any payload (data) and used explicitly to keep starting point of the list. Due to such design it's really simple to a) check if the list is empty, and b) debug list because unused nodes have been assigned to the so called POISON — magic number specific only to the list pointers in the entire kernel.
1) non-initialized list
+-------------+
| HEAD |
| prev | next |
|POISON POISON|
+-------------+
2) empty list
+----------+-----------+
| | |
| | |
| +------v------+ |
| | HEAD | |
+---+ prev | next +----+
| HEAD HEAD |
+-------------+
3) list with one element
+--------------+--------------+
| | |
| | |
| +------v------+ |
| | HEAD | |
| +---+ prev | next +--+ |
| | |ITEM1 ITEM1| | |
| | +-------------+ | |
| +--------------------+ |
| | |
| +------v------+ |
| | ITEM1 | |
+-------+ prev | next +-------+
| DATA1 |
+-------------+
4) two items in the list
+----------+
| |
| |
| +------v------+
| | HEAD |
+------+ prev | next +----+
| | |ITEM2 ITEM1| |
| | +-------------+ |
+----------------------------+
| | | |
| | | +------v------+
| | | | ITEM1 |
| | +---+ prev | next +----+
| | | | DATA1 | |
| | | +-------------+ |
| +-------------------------+
| | |
| | +------v------+
| | | ITEM2 |
+---------+ prev | next +----+
| | DATA2 | |
| +-------------+ |
| |
+----------------------+
In the lock less algorithm there is a guarantee only for next pointer to be consistent. The guarantee wasn't always the case. The commit 2f073848c3cc introduces it.

Binary Search Trees Switching subtrees

So I'm trying to write a function that when given two pointers to nodes in the BST, will 'switch' the subtree locations.
typedef struct NODE {
struct NODE* parent;
struct NODE* left;
struct NODE* right;
}node_t;
This is the node struct I have for the BST.
My function goes along the line of :
void switch_subtree(node_t* a, node_t* b)
{
if (a==NULL || b==NULL)
{
return;
}
if (a->parent->left == a)
{
a->parent->left = b;
}
else
{
a->parent->right = b;
}
if (b->parent->left == b)
{
b->parent->left = a;
}
else
{
b->parent->right = a;
}
nodes * temp = a;
a->parent = b->parent;
b->parent = temp->parent;
}
However, when I run it, it does not properly switch the subtrees.
Can anyone point out any errors Im making and point me in the right direction?
Thanks!!!
Your problem is here:
nodes * temp = a;
a->parent = b->parent;
b->parent = temp->parent;
correctly it should read:
nodes * temp = a->parent;
a->parent = b->parent;
b->parent = temp;
otherwise a->parent is forever lost after line 2.
rationale
wrong approach
The line temp = a will make both pointers temp and a point to the same NODE structure:
+- > +--------+ +- > +--------+
| | | | | |
| | ... | | | ... |
| +--------+ | +--------+
| |
+--------------+ +----------------+
| |
+--------+- > +--------+ | +- > +---------+ |
| | | parent |-+ | | parent |-+
| | | ... | | | ... |
| | +--------+ | +---------+
| | |
+------+ | +---+ | +---+ |
| temp |-+ | a |-+ | b |-+
+------+ +---+ +---+
Changing a->parent in line 2 (a->parent = b->parent) will also change temp->parent as both are just different names for the same component (parent) of the same NODE structure:
+--------+ +---+- > +--------+
| | | | | |
| ... | | | | ... |
+--------+ | | +--------+
| |
| +--------------+
| |
+--------+- > +--------+ | +- > +---------+
| | | parent |---+ | | parent |
| | | ... | | | ... |
| | +--------+ | +---------+
| | |
+------+ | +---+ | +---+ |
| temp |-+ | a |-+ | b |-+
+------+ +---+ +---+
The assignment b->parent = temp->parent doesn't change anything at all, as both b->parent and temp->parent are already pointing at the same node.
- mistake !
alternative
Taking a look at the proposed alternative, temp = a->parent will leave you with the situation sketched below:
+---------+- > +--------+ +- > +--------+
| | | | | | |
| | | ... | | | ... |
| | +--------+ | +--------+
| | |
| +--------------+ +----------------+
| | |
| +- > +--------+ | +- > +---------+ |
| | | parent |-+ | | parent |-+
| | | ... | | | ... |
| | +--------+ | +---------+
| | |
+------+ | +---+ | +---+ |
| temp |-+ | a |-+ | b |-+
+------+ +---+ +---+
After a->parent = b->parent temp is still pointing to the original parent node of the node pointed to by a:
+----------- > +--------+ +- > +--------+
| | | | | |
| | ... | | | ... |
| +--------+ | +--------+
| |
| +-----+----------------+
| | |
| +- > +--------+ | +- > +---------+ |
| | | parent |-+ | | parent |-+
| | | ... | | | ... |
| | +--------+ | +---------+
| | |
+------+ | +---+ | +---+ |
| temp |-+ | a |-+ | b |-+
+------+ +---+ +---+
Finally assigning b->parent = temp will give the node pointed to by b the right parent:
+--------+-- > +--------+ +----- > +--------+
| | | | | | |
| | | ... | | | ... |
| | +--------+ | +--------+
| | |
| +-----------------|--------------------+
| | |
| +- > +--------+ | +- > +---------+ |
| | | parent |---+ | | parent |-+
| | | ... | | | ... |
| | +--------+ | +---------+
| | |
+------+ | +---+ | +---+ |
| temp |-+ | a |-+ | b |-+
+------+ +---+ +---+

Resources