What is a node in C? - c

I'm working on cs50 pset5 speller, and in the lecture they introduce a new thing called nodes. What is a node? I didn't really understand what they said in the video. When I tried googling it, I got some sites that explained what a node is, but I didn't really understand. I'm new to c so I'm not accustomed to what I call 'coding words'. For instance, I found this on a site about nodes: A dynamic array can be extended by doubling the size but there is overhead associated with the operation of copying old data and freeing the memory associated with the old data structure. What is that supposed to mean? Please help me figure out what a node is because they seem important and useful, especially for pset5.
My node is defined like this:
typedef struct node
{
char word[LENGTH + 1];
struct node *next;
}
node;
Here is the link to the walk-through of speller pset5: https://cs50.harvard.edu/x/2020/psets/5/speller/

Node is a common terminology that is used to demonstrate a single block of linked list or tree or related data structures.
It is a convention to name it node, otherwise you can call it with any name.
Standard
C++
struct node{
int data;
int *next;
};
or in Python
class Node:
def __init__(self, data, next= None):
self.data = data
self.next = next
But you can call it with anyname
Not Standard
C++
struct my_own_name{
int data;
int *nextptr;
};
or in python
class my_own_name:
def __init__(self, data, next=None):
self.data = data
self.next = next

A "node" is a concept from graph theory. A graph consists of nodes (vertices) and edges that connect the nodes.
A node in C can be represented as a structure (a struct) that has all the necessary data elements "on board" to implement a graph. Optionally a structure may be required that represents the edges.
Example:
typedef struct NODE {
int node_id;
struct EDGE *edgelist;
} tNode;
typedef struct EDGE {
tNode *from, *to;
struct EDGE *next;
} tEdge;
Note: the term "node" may also be used in other contexts, for example the nodes of a binary tree, the nodes of a list, etc.

Expanding on Ahmad's answer, there are a number of data structures that are built of elements commonly called "nodes" - each node contains some data and some kind of reference (typically a pointer in C and C++) to one or more other nodes. For a singly-linked list, the node definition typically looks like
struct node {
data_t data; // for some arbitrary data_t type
struct node *next;
};
Each node contains the address of the following node. A graphical representation typically looks like
+------+------+ +------+------+ +------+------+
| data | next |------->| data | next |----->| data | next |------|||
+------+------+ +------+------+ +------+------+
You can also have doubly linked list, where each node points to both the preceding and following nodes:
struct node {
data_t data;
struct node *prev;
struct node *next;
};
And there are binary trees, where each node points to left and right child nodes:
struct node {
data_t data;
struct node *left;
struct node *right;
};
The use of the term "node" is just a common naming convention.
A dynamic array can be extended by doubling the size but there is overhead associated with the operation of copying old data and freeing the memory associated with the old data structure. What is that supposed to mean?
You can resize a dynamically allocated buffer using the realloc library function. For example, suppose we want to dynamically allocate a buffer to
store the string "foo". We'd write something like:
size_t bufsize = 4;
char *buffer = malloc( bufsize );
if ( buffer )
strcpy( buffer, "foo" );
We'll imagine the address returned from malloc is 0x1000:
+---+---+---+---+
0x1000: |'f'|'o'|'o'| 0 |
+---+---+---+---+
0x1004: | ? | ? | ? | ? |
+---+---+---+---+
... ... ... ...
Now, suppose we want to append the string "bar" to "foo". We didn't allocate a large enough buffer to do that, so we need to resize it using the realloc library function:
char *tmp = realloc( buffer, bufsize * 2 ); // double the buffer size
if ( tmp )
{
buffer = tmp;
bufsize *= 2;
strcat( buffer, "bar" );
}
else
{
// could not extend buffer, handle as appropriate
}
Now, if possible, realloc will just grab the space following the current buffer, so the result of that code would be:
+---+---+---+---+
0x1000: |'f'|'o'|'o'|'b'|
+---+---+---+---+
0x1004: |'a'|'r'| 0 | ? |
+---+---+---+---+
... ... ... ...
However, if the memory at 0x1004 had already been allocated for something else, then we can't do that. realloc will have to allocate a new buffer at a different address and copy the contents of the current buffer into it, then deallocate the original buffer. We'll imagine that the first region of free space large enough starts at 0x100c:
+---+---+---+---+
0x1000: |'f'|'o'|'o'| 0 |
+---+---+---+---+
0x1004: | ? | ? | ? | ? |
+---+---+---+---+
... ... ... ...
+---+---+---+---+
0x100c: | ? | ? | ? | ? |
+---+---+---+---+
0x1010: | ? | ? | ? | ? |
+---+---+---+---+
So realloc must first allocate the 8 bytes starting at 0x100c, then it must copy the contents of the current buffer to that new space:
+---+---+---+---+
0x1000: |'f'|'o'|'o'| 0 |
+---+---+---+---+
0x1004: | ? | ? | ? | ? |
+---+---+---+---+
... ... ... ...
+---+---+---+---+
0x100c: |'f'|'o'|'o'| 0 |
+---+---+---+---+
0x1010: | ? | ? | ? | ? |
+---+---+---+---+
and then finally release the space at 0x1000. We append "bar" to this new buffer, giving us:
+---+---+---+---+
0x1000: |'f'|'o'|'o'| 0 | // free'd memory is not overwritten
+---+---+---+---+
0x1004: | ? | ? | ? | ? |
+---+---+---+---+
... ... ... ...
+---+---+---+---+
0x100c: |'f'|'o'|'o'|'b'|
+---+---+---+---+
0x1010: |'a'|'r'| 0 | ? |
+---+---+---+---+
If realloc cannot find a large enough region to satisfy the request, it will return NULL and leave the current buffer in place. This is why we assign the return value of realloc to a different pointer variable - if we assigned that NULL back to buffer, then we'd lose our access to the original buffer.

A 'node' is not a C keyword.
The meaning of this:
A dynamic array can be extended by doubling the size but there is overhead associated with the operation of copying old data and freeing the memory associated with the old data structure
Dynamic allocation means that memory is allocated on the heap. The size of the memory space allocated does not have to be a compile time constant as in static memory allocation, and thus can be modified by reallocating more memory later on in the program's execution.
Overhead means the additional cost of doing an operation in comparison to some other way of doing the same operation. In this case increasing a dynamic array's size is an overhead in comparison with directly allocating the total required space.

Related

what is the difference between *root and **root?

I was iterating a tree data structure which has a pointer to its root as follows-
struct node *root;
when I have to pass reference of this root as a parameter to a function.... I have to pass it like-
calcHeight(&root);
-
-
-
//somewhere
int calcHeight(struct node **root) // function defination is this
My question is- why do we need to pass "root" pointer as &root? can't we just pass root like--
struct node *root;
calcHeight(root);
int calcHeight(struct node *root);
// EDIT
void someFunct(int *arr){
printf("arr2 inside someFunct is %d\n",arr[2]);
arr[2]=30;
}
int main()
{
int *arr=(int*)calloc(10,sizeof(int));
printf("arr[2] is %d\n",arr[2]);
someFunct(arr);
printf("arr[2] finally is %d\n",arr[2]);
return 0;
}
In this case arr in main function is modified even when I'm not passing the address of arr.
I'm getting the fact that for structures and single value vars we HAVE to pass the address like someFunct(&var) but this is not necessary for arrays? for arrays we write someFunct(arr)
But I'm not getting the reason behind this?
struct node * is a pointer to a struct node.
struct node ** is a pointer to a pointer to a struct node.
The reason for passing in a struct node ** could be that the function needs to modify what the struct node * is actually pointing at - which seems odd for a function named calcHeight. Had it been freeNode it could have made sense. Example:
void freeNode(struct node **headp) {
free(*headp);
*headp = NULL; // make the struct node * passed in point at NULL
}
Demo
Another reason could be to make the interface consistent so that one always needs to supply a struct node ** to all functions in the functions supporting struct nodes - not only those actually needing to change what the struct node * is pointing at.
Regarding the added // EDIT part:
In this scenario there is no reason to send in a pointer-to-pointer. If you do not need to change the actual pointer, you only need to send in the value of the pointer.
Example memory layout:
Address What's stored there
+-----+
| +0 | uint64_t ui1 = 1 <--+
+-----+ |
| +8 | uint64_t ui2 = 2 |
+-----+ |
| +16 | uint64_t* p = &ui1 ---+
+-----+
Now, if a function only need an uint64_t value, you can send in ui1, ui2 or *p to that function.
void someFunc(uint64_t val) { ++val; ... }
The changes this function makes to val are not visible to the caller of the function.
If a function is supposed to be able to make changes that are visible to the caller of the function, send in a pointer:
void someFunc(uint64_t *valp) { *valp = 10; }
Calling it with someFunc(&ui1); or someFunc(p); will change ui1 and assign 10 to it.
If you have a pointer and want to change what it's actually pointing at, which is what your original question was asking, you would need to send in a pointer to that pointer:
void someFunc(uint64_t **valpp) { *valpp = &ui2 }`
If you call that with someFunc(&p) (where p is currently pointing at ui1) you will find that after the function call, p will point at ui2:
+-----+
| +0 | uint64_t ui1 = 1
+-----+
| +8 | uint64_t ui2 = 2 <--+
+-----+ |
| +16 | uint64_t* p = &ui2 ---+
+-----+
Because in calcHeight you're passing your argument by value. If you want to modify the pointed value by root you need to pass the adress of the pointer.
First one is a pointer to node which is a structure.
struct node *root;
defines root as a variable which can store the address of a node.
Second one is a pointer to a pointer to node which is a structure.
struct node **root;
defines root as variable which can store address of another variable which has the address of a node.
why do we need to pass "root" pointer as &root?
calcHeight(&root);
C passes arguments by value, not by reference. so, you have to pass the address of root to modify the value of root.

Pointers in Structure pointing to another strucutre

I am trying to understand how pointers in linked lists work. So far i am having lots of trouble trying to figure out where the pointer is pointing to and how a pointer of a type struct works( I know we need to allocate memory from the heap but can't quite understand it, but maybe that's a different question altogether).
Lets take this structure:
typedef struct Node {
int data;
struct Node *link;
} Node;
What I think will happen now is:
Say you have a pointer of type Node in the main function, Node* p and this is allocated memory (using malloc).
Now if we have some data p->data=5; , p points to the beginning of this data (at least this is what i think is happening).
Where exactly does link point to?
So now, i come across this particular piece of code:
typedef struct Node {
int data;
struct Node *link;
} Node;
typedef struct List {
Node* head;
int number_of_nodes;
} List;
So this is complete chaos in my brain! .
Now in the structure List, what is head doing? What is it pointing to? And how would you create a linked list at all with these two lists??
I am really trying my level best to understand how linked lists work, but all the pointers make it too hard to keep track of. You might suggest i start with something simple and i did, and i have already mentioned how much i understand. But the head pointer in the second structure has completely thrown me off track!
It would make my life so much more easier if someone could help me explain it while keeping track of the pointers.
Where exactly does link point to?
link points to another object of the same type:
+------+------+ +------+------+ +------+------+
| data | link |---->| data | link |---->| data | link | ----> ...
+------+------+ +------+------+ +------+------+
Now in the structure List, what is head doing? What is it pointing to?
head points to the first node in a list:
+-----------------+ +------+------+ +------+------+
| head |---->| data | link |---->| data | link |----> ...
+-----------------+ +------+------+ +------+------+
| number_of_nodes |
+-----------------+
I am really trying my level best to understand how linked lists work,
Don't feel bad - linked lists threw me for a loop in my Data Structures class (my first "hard" CS class). It took me a solid week longer than my classmates to grok the concept. Hopefully the pictures help.
Edit
what happens if you have a pointer to the structure List, memory allocated and all? Where does it point to then (according to the diagrams, which did help by the way)
So, let's assume you have the following code:
/**
* Create a new list object. head is initially NULL,
* number_of_nodes initially 0.
*/
List *newList( void )
{
List *l = malloc( sizeof *l );
if ( l )
{
l->head = NULL;
l->number_of_nodes = 0;
}
return l;
}
int main( void )
{
List *l = newList();
...
}
Then your picture looks like this:
+---------+ +--------------------+
| l: addr | ----> | head: NULL |
+---------+ +--------------------+
| number_of_nodes: 0 |
+--------------------+
(addr represents some arbitrary memory address)
Now let's say you add a node to your list:
/**
* Create a new node object, using the input data
* link is initially NULL
*/
Node *newNode( int data )
{
Node *n = malloc( sizeof *n );
if ( n )
{
n->data = data;
n->link = NULL;
}
return n;
}
void insertNode( List *l, int data )
{
Node *n = newNode( data );
if ( n )
{
/**
* If list is initially empty, make this new node the head
* of the list. Otherwise, add the new node to the end of the
* list.
*/
if ( !l->head ) // or n->head == NULL
{
l->head = n;
}
else
{
/**
* cur initially points to the first element in the list.
* While the current element has a non-NULL link, follow
* that link.
*/
for ( Node *cur = l->head; cur->link != NULL; cur = cur->link )
; // empty loop body
cur->link = n;
}
l->number_of_nodes++;
}
}
int main( void )
{
List *l = newList();
insertNode( l, 5 );
...
}
Now your picture looks like this:
+---------+ +--------------------+ +------------+
| l: addr | ----> | head: addr | ---> | data: 5 |
+---------+ +--------------------+ +------------+
| number_of_nodes: 1 | | link: NULL |
+--------------------+ +------------+
You could add another node:
int main( void )
{
List *l = newList();
insertNode( l, 5 );
insertNode( l, 3 );
...
}
then your picture becomes
+---------+ +--------------------+ +------------+ +------------+
| l: addr | ----> | head: addr | ---> | data: 5 | +--> | data: 3 |
+---------+ +--------------------+ +------------+ | +------------+
| number_of_nodes: 2 | | link: addr | --+ | link: NULL |
+--------------------+ +------------+ +------------+
Naturally, you'd want to add some error checking and messages in case a node couldn't be allocated (it happens). And you'd probably want an ordered list, where elements are inserted in order (ascending, descending, whatever). But this should give you a flavor of how to build lists.
You'd also need functions to remove items and free that memory. Here's how I'd free an entire list:
void freeList( List *l )
{
Node *prev, *cur = l->head;
while( cur && cur->link )
{
prev = cur;
cur = cur->link;
free( prev );
}
free( cur );
}
int main( void )
{
List *l = newList();
...
freeList( l );
free( l );
...
}
… a pointer of a type struct…
A pointer cannot be of a type struct. A pointer can point to a structure.
C has objects. Objects include char, int, double, structures, and other things. A structure is a collection of objects grouped together.
In main, if you define p with Node *p;, you then have a pointer p. It has no value because you have not given it a value. When you execute p = malloc(sizeof *p);, you request enough memory for the size of the thing p points to (*p). If malloc returns a non-null pointer, then p points to a Node structure.
Then p->data refers to the data member of that structure. p->data is shorthand for (*p).data, in which *p means “the object p points to” and .data means “the data member in that object.”
After p = malloc(sizeof *p); and p->data = 5;, p->link does not point to anything because you have not assigned it a value. In a linked list, you would use malloc to get memory for another Node, and then you would set the p->link in one Node to point to the new Node. In each Node, its link member points to the next Node in the list. Except, in the last Node, p->link is set to a null pointer to indicate it is the last Node.
In List, you would set head to point to the first Node in a list of Node objects.

Having trouble understanding the definition of a linked list

I have some questions regarding the definition of a linked list as it was defined in my class.
This is what was used:
typedef struct node_t {
int x;
struct node_t *next;
} *Node;
Now, I understand that this way we created a shorter way to use pointers to the struct node_t. Node will be used as struct node_t*.
Now, say we want to create a linked list. For example:
Node node1 = malloc(sizeof(*node1));
Node node2 = malloc(sizeof(*node2));
Node node3 = malloc(sizeof(*node3));
node1->x = 1;
node1->next = node2;
node2->x = 4;
node2->next = node3;
node3->x = 9;
node3->next = NULL;
This is roughly how I imagine this (The circles represent the structures):
Now I know it's wrong, but I can't understand why. We have a pointer, node1, that points to our structure. Then, we point at node2, which points at another structure and so and so on.
Another things is, I can't understand how is it possible to have the longer arrows in the picture. Shouldn't we only be able to point to a structure from each lower part of the circle, and not to a pointer to a structure? How is this possible?
If anyone here could make things a little clearer it would be hugely appreciated. Thank a lot.
You have three linked nodes, and additional local pointers pointing to them.
The nodes don't know anything about those local pointers though, even if it is often convenient to use their names to refer to the nodes.
Instead, they know the next node in the sequence, respectively the last node knows none.
Put another way, your image is flat-out wrong.
+---+------+
node1 --> | 1 | next |
+---+-|----+
|
v
+---+------+
node2 --> | 4 | next |
+---+-|----+
|
v
+---+------+
node3 --> | 9 | NULL |
+---+------+
Assignment is a transitive operation. So,
node1->next = node2;
would mean that node1->next points to whatever node2 was pointing to. And, in particular, node1->next does not point to node2 itself.
Each of node1, node2, and node3 name a variable that is a pointer.
node1 node2 node3
+---+ +---+ +---+
| * | | * | | * |
+ | + + | + + | +
v v v
+---+---+ +---+---+ +---+---+
| 1 | * --> | 4 | * --> | 9 | * --> NULL
+---+---+ +---+---+ +---+---+
typedef struct node_t {
int x;
struct node_t *next;
} *Node; /* <-- don't typedef pointers */
Simply use Node instead of Node * and then allocate with:
Node *node1 = malloc(sizeof(*node1));
Why? Somebody looking at your code 100 lines below the declaration of your typedef will not inherently know whether Node is a type, or whether it is a pointer-to-type. This type of confusion will only grow as your code grows in size. Review: Is it a good idea to typedef pointers?.
(note: good job using the dereferenced pointer to set the typesize in sizeof)
A Linked List
A linked list is simply a clever data structure that allows you to iterate over a number of independently allocated nodes. Each node contains some data and then a pointer to the next node in the list, or NULL if that node is the final node in the list.
(for a doubly-linked list, you simply add a prev pointer that also points to the node before the current node in the list. You also have circular lists where the last node points back to the first allowing iteration from any node to any other node in the list regardless of which node you begin iterating with. For a doubly-linked circular list, you can iterate the entire list in both directions from any node)
In your case, your list is simply:
node1 +-> node2 +-> node3
+------+ | +------+ | +------+
| data | | | data | | | data |
|------| | |------| | |------|
| next |--+ | next |--+ | next |---> NULL
+------+ +------+ +------+
Where your data is a single integer value and your next pointer simply holds the address of the next node in your list, or NULL if it is the final node in the list. Adding your data, your list would be:
node1 +-> node2 +-> node3
+------+ | +------+ | +------+
| 1 | | | 4 | | | 9 |
|------| | |------| | |------|
| next |--+ | next |--+ | next |---> NULL
+------+ +------+ +------+
When creating a list, the first node is usually referred to as the head of the list and the last node the tail of the list. You must always preserve a pointer to the head of your list as that pointer holds the beginning list-address. For efficient insertions into the list, it is also a good idea to keep a pointer to the tail node so you can simply insert the new node without iterating over the list to find the last node each time, e.g.:
Node *newnode = malloc(sizeof(*newnode)); /* allocate */
newnode->next = NULL; /* initialize next NULL */
tail->next = newnode; /* assign to tail */
tail = newnode; /* set new tail at newnode */
Lists are fundamental to C, there are many used in the Linux kernel itself. Take the time to understand them and how to write them in the differing variants. You'll be glad you did. Lastly, don't forget to write a simple function to free your list when you are done (and free the data as well if it is allocated). A simple free_list function would be:
void free_list (Node *list)
{
while (list) {
Node *victim = list; /* separate pointer to node to free */
list = list->next; /* can you see why you iterate next... */
free (victim); /* before you free the victim node? */
}
}
Let me know if you have further questions.

delete an entry from a singly-linked list

So today I was watching The mind behind Linux | Linus Torvalds, Linus posted two pieces of code in the video, both of them are used for removing a certain element in a singly-linked list.
The first one (which is the normal one):
void remove_list_entry(linked_list* entry) {
linked_list* prev = NULL;
linked_list* walk = head;
while (walk != entry) {
prev = walk;
walk = walk->next;
}
if (!prev) {
head = entry->next;
} else {
prev->next = entry->next;
}
}
And the better one:
void remove_list_entry(linked_list* entry) {
// The "indirect" pointer points to the
// *address* of the thing we'll update
linked_list** indirect = &head;
// Walk the list, looking for the thing that
// points to the entry we want to remove
while ((*indirect) != entry)
indirect = &(*indirect)->next;
// .. and just remove it
*indirect = entry->next;
}
So I cannot understand the second piece of code, what happens when *indirect = entry->next; evaluates? I cannot see why it leads to the remove of the certain entry. Someone explains it please, thanks!
what happens when *indirect = entry->next; evaluates? I cannot see why it leads to the remove of the certain entry.
I hope you have clear understanding of double pointers1).
Assume following:
Node structure is
typedef struct Node {
int data;
struct Node *next;
} linked_list;
and linked list is having 5 nodes and the entry pointer pointing to second node in the list. The in-memory view would be something like this:
entry -+
head |
+---+ +-------+ +-------+ +-------+ +-------+ +--------+
| |---->| 1 | |---->| 2 | |---->| 3 | |---->| 4 | |---->| 5 |NULL|
+---+ +-------+ +-------+ +-------+ +-------+ +--------+
This statement:
linked_list** indirect = &head;
will make indirect pointer pointing to head.
entry -+
head |
+---+ +-------+ +-------+ +-------+ +-------+ +--------+
| |---->| 1 | |---->| 2 | |---->| 3 | |---->| 4 | |---->| 5 |NULL|
+---+ +-------+ +-------+ +-------+ +-------+ +--------+
^
|
+---+
| |
+---+
indirect
The while loop
while ((*indirect) != entry)
*indirect will give the address of first node because head is pointing to first node and since entry is pointing to second node the loop condition evaluates to true and following code will execute:
indirect = &(*indirect)->next;
this will make the indirect pointer pointing to the next pointer of first node. The in-memory view:
entry -+
head |
+---+ +-------+ +-------+ +-------+ +-------+ +--------+
| |---->| 1 | |---->| 2 | |---->| 3 | |---->| 4 | |---->| 5 |NULL|
+---+ +-------+ +-------+ +-------+ +-------+ +--------+
^
|
+---+
| |
+---+
indirect
now the while loop condition will be evaluated. Because the indirect pointer is now pointing to next of first node, the *indirect will give the address of second node and since entry is pointing to second node the loop condition evaluates to false and the loop exits.
The following code will execute now:
*indirect = entry->next;
The *indirect dereference to next of first node and it is now assigned the next of node which entry pointer is pointing to. The in-memory view:
entry -+
head |
+---+ +-------+ +-------+ +-------+ +-------+ +--------+
| |---->| 1 | |-- | 2 | |---->| 3 | |---->| 4 | |---->| 5 |NULL|
+---+ +-------+ \ +-------+ +-------+ +-------+ +--------+
*indirect \ /
+------------+
Now the next of first node is pointing to third node in the list and that way the second node is removed from the list.
Hope this clear all of your doubts.
EDIT:
David has suggested, in comment, to add some details around - why are the (..) parenthesis required in &(*indirect)->next?
The type of indirect is linked_list **, which means it can hold the address of pointer of type linked_list *.
The *indirect will give the pointer of type linked_list * and ->next will give its next pointer.
But we cannot write *indirect->next because the precedence of operator -> is higher than unary * operator. So, *indirect->next will be interpreted as *(indirect->next) which is syntactically wrong because indirect is a pointer to pointer.
Hence we need () around *indirect.
Also, &(*indirect)->next will be interpreted as &((*indirect)->next), which is the address of the next pointer.
1) If you don't know how double pointer works, check below:
Lets take an example:
#include <stdio.h>
int main() {
int a=1, b=2;
int *p = &a;
int **pp = &p;
printf ("1. p : %p\n", (void*)p);
printf ("1. pp : %p\n", (void*)pp);
printf ("1. *p : %d\n", *p);
printf ("1. *pp : %d\n", **pp);
*pp = &b; // this will change the address to which pointer p pointing to
printf ("2. p : %p\n", (void*)p);
printf ("2. pp : %p\n", (void*)pp);
printf ("2. *p : %d\n", *p);
printf ("2. *pp : %d\n", **pp);
return 0;
}
In the above code, in this statement - *pp = &b;, you can see that without accessing pointer p directly we can change the address it is pointing to using a double pointer pp, which is pointing to pointer p, because dereferencing the double pointer pp will give pointer p.
Its output:
1. p : 0x7ffeedf75a38
1. pp : 0x7ffeedf75a28
1. *p : 1
1. *pp : 1
2. p : 0x7ffeedf75a34 <=========== changed
2. pp : 0x7ffeedf75a28
2. *p : 2
2. *pp : 2
In-memory view would be something like this:
//Below in the picture
//100 represents 0x7ffeedf75a38 address
//200 represents 0x7ffeedf75a34 address
//300 represents 0x7ffeedf75a28 address
int *p = &a
p a
+---+ +---+
|100|---->| 1 |
+---+ +---+
int **pp = &p;
pp p a
+---+ +---+ +---+
|300|---->|100|---->| 1 |
+---+ +---+ +---+
*pp = &b;
pp p b
+---+ +---+ +---+
|300|---->|200|---->| 2 |
+---+ +---+ +---+
^^^^^ ^^^^^
The entry isn't really "deleted", it's just no longer in the list.
If this is your chain:
A --> B --> C --> D --> E --> ■
And you want to delete C, you're really just linking over it. It's still there in memory, but no longer accessible from your data structure.
C
A --> B --------> D --> E --> ■
That last line sets the next pointer of B to D instead of C.
Instead of looping through the entries in the list, as the first example does, the second example loops through the pointers to the entries in the list. That allows the second example to have the simple conclusion with the statement you've asked about, which in English is "set the pointer that used to point to the entry I want to remove from the list so that it now points to whatever that entry was pointing to". In other words, it makes the pointer that was pointing to the entry you're removing point past the entry you're removing.
The first example has to have a special way to handle the unique case of the entry you want to remove being the first entry in the list. Because the second example loops through the pointers (starting with &head), it doesn't have a special case.
*indirect = entry->next;
That just move it to the next node
You need to remove the entry one
So you have to point .. before entry node the next of the entry node
So your loop should stop before the entry
while ((*indirect)->next != entry)
indirect = &(*indirect)->next
(*indirect)->Next =entry-> next
I hope that help you
This example is both a great way of manipulating linked list structures in particular, but also a really excellent way of demonstrating the power of pointers in general.
When you delete an element from a singly-linked list, you have to make the previous node point to the next node, bypassing the node you're deleting. For example, if you're deleting node E, then whatever list pointer it is that used to point to E, you have to make it point to whatever E.next points to.
Now, the problem is that there are two possibilities for "whatever list pointer it is that used to point to E". Much of the time, it's some previous node's next pointer that points to E. But if E happens to be the first node in the list, that means there's no previous node in the list, and it's the top-level list pointer that points to E — in Linus's example, that's the variable head.
So in Linus's first, "normal" example, there's an if statement. If there's a previous node, the code sets prev->next to point to the next node. But if there's no previous node, that means it's deleting the node at the head of the list, so it sets head to point to the next node.
And although that's not the end of the world, it's two separate assignments and an if condition to take care of what we thought of in English as "whatever list pointer it is that used to point to E". And one of the crucial hallmarks of a good programmer is an unerring sense for sniffing out needless redundancy like this and replacing it with something cleaner.
In this case, the key insight is that the two things we might want to update, namely head or prev->next, are both pointers to a list node, or linked_list *. And one of the things that pointers are great at is pointing at a thing we care about, even if that thing might be, depending on circumstances, one of a couple of different things.
Since the thing we care about is a pointer to a linked_list, a pointer to the thing we care about will be a pointer to a pointer to a linked_list, or linked_list **.
And that's exactly what the variable indirect is in Linus's "better" example. It is, literally, a pointer to "whatever list pointer it is that used to point to E" (or, in the actual code, not E, but the passed-in entry being deleted). At first, the indirect pointer points to head, but later, after we've begun walking through the list to find the node to delete, it points at the next pointer of the node (the previous node) that points at the one we're looking at. So, in any case, *indirect (that is, the pointer pointed to by indirect) is the pointer we want to update. And that's precisely what the magic line
*indirect = entry->next;
does in the "better" example.
The other thing to notice (although this probably makes the code even more cryptic at first) is that the indirect variable also takes the place of the walk variable used in the first example. That is, everywhere the first example used walk, the "better" example uses *indirect. But that makes sense: we need to walk over all the nodes in the list, looking for entry. So we need a pointer to step over those nodes — that's what the walk variable did in the first example. But when we find the entry we want to delete, the pointer to that entry will be "whatever list pointer it is that used to point to E" — and it will be the pointer to update. In the first example, we couldn't set walk to prev->next — that would just update the local walk variable, not head or one of the next pointers in the list. But by using the pointer indirect to (indirectly) walk the list, it's always the case that *indirect — that is, the pointer pointed to by indirect — is the original pointer to the node we're looking at (not a copy sitting in walk), meaning it's something we can usefully update by saying *indirect = entry->next.
This will be much easier to understand if you rewrite
indirect = &(*indirect)->next;
As
Indirect = &((*indirect)->next);
The while loop will give us the address of a next pointer belong to some node of which the next pointer is pointing to the entry. So the last statement is actually changing the value of this next pointer so that it doesn’t point to the entry anymore.
And in the special case when the entry is head, the while loop will be skipped and the last line change the value of the head pointer and make it point to the next node of the entry.

C Confusion about pointer to pointers memory allocation?

I apologize if this might be viewed as a duplicate, but I cannot seem to find a conclusive answer that satisfies my question.
So I have a struct with a self referential pointer to pointers.
struct Node {
int id;
int edge_count;
struct Node **edges;
}
static struct Node s_graph[MAX_ID+1];
I then have a function that allocates some memory.
int add_edge(int tail, int head)
{
struct Node *ptail, *phead;
ptail = &s_graph[tail];
phead = &s_graph[head];
ptail->edges = realloc(ptail->edges, ++ptail->edge_count * sizeof(struct Node *));
if (ptail->edges) {
*(ptail->edges + ptail->edge_count - 1) = phead;
return 0;
}
return -1;
}
The above seems to work just fine. However, I keep seeing posts about pointer to pointers that lead me to wonder if I need to do something like the following in add_edge:
struct Node *phead = malloc(sizeof(struct Node *));
However, this does not seem logical. There should be enough memory for ptail->edges to store this pointer after the realloc call. I am fairly confident that I did the allocation correctly (albeit, inefficiently), but it is kind of sending me on a mind trip ... So when people declare pointer to pointers (e.g., **ptr) and then allocate memory for both ptr and *ptr, wouldn't that technically make ptr a pointer to pointers to pointers (and maybe clearer to declare as ***ptr)? Or maybe I am wrong and missing something conceptually?
Thank you in advance!
It depends on the situation, there is no general answer. If you have a pointer to pointer, eg Node**, and you want to store new data into it, then you need to have two levels of allocations, otherwise one is enough.
struct Node** nodes = calloc(AMOUNT, sizeof(struct Node*));
Now you have an array of struct Node* elements, so each element is a pointer to a struct Node.
Now how do you fill this array? You could want to insert new nodes inside it. Then you wouold require to allocate them, eg
nodes[0] = calloc(1, sizeof(struct Node)); // <- mind Node, not Node*
But in your situation you just want to set the address to an element of an array of the static variable s_graph, so you don't need to allocate a second level, you directly set the value.
So:
struct Node** nodes = calloc(AMOUNT, sizeof(struct Node*));
nodes -> | 0 | 1 | 2 | 3 |
nodes[0] = calloc(1, sizeof(struct Node))
nodes -> | 0 | 1 | 2 | 3 |
|
|
v
| NODE |
But if you have s_graph you already have them allocated, so it's something like:
static struct Node s_graph[MAX_ID+1];
struct Node** nodes = calloc(AMOUNT, sizeof(struct Node*));
nodes -> | 0 | 1 | 2 | 3 |
s_graph -> | N1 | N2 | N3 |
nodes[0] = &s_graph[0];
nodes -> | 0 | 1 | 2 | 3 |
|
|----|
v
s_graph -> | N1 | N2 | N3 |

Resources