Confusion regarding nodes in a Linked List - c

I have a question about Linked Lists. I spoke to a friend and am now confused.
Suppose that there are two variables of type struct node. One is ptr and the other is the header node.
struct node
{
int data;
struct node *link;
};
struct node *ptr,*header;
What is the difference between
ptr=header
and
ptr->link=header
and
ptr->link=header->link
?
Edit:I mean semantically.

Assuming header is pointing to an allocated node initially, it will look like
+----------------+-----------+
| | |
header +-----------> | data | link+-----------> other node/NULL
| | |
+----------------+-----------+
After ptr=header, both ptr and header points to the same node
+----------------+-----------+
pointer +-----------> | | |
header +-----------> | data | link+-----------> other node/NULL
| | |
+----------------+-----------+
After ptr->link=header,
+----------------+-----------+
pointer +-----------> | | |
header +-----------> | data | link+----------+
+---> | | | |
| +----------------+-----------+ |
+-----------------------------------------+
after ptr->link=header->link, it would depend on where header and ptr are pointing
if they point to the same node then this statement will have no effect.
if they point to some different nodes then link pointer of both nodes pointed by ptr and header will point to the same node (or NULL).
+----------------+-----------+
| | |
header +-----------> | data | link+--------------+ |
| | | |
+----------------+-----------+ +------> |
|other node/NULL
+------> |
+----------------+-----------+ |
| | | |
ptr +-----------> | data | link+--------------+
| | |
+----------------+-----------+

Assuming you execute only one of these cases:
ptr=header would make a pointer to the header node. This means the data of ptr would be the same as the data in header. ptr->link would also be the same as header->link
ptr->link=header moves ptr infront of header in the list
ptr->link = header->link makes ptr parallel to header, e.g.:
ptr->data != header->data; // TRUE
a = ptr->link;
b = header->link;
a->data == b->data // TRUE
where a and b are of type struct node*

The best way to understand is to draw it out.

Related

Valgrind warns of overlap when trying to copy a string into a struct member variable

This is how the struct looks like for reference:
struct thread_data {
struct ringbuf_t *rb;
char *file_name;
};
I need to take command line arguments and store it inside struct member variables for each thread_data element in the threads array, like so:
for (int index = optind; index < argc; index++) {
threads[length].rb = rb;
memmove(&threads[length].file_name, &argv[index], strlen(argv[index]));
strcpy(threads[length].file_name, argv[index]);
++length;
}
Prevously used memcpy and it worked when I printed the variable. However, Valgrind is giving me this:
==465645== Source and destination overlap in strcpy(0x1fff000b54, 0x1fff000b54)
==465645== at 0x4C3C180: strcpy (vg_replace_strmem.c:523)
==465645== by 0x400F85: main (bytemincer.c:55)
So I used memmove and I still got the same Valgrind result. Any solution for this?
This is what you want to end up with:
(I'm using "fn" instead of "file_name" in the post.)
*(argv[0]) # 0x2000
+---+---+- -+---+
+--------------->| | | … | 0 |
argv # 0x1000 | +---+---+- -+---+
+---------------+ |
| 0x2000 -------+ *(argv[1]) # 0x2100
+---------------+ +---+---+- -+---+
| 0x2100 -----------+----------->| | | … | 0 |
+---------------+ | +---+---+- -+---+
| 0x2200 -----------)----+
+---------------+ | | *(argv[2]) # 0x2200
| ⋮ | | | +---+---+- -+---+
| +------->| | | … | 0 |
rb # 0x3000 | | +---+---+- -+---+
+---------------+ | |
| 0x4000 -------+ | | *rb # 0x4000
+---------------+ | | | +---------------+
+---)---)------->| |
threads # 0x5000 | | | +---------------+
+---------------+ | | |
| +-----------+| | | |
|rb| 0x4000 --------+ | |
| +-----------+| | | |
|fn| 0x2100 --------)---+ |
| +-----------+| | |
+---------------+ | |
| +-----------+| | |
|rb| 0x4000 --------+ |
| +-----------+| |
|fn| 0x2200 ----------------+
| +-----------+|
+---------------+
| ⋮ |
(This assumes threads is an array rather than a pointer to an array. This doesn't affect the rest of the post.)
All addresses are made up, of course. But you can see how more than once variable have the same address for value. Because it's perfectly fine to have multiple pointers point to the same memory block. All we need to do is copy the pointer (the address).
To copy a pointer, all you need to do is
dst = src;
So all you need is
threads[length].rb = rb;
threads[length].fn = argv[index];
While
memmove(&threads[length].rb, &rb, sizeof(threads[length].rb));
memmove(&threads[length].fn, &argv[index], sizeof(threads[length].fn));
and
memmove(&threads[length].rb, &rb, sizeof(rb));
memmove(&threads[length].fn, &argv[index], sizeof(argv[index]));
are equivalent to the assignments, it doesn't make sense to do something that complicated:
(Note the use of sizeof(argv[index]) rather than strlen(argv[index]). It's the pointer we're copying, so we need the size of the pointer.)
The warning came from trying to copy the string that's in the buffer at 0x2100 into the buffer at 0x2100. Remember that threads[length].fn and argv[index] both have the same value (address) after the memmove.

C Linked List pointer understanding

I'm trying to understand how C linked list pointer works.
I understand that a pointer to a variable is a "link" to an address memory, and that a pointer to a pointer is, sometimes, a reference to a pointer itself.
What concerns me is how could, for example, a node reference modify the original list value, but not the list itself.
I'll explain myself better:
void insertNode(struct node** head, int value) {
struct node* new = malloc(sizeof(struct node*));
struct node* ref = (*head); //this is a reference. -> same address.
//base case
if((*head) == NULL) {
//do things
} else { // not null
while(ref->next != null) {
ref = ref->next; //THIS: how can this not modify the head itself?
}
//null spot found, set up node
new->value = 10; //some int here
new->next = NULL;
ref->next = new; //Instead, how can this modify the head? and why?
}
}
here's a little snippets of code, and my question is:
Yes, i'm holding a reference to head through ref.
But why
ref = ref->next;
only modify ref itself, while
ref->next = new
modify also the head?
through GDB i saw that both, at the beginning, share the same address memory, but ref only modify the referenced list on the new insert.
Can someone explain it?
ref is just a pointer; modifying ref will not modify what is pointed by ref.
The while loop is actually just looking for the last element of the list. After the while loop , ref will simply point to the last element of the list.
First "mystery" line:
ref = ref->next; //THIS: how can this not modify the head itself?
Here we just read ref->next, so the head cannot be modified.
Second "mystery" line:
ref->next = new; //Instead, how can this modify the head? and why?
Here we modify what is pointed by ref. At this line ref points either to the last element of the list, or it points the head (which is also the last element of the list if there is only one element in the list, or which is the newly created head (to be done in //do things) if the list was empty.
Maybe some pictures will help.
Before calling insertNode, you have a sequence of nodes linked like so:
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
You have a pointer (call it h) that points to the first element of the list:
+---+
| h |
+---+
|
V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
When you call insertNode, you pass a pointer to h in as a parameter, which we call head:
+------+
| head |
+------+
|
V
+---+
| h |
+---+
|
V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
You create a pointer variable named ref that takes the value of *head (h); IOW, ref winds up pointing to the first element of the list:
+------+ +-----+
| head | | ref |
+------+ +-----+
| |
V |
+---+ |
| h | |
+---+ |
| +----+
V V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
Then you create another node on the heap, and assign that pointer to a local variable named new:
+------+ +-----+ +-----+ +-------+------+
| head | | ref | | new | ---> | value | next |
+------+ +-----+ +-----+ +-------+------+
| |
V |
+---+ |
| h | |
+---+ |
| +----+
V V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
So, the thing to notice is that while ref and *head (h) have the same value (the address of the first node in the list), they are different objects. Thus, anything that changes the value of ref does not affect either head or h.
So, if we execute this loop
while(ref->next != null) {
ref = ref->next;
the result of the first iteration is
+------+ +-----+ +-----+ +-------+------+
| head | | ref | | new | ---> | value | next |
+------+ +-----+ +-----+ +-------+------+
| |
V |
+---+ |
| h | |
+---+ |
| +------------+
V V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
After another iteration we get
+------+ +-----+ +-----+ +-------+------+
| head | | ref | | new | ---> | value | next |
+------+ +-----+ +-----+ +-------+------+
| |
V |
+---+ |
| h | |
+---+ |
| +----------------------------------+
V V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
At this point, ref->next is NULL, so the loop exits.
We then assign values to new->value and new->next, such that new->next is NULL:
+------+ +-----+ +-----+ +-------+------+
| head | | ref | | new | ---> | value | next | ---|||
+------+ +-----+ +-----+ +-------+------+
| |
V |
+---+ |
| h | |
+---+ |
| +----------------------------------+
V V
+-------+------+ +-------+------+ +-------+------+
| value | next | ---> | value | next | ---> | value | next | ---|||
+-------+------+ +-------+------+ +-------+------+
Finally, we set ref->next to the value of new, thus adding the node new points to to the end of the list:
+------+ +-----+ +-----+ +-------+------+
| head | | ref | | new | ---> | value | next | ---|||
+------+ +-----+ +-----+ +-------+------+
| | ^
V | |
+---+ | +-------------------------------+
| h | | |
+---+ | |
| +----------------------------------+ |
V V |
+-------+------+ +-------+------+ +-------+------+ |
| value | next | ---> | value | next | ---> | value | next | ---+
+-------+------+ +-------+------+ +-------+------+
Note that ref->next isn't pointing to the variable new, it's pointing to the thing that new points to.
So, that's why updating ref does not affect head (or *head (h)). The base case, where the list is empty, will end up writing to *head (h), setting it to point to a new node allocated from the heap.

container_of sample code in lwn.net

When seeing:
void my_object_release(struct kobject *kobj)
{
struct my_object *mine = container_of(kobj, struct my_object, kobj);
/* Perform any additional cleanup on this object, then... */
kfree (mine);
}
in LWN’s The zen of kobjects, it seems incorrect in the third parameter kobj. I think it should be kobject.
The given code is correct: the third argument is the name of the container structure member to which the pointer points, not its type, so kobj is right. The example is somewhat confusing since the first kobj doesn’t correspond to the same thing as the second kobj: the first is the pointer in the caller’s scope.
Here’s a diagram to hopefully clarify the parameters of container_of:
container_of(kobj, struct my_object, kobj)
| | |
| | |
\------------+----------+--------------------------------\
| | |
| | |
/-----------------/ | |
| | |
V /-------------/ |
+------------------+ | |
| struct my_object | { | |
+------------------+ V V
+------+ +------+
struct kobject | kobj |; <-- You have a pointer to this, called | kobj |
+------+ +------+
...
};
container_of allows you to pass around a kobject pointer and find the containing object (as long as you know what the containing object is) — it allows you to use your knowledge of “what” to answer “where”.
It’s worth pointing out that container_of is a macro, which is how it can do seemingly impossible things (for developers not used to meta-programming).

Correct way to join two double linked list

In the Linux kernel source, the list_splice is implemented with __list_splice:
static inline void __list_splice(const struct list_head *list,
struct list_head *prev,
struct list_head *next)
{
struct list_head *first = list->next; // Why?
struct list_head *last = list->prev;
first->prev = prev;
prev->next = first;
last->next = next;
next->prev = last;
}
Isn't the list already pointing to the head of a linked list?
Why do we need to fetch list->next instead?
The double linked list API in the Linux kernel is implemented as an abstraction of circular list. In that simple scheme the HEAD node does not contain any payload (data) and used explicitly to keep starting point of the list. Due to such design it's really simple to a) check if the list is empty, and b) debug list because unused nodes have been assigned to the so called POISON — magic number specific only to the list pointers in the entire kernel.
1) non-initialized list
+-------------+
| HEAD |
| prev | next |
|POISON POISON|
+-------------+
2) empty list
+----------+-----------+
| | |
| | |
| +------v------+ |
| | HEAD | |
+---+ prev | next +----+
| HEAD HEAD |
+-------------+
3) list with one element
+--------------+--------------+
| | |
| | |
| +------v------+ |
| | HEAD | |
| +---+ prev | next +--+ |
| | |ITEM1 ITEM1| | |
| | +-------------+ | |
| +--------------------+ |
| | |
| +------v------+ |
| | ITEM1 | |
+-------+ prev | next +-------+
| DATA1 |
+-------------+
4) two items in the list
+----------+
| |
| |
| +------v------+
| | HEAD |
+------+ prev | next +----+
| | |ITEM2 ITEM1| |
| | +-------------+ |
+----------------------------+
| | | |
| | | +------v------+
| | | | ITEM1 |
| | +---+ prev | next +----+
| | | | DATA1 | |
| | | +-------------+ |
| +-------------------------+
| | |
| | +------v------+
| | | ITEM2 |
+---------+ prev | next +----+
| | DATA2 | |
| +-------------+ |
| |
+----------------------+
In the lock less algorithm there is a guarantee only for next pointer to be consistent. The guarantee wasn't always the case. The commit 2f073848c3cc introduces it.

How does this code actually work?

I can't understand this 3 line code used to implement static linked list. This is actually the answer to this question.
I am posting the code here again-(the main action is basically the 2nd line)
struct node {int x; struct node *next;};
#define cons(x,next) (struct node[]){{x,next}}
struct node *head = cons(1, cons(2, cons(3, cons(4, NULL))));
My question is - what is the functionality of this statement?
(struct node[]){{x,next}}. Is this a initialization statement and what is it returning that it can be assigned to a struct node*?
(struct node[]){{x,next}} is a compound literal and it will initialize a struct *node pointer.
+------+------+ +------+------+ +------+------+ +------+------+
| | | | | | | | | | | |
| 1 | next +---->| 2 | next +---->| 3 | next +---->| 4 | NULL |
| | | | | | | | | | | |
+------+------+ +------+------+ +------+------+ +------+------+
^
|
head

Resources