Accessing void * in a struct in C - c

List.c
int const LIST_SIZE = 100;
typedef struct node {
void *item;
} Node;
typedef struct list {
Node *currentItem;
Node *items[LIST_SIZE];
} LIST;
main.c
#include <stdlib.h>
#include <printf.h>
#include "List.h"
LIST *ListCreate();
int main() {
LIST *newList = ListCreate();
Node *newNode = malloc(sizeof(Node));
newList->currentItem = newNode;
newNode->item = (int *)200;
printf("%d", *((int *)newNode));
}
LIST *ListCreate() {
LIST *newList = malloc(sizeof(LIST));
return newList;
}
My question is:
In main.c, I use the printf statement to access the item in the newNode. According to my understanding the proper call should be:
printf("%d", *((int *)newNode->item));
However, I get a segmentation fault when using this. Could anyone please explain me why this doesn't work and the other one works?
Thanks.

(int*)200 tells the compiler to take the number 200 and pretend it's the address of an int. However, it's not actually a valid address of an int, so you can't dereference it (i.e. you can't use *(int*)newNode->item to get the int at that address).
All you want to do is tell the compiler to take the "address" and treat it as a number again, which you can do using:
printf("%d", (int)newNode->item);
The (int) cast undoes the (int*) cast from before.
Side note: using (int*) here is slightly unusual; there's no reason to use it instead of (void*).

You're storing 200 (which isn't a pointer value) in the pointer variable, and then you deference the variable as a pointer even though it doesn't actually contains a pointer value.
General pattern for some type type:
Node* node = malloc(sizeof(Node));
node->item = malloc(sizeof(type));
*((type*)node->item) = ...;
type val = *((type*)node->item);
free(node->item);
free(node);
So, to store an int, it would be
Node* node = malloc(sizeof(Node));
node->item = malloc(sizeof(int));
*((int*)node->item) = 200;
printf("%d\n", *((int*)node->item));
free(node->item);
free(node);
Now, pointers can be used to store integers, so you could save an allocation in this particular case by using some trickery. This is a more advanced solution.
Node* node = malloc(sizeof(Node));
node->item = (void*)200; // Store the number in a pointer variable.
printf("%d\n", (int)node->item); // Treat the pointer variable as an `int`.
free(node);

Making a slight change to the main() lets take a look at this application run using Visual Studio 2013 in debug mode.
The modified main() is as follows:
int main() {
LIST *newList = ListCreate();
Node *newNode = malloc(sizeof(Node));
newList->currentItem = newNode;
newNode->item = (int *)200; // assign the value 200 to the pointer item
printf("*((int *)newNode) %d\n", *((int *)newNode)); // works
printf("((int)newNode->item) %d\n", ((int)newNode->item)); // works
printf("*((int *)newNode->item) %d\n", *((int *)newNode->item)); // access violation exception
}
When this console application is run, the window displayed contains two lines of output:
*((int *)newNode) 200
((int)newNode->item) 200
In the debugger, when single stepping through the function main() an access violation exception is thrown on the indicated line. The actual text of the access violation is:
First-chance exception at 0x010E1BAF in ConsoleApplication1.exe: 0xC0000005: Access violation reading location 0x000000C8.
Unhandled exception at 0x010E1BAF in ConsoleApplication1.exe: 0xC0000005: Access violation reading location 0x000000C8.
The location being read, 0x000000C8, is the decimal number 200 and when an attempt is made to dereference the pointer item as an int pointer containing the value of 200, the memory location of 200 is not valid for this application running under Windows 10 in user mode.
*((int *)newNode) means to treat the value of the variable newNode as a pointer to an int and to then fetch the value at that address. This works because newNode contains an address this application can access.
(int)newNode->item means to treat the value of the variable newNode->item as an int and to fetch the value of that variable. The variable is not treated as a pointer but as a variable containing an int.
*((int *)newNode->item) means to to treat the value of the variable newNode->item as a pointer to an int and to then fetch the value at that address. This doesn't work because the value of the variable newNode->item, a value of 200, is not an address this application can access so an access exception is thrown when the attempt is made.

According to my understanding the proper call should be:
printf("%d", *((int *)newNode->item));
That is true only from a syntactic point of view. But it is wrong because you are not using pointers correctly.
newNode->item = (int *)200;
Not sure what you were expecting that line to do. Casting 200 to an int* does not give you a valid pointer.
*((int *)newNode->item
only dereferences the value 200 pretending to be a valid pointer. This causes undefined behavior.
You can use:
// Allocate memory for newNode->item
newNode->item = malloc(sizeof(int));
// Set the value
*((int *)newNode->item) = 200;
// Print the value
printf("%d", *((int *)newNode->item));
Unless you wish to store other kinds of object in a node, I suggest using:
typedef struct node {
int item;
} Node;
and then your code can be a little bit simpler.
// Set the value
*newNode->item = 200;
// Print the value
printf("%d", newNode->item);

Related

Updating a pointer value not behaving as expected [duplicate]

The two code examples below both add a node at the top of a linked list.
But whereas the first code example uses a double pointer the second code example uses a single pointer
code example 1:
struct node* push(struct node **head, int data)
{
struct node* newnode = malloc(sizeof(struct node));
newnode->data = data;
newnode->next = *head;
return newnode;
}
push(&head,1);
code example 2:
struct node* push(struct node *head, int data)
{
struct node* newnode = malloc(sizeof(struct node));
newnode->data = data;
newnode->next = head;
return newnode;
}
push(head,1)
Both strategies work. However, a lot of programs that use a linked list use a double pointer to add a new node. I know what a double pointer is. But if a single pointer would be sufficient to add a new node why do a lot of implementations rely on double pointers?
Is there any case in which a single pointer does not work so we need to go for a double pointer?
Some implementations pass a pointer to pointer parameter to allow changing the head pointer directly instead of returning the new one. Thus you could write:
// note that there's no return value: it's not needed
void push(struct node** head, int data)
{
struct node* newnode = malloc(sizeof(struct node));
newnode->data=data;
newnode->next=*head;
*head = newnode; // *head stores the newnode in the head
}
// and call like this:
push(&head,1);
The implementation that doesn't take a pointer to the head pointer must return the new head, and the caller is responsible for updating it itself:
struct node* push(struct node* head, int data)
{
struct node* newnode = malloc(sizeof(struct node));
newnode->data=data;
newnode->next=head;
return newnode;
}
// note the assignment of the result to the head pointer
head = push(head,1);
If you don't do this assignment when calling this function, you will be leaking the nodes you allocate with malloc, and the head pointer will always point to the same node.
The advantage should be clear now: with the second, if the caller forgets to assign the returned node to the head pointer, bad things will happen.
Edit:
Pointer to pointer(Double pointers) also allows for creation for multiple user defined data types within a same program(Example: Creating 2 linked lists)
To avoid complexity of double pointers we can always utilize structure(which works as an internal pointer).
You can define a list in the following way:
typedef struct list {
struct node* root;
} List;
List* create() {
List* templ = malloc(sizeof(List));
templ->root = NULL;
return templ;
}
In link list functions use the above List in following way: (Example for Push function)
void Push(List* l, int x) {
struct node* n = malloc(sizeof(struct node));
n->data = x;
n->link = NULL;
printf("Node created with value %d\n", n->data);
if (l->root == NULL) {
l->root = n;
} else {
struct node* i = l->root;
while (i->link != NULL){
i = i->link;
}
i->link = n;
}
}
In your main() function declare the list in follow way:
List* list1 = create();
push(list1, 10);
Although the previous answers are good enough, I think it's much easier to think in terms of "copy by value".
When you pass in a pointer to a function, the address value is being copied over to the function parameter. Due to the function's scope, that copy will vanish once it returns.
By using a double pointer, you will be able to update the original pointer's value. The double pointer will still be copied by value, but that doesn't matter. All you really care is modifying the original pointer, thereby bypassing the function's scope or stack.
Hope this answers not just your question, but other pointer related questions as well.
As #R. Martinho Fernandes pointed out in his answer, using pointer to pointer as an argument in void push(struct node** head, int data) allows you to change the head pointer directly from within push function instead of returning the new pointer.
There is yet another good example which shows why using pointer to pointer instead a single pointer may shorten, simplify and speed up your code. You asked about adding a new node to the list which probably typically doesn't need pointer-to-pointer in contrast to removing the node from the singly-linked list. You can implement removing node from the list without pointer-to-pointer but it is suboptimal. I described the details here. I recommend you also to watch this YouTube video which addresses the problem.
BTW: If you count with Linus Torvalds opinion, you would better learn how to use pointer-to-pointer. ;-)
Linus Torvalds: (...) At the opposite end of the spectrum, I actually wish more people understood the really core low-level kind of coding. Not big, complex stuff like the lockless name lookup, but simply good use of pointers-to-pointers etc. For example, I've seen too many people who delete a singly-linked list entry by keeping track of the "prev" entry, and then to delete the entry, doing something like
if (prev)
prev->next = entry->next;
else
list_head = entry->next;
and whenever I see code like that, I just go "This person doesn't understand pointers". And it's sadly quite common.
People who understand pointers just use a "pointer to the entry pointer", and initialize that with the address of the list_head. And then as they traverse the list, they can remove the entry without using any conditionals, by just doing a "*pp = entry->next". (...)
Other resources that may be helpful:
C double pointers
Pointers to Pointers
Why use double pointer? or Why use pointers to pointers?
In your particular example there is no need for the double pointer. However it can be needed, if, for example, you were to do something like this:
struct node* push(struct node** head, int data)
{
struct node* newnode = malloc(sizeof(struct node));
newnode->data=data;
newnode->next=*head;
//vvvvvvvvvvvvvvvv
*head = newnode; //you say that now the new node is the head.
//^^^^^^^^^^^^^^^^
return newnode;
}
Observation and Finding, WHY...
I decided to do some experiments and make some conclusion,
OBSERVATION 1- If the linked list is not empty then we can add the nodes in it (obviously at the end) by using a single pointer only.
int insert(struct LinkedList *root, int item){
struct LinkedList *temp = (struct LinkedList*)malloc(sizeof(struct LinkedList));
temp->data=item;
temp->next=NULL;
struct LinkedList *p = root;
while(p->next!=NULL){
p=p->next;
}
p->next=temp;
return 0;
}
int main(){
int m;
struct LinkedList *A=(struct LinkedList*)malloc(sizeof(struct LinkedList));
//now we want to add one element to the list so that the list becomes non-empty
A->data=5;
A->next=NULL;
cout<<"enter the element to be inserted\n"; cin>>m;
insert(A,m);
return 0;
}
Its simple to explain (Basic). We have a pointer in our main function which points to the first node (root) of the list. In the insert() function we pass the address of the root node and using this address we reach the end of the list and add a node to it. So we can conclude that if we have address of a variable in a function (not the main function) we can make permanent changes in the value of that variable from that function which would reflect in the main function.
OBSERVATION 2- The above method of adding node failed when the list was empty.
int insert(struct LinkedList *root, int item){
struct LinkedList *temp = (struct LinkedList*)malloc(sizeof(struct LinkedList));
temp->data=item;
temp->next=NULL;
struct LinkedList *p=root;
if(p==NULL){
p=temp;
}
else{
while(p->next!=NULL){
p=p->next;
}
p->next=temp;
}
return 0;
}
int main(){
int m;
struct LinkedList *A=NULL; //initialise the list to be empty
cout<<"enter the element to be inserted\n";
cin>>m;
insert(A,m);
return 0;
}
If you keep on adding elements and finally display the list then you would find that the list has undergone no changes and still it is empty.
The question which struck my mind was in this case also we are passing the address of the root node then why modifications are not happening as permanent modifications and list in the main function undergoes no changes. WHY? WHY? WHY?
Then I observed one thing, when I write A=NULL the address of A becomes 0. This means now A is not pointing to any location in memory. So I removed the line A=NULL; and made some modification in the insert function.
some modifications,(below insert() function can add only one element to an empty list, just wrote this function for testing purpose)
int insert(struct LinkedList *root, int item){
root= (struct LinkedList *)malloc(sizeof(struct LinkedList));
root->data=item;
root->next=NULL;
return 0;
}
int main(){
int m;
struct LinkedList *A;
cout<<"enter the element to be inserted\n";
cin>>m;
insert(A,m);
return 0;
}
the above method also fails because in the insert() function root stores same address as A in the main() function but after the line root= (struct LinkedList *)malloc(sizeof(struct LinkedList)); the address stored in root changes. Thus now , root (in insert() function) and A (in main() function) store different addresses.
So the correct final program would be,
int insert(struct LinkedList *root, int item){
root->data=item;
root->next=NULL;
return 0;
}
int main(){
int m;
struct LinkedList *A = (struct LinkedList *)malloc(sizeof(struct LinkedList));
cout<<"enter the element to be inserted\n";
cin>>m;
insert(A,m);
return 0;
}
But we dont want two different functions for insertion, one when list is empty and other when list is not empty. Now comes double pointer which makes things easy.
One thing I noticed which is important is that pointers store address
and when used with '*' they give value at that address but pointers
themselves have their own address.
Now here is the complete program and later explain the concepts.
int insert(struct LinkedList **root,int item){
if(*root==NULL){
(*root)=(struct LinkedList *)malloc(sizeof(struct LinkedList));
(*root)->data=item;
(*root)->next=NULL;
}
else{
struct LinkedList *temp=(struct LinkedList *)malloc(sizeof(struct LinkedList));
temp->data=item;
temp->next=NULL;
struct LinkedList *p;
p=*root;
while(p->next!=NULL){
p=p->next;
}
p->next=temp;
}
return 0;
}
int main(){
int n,m;
struct LinkedList *A=NULL;
cout<<"enter the no of elements to be inserted\n";
cin>>n;
while(n--){
cin>>m;
insert(&A,m);
}
display(A);
return 0;
}
following are the observations,
1. root stores the address of pointer A (&A) , *root stores the address stored by pointer A and **root stores the value at address stored by A. In simple language root=&A, *root= A and **root= *A.
2. if we write *root= 1528 then it means that value at address stored in root becomes 1528 and since address stored in root is the address of pointer A (&A) thus now A=1528 (i.e. address stored in A is 1528) and this change is permanent.
whenever we are changing value of *root we are indeed changing value at address stored in root and since root=&A ( address of pointer A) we are indirectly changing value of A or address stored in A.
so now if A=NULL (list is empty) *root=NULL , thus we create the first node and store its address at *root i.e. indirectly we storing the address of first node at A. If list is not empty , everything is same as done in previous functions using single pointer except we have changed root to *root since what was stored in root is now stored in *root.
Let's take this simple eg:
void my_func(int *p) {
// allocate space for an int
int *z = (int *) malloc(sizeof(int));
// assign a value
*z = 99;
printf("my_func - value of z: %d\n", *z);
printf("my_func - value of p: %p\n", p);
// change the value of the pointer p. Now it is not pointing to h anymore
p = z;
printf("my_func - make p point to z\n");
printf("my_func - addr of z %p\n", &*z);
printf("my_func - value of p %p\n", p);
printf("my_func - value of what p points to: %d\n", *p);
free(z);
}
int main(int argc, char *argv[])
{
// our var
int z = 10;
int *h = &z;
// print value of z
printf("main - value of z: %d\n", z);
// print address of val
printf("main - addr of z: %p\n", &z);
// print value of h.
printf("main - value of h: %p\n", h);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
// change the value of var z by dereferencing h
*h = 22;
// print value of val
printf("main - value of z: %d\n", z);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
my_func(h);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
// print value of h
printf("main - value of h: %p\n", h);
return 0;
}
Output:
main - value of z: 10
main - addr of z: 0x7ffccf75ca64
main - value of h: 0x7ffccf75ca64
main - value of what h points to: 10
main - value of z: 22
main - value of what h points to: 22
my_func - value of z: 99
my_func - value of p: 0x7ffccf75ca64
my_func - make p point to z
my_func - addr of z 0x1906420
my_func - value of p 0x1906420
my_func - value of what p points to: 99
main - value of what h points to: 22
main - value of h: 0x7ffccf75ca64
we have this signature for my_func:
void my_func(int *p);
If you look at the output, in th end, the value that h points to is still 22 and the value of h is the same, altough in my_func it was changed. How come ?
Well, in my_func we are manipulating the value of p, which is just a local pointer.
after calling:
my_func(ht);
in main(), p will hold the value that h holds, which represents the address of z variable, declared in main function.
In my_func(), when we are changing the value of p to hold the value of z, which is a pointer to a location in memory, for which we have allocated space, we are not changing the value of h, that we've passed in, but just the value of local pointer p. Basically, p does not hold the value of h anymore, it will hold the address of a memory location, that z points to.
Now, if we change our example a little bit:
#include <stdio.h>
#include <stdlib.h>
void my_func(int **p) {
// allocate space for an int
int *z = (int *) malloc(sizeof(int));
// assign a value
*z = 99;
printf("my_func - value of z: %d\n", *z);
printf("my_func - value of p: %p\n", p);
printf("my_func - value of h: %p\n", *p);
// change the value of the pointer p. Now it is not pointing to h anymore
*p = z;
printf("my_func - make p point to z\n");
printf("my_func - addr of z %p\n", &*z);
printf("my_func - value of p %p\n", p);
printf("my_func - value of h %p\n", *p);
printf("my_func - value of what p points to: %d\n", **p);
// we are not deallocating, because we want to keep the value in that
// memory location, in order for h to access it.
/* free(z); */
}
int main(int argc, char *argv[])
{
// our var
int z = 10;
int *h = &z;
// print value of z
printf("main - value of z: %d\n", z);
// print address of val
printf("main - addr of z: %p\n", &z);
// print value of h.
printf("main - value of h: %p\n", h);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
// change the value of var z by dereferencing h
*h = 22;
// print value of val
printf("main - value of z: %d\n", z);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
my_func(&h);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
// print value of h
printf("main - value of h: %p\n", h);
free(h);
return 0;
}
we have the follwoing output:
main - value of z: 10
main - addr of z: 0x7ffcb94fb1cc
main - value of h: 0x7ffcb94fb1cc
main - value of what h points to: 10
main - value of z: 22
main - value of what h points to: 22
my_func - value of z: 99
my_func - value of p: 0x7ffcb94fb1c0
my_func - value of h: 0x7ffcb94fb1cc
my_func - make p point to z
my_func - addr of z 0xc3b420
my_func - value of p 0x7ffcb94fb1c0
my_func - value of h 0xc3b420
my_func - value of what p points to: 99
main - value of what h points to: 99
main - value of h: 0xc3b420
Now, we actually have changed the value which h holds, from my_func, by doing this:
changed function signature
calling from main(): my_func(&h); Basically we are passing the address of h pointer to double pointer p, declared as a parameter in function's signature.
in my_func() we are doing: *p = z; we are dereferencing the double pointer p, one level. Basically this got translated as you would do: h = z;
The value of p, now holds the address of h pointer. h pointer holds the address of z.
You can take both examples and diff them.
So, getting back to your question, you need double pointer in order to make modifications to the pointer that you've passed in straight from that function.
Think of memory location for head like [HEAD_DATA].
Now in your second scenario, the calling function's main_head is the pointer to this location.
main_head--->[HEAD_DATA]
In your code, it sent the value of the pointer main_head to the function(i.e the address of the memory location of head_data)
You copied that to local_head in the function.
so now
local_head---> [HEAD_DATA]
and
main_head---> [HEAD_DATA]
Both point to the same location but are essentially independent of each other.
So when you write local_head = newnode;
what you did is
local_head--/-->[HEAD_DATA]
local_head-----> [NEWNODE_DATA]
You simply replaced the memory address of previous memory with new one in local pointer.
The main_head (pointer) still points to the old [HEAD_DATA]
The standard way to handle linked lists in C is to have the push and pop functions automatically update the head pointer.
C is "Call by value" meaning copies of parameters are passed into functions. If you only pass in the head pointer any local update you make to that pointer will not be seen by the caller. The two workarounds are
1) Pass the address of the head pointer. (Pointer to head pointer)
2) Return a new head pointer, and rely on the caller to update the head pointer.
Option 1) is the easiest even though a little confusing at first.
The answer is more obvious if you take the time to write a working node insertion function; yours isn't one.
You need to be able to write over the head to move it forward, so you need a pointer to the pointer to the head so you can dereference it to get the pointer to the head and change it.
Imagine a case where you have to make certain changes and those changes should reflect back in the calling function.
Example:
void swap(int* a,int* b){
int tmp=*a;
*a=*b;
*b=tmp;
}
int main(void){
int a=10,b=20;
// To ascertain that changes made in swap reflect back here we pass the memory address
// instead of the copy of the values
swap(&a,&b);
}
Similarly we pass the Memory Address of the Head of the List.
This way, if any node is added and the Value of Head is Changed, then that change Reflects Back and we don't have to manually reset the Head inside of the calling function.
Thus this approach reduces the chances of Memory Leaks as we would have lost the pointer to the newly allocated node, had we forgot to update the Head back in the calling function.
Beside this, the second code will Work Faster since no time is wasted in copying and returning since we work directly with the memory.
When we pass pointer as a parameter in a function and want update in the same pointer we use double pointer.
On the other hand if we pass pointer as a parameter in a function and catch it in single pointer then will have to return the result to calling function back in order to use the result.
I think the point is that it makes it easier to update nodes within a linked list. Where you would normally have to keep track of a pointer for previous and current you can have a double pointer take care of it all.
#include <iostream>
#include <math.h>
using namespace std;
class LL
{
private:
struct node
{
int value;
node* next;
node(int v_) :value(v_), next(nullptr) {};
};
node* head;
public:
LL()
{
head = nullptr;
}
void print()
{
node* temp = head;
while (temp)
{
cout << temp->value << " ";
temp = temp->next;
}
}
void insert_sorted_order(int v_)
{
if (!head)
head = new node(v_);
else
{
node* insert = new node(v_);
node** temp = &head;
while ((*temp) && insert->value > (*temp)->value)
temp = &(*temp)->next;
insert->next = (*temp);
(*temp) = insert;
}
}
void remove(int v_)
{
node** temp = &head;
while ((*temp)->value != v_)
temp = &(*temp)->next;
node* d = (*temp);
(*temp) = (*temp)->next;
delete d;
}
void insertRear(int v_)//single pointer
{
if (!head)
head = new node(v_);
else
{
node* temp = new node(v_);
temp->next = head;
head = temp;
}
}
};
Lets say I noted down your house address on a card-1. Now if I want tell your house address to somebody else, I can either copy the address from card-1 to card-2 and give card-2 OR I can give card-1 directly. Either ways the person will know the address and can reach you. But when I give card-1 directly, the address can be changed on card-1 but if I gave card-2 only the address on card-2 can be changed but not on card-1.
Passing a pointer to pointer is similar to giving the access to card-1 directly. Passing a pointer is similar to creating a new copy of the address.
I think your confusion might come from the fact that both functions have a parameter named head. The two head are actually different things. head in the first code stores the address of of the head node pointer(which itself stores an address of the head node structure). Whereas the second head stores an address of the head node structure directly. And since both function returns the newly created node(which should be the new head), I think there is no need to go for the first approach. Callers of this function is responsible to update the head reference they have. I think the second one is good enough and simple to look at. I'd go with the second one.
The naming convention- Head is the cause of the confusion.
The Head is the Tail and the Tail is the Head. The Tail wags the Head.
The Head is just a Pointer,Data is Null - and the Tail is just Data, Pointer is Null.
So you have a pointer to a struct pointer. the Struct pointer points to the 1st node struct in the Linked list.
This pointer to the 1st struct node pointer is called Head. It can better be called startptr or headptr.
When you catch hold of the startptr you have caught hold of the linkedlist. then you can traverse all the struct nodes.

Why does this code function only works with a double pointer and not single one? [duplicate]

The two code examples below both add a node at the top of a linked list.
But whereas the first code example uses a double pointer the second code example uses a single pointer
code example 1:
struct node* push(struct node **head, int data)
{
struct node* newnode = malloc(sizeof(struct node));
newnode->data = data;
newnode->next = *head;
return newnode;
}
push(&head,1);
code example 2:
struct node* push(struct node *head, int data)
{
struct node* newnode = malloc(sizeof(struct node));
newnode->data = data;
newnode->next = head;
return newnode;
}
push(head,1)
Both strategies work. However, a lot of programs that use a linked list use a double pointer to add a new node. I know what a double pointer is. But if a single pointer would be sufficient to add a new node why do a lot of implementations rely on double pointers?
Is there any case in which a single pointer does not work so we need to go for a double pointer?
Some implementations pass a pointer to pointer parameter to allow changing the head pointer directly instead of returning the new one. Thus you could write:
// note that there's no return value: it's not needed
void push(struct node** head, int data)
{
struct node* newnode = malloc(sizeof(struct node));
newnode->data=data;
newnode->next=*head;
*head = newnode; // *head stores the newnode in the head
}
// and call like this:
push(&head,1);
The implementation that doesn't take a pointer to the head pointer must return the new head, and the caller is responsible for updating it itself:
struct node* push(struct node* head, int data)
{
struct node* newnode = malloc(sizeof(struct node));
newnode->data=data;
newnode->next=head;
return newnode;
}
// note the assignment of the result to the head pointer
head = push(head,1);
If you don't do this assignment when calling this function, you will be leaking the nodes you allocate with malloc, and the head pointer will always point to the same node.
The advantage should be clear now: with the second, if the caller forgets to assign the returned node to the head pointer, bad things will happen.
Edit:
Pointer to pointer(Double pointers) also allows for creation for multiple user defined data types within a same program(Example: Creating 2 linked lists)
To avoid complexity of double pointers we can always utilize structure(which works as an internal pointer).
You can define a list in the following way:
typedef struct list {
struct node* root;
} List;
List* create() {
List* templ = malloc(sizeof(List));
templ->root = NULL;
return templ;
}
In link list functions use the above List in following way: (Example for Push function)
void Push(List* l, int x) {
struct node* n = malloc(sizeof(struct node));
n->data = x;
n->link = NULL;
printf("Node created with value %d\n", n->data);
if (l->root == NULL) {
l->root = n;
} else {
struct node* i = l->root;
while (i->link != NULL){
i = i->link;
}
i->link = n;
}
}
In your main() function declare the list in follow way:
List* list1 = create();
push(list1, 10);
Although the previous answers are good enough, I think it's much easier to think in terms of "copy by value".
When you pass in a pointer to a function, the address value is being copied over to the function parameter. Due to the function's scope, that copy will vanish once it returns.
By using a double pointer, you will be able to update the original pointer's value. The double pointer will still be copied by value, but that doesn't matter. All you really care is modifying the original pointer, thereby bypassing the function's scope or stack.
Hope this answers not just your question, but other pointer related questions as well.
As #R. Martinho Fernandes pointed out in his answer, using pointer to pointer as an argument in void push(struct node** head, int data) allows you to change the head pointer directly from within push function instead of returning the new pointer.
There is yet another good example which shows why using pointer to pointer instead a single pointer may shorten, simplify and speed up your code. You asked about adding a new node to the list which probably typically doesn't need pointer-to-pointer in contrast to removing the node from the singly-linked list. You can implement removing node from the list without pointer-to-pointer but it is suboptimal. I described the details here. I recommend you also to watch this YouTube video which addresses the problem.
BTW: If you count with Linus Torvalds opinion, you would better learn how to use pointer-to-pointer. ;-)
Linus Torvalds: (...) At the opposite end of the spectrum, I actually wish more people understood the really core low-level kind of coding. Not big, complex stuff like the lockless name lookup, but simply good use of pointers-to-pointers etc. For example, I've seen too many people who delete a singly-linked list entry by keeping track of the "prev" entry, and then to delete the entry, doing something like
if (prev)
prev->next = entry->next;
else
list_head = entry->next;
and whenever I see code like that, I just go "This person doesn't understand pointers". And it's sadly quite common.
People who understand pointers just use a "pointer to the entry pointer", and initialize that with the address of the list_head. And then as they traverse the list, they can remove the entry without using any conditionals, by just doing a "*pp = entry->next". (...)
Other resources that may be helpful:
C double pointers
Pointers to Pointers
Why use double pointer? or Why use pointers to pointers?
In your particular example there is no need for the double pointer. However it can be needed, if, for example, you were to do something like this:
struct node* push(struct node** head, int data)
{
struct node* newnode = malloc(sizeof(struct node));
newnode->data=data;
newnode->next=*head;
//vvvvvvvvvvvvvvvv
*head = newnode; //you say that now the new node is the head.
//^^^^^^^^^^^^^^^^
return newnode;
}
Observation and Finding, WHY...
I decided to do some experiments and make some conclusion,
OBSERVATION 1- If the linked list is not empty then we can add the nodes in it (obviously at the end) by using a single pointer only.
int insert(struct LinkedList *root, int item){
struct LinkedList *temp = (struct LinkedList*)malloc(sizeof(struct LinkedList));
temp->data=item;
temp->next=NULL;
struct LinkedList *p = root;
while(p->next!=NULL){
p=p->next;
}
p->next=temp;
return 0;
}
int main(){
int m;
struct LinkedList *A=(struct LinkedList*)malloc(sizeof(struct LinkedList));
//now we want to add one element to the list so that the list becomes non-empty
A->data=5;
A->next=NULL;
cout<<"enter the element to be inserted\n"; cin>>m;
insert(A,m);
return 0;
}
Its simple to explain (Basic). We have a pointer in our main function which points to the first node (root) of the list. In the insert() function we pass the address of the root node and using this address we reach the end of the list and add a node to it. So we can conclude that if we have address of a variable in a function (not the main function) we can make permanent changes in the value of that variable from that function which would reflect in the main function.
OBSERVATION 2- The above method of adding node failed when the list was empty.
int insert(struct LinkedList *root, int item){
struct LinkedList *temp = (struct LinkedList*)malloc(sizeof(struct LinkedList));
temp->data=item;
temp->next=NULL;
struct LinkedList *p=root;
if(p==NULL){
p=temp;
}
else{
while(p->next!=NULL){
p=p->next;
}
p->next=temp;
}
return 0;
}
int main(){
int m;
struct LinkedList *A=NULL; //initialise the list to be empty
cout<<"enter the element to be inserted\n";
cin>>m;
insert(A,m);
return 0;
}
If you keep on adding elements and finally display the list then you would find that the list has undergone no changes and still it is empty.
The question which struck my mind was in this case also we are passing the address of the root node then why modifications are not happening as permanent modifications and list in the main function undergoes no changes. WHY? WHY? WHY?
Then I observed one thing, when I write A=NULL the address of A becomes 0. This means now A is not pointing to any location in memory. So I removed the line A=NULL; and made some modification in the insert function.
some modifications,(below insert() function can add only one element to an empty list, just wrote this function for testing purpose)
int insert(struct LinkedList *root, int item){
root= (struct LinkedList *)malloc(sizeof(struct LinkedList));
root->data=item;
root->next=NULL;
return 0;
}
int main(){
int m;
struct LinkedList *A;
cout<<"enter the element to be inserted\n";
cin>>m;
insert(A,m);
return 0;
}
the above method also fails because in the insert() function root stores same address as A in the main() function but after the line root= (struct LinkedList *)malloc(sizeof(struct LinkedList)); the address stored in root changes. Thus now , root (in insert() function) and A (in main() function) store different addresses.
So the correct final program would be,
int insert(struct LinkedList *root, int item){
root->data=item;
root->next=NULL;
return 0;
}
int main(){
int m;
struct LinkedList *A = (struct LinkedList *)malloc(sizeof(struct LinkedList));
cout<<"enter the element to be inserted\n";
cin>>m;
insert(A,m);
return 0;
}
But we dont want two different functions for insertion, one when list is empty and other when list is not empty. Now comes double pointer which makes things easy.
One thing I noticed which is important is that pointers store address
and when used with '*' they give value at that address but pointers
themselves have their own address.
Now here is the complete program and later explain the concepts.
int insert(struct LinkedList **root,int item){
if(*root==NULL){
(*root)=(struct LinkedList *)malloc(sizeof(struct LinkedList));
(*root)->data=item;
(*root)->next=NULL;
}
else{
struct LinkedList *temp=(struct LinkedList *)malloc(sizeof(struct LinkedList));
temp->data=item;
temp->next=NULL;
struct LinkedList *p;
p=*root;
while(p->next!=NULL){
p=p->next;
}
p->next=temp;
}
return 0;
}
int main(){
int n,m;
struct LinkedList *A=NULL;
cout<<"enter the no of elements to be inserted\n";
cin>>n;
while(n--){
cin>>m;
insert(&A,m);
}
display(A);
return 0;
}
following are the observations,
1. root stores the address of pointer A (&A) , *root stores the address stored by pointer A and **root stores the value at address stored by A. In simple language root=&A, *root= A and **root= *A.
2. if we write *root= 1528 then it means that value at address stored in root becomes 1528 and since address stored in root is the address of pointer A (&A) thus now A=1528 (i.e. address stored in A is 1528) and this change is permanent.
whenever we are changing value of *root we are indeed changing value at address stored in root and since root=&A ( address of pointer A) we are indirectly changing value of A or address stored in A.
so now if A=NULL (list is empty) *root=NULL , thus we create the first node and store its address at *root i.e. indirectly we storing the address of first node at A. If list is not empty , everything is same as done in previous functions using single pointer except we have changed root to *root since what was stored in root is now stored in *root.
Let's take this simple eg:
void my_func(int *p) {
// allocate space for an int
int *z = (int *) malloc(sizeof(int));
// assign a value
*z = 99;
printf("my_func - value of z: %d\n", *z);
printf("my_func - value of p: %p\n", p);
// change the value of the pointer p. Now it is not pointing to h anymore
p = z;
printf("my_func - make p point to z\n");
printf("my_func - addr of z %p\n", &*z);
printf("my_func - value of p %p\n", p);
printf("my_func - value of what p points to: %d\n", *p);
free(z);
}
int main(int argc, char *argv[])
{
// our var
int z = 10;
int *h = &z;
// print value of z
printf("main - value of z: %d\n", z);
// print address of val
printf("main - addr of z: %p\n", &z);
// print value of h.
printf("main - value of h: %p\n", h);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
// change the value of var z by dereferencing h
*h = 22;
// print value of val
printf("main - value of z: %d\n", z);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
my_func(h);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
// print value of h
printf("main - value of h: %p\n", h);
return 0;
}
Output:
main - value of z: 10
main - addr of z: 0x7ffccf75ca64
main - value of h: 0x7ffccf75ca64
main - value of what h points to: 10
main - value of z: 22
main - value of what h points to: 22
my_func - value of z: 99
my_func - value of p: 0x7ffccf75ca64
my_func - make p point to z
my_func - addr of z 0x1906420
my_func - value of p 0x1906420
my_func - value of what p points to: 99
main - value of what h points to: 22
main - value of h: 0x7ffccf75ca64
we have this signature for my_func:
void my_func(int *p);
If you look at the output, in th end, the value that h points to is still 22 and the value of h is the same, altough in my_func it was changed. How come ?
Well, in my_func we are manipulating the value of p, which is just a local pointer.
after calling:
my_func(ht);
in main(), p will hold the value that h holds, which represents the address of z variable, declared in main function.
In my_func(), when we are changing the value of p to hold the value of z, which is a pointer to a location in memory, for which we have allocated space, we are not changing the value of h, that we've passed in, but just the value of local pointer p. Basically, p does not hold the value of h anymore, it will hold the address of a memory location, that z points to.
Now, if we change our example a little bit:
#include <stdio.h>
#include <stdlib.h>
void my_func(int **p) {
// allocate space for an int
int *z = (int *) malloc(sizeof(int));
// assign a value
*z = 99;
printf("my_func - value of z: %d\n", *z);
printf("my_func - value of p: %p\n", p);
printf("my_func - value of h: %p\n", *p);
// change the value of the pointer p. Now it is not pointing to h anymore
*p = z;
printf("my_func - make p point to z\n");
printf("my_func - addr of z %p\n", &*z);
printf("my_func - value of p %p\n", p);
printf("my_func - value of h %p\n", *p);
printf("my_func - value of what p points to: %d\n", **p);
// we are not deallocating, because we want to keep the value in that
// memory location, in order for h to access it.
/* free(z); */
}
int main(int argc, char *argv[])
{
// our var
int z = 10;
int *h = &z;
// print value of z
printf("main - value of z: %d\n", z);
// print address of val
printf("main - addr of z: %p\n", &z);
// print value of h.
printf("main - value of h: %p\n", h);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
// change the value of var z by dereferencing h
*h = 22;
// print value of val
printf("main - value of z: %d\n", z);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
my_func(&h);
// print value of what h points to
printf("main - value of what h points to: %d\n", *h);
// print value of h
printf("main - value of h: %p\n", h);
free(h);
return 0;
}
we have the follwoing output:
main - value of z: 10
main - addr of z: 0x7ffcb94fb1cc
main - value of h: 0x7ffcb94fb1cc
main - value of what h points to: 10
main - value of z: 22
main - value of what h points to: 22
my_func - value of z: 99
my_func - value of p: 0x7ffcb94fb1c0
my_func - value of h: 0x7ffcb94fb1cc
my_func - make p point to z
my_func - addr of z 0xc3b420
my_func - value of p 0x7ffcb94fb1c0
my_func - value of h 0xc3b420
my_func - value of what p points to: 99
main - value of what h points to: 99
main - value of h: 0xc3b420
Now, we actually have changed the value which h holds, from my_func, by doing this:
changed function signature
calling from main(): my_func(&h); Basically we are passing the address of h pointer to double pointer p, declared as a parameter in function's signature.
in my_func() we are doing: *p = z; we are dereferencing the double pointer p, one level. Basically this got translated as you would do: h = z;
The value of p, now holds the address of h pointer. h pointer holds the address of z.
You can take both examples and diff them.
So, getting back to your question, you need double pointer in order to make modifications to the pointer that you've passed in straight from that function.
Think of memory location for head like [HEAD_DATA].
Now in your second scenario, the calling function's main_head is the pointer to this location.
main_head--->[HEAD_DATA]
In your code, it sent the value of the pointer main_head to the function(i.e the address of the memory location of head_data)
You copied that to local_head in the function.
so now
local_head---> [HEAD_DATA]
and
main_head---> [HEAD_DATA]
Both point to the same location but are essentially independent of each other.
So when you write local_head = newnode;
what you did is
local_head--/-->[HEAD_DATA]
local_head-----> [NEWNODE_DATA]
You simply replaced the memory address of previous memory with new one in local pointer.
The main_head (pointer) still points to the old [HEAD_DATA]
The standard way to handle linked lists in C is to have the push and pop functions automatically update the head pointer.
C is "Call by value" meaning copies of parameters are passed into functions. If you only pass in the head pointer any local update you make to that pointer will not be seen by the caller. The two workarounds are
1) Pass the address of the head pointer. (Pointer to head pointer)
2) Return a new head pointer, and rely on the caller to update the head pointer.
Option 1) is the easiest even though a little confusing at first.
The answer is more obvious if you take the time to write a working node insertion function; yours isn't one.
You need to be able to write over the head to move it forward, so you need a pointer to the pointer to the head so you can dereference it to get the pointer to the head and change it.
Imagine a case where you have to make certain changes and those changes should reflect back in the calling function.
Example:
void swap(int* a,int* b){
int tmp=*a;
*a=*b;
*b=tmp;
}
int main(void){
int a=10,b=20;
// To ascertain that changes made in swap reflect back here we pass the memory address
// instead of the copy of the values
swap(&a,&b);
}
Similarly we pass the Memory Address of the Head of the List.
This way, if any node is added and the Value of Head is Changed, then that change Reflects Back and we don't have to manually reset the Head inside of the calling function.
Thus this approach reduces the chances of Memory Leaks as we would have lost the pointer to the newly allocated node, had we forgot to update the Head back in the calling function.
Beside this, the second code will Work Faster since no time is wasted in copying and returning since we work directly with the memory.
When we pass pointer as a parameter in a function and want update in the same pointer we use double pointer.
On the other hand if we pass pointer as a parameter in a function and catch it in single pointer then will have to return the result to calling function back in order to use the result.
I think the point is that it makes it easier to update nodes within a linked list. Where you would normally have to keep track of a pointer for previous and current you can have a double pointer take care of it all.
#include <iostream>
#include <math.h>
using namespace std;
class LL
{
private:
struct node
{
int value;
node* next;
node(int v_) :value(v_), next(nullptr) {};
};
node* head;
public:
LL()
{
head = nullptr;
}
void print()
{
node* temp = head;
while (temp)
{
cout << temp->value << " ";
temp = temp->next;
}
}
void insert_sorted_order(int v_)
{
if (!head)
head = new node(v_);
else
{
node* insert = new node(v_);
node** temp = &head;
while ((*temp) && insert->value > (*temp)->value)
temp = &(*temp)->next;
insert->next = (*temp);
(*temp) = insert;
}
}
void remove(int v_)
{
node** temp = &head;
while ((*temp)->value != v_)
temp = &(*temp)->next;
node* d = (*temp);
(*temp) = (*temp)->next;
delete d;
}
void insertRear(int v_)//single pointer
{
if (!head)
head = new node(v_);
else
{
node* temp = new node(v_);
temp->next = head;
head = temp;
}
}
};
Lets say I noted down your house address on a card-1. Now if I want tell your house address to somebody else, I can either copy the address from card-1 to card-2 and give card-2 OR I can give card-1 directly. Either ways the person will know the address and can reach you. But when I give card-1 directly, the address can be changed on card-1 but if I gave card-2 only the address on card-2 can be changed but not on card-1.
Passing a pointer to pointer is similar to giving the access to card-1 directly. Passing a pointer is similar to creating a new copy of the address.
I think your confusion might come from the fact that both functions have a parameter named head. The two head are actually different things. head in the first code stores the address of of the head node pointer(which itself stores an address of the head node structure). Whereas the second head stores an address of the head node structure directly. And since both function returns the newly created node(which should be the new head), I think there is no need to go for the first approach. Callers of this function is responsible to update the head reference they have. I think the second one is good enough and simple to look at. I'd go with the second one.
The naming convention- Head is the cause of the confusion.
The Head is the Tail and the Tail is the Head. The Tail wags the Head.
The Head is just a Pointer,Data is Null - and the Tail is just Data, Pointer is Null.
So you have a pointer to a struct pointer. the Struct pointer points to the 1st node struct in the Linked list.
This pointer to the 1st struct node pointer is called Head. It can better be called startptr or headptr.
When you catch hold of the startptr you have caught hold of the linkedlist. then you can traverse all the struct nodes.

simple linked list failing to print

I am learning how to make a linked list, but its failing to print out anything at all, and I cant figure out why??? please help. I believe it has something to do with my pointers but I don't know what it is.
#include <stdio.h>
#include <stdlib.h>
// typedef is used to give a data type a new name
typedef struct node * link ;// link is now type struct node pointer
/*
typedef allows us to say "link ptr"
instead of "struct node * ptr"
*/
struct node{
int item ;// this is the data
link next ;//same as struct node * next, next is a pointer
};
void printAll(link head); // print a linked list , starting at link head
void addFirst(link ptr, int val ); // add a node with given value to a list
link removeLast(link ptr); // removes and returns the last element in the link
//prints the link
void printAll(link head){
link ptr = head;
printf("\nPrinting Linked List:\n");
while(ptr != NULL){
printf(" %d ", (*ptr).item);
ptr = (*ptr).next;// same as ptr->next
}
printf("\n");
}
//adds to the head of the link
void addFirst(link ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = ptr;
ptr = tmp;
}
// testing
int main(void) {
link head = NULL;// same as struct node * head, head is a pointer type
//populating list
for(int i = 0; i<3; i++){
addFirst(head, i);
}
printAll(head);
return 0;
}
output:
Printing Linked List:
Process returned 0 (0x0) execution time : 0.059 s
Press any key to continue
It's because you're passing a null pointer to your function and the condition for exiting the loop is for that pointer to be null, so nothing happens.
Your addFirst function takes a pointer's value, but it cannot modify the head that you declared inside of main().
To modify head you need to pass a pointer to link, then you can dereference that pointer to access your head and you can then change it.
void addFirst(link *ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = *ptr;
*ptr = tmp;
}
Now you can change the head pointer. Just remember to pass the address to it when calling the function. addFirst(&head,i)
In the for loop
for(int i = 0; i<3; i++){
addFirst(head, i);
}
you create a bunch of pointers which all point to NULL. head is never changing since pointer itself is passed "by value". E.g. head is copied and all modifications to the pointer itself in addFirst are not visible outside.
This is the same as with say int. Imagine void foo(int x);. Whatever this function does to x is not visible outside.
However changes to the memory which link ptr points to are visible of course.
E.g. this line does nothing:
tmp->next = ptr;
ptr = tmp; <=== this line
}
You can fix this in several ways. One is to return new node from addFirst and another one is to make link ptr to be a pointer to pointer: link *ptr. Since in this case you want to change pointer value (not pointee value):
//link *ptr here a pointer to pointer
void addFirst(link * ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = *ptr; //<<changed
*ptr = tmp; //<<changed
}
Do not forget to update declaration above also. And the call:
void addFirst(link * ptr, int val ); // add a node with given value to a list
...
for(int i = 0; i<3; i++){
addFirst(&head, i);
}
Then this code produces:
Printing Linked List:
2 1 0
Added:
It's important to understand that working with linked list requires working with two different types of data.
First is struct node and you pass around this type of data using links.
Second is head. This is a pointer to the very first node. When you would like to modify the head you find it is not a "node". It is something else. It's a "name" for the first node in the list. This name by itself is a pointer to node. See how memory layout for head is different from the list itself.
head[8 bytes]->node1[16 bytes]->node2[16 bytes]->...->nodek[16 bytes]->NULL;
by the way - the only thing which have lexical name here is head. All the nodes do not have name and accessible through node->next syntax.
You can also imagine another pointer here, link last which will point to nodek. Again this will have different memory layout from nodes itself. And if you would like to modify that in a function you will need to pass to function pointer to that (e.g.pointer to pointer).
Pointer and data it points to are different things. In your mind you need to separate them. Pointer is like int or float. It is passed "by value" to functions. Yes link ptr is already pointer and that permits you to update the data it points to. However the pointer itself is passed by value and updates to pointer (in your case ptr=tmp) are not visible outside.
(*ptr).next=xxx will be visible of course because data is updated (not pointer). That means you need to do one extra step - make changes to your pointer visible outside of function, e.g. convert the pointer itself (head) into data for another pointer, e.g. use struct node **ptr (first star here says this is pointer to a node, and the second star converts that pointer to data for another pointer.

Malloc function in dynamic lists

I'm getting started with dynamic lists and i don't understand why it is necessary to use the malloc function even when declaring the first node in the main() program, the piece of code below should just print the data contained in the first node but if i don't initialize the node with the malloc function it just doesn't work:
struct node{
int data;
struct node* next;
};
void insert(int val, struct node*);
int main() {
struct node* head ;
head->data = 2;
printf("%d \n", head->data);
}
You don’t technically, but maintaining all nodes with the same memory pattern is only an advantage to you, with no real disadvantages.
Just assume that all nodes are stored in the dynamic memory.
Your “insert” procedure would be better named something like “add” or (for full functional context) “cons”, and it should return the new node:
struct node* cons(int val, struct node* next)
{
struct node* this = (struct node*)malloc( sizeof struct node );
if (!this) return next; // or some other error condition!
this->data = val;
this->next = next;
return this;
}
Building lists is now very easy:
int main()
{
struct node* xs = cons( 2, cons( 3, cons( 5, cons( 7, NULL ) ) ) );
// You now have a list of the first four prime numbers.
And it is easy to handle them.
// Let’s print them!
{
struct node* p = xs;
while (p)
{
printf( "%d ", p->data );
p = p->next;
}
printf( "\n" );
}
// Let’s get the length!
int length = 0;
{
struct node* p = xs;
while (p)
{
length += 1;
p = p->next;
}
}
printf( "xs is %d elements long.\n", length );
By the way, you should try to be as consistent as possible when naming things. You have named the node data “data” but the constructor’s argument calls it “val”. You should pick one and stick to it.
Also, it is common to:
typedef struct node node;
Now in every place except inside the definition of struct node you can just use the word node.
Oh, and I almost forgot: Don’t forget to clean up with a proper destructor.
node* destroy( node* root )
{
if (!root) return NULL;
destroy( root->next );
free( root );
return NULL;
}
And an addendum to main():
int main()
{
node* xs = ...
...
xs = destroy( xs );
}
When you declare a variable, you define the type of the variable, then it's
name and optionally you declare it's initial value.
Every type needs an specific amount of memory. For example int would be
32 bit long on a 32bit OS, 8 bit long on a 64.
A variable declared in a function is usually stored in the stack associated
with the function. When the function returns, the stack for that function is
no longer available and the variable does not longer exist.
When you need the value/object of the variable to exist even after a function
returns, then you need to allocate memory on a different part of the program,
usually the heap. That's exactly what malloc, realloc and calloc do.
Doing
struct node* head ;
head->data = 2;
is just wrong. You've declaring a pointer named head of type struct node,
but you are not assigning anything to it. So it points to an unspecified
location in memory. head->data = 2 tries to store a value at an unspecified
location and the program will most likely crash with a segfault.
In main you could do this:
int main(void)
{
struct node head;
head.data = 2;
printf("%d \n", head.data);
return 0;
}
head will be saved in the stack and will persist as long as main doesn't
return. But this is only a very small example. In a complex program where you
have many more variables, objects, etc. it's a bad idea to simply declare all
variables you need in main. So it's best that objects get created when they
are needed.
For example you could have a function that creates the object and another one
that calls create_node and uses that object.
struct node *create_node(int data)
{
struct node *head = malloc(sizeof *head);
if(head == NULL)
return NULL; // no more memory left
head->data = data;
head->next = NULL;
return head;
}
struct node *foo(void)
{
struct node *head = create_node(112);
// do somethig with head
return head;
}
Here create_node uses malloc to allocate memory for one struct node
object, initializes the object with some values and returns a pointer to that memory location.
foo calls create_node and does something with it and it returns the
object. If another function calls foo, this function will get the object.
There are also other reasons for malloc. Consider this code:
void foo(void)
{
int numbers[4] = { 1, 3, 5, 7 };
...
}
In this case you know that you will need 4 integers. But sometimes you need an
array where the number of elements is only known during runtime, for example
because it depends on some user input. For this you can also use malloc.
void foo(int size)
{
int *numbers = malloc(size * sizeof *numbers);
// now you have "size" elements
...
free(numbers); // freeing memory
}
When you use malloc, realloc, calloc, you'll need to free the memory. If
your program does not need the memory anymore, you have to use free (like in
the last example. Note that for simplicity I omitted the use of free in the
examples with struct head.
What you have invokes undefined behavior because you don't really have a node,, you have a pointer to a node that doesn't actually point to a node. Using malloc and friends creates a memory region where an actual node object can reside, and where a node pointer can point to.
In your code, struct node* head is a pointer that points to nowhere, and dereferencing it as you have done is undefined behavior (which can commonly cause a segfault). You must point head to a valid struct node before you can safely dereference it. One way is like this:
int main() {
struct node* head;
struct node myNode;
head = &myNode; // assigning the address of myNode to head, now head points somewhere
head->data = 2; // this is legal
printf("%d \n", head->data); // will print 2
}
But in the above example, myNode is a local variable, and will go out of scope as soon as the function exists (in this case main). As you say in your question, for linked lists you generally want to malloc the data so it can be used outside of the current scope.
int main() {
struct node* head = malloc(sizeof struct node);
if (head != NULL)
{
// we received a valid memory block, so we can safely dereference
// you should ALWAYS initialize/assign memory when you allocate it.
// malloc does not do this, but calloc does (initializes it to 0) if you want to use that
// you can use malloc and memset together.. in this case there's just
// two fields, so we can initialize via assignment.
head->data = 2;
head->next = NULL;
printf("%d \n", head->data);
// clean up memory when we're done using it
free(head);
}
else
{
// we were unable to obtain memory
fprintf(stderr, "Unable to allocate memory!\n");
}
return 0;
}
This is a very simple example. Normally for a linked list, you'll have insert function(s) (where the mallocing generally takes place and remove function(s) (where the freeing generally takes place. You'll at least have a head pointer that always points to the first item in the list, and for a double-linked list you'll want a tail pointer as well. There can also be print functions, deleteEntireList functions, etc. But one way or another, you must allocate space for an actual object. malloc is a way to do that so the validity of the memory persists throughout runtime of your program.
edit:
Incorrect. This absolutely applies to int and int*,, it applies to any object and pointer(s) to it. If you were to have the following:
int main() {
int* head;
*head = 2; // head uninitialized and unassigned, this is UB
printf("%d\n", *head); // UB again
return 0;
}
this is every bit of undefined behavior as you have in your OP. A pointer must point to something valid before you can dereference it. In the above code, head is uninitialized, it doesn't point to anything deterministically, and as soon as you do *head (whether to read or write), you're invoking undefined behavior. Just as with your struct node, you must do something like following to be correct:
int main() {
int myInt; // creates space for an actual int in automatic storage (most likely the stack)
int* head = &myInt; // now head points to a valid memory location, namely myInt
*head = 2; // now myInt == 2
printf("%d\n", *head); // prints 2
return 0;
}
or you can do
int main() {
int* head = malloc(sizeof int); // silly to malloc a single int, but this is for illustration purposes
if (head != NULL)
{
// space for an int was returned to us from the heap
*head = 2; // now the unnamed int that head points to is 2
printf("%d\n", *head); // prints out 2
// don't forget to clean up
free(head);
}
else
{
// handle error, print error message, etc
}
return 0;
}
These rules are true for any primitive type or data structure you're dealing with. Pointers must point to something, otherwise dereferencing them is undefined behavior, and you hope you get a segfault when that happens so you can track down the errors before your TA grades it or before the customer demo. Murphy's law dictates UB will always crash your code when it's being presented.
Statement struct node* head; defines a pointer to a node object, but not the node object itself. As you do not initialize the pointer (i.e. by letting it point to a node object created by, for example, a malloc-statement), dereferencing this pointer as you do with head->data yields undefined behaviour.
Two ways to overcome this, (1) either allocate memory dynamically - yielding an object with dynamic storage duration, or (2) define the object itself as an, for example, local variable with automatic storage duration:
(1) dynamic storage duration
int main() {
struct node* head = calloc(1, sizeof(struct node));
if (head) {
head->data = 2;
printf("%d \n", head->data);
free(head);
}
}
(2) automatic storage duration
int main() {
struct node head;
head.data = 2;
printf("%d \n", head.data);
}

Conversion from double to pointer

I'm quite newbie in C and now I'm trying to implement basic generic linked list with 3 elements, each element will contain a different datatype value — int, char and double.
Here is my code:
#include <stdio.h>
#include <stdlib.h>
struct node
{
void* data;
struct node* next;
};
struct node* BuildOneTwoThree()
{
struct node* head = NULL;
struct node* second = NULL;
struct node* third = NULL;
head = (struct node*)malloc(sizeof(struct node));
second = (struct node*)malloc(sizeof(struct node));
third = (struct node*)malloc(sizeof(struct node));
head->data = (int*)malloc(sizeof(int));
(int*)(head->data) = 2;
head->next = second;
second->data = (char*)malloc(sizeof(char));
(char*)second->data = 'b';
second->next = third;
third->data = (double*)malloc(sizeof(double));
(double*)third->data = 5.6;
third->next = NULL;
return head;
}
int main(void)
{
struct node* lst = BuildOneTwoThree();
printf("%d\n", lst->data);
printf("%c\n", lst->next->data);
printf("%.2f\n", lst->next->next->data);
return 0;
}
I have no problem with the first two elements, but when I try to assign a value of type double to the third element I get an error:
can not convert from double to double *
My questions:
What is the reason for this error?
Why don't I get the same error in case of int or char?
How to assign a double value to the data field of the third element?
The problem string is (double*)third->data = 5.6;.
In your "working" examples, you're calling malloc to get a pointer to some newly-allocated space, and then immediately throwing that pointer away and replacing the pointer value with your integer or character value. That works more or less by accident because in most C implementations a pointer cell can hold an integer or char value, though you should be getting warnings. If you actually tried to dereference the data pointer after those assignments, you would probably get a crash and core dump.
You want to put the values at the place pointed to by the pointer, not in the pointer itself. Which means you need an extra *:
*((double *)third->data) = 5.6;
The * in the (double *) typecast is part of the type name - "pointer to double". The cast says "take the value of third->data and interpret it as a pointer to double". The result is still a pointer, so when you assign to that, you are changing where the pointer points (and probably making it point somewhere meaningless). Instead, you want to assign a value to the place it already points, which is what the outer * does.
However, if you're only storing basic types like int, char, and double, you don't need to go through a pointer (and worry about the attendant memory management). You can just use a union:
struct node
{
struct node *next;
union {
char c;
int i;
double d;
} data;
}
Then you would do e.g.
head->data.i = 2;
second->data.c = 'b';
third->data.d = 5.6;
You are casting the pointer, but you need to derefernce it for the assignment, the assignment worked for the first two since the int and char were casted to pointers, it should be:
*((int*)(head->data)) = 2;
*((char*)(second->data)) = 'b';
*((double*)(third->data)) = 5.6;
Anyway, there should be warning for such a cast in first place.
You can't assign the value to the pointer, you have to assign it to the object pointed (in the last case, a double - you only have space for one double).
So:
...
head->data = (int*)malloc(sizeof(int));
((int*)(head->data))[0] = 2;
head->next = second;
second->data = (char*)malloc(sizeof(char));
((char*)second->data)[0] = 'b';
second->next = third;
third->data = (double*)malloc(2 * sizeof(double));
((double*)third->data)[0] = 5.6;
((double*)third->data)[1] = 3.1415;
// We only allocated space for 2 doubles, so this line here would cause a crash
// (or anyway, a data corruption)
// ((double*)third->data)[2] = 666;
third->next = NULL;
return head;
}
int main(void)
{
struct node* lst = BuildOneTwoThree();
printf("%d\n", ((int *)lst->data)[0]);
printf("%c\n", ((char *)lst->next->data)[0]);
printf("%.2f\n", ((double *)lst->next->next->data)[0]);
printf("%.2f\n", ((double *)lst->next->next->data)[1]);
...
returns:
2
b
5.60
3.14
BTW: with full warnings enabled, the compiler should have warned you that the first two assignments were risky (GCC considers them errors) and the third one unallowed (can't convert from a double to a pointer)
One more thing: when you use a struct payload this way, you have to consider that the data type that you actually stored in the payload itself is lost. So you can't, by inspecting an instance of your linked list, determine whether it is a char, an integer or a double. Worse, even checking the value might not be allowed and crash the program (suppose you stored a single byte, but try to read four or eight).
So you ought to also store an extra field in your structure holding an indicator (an enum maybe) of the original data type:
typedef enum
{
TYPE_IS_CHAR,
TYPE_IS_INT,
TYPE_IS_FLOAT,
TYPE_IS_DOUBLE,
...
} mytype_t;
struct node
{
mytype_t type;
void *data;
struct node *next;
}

Resources