Pointers to structs in C vs pointers to arrays - c

Do pointers to structures in C work differently than other pointers?
For example in this code:
typedef struct node {
int data;
struct node *next;
} node;
void insert(node **head, int data) {
node *new_node = malloc(sizeof(node));
new_node->data = data;
new_node->next = *head;
*head = new_node;
}
int main() {
node *head = NULL;
insert(&head, 6);
Why do I have to use a pointer to a pointer and can't use the variable head in the insert function like in this example with arrays:
void moidify(int *arr) {
*arr = 3;
}
int main() {
int *array = malloc(8);
*array = 1;
*(array + 1) = 2;
moidify(array);
}
Here I don't have to pass &array to the function.

There is no difference. If you want to change the value of the variable you send in to function in such a way that the change is visible in the function that called function, you need to supply its address to function, which is what you do when taking the address of head.
In moidify(array) you send in a pointer to the first element in array which is why modifying the array data works. If you would like to modify the array variable itself (by making it potentially point somewhere else), you would have to take its address too. Example:
void moidify(int **arr) {
*arr = realloc(*arr, 128);
if(*arr == NULL) {
perror(__func__);
exit(1);
}
}
int main() {
int *array = malloc(8);
*array = 1;
*(array + 1) = 2;
moidify(&array);
}

You must understand how pointers works to get this one.
Here, the variable array is not properly speaking, an array. It's a pointer toward a memory space, of size 8 * sizeof(int). It contains only an address. From this address you can access the values of the array, you move using this address, to the rightfully memory space you want to fill or read.
Once that understood, when you call the moidify function, you are not passing the array. Nor the memory space. You are passing, the address of the memory space. The function gets a copy of the given address, in the argument variable int *arr.
Hence, you can use it the same way you use it from the main function.
If you wanted to change the address toward which the array variable would go, you would need to specify &array to the receiving function, which would then use an int ** argument variable.
Your example with struct is similar to this last part I just described, you want to change toward which address head is pointing, so, you need to give &head to the function. To get the address of head, and be able to modify the contained address.
You use an address, to access the memory space called head, to modify the address inside the memory space called head, which point toward another memory space, where your struct truly belongs.

Related

I can alter a struct member from one location but not from the other

I am trying to implement a linked list in C - starting simple, with one list containing one node. However, I stumble upon some issues when trying to add data to the node. Here's my implementation thus far:
struct mylist_node {
int data;
};
struct mylist {
struct mylist_node *head_pt;
};
void mylist_init(struct mylist* l){
struct mylist_node head_node;
head_node.data = 5; //First try
l->head_pt = &head_node;
l->head_pt->data = 5; //Second try
};
And my main method:
int main()
{
struct mylist ml, *ml_pointer;
ml_pointer = &ml;
mylist_init(ml_pointer);
printf("%d\n", ml_pointer->head_pt->data);
ml_pointer->head_pt->data = 4;
printf("%d\n", ml_pointer->head_pt->data);
return 0;
}
This should print out
5
4
If my knowledge of pointers is correct. However, it prints out
0
4
As you can see I try to set the node data twice within the mylist_init method. Neither appears to be working - meanwhile, writing to and reading from it from my main method works just fine. What am I doing wrong?
In mylist_init, you're storing the address of a local variable in the struct pointed to by l. That variable goes out of scope when the function returns, so the memory it occupied is no longer valid, and thus the pointer that previously pointed to it now points to an invalid location. Returning the address of a local variable a dereferencing that address invokes undefined behavior.
Your function needs to allocate memory dynamically using malloc so the memory will still be valid when the function returns.
void mylist_init(struct mylist* l){
struct mylist_node *head_node = malloc(sizeof(*head_node));
l->head_pt = head_node;
l->head_pt->data = 5;
};
Also, don't forget to free the memory when you're done using it.
For starters, you have to allocate memory for your node, the way you were doing it, your node is a local variable on the stack which will likely get overwritten after the function exits.
void mylist_init(struct mylist* l)
{
struct mylist_node *head_node = (struct mylist_node *)malloc(sizeof(struct mylist_node));
head_node.data = 5; //First try
l->head_pt = head_node;
};

simple linked list failing to print

I am learning how to make a linked list, but its failing to print out anything at all, and I cant figure out why??? please help. I believe it has something to do with my pointers but I don't know what it is.
#include <stdio.h>
#include <stdlib.h>
// typedef is used to give a data type a new name
typedef struct node * link ;// link is now type struct node pointer
/*
typedef allows us to say "link ptr"
instead of "struct node * ptr"
*/
struct node{
int item ;// this is the data
link next ;//same as struct node * next, next is a pointer
};
void printAll(link head); // print a linked list , starting at link head
void addFirst(link ptr, int val ); // add a node with given value to a list
link removeLast(link ptr); // removes and returns the last element in the link
//prints the link
void printAll(link head){
link ptr = head;
printf("\nPrinting Linked List:\n");
while(ptr != NULL){
printf(" %d ", (*ptr).item);
ptr = (*ptr).next;// same as ptr->next
}
printf("\n");
}
//adds to the head of the link
void addFirst(link ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = ptr;
ptr = tmp;
}
// testing
int main(void) {
link head = NULL;// same as struct node * head, head is a pointer type
//populating list
for(int i = 0; i<3; i++){
addFirst(head, i);
}
printAll(head);
return 0;
}
output:
Printing Linked List:
Process returned 0 (0x0) execution time : 0.059 s
Press any key to continue
It's because you're passing a null pointer to your function and the condition for exiting the loop is for that pointer to be null, so nothing happens.
Your addFirst function takes a pointer's value, but it cannot modify the head that you declared inside of main().
To modify head you need to pass a pointer to link, then you can dereference that pointer to access your head and you can then change it.
void addFirst(link *ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = *ptr;
*ptr = tmp;
}
Now you can change the head pointer. Just remember to pass the address to it when calling the function. addFirst(&head,i)
In the for loop
for(int i = 0; i<3; i++){
addFirst(head, i);
}
you create a bunch of pointers which all point to NULL. head is never changing since pointer itself is passed "by value". E.g. head is copied and all modifications to the pointer itself in addFirst are not visible outside.
This is the same as with say int. Imagine void foo(int x);. Whatever this function does to x is not visible outside.
However changes to the memory which link ptr points to are visible of course.
E.g. this line does nothing:
tmp->next = ptr;
ptr = tmp; <=== this line
}
You can fix this in several ways. One is to return new node from addFirst and another one is to make link ptr to be a pointer to pointer: link *ptr. Since in this case you want to change pointer value (not pointee value):
//link *ptr here a pointer to pointer
void addFirst(link * ptr, int val ){
link tmp = malloc(sizeof(struct node));// allocates memory for the node
tmp->item = val;
tmp->next = *ptr; //<<changed
*ptr = tmp; //<<changed
}
Do not forget to update declaration above also. And the call:
void addFirst(link * ptr, int val ); // add a node with given value to a list
...
for(int i = 0; i<3; i++){
addFirst(&head, i);
}
Then this code produces:
Printing Linked List:
2 1 0
Added:
It's important to understand that working with linked list requires working with two different types of data.
First is struct node and you pass around this type of data using links.
Second is head. This is a pointer to the very first node. When you would like to modify the head you find it is not a "node". It is something else. It's a "name" for the first node in the list. This name by itself is a pointer to node. See how memory layout for head is different from the list itself.
head[8 bytes]->node1[16 bytes]->node2[16 bytes]->...->nodek[16 bytes]->NULL;
by the way - the only thing which have lexical name here is head. All the nodes do not have name and accessible through node->next syntax.
You can also imagine another pointer here, link last which will point to nodek. Again this will have different memory layout from nodes itself. And if you would like to modify that in a function you will need to pass to function pointer to that (e.g.pointer to pointer).
Pointer and data it points to are different things. In your mind you need to separate them. Pointer is like int or float. It is passed "by value" to functions. Yes link ptr is already pointer and that permits you to update the data it points to. However the pointer itself is passed by value and updates to pointer (in your case ptr=tmp) are not visible outside.
(*ptr).next=xxx will be visible of course because data is updated (not pointer). That means you need to do one extra step - make changes to your pointer visible outside of function, e.g. convert the pointer itself (head) into data for another pointer, e.g. use struct node **ptr (first star here says this is pointer to a node, and the second star converts that pointer to data for another pointer.

Malloc function in dynamic lists

I'm getting started with dynamic lists and i don't understand why it is necessary to use the malloc function even when declaring the first node in the main() program, the piece of code below should just print the data contained in the first node but if i don't initialize the node with the malloc function it just doesn't work:
struct node{
int data;
struct node* next;
};
void insert(int val, struct node*);
int main() {
struct node* head ;
head->data = 2;
printf("%d \n", head->data);
}
You don’t technically, but maintaining all nodes with the same memory pattern is only an advantage to you, with no real disadvantages.
Just assume that all nodes are stored in the dynamic memory.
Your “insert” procedure would be better named something like “add” or (for full functional context) “cons”, and it should return the new node:
struct node* cons(int val, struct node* next)
{
struct node* this = (struct node*)malloc( sizeof struct node );
if (!this) return next; // or some other error condition!
this->data = val;
this->next = next;
return this;
}
Building lists is now very easy:
int main()
{
struct node* xs = cons( 2, cons( 3, cons( 5, cons( 7, NULL ) ) ) );
// You now have a list of the first four prime numbers.
And it is easy to handle them.
// Let’s print them!
{
struct node* p = xs;
while (p)
{
printf( "%d ", p->data );
p = p->next;
}
printf( "\n" );
}
// Let’s get the length!
int length = 0;
{
struct node* p = xs;
while (p)
{
length += 1;
p = p->next;
}
}
printf( "xs is %d elements long.\n", length );
By the way, you should try to be as consistent as possible when naming things. You have named the node data “data” but the constructor’s argument calls it “val”. You should pick one and stick to it.
Also, it is common to:
typedef struct node node;
Now in every place except inside the definition of struct node you can just use the word node.
Oh, and I almost forgot: Don’t forget to clean up with a proper destructor.
node* destroy( node* root )
{
if (!root) return NULL;
destroy( root->next );
free( root );
return NULL;
}
And an addendum to main():
int main()
{
node* xs = ...
...
xs = destroy( xs );
}
When you declare a variable, you define the type of the variable, then it's
name and optionally you declare it's initial value.
Every type needs an specific amount of memory. For example int would be
32 bit long on a 32bit OS, 8 bit long on a 64.
A variable declared in a function is usually stored in the stack associated
with the function. When the function returns, the stack for that function is
no longer available and the variable does not longer exist.
When you need the value/object of the variable to exist even after a function
returns, then you need to allocate memory on a different part of the program,
usually the heap. That's exactly what malloc, realloc and calloc do.
Doing
struct node* head ;
head->data = 2;
is just wrong. You've declaring a pointer named head of type struct node,
but you are not assigning anything to it. So it points to an unspecified
location in memory. head->data = 2 tries to store a value at an unspecified
location and the program will most likely crash with a segfault.
In main you could do this:
int main(void)
{
struct node head;
head.data = 2;
printf("%d \n", head.data);
return 0;
}
head will be saved in the stack and will persist as long as main doesn't
return. But this is only a very small example. In a complex program where you
have many more variables, objects, etc. it's a bad idea to simply declare all
variables you need in main. So it's best that objects get created when they
are needed.
For example you could have a function that creates the object and another one
that calls create_node and uses that object.
struct node *create_node(int data)
{
struct node *head = malloc(sizeof *head);
if(head == NULL)
return NULL; // no more memory left
head->data = data;
head->next = NULL;
return head;
}
struct node *foo(void)
{
struct node *head = create_node(112);
// do somethig with head
return head;
}
Here create_node uses malloc to allocate memory for one struct node
object, initializes the object with some values and returns a pointer to that memory location.
foo calls create_node and does something with it and it returns the
object. If another function calls foo, this function will get the object.
There are also other reasons for malloc. Consider this code:
void foo(void)
{
int numbers[4] = { 1, 3, 5, 7 };
...
}
In this case you know that you will need 4 integers. But sometimes you need an
array where the number of elements is only known during runtime, for example
because it depends on some user input. For this you can also use malloc.
void foo(int size)
{
int *numbers = malloc(size * sizeof *numbers);
// now you have "size" elements
...
free(numbers); // freeing memory
}
When you use malloc, realloc, calloc, you'll need to free the memory. If
your program does not need the memory anymore, you have to use free (like in
the last example. Note that for simplicity I omitted the use of free in the
examples with struct head.
What you have invokes undefined behavior because you don't really have a node,, you have a pointer to a node that doesn't actually point to a node. Using malloc and friends creates a memory region where an actual node object can reside, and where a node pointer can point to.
In your code, struct node* head is a pointer that points to nowhere, and dereferencing it as you have done is undefined behavior (which can commonly cause a segfault). You must point head to a valid struct node before you can safely dereference it. One way is like this:
int main() {
struct node* head;
struct node myNode;
head = &myNode; // assigning the address of myNode to head, now head points somewhere
head->data = 2; // this is legal
printf("%d \n", head->data); // will print 2
}
But in the above example, myNode is a local variable, and will go out of scope as soon as the function exists (in this case main). As you say in your question, for linked lists you generally want to malloc the data so it can be used outside of the current scope.
int main() {
struct node* head = malloc(sizeof struct node);
if (head != NULL)
{
// we received a valid memory block, so we can safely dereference
// you should ALWAYS initialize/assign memory when you allocate it.
// malloc does not do this, but calloc does (initializes it to 0) if you want to use that
// you can use malloc and memset together.. in this case there's just
// two fields, so we can initialize via assignment.
head->data = 2;
head->next = NULL;
printf("%d \n", head->data);
// clean up memory when we're done using it
free(head);
}
else
{
// we were unable to obtain memory
fprintf(stderr, "Unable to allocate memory!\n");
}
return 0;
}
This is a very simple example. Normally for a linked list, you'll have insert function(s) (where the mallocing generally takes place and remove function(s) (where the freeing generally takes place. You'll at least have a head pointer that always points to the first item in the list, and for a double-linked list you'll want a tail pointer as well. There can also be print functions, deleteEntireList functions, etc. But one way or another, you must allocate space for an actual object. malloc is a way to do that so the validity of the memory persists throughout runtime of your program.
edit:
Incorrect. This absolutely applies to int and int*,, it applies to any object and pointer(s) to it. If you were to have the following:
int main() {
int* head;
*head = 2; // head uninitialized and unassigned, this is UB
printf("%d\n", *head); // UB again
return 0;
}
this is every bit of undefined behavior as you have in your OP. A pointer must point to something valid before you can dereference it. In the above code, head is uninitialized, it doesn't point to anything deterministically, and as soon as you do *head (whether to read or write), you're invoking undefined behavior. Just as with your struct node, you must do something like following to be correct:
int main() {
int myInt; // creates space for an actual int in automatic storage (most likely the stack)
int* head = &myInt; // now head points to a valid memory location, namely myInt
*head = 2; // now myInt == 2
printf("%d\n", *head); // prints 2
return 0;
}
or you can do
int main() {
int* head = malloc(sizeof int); // silly to malloc a single int, but this is for illustration purposes
if (head != NULL)
{
// space for an int was returned to us from the heap
*head = 2; // now the unnamed int that head points to is 2
printf("%d\n", *head); // prints out 2
// don't forget to clean up
free(head);
}
else
{
// handle error, print error message, etc
}
return 0;
}
These rules are true for any primitive type or data structure you're dealing with. Pointers must point to something, otherwise dereferencing them is undefined behavior, and you hope you get a segfault when that happens so you can track down the errors before your TA grades it or before the customer demo. Murphy's law dictates UB will always crash your code when it's being presented.
Statement struct node* head; defines a pointer to a node object, but not the node object itself. As you do not initialize the pointer (i.e. by letting it point to a node object created by, for example, a malloc-statement), dereferencing this pointer as you do with head->data yields undefined behaviour.
Two ways to overcome this, (1) either allocate memory dynamically - yielding an object with dynamic storage duration, or (2) define the object itself as an, for example, local variable with automatic storage duration:
(1) dynamic storage duration
int main() {
struct node* head = calloc(1, sizeof(struct node));
if (head) {
head->data = 2;
printf("%d \n", head->data);
free(head);
}
}
(2) automatic storage duration
int main() {
struct node head;
head.data = 2;
printf("%d \n", head.data);
}

Declaring a pointer to struct creates a struct?

It seems to me like struct new_element *element = malloc(sizeof(*element)) creates a structure of type element, whereas I thought it would only create a pointer to it. The following code proves to me I'm wrong:
struct new_element
{
int i;
struct new_element *next;
};
int main(void)
{
struct new_element *element = malloc(sizeof(*element));
element->i = 5;
element->next = NULL;
printf("i = %d, next = %p\n", element->i, element->next);
}
Output:
i = 5, next = (nil);
element->i was given the value 5 and element->next was given the value NULL. Doesn't that mean that element points to a structure, which would mean that there is a structure that was created? I thought that malloc would only give a pointer the size needed in memory.
The variable element is a pointer. When you define it, that sets aside space for the pointer.
If you just did this:
struct new_element *element;
You've created a pointer. It just doesn't point anywhere.
When you then call malloc(sizeof(*element)), that sets aside space big enough for what element points to, i.e. an instance of struct new_element. You then point the variable element to this section of memory.
This syntax:
element->i = 5;
Is the same as:
(*element).i = 5;
It dereferences the pointer element, giving you a struct new_element, then you access the member i.
If you did this, as you suggested in the comments:
struct new_element *element = malloc(sizeof(element));
You're not allocating the proper amount of space. You're setting aside enough space for a struct new_element * instead of a struct new_element. If the struct is larger than a pointer to it (likely in this case, since it contains a pointer to its own type), then you end of writing past the end of the allocated memory when modifying one of the members. This invokes undefined behavior.

C: Method that handles pointer

I was wondering why the first method does not work but the second does:
//First method
int create_node(struct node *create_me, int init){
create_me = malloc(sizeof(struct node));
if (create_me == 0){
perror("Out of momory in ''create_node'' ");
return -1;
}
(*create_me).x = init;
(*create_me).next = 0;
return 1;
}
int main( void ){
struct node *root;
create_node(root, 0);
print_all_nodes(root);
}
Ok, here the print_all_nodes function tells me, root has not been initialized. Now second method that works fine:
struct node* create_node(struct node *create_me, int init){ //<-------
create_me = malloc(sizeof(struct node));
if (create_me == 0){
perror("Out of momory in ''create_node'' ");
exit(EXIT_FAILURE);
}
(*create_me).x = init;
(*create_me).next = 0;
return create_me; //<---------
}
int main( void ){
struct node *root;
root = create_node(root, 0); //<---------------
print_all_nodes(root);
}
In my understanding (talking about method 1), when I give the create_node function the pointer to the root node, then it actually changes the x and the next of root.
Like when you do:
void change_i(int* p){
*p = 5;
}
int main( void ){
int i = 2;
printf("%d\n", i);
change_i(&i);
printf("%d", i);
}
It actually changes i.
Get the idea?
Can someone share his/her knowledge with me please !
You need a pointer to pointer, not just a pointer.
If you want to change a variable in another function, you have to send a pointer to that variable. If the variable is an integer variable, send a pointer to that integer variable. If the variable is a pointer variable, send a pointer to that pointer variable.
You are saying in your question that "when I give the create_node function the pointer to the root node, then it actually changes the x and the next of root." Your wording makes me suspect that there is some confusion here. Yes, you are changing the contents of x and next, but not of root. root has no x and next, since root is a pointer that points to a struct that contains an x and a next. Your function does not change the contents of root, since what your function gets is only a copy of that pointer.
Changes to your code:
int create_node(struct node **create_me, int init) {
*create_me = malloc(sizeof(struct node));
if (*create_me == 0){
perror("Out of momory in ''create_node'' ");
return -1;
}
(*create_me)->x = init;
(*create_me)->next = 0;
return 1;
}
int main( void ){
struct node *root;
create_node(&root, 0);
print_all_nodes(root);
}
You need to do something like create_node(&root, 0); and then access it as a ** in the called method. C doesn't have pass by reference concept. You need to give the address to access it in another function.
This is a question of the scope of your variables. In the first example, where you supply a pointer to a node, you could change that node and the changes would persist afterwards. However, your malloc changes this pointer, which is discarded after the scope (your function) ends.
In the second example you return this pointer and therefore copy it before being discarded.
This would correspond to this in your given example no. 3:
void change_i(int* p){
*p = 5; // you can 'change i'
p = 5 // but not p (pointer to i), as it is local -> gets discarded after following '}'
}
when I give the create_node function the pointer to the root node, then it actually changes the x and the next of root.
You don't give the create_node() function (in both versions) a pointer to the root node because you don't have the root node, in the first place.
The declaration:
struct node *root;
creates the variable root, of type struct node * and lets it uninitialized. root is a variable that can store the address in memory of a struct node value (a pointer to a struct node value). But the code doesn't create any struct node value and the value of root is just garbage.
Next, both versions of function create_node() receive the garbage value of root in parameter create_me as a consequence of the call:
create_node(root, 0);
The first thing both implementations of create_node() do is to ignore the value they receive in create_me parameter (be it valid or not), create a value of type struct node and store its address in create_me.
The lines:
(*create_me).x = init;
(*create_me).next = 0;
put some values into the properties of the newly allocated struct node object.
The first version of the function then returns 1 and ignores the value stored in create_me. Being a function parameter (a local variable of the function), its value is discarded and lost forever. The code just created a memory leak: a block of memory that is allocated but inaccessible because there is no pointer to it. Don't do this!
The second version of the function returns the value of create_me (i.e. the address of the newly allocated value of type struct node). The calling code (root = create_node(root, 0);) stores the value returned by the function into the variable root (replacing the garbage value used to initialize this variable).
Great success! The second version of the create_node() function creates a new struct node object, initializes its properties and returns the address of the new object to be stored and/or further processed. Don't forget to call free(root) when the object is not needed any more.

Resources