Below is my simple linked list in C. My question is in "headRef = &newNode;" which causes segmentation fault. Then I tried instead "*headRef = newNode;" which resolves the seg fault problem. Though the two lines of code seem to me to work in the same way, why is one causing seg fault and the other one not?
Thanks in advance.
struct node{
int data;
struct node* next;
};
void Push(struct node** headRef, int data){
struct node* newNode = malloc(sizeof(struct node));
if(!newNode) return;
newNode->data = data;
newNode->next = *headRef;
headRef = &newNode;
return;
}
You have a fundamental misunderstanding of reference semantics via pointers. Here's the core example:
// Call site:
T x;
modify(&x); // take address-of at the call site...
// Callee:
void modify(T * p) // ... so the caller takes a pointer...
{
*p = make_T(); // ... and dereferences it.
}
So: Caller takes address-of, callee dereferences the pointer and modifies the object.
In your code this means that you need to say *headRef = newNode; (in our fundamental example, you have T = struct node *). You have it the wrong way round!
newNode is already an address, you've declared it as a pointer: struct node *newNode. With *headRef = newNode you're assigning that address to a similar pointer, a struct node * to a struct node *.
The confusion is that headRef = &newNode appears to be similarly valid, since the types agree: you're assigning to a struct node ** another struct node **.
But this is wrong for two reasons:
You want to change the value of your function argument headRef, a struct node *. You've passed the address of headRef into the function because C is pass-by-value, so to change a variable you'll need it's address. This variable that you want to change is an address, and so you pass a pointer to a pointer, a struct node **: that additional level of indirection is necessary so that you can change the address within the function, and have that change reflected outide the function. And so within the function you need to dereference the variable to get at what you want to change: in your function, you want to change *headRef, not headRef.
Taking the address of newNode is creating an unnecessary level of indirection. The value that you want to assign, as mentioned above, is the address held by newNode, not the address of newNode.
headRef = &newNode is a local assignment, so the assignment is only valid within the scope of Push function. If changes to the headRef should be visible outside the Push you need to do *headRef = newNode. Furthermore, these two are not equivalent. headRef = &newNode assigns the address of a node pointer to a pointer to node pointer while the *headRef = newNode assigns the address of a node to a pointer to a node using indirection.
You're setting headRef to hold the address of a variable that lives on the stack; as soon as your Push() function returns, the stack is no longer valid and you can count on it getting overwritten. This is a sure recipe for a segfault.
Related
I am learning data structure, and here is a thing that I am unable to understand...
int end(struct node** p, int data){
/*
This is another layer of indirection.
Why is the second construct necessary?
Well, if I want to modify something allocated outside of my function scope,
I need a pointer to its memory location.
*/
struct node* new = (struct node*)malloc(sizeof(struct node));
struct node* last = *p;
new->data = data;
new->next = NULL;
while(last->next !=NULL){
last = last ->next ;
}
last->next = new;
}
why we are using struct node **p?
can we use struct node *p in place of struct node **p?
the comment which I wrote here is the answer I found here, but still, I am unclear about this here is the full code...
please help me
thank you
Short answer: There is no need for a double-pointer in the posted code.
The normal reason for passing a double-pointer is that you want to be able to change the value of a variable in the callers scope.
Example:
struct node* head = NULL;
end(&head, 42);
// Here the value of head is not NULL any more
// It's value was change by the function end
// Now it points to the first (and only) element of the list
and your function should include a line like:
if (*p == NULL) {*p = new; return 0;}
However, your code doesn't !! Maybe that's really a bug in your code?
Since your code doesn't update *p there is no reason for passing a double-pointer.
BTW: Your function says it will return int but the code has no return statement. That's a bug for sure.
The shown function (according to its name) should create a new node and apend it at the end of the list represented by the pointer to a pointer to a node of that list. (I doubt however, that it actually does, agreeing with comments...)
Since the list might be empty and that pointer to node hence not be pointing to an existing node, it is ncessary to be able to potentially change the pointer to the first elemet of that list away from NULL to then point to the newly created node.
That is only possible if the parameter is not only a copy of the pointer to the first node but instead is a pointer to the pointer to the first node. Because in the second case you can dereference the pointer to pointer and actually modify the pointer to node.
Otherwise the list (if NULL) would always still point to NULL after the function call.
I've been reviewing the basics of singly linked list In C with materials from Stanford CS Library, where I came cross the following code:
struct node{
int data;
struct node* next;
};
struct node* BuildWithDummyNode(){
struct node dummy;
struct node* tail = & dummy; // this line got me confused
int i;
dummy.next = NULL;
for (i=1;i<6;i++){
Push(&(tail->next), i);
tail = tail->next;
}
return dummy.next;
}
Probably not revenant, but the code for Push() is:
void Push(struct node** headRef, int data){
struct node* NewNode = malloc(sizeof(struct node));
newNode->data = data;
newNode->next = *headRef;
*headRef = newNode;
}
Everything runs smoothly, but I've always been under the impression that whenever you define a pointer, it must point to an already defined variable. But here the variable "dummy" is only declared and not initialized. Shouldn't that generate some kind of warning at least?
I know some variables are initialized to 0 by default, and after printing dummy.data it indeed prints 0. So is this an instance of "doable but bad practice", or am I missing something entirely?
Thank you very much!
Variable dummy has already been declared in the following statement:
struct node dummy;
which means that memory has been allocated to it. In other words, this means that it now has an address associated with it. Hence the pointer tail declared in following line:
struct node* tail = & dummy;
to store its address makes perfect sense.
"But here the variable "dummy" is only declared and not initialized."
The variable declaration introduces it into the scope. You are correct in deducing it's value is unspecified, but to take and use its address is well defined from the moment it comes into scope, right up until it goes out of scope.
To put more simply: your program is correct because you don't depend on the variables uninitialized value, but rather on its well defined address.
I have a structure like this
struct node
{
int data;
struct node* next;
};
Which I use to create singly linked list.
I created other functions like
int push(struct node* head,int element);
which pushes data onto stack created using node structs.
The function then tries to update the struct node* head passed to it using code(it does other things as well)
head=(struct node*)malloc(sizeof(struct node));
The call is made as such
struct node* stack;
push(stack,number);
It looks like this code created copy of the pointer passed to it. So I had to change the function to
int push(struct node** head,int element)
and
*head=(struct node*)malloc(sizeof(struct node));
The call is made as such
struct node* stack;
push(&stack,number);
So my question is, what was the earlier function doing? Is it necessary to pass struct node** to the function if I want to update original value of pointer or is my approach wrong?
Sorry I cannot provide complete code as it is an assignment.
C always passes by value. To change a variable passed to a function, instead of passing the variable itself, you pass a reference(its address).
Let's say you're calling your function with the old signature
int push(struct node* head,int element);
struct node *actual_head = NULL;
push(actual_head, 3);
Now before calling push, your variable actual_head will have value as NULL.
Inside the push function, a new variable head will be pushed to stack. It will have the same value as passed to it, i.e. NULL.
Then when you call head = malloc(...), your variable head will get a new value instead of actual_head which you wanted to.
To mitigate the above, you'll have to change the signature of your function to
int push(struct node** head,int element);
struct node *actual_head = NULL
push(&actual_head, 3);
Now if you notice carefully, the value of actual_head is NULL, but this pointer is also stored somewhere, that somewhere is its address &actual_head. Let's take this address as 1234.
Now inside the push function, your variable head which can hold the address of a pointer(Notice the two *), will have the value of 1234
Now when you do *head = malloc(...), you're actually changing the value of the object present at location 1234, which is your actual_head object.
C always passes parameters by value (i.e., by copying it). This applies even to pointers, but in that case, it is the pointer itself that is copied. Most of the times you use pointers, that is fine, because you are interested in manipulating the data that is pointed to by the pointer. However, in your situation, you want to modify the pointer itself, so you do indeed have to use a pointer to a pointer.
Yes.
The first version of your program was passing the pointer by value. Although it passed an address (held by the pointer to struct) it didn't pass the pointer's address - necessary to update the value.
Whenever you want to update a variable's value you must pass the variable's address. To pass a pointer address, you need a parameter pointer to pointer to type.
In your case, pointer to pointer to struct node.
The code is not doing what you think but not because it creates a copy of the node, it creates a copy of the pointer.
Try printing
fprintf(stdout, "Address of head: %p\n", (void *) head);
both, inside push() and in the caller function.
The pointer you pass in and the parameter have different addresses in memory although they both point to the same address, storing the result of malloc() in it doesn't persist after the funcion has returned.
You need to pass a pointer to the pointer like this
int push(struct node **head, int element)
{
/* Ideally, check if `head' is `NULL' and find the tail otherwise */
*head = malloc(sizeof(**head));
if (*node == NULL)
return SOME_ERROR_VALUE;
/* Do the rest here */
return SOME_SUCCESS_VALUE_LIKE_0;
}
And to call it, just
struct node *head;
head = NULL;
push(&head, value);
/* ^ take the address of head and pass a pointer with it */
of course, the push() implementation should be very differente but I think you will get the idea.
Everything everybody has said is absolutely correct in terms of your question. However, I think you should also consider the design. Part of your problem is that you are conflating the stack itself with the internal structures needed to store data on it. You should have a stack object and a node object. i.e.
struct Node
{
int data;
struct Node* next;
}
struct Stack
{
struct Node* head;
}
Your push function can then take a pointer to the Stack without any double indirection. Plus there is no danger of pushing something on to a node that is in the middle of the stack.
void push(struct Stack* stack, int value)
{
struct Node* node = malloc(sizeof node);
node->data = value;
node->next = stack->head;
stack->head = node;
}
The function
int push(struct node* head,int element) {
head=(struct node*)malloc(sizeof(struct node));
}
allocate some memory and throw it away (cause memory leak).
Passing “pointer to structure” to a function do create local copies of it.
It is necessary to pass struct node** to the function if you want to update original value of pointer. (using global variables is generally considered as a bad idea)
When you pass stack to your function push(struct node* head,int element)
and do
head=(struct node*)malloc(sizeof(struct node));
The pointer head will update to the memory allocated by malloc() and stack is unaware of this memory as you just passed the value.(which is uninitialized here)
When you pass the address then you have a pointer to pointer which makes the changes inside push() to be reflected on stack
So my question is, what was the earlier function doing?
Your earlier function was defined to receive a pointer to an object. You passed your function an uninitialized struct node pointer. A function can't do anything with a value representing an uninitialized pointer. So your function was passed garbage, but no harm was done because your function immediately ignored it by overwriting with a pointer to allocated memory. Your function is not using the value you passed for anything except temporary local storage now. Upon return from your function, your parameters to the function are thrown away (they are just copies), and the value of your stack variable is as it was before, still uninitialized. The compiler usually warns you about using a variable before it is initialized.
By the way, the pointer value to the allocated memory was also thrown away/lost upon function return. So there would now be a location in memory with no reference and therefore no way to free it up, i.e., you have a memory leak.
Is it necessary to pass struct node** to the function if I want to update original value of pointer or is my approach wrong?
Yes, it is necessary to pass the address of a variable that you want filled in by the function being called. It must be written to accept a pointer to the type of data it will supply. Since you are referencing your object with a pointer, and since your function is generating a pointer to your object, you must pass a pointer to a pointer to your object.
Alternatively, you can return a pointer as a value from a function, for example
struct node * Function() { return (struct node *)malloc(sizeof(struct node)); }
The call would be...
struct node *stack;
stack = Function();
if(stack == NULL) { /* handle failure */ }
So, your approach is not wrong, just your implementation (and understanding) need work.
My question is an extension of this: Returning pointer to a local structure
I wrote the following code to create an empty list:
struct node* create_empty_list(void)
{
struct node *head = NULL;
return head;
}
I just read that returning pointers to local variables is useless, since the variable will be destroyed when the function exits. I believe the above code is returning a NULL pointer, so I don't think it's a pointer to a local variable.
Where is the memory allocated to the pointer in this case. I didn't allocate any memory on the heap, and it should be on the stack, as an automatic variable. But what happens when the code exits (to the pointer), if I try to use it in the program, by assigning this pointer some pointees / de-referencing and alike?
struct node* create_empty_list(void)
{
struct node *head = NULL;
return head;
}
is equivalent to:
struct node* create_empty_list(void)
{
return NULL;
}
which is perfectly fine.
The problem would happen if you had something like:
struct node head;
return &head; // BAD, returning a pointer to an automatic object
Here, you are returning the value of a local variable, which is OK:
struct node* create_empty_list()
{
struct node* head = NULL;
return head;
}
The value of head, which happens to be NULL (0), is copied into the stack before function create_empty_list returns. The calling function would typically copy this value into some other variable.
For example:
void some_func()
{
struct node* some_var = create_empty_list();
...
}
In each of the examples below, you would be returning the address of a local variable, which is not OK:
struct node* create_empty_list()
{
struct node head = ...;
return &head;
}
struct node** create_empty_list()
{
struct node* head = ...;
return &head;
}
The address of head, which may be a different address every time function create_empty_list is called (depending on the state of the stack at that point), is returned. This address, which is typically a 4-byte value or an 8-byte value (depending on your system's address space), is copied into the stack before the function returns. You may use this value "in any way you like", but you should not rely on the fact that it represents the memory address of a valid variable.
A few basic facts about variables, that are important for you to understand:
Every variable has an address and a value.
The address of a variable is constant (i.e., it cannot change after you declare the variable).
The value of a variable is not constant (unless you explicitly declare it as a const variable).
With the word pointer being used, it is implied that the value of the variable is by itself the address of some other variable. Nonetheless, the pointer still has its own address (which is unrelated to its value).
Please note that the description above does not apply for arrays.
As others have mentioned, you are returning value, what is perfectly fine.
However, if you had changed functions body to:
struct node head;
return &head;
you would return address (pointer to) local variable and that could be potentially dangerous as it is allocated on the stack and freed immediately after leaving function body.
If you changed your code to:
struct node * head = (struct node *) malloc( sizeof( struct node ) );;
return head;
Then you are returning value of local value, that is pointer to heap-allocated memory which will remain valid until you call free on it.
Answering
Where is the memory allocated to the pointer in this case. I didn't
allocate any memory on the heap, and it should be on the stack, as an
automatic variable. But what happens when the code exits (to the
pointer), if I try to use it in the program, by assigning this pointer
some pointees / de-referencing and alike?
There is no memory allocated to the pointer in your case. There is memory allocated to contain the pointer, which is on the stack, but since it is pointing to NULL it doesn't point to any usable memory. Also, you shouldn't worry about that your pointer is on the stack, because returning it would create a copy of the pointer.
(As others mentioned) memory is allocated on the stack implicitly when you declare objects in a function body. As you probably know (judging by your question), memory is allocated on the heap by explicitly requesting so (using malloc in C).
If you try to dereference your pointer you are going to get a segmentation fault. You can assign to it, as this would just overwrite the NULL value. To make sure you don't get a segmentation fault, you need to check that the list that you are using is not the NULL pointer. For example here is an append function:
struct node
{
int elem;
struct node* next;
};
struct node* append(struct node* list, int el) {
// save the head of the list, as we would be modifying the "list" var
struct node* res = list;
// create a single element (could be a separate function)
struct node* nn = (struct node*)malloc(sizeof(struct node));
nn->elem = el;
nn->next = NULL;
// if the given list is not empty
if (NULL != list) {
// find the end of the list
while (NULL != list->next) list = list->next;
// append the new element
list->next = nn;
} else {
// if the given list is empty, just return the new element
res = nn;
}
return res;
}
The crucial part is the if (NULL != list) check. Without it, you would try to dereference list, and thus get a segmentation fault.
I have a pointer in one function and i would like to return that pointer so that i can modify what it points to later on. Will returning it return the address of what the pointer is pointing to or the pointer itself? This question is because i want to change what the head of a linked-list points to.
So for example
struct node_{
//variables
}*headPtr=NULL; //assume when we are returning headPtr in foo() it is no longer NULL but points to something
typdef struct node_ node;
node foo(){
//some if conditions
return headPtr;
}
main(){
node *tmpPtr;
tmpPtr=foo();
}
This function takes a pointer to the head of a linked list, some other arguments, creates a new node and append the list to it and return de new head.
struct type *ptr_change_head(struct type *ptr_old_head, other args){
struct type *ptr = ptr_old_head;
struct type *ptr_new_head = <new head>;
ptr_new_head->ptr_next = ptr;
return ptr_new_head;
}
First of all, let's make your types match up. foo()'s type signature says it returns a node, but you are actually trying to return a pointer to node. Let's redefined foo() to be
node *foo() {
return headPtr;
}
Will returning it return the address of what the pointer is pointing to or the pointer itself?
The line
tmpPtr = foo();
will place into the variable tmpPtr the address of what the pointer is pointing to, and tmpPtr will be another variable that exactly matches headPtr. That is to say, two separate pointers that point to the same thing. There are two separate points to make here:
tmpPtr and headPtr are two separate variables, and if you re-assign one it will have no effect on the other.
Since they currently point to the same node, you could potentially mutate the node that they point to by dereferencing either pointer, assuming they point to an actual node and not NULL.
If your intent is to change what headPtr points to you can either:
change headPtr directly with an assignment like headPtr = something.
set up a pointer to headPtr with type node**.
You can set up a pointer to headPtr with something like
node **foo() {
return &headPtr;
}
int main() {
node **headPtrPtr = foo();
*headPtrPtr = something else;
}
This will place into headPtrPtr a pointer to headPtr. By dereferencing headPtrPtr, you can make headPtr point to a new node.