Call by value vs call by reference while dealing with binary tree - c

When we want to change the value of an ordinary variable by in a function we pass it using call by reference. But I am not able to understand the intricacies when we have to pass a pointer variable(like a node of a binary tree) using call by refence. I understand that if we want to modify the poiter variable to point to another node we have to use call by reference. But what if we have to modify the data element of the root. I thought that to change it also we would need a call by reference. But the following code snippet is giving an output of 10, 10, 10 even though I have passed the root node of the tree using call by value in the function modifyTree. Am I missing something over here?
#include<stdio.h>
#include<stdlib.h>
struct node
{
int data;
struct node* left;
struct node* right;
};
/* Helper function that allocates a new node with the
given data and NULL left and right pointers. */
struct node* newNode(int data)
{
struct node* node = (struct node*)malloc(sizeof(struct node));
node->data = data;
node->left = NULL;
node->right = NULL;
return(node);
}
/* This function sets the data fields of some of the nodes of tree to 10*/
void modifyTree(struct node* node)
{
node->data = 10;
node->left->data = 10;
node->right->data = 10;
}
int main()
{
struct node *root = newNode(1);
root->left = newNode(2);
root->right = newNode(3);
root->left->left = newNode(4);
root->left->right = newNode(5);
modifyTree(root);
printf("%d\n", root->data);
printf("%d\n", root->left->data);
printf("%d\n", root->right->data);
getchar();
return 0;
}

Passing a pointer by value means the called function receives the exact same pointer value that the caller used, so any accesses through that pointer will refer to the same memory.
You would need a double pointer if you wanted the function to modify the pointer value the caller has (for instance by allocating a new tree, thus "creating" a new pointer value).

You are passing the pointer by value, but the pointer still points at the same thing. I'll use some hypothetical values to demonstrate.
In main you allocate a new struct node. Let's say it gets created at memory location 0x12345. So now your struct node *root contains 0x12345.
You now call modifyTree(root);. root gets passed by value to the root argument of modifyTree.
That root now contains 0x12345. It's pointing at the same memory location.
So when you access that location with node->data = 10, you are accessing the same memory you created in main.

You pass the pointer by value yes, but what you are changing inside the modifyTree function are elements of the struct that the pointer is pointing to. Passing the pointer to the struct by value will not prevent you from changing the internal contents of the struct being pointed at by your parameter. If it were the pointer itself that you were changing, then you would see the behaviour that you are expecting.

Related

Understanding code for creating a singly linked list using double pointer in C

I am trying to understand how the code below for creating a singly linked list works using a double pointer.
#include <stdio.h>
#include <stdlib.h>
struct Node {
int data;
struct Node* next;
};
void push(struct Node** headRef, int data) {
struct Node* newNode = (struct Node*)malloc(sizeof(struct Node));
newNode->data = data;
newNode->next = *headRef;
*headRef = newNode;
}
//Function to implement linked list from a given set of keys using local references
struct Node* constructList(int keys[], int n) {
struct Node *head = NULL;
struct Node **lastPtrRef = &head;
int i, j;
for(i = 0; i < n; i++) {
push(lastPtrRef, keys[i]);
lastPtrRef = &((*lastPtrRef)->next); //this line
if((*lastPtrRef) == NULL) {
printf("YES\n");
}
}
return head;
}
int main() {
int keys[] = {1, 2, 3, 4};
int n = sizeof(keys)/sizeof(keys[0]);
//points to the head node of the linked list
struct Node* head = NULL;
head = constructList(keys, n); //construct the linked list
struct Node *temp = head;
while(temp != NULL) { //print the linked list
printf(" %d -> ", temp->data);
temp = temp->next;
}
}
I understand the purpose of using the double pointer in the function push(), it allows you to change what the pointer headRef is pointing to inside the function. However in the function constructList(), I don't understand how the following line works:
lastPtrRef = &((*lastPtrRef)->next);
Initially lastPtrRef would be pointing to head which points to NULL. In the first call to push(), within the for loop in constructList(), the value that head points to is changed (it points to the new node containing the value 1). So after the first call to push(), lastPtrRef will be pointing to head which points to a node with the value of 1. However, afterwards the following line is executed:
lastPtrRef = &((*lastPtrRef)->next);
Whereby lastPtrRef is given the address of whatever is pointed to by the next member of the newly added node. In this case, head->next is NULL.
I am not really sure what the purpose of changing lastPtrRef after the call to push(). If you want to build a linked list, don't you want lastPtrRef to have the address of the pointer which points to the node containing 1, since you want to push the next node (which will containing 2) onto the head of the list (which is 1)?
In the second call to push() in the for loop in constructList, we're passing in lastPtrRef which points to head->next (NULL) and the value 2. In push() the new node is created, containing the value 2, and newNode->next points to head->next which is NULL. headRef in push gets changed so that it points to newNode (which contains 2).
Maybe I'm understanding the code wrong, but it seems that by changing what lastPtrRef points to, the node containing 1 is getting disregarded. I don't see how the linked list is created if we change the address lastPtrRef holds.
I would really appreciate any insights as to how this code works. Thank you.
This uses a technique called forward-chaining, and I believe you already understand that (using a pointer-to-pointer to forward-chain a linked list construction).
This implementation is made confusing by the simple fact that the push function seems like it would be designed to stuff items on the head of a list, but in this example, it's stuffing them on the tail. So how does it do it?
The part that is important to understand is this seemingly trivial little statement in push:
newNode->next = *headRef
That may not seem important, but I assure you it is. The function push, in this case, does grave injustice to what this function really does. In reality it is more of a generic insert. Some fact about that function
It accepts a pointer-to-pointer headRef as an argument, as well as some data to put in to the linked list being managed.
After allocating a new node and saving the data within, it sets the new node's next pointer to whatever value is currently stored in the dereferenced headRef pointer-to-pointer (so.. a pointer) That's what the line I mentioned above accomplishes.
It then stores the new node's address at the same place it just pulled the prior address from; i.e. *headRef
Interestingly, it has no return value (it is void) further making this somewhat confusing. Turns out it doesn't need one.
Upon returning to the caller, at first nothing may seem to have changed. lastPtrRef still points to some pointer (in fact the same pointer as before; it must, since it was passed by value to the function). But now that pointer points to the new node just allocated. Further, that new node's next pointer points to whatever was in *lastPtrRef before the function call (i.e. whatever value was in the pointer pointed to by lastPtrRef before the function call).
That's important. That is what that line of code enforces, That means if you invoke this with lastPtrRef addressing a pointer pointing to NULL (such as head on initial loop entry), that pointer will receive the new node, and the new node's next pointer will be NULL. If you then change the address in lastPtrRef to point to the next pointer of the last-inserted node (which points to NULL; we just covered that), and repeat the process, it will hang another node there, setting that node's next pointer to NULL, etc. With each iteration, lastPtrRef addresses the last-node's next pointer, which is always NULL.
That's how push is being used to construct a forward linked list. One final thought. What would you get for a linked list if you had this:
#include <stdio.h>
#include <stdlib.h>
struct Node
{
int data;
struct Node* next;
};
void push(struct Node** headRef, int data)
{
struct Node* newNode = (struct Node*)malloc(sizeof(struct Node));
newNode->data = data;
newNode->next = *headRef;
*headRef = newNode;
}
int main()
{
//points to the head node of the linked list
struct Node* head = NULL;
push(&head, 1);
push(&head->next, 2);
push(&head->next, 3);
for (struct Node const *p = head; p; p = p->next)
printf("%p ==> %d\n", p, p->data);
}
This seemingly innocent example amplifies why I said push is more of a generic insert than anything else. This just populates the initial head node.
push(&head, 1);
Then this appends to that node by using the address of the new node's next pointer as the first argument, similar to what your constructList is doing, but without the lastPtrRef variable (we don't need it here):
push(&head->next, 2);
But then this:
push(&head->next, 3);
Hmmm. Same pointer address as the prior call, so what will it do? Hint: remember what that newNode->next = *headRef line does (I droned on about it forever; I hope something stuck).
The output of the program above is this (obviously the actual address values will be different, dependent to your instance and implementation):
0x100705950 ==> 1
0x10073da90 ==> 3
0x100740b90 ==> 2
Hope that helps.

What is the correct syntax of Delete(node ) for SLL in C?

Assuming the relevant header files, functions for Singly Linked List in C are declared.
Is the following definition of Delete() correct?
/* The Structure for SLL
typedef struct SLL
{
int data;
struct SLL *next;
}node;
Function Delete() deletes a node*/
void Delete( node **head)
{
node *temp, *prev;
int key;
temp = *head;
if(temp == NULL)
{
printf("\nThe list is empty");
return;
}
clrscr();
printf("\nEnter the element you want to delete:");
scanf("%d", &key);
temp = search( *head , key);//search()returns the node which has key
if(temp != NULL)
{
prev = get_prev(*head, key);
if(prev != NULL)
{
prev->next = temp->next;
free(temp);
}
else
{
*head = temp->next;
free(temp);
}
printf("\nThe node is deleted");
getch();
}
}
1) What happens if I replace(node ** head) with (node *head)?
2) What happens if I replace void Delete (node **head) with node
*Delete(node *head)?
3) Is there an alternate way to delete a node in C?
Thanks in advance
This isn't a tutorial site, but here goes...
You do know that arguments in C are passed by value? Meaning the value is copied.
For example:
void some_function(int a)
{
// ...
}
When calling the function above, like
int x = 5;
some_function(x);
Then the value in x is copied into the argument a in the function. If the code inside the function assigns to a (e.g. a = 12;) then you only modify the local variable a, the copy. It does not modify the original variable.
Now, if we want the function to modify x, then we must emulate pass by reference, which is done using pointers and the address-of operator:
void some_function(int *a)
{
*a = 12; // Modify where a is pointing
}
Now to call that, we don't create a pointer variable and pass that (though it's possible as well), instead we use the address-of operator & to pass a pointer to the variable:
int x = 5;
some_function(&x); // Pass a pointer to the variable x
The pointer &x will be passed by value (since that's the only way to pass arguments in C), but we don't want to modify the pointer, we want to modify the data where it points.
Now back to your specific function: Your function wants to modify a variable which is a pointer, then how do we emulate pass by reference? By passing a pointer to the pointer.
So if you have
node *head;
// Initialize head, make it point somewhere, etc.
Now since the Delete function needs to modify where head points, we pass a pointer tohead`, a pointer to the pointer:
Delete(&head);
The Delete function of course must accept that type, a pointer to a pointer to node, i.e. node **. It then uses the dereference operator * to get where the pointer is pointing:
*head = temp->next;
1) If you replace node** head with node* head you won't modify the original head pointer. You probably have a head somewhere that marks the beginning of the linked list. When you delete a node, there's a chance that you want to delete head. In that case you need to modify head to point to the next node in the linked list.
*head = temp->next;
free(temp);
This part of your code does exactly that. Here, temp == head. We want head to point to head->next, but if we pass in node* head to the function, the pointer will get modified but the changes will disappear because you're passing the pointer by value. You need to pass in &head which will be of type node ** head if you want the changes to be reflected outside of the function.
2) You will then change the function definition to return a void pointer (which is a placeholder pointer that can be converted to any pointer. Take care to not break any aliasing rules with this. But the problem from (1) remains, although, you could return a modified head, and assign it to the returned value. In that case define the function won't fit well with other cases where the head doesn't need to be modified. So you could return a pointer for head if it's modified or return NULL when it doesnt. It's a slightly messier method of doing things imho, though.
3) Yes, but that depends on the way a linked list is implemented. For the datatype shown here, the basic delete operation is as given.

In-order traversal deviating to other locations in memory

I'm trying to do an in-order traversal on a BST. In the first call of inOrder() everything works as expected: *node points to the root and in the debugger I can see that the whole three is represented correctly (i.e., the root's descendants are correctly represented).
However, in the next call on the left child of the root (i.e., *node now represents the root's left child), the tree is not represented correctly anymore. The only things that are correct are the value of *node and the right child being NULL. The left child is not the node with the value 3 that was appended before, but it has some weird values - it appears to point at a random memory location.
(If I run further, Xcode terminates saying: EXC_BAD_ACCESS...)
Can you explain me why this is so?
#include <stdio.h>
typedef struct Node Node;
struct Node
{
int value;
Node *left;
Node *right;
};
void inOrder(Node *node)
{
if(node!=NULL)
{
inOrder(node->left);
printf("%d,", node->value);
inOrder(node->right);
}
}
void append(Node *root)
{
Node n = {3,NULL};
root->left->left = &n;
}
int main(int argc, const char * argv[])
{
Node a = {10, NULL};
Node b = {5, NULL};
Node root = {8, &b, &a};
// appends a node with value 3 to the node 5 (just a test)
append(&root);
inOrder(&root);
puts("\n");
}
Inside append() function,
void append(Node *root)
{
Node n = {3,NULL};
root->left->left = &n;
}
you're trying to return the address of a local variable n from the function. Outside append(), the address of n is invalid. Using that invokes undefined behaviour.
Solution: Define n as a pointer of type Node, allocate memory dynamically using malloc() and then you can return the pointer. The lifetime of dynamically allocated memory remains valid until deallocated.
root->left->left = &n;
The variable n is local to the function append() and once you exit the function append() n is no more valid.
So accessing the variable out of its scope leads to undefined behavior

When to use double pointers?

I saw this working code to converting a tree to its mirror.
struct node* mir(struct node *root)
{
if(root)
{
struct node * temp;
mir(root->left);
mir(root->right);
temp=root->left;
root->left=root->right;
root->right=temp;
}
Should not be there mir(struct node **) like we have in linked list?
All calls in C are call by value, which means the called functions cannot change the value of the argument in the caller's context. The called function receives just a copy of the arguments. However, you can effectively bypass this by passing a pointer to your variable, and then modifying its dereferenced state. What if the variable you want to change is a pointer? You pass a pointer to a pointer.
struct node* mir(struct node *root);
struct node* mir2(struct node **root);
...
/* following cannot change value of root */
x = mir(root);
/* following may change value of root */
x = mir2(&root);

C pointer pointer, and seg fault

Below is my simple linked list in C. My question is in "headRef = &newNode;" which causes segmentation fault. Then I tried instead "*headRef = newNode;" which resolves the seg fault problem. Though the two lines of code seem to me to work in the same way, why is one causing seg fault and the other one not?
Thanks in advance.
struct node{
int data;
struct node* next;
};
void Push(struct node** headRef, int data){
struct node* newNode = malloc(sizeof(struct node));
if(!newNode) return;
newNode->data = data;
newNode->next = *headRef;
headRef = &newNode;
return;
}
You have a fundamental misunderstanding of reference semantics via pointers. Here's the core example:
// Call site:
T x;
modify(&x); // take address-of at the call site...
// Callee:
void modify(T * p) // ... so the caller takes a pointer...
{
*p = make_T(); // ... and dereferences it.
}
So: Caller takes address-of, callee dereferences the pointer and modifies the object.
In your code this means that you need to say *headRef = newNode; (in our fundamental example, you have T = struct node *). You have it the wrong way round!
newNode is already an address, you've declared it as a pointer: struct node *newNode. With *headRef = newNode you're assigning that address to a similar pointer, a struct node * to a struct node *.
The confusion is that headRef = &newNode appears to be similarly valid, since the types agree: you're assigning to a struct node ** another struct node **.
But this is wrong for two reasons:
You want to change the value of your function argument headRef, a struct node *. You've passed the address of headRef into the function because C is pass-by-value, so to change a variable you'll need it's address. This variable that you want to change is an address, and so you pass a pointer to a pointer, a struct node **: that additional level of indirection is necessary so that you can change the address within the function, and have that change reflected outide the function. And so within the function you need to dereference the variable to get at what you want to change: in your function, you want to change *headRef, not headRef.
Taking the address of newNode is creating an unnecessary level of indirection. The value that you want to assign, as mentioned above, is the address held by newNode, not the address of newNode.
headRef = &newNode is a local assignment, so the assignment is only valid within the scope of Push function. If changes to the headRef should be visible outside the Push you need to do *headRef = newNode. Furthermore, these two are not equivalent. headRef = &newNode assigns the address of a node pointer to a pointer to node pointer while the *headRef = newNode assigns the address of a node to a pointer to a node using indirection.
You're setting headRef to hold the address of a variable that lives on the stack; as soon as your Push() function returns, the stack is no longer valid and you can count on it getting overwritten. This is a sure recipe for a segfault.

Resources