Double pointer to binary search-tree node - c

This might seem like a silly question to some of you and I know that I get things mixed up quite often but I need to understand the code so I can stop obsessing about it and focus on the real matter of why I need to use it.
So, in the code I see several assignments like this:
struct bst_node** node = root;
node = &(*node)->left;
node = &(*node)->right;
is there an invisible parenthesis here?
node = &((*node)->right);
This example is taken from literateprograms.org.
So to me it seems &(*node) is unnecessary and I might as well just write node->left instead, but the code seems to work where I can't make sense of it and I'm wondering if it's because I'm misunderstanding what's happening at those lines. Particularly, at one place in the code where it is deleting a node by constantly moving the "deleted" data to the bottom of the tree to safely remove the node without having to "break things", I'm lost because I don't get how
old_node = *node;
if ((*node)->left == NULL) {
*node = (*node)->right;
free_node(old_node);
else if ((*node)->right == NULL) {
*node = (*node)->left;
free_node(old_node);
} else {
struct bst_node **pred = &(*node)->left;
while ((*pred)->right != NULL) {
pred = &(*pred)->right;
}
psudo-code: swap values of *pred and *node when the
bottom-right of the left tree of old_node has been found.
recursive call with pred;
}
can keep the tree structure intact. I don't understand how this makes sure the structure is intact and would appreciate some help from somebody who knows what's going on. I interpret node being a local variable on the stack, created at the function call. Since it is a double pointer it points to a location in the stack (I assume this, since they did &(*node) previously to the function call), of either it's own stack or the function before, which then points to said node on the heap.
In the example code above what I think it is supposed to do is switch either left or right, since one of them is NULL, and then switch the one that isn't (assuming the other one isn't NULL?) As I said, I'm not sure about how this would work. My question mostly relates to the fact that I think &(*node) <=> node but I want to know if that's not the case etc.

node = &(*node)->right;
is there an invisible parenthesis here?
node = &((*node)->right);
Yes. It is taking the address of the right member of *node. The -> takes precedence over &; see C++ Operator Precedence (-> is 2 and & is 3 in that list) (it's the same general precedence as C).
So to me it seems &(*node) is unnecessary and I might as well just write node->left instead,
Your premise is off. There is no expression &(*node), as explained above, the & applies to the entire (*node)->left, not (*node).
In that code the double pointers are just that, a pointer to a pointer. Just as this works:
int x = 0;
int *xptr = &x;
*xptr = 5;
assert(x == 5);
This is the same, it changes the value of the pointer x:
int someint;
int *x = &someint;
int **xptr = &x;
*xptr = NULL;
assert(x == NULL);
In that code snippet you posted, assigning a pointer to *node changes the value of the pointer that node points to. So, e.g. (pseudo-ish code):
typedef struct bst_node_ {
struct bst_node_ *left;
struct bst_node_ *right;
} bst_node;
bst_node * construct_node () {
return a pointer to a new bst_node;
}
void create_node (bst_node ** destination_ptr) {
*destination_ptr = construct_node();
}
void somewhere () {
bst_node *n = construct_node();
create_node(&n->left); // after this, n->left points to a new node
create_node(&n->right); // after this, n->right points to a new node
}
Noting again that &n->left is the same as &(n->left) because of precedence rules. I hope that helps.
In C++ you can pass arguments to a function by reference, which is essentially the same as passing a pointer except syntactically it leads to code that is a bit easier to read.

That is useful
&(*node)->left <=>&((*node)->left)
The variable edited by this code is *node. I need the context fo this code to give more info

Related

what does this struct node **p is doing?

I am learning data structure, and here is a thing that I am unable to understand...
int end(struct node** p, int data){
/*
This is another layer of indirection.
Why is the second construct necessary?
Well, if I want to modify something allocated outside of my function scope,
I need a pointer to its memory location.
*/
struct node* new = (struct node*)malloc(sizeof(struct node));
struct node* last = *p;
new->data = data;
new->next = NULL;
while(last->next !=NULL){
last = last ->next ;
}
last->next = new;
}
why we are using struct node **p?
can we use struct node *p in place of struct node **p?
the comment which I wrote here is the answer I found here, but still, I am unclear about this here is the full code...
please help me
thank you
Short answer: There is no need for a double-pointer in the posted code.
The normal reason for passing a double-pointer is that you want to be able to change the value of a variable in the callers scope.
Example:
struct node* head = NULL;
end(&head, 42);
// Here the value of head is not NULL any more
// It's value was change by the function end
// Now it points to the first (and only) element of the list
and your function should include a line like:
if (*p == NULL) {*p = new; return 0;}
However, your code doesn't !! Maybe that's really a bug in your code?
Since your code doesn't update *p there is no reason for passing a double-pointer.
BTW: Your function says it will return int but the code has no return statement. That's a bug for sure.
The shown function (according to its name) should create a new node and apend it at the end of the list represented by the pointer to a pointer to a node of that list. (I doubt however, that it actually does, agreeing with comments...)
Since the list might be empty and that pointer to node hence not be pointing to an existing node, it is ncessary to be able to potentially change the pointer to the first elemet of that list away from NULL to then point to the newly created node.
That is only possible if the parameter is not only a copy of the pointer to the first node but instead is a pointer to the pointer to the first node. Because in the second case you can dereference the pointer to pointer and actually modify the pointer to node.
Otherwise the list (if NULL) would always still point to NULL after the function call.

pointers don't change after i modified it

I'm setting up a struct called Node
typedef struct node{
struct node *left;
struct node *right;
struct node *parent;
}node;
and a function that operate on the nodes:
int test(node *old,node* new){
old->parent->right = new;
new->parent = old->parent;
}
Ok, so i make 3 nodes and set up the relationship between them
node* me =malloc(sizeof(node));
node* me1 = malloc(sizeof(node));
node* me2 = malloc(sizeof(node));
me->right = me1;
me->left = me2;
me1->parent = me;
me2->parent = me;
test(me1,me);
1.However, after test(), me1->parent->right changed while me1 didn't, which is weird because me1 and me1->parent->right point are the same address. I wonder if i make any wrong assumption here?
2.In function test(), if i replace old->parent->right with old only, then after the function call, the node me1 remains the same. Isn't the pointer modified after we do operations on it inside a function,and why in this case it is not?
me, me1, and me2 are local variables inside your outer function (let's assume it was main). These pointers are never modified, so after the call to test, me1 still points to the same node as before, while the pointer me1->parent->right now points to me. So, "me1 and me1->parent->right point are the same address" isn't true anymore!
If you only modify old inside test, you will only modify the parameter old, which is a copy of me1. After test returns, this copy is forgotten, and the modification has no effect. If you want to modify the me1 variable from within test, you will have to pass a pointer to the pointer, i.e. a double pointer:
int test(node **old,node* new){
*old = new;
...
}
and call it as test(&me1,me);.
Also: Please don't name things "new", because if you ever decide to compile the code as C++, this will conflict with the reserved keyword new.
Here is what your test() method does:
int test(node *old,node* new){
old->parent->right = new;
new->parent = old->parent;
}
when you call this:
me->right = me1;
me1->parent = me;
test(me1,me);
These steps happens:
me1's parent is me and me's right is me1. So me's right becomes me again.
me's parent becomes me itself.

If a parameter is a pointer type, is the parameter a pointer allocated in local memory

I'm just learning C, and I have a question about pointer parameters. My code is the following:
int Length(node *head)
{
int length = 0;
while (head) {
length++;
head = head->next;
}
return length;
}
The code in the book I'm reading says to do this though:
int Length(struct node* head)
{
struct node* current = head;
int count = 0;
while (current != NULL) {
count++;
current = current->next;
}
return count;
}
Is there really a difference? The way I'm reading my code is that I get a pointer to a node struct as a parameter. The pointer itself however, is a local variable that I am free to mutate as long as I don't dereference it first. Therefore, I can change the value of the pointer to point to a different node (the next node as it may be).
Will this cause a memory leak or is there some other difference I'm not seeing?
This code is for a linked list implementation. The node struct is defined as:
// Define our linked list node type
typedef struct node {
int data;
struct node *next;
} node;
Yes, they are both doing the same. But in the second example, it is more clear what the author is trying to do because of the code. In your first example, you're using the pointer head to reference nodes other than the head. That can be confusing.
You could write your function like this and your intend would be clear:
int GetLength(node* current)
{
int length = 0;
while (current != NULL)
{
length += 1;
current = current->next;
}
return length;
}
Your solution and reasoning is correct. The node argument is a local variable: a copy of the pointer passed to your function, allocated on the stack. That's why you can modify it from within the function.
There is no difference between the two solutions, at least not in functionality, modern compilers are most likely to optimize away the extra variable in the book's solution. The only slight difference is in style, many tend to take arguments as unmodifiable values just in case to avoid mistakes.
Your understanding of the argument-passing mechanics is correct. Some people simply prefer not to modify argument values, the reasoning being that modifying an argument tends to be bug-prone. There's a strong expectation that at any point in the function, if you want to get the value the caller passed as head, you can just write head. If you modify the argument and then don't pay attention, or if you're maintaining the code 6 months later, you might write head expecting the original value and get some other thing. This is true regardless of the type of the argument.

understanding linked list-like structure in c

I'm having trouble understanding a piece of C code that represents a linked list structure. The skeleton of the struct looks like this:
struct r{
r *next;
r **prev;
data *d;
}
struct r *rlist;
rlist can be filled by calling the following function: (skeleton only)
r* rcreate(data *d){
struct r *a = xmalloc(sizeof(*r))
a->d = d;
a->next = rlist;
a->prev = &rlist;
if (rlist)
rlist->prev = &a->next;
rlist = a;
return a;
}
How do I go about using this data structure? e.g. how to traverse rlist ?
Edit: here is the function for deleting a node in the linked list
void rdestroy(struct r *a){
if (a->next){
a->next->prev = a->prev;
}
*a->prev = a->next;
destroy(a->d); /* destroy is defined elsewhere */
}
Double prev pointer seems to allow traversing list in one direction only, while allowing easy deletion (because even though you can't access the previous element (easily), you can access the next pointer of previous element, and set it to new correct value when deleting a node.
Without seeing other related functions, it's hard to see why it is done this way. I've not seen this done, and can't immediately think of any really useful benefit.
I think this allows having simpler node deletion code, because node does not need to care if it first or not, because node's prev pointer will always have non-NULL value to a pointer it needs to modify when deleting itself. And same simplicity for insertion before a current node. If these operations are what dominate the use pattern, then this could be seen as minor optimization, I suppose, especially in older CPUs where branches might have been much more expensive.
How to traverse list
This was the question, right? You can only traverse it forward, in a very simple manner, here's a for loop to traverse entire list:
struct r *node;
for (node = rlist ; node ; node = node->next) {
// assert that prev points to pointer, which should point to this node
assert(*(node->prev) == node);
// use node
printf("node at %p with data at %p\n", node, node->d);
}
Example insertion function
This example insertion function demonstrates how insertion before a node needs no branches (untested):
struct r *rinsert(struct r *nextnode, data *d) {
// create and initialize new node
struct r *newnode = xmalloc(sizeof(struct r));
newnode->d = d;
newnode->next = nextnode;
newnode->prev = nextnode->prev;
// set next pointer of preceding node (or rlist) to point to newnode
*(newnode->prev) = newnode;
// set prev pointer of nextnode to point to next pointer of newnode
nextnode->prev = &(newnode->next);
return newnode;
}
There's no good reason to have r ** next in that structure. It's for a double linked list.
So if this thing is created you have it assigned
thisList = rcreate("my data")
now you could start with traversing it
while (thisList->next)
thisList = thisList->next.
...
Your code has many syntactical errors in it, probably because (as you say) it is a "skeleton," so it is hard to parse what the author (whether it was you or someone else) actually intended this code to do.
A simple (doubly) linked list structure looks like this:
struct node {
struct node *next, *prev; // pointers to the adjacent list entries
int data; // use whatever datatype you want
};
struct node *list = NULL; // the list starts empty
void add_entry(int new_data) {
struct node *new_entry = malloc(sizeof(struct node));
// note that in the above line you need sizeof the whole struct, not a pointer
new_entry->data = new_data;
new_entry->next = list; // will be added to the beginning of the list
new_entry->prev = NULL; // no entries currently front of this one
// in general a NULL pointer denotes an end (front or back) of the list
list->prev = new_entry;
list = new_entry; // now list points to this entry
// also, this entry's "next" pointer points to what used to
// be the start of the list
}
Edit: I'll say that if you want us to help you understand some code that is part of a larger program, that you did not write and can't modify, then please post the relevant code in a format that is at least syntactical. As others have said, for example, the use of prev in the code you posted is indecipherable, and it isn't clear (because there are other similarly confusing syntactical problems) whether that was in the original code or whether it is an error introduced in transcription.
Yang, I am not sure how comfortable you are with pointers in general. I suggest taking a look at few other linked-list implementations, it might just do the trick.
Take at look at this Generic Linked List Implementation.

How can I edit a pointer to a list node from a function in a recursion?

I have been writing a program that is quite complex compared to what I have dealt with until now. Anyways at some point I am supposed to write a function that will manipulate a struct list. I'm trying to make this question as simple as possible so I wrote down a pretty simple piece of code just for reference.
Here is the thing: at first I call testf from another function providing it with a valid current as well as an i with a value of 0. This means that testf will call itself about 100 times before it starts accessing the rest of the code. This is when all the generated instances of testf will start getting resolved.
void testf(listnode *current, int *i) {
wordwagon *current2;
current2 = current;
if (*i < 100) {
*i = *i + 1;
current2 = current2->next;
testf(current2, i);
}
current = current->next;
return;
}
If, let's say, I have enough connected list nodes at my disposal, is current = current->next; the correct way for the "last" testf function to access and edit the caller's current2 value (which is this function's current), or am I horribly wrong?
If I am, what is the way to make changes to the caller function's variables from inside the called function and be sure they won't go away as soon as the called function returns? I find it kind of hard to get a good grasp on how pointers work.
It is very likely that I have left out important information or that I haven't asked my question clearly enough. Please inform me if that is the case so I can edit in whatever you need.
Thanks in advance.
You can pass pointer to a pointer in your function, and derefrence it to get a listnode pointer back , here is how the code will look like after that ( not tested for compilation ) :
void testf(listnode **current, int *i) { // accept pointer to listnode pointer
wordwagon *current2;
current2 = *current; // retreive pointer value by dereferece
if (*i < 100) {
*i = *i + 1;
current2 = current2->next;
testf(&current2, i); // recursively call by reference to the pointer
}
*current = (*current)->next; /* change the current pointer next pointer, CORRECTED as suggested by Azure */
return;
}
Here is a list of really good articles for learning pointers :
a) http://cslibrary.stanford.edu/102/PointersAndMemory.pdf
b) http://cslibrary.stanford.edu/103/LinkedListBasics.pdf

Resources