Need some explanation about trees in C - c

Leaf *findLeaf(Leaf *R,int data)
{
if(R->data >= data )
{
if(R->left == NULL) return R;
else return findLeaf(R->left,data);
}
else
{
if(R->right == NULL) return R;
else return findLeaf(R->right,data);
}
}
void traverse(Leaf *R)
{
if(R==root){printf("ROOT is %d\n",R->data);}
if(R->left != NULL)
{
printf("Left data %d\n",R->left->data);
traverse(R->left);
}
if(R->right != NULL)
{
printf("Right data %d\n",R->right->data);
traverse(R->right);
}
}
These code snippets works fine but i wonder how they works?
I need a brief explanation about recursion.I am thankful for your helps.

A Leaf struct will look something like this:
typedef struct struct_t {
int data;
Leaf * left; //These allow structs to be chained together where each node
Leaf * right; //has two pointers to two more nodes, causing exponential
} Leaf; //growth.
The function takes a pointer to a Leaf we call R and some data to search against, it returns a pointer to a Leaf
Leaf *findLeaf(Leaf *R,int data){
This piece of code decides whether we should go left or right, the tree is known to be ordered because the insert function follows this same rule for going left and right.
if(R->data >= data ){
This is an edge case of the recursive nature of the function, if we have reached the last node in a tree, called the Leaf, return that Leaf.
An edge case of a recursive function has the task of ending the recursion and returning a result. Without this, the function would not finish.
if(R->left == NULL) return R;
This is how we walk through the tree, Here, we are traversing down the left side because the data was larger. (Larger data is always inserted on at the left to stay ordered.)
What is happening is that now we call findLeaf() with R->left, but imagine if we get to this point again in this next call.
It will become R->left->left in reference to the first call. If the data is smaller than the current node we are operating on we would go right instead.
else return findLeaf(R->left,data);
Now we are at the case where the data was smaller than the current Node, so we are going right.
} else {
This is exactly the same as with the left.
if(R->right == NULL) return R;
else return findLeaf(R->right,data);
}
}
In the end, the return of the function can be conceptualized as something like R->right->right->left->NULL.
Lets take this tree and operate on it with findLeaf();
findLeaf(Leaf * root, 4) //In this example, root is already pointing to (8)
We start at the root, at the top of the tree, which contains 8.
First we check R->data >= data where we know R->data is (8) and data is (4). Since we know data is smaller than R->data(Current node), we enter the if statement.
Here we operate on the left Leaf, checking if it is NULL. It isn't and so we skip to the else.
Now we return findLeaf(R->left, data);, but to return it, we must solve it first. This causes us to enter a second iteration where we compare (3) to (4) and try again.
Going through the entire process again, we will compare (6) to (4) and then finally find our node when we comepare (4) to (4). Now we will backtrack through the function and return a chain like this:
R(8)->(3)->(6)->(4)
Edit: Also, coincidentally, I wrote a blog post about traversing a linked list to explain the nature of a Binary Search Tree here.

Each Leaf contains three values:
data - an integer
left and right, both pointers to another leaf.
left, right or both, might be NULL, meaning there isn't another leaf in that direction.
So that's a tree. There's one Leaf at the root, and you can follow the trail of left or right pointers until you reach a NULL.
The key to recursion is that if you follow the path by one Leaf, the remaining problem is exactly the same (but "one smaller") as the problem you had when you were at the root. So you can call the same function to solve the problem. Eventually the routine will be at a Leaf with NULL as its pointer, and you've solved the problem.
It's probably easiest to understand a list before you understand a tree. So instead of a Leaf with two pointers, left and right, you have a Node with just one pointer, next. To follow the list to its end, recursively:
Node findEnd(Node node) {
if(node->next == NULL) {
return node; // Solved!!
} else {
return findEnd(node->next);
}
}
What's different about your findLeaf? Well, it uses the data parameter to decide whether to follow the left or right pointer, but otherwise it's exactly the same.
You should be able to make sense of traverse() with this knowledge. It uses the same principle of recursion to visit every Leaf in the structure.

Recursion is a function that breaks a problem down into 2 variants:
one step of solving the problem, and calling itself with the remainder of the problem
the last step of solving the problem
Recursion is simply a different way of looping through code.

Recursive algorithms generally work hand in hand with some form of data structure - in your case the tree. You need to imagine the recursion - very high level - as "reapply the same logic on a subset of the problem".
In your case the subset of the problem is either the branch of the three on the right or the branch of the three on the left.
So, let's look at the traverse algorithm:
It takes the leaf you pass to the method and - if it's the ROOT leaf states it
Then, if there is a "left" sub-leaf it displays the data attached to it and restarts the algorithm (the recursion) which means... on the left node
If the left node is the ROOT, state it (no chance after the first recursion since the ROOT is at the top)
Then , if there is a "left" sub-leaf to our left node, display it and restart the algorithm on this left, left
When reaching the bottom left, i.e. when there is no left leaf left (following? :) ) then it does the same on the first right leaf. If there is neither a left leaf nor a right leaf, which means we are at the real leaf that does not have sub-leafs, the recursive call ends, which means that the algorithm starts again from the place it was before recursing and with all the variables at the state they were in then.
After first recursion termination, you will move from the bottom left leaf up one leaf, and go down on the right leaf if there is one and start again printing and moving on the left.
All in all - the ending result is that you walk through your whole tree in a left first way.
Tell me if it's not crystal clear and try to apply the same pattern on the findLeaf recursive algorithm.

A little comment about recursion and then a little comment about searching on a tree:
let's suppose you want to calculate n!. You can do (pseudocode)
fac 0 = 1
fac (n+1) = (n+1) * fac n
So recursion is solving a problem by manipulating the result of solving the same problem with a smaller data. See http://en.wikipedia.org/wiki/Recursion.
So now, let's suppose we have a data structure tree
T = (L, e, R)
with L the left subtree, e is the root and R is the right subtree... So let's say you want to find the value v in that tree, you would do
find v LEAF = false // you cant find any value in an empty tree, base case
find v (L, e, R) =
if v == e
then something(e)
else
if v < e
find v L (here we have recursion, we say 'go and search for v in the left subtree)
else
find v R (here we have recursion, we say 'go and search for v in the right subtree)
end
end

Related

How does the algorithm of getting binary tree height work?

int GetHeight(BinTree BT)
{
int HL, HR, MaxH;
if(BT)
{
HL = GetHeight(BT->Left);
HR = GetHeight(BT->Right);
MaxH = HL > HR ? HL : HR;
return (MaxH + 1);
}
else
return 0;
}
I cant get the detail of this algorithm.
How does the HL and HR get their height?
Can anyone explain it?
Thanks a lot.
There are two cases.
The first case is when the tree node is NULL, meaning there isn't a tree node. That height is zero, and is captured in the "else" statement.
The second case is when the tree node is not NULL, then the hieght of the tree is the larger of the heights of the two tree branches, with 1 added.
So, if you have a single node tree, the branches both report zero, and one is added, making it a height of one. If that single node tree has a parent node, then that branch of the parent node will report one, and the other branch might report something else (let's say zero) and the height is one plue one, or two. And so on, until you finally get the height of the tree.
I'm going to guess the thing you're struggling with here is how recursive functions work. The important thing to note is that GetHeight returns an int.
If you examine a tree node which is non-null, the if(BT) condition is evaluated, and you follow the Left and Right pointers into GetHeight again (adding a stack frame). Now, if Left or Right is null you're going hit the else condition which returns 0. Imagine a leaf node (left and right both null), both HL and HR will be 0, so MaxH will be 0, and GetHeight will return 1. If the call to GetHeight were in a recursive function call, that 1 gets passed back to the calling function, and either HL or HR is now 1, and that call will return MaxH + 1 until you end up back at the initial call. In this way you recursively accumulate the answer.

Binary search tree Sorting

i want to sort some data with the help of a binary search tree, that i have already created.
I have the following example code that works.. But can't understand how this works..
It starts and if there is no record in the database then b=0 and returns. This is clear.
If b exists, then it goes to the left node and calls the function again and again until b->left ==NULL.. Do i get it correctly?
But when does it print the data, since from what i get when it runs the function it doesnt print, but starts again from the top of the function..
void display_ordered_email(struct BST_node *b)
{
if (b==0)
return;
display_ordered_email(b->left);
printf("Name : %s\n", b->data->name);
printf("Address : %s\n", b->data->address);
printf("Email : %s\n", b->data->email);
printf("\n");
display_ordered_email(b->right);
}
Is this inorder traversal or other method?
Consider this simple tree.
b
/ \
a c
Given that display_ordered_email is supposed to recursively print the nodes in order, you can ask yourself when b should be printed. The answer is that b should be printed after it has visited and printed a (the left side), but before it will visit and print c (the right side).
void display_ordered_email(struct BST_node *b)
{
if (b==0)
return;
display_ordered_email(b->left);
/* ... print the node */
display_ordered_email(b->right);
}
which is exactly how your routine is structured.
This is your pre-order traversal using recursion. Once you are done with the left subtree, it prints the root of that subtree followed by right subtree. You may want to try it out with a tree of about 8 nodes.
It will traverse all the way to the bottom left and hit 0. then it moves back one node and continues the code for that node after the return statement. This means it will print that code and then try it for the right node. If there is no right node it just returns otherwise it prints the right node. Then if both are done it will back up one level and print everything there then check that right branch for any branches it may have.
It is quite confusing at first but if you draw it out it becomes a lot easier to understand.

C: Finding Huffman-coded path of specific leaf in tree

I'm trying to write a recursive function that locates a specific leaf within a Huffman tree, then prints its shortest path with zeros and ones (zero being a traversal to the left, and one being a traversal to the right). I understand the logic of what I need to do, but I'm not having success at actually implementing it. I believe that I have a good skeleton here, but the part I'm missing is some more complicated logic to tell when I should actually run printf and when I should not (since this currently just prints every zero and one). Also, I know that the rest of the logic outside of this is working properly because if you do a normal traversal where you do not have to plot the shortest paths, each of the elements I am searching for is found.
I've tried looking at quite a few resources online and I cannot find a solution, or at least, I cannot recognize the solution properly. I've probably rewritten this 50 or more times. Let me know what you think!
void traverse(struct tree *curr, struct tree *cmp)
{
if (curr == NULL)
{
return;
}
if (getLeft(curr) == NULL && getRight(curr) == NULL)
{
if (curr == cmp)
{
return;
}
}
if (getLeft(curr) != NULL)
{
printf("0");
traverse(getLeft(curr), cmp);
}
if (getRight(curr) != NULL)
{
printf("1");
traverse(getRight(curr), cmp);
}
}
For context: cmp is the node we want to find, getLeft() and getRight() return the left and right children of a node respectively, and curr starts as the root of the Huffman tree itself. Also, the reason this printf thing works is because I loop through all of the known leaves, print other information about the leaf, and then call this traversal method, followed by a newline.
There are several solutions.
First, you could traverse the entire tree as you are doing and build a table of codes. Then use the table, not the tree. Then you're not wasting your time searching the whole tree for every code. As you traverse the tree you build up a string of 0's and 1's, and when you get to a leaf, you save the built up string and the symbol in the leaf in the table. Then throw away the tree. This is the recommended approach.
Second, your links could be bidirectional. Since you have a pointer to the leaf, you could simply start at the leaf and work your way back to the root, constructing the string of 0's and 1's in reverse.
Third, you could persist in doing your painful tree search for every leaf by having your traverse function return true or false. It would return true if either it got to the desired leaf, or if one of the traverse calls returned true. Then depending on which traverse call returned true, you would print or save a zero or a one. This would print the path in reverse. If you save them in a string in reverse order instead of printing, then you can print the string when the first traverse call returns.
A viable solution is to give each node a parent pointer. This way, once you find the leaf, you can traverse up the tree recursively starting at that leaf, and print the appropriate bits as you return from the recursive calls.
In this function, first check if the node has a parent or not (in other words, if we're at the root or not), and if so, call the function recursively with the node's parent; if not, return.
In the case that we called the function recursively, after the recursive call, check to see if the current node is the right child of its parent. If so, print a 1; if not, print a 0.
No need to worry about reversing a string in this implementation.
Another possible solution would be to build up the string and pass it along to the recursive calls. For this solution, you'd need to know the height of the tree, or at least the number of symbols your tree can encode so that you pass in a char array of at least that size, plus one for null termination.
In pseudocode, this would look like:
func traverse (cur, cmp, str)
if cur == null, return
if cur == cmp
print str
if cur.left != null
traverse(cur.left, cmp, str + "0")
if cur.right != null
traverse(cur.right, cmp, str + "1")
This way, you're building up the string, and only print it once you find the leaf in question. Note that I moved the cur == cmp check outside of that if statement, because it should never be true for an internal node in a Huffman code tree. This method is wildly inefficient for finding the code for one character, though, since it performs a DFS on the entire tree.

How to find the height of a binary tree without using recursion and level order traversal?

I came up with preorder traversal for finding the height of a tree.
void preHeight(node * n)
{
int max = 0,c = 0;
while(1)
{
while(n)
{
push(n);
c++;
if(c>max)
max = c;
n= n->left;
}
if (isStackEmpty())
return max;
n = pop();
if(n->right) //Problem point
c--;
n=n->right;
}
}
I get the height correct but I'm not sure if my method is correct. What I do is I increment my counter c till the leftmost node and then, if I move up I reduce it in case I need to move right, and then repeat the entire exercise. Is that correct?
Consider the following tree -
o
/ \
o o
/ \
o o
You will reach the end of leftmost branch with c==2, then pop the nodes on the way up, but only decrement c for nodes with a right child, when in fact you should decrement it on each pop since it means you went a level up, regardless of other children. In this case you'll reach the top node, decrement once, and then start descending from the root with c==1, eventually reaching 3.
If you remove the condition it should work. Alternatively, you can keep the value of c per each level instead of restoring it with the decrements - you could either push it into a separate stack alongside the node pointers, or you could convert the code to be recursive, with c (and the current node) as local variables (basically this is the same thing, except that the compiler maintains the stack for you).

How does recursion works in a binary search tree?

Binary search tree algorithms usually use recursion, and I'm having a hard time with it.
This is a code which converts the tree into its mirror image .
void mirror_image(struct tree* node1)
{
if (node1==NULL)
return;
else
{
struct tree *temp;
mirror_image(node1->left);
mirror_image(node1->right);
temp=node1->left;
node1->left=node1->right;
node1->right=temp;
}
}
How does this work?
Basically you are creating new tree with changing its right and left node. pointers because you are making changes in adresses. first you are assigning value of left node to temp pointer variable. Then value of right node into left node. And at last the value in temp is shifting to right node. its like swapping.
So, it scans the left childs trees using
mirror_image(node1->left);
and right childs tress using
mirror_image(node1->right);
on reaching the end when
if (node1==NULL)
return;
it interchanges them using the swap procedure:
temp=node1->left;
node1->left=node1->right;
node1->right=temp;
I'd suggest try with a small binary tree, see it yourself on paper.

Resources