inorder traversal in BST - c

how to keep track of previous node in recursive inorder binary search tree traversal?
eq...
in finding floor of any no... in bst ...iam trying to find the first number in bst which larger than given value ...and at that point printing the data of prev node which is either equal to or less then given value as it is inorder traversal...
so why question is how we can keep track of previous node in bst in
recursive inorder traversal??

(Aside: It doesn't sound like you're asking for an in-order traversal, but rather a binary search function that returns the greatest node that is no greater than the query.)
The two most common ways of keeping track of stuff like this in a recursive algorithm are to either pass it down as a parameter, or to return back up to it. (Either way you're storing information about the past on the stack.)
In your case it's probably cleanest to do the latter. eg:
Node* floor_node(int x, Node *subtree) {
if (subtree) {
if(subtree->value > x) {
return floor_node(x, subtree->left);
} else {
return floor_node(x, subtree->right) || subtree;
}
} else {
return subtree;
}
}

Binary tree recursion works by going down the left tree and then the right. Inorder/preorder/postorder are a convention that is determined merely by the ordering of some local action in the recursive procedure: the timing of the "visiting" of the current node itself with regard to the two recursive calls.
How you can get the next node is to have the recursion return it.
When you recurse into a tree, the last node visited in "inorder" is simply the rightmost node! Therefore, your recursion must simply return the rightmost node.
Furthermore, if a tree T as a whole has some previous node P, then the left subtree of T, namely left(T) also has the same previous node P. P is the predecessor of the leftmost node of T.
Moreover, the previous node with respect to right(T) is the node T itself.
So when recursing into left(T) we can simply pass down the same predecessor that was given to us, and when recursing into right(T) we pass ourselves as the predecessor.
Pseudocode:
# a recursive function that is given its previous node,
# and returns the rightmost node
recurse_with_previous (tree previous-in):
# skip empty link. No leaf to see here!
# previous-in is the rightmost node still
if null(tree)
return previous-in
# if we are at a leaf, then that leaf is rightmost
if leaf(tree)
print "visiting leaf node " tree " with previous node " previous-in
return tree
# the previous node (previous-in) of this tree is actually the left
# subtrees previous node, so we just pass that parameter down
previous = recurse_with_previous (left(tree) previous-in)
# inorder visit: visit this node between the subtrees
print "visiting " tree " with previous node " previous
# now the right subtree. what is ITS previous? Why, we are!!!
# we return whatever this returns causing the return value
# to be the rightmost node.
return recurse_with_previous (right(tree) tree)
# how to call
recurse_with_previous(some-tree nil)

Related

Inserting elements from a sorted array into a BST in an efficient way

I am struggling with this exercise:
Given a BST T, whose nodes contain only a key field, a left and right field and a sorted array A which contains m keys. Write an efficient algorithm which inserts into T any A's keys which are not already present in T. You must not apply the InsertBST(T,key) algorithm on single A's keys.
For example if the BST contains 1,3,4,5 and A contains 1,2,5,6, I have to insert 2,6 without using InsertBST(T,2) and InsertBST(T,6).
This is what I tried:
Algo(T,A,index)
if T != null then
Algo(T->sx,A,index)
p = NewNode()
p->key = A[index]
if T->key > A[index] then
T->sx = p
index = index + 1
else if T->key < A[index] then
T->dx = p
index = index + 1
Algo(T->dx,A,index)
But it inserts nodes at the wrong places. Can you help me?
I see the following issues with your attempt:
In each recursive call you are going to insert a node. So for instance, also when you reach the first, left-most leaf. But this cannot be right. The first node insertion may have to happen completely elsewhere in the tree without affecting this leaf at all
There is no check that index runs out of bounds. The code assumes that there are just as many values to insert as there are nodes in the tree.
There is no provision for when the value is equal to the value of an existing node in the tree. In that case the value should not be added at all, but ignored.
In most languages index will be a local variable to the Algo function, and so updating it (with + 1) will have no effect to the index variable that the caller has. You would need a single index variable that is shared across all executions of Algo. In some languages you may be able to pass index by reference, while in other languages you may be able to access a more global variable, which is not passed as argument.
Algorithm
The algorithm to use is as follows:
Perform the inorder traversal (as you already did). But keep track of the previously visited node. Only when you detect that the next value to be inserted is between the values of those two nodes, you need to act: in that case create a new node and insert it. Here is how to detect where to insert it. Either:
the current node has no left child, which also means the previous node (in the inorder sequence) is not existing or is higher up the tree. In this case the new node should become the left child of the current node
the current node has a left child, which also means the previous node is in the left subtree and has no right child of its own. In that case, add the new node as right child of the previous node.
The Sentinel idea
There is a possibility that after visiting the last node in inorder sequence, there are still values to insert, because they are greater than the greatest value in the tree. Either you must treat that case separately after the recursive traversal has finished, or you can begin your algorithm by adding a new, temporary root to your tree, which has a dummy, but large value -- greater than the greatest value to insert. Its left child is then the real root. With that trick you are sure that the recursive logic described above will insert nodes for all values, because they will all be less than that temporary node's value.
Implementation in Python
Here is the algorithm in Python syntax, using that "sentinel" node idea:
def insertsorted(root, values):
if not values:
return root # Nothing to do
# These variables are common to all recursive executions:
i = 0 # index in values
prev = None # the previously visited node in inorder sequence
def recur(node):
nonlocal i, prev # We allow those shared variables to be modified here
if i < len(values) and node.left:
recur(node.left)
while i < len(values) and values[i] <= node.value:
if values[i] < node.value:
newnode = Node(values[i])
if node.left is None:
node.left = newnode
else:
prev.right = newnode
prev = newnode
i += 1
prev = node
if i < len(values) and node.right:
recur(node.right)
# Create a temporary node that will become the parent of the root
# Give it a value greater than the last one in the array
sentinel = Node(values[-1] + 1)
sentinel.left = root
recur(sentinel)
# Return the real root to the caller
return sentinel.left

How do you find the lowest key in a maximum sum path of a binary search tree in O(n)?

I am already calculating the maximum path sum, but i want to figure out which is the lowest key inside the path. How should i get this information? I am having troubles because if i check for the minimum inside the maximum path sum, i dont get what i am looking for (ofcourse) because im recurring firstly to the lowest element inside the BST.
Below what i tried:
int Max_Path_Sum(struct node* root){
int res= INT_MIN;
int min = INT_MAX;
Max_Path_Sum_Util( root, &res, &min);
printf("%d\n\n%d", min,res);
return res;
}
int Max_Path_Sum_Util(struct node* root, int *res, int *min){
if(root == NULL) return 0;
if( root->left == NULL && root->right == NULL)return root->key;
int ls = Max_Path_Sum_Util(root->left ,res , min);
int rs = Max_Path_Sum_Util(root->right ,res , min);
if(root->left != NULL && root->right != NULL){
*res = max(*res , ls + rs + root->key);
return max(ls,rs)+ root->key;
}
int sum = (root->left == NULL) ? rs+root->key : ls+root->key;
if(root != NULL && *min> root->key)*min = root->key;
return sum;
}
I am recieving the lowest key inside of the BST but i understand why it isnt the real result, beside some rare cases. My BST isnt balanced(its just an homework) So insterting keys without caring about balance.
struct node *root=New_Node(4);
Insert(root, 2);
Insert(root, 1);
Insert(root, 3);
Insert(root, 6);
Insert(root, 5);
Insert(root, 4);
Insert(root, -5);
Insert(root,0);
Insert( root, 3);
Insert(root, 2);
Using this tree the result of maximum path sum is 24, which should be correct.
As minimum i recieve 6, which isnt the right answer. I think it should be 2.
I am having troubles because if i check for the minimum inside the
maximum path sum, i dont get what i am looking for (ofcourse) because
im recurring firstly to the lowest element inside the BST.
I'd characterize the issue differently: you cannot directly record the minimum node along the path, because you don't know during any particular execution of the recursive function whether it is operating on a node that will turn out to be on the maximum path. But this genuine issue presents an actual problem only for some implementations.
When searching for a path in a tree via an algorithm that works one node at a time, you generally have two cases to consider as you process each node:
the path of interest passes through the current node, or
it doesn't
Specific algorithms generally subdivide those further. In particular, your recursive approach that processes the tree from the chosen root node toward the leaf nodes has these more specific cases to account for:
the path passes through the current node from its parent node (either ending at this node or continuing through exactly one of its children)
the path passes through this node and does not contain its parent node (it may also pass through one or both of its children)
the path does not pass through this node, but it is contained in the subtree rooted at this node
the path does not pass through any node in the subtree rooted at the current node
When processing a given node during your recursive traversal, you need to provide an answer as if that node were the root of the tree (because it might be), and also sufficient information for the answer to be determined correctly if it isn't.
Now note that neither the maximum path sum in a tree T1 nor the minimum element along that maximum path directly informs computation of those properties for a larger tree T2 that contains T1 as a subtree. You can't just add maximum path sums from a node's left and right subtrees -- that gives the right answer only in the case that in each subtree, the maximum path starts at the subtree root, so that you can join them together through their common parent to form a path. If the maximum path in one of the subtrees does not contain the subtree root or the subtree root is somewhere in the middle of the maximum path, then you can't form a path by joining the parent node to it.
Thus you need separate sets of information about each subtree:
information about the general maximum-sum path within that subtree, and
information about the maximum-sum path within it that starts at the subtree root (even if, as will be common, its path sum is less than the maximum in the subtree)
When processing a node, you can combine the latter sets of information about the subtrees rooted at its children to compute both sets of information for node under consideration. Moreover, you need to maintain data separation so that information applying to one of a node's child trees is not lost when you process the other. So what does that look like?
Let's first introduce another data structure to make it easier to keep track:
struct path_info {
int sum;
int min_value;
};
Now let's consider what your recursive function's signature needs to look like. There are several ways it could be done, but I'm going to suggest this:
struct path_info compute_max_path(struct node *root, struct path_info *max_leg)
The return value conveys the result for the tree rooted at the specified node, and the information needed to build such a result for a larger tree is conveyed via the max_leg output parameter.
I don't intend to write a complete solution for you, but I suspect there is one more idea that you're missing: how to segregate max_leg results for the subtrees. The key here is that when you recurse, you do not forward the max_leg parameter to the recursive calls. Instead, you declare new objects, and pass pointers to those:
struct path_info left_leg;
struct path_info left_result = compute_max_path(root->left, &left_leg);
struct path_info right_leg;
struct path_info right result = compute_max_path(root->right, &right_leg);
You then have all the information you need to set the max_leg data for the current node and to compute and return the maximum path information for its subtree.

Find maximum subtree in the given BST such that it has no duplicates

Given the BST which allows duplicates as separate vertices, how do I find the highest subtree such that it has no duplicates.
This is the idea:
(1) Check if the root value appears in its right subtree (inserting this way: left < root <= right). If not, tree has no duplicates. I look for it always on the left from the root's child.
(2) Traversing and doing (1) I can find all subtrees without duplicates, storing their root pointer and height.
(3) Comparing heights I can find largest seeked subtree.
I don't know how to store these information while traversing. I found programs for finding all duplicate subtrees of BST that use hash maps, but if possible I would prefer to avoid using hash maps, as I haven't had them on my course yet.
<!-- language: lang-c -->
typedef struct vertex {
int data;
struct vertex *left;
struct vertex *right;
} vertex, *pvertex;
// Utility functions
int Height(pvertex t){
if (t == NULL)
return 0;
if (Height(t->left) > Height(t->right))
return Height(t->left) + 1;
else
return Height(t->right) + 1;
}
int DoesItOccur(pvertex t, int k){
if(!t)
return 0;
if(t->data==k)
return 1;
if(t->data<k){
return DoesItOccur(t->left,k);
}
}
// My function
pvertex MaxSeeked(pvertex t){
if(!t)
return NULL;
if(DoesItOccur(t->right,t->data)==0)
return t;
else if{
if(t->left && t->right){
if(Height(MaxSeeked(t->left))>Height(MaxSeeked(t->right)))
return t->left;
else
return t->right;
}
}
else if{
......
}
}
I don't know how to store these information while traversing. I found programs for finding all duplicate subtrees of BST that use hash maps, but if possible I would prefer to avoid using hash maps, as I haven't had them on my course yet.
Note in the first place that you only need to track all the subtrees of the maximal height discovered so far. Or maybe you can limit that to just one such, if that's all you need to discover. For efficiency, you should also track what that maximal height actually is.
I'll suppose that you must not add members to your node structure, but if you could do, you could add a member or two wherein to record whether the tree rooted at each node contains any dupes, and how high that tree is. You could populate those data as you go, and remember what the maximum height is, then make a second traversal to collect the nodes.
But without modifying any nodes themselves, you can still track the current candidates by other means, such as a linked list. And you can put whatever metadata you want into the tracking data structure. For example,
struct nondupe_subtree {
struct vertex *root;
int height;
struct nondupe_subtree *next;
};
You can then, say, perform a selective traversal of your tree in breadth first order, carrying along a linked list of struct nondupe_subtree nodes:
Start by visiting the root node.
Test the subtree rooted at each visited node to see whether it contains any dupes, according to the procedure you have described.
If so then enqueue its children for traversal.
If not then measure the subtree height and update your linked list (or not) accordingly. Do not enqueue this node's children.
When no more nodes are enqueued for traversal, you linked list contains the roots of all the maximal height subtrees without dupes.
Note that that algorithm would in many cases be significantly sped if you could compute and store all the subtree heights in an initial DFS pass, for it is otherwise prone to performing duplicate tree-height computations. Many of them, in some cases.
Note also that although it does simplify this particular algorithm, your rule for always putting dupes to the right works against balanced trees, which may also yield reduced performance. In the worst case, where are vertices are duplicate, your "tree" will perforce be linear.

Need some explanation about trees in C

Leaf *findLeaf(Leaf *R,int data)
{
if(R->data >= data )
{
if(R->left == NULL) return R;
else return findLeaf(R->left,data);
}
else
{
if(R->right == NULL) return R;
else return findLeaf(R->right,data);
}
}
void traverse(Leaf *R)
{
if(R==root){printf("ROOT is %d\n",R->data);}
if(R->left != NULL)
{
printf("Left data %d\n",R->left->data);
traverse(R->left);
}
if(R->right != NULL)
{
printf("Right data %d\n",R->right->data);
traverse(R->right);
}
}
These code snippets works fine but i wonder how they works?
I need a brief explanation about recursion.I am thankful for your helps.
A Leaf struct will look something like this:
typedef struct struct_t {
int data;
Leaf * left; //These allow structs to be chained together where each node
Leaf * right; //has two pointers to two more nodes, causing exponential
} Leaf; //growth.
The function takes a pointer to a Leaf we call R and some data to search against, it returns a pointer to a Leaf
Leaf *findLeaf(Leaf *R,int data){
This piece of code decides whether we should go left or right, the tree is known to be ordered because the insert function follows this same rule for going left and right.
if(R->data >= data ){
This is an edge case of the recursive nature of the function, if we have reached the last node in a tree, called the Leaf, return that Leaf.
An edge case of a recursive function has the task of ending the recursion and returning a result. Without this, the function would not finish.
if(R->left == NULL) return R;
This is how we walk through the tree, Here, we are traversing down the left side because the data was larger. (Larger data is always inserted on at the left to stay ordered.)
What is happening is that now we call findLeaf() with R->left, but imagine if we get to this point again in this next call.
It will become R->left->left in reference to the first call. If the data is smaller than the current node we are operating on we would go right instead.
else return findLeaf(R->left,data);
Now we are at the case where the data was smaller than the current Node, so we are going right.
} else {
This is exactly the same as with the left.
if(R->right == NULL) return R;
else return findLeaf(R->right,data);
}
}
In the end, the return of the function can be conceptualized as something like R->right->right->left->NULL.
Lets take this tree and operate on it with findLeaf();
findLeaf(Leaf * root, 4) //In this example, root is already pointing to (8)
We start at the root, at the top of the tree, which contains 8.
First we check R->data >= data where we know R->data is (8) and data is (4). Since we know data is smaller than R->data(Current node), we enter the if statement.
Here we operate on the left Leaf, checking if it is NULL. It isn't and so we skip to the else.
Now we return findLeaf(R->left, data);, but to return it, we must solve it first. This causes us to enter a second iteration where we compare (3) to (4) and try again.
Going through the entire process again, we will compare (6) to (4) and then finally find our node when we comepare (4) to (4). Now we will backtrack through the function and return a chain like this:
R(8)->(3)->(6)->(4)
Edit: Also, coincidentally, I wrote a blog post about traversing a linked list to explain the nature of a Binary Search Tree here.
Each Leaf contains three values:
data - an integer
left and right, both pointers to another leaf.
left, right or both, might be NULL, meaning there isn't another leaf in that direction.
So that's a tree. There's one Leaf at the root, and you can follow the trail of left or right pointers until you reach a NULL.
The key to recursion is that if you follow the path by one Leaf, the remaining problem is exactly the same (but "one smaller") as the problem you had when you were at the root. So you can call the same function to solve the problem. Eventually the routine will be at a Leaf with NULL as its pointer, and you've solved the problem.
It's probably easiest to understand a list before you understand a tree. So instead of a Leaf with two pointers, left and right, you have a Node with just one pointer, next. To follow the list to its end, recursively:
Node findEnd(Node node) {
if(node->next == NULL) {
return node; // Solved!!
} else {
return findEnd(node->next);
}
}
What's different about your findLeaf? Well, it uses the data parameter to decide whether to follow the left or right pointer, but otherwise it's exactly the same.
You should be able to make sense of traverse() with this knowledge. It uses the same principle of recursion to visit every Leaf in the structure.
Recursion is a function that breaks a problem down into 2 variants:
one step of solving the problem, and calling itself with the remainder of the problem
the last step of solving the problem
Recursion is simply a different way of looping through code.
Recursive algorithms generally work hand in hand with some form of data structure - in your case the tree. You need to imagine the recursion - very high level - as "reapply the same logic on a subset of the problem".
In your case the subset of the problem is either the branch of the three on the right or the branch of the three on the left.
So, let's look at the traverse algorithm:
It takes the leaf you pass to the method and - if it's the ROOT leaf states it
Then, if there is a "left" sub-leaf it displays the data attached to it and restarts the algorithm (the recursion) which means... on the left node
If the left node is the ROOT, state it (no chance after the first recursion since the ROOT is at the top)
Then , if there is a "left" sub-leaf to our left node, display it and restart the algorithm on this left, left
When reaching the bottom left, i.e. when there is no left leaf left (following? :) ) then it does the same on the first right leaf. If there is neither a left leaf nor a right leaf, which means we are at the real leaf that does not have sub-leafs, the recursive call ends, which means that the algorithm starts again from the place it was before recursing and with all the variables at the state they were in then.
After first recursion termination, you will move from the bottom left leaf up one leaf, and go down on the right leaf if there is one and start again printing and moving on the left.
All in all - the ending result is that you walk through your whole tree in a left first way.
Tell me if it's not crystal clear and try to apply the same pattern on the findLeaf recursive algorithm.
A little comment about recursion and then a little comment about searching on a tree:
let's suppose you want to calculate n!. You can do (pseudocode)
fac 0 = 1
fac (n+1) = (n+1) * fac n
So recursion is solving a problem by manipulating the result of solving the same problem with a smaller data. See http://en.wikipedia.org/wiki/Recursion.
So now, let's suppose we have a data structure tree
T = (L, e, R)
with L the left subtree, e is the root and R is the right subtree... So let's say you want to find the value v in that tree, you would do
find v LEAF = false // you cant find any value in an empty tree, base case
find v (L, e, R) =
if v == e
then something(e)
else
if v < e
find v L (here we have recursion, we say 'go and search for v in the left subtree)
else
find v R (here we have recursion, we say 'go and search for v in the right subtree)
end
end

convert Binary tree to Binary Search Tree inplace using C

Without using any extra space convert Binary Tree to Binary Search tree.I came up with the following algo but It doesn't work.
BTtoBST(node *root)
1.if the root is NULL return
2.else current=root
3.if (current->left > current) swap(current->left , current)
4.if (current->right < current) swap(current->right , current)
5.current=current->left
6 go to 3 if current!=NULL else go to 4
7.current=current->right
Thanks in advance
PS:I saw this link but was not of much help!!
Convert Binary Tree -> BST (maintaining original tree shape)
You can swap the nodes including subtrees (not only the node content) like in an AVL Tree http://en.wikipedia.org/wiki/AVL_tree
Just keep swapping as long as BST constraints are violated, restarting deep first search from root after each swap.
Perform a post-order (bottom up) traversal of the tree, taking the nodes that are visited and inserting them into a BST.
Does "without any extra space" preclude recursion?
If not, then something like:
# top level call passes null for bst
bt_to_bst (root, bst)
# nothing to add to bst; just return it
if null(root) -> return bst
# if this is a leaf node, stick it into the BST
if null(root->left) && null(root->right)
return bst_insert(bst, root)
# otherwise add all of left subtree into the bst and then the right tree
bst = bt_to_bst (root->left, bst);
return bt_to_bst (root->right, bst);
bt_to_bst is a filtering operation; it takes an existing bst and returns a new one with the given node added to it.
Adding a leaf node to the bst is safe because we will never visit it again, so we can overwrite its left and right pointers.

Resources