I have a question regarding the insertion in an AVL Tree. I noticed that there are some cases in which for example, after you inserted an element, both the parent and it's child are breaking the AVL condition. For example here https://www.youtube.com/watch?v=EsgAUiXbOBo, at min. 12:50, when after 1 was inserted, both 4 and 3 are breaking the AVL condition. My question is on which node should we do the rotation. The closest one to the root (in this case is the root itself) or the one who is farthest from the root, as we would get two different trees in those cases? Or is it correct either way?
Rotation starts from the bottom (the inserted node).
Let's consider having balanced all nodes up to P (included). So the subtree of P is perfectly balanced. We go to P's parent (Q). The subtree of Q is checked and (eventually) rotated. The result tree (the root may have changed if a rotation was performed) is perfectly balanced. Advance up again.
Related
I was recently studying Uninformed Search from here. In the case of Depth First Search, it is given that the space taken by the fringe is O(b.m), but I am unable to figure out how(I could not find proof of this anywhere online). Any help or pointers to specific material would be much appreciated.
A depth-first tree search needs to store only a single path from the root to a leaf node, along with the remaining unexpanded sibling nodes for each node on the path.
Once a node has been expanded, it can be removed from memory as soon as all its descendants has been fully expanded.
So that, for a state space with branching factor band maximum depth m, depth-first search requires storage of only O(b*m).
ref. Russel & Norvig, Artificial Intelligence, Figure 3.16, p87
The Depth First Search (DFS) algorithm has to store few nodes in the fridge because it processes the lastly added node first (Last In First Out), which results in a space complexity of O(bd). Thus, for a depth of d it has to store at most the b children of the d nodes above.
The Breadth First Search (BFS) algorithm, however, gets the firstly inserted node first (First In First Out). Because of this, it has to keep track of all the children nodes it encounters, which results in a space complexity of O(b^d).
Thus, for a depth of d it has to store the children and the children's children, etc., resulting in the exponential growth.
I'm trying to solve the following problem: "Given a sorted (increasing order) array with unique integer elements, write an algorithm to create a BST with minimal height."
The given answer takes the root node to be the middle of the array. While doing this makes sense to me intuitively, I'm trying to prove, rigorously, that it's always best to make the root node the middle of the array.
The justification given in the book is: "To create a tree of minimal height, we need to match the number of nodes in the left subtree to the number of nodes in the right subtree as much as possible. This means that we want the root node to be the middle of the array, since this would mean that half the elements would be less than the root and half would be greater."
I'd like to ask:
Why would any tree of minimal height be one where the number of nodes in the left subtree be as equal as possible to the number of nodes in the right subtree? (Or, do you have any other way to prove that it's best to make the root node the middle of the array?)
Is a tree with minimal height the same as a tree that's balanced? From a previous question on SO, that's the impression I got, (Visualizing a balanced tree) but I'm confused because the book specifically states "BST with minimal height" and never "balanced BST".
Thanks.
Source: Cracking the Coding Interview
The way I like to think about it, if you balance a tree using tree rotations (zig-zig and zig-zag rotations) you will eventually reach a state in which the left and right subtree differ by at most height of one. It is not always the case that a balanced tree must have the same number of children on the right and the left; however, if you have that invariant(same # of children on each side), you can reach a tree that is balanced using tree rotations)
Balance is defined arbitrarily. AVL trees define it in such as way that no subtree of the tree has a children whose heights differ by more than one. Other trees define balance in different ways, so they are not the same distinction. They are inherently related yet not exactly the same. That being said, a tree of minimal height will always be balanced under any definition since balancing exists to maintain a O(log(n)) lookup time of the BST.
If I missed anything or said anything wrong, feel free to edit/correct me.
Hope this helps
Why would any tree of minimal height be one where the number of nodes
in the left subtree be as equal as possible to the number of nodes in
the right subtree?
There can be a scenario where in minimal height tree which are ofcourse balanced can have different number of node count on left and right hand side. BST worst case traversal is O(n) in case if it is sorted and in minimal height trees the complexity for worst case is O(log n).
*
/ \
* *
/
*
Here you can clearly see that left node count and right nodes are not equal though it is a minimal height tree.
Is a tree with minimal height the same as a tree that's balanced? From a previous question on SO, that's the impression I got, (Visualizing a balanced tree) but I'm confused because the book specifically states "BST with minimal height" and never "balanced BST".
Minimal height tree is balanced one, for more details you can take a look on AVL trees which are also known as height-balanced trees. While making BST a height balanced tree you have to perform rotations (LR, RR, LL, RL).
I had a doubt when we are using a splay tree, the last accessed element will come to the root node. consider my tree is
5
/ \
3 7
/ \ / \
2 4 6 8
when I perform a inorder traversal, the output will be
2 3 4 5 6 7 8
so here the last accessed element is 8, in that I have a doubt, so the 8 will be the last accessed node, so we want to move 8 as a root node or not?
Your logic is correct. But the operation of splaying is only done during insertion and searching and not during traversal. When you insert or search a node it is moved to the top(made as root node) so that it would be accessed quickly thereafter.
You can choose either way. There are multiple factors here.
The classic algorithm for inorder traversal of binary trees is recursive, and splay trees can be as deep as they have nodes because they are not strictly balanced. So, a recursive in order traversal of a splay tree can easily run out of stack space -- and it could take up as much memory than the splay tree itself!
If the splay tree has nodes with parent pointers the inorder traversal can be done without recursion and without splaying. This is because you can find the prececessor and successor by following left, right, and parent pointers.
It is also possible to efficiently iterate splay tree nodes in order as follows:
Find the smallest element and splay it to the root.
Find its successor and splay it to the root.
Find its successor and splay it to the root. Etc.
Continue until there is no successor.
In this case, when all is done, the root will be the largest element when done.
Inorder traversal of splay trees (or https://doi.org/10.1016/S1571-0661(04)80771-0) describes why this approach is efficient. It is nearly O(n) to visit each node of a splay tree in order, splaying each time.
I'm trying to create a B+ tree with the following sequence,
10 20 30 40 50 60 70 80 90 100
all index nodes should have minimum of 2 and max of 3 keys. I was able to insert till 90, but as soon as insert 100 it increases the height from 2 to 3.
The problem is second child of root has one node, and I cannot fix it. It should have atleast 2, right? Can someone guide me?
UPDATE: I'm following this algorithm
If the bucket is not full (at most b - 1 entries after the insertion), add the record.
Otherwise, split the bucket.
Allocate new leaf and move half the bucket's elements to the new bucket.
Insert the new leaf's smallest key and address into the parent.
If the parent is full, split it too.
Add the middle key to the parent node.
Repeat until a parent is found that need not split.
If the root splits, create a new root which has one key and two pointers. (That is, the value that gets pushed to the new root gets removed from the original node)
P.S: I'm doing it manually, by hand, to understand the algorithm. There's no code!
I believe your B+ Tree is O.K, assuming the order of your B+ Tree is 3. If the order is m, each internal node can have ⌈m/2⌉ to m children. In your case, each internal node can have 2 to 3 children. In a B+ Tree if a node is having just 2 it children, then it requires only 1 key, so no constraints are violated by your B+ Tree.
If you are still confused, look at this B+ Tree Simulator. Try it.
To get the tree you've drawn after inserting the values 10 to 100, the Order of your tree must be 4 not 3. Otherwise the answer given is correct: order m allows m-1 keys in each leaf and each node. After that the Wikipedia description gets a bit confusing as it concentrates on children not keys, and doesn't mention what to do with rounding. Dealing with just keys, the rules are:
Max keys for all nodes = Order-1
Min keys for leaf nodes = floor(Order/2)
Min keys for internal nodes = floor(maxkeys/2)
So you are correct in having one key in the node (order=4, max=3, minleaf=2, minnode=1). You might find this page useful as it has an online JavaScript version of the processes as well as documentation of both insert and delete:
http://goneill.co.nz/btree.php
Believing the wikipedia article: http://en.wikipedia.org/wiki/AVL_tree
AVL trees are height-balanced, but in general not weight-balanced nor μ-balanced;[4] that is, sibling nodes can have hugely differing numbers of descendants.
But, as an AVL tree is:
a self-balancing binary search tree [...]. In an AVL tree, the heights of the two child subtrees of any node differ by at most one
I don't see how an AVL could be weight-unbalanced since -if I understood the definition of an AVL tree well-, every sibling will have approximately the same number of child since they have the same height +/- 1.
So, could you give me an example of an AVL tree which is unbalanced ? I did not succeed to find one. Thus, or I misunderstood the definition of an AVL/unweighted tree, or the wikipedia article is false...
Thanks
You are correct in your understanding that an AVL tree is defined by the nearly-uniform height of its edge nodes, but your confusion appears to be about the difference between node position and edge weight.
That is: In an AVL tree, the depth of the edge nodes will the same +/- (but not both!) one. This makes no claims as to the cost associated with an edge between the nodes. For an AVL tree with a root node and two children, the left path may be twice as expensive to traverse as the right path. This would make the tree weight-unbalanced, but still maintain the definition of an AVL tree.
This page has more information: Weight-balanced tree - wikipedia
From Wikipedia:
A Binary Tree is called μ-balanced, with , if for every node N, the inequality:
holds and μ is minimal with this property. |N| is the number of nodes under the tree with N as root (including the root) and Nl is the left sub-tree of N.
Essentially, this means that the children in an AVL tree are not necessarily evenly distributed across the lowest level of the tree. Taking N as indicating the root node of the tree, one could construct a valid AVL tree that has more children to the left of the root than to the right of it. With a very deep tree, there could be many nodes at this bottom level.
The definition of an AVL tree would require that they all be within one of the deepest point, but makes no guarantee as to what node they are a child of with respect to a node N.
sibling nodes can have hugely differing numbers of descendants.
I was just scratching my head about this and the fact that my AVL implementation produced trees that were not ultimately lopsided, but which had smaller and larger "distant cousin" subtrees inside.
I sketched this out to reassure myself:
The red nodes have a balance of 1, the green ones -1, and the black ones 0. This is a valid AVL tree in that the height difference between two sibling subtrees is never more than one, but there are (almost) twice as many nodes in the right subtree as the left one.