B+ Tree leaf node size - c

In this picture, explaining that node size n is 4 (n = 4)
but I wonder that which one is node or pointer
I think that Brandt, Califieri, Crick 's section is node but that section size is only 3... and between that section is pointer ( I think..)
so. I really want to know 'n = 4' exactly mean and how select n's size
and this picture's mean...
thx..... ^^..;;

As per the image , i think node is a row of the table containing 4 columns . The so named leaf node in image here , is some other structure , pointer of whose type can point to a node(row) of the given database.
Probably the leaf node is an array of 3 leaf node pointers , each component of which has 2 fields. One for pointer , and other for a string like "Brandit".
Or it may be a node with 7 fields , three of which are data fields and 4 are pointers to a node(row) of a database.

Related

proof or disproof adding two minimal values to a B tree and then delete them

I came across a question and I'm not sure about the right answer:
We insert two new minimal values w and z, with w > z, in a B tree --
first we insert w and then x. Right afterwards we delete them by the same order. Does the original B tree struct stay the same, or do we get a different order in the tree?
It is not guaranteed that the B-tree remains the same. It would be guaranteed if the deletions happened in the opposite order as the insertions, but if the order is:
Insert w
Insert z
Delete w
Delete z
...then it depends on implementation choices, notably how the deletion of a value that occurs in a non-leaf node is dealt with.
Here is a counter example 2-3 tree, i.e. a B-tree of order 3:
[5 , -]
/ |
[4,-] [6,7]
So we have a root with (separator) value 5 and an empty slot. There are two leaves: the first leaf is filled half, with value 4, while the right leaf is completely occupied with values 6 and 7.
Now let w=2 and z=1.
After we insert 2, we get this tree -- nothing special happens:
[5 , -]
/ |
[2,4] [6,7]
Then, to insert 1, we must split the left most leaf, and move 2 as separator value to the parent node:
[2 , 5]
/ | \
[1,-] [4,-] [6,7]
Now we get to the critical part: the deletion of 2 gives us a choice. Wikipedia describes that choice as follows:
Choose a new separator (either the largest element in the left subtree or the smallest element in the right subtree), remove it from the leaf node it is in, and replace the element to be deleted with the new separator.
If we choose the second option, then that means we choose 4 as new separator value to replace the value 2. This gives us the following intermediate situation:
[4 , 5]
/ | \
[1,-] [-,-] [6,7]
The empty leaf in the middle is underflowing, so we try to rotate. We must perform a rotation with the right-sided neighbor, as the other one does not have enough values, and so we move the 6 up, and the 5 down:
[4 , 6]
/ | \
[1,-] [5,-] [7,-]
...and the tree is valid again. But,... it is not the original tree.
So, this one counter example is enough proof that the predicate is false.
If however there would be the extra information that the algorithm always takes the first alternative for the deletion of an internal value, then the predicate seems to be true.

How many records can a maximally full B+ Tree with d = 3 and height = 3 hold?

How many records can a maximally full B+ Tree with d = 3 and height = 3 hold?
From my point, it's either 7^3 * 6(only one node in root) or 7 ^ 4 + 6 (all node in root node is full)
However, the autograder told it's not correct but doesn't give me the correct answer, wondering if I misunderstand anything here.
I also want to know how calculate this question.
In my opinion, this is7^3 * 6.
(2d + 1)^height * 2d
B+ tree leaf node stores data, so for a height of 3 B+ tree, the total records ( that stores at the leaf level) should be (2d + 1)^H *(2d)= (2*3+1)^(height-1)*(6) = 7^2*6 = 294
The actual height H should exclude leaf level.

How to find the maximum sum of nodes in a tree

I am given two arrays, one defines the relationship of nodes and other gives the values of nodes.
arr1={0,1,1,1,3,3,4}
arr2={22,100,3,3,4,5,9}
arr1 defines the relationship, i.e. root node is 1st element and parent of node 2,3 and 4th is node 1 and parent of node 5th and 6th is root 3rd and parent of node 7th is node 4.
arr2 gives the value of nodes, node 1 have a value of 22 and node 2 has got a value of 100.
I have to find the maximum sum of nodes such that no two included nodes have a parent or a grand parent relationship.
sample input:
a[i]=[0,1,1,1,3,3,6,6]
b[i]=[1,2,3,4,5,100,7,8]
output: 111
I am new to DS and ALGO and not able to even think of the solution. Help is needed thanks.
Any type of help will do.
You can solve it using Dynamic Programming.
Consider an array dp[] which stores the answer for each of the vertex and its subtree.
Now state of DP would be,
dp[currentVertex] = max(sum of all children's dp[] ,
b[currentVertex] + sum of all vertices' dp[] whose
greatGrandParent is currentVertex])
You need to build DP table using bottom-up approach. So start from the leaves.
Answer would be dp[root] after all the calculation.

Insert node in Min Heap prioritizing left child

After getting the code for inserting a note in a Min Heap right, I'm confused when it comes to what changes should I make if I want to prioritize the left child of a node when rearranging the heap.
The input would be something like:
I 5 //insert number 5 in the Min Heap
I 4
I 3
I 2
I 1
and the output should be:
1 2 3 4 5
instead of the usual:
1 2 4 5 3
Any ideas on how to get to this output? Thanks in advance.
The structure of the heap depends entirely on the order in which you insert items. The reason is that, when inserting, you add the new node to the end of the heap and then sift it up through its parent's pointer. The rules are:
Add the item to the end of the heap.
If the item is greater than or equal to its parent, then done.
Swap the item with its parent.
Go to 2.
Given those rules, let's walk through what happens when you insert items in the order [5,4,3,2,1].
[5]
[5,4] // the new item is smaller than its parent, so swap
[4,5]
[4,5,3] // the new item is smaller than its parent, so swap
[3,5,4]
[3,5,4,2] // the new item is smaller than its parent, so swap
[3,2,4,5] // still smaller than its parent
[2,3,4,5]
[2,3,4,5,1] // 1 is smaller than 3, so swap
[2,1,4,5,3] // 1 is smaller than 2, so swap
[1,2,4,5,3]
There's no efficient way to "prioritize" a particular subtree, especially in a binary heap. It looks simple enough in a heap with just five items, but every level you add increases the cost of keeping sibling nodes in the proper order. You're better off just sorting the nodes and creating a heap from the resulting array.
Not that a sorted heap helps you much. As soon as you removed the first item, rearranging the heap would cause it to no longer be sorted.

Approximate Order-Preserving Huffman Code

I am working on an assignment for an Algorithms and Data Structures class. I am having trouble understanding the instructions given. I will do my best to explain the problem.
The input I am given is a positive integer n followed by n positive integers which represent the frequency (or weight) for symbols in an ordered character set. The first goal is to construct a tree that gives an approximate order-preserving Huffman code for each character of the ordered character set. We are to accomplish this by "greedily merging the two adjacent trees whose weights have the smallest sum."
In the assignment we are shown that a conventional Huffman code tree is constructed by first inserting the weights into a priority queue. Then by using a delmin() function to "pop" off the root from the priority queue I can obtain the two nodes with the lowest frequencies and merge them into one node with its left and right being these two lowest frequency nodes and its priority being the sum of the priorities of its children. This merged node then is inserted back into the min-heap. The process is repeated until all input nodes have been merged. I have implemented this using an array of size 2*n*-1 with the input nodes being from 0...n-1 and then from n...2*n*-1 being the merged nodes.
I do not understand how I can greedily merge the two adjacent trees whose weights have the smallest sum. My input has basically been organized into a min-heap and from there I must find the two adjacent nodes that have the smallest sum and merge them. By adjacent I assume my professor means that they are next to each other in the input.
Example Input:
9
1
2
3
3
2
1
1
2
3
Then my min-heap would look like so:
1
/ \
2 1
/ \ / \
2 2 3 1
/ \
3 3
The two adjacent trees (or nodes) with the smallest sum, then, are the two consecutive 1's that appear near the end of the input. What logic can I apply to start with these nodes? I seem to be missing something but I can't quite grasp it. Please, let me know if you need any more information. I can elaborate myself or provide the entire assignment page if something is unclear.
I think this can be done with a small modification to the conventional algoritm. Instead of storing single trees in your priority queue heap, store pairs of adjacent trees. Then, at each step you remove the minimum pair (t1, t2) as well as the up to two pairs that also contain those trees, i.e. (u, t1) and (t2, r). Then merge t1 and t2 to a new tree t', re-insert the pairs (u, t') and (t', r) in the heap with updated weights and repeat.
You need to pop two trees and make 3rd tree. To it left node join tree with smaller sum and to right node join second tree. Put this tree to heap. From your example
Pop 2 tree from heap:
1 1
Make tree
?
/ \
? ?
Put smaller tree to left node
min(1, 1) = 1
?
/ \
1 ?
Put to right node second tree
?
/ \
1 1
Tree you made have sum = sum of left node + sum of right node
2
/ \
1 1
Put new tree (sum 2) to heap.
Finally you will have one tree, It's Huffman tree.

Resources