How to make a binary tree balance - c

Here is a simple binary tree in c, but it seems not balance, how to make it balance?
Code:
/**
* binary_tree impl
*/
#include <stdio.h>
#include <stdlib.h>
typedef struct _tnode _tnode;
typedef struct _bin_tree _bin_tree;
struct _tnode {
int data;
_tnode *parent;
_tnode *left;
_tnode *right;
};
_tnode *new_node(int data) {
_tnode *node = (_tnode*)malloc(sizeof(_tnode));
node->data = data;
node->left = NULL;
node->right = NULL;
return node;
}
_tnode *add(_tnode *top, int new_data, int (*cmpf)(int, int)) {
if(top == NULL) {
top = new_node(new_data);
} else if(cmpf(top->data, new_data)<=0) {
if(top->left == NULL)
top->left = new_node(new_data);
else
add(top->left, new_data, cmpf);
} else {
if(top->right == NULL)
top->right = new_node(new_data);
else
add(top->right, new_data, cmpf);
}
return top;
}
int cmp_int(int n1, int n2) {
return n1 - n2;
}
void print_tree(_tnode *top) {
if(top->left) print_tree(top->left);
printf("%d\n",top->data);
if(top->right) print_tree(top->right);
}
int main(int argc, char * argv[]) {
int i = 0;
_tnode *top = NULL;
int arr[] = {6,1,9,3,5,0,2,7};
int count = sizeof(arr) / sizeof(arr[0]);
for(i=0; i<count; i++) {
top = add(top, arr[i], cmp_int);
printf("add: %d\n", arr[i]);
}
print_tree(top);
return 0;
}

The basic idea is as follows.
For insertions, you first insert your new node at a leaf exactly as you would for a non-balanced tree.
Then you work your way up the tree towards the root, making sure that, for each node, the difference in height between the left and right sub-trees is never more than one.
If it is, you "rotate" nodes so that the difference is one or less. For example, consider the following tree, which was balanced before you added 32 but now isn't:
128
/
64
/
32
The depth differential at the 32 node is zero as both sub-trees have a depth of zero.
The depth differential at the 64 node is one as the left sub-tree has a depth of one and the right sub-tree has a depth of zero.
The depth differential at the 128 node is two as the left sub-tree has a depth of two and the right sub-tree has a depth of zero. So a rotation through that node needs to occur. This can be done by pushing the 128 down to the right sub-tree and bringing up the 64:
64
/ \
32 128
and you once again have balance.
The direction of rotation depends, of course, on whether the height is too much on the left or the right.
Deletion is a little more tricky since you're not necessarily working at a leaf node like you are with insertion. It gets a little complex there since it depends on whether the node has no children (is a leaf), one child, or two children.
1/ For a leaf node, you can just delete it, then start re-balancing at the parent of that leaf.
2/ For a one-child node, you can just copy the child information (data and links) to replace the one you want to delete, then delete the child and start re-balancing where the child information now is.
3/ For a two-child node, the idea is to find its immediate successor (by first going to the right child then continuously going to the left child until there are no more left children). You could also find its immediate predecessor (left then continuously right) which works just as well.
Then you swap the data in the node you want to delete with the data in the successor (or predecessor), then re-apply this rule until the node you want to delete is a leaf node. Then just delete that leaf node, using the exact same rules as per (1) above.
This swapping trick is perfectly valid since, even though the swap puts two adjacent items temporarily out of sequence, the fact that you're deleting one of them (2 in this case) auto-magically fixes things up:
2 3 3
/ \ --> / \ --> /
1 3 1 2 1
===== ===== =====
1,2,3 1,3,2 1,3

Related

spliting a binary search tree in half in O(h) time complexity

I am practicing binary search trees and i have to answer a problem:
A tree struct is given as
struct tree{
int key;
int lcnt;
struct tree *lc;
struct tree *rc;
};
where lcnt is an integer holding the number of the nodes at the left subtree of each node. The problem is to split the tree in half updating every time the lcnt with the valid value. The split algorith must take O(h) time where h is the tree's hight. I found the solution down below and it works for the most trees. But consider now this tree
170
/
45
\
30
the result will be: tree1: 170, tree2: 45.
I have no idea how to fix it because if i try something like "dont split if the node is a leaf" or something then i have problems with other trees. The split function takes the parameter root which is the root of the primary tree, an integer which is the trees lenght/2 and it returns the 2 new trees. The one with return and the other by reference using a third parameter double pointer tree. I am also using updt function and some calculations to update the lcnt at every split.
the code is here:
struct tree* split(struct tree *root, struct tree **new_tree, int collect){
struct tree *new_root_1, *new_root_2, *link1=NULL, *link2=NULL;
struct tree *current=root, *prev=NULL, *temp=NULL;
if(!root)
return NULL; //empty tree
int collected=0, created_root1=0, created_root2=0;
int decrease;
while(current!=NULL && collected<collect){
if(collected+current->lcnt+1<=collect){
// there is space for the left subtree so take it all and move to the right
collected=collected+current->lcnt+1; //update the number of the collected nodes
if(!created_root1){
//create the root for the one tree
created_root1=1;
new_root_1=current;
link1=current;
}else{
link1->rc=current;
link1=current;
}
if(!created_root2 && collect==collected)
//in case the tree must be splited in half
new_root_2=current->rc;
prev=current;
current=current->rc;
//break the node link
prev->rc=NULL;
}else{
// there is no space for the left subtree so traverse it until it becomes small enough
if(!created_root2){
//create the root for the second tree
created_root2=1;
new_root_2=current;
link2=current;
}else{
link2->lc=current;
// at every link at left the count_total_tasks will help to update the lcnt of the
parent node
temp=new_root_2;
while(temp!=NULL){
temp->lcnt=count_total_tasks(temp->lc);
temp=temp->lc;
}
link2=current;
}
prev=current;
current=current->lc;
//break the node link
prev->lc=NULL;
//update the lcnt
decrease=prev->lcnt;
updt(new_root_2, decrease);
}
}
*new_tree=new_root_2;
return new_root_1;
}
And this is the updt function:
void updt(struct tree* root, int decrease){
struct tree *temp;
temp=root;
while(temp!=NULL){
temp->lcnt=temp->lcnt-decrease;
temp=temp->lc;
}
}
Your test case,
170
/
45
\
30
is not a valid binary search tree.

Huffman Tree Coding

How can I insert data to huffman tree in c?
huffman_tree *huffman_node_create(char data, unsigned int frequency)
{
huffman_tree *node = malloc(sizeof(huffman_tree));
node->data = data;
node->frequency = frequency;
node->left = NULL;
node->right = NULL;
return node;
}
I write this to create Huffman tree. But I do not know how can I add the frequency the tree, how can I know the number should be right or left?
and:
typedef struct huffman_tree{
char c;
int freq; //unsigned?
struct huffman_tree *left; // 0 (left)
struct huffman_tree *right; // 1 (right)
}huffman_tree;
It doesn't matter if they're right or left, actually.
The way to construct a Huffman tree is to keep selecting the two lowest frequencies and combining them into a tree node so that one becomes the left child, the other the right, and the combined frequency is the sum of the two frequencies. This combined node is put in the table, replacing its two children. The tree gets built gradually as previously combined nodes get paired with others. This process continues until all frequencies have been combined into a single tree.
Maybe take a look at this? https://www.geeksforgeeks.org/huffman-coding-greedy-algo-3/
In general Numbers go to the left if they are smaller and to the right if they are larger than the previous node.

How to find the deepest UNIQUE node of a binary tree in C

I am reading commands from a text file. A sample input is:
Create Key 2
Create Key 1
Create Key 3
Update Key 1
Delete Key 2
I want to reduce operations my program executes. For example, it is useless to create Key2, only to delete it after.
In order to minimize the number of operations I decided to store these in a binary search tree. In the book "Introduction to algorithms", third edition, by Cormen, Leiserson, Rivest and Stein, a binary search tree (BST) is explicitly defined as allowing duplicates. The letter after the key stands for either Create, Update or Delete. A simple example would be as follows:
K2(C)
/ \
/ \
K1(C) K3(C) <-- The deepest Key3 appears here.
\ /
K1(U) K2(D) <-- The deepest Key1 and Key2 appear here.
As pointed out I would like to be able to extract all the unique keys in their deepest position, to minimize the number of operations. I could not find any reference to this in CLRS, maybe I was looking for the wrong thing..
A simple search, which returns a key does not suffice, since it returns the first node found, therefore breadth-first or depth-first search would not work.
struct node* search(struct node* root, int key)
{
// Base Cases: root is null or key is present at root
if (root == NULL || root->key == key)
return root;
// Key is greater than root's key
if (root->key < key)
return search(root->right, key);
// Key is smaller than root's key
return search(root->left, key);
How to handle duplicates in Binary Search Tree? describes how to handle inserting duplicates not how to handle extracting duplicates which appear last.
Another idea would be to return the right most unique key as follows:
struct node * maxValueNode(struct node* node)
{
struct node* current = node;
/* loop down to find the rightmost leaf */
while (current->right != NULL)
current = current->right;
return current;
}
Am I missing something here? How can I how to find the deepest UNIQUE node of a binary tree?
I don't get why you would need a BST for that but anyway, you could make a search that does not stop at first occurrence and keeps track of the deepest node found using pointers. This should do the trick :
void deepest_search(struct node * root, int key, int currentDepth, int * maxDepth, struct node * deepestNode)
{
// Do nothing if root is null
if (root != NULL)
{
// Update deepest node if needed
if(root->key == key && currentDepth > *maxDepth)
{
*maxDepth = currentDepth;
*deepestNode = *root;
}
// Might need to search both sides because of duplicates
// Can make this an if/else if duplicates are always in left/right subtree
if(root->key <= key)
deepest_search(root->right, key, currentDepth + 1, maxDepth, deepestNode);
if(root->key >= key)
deepest_search(root->left, key, currentDepth + 1, maxDepth, deepestNode);
}
}
I tested it on your (small) example and it seems to work fine:
struct node
{
int key;
int val;
struct node *left, *right;
};
void main(void)
{
int key = 1;
int currentDepth = 1;
struct node n5 = {2, 5, NULL, NULL};
struct node n4 = {1, 4, NULL, NULL};
struct node n3 = {3, 3, &n5, NULL};
struct node n2 = {1, 2, NULL, &n4};
struct node n1 = {2, 1, &n2, &n3};
struct node * deepestNode = (struct node *) malloc(sizeof(struct node));
int maxDepth = 0;
deepest_search(&n1, key, currentDepth, &maxDepth, deepestNode);
printf("%d\n", maxDepth);
printf("%d\n", deepestNode->val);
}
If you are sure about handling duplicate values then the article you mentioned gives a great idea how to handle them in BSTs. Assuming you implement one of those two inserting methods let's see how you can implement deleting a node in any of them.
1. Pushing duplicates to the right or left subtree (ugly solution)
If you chose this solution then if you find a node with given value ( let's call the value X) there is no guarantee that it does not appear somewhere else down the tree. You must search for the value in one of the subtrees. There is more to it, you have to propagate a depth of every node with value X and choose the deepest one. That requires some coding. That's why I think that the second solution is far better.
2. Counting values (better)
According to this method, every node holds a counter that tells how many times given value has occured. If you want to delete one instance of this value, then if counter is >1 then you just decrement the counter. Otherwise if counter == 1 you delete the node as you would do in a regular BST.

Iterating over AVL tree in O(1) space without recursion

I have an AVL Tree. Each node looks like this:
typedef struct {
Node *parent; // the parent node
Node *left; // the node left of this node
Node *right; // the node right of this node
int height; // the height of this node
void *value; // payload
} Node;
Is it possible to iterate over an AVL tree with these nodes in O(1) space, without recursion, if yes, how?
If not, a solution with sub-O(n) space or iff really necessary O(N) space is appreciated as well.
With iterating over the tree I mean I want to visit each node once, and if possible in order (from the mostleft to the mostright node).
If you store the last node you have visited, you can derive the next node to visit in an iterator.
If the last node was your parent, go down the left subtree.
If the last node was your left subtree, go down the right subtree.
If the last node was your right subtree, go to your parent.
This algorithm gives you a traversal in O(1) for the tree. You need to flesh it out a little for the leaves and decide what kind of iterator (pre/in/post-order) you want to decide where the iterator should and wait for incrementation.
It is possible to get the next in-order node given a pointer to some node, as long as you keep parent pointers. This can be used to iterate the tree, starting with the leftmost node. From my implementation of AVL tree:
BAVLNode * BAVL_GetNext (const BAVL *o, BAVLNode *n)
{
if (n->link[1]) {
n = n->link[1];
while (n->link[0]) {
n = n->link[0];
}
} else {
while (n->parent && n == n->parent->link[1]) {
n = n->parent;
}
n = n->parent;
}
return n;
}
To get the leftmost node:
BAVLNode * BAVL_GetFirst (const BAVL *o)
{
if (!o->root) {
return NULL;
}
BAVLNode *n = o->root;
while (n->link[0]) {
n = n->link[0];
}
return n;
}
Here, node->link[0] and node->link[1] are the left and right child of the node, respectively, and node->parent is the pointer to the parent node (or NULL for root).
A single GetNext() operation has O(logn) time complexity. However, when used to iterate the entire tree, you get O(n) amortized time complexity.
"Datastructures and their algorithms" by Harry Lewis and Larry Denenberg describe link inversion traversal for constant space traversal of a binary tree. For this you do not need parent pointer at each node. The traversal uses the existing pointers in the tree to store path for back tracking. 2-3 additional node references are needed. Plus a bit on each node to keep track of traversal direction (up or down) as we move down. In my implementation of this algorithms from the book, profiling shows that this traversal has far less memory / processor time. An implementation in java (c would be faster i guess) is here.

Single Linked List - Delete From middle

I am trying to figure out an algorithm to delete from the middle of a linked list..
My idea is to traverse the list, find the node right before the node I want to delete, call it Nprev, and set Nprev to Nnext where Nnext is after the node to delete Ndelete.
So Nprev -> Ndelte -> Nnext.
My problem is that I cannot figure out how to traverse this list to find the node before the one I want.
I've been doing this with seg faults because I assign pointers out of range I assume.
Its a very messy algorithm that I have, with many if else statements..
Is there an easier way to do this?
Basically I need to go through the list, apply a function to each node to test if
it is true or false. If false I delete the node.
Deleting first and last is not as hard but middle stumped me.
Please let me know if there are some general ways to solve this problem. I've
been scouring the internet and found nothing I need.
I used this: http://www.cs.bu.edu/teaching/c/linked-list/delete/
but the algorithm before step 4 only deletes the first node in my list
and doesn't do any more.
How can I modify this?
They also give a recursive example but I don't understand it and am intimidated by it.
First you need to find the middle node.
Well take 3 pointers fast, slow, prev
with fast moving with twice the speed of slow and prev storing the address of the node previous of slow.
i.e.
*slow=&head,*fast=&head,prev=Null
traverse the list and when fast=NULL
slow will point to the middle node if number of elements are odd and prev will store the address of node previous of the mid node.
so simply
prev->next=slow->next.
Here an example of something I use to search and remove by index:
Given this struct: (Can also be adapted to other self referencing structs)
struct node
{
S s;
int num;
char string[10];
struct node *ptr;
};
typedef struct node NODE;
Use this to remove an item from somewhere in the "middle" of the list (by index)
int remove_by_index(NODE **head, int n) /// tested, works
{
int i = 0;
int retval = -1;
NODE * current = *head;
NODE * temp_node = NULL;
if (n == 0) {
return pop(head);
}
for (int i = 0; i < n-1; i++) {
if (current->ptr == NULL) {
return -1;
}
current = current->ptr;
}
temp_node = current->ptr;
retval = temp_node->num;
current->ptr = temp_node->ptr;
free(temp_node);
return retval;
}

Resources