Inserting millions of Elements into a Binary Search Tree BST - c

I have been trying to solve this issue for almost 3 days with no luck yet. I am trying to insert many elements is size of (5 million)of unsigned integer into a binary tree.
This code works fine when I limit the total elements to be inserted into 10K, however, it does not work when I set the total elements to be 5 million.
I am running this code on my PC which has:
Windows 7 - 32
RAM 4 GB
Any help would be really appreciated. Thanks in advance :)
Here is my code:
#include<stdlib.h>
#include<stdio.h>
typedef int ElementType;
typedef struct TreeNode {
ElementType element;
struct TreeNode *left, *right;
} TreeNode;
TreeNode *createTree(){
//Create the root of tree
TreeNode *tempNode;
tempNode = malloc(sizeof(TreeNode));
tempNode->element = 0;
tempNode->left = NULL;
tempNode->right = NULL;
return tempNode;
}
TreeNode *createNode(ElementType X){
//Create a new leaf node and return the pointer
TreeNode *tempNode;
tempNode = malloc(sizeof(TreeNode));
tempNode->element = X;
tempNode->left = NULL;
tempNode->right = NULL;
return tempNode;
}
TreeNode *insertElement(TreeNode *node, ElementType X){
//insert element to Tree
if(node==NULL){
return createNode(X);
}
else{
if(X < node->element){
node->left = insertElement(node->left, X);
return node; // add this.
}
else if(X > node->element){
node->right = insertElement(node->right, X);
return node; // add t
else if(X == node->element){
printf("Oops! the element is already present in the tree.");
}
}
}
TreeNode *displayTree(TreeNode *node){
//display the full tree
if(node==NULL){
return;
}
displayTree(node->left);
printf("| %d ", node->element);
displayTree(node->right);
}
main(){
//pointer to root of tree #2
TreeNode *TreePtr;
TreeNode *TreeRoot;
TreeNode *TreeChild;
//Create the root of tree
TreePtr = createTree();
TreeRoot = TreePtr;
TreeRoot->element = 32;
for ( int i=0; i < 5000000; i ++)
insertElement(TreeRoot, i);
displayTree(TreeRoot);
}
Any help would be really appreciated. Thanks in advance :)

Assuming that you have no other errors in your code, always inserting 8 will degenerate the tree to a list, and I think, your stack will overflow at some recursion level far below the 5 million.
To generally avoid a degenerated tree, I would advise you to use the insert/deletion semantics of an AVL-Tree.
This has the advantage, that your datastructures can remain as they are, but you only have to adapt the insert and delete procedures.
Edit: In your comment, you state now that you do not insert always 8 but i. This means, you insert pre-sorted elements into the binary tree, which also degenerates it to a list, so the same problem arises as with always inserting "8".
It looks like:
1
\
2
\
3
\
4
\
...
after inserting the elements in order.
An AVL-Tree will not suffer from that problem:
https://en.wikipedia.org/wiki/AVL_tree

OK Faisal.
If you insert the value of i, then you will always insert a key greater than all the keys currently contained in the tree. Consequently you will get a tree of maximum height, which in performance (and shape too) is equivalent to a list. Since you algorithm is recursive, you will get a stack overflow very quickly.
A possible way for dealing with you problem, but is not a guarantee for avoiding an overflow, is to insert random keys. Theoretically, the average on the number of nodes revised in an unsuccessful search is O(log N).
So you could use the rand() for getting random numbers or a more sophisticated and sure random number generator.

Related

How can i delete multiple nodes of a BST in c?

i have to delete some nodes of this BST
struct node {
char *value;
struct node *p_left;
struct node *p_right;
int usable;
};
the nodes that i have to delete are the ones with usable set to 0.
my question is, is it possible to make a sweep of the tree and delete all the nodes with usable == 0? all the resources i found online are about deleting a node containing a certain key, i tried to apply those but they didn't work
--edit:
the remove node function that i implemented was fine:
struct node* deleteNode(struct node* root, char *key) {
if (root == NULL)
{
return root;}
int cmp_result = strcmp(key, root->value);
if (cmp_result < 0)
root->p_left= deleteNode(root->p_left, key);
else if (cmp_result>0)
root->p_right= deleteNode(root->p_right, key);
else{
if (root->p_left==NULL) {
struct node *temp = root->p_right;
free(root);
return temp;
} else if(root->p_right==NULL){
struct node *temp = root->p_left;
free(root);
return temp;
}
struct node* temp = minValueNode(root->p_right);
strcpy(root->value, temp->value);
root->p_right= deleteNode(root->p_right, temp->value);
}
return root;
}
the problems arose because i called this function while traversing the tree, changing the structure of the tree while i'm using it
void pos2(struct node *head, char exactchar, int n)
{
if( head != NULL ) {
pos2(head->p_left, exactchar, n);
if (head->value[n]!=exactchar){
head = deleteNode(head, head->value);}
pos2(head->p_right, exactchar, n);
}
}
like this function that deletes a node if it has a word without a determined char in a determined position in it
is it possible to make a sweep of the tree and delete all the nodes with usable == 0?
Of course.
all the resources i found online are about deleting a node containing a certain key, i tried to apply those but they didn't work
I have no idea what, specifically, you tried. However, algorithms aimed at deleting the node having a specific key clearly do not solve the problem you have posed. They will use the BST-ness of the tree to efficiently find the specific node to delete, if it is present, and delete just that node.
Since your flag does not have a functional relationship with the keys on which the BST is ordered, you need to traverse the whole tree and delete every node you find that satisfies your criterion for doing so. Operationally, I would probably structure that as a depth-first traversal with post-order deletions (that is, consider whether to delete a given node after processing both its subtrees).

largest leaf node in a tree.how to prevent getting a 0 when leaf nodes are negative

I have to find the largest leaf node in a BST but it only has the shape of a BST not the properties. The right most node is not the largest and the left most is not the smallest.
So far I have created code that I think would work for most cases except negative cases.
typedef Struct BST
{
int data;
struct BST *left;
struct BST *right;
}Tree;
The above is just the structure of the node
int largest_leaf(Tree *head)
{
if(head == NULL)
{
printf("Heyall\n");
return 0;
}
if(head -> right == NULL && head -> left == NULL)
{
printf("head -> data: %d\n", head -> data);
return head -> data;
}
int i = largest_leaf(head -> left);
int r = i;
i = largest_leaf(head -> right);
if(i > r)
{
r = i;
}
return r;
}
I know my code may be confusing. But to simplfy it, since this is in a shape of a tree I am going to be traversing to one edge (In this case the left) and then return the leaf value and do the same for every value there after.
My question is how do I fix the problem of getting a zero for the case of all negative values in the leaf node.
edit: the tree is not empty
I have to find the largest leaf node in a BST but it only has the shape of a BST not the properties. The right most node is not the largest and the left most is not the smallest.
You have to traverse the whole tree and use the same basic algorithm you'd use for an array. Instead of 0 as the smallest number, use the smallest possible integer: INT_MIN.
As mentioned by kaylum, you need to consider what happens when the tree is empty. Does it return INT_MIN? Do you need a separate error flag?

Convert heap implemented in array to tree

I have this homework where I have to convert a min-heap represented in array:
DEFINE #SIZE
typedef int Heap[SIZE]
and implement it in a tree like so:
typedef struct node{
int val;
struct no *left, *right;
} Node, *Tree;
and as a reminder the index of min-heaps arrays is as follows:
#DEFINE PARENT(i) (i-1)/2
#DEFINE LEFT(i) 2*i + 1
#DEFINE RIGHT(i) 2*i + 2
so, how do I do this?
I started on something like this:
Tree heapToTree(int * heap){
Tree *t = malloc(sizeof(struct node));
t->val = heap[0];
Tree *aux = t; //save initial tree position
for(i=0;i<SIZE;i++){
aux->left=malloc(sizeof(struct Node));
aux->left->val=heap[i*2 +1];
aux->right=malloc(sizeof(struct Node));
aux->right->val=heap[i*2 +2];
}
Am I on the right path? I think this should be done recursively, but how?
thanks in advance
There is one thing that you are lacking is - not making the newly created node's links (left and right) to the NULL initially. No matter what, any kind of tree implementation this is very useful - helps in traversal, finding an element (which is again a traversal) etc.
Also in the loop you didn't change the value of aux (or atleast you didn't show) - as a result you are writing over the old values and having memory leak.
Apart from that not checking the return value of malloc is another point. You should check the return value of malloc - if NULL then you should handle that distinctly (error handling) from that of usual code flow.
Considering that heap is implemented in an array (0-index) you can do this to convert it to tree.
struct node *convIntoTree(int pos,int sz, int *heap){
if(pos >= sz ) return NULL;
struct node* root = malloc(sizeof *root);
if( root == NULL ){
perror("Malloc failed");
exit(EXIT_FAILURE);
}
root->data = heap[pos];
root->left = convIntoTree(pos*2+1,sz);
root->right = convIntoTree(pos*2+2,sz);
return root;
}
Call it like this
struct node *root = convToTree(0,heapsize,heap);
The solution is simply applying a brute force method of traversing every node of the heap and then allocate memory for it and populate it's left and right child recursively.

Homework, Recursive BST insert function in C

This is homework for my first class in c. It focuses on dynamic allocation in c, in the form of a bst.
I have to have a dynamically allocated BST, recursively implemented. I know that my traversal works correctly, and am having trouble inserting nodes. I only ever have the root node, and every other node seems to be set to NULL. I think that I can't print the rest of the nodes when traversing, because I am trying to access the data member of a NULL struct. My code so far is as follows:
void insert_node(TreeNode** root, int toInsert){
if(*root==NULL){
TreeNode *newnode = (TreeNode *)malloc(sizeof(TreeNode));
newnode->data = toInsert;
newnode->left = NULL;
newnode->right = NULL;
}
else if(toInsert > (*root)->data){ //if toInsert greater than current
struct TreeNode **temp = (TreeNode **)malloc(sizeof(struct TreeNode*));
*temp = (*root)->right;
insert_node(temp, toInsert);
}
else{ //if toInsert is less than or equal to current
struct TreeNode **temp = (TreeNode **)malloc(sizeof(struct TreeNode*));
*temp = (*root)->left;
insert_node(temp, toInsert);
}
}
void build_tree(TreeNode** root, const int elements[], const int count){
if(count > 0){
TreeNode *newroot = (TreeNode *)malloc(sizeof(TreeNode));
newroot->data = elements[0];
newroot->left = NULL;
newroot->right = NULL;
*root = newroot;
for(int i = 1; i < count; i++){
insert_node(root, elements[i]);
}
}
I'm sure it's only one of many problems, but I get segmentation faults on any line that uses "(*root)->data", and I'm not sure why.
As a side note, despite getting segmentation faults for the "(*root)->data" lines, I'm still able to printf "(*root)->data". How is it possible to print the value, but still get a segmentation fault?
It's messy. Some things that might help
1) Don't need to use TreeNode*, pointer to pointer, as argument. Use jsut the TreeNode. (something went wrong here, as it's some feature from the text editor, consider and additional * after each TreeNode in this line)
2) Not a strict rule, but as best practice avoid using the first node of a linked list to store actual values. Use just as the header of your list. Reason is, if you need to delete this node, you don't lose the list. Just a tip
3) In your first function, if *root==NULL, I'd rather make the function fail than adding it to a temporary list (that's being lost in the current code, see that it adds the value to a list that is not being passed outside the function.
4) Well, you are actually making it go to the right if the new value is greater than the node, to the left if it's smaller than the node, but it never stops. See this example:
Suppose you have the list 1->3->4. Now you want to insert 2. What the algorithm will do? keep trying to insert in the 1 node and 3 node, switching between them, but never actually inserting anything.
Solution: as you will build this list bottom up, your list will always be sorted (inf you insert nodes correctly). So you just need to check if the next node is higher, and if it is, insert right where you are.
5) If you're passing a TreeNode *root as argument (on the 2nd function), you shouldn't have to recreate a new list and make root=newlist. Just use the root.
All of this would result in (didn't test, might be some errors):
void insert_node(TreeNode* root, int toInsert){
if(root==NULL){
printf("Error");
return;
}
TreeNode* temp = root; //I just don't like to mess with the original list, rather do this
if(temp->right!=NULL && toInsert > temp->right->data){ //if toInsert greater than next
insert_node(temp->right, toInsert);
}
else{ //if toInsert is less or equal than next node
TreeNode* temp2 = temp->right; //grabbing the list after this node
temp->right=(TreeNode*)malloc(sizeof(TreeNode)); //making room for the new node
temp->right->right=temp2; //putting the pointer to the right position
temp->right->left=temp; //setting the left of the next node to the current
temp->right->data=toInsert;
}
}
void build_tree(TreeNode* root, const int elements[], const int count){
if(count > 0){
for(int i = 0; i < count; i++){
insert_node(root, elements[i]);
}
}
}

Problem with pointers in binary search tree deletion

I am trying to implement binary search tree operations and got stuck at deletion.
11
/ \
10 14
Using inorder traversal as representation of tree initially output is 10 11 14.
Deleting node 10, output expected is 11 14 but I am getting 0 11 14.
Deleting node 14, output expected is just 11 but I am getting 0 11 67837.
Please explain why I am getting wrong output. I am not looking for any code :).
typedef struct _node{
int data;
struct _node *left;
struct _node *right;
} Node;
Node* bstree_search(Node *root, int key)
{
if(root == NULL){
return root;
}
// Based on binary search relation, key can be found in either left,
// right, or root.
if(key > root->data)
return bstree_search(root->right, key);
else if(key < root->data)
return bstree_search(root->left, key);
else
return root;
}
void bstree_insert(Node **adroot, int value)
{
// since address of address(root is itself address) is passed we can change root.
if(*adroot == NULL){
*adroot = malloc(sizeof(**adroot));
(*adroot)->data = value;
(*adroot)->right = (*adroot)->left = NULL;
return;
}
if(value > (*adroot)->data)
bstree_insert(&(*adroot)->right, value);
else
bstree_insert(&(*adroot)->left, value);
}
void bstree_inorder_walk(Node *root)
{
if(root != NULL){
bstree_inorder_walk(root->left);
printf("%d ",root->data);
bstree_inorder_walk(root->right);
}
}
void bstree_delete(Node **adnode)
{
//Node with no children or only one child
Node *node, *temp;
node = temp = *adnode;
if((*adnode)->right == NULL || (*adnode)->left == NULL){
if((*adnode)->right == NULL){
*adnode = (*adnode)->left;
}else{
*adnode = (*adnode)->right;
}
}else{ // Node with two children
}
free(temp);
}
int main()
{
Node *root = NULL;
Node *needle = NULL;
int i,elems[] = {11,10,14};
for(i = 0; i < 3; ++i)
bstree_insert(&root,elems[i]);
bstree_inorder_walk(root);
printf("\n");
needle = bstree_search(root, 10);
bstree_delete(&needle);
bstree_inorder_walk(root);
printf("\n");
needle = bstree_search(root, 14);
bstree_delete(&needle);
bstree_inorder_walk(root);
printf("\n");
}
Please explain why I am getting wrong
output.
Your delete function must also change the parent of the deleted Node. For example, when you delete the node holding 10, you must set the root Node's left child to NULL. Since you don't do this, when you later traverse the tree, you print out data that has already been freed.
I did not look at any code other than delete, so I can't make any guarantees about it working once this change is made.
You're getting wrong output because your deletion code is buggy (okay, maybe that's stating the obvious).
To delete from a binary search tree, you first find the node to be deleted. If it's a leaf node, you set the pointer to it in its parent node to NULL, and free the node. If it's not a leaf node, you take one of two leaf nodes (either the left-most child in the right sub-tree, or the right-most child in the left sub-tree) and insert that in place of the node you need to delete, set the pointer to that node in its previous parent to NULL, and delete the node you've now "spliced out" of the tree.
A couple of things really quick,
first when you allocate the node, you really should be doing the malloc on the sizeof the type (ie Node).
Second, if you have 2 children it looks like you are not really deleting the node and rebuilding the search tree by promoting one of the children.
Other people have already got you other obvious errors.

Resources