Binary Search Tree in C: remove node function - c

I'm putting together functions for a binary search tree and ran into a wall. I'm working on each situation that might be encountered when a node holding a specified value needs to be removed from the tree. I'm uncertain how to handle freeing the node if it does not have a left and right child. The function must return a node. Do I back up, examine each left and right child, and remove the value while it's in a child? But then if the value is in the root, wouldn't I have a similar problem with how to remove it? Just by way of explanation, the program uses a void pointer then casts the TYPE val in a separate function compare() which evaluates both values and returns -1 for <, 0 for ==, and 1 for >.
struct Node *_removeNode(struct Node *cur, TYPE val)
{
if (compare(cur->val, val) == 0) { //if val == cur->val
if (cur->right != NULL && cur->left != NULL) { //if cur has right and left
cur = _leftMost(cur->right);
free(_leftMost(cur->right));
}
else if (cur->right == NULL && cur->left != NULL) { //if cur has left
cur = cur->left;
free(cur->left);
}
else if (cur->right != NULL && cur->left == NULL){ //if cur has right
cur = cur->right;
free(cur->right);
}
else if (cur->right == NULL && cur->left == NULL){ //if cur has no child
//free cur if cur = val
}
}
else if (compare(cur->val, val) == -1) {
cur->right = _removeNode(cur->right, val);
}
else if (compare(cur->val, val) == 1) {
cur->left = _removeNode(cur->left, val);
}
return cur;
}

If the node has neither child then it can simply be deleted. In order to make your recursion in the other cases work, you should return NULL from _removeNode. In all cases, cur should be deleted (freed) as it is no longer needed. In each case, you need to return the replacement subtree. The complication occurs in the first case where the left most descendent of the right child is pulled up. After pulling it up, you need to remove it from the right sub-tree (note that it may be the right sub-tree).
I wrote all of the below off the top of my head so be prepared for a few errors/a bit of debugging. Also, _leftMost and _removeLeftMost can be merged with a bit of work.
The block in question should look something like:
Node *replacement;
if (cur->right != NULL && cur->left != NULL) { //if cur has right and left
replacement = _leftMost(cur->right);
replacement->right = _removeLeftMost(cur->right,replacement);
replacement->left = cur->left;
}
else if (cur->right == NULL && cur->left != NULL) { //if cur has left
replacement = cur->left;
}
else if (cur->right != NULL && cur->left == NULL){ //if cur has right
replacement = cur->right;
}
else if (cur->right == NULL && cur->left == NULL){ //if cur has no child
replacement = NULL;
}
free(cur);
cur = replacement;
The function _removeLeftMost walks down the left child pointers until it sees the node to be replaced and then replaces it with the right child of that node. Something like:
Node *_removeLeftMost(node, remove) {
if (node == remove) {
return node->right; // Note that remove->left should be null
}
else {
node->left = _removeLeftMost(node->left,remove);
return node;
}
}
Also, the main call is something like
root = _removeNode(root, val);
So that handles your concern when the node is the root.

Related

BST; the sum of BST elements which are greater then the sum of their direct children

So the task is to write a function which returns the sum of BST elements which are greater then the sum of their direct children. Do not count leaf nodes. I did it this way, the base case when the tree is empty return 0, and if it doesn't have sons also return 0. Then I checked if node has one or two children and the condition for the sums.
int sum(struct node* root)
{
if (root== NULL) return 0;
if (root->right == NULL && root->left==NULL) return 0;
if (root->right!= NULL && root->left != NULL)
{
if (((root->right)->key + (root->left)->key) < root->key) return root->key;
}
if (root->right != NULL && root->left==NULL)
{
if ((root->right)->key< root->key) return root->key;
}
if (root->left != NULL && root->right==NULL)
{
if ((root->left)->key < root->key) return root->key;
}
else return sum(root->right) + sum(root->left);
}
Main:
struct node* root = NULL;
add(&root,-4);
add(&root,6);
add(&root,8);
add(&root,-11);
add(&root,5);
add(&root,7);
add(&root,-20);
printf("%d",sum(root));
It should return -1 (6+8-11-4), but my function doesn't work I don't know why.
In your code, the else clause will never be executed; your previous conditions have dealt with all possibilities.
However, the expression should be executed. You either need to add + sum(root->left) + sum(root->right) as part of the three non-degenerate return statements, or you need to save the root->key in a local variable (defaulting to zero) and fall through to return return_value + sum(root->left) + sum(root->right);.
Hence (using the abbreviation rv for 'return value':
int sum(struct node* root)
{
int rv = 0;
if (root == NULL || (root->right == NULL && root->left == NULL))
return rv;
if (root->right != NULL && root->left != NULL)
{
if ((root->right->key + root->left->key) < root->key)
rv = root->key;
}
if (root->right != NULL && root->left == NULL)
{
if (root->right->key < root->key)
rv = root->key;
}
if (root->left != NULL && root->right == NULL)
{
if (root->left->key < root->key)
rv = root->key;
}
return rv + sum(root->right) + sum(root->left);
}
Warning: pristine code, unsullied by any attempt to compile it.
You could use an else if chain for the last three 'outer' if tests (with an else clause instead of the last test) to avoid repeated tests. Whether there'd be a measurable difference is debatable; an optimizer might even make the change for itself.

Error removing entry and freeing bucket node in C linked list traversal

I am trying to develop a Hash Table with chained linked-list hashing to remedy collisions, but I seem to be having an error with my remove_entry function. I am working almost exclusively with pointers, and as such I am dynamically allocation and freeing memory as necessary.
Here are the structures for my Table and Bucket datatypes:
typedef struct bucket {
char *key;
void *value;
struct bucket *next;
} Bucket;
typedef struct {
int key_count;
int table_size;
void (*free_value)(void *);
Bucket **buckets;
} Table;
And here is my remove function. I tried to include commentary to explain what is going on:
int remove_entry(Table *table, const char *key) {
int hash;
Bucket *cur, *prev;
if (table == NULL || key == NULL) {
return FAILURE;
}
hash = hash_code(key) % (table->table_size);
/* case: key not present at lead bucket position */
if (table->buckets[hash] == NULL) {
return FAILURE;
} else {
cur = table->buckets[hash];
/* traverse thru the chain from table->buckets[hash]
* to see if the key exists somewhere. loop only can run the
* first time if its key does not match the paramteter key, so
* if they do match then we proceed to the logic below this
* loop (2nd if statement), acting on the FIRST (lead) bucket */
while (strcmp(cur->key, key) != 0 && cur != NULL) {
prev = cur;
cur = cur->next;
}
/* case - key not found anywhere (cur == null) */
if (cur == NULL) {
return FAILURE;
}
/* case - key found in chain (strcmp returned 0, keys match) */
if (strcmp(cur->key, key) == 0) {
if (table->free_value != NULL) {
table->free_value(cur->value);
cur->value = NULL;
}
free(cur->key);
cur->key = NULL;
if (cur->next != NULL) {
prev->next = cur->next;
}
free(cur);
cur = NULL;
table->key_count -= 1;
return SUCCESS;
}
}
return FAILURE;
}
valgrind tells me that there is an issue calling free() on cur at the bottom of the function, but I don't understand why that's an issue. The overall issue I found I am having is that the address of the bucket (at the appropriate index "hash") undergoes no change even though cur is changed.
Thanks in advance.
I think your error involves removing the FIRST bucket when more that one bucket exists.
In this code segment:
cur = table->buckets[hash];
while (strcmp(cur->key, key) != 0 && cur != NULL) {
prev = cur;
cur = cur->next;
}
If key does indeed equal the FIRST bucket pointed to by the cur pointer then your WHILE loop's inner code never gets invoked. And the upshot is that pointer prev is undefined ... you never pre-set it to NULL and it did not get set in that bypassed loop.
So this subsequent code snippet:
if (cur->next != NULL) {
prev->next = cur->next;
}
is going to fail since prev has either zero (NULL) in it, or has some undefined memory location stored in it.
Also the logic of the above code snippet is not quite correct; if this is the last bucket being removed (so cur->next indeed equals NULL), you still want to move that NULL back into the prev->next (or the list head) so that prev "knows" it is now the end of the linked list.
You are also failing to account for the case that you need to reset you linked list header (and not prev->next) when the very first bucket is the one being removed. This is a common mistake, and one I've answered several times recently just this week.
So I think you need to change your code to:
int remove_entry(Table *table, const char *key) {
int hash;
Bucket *cur;
Bucket *prev = NULL;
if (table == NULL || key == NULL) {
return FAILURE;
}
hash = hash_code(key) % (table->table_size);
/* case: key not present at lead bucket position */
if (table->buckets[hash] == NULL) {
return FAILURE;
} else {
cur = table->buckets[hash];
/* traverse thru the chain from table->buckets[hash]
* to see if the key exists somewhere. loop only can run the
* first time if its key does not match the paramteter key, so
* if they do match then we proceed to the logic below this
* loop (2nd if statement), acting on the FIRST (lead) bucket */
while (strcmp(cur->key, key) != 0 && cur != NULL) {
prev = cur;
cur = cur->next;
}
/* case - key not found anywhere (cur == null) */
if (cur == NULL) {
return FAILURE;
}
/* case - key found in chain (strcmp returned 0, keys match) */
if (strcmp(cur->key, key) == 0) {
if (table->free_value != NULL) {
table->free_value(cur->value);
cur->value = NULL;
}
free(cur->key);
cur->key = NULL;
if (prev != NULL) {
prev->next = cur->next; // even want to copy-back when cur->next == NULL
}
else
{
table->buckets[hash] = cur->next; // even want to copy-back when cur->next == NULL
}
free(cur);
cur = NULL;
table->key_count -= 1;
return SUCCESS;
}
}
return FAILURE;
}
I'm not sure why free(cur) would be giving you an error, but once you've fixed up the code to handle the linked list properly and don't have a prev pointer that is not properly initialized, you may be able to figure out what remains wrong.
One other trivial nit, you really don't need the following standalone IF statement: if (strcmp(cur->key, key) == 0) {. When you exit the WHILE loop, the only two conditions are cur indeed == NULL, or strcmp(cur->key, key) indeed == 0. You do not need the second IF statement. Its not wrong, just slightly inefficient.

Binary Search Tree insertion not working

I've been playing about with this Binary search tree for a while but I can't seem to insert or change any of the tree properties.
My binary tree is defined as:
struct tree{
Node * root;
int size;
};
struct node{
int value;
Node * left;
Node * right;
};
Therefore my binary tree is composed of nodes. Now the bit that doesn't work:
void add(int value, Tree *t){
//1. if root is null create root
if(t->root == NULL){
t->root = nodeCreate(value);
t->size ++;
return;
}
Node * cursor = t->root;
while(cursor != NULL){
if(value == cursor->value){
printf("value already present in BST\n");
return;
}
if(value < cursor->value){
cursor = cursor->left;
}
if(value > cursor->value){
cursor = cursor->right;
}
}
//value not found in BST so create a new node.
cursor = nodeCreate(value);
t->size = t->size + 1;
}
Can someone tell me where I'm going wrong? I expected calls to add() would increase the size member as well as creating new nodes but I can't seem to get it.
I believe the changes below will fix your problem.
void add(int value, Tree *t){
if(t->root == NULL){
t->root = nodeCreate(value);
t->size ++;
return;
}
Node * cursor = t->root;
Node * last = null;
while(cursor != NULL){
last = cursor;
if(value == cursor->value){
printf("value already present in BST\n");
return;
}
if(value < cursor->value){
cursor = cursor->left;
}
if(value > cursor->value){
cursor = cursor->right;
}
}
//value not found in BST so create a new node.
cursor = nodeCreate(value);
if (value > cursor->value)
{
last->right = cursor;
}
else
{
last->left = cursor;
}
t->size = t->size + 1;
}
You're have both a design flaw and an outright-bug in your loop.
The design flaw: You're allocating a new node, but assigning to cursor doesn't mean you're assigning to the parent node left or right child pointer that got you there in the first place. You need a reference to the actual pointer you're going to populate. One way to do this is with a pointer-to-pointer, and as a bonus, this eliminates the is-my-root-null check at the beginning.
The outright bug: Your left-side movement clause (i.e. chasing a left-side pointer) will potentially change cursor to NULL. but the logic for chasing the right side is not excluded with an else if condition. If your search followed a left-side to null it would fault chasing the right side of a null pointer. This was obviously a problem.
void add(int value, Tree *t)
{
Node **pp = &(t->root);
while (*pp)
{
if(value == (*pp)->value) {
printf("value already present in BST\n");
return;
}
if(value < (*pp)->value)
pp = &(*pp)->left;
else if(value > (*pp)->value)
pp = &(*pp)->right;
}
*pp = nodeCreate(value);
t->size++;
}
I should also note that you can skip the equality check by assuming a strict-weak order. I.e. the following rule can be considered valid:
if (!(a < b) && !(b < a)) then a == b is true.
That makes your insertion simpler as well.
void add(int value, Tree *t)
{
Node **pp = &(t->root);
while (*pp)
{
if (value < (*pp)->value)
pp = &(*pp)->left;
else if ((*pp)->value < value)
pp = &(*pp)->right;
else { // must be equal.
printf("value already present in BST\n");
return;
}
}
*pp = nodeCreate(value);
t->size++;
}
You're not assigning any of your existing nodes to point to the new node. You walk through the tree, create a new node when you get to the end, but you don't set any existing nodes to point to the new node.
You might want to change your structure to something like:
if ( value < cusor->value )
{
if ( cursor->left )
{
cursor = cursor->left;
}
else
{
cursor->left = newNode(value);
break;
}
}
with similar logic for the right-hand cursor.

C Binary Search Tree pre-order traversal with recursion

I am working on a function that searches through a binary search tree in C for a name that is passed in with the function. However, I am stuck on how to format my loop so that the recusion doesn't simply end when the traversal reaches the left-most node with no children. The traversal has to be pre-order (visit myself, then my left child, then my right child).
My find function is as follows:
tnode *bst_find_by_name(tnode *ptr, const char *nom){
if(ptr != NULL){
if(strcmp(ptr->name, nom) == 0){
return ptr;
}
if(ptr->left != NULL){
return bst_find_by_name(ptr->left, nom);
}
if(ptr->right != NULL){
return bst_find_by_name(ptr->right, nom);
}
}
return NULL;
}
As you can see, currently this simply returns NULL once it reaches the left-most node that does not match the string that was passed into the function. I have to have it return NULL if it does not find a match in the tree, but at the same time I do not want it to return NULL too early before it has a chance to search every node in the tree. Any ideas?
Create a temporary variable that holds the return value. And check to see if bst_find_by_name returned something other than NULL if it returned NULL continue checking the tree.
Something like the following.
tnode *ret = NULL;
if(ptr->left != NULL){
ret = bst_find_by_name(ptr->left, nom);
}
if(ret == NULL && ptr->right != NULL){
ret = bst_find_by_name(ptr->right, nom);
}
return ret;
I prefer to write it like this:
tnode *bst_find_by_name(tnode *ptr, const char *nom) {
// accept a null node, just exit early before dereferencing it
if (ptr == NULL) {
return NULL;
}
// is it this node?
if(strcmp(ptr->name, nom) == 0){
return ptr;
}
// remember, if the first part is true, || will skip the second part
return bst_find_by_name(ptr->left, nom) || bst_find_by_name(ptr->right, nom)
}
// get the matching pointer for left or right subtree, and return
tnode *bst_find_by_name(tnode *ptr, const char *nom) {
// accept a null node, just exit early before dereferencing it
if (ptr == NULL) {
return NULL;
}
// is it this node?
if(strcmp(ptr->name, nom) == 0){
return ptr;
}
tnode * ptrtemp = bst_find_by_name(ptr->left, nom);
if(!ptrtemp) {
ptrtemp = bst_find_by_name(ptr->right, nom);
}
return ptrtemp;
}

Changing From Assert Function in C To If Statement

I have found some code online for red black trees, and am trying to implement it.
I do not want to use the assert function though which the original code has located here
I am getting a seg fault on the line n->color = child->color; just before the delete fix. After debugging I discovered that the child did not exist in this case, and so the reason for the assert statement in the original code. I decided to add what I thought was appropriate with the additional if clause around everything from where child is dealt with downward.
However, now the program does not actually delete, because if the child does not exist, it never makes it into the loop. After trial and error I still cannot find where to close the if clause in order to take the place of the assert statement properly.
Please let me know your ideas!
Here is my "translated" code without the assert, and using an if statement instead.
void delete_node(int key)
{
node* child;
node* n ;
n = searchTree(key);
if(n == NULL)return;
if(n->left != NULL && n->right != NULL)
{
node* pred = n->left;
while(pred->right != NULL)
pred = pred->right;
n->value = pred->value;
n = pred;
}
if(n->right != NULL || n->left != NULL){
child = n->right == NULL ? n->left : n->right;
if(n->color == 'b')
{
n->color = child->color;
delete_fix1(n);
}
swap_nodes(n, child);
if(n->parent == NULL && child != NULL)
child->color = 'b';
free(n);
}
}
Test data (Seg Fault occurs when attempting to delete 4):
i stand for insert (insert occurs flawlessly as far as I can tell)
d stands for delete
i 7
i 8
i 1
d 8
i 4
i 10
d 4
i 11
This:
assert(n->left == NULL || n->right == NULL)
Is nowhere near this:
if (n->right != NULL || n->left != NULL)
Recheck your translation. The assert states that one of them must be NULL. your if-expr evals true if either are not NULL. Your if-expr passes if both are non-null, where the assert would fail. Likewise, your if-expr fails if both are NULL, while the assert would pass.
Don't take shortcuts when doing this kinda of thing. First. keep the asserts regardless of your added checks. Second, until it is up and working, copy the assert clauses verbatim in your if (expr): or (!(expr)) for bailout conditions.
verbatim check:
assert(n->left == NULL || n->right == NULL)
if (n->left == NULL || n->right == NULL)...
bailout check:
assert(n->left == NULL || n->right == NULL)
if (!(n->left == NULL || n->right == NULL))
bailout loud.
EDIT Translation of linked code with integrated if-expr
void rbtree_delete(rbtree t, void* key, compare_func compare)
{
node child;
node n = lookup_node(t, key, compare);
if (n == NULL) return; /* Key not found, do nothing */
if (n->left != NULL && n->right != NULL) {
/* Copy key/value from predecessor and then delete it instead */
node pred = maximum_node(n->left);
n->key = pred->key;
n->value = pred->value;
n = pred;
}
assert(n->left == NULL || n->right == NULL);
if (n->left == NULL || n->right == NULL) // << note: SAME as above
{
child = n->right == NULL ? n->left : n->right;
if (node_color(n) == BLACK) {
n->color = node_color(child);
delete_case1(t, n);
}
replace_node(t, n, child);
if (n->parent == NULL && child != NULL)
child->color = BLACK;
free(n);
}
verify_properties(t);
}

Resources