Deleting a node in a Binary Search Tree - C - c

I'm currently working on a school project where I have to write a few helper functions for Binary Search Trees. One of the functions removes a node from the tree. I'm trying to run some test cases but I can't seem to get them to work. I know the problem has something do with how I'm using pointers, but I'm not quite sure where I'm going wrong.
Here is the code:
int removeBST (struct TreeNode **rootRef, int data)
{
struct TreeNode *current = *rootRef;
struct TreeNode *temp = current;
if (current == NULL)
return 0;
if (data < current->data)
{
current->left = removeBST (&current->left, data);
}
if (data > current->data)
{
current->right = removeBST (&current->right, data);
}
if (current->left == NULL || current->right == NULL)
return 0;
else
{
if (current->left == NULL) {
temp = current->right;
current = temp;
free (temp);
return 1;
}
else if (current->right == NULL) {
temp = current->left;
current = temp;
free (temp);
return 1;
}
temp = leftRoot (current->right);
current->data = temp->data;
current->right = removeBST (&current->right, temp->data);
}
return 1;
}
Note: I didn't include the leftRoot() function, but it's fairly simple and I know it does what it's supposed to (return the leftmost root in a subtree)
Here is the part of the code my professor gave us that tests the remove function:
for(i = -4; i < 25; i+=4)
{
n = removeBST(&bst, i);
if(!n) printf("remove did not find %d\n", i);
}
and in case it's necessary, here's the entire test code that creates the tree and inserts the data:
struct TreeNode* bst = NULL;
for(i = 0; i < 23; ++i)
{
n = (i*17+11) % 23;
bst = insertBST(bst, n);
}
printf("filled BST: ");
printTree(bst);
printf("BST leaves: ");
printLeaves(bst);
printf("BST depth = %d\n", maxDepth(bst));
printf("BST minimum value = %d\n", minValueBST(bst));
printf("BST isBST = %d\n", isBST(bst));
for(i = -4; i < 25; i+=4)
{
n = removeBST(&bst, i);
if(!n) printf("remove did not find %d\n", i);
}
the entire output is:
filled BST: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
BST leaves: 0 6 12 17
BST depth = 8
BST minimum value = 0
BST isBST = 1
remove did not find -4
remove did not find 0
remove did not find 4
(this part repeats all the way up to 24)
BST after removes: 11
Since everything besides the '11' is no longer attached to tree, I'm fairly certain that something in my program is assigning pointers where they shouldn't be assigned and tree nodes are just getting lost in the void. Any ideas?
EDIT: One piece of information I forgot to provide, the deleted node's left-most child is supposed to replace the deleted node.

I'm not sure that I've found all of the issues in your code but here is one major one:
int removeBST (struct TreeNode **rootRef, int data)
Your function returns an int, corroborated by a number of return 1 or return 0 statements...
And yet you do this:
if (data < current->data)
{
current->left = removeBST (&current->left, data);
}
if (data > current->data)
{
current->right = removeBST (&current->right, data);
}
Since you're passing &current->left to the first argument the I can assume that it's type would be a pointer to a struct TreeNode **rootRef, which is struct TreeNode ***rootRef...
Which means that you're assigning addresses 0 and 1 to the left and right nodes? This seems very odd to me and is likely causing problems for you.
Note: this is not a solution but it is too big to fit into a comment.
Since you've opted for recursion let me see if I can help you fix this somewhat...
int removeBST (struct TreeNode **rootRef, int data)
{
struct TreeNode *current = *rootRef;
struct TreeNode *temp = current;
if (current == NULL)
return 0;
if (data < current->data)
{
// We don't want to modify things here, just let the next
// call take care of it and return what it returns.
return removeBST(&current->left, data);
}
else if (data > current->data)
{
// Same here.
return removeBST(&current->right, data);
}
else
{
if (current->left == NULL) {
temp = current->right;
// The rest of the stuff from here moved below.
// Because I added the else, the return isn't needed
// here anymore either, since the one at the bottom
// will return 1 anyway.
}
else if (current->right == NULL) {
temp = current->left;
// I did the same here.
}
else {
temp = leftRoot (current->right);
// This was on the outside but really it should be an else
// since it means less code...
// Additionally, once you got the left root why did you decide
// to remove it too? As far as I can see you only want to
// remove this one... If not, then you might have some work
// to do here...
}
*rootRef = temp; // current and rootRef are not the same.
// You need to use rootRef here so that we
// move the temp pointer to the current one
// (replace it). Think carefully about where
// the pointers are! Pointers also have addresses
// and it matters what address you write to
// where, use pen and paper and draw where things
// point!
free (current); // this means that we can't delete temp! so
// since, we've just deleted the "current"
// pointer we should discard it too...
}
return 1;
}
Draw a diagram for your pointers. I find diagrams like this or this help me most. It is not embarrassing and will help you understand what you're writing. It is important to visualize these things, particularly when you're just learning.
I've tried to fix the code up a little. I will admit that I didn't spend as much time as I possibly should have proof-reading it but it should be ok enough to give you an idea about the solution. Don't just copy/paste this, I don't guarantee it will work. But it should help you get onto the right path.

Related

AVL tree not balancing correctly

I have an assignment to write a self balancing binary search tree. I decided to use an AVL tree as it is what we have discussed in class. Then with the given input of { 3, 5, 61, 9, 32, 7, 1, 45, 26, 6} I'm expecting an output of:
7
6-----|-----32
3----| 9----|----45
1---| |---26 |---61
That is, unless I have grossly misunderstood and thus miscalculated what an AVL tree is supposed to do when it balances itself. The output I'm getting is quite different:
5
3-----|-----61
1----| 9----|
7---|---32
6--| 26--|--45
Again, unless I am completely wrong, that tree is not balanced. The function I'm using to set up the tree is defined as such:
node* insertKeyAVL(node* n, int e)
{
int cmpVal;
if (n == NULL){
n = create_node();
n->data = e;
} else if (e < n->data) {
if (n->left == NULL){
n->left = create_node();
n->left->data = e;
n->left->parent = n;
} else {
n->left = insertKeyAVL(n->left, e);
}
cmpVal = height(n->left) - height(n->right);
} else {
if (n->right == NULL){
n->right = create_node();
n->right->data = e;
n->right->parent = n;
} else {
n->right = insertKeyAVL(n->right, e);
}
cmpVal = height(n->right) - height(n->left);
}
if (cmpVal > 2){
if (n->left){
if (e < n->left->data)
n = rotate_left(n);
else
n = rotate_right_left(n);
} else if (n->right){
if (e > n->right->data)
n = rotate_right(n);
else
n = rotate_left_right(n);
}
}
n->height = max(height(n->left), height(n->right)) + 1;
return n;
}
The structure I'm using to store all the data is defined as such:
typedef struct node
{
struct node *parent;
struct node* left;
struct node* right;
int data;
int height;
} node;
The functions rotate_left_right and rotate_right_left are basic functions that rotate the direction of the first post-fix then the second post-fix, and are both reliant on rotate_left and rotate_right for their respective direction. rotate left is defined as such:
node* rotate_left(node* n)
{
node* tmp = n->left;
n->left = tmp->right;
tmp->right = n;
tmp->parent = n->parent;
n->parent = tmp;
n->height = max(height(n->left), height(n->right)) + 1;
tmp->height = max(height(tmp->left), n->height) + 1;
return tmp;
}
rotate_right is similar but adjusted for a rotation right.
I'm wondering where this code messes up so that it doesn't produce the desired output.
when you add 26 in your expected result cmpval for 5 become 2 which is not valid that's why code re-execute and give result as yours.
I don't have a full answer and I'm not sure your code is salvageable, since it misses a lot of pieces. Main source of surprise is the missing initialization of cmpVal which is then compared to 2. But if n is NULL its UB.
cmpVal is the balance of AVL but with an absolute value. Unfortunately when rebalancing you check the existence of a left child or of a right child. But this tells you nothing. You need to know the sign of the balance in order to choose the rotation direction. You can have both children and still need to balance.
Your insertion looks strange because after checking that the node is not NULL you check the children for the same thing. But recursion here would have saved the two checks entirely by performing the check for you.

C - Segfault when accessing struct member in a HashTable (insert function)

I am new to C and am having issues implementing an insert function for my HashTable.
Here are my structs:
typedef struct HashTableNode {
char *url; // url previously seen
struct HashTableNode *next; // pointer to next node
} HashTableNode;
typedef struct HashTable {
HashTableNode *table[MAX_HASH_SLOT]; // actual hashtable
} HashTable;
Here is how I init the table:
HashTable *initTable(){
HashTable* d = (HashTable*)malloc(sizeof(HashTable));
int i;
for (i = 0; i < MAX_HASH_SLOT; i++) {
d->table[i] = NULL;
}
return d;
}
Here is my insert function:
int HashTableInsert(HashTable *table, char *url){
long int hashindex = JenkinsHash(url, MAX_HASH_SLOT);
int uniqueBool = 2; // 0 for true, 1 for false, 2 for init
HashTableNode* theNode = (HashTableNode*)malloc(sizeof(HashTableNode));
theNode->url = url;
if (table->table[hashindex] != NULL) { // if we have a collision
HashTableNode* currentNode = (HashTableNode*)malloc(sizeof(HashTableNode));
currentNode = table->table[hashindex]->next; // the next node in the list
if (currentNode == NULL) { // only one node currently in list
if (strcmp(table->table[hashindex]->url, theNode->url) != 0) { // unique node
table->table[hashindex]->next = theNode;
return 0;
}
else{
printf("Repeated Node\n");
return 1;
}
}
else { // multiple nodes in this slot
printf("There was more than one element in this slot to start with. \n");
while (currentNode != NULL)
{
// SEGFAULT when accessing currentNode->url HERE
if (strcmp(currentNode->url, table->table[hashindex]->url) == 0 ){ // same URL
uniqueBool = 1;
}
else{
uniqueBool = 0;
}
currentNode = currentNode->next;
}
}
if (uniqueBool == 0) {
printf("Unique URL\n");
theNode->next = table->table[hashindex]->next; // splice current node in
table->table[hashindex]->next = theNode; // needs to be a node for each slot
return 0;
}
}
else{
printf("simple placement into an empty slot\n");
table->table[hashindex] = theNode;
}
return 0;
}
I get SegFault every time I try to access currentNode->url (the next node in the linked list of a given slot), which SHOULD have a string in it if the node itself is not NULL.
I know this code is a little dicey, so thank you in advance to anyone up for the challenge.
Chip
UPDATE:
this is the function that calls all ht functions. Through my testing on regular strings in main() of hash table.c, I have concluded that the segfault is due to something here:
void crawlPage(WebPage * page){
char * new_url = NULL;
int pos= 0;
pos = GetNextURL(page->html, pos, URL_PREFIX, &new_url);
while (pos != -1){
if (HashTableLookup(URLsVisited, new_url) == 1){ // url not in table
printf("url is not in table......\n");
hti(URLsVisited, new_url);
WebPage * newPage = (WebPage*) calloc(1, sizeof(WebPage));
newPage->url = new_url;
printf("Adding to LIST...\n");
add(&URLList, newPage); // added & to it.. no seg fault
}
else{
printf("skipping url cuz it is already in table\n");
}
new_url = NULL;
pos = GetNextURL(page->html, pos, URL_PREFIX, &new_url);
}
printf("freeing\n");
free(new_url); // cleanup
free(page); // free current page
}
Your hash table insertion logic violates some rather fundamental rules.
Allocating a new node before determining you actually need one.
Blatant memory leak in your currentNode allocation
Suspicious ownership semantics of the url pointer.
Beyond that, this algorithm is being made way too complicated for what it really should be.
Compute the hash index via hash-value modulo the table size.
Start at the table slot of the hash index, walking node pointers until one of two things happens:
You discover the node is already present
You reach the end of the collision chain.
Only in #2 above do you actually allocate a collision node and chain it to your existing collision list. Most of this is trivial when employing a pointer-to-pointer approach, which I demonstrate below:
int HashTableInsert(HashTable *table, const char *url)
{
// find collision list starting point
long int hashindex = JenkinsHash(url, MAX_HASH_SLOT);
HashTableNode **pp = table->table+hashindex;
// walk the collision list looking for a match
while (*pp && strcmp(url, (*pp)->url))
pp = &(*pp)->next;
if (!*pp)
{
// no matching node found. insert a new one.
HashTableNode *pNew = malloc(sizeof *pNew);
pNew->url = strdup(url);
pNew->next = NULL;
*pp = pNew;
}
else
{ // url already in the table
printf("url \"%s\" already present\n", url);
return 1;
}
return 0;
}
That really is all there is to it.
The url ownership issue I mentioned earlier is addressed above via string duplication using strdup(). Although not a standard library function, it is POSIX compliant and every non-neanderthal half-baked implementation I've seen in the last two decades provides it. If yours doesn't (a) I'd like to know what you're using, and (b) its trivial to implement with strlen and malloc. Regardless, when the nodes are being released during value-removal or table wiping, be sure and free a node's url before free-ing the node itself.
Best of luck.

linked list delete function

The following code snippet is not working right.
void deleteNode(list **start, int pos) {
int currentPosition=0;
list *currentNode;
list *nodToDelete;
currentNode = *start;
if (currentNode == NULL) {
printf("Empty List\n");
} else if (pos == 0 ) {
nodToDelete = *start;
*start = nodToDelete->next;
free(nodToDelete);
} else {
while (currentNode->next != NULL) {
if (currentPosition >= pos -1) {
break;
}
currentPosition++;
currentNode = currentNode->next;
}
if (currentPosition < pos -1 || currentNode->next == NULL) {
printf("No node at given position exists\n");
} else {
nodToDelete = currentNode->next;
currentNode = nodToDelete->next;
free(nodToDelete);
nodToDelete = NULL;
}
}
}
void displayList(list *node) {
if (node == NULL) {
printf("Empty List");
}
while (node != NULL) {
printf("%d\t", node->data);
node = node->next;
}
printf("\n");
}
int main()
{
list *start, *node;
start = NULL;
insertNode(&start, 2);
insertNode(&start, 3);
insertNode(&start, 4);
insertNode(&start, 1);
insertNode(&start, 5);
deleteNode(&start, 3);
displayList(start);
}
When executed the output is
Before Deletion 2 3 4 1 5
After Deletion 2 3 4 0 5
It is supposed to delete 1 but it is inserting 0 at its place.
Here is something that might work --
Replace
currentNode = nodToDelete->next;
with
currentNode->next = nodToDelete->next;
You basically need the node before the nodetodelete to have its next to point to the node that nodetodelete used to point to
Once you've found the node you want to take out of the list, you need to actually take it out. =)
...
nodToDelete = currentNode->next;
currentNode->next = nodToDelete->next;
free(nodToDelete);
...
Besides the problem with currentNode->next = nodToDelete->next; and negative positions you are mixing your ui and your logic. As much as possible you should separate the two.
Sending something to the ui is a way of reporting progress; whether the ui is a command line, a browser or a speaker. Within deleteNode, an empty list or a position that is out of bounds, is not progress. Sequentially both are the same as success - you are done. If you want failure to be to be reported, that should be done where it can lead to a separate sequence...i.e the caller. Also, by mixing in ui, you introduce an unnecessary dependency and failure (what if there's a bug in printf, YOUR function will crash when it doesn't doesn't have to). If you're function returns a defined result, the caller can decide if/how to report that result, including success (your function currently doesn't do so, and the caller has no way telling the difference between sucess or failure).

Help inserting a list of values into a binary tree..?

Well, I've been at it for a while...trying to figure out an algorithm to insert my list of random numbers into a binary tree.
This is what I have gotten so far:
NodePtr and Tree are pointers to a node
NodePtr CreateTree(FILE * fpData)
{
int in;
fscanf(fpData, "%i", &in);
Tree T = (NodePtr)malloc(sizeof(Node));
T->Left = NULL;
T->Right = NULL;
T->value = in;
while((fscanf(fpData, "%i", &in)) != EOF)
{
InsertInTree(in, T);
printf("\n %p", T);
}
return T;
}
void InsertInTree(int value,Tree T)
{
if(T == NULL)
{
T->Left = (NodePtr)malloc(sizeof(Node));
T->Left->Left = NULL;
T->Left->Right = NULL;
T->Left->value = value;
printf("\n %i ", value);
return;
}
if(T->Left == NULL)
{
InsertInNull(value, T->Left);
}
else if(T->Right == NULL)
{
InsertInNull(value, T->Right);
}
else
{
if(T->Left->Left == NULL || T->Left->Right == NULL) InsertInTree(value, T->Left);
else InsertInTree(value, T->Right);
}
}
I'm lost on what to do if the both children of a particular node are not null. What I did here works for a small amount of numbers (1,2,3,5,6) but if the list is larger it becomes unbalanced and wrong.
Is it meant to be a search-tree? I don't see any if (value < T->Value) conditions.
And you have an InsertNull (not shown). That shouldn't be necessary, 1 function should be enough.
To address your main problem, use a pointer-to-pointer parameter or, more elegant, always return a new Tree:
//untested, no balancing
Tree InsertValue(Tree t, int value)
{
if (t == null)
t = // create and return new node
else
{
if (value < t->Value)
t->Left = InsertValue(t->Left, value);
else
t->Right = InsertValue(t->Left, value);
}
return t;
}
And in CreateTree:
Tree t = InsertValue(null, in);
Since the assignment is not for a sorted tree, you can populate it in a breadth-first manner. This means the first thing inserted is always the root, the next is the first node at the next level so it looks like this:
0
1 2
3 4 5 6
Here is an article that explains it further:
http://www.cs.bu.edu/teaching/c/tree/breadth-first/
Simple insertion in a binary tree and keeping a binary tree balanced are different problems. I suggest you start with the first problem and just focus on keeping order properties correct within the tree. Your are not far from that.
Then you should have a look at classical implementations for red-black trees, well studied and efficient way of keeping trees balanced, but with a cost, it's more complex.

I can't seem to delete the simplest case on a Binary Search Tree in C

I've posted about this last year because some university project and now I have to do it again (I never finished what I had to do last year). I've already looked at me previous code, all you guys answers on those questions, but still, I can't seem to understand this.
I'm not going to put all my questions in one long post, it just makes everything more confusing and I need to understand this once and for all.
I'm working with the simplest BST possible (just an integer for element) and I'm trying to delete a node from the tree in it's simplest cast, deleting a leaf.
The tree elements I'm testing with are inserted in the following order: 7 3 10 2 5 1 6 9 4 8
And the output from in-order printing is, of course: 1 2 3 4 5 6 7 8 9 10
This is my Tree structure:
typedef int TreeElement;
typedef struct sTree {
TreeElement item;
struct sTree *left;
struct sTree *right;
} Tree;
And this is my delete function:
int delete(Tree **tree, TreeElement item) {
if(!*tree) return 1;
Tree *currPtr = *tree;
Tree *prevPtr = NULL;
while(currPtr) {
if(item < currPtr->item) {
prevPtr = currPtr;
currPtr = currPtr->left;
} else if(item > currPtr->item) {
prevPtr = currPtr;
currPtr = currPtr->right;
} else {
if(!currPtr->left && !currPtr->right) {
currPtr = NULL;
}
free(currPtr);
return 0;
}
}
return 1;
}
I can't understand why but this is not working... As far as I understand it, I'm searching for the element to delete correctly. When I find it, I check if this node is a leaf by checking the left and right child are "empty". Which they are for my test case (trying to delete node 1).
When try to delete node 1 with the code above, I'll still get the in-order printing as I posted above. If I remove the currPtr = NULL from the if(!currPtr->left && !currPtr->right) block, I'll get the following for the in-order printing: 0 2 3 4 5 6 7 8 9 10.
I'm not understanding any of this...
What I'm missing in the code above so I can correctly delete a node that is a leaf? This is the most simple case of deleting a node in a BST, yet, I'm having so much trouble doing it, it's driving me crazy.
In this case you are change the value holding the current pointer, rather than the value in the node pointing to it. What you would actually want is something like
int delete(Tree **tree, TreeElement item) {
if(!*tree) return 1;
Tree *currPtr = *tree;
Tree *prevPtr = NULL;
bool fromLeft = false;
while(currPtr) {
if(item < currPtr->item) {
prevPtr = currPtr;
currPtr = currPtr->left;
fromLeft = true;
} else if(item > currPtr->item) {
prevPtr = currPtr;
currPtr = currPtr->right;
fromLeft = false;
} else {
if(!currPtr->left && !currPtr->right) {
if( fromLeft )
prevPtr->left = NULL;
else
prevPtr->right = NULL;
}
free(currPtr);
return 0;
}
}
return 1;
}
A binary tree is a recursively defined data type, and deletion is far easiest with a recursive function. I don't even want to try to debug your tricky iterative solution; the mistake you are making is not a minor code mistake; the mistake is that you are using the wrong ideas for the job.
Here is some completely untested code:
Tree *delete(item, tree) {
if (tree == NULL)
return tree;
else if (item < tree->item)
tree->left = delete(item, tree->left);
else if (item > tree->item)
tree->right = delete(item, tree->right);
else { // here comes the only interesting case
// if one child is NULL, move the other up
// otherwise grab an item from one child and recurse
Tree *answer;
if (tree->right = NULL) {
answer = tree->left;
free(tree);
return answer;
} else if (tree->left == NULL) {
answer = tree->right;
free(tree);
return answer;
} else {
tree->item = tree->left->item; // choice of left/right is arbitrary
tree->left = delete(tree->left->item, tree->left);
return tree;
}
}
}

Resources