How to find swapped nodes in a binary search tree? - c

This is an interview Question.
A binary search tree is given and the values of two nodes have been swapped. The question is how to find both the nodes and the swapped values in a single traversal of the tree?
i have tried to solve this using below code but i am not able to stop the recursion so i am getting segmentation fault. help me how to stop recursion.
#include <stdio.h>
#include <stdlib.h>
#include <limits.h>
#include <stdlib.h>
/* A binary tree node has data, pointer to left child
and a pointer to right child */
struct node
{
int data;
struct node* left;
struct node* right;
};
/* Helper function that allocates a new node with the
given data and NULL left and right pointers. */
struct node* newNode(int data)
{
struct node* node = (struct node*)
malloc(sizeof(struct node));
node->data = data;
node->left = NULL;
node->right = NULL;
return(node);
}
void modifiedInorder(struct node *root, struct node **nextNode)
{
static int nextdata=INT_MAX;
if(root)
{
modifiedInorder(root->right, nextNode);
if(root->data > nextdata)
return;
*nextNode = root;
nextdata = root->data;
modifiedInorder(root->left, nextNode);
}
}
void inorder(struct node *root, struct node *copyroot, struct node **prevNode)
{
static int prevdata = INT_MIN;
if(root)
{
inorder(root->left, copyroot, prevNode);
if(root->data < prevdata)
{
struct node *nextNode = NULL;
modifiedInorder(copyroot, &nextNode);
int data = nextNode->data;
nextNode->data = (*prevNode)->data;
(*prevNode)->data = data;
return;
}
*prevNode = root;
prevdata = root->data;
inorder(root->right, copyroot, prevNode);
}
}
/* Given a binary tree, print its nodes in inorder*/
void printInorder(struct node* node)
{
if (node == NULL)
return;
/* first recur on left child */
printInorder(node->left);
/* then print the data of node */
printf("%d ", node->data);
/* now recur on right child */
printInorder(node->right);
}
int main()
{
/* 4
/ \
2 3
/ \
1 5
*/
struct node *root = newNode(1); // newNode will return a node.
root->left = newNode(2);
root->right = newNode(3);
root->left->left = newNode(4);
root->left->right = newNode(5);
printf("Inorder Traversal of the original tree\n ");
printInorder(root);
struct node *prevNode=NULL;
inorder(root, root, &prevNode);
printf("\nInorder Traversal of the fixed tree \n");
printInorder(root);
return 0;
}

Walk to the tree using inorder traversal. By using that you will get all the elements sorted and the one element that will be greater than the surrounding elements is swapped.
For example consider this below binary tree
_ 20 _
/ \
15 30
/ \ / \
10 17 25 33
/ | / \ / \ | \
9 16 12 18 22 26 31 34
First, we linearize this into an array and we get
9 10 16 15 12 17 18 20 22 25 26 30 31 33 34
Now you can notice that 16 is greater than its surrounding elements and that 12 is less than them. This immediately tells us that 12 and 16 were swapped.

The following function validates if a tree is BST or not by recursively iterating both left and right subtrees while tightening the bounds.
I believe it can be modified to achieve the above task by
Instead of returning false, return temp i.e. pointer to node which fails the tree from being BST.
There would be two such instances which gives both the swapped values.
EDIT: We would need to distinguish between recursive function returning true vs pointer to node which is swapped
This assumes that there are only two such values as mentioned in the problem definition
bool validate_bst(tnode *temp, int min, int max)
{
if(temp == NULL)
return true;
if(temp->data > min && temp->data < max)
{
if( validate_bst(temp->left, min, temp->data) &&
validate_bst(temp->right, temp->data, max) )
return true;
}
return false;
}
The main would call above api like this
validate_bst(root, -1, 100); // Basically we pass -1 as min and 100 as max in
// this instance

We can solve this in O(n) time and with a single traversal of the given BST. Since inorder traversal of BST is always a sorted array, the problem can be reduced to a problem where two elements of a sorted array are swapped. There are two cases that we need to handle:
6
/ \
10 2
/ \ / \
1 3 7 12
1. The swapped nodes are not adjacent in the inorder traversal of the BST.
For example, Nodes 10 and 2 are swapped in {1 10 3 6 7 2 12}.
The inorder traversal of the given tree is 1 10 3 6 7 2 12
If we observe carefully, during inorder traversal, we find node 3 is smaller than the previous visited node 10. Here save the context of node 10 (previous node). Again, we find that node 2 is smaller than the previous node 7. This time, we save the context of node 2 ( current node ). Finally swap the two node’s values.
2. The swapped nodes are adjacent in the inorder traversal of BST.
6
/ \
10 8
/ \ / \
1 3 7 12
For example, Nodes 10 and 2 are swapped in {1 10 3 6 7 8 12}.
The inorder traversal of the given tree is 1 10 3 6 7 8 12
Unlike case #1, here only one point exists where a node value is smaller than previous node value. e.g. node 10 is smaller than node 36.
/**
* Definition for a binary tree node.
* struct TreeNode {
* int val;
* TreeNode *left;
* TreeNode *right;
* TreeNode(int x) : val(x), left(NULL), right(NULL) {}
* };
*/
class Solution {
public:
void recoverTreeUtil(TreeNode *root, TreeNode **first, TreeNode **middle, TreeNode **last, TreeNode **prev) {
if (root) {
recoverTreeUtil(root->left, first, middle, last, prev);
if (*prev && (*prev)->val > root->val) {
if (!(*first)) {
*first = *prev;
*middle = root;
} else *last = root;
}
*prev = root;
recoverTreeUtil(root->right, first, middle, last, prev);
}
}
void recoverTree(TreeNode* root) {
TreeNode *first, *middle, *last, *prev;
first = middle = last = prev = nullptr;
recoverTreeUtil(root, &first, &middle, &last, &prev);
if (first && last) swap(first->val, last->val);
else if (first && middle) swap(first->val, middle->val);
}
};

I found another solution to this questions on Geeksforgeeks.com ..............guys u can look into this thread for more explanation of below code http://www.geeksforgeeks.org/archives/23616
// Two nodes in the BST's swapped, correct the BST.
#include <stdio.h>
#include <stdlib.h>
/* A binary tree node has data, pointer to left child
and a pointer to right child */
struct node
{
int data;
struct node *left, *right;
};
// A utility function to swap two integers
void swap( int* a, int* b )
{
int t = *a;
*a = *b;
*b = t;
}
/* Helper function that allocates a new node with the
given data and NULL left and right pointers. */
struct node* newNode(int data)
{
struct node* node = (struct node *)malloc(sizeof(struct node));
node->data = data;
node->left = NULL;
node->right = NULL;
return(node);
}
// This function does inorder traversal to find out the two swapped nodes.
// It sets three pointers, first, middle and last. If the swapped nodes are
// adjacent to each other, then first and middle contain the resultant nodes
// Else, first and last contain the resultant nodes
void correctBSTUtil( struct node* root, struct node** first,
struct node** middle, struct node** last,
struct node** prev )
{
if( root )
{
// Recur for the left subtree
correctBSTUtil( root->left, first, middle, last, prev );
// If this node is smaller than the previous node, it's violating
// the BST rule.
if (*prev && root->data < (*prev)->data)
{
// If this is first violation, mark these two nodes as
// 'first' and 'middle'
if ( !*first )
{
*first = *prev;
*middle = root;
}
// If this is second violation, mark this node as last
else
*last = root;
}
// Mark this node as previous
*prev = root;
// Recur for the right subtree
correctBSTUtil( root->right, first, middle, last, prev );
}
}
// A function to fix a given BST where two nodes are swapped. This
// function uses correctBSTUtil() to find out two nodes and swaps the
// nodes to fix the BST
void correctBST( struct node* root )
{
// Initialize pointers needed for correctBSTUtil()
struct node *first, *middle, *last, *prev;
first = middle = last = prev = NULL;
// Set the poiters to find out two nodes
correctBSTUtil( root, &first, &middle, &last, &prev );
// Fix (or correct) the tree
if( first && last )
swap( &(first->data), &(last->data) );
else if( first && middle ) // Adjacent nodes swapped
swap( &(first->data), &(middle->data) );
// else nodes have not been swapped, passed tree is really BST.
}
/* A utility function to print Inoder traversal */
void printInorder(struct node* node)
{
if (node == NULL)
return;
printInorder(node->left);
printf("%d ", node->data);
printInorder(node->right);
}
/* Driver program to test above functions*/
int main()
{
/* 6
/ \
10 2
/ \ / \
1 3 7 12
10 and 2 are swapped
*/
struct node *root = newNode(6);
root->left = newNode(10);
root->right = newNode(2);
root->left->left = newNode(1);
root->left->right = newNode(3);
root->right->right = newNode(12);
root->right->left = newNode(7);
printf("Inorder Traversal of the original tree \n");
printInorder(root);
correctBST(root);
printf("\nInorder Traversal of the fixed tree \n");
printInorder(root);
return 0;
}
Output:
Inorder Traversal of the original tree
1 10 3 6 7 2 12
Inorder Traversal of the fixed tree
1 2 3 6 7 10 12
Time Complexity: O(n)
For more test cases please refer to this link http://ideone.com/uNlPx

My C++ Solution:
struct node *correctBST( struct node* root )
{
//add code here.
if(!root)return root;
struct node* r = root;
stack<struct node*>st;
// cout<<"1"<<endl;
struct node* first = NULL;
struct node* middle = NULL;
struct node* last = NULL;
struct node* prev = NULL;
while(root || !st.empty()){
while(root){
st.push(root);
root = root->left;
}
root = st.top();
st.pop();
if(prev && prev->data > root->data){
if(!first){
first = prev;
middle = root;
}
else{
last = root;
}
}
prev = root;
root = root->right;
}
if(first && last){
swap(first->data,last->data);
}
else{
swap(first->data,middle->data);
}
return r;
}

Related

Linked List behaving differently

this is my first time asking here.
I was practicing linked list in C and I cannot figure out why my functions behave differently.
I make a linked list of ints 1-10 and delete the even numbers using delEven() function.
delEven() plays with the node P, P=*node, to make the changes, which work properly.
However, when I use the P=*node to delete all elements in the list in delAll2, the list is untouched. When I ran printList(P) within the function delAll2, the list is properly deleted.
When I use the input *node instead of P to delete all elements in delAll, it is working properly.
I would like to understand why I was able to delete even elements in delEven using P=*node, when I was not in delAll2.
Thanks,
code:
#include <stdio.h>
#include <stdlib.h>
struct node
{
int dat;
struct node* next;
};
void printList (struct node* node)
{
if (node == NULL)
{
printf("empty\n");
return;
}
while (node->next!=NULL)
{
printf("%2d",node->dat);
node = node->next;
}
printf("%2d\n",node->dat);
}
void delEven(struct node** node)
{
struct node* P;
P = *node;
while (P->next != NULL)
{
if (P->next->dat%2==0)
{
P->next = P->next->next;
}
if(P->next->dat%2==0)
{
P->next = P->next->next;
}
P=P->next;
}
}
void delAll(struct node** node)
{
struct node* P;
P = *node;
while ((*node)->next != NULL)
{
*node = (*node)->next;
}
*node = (*node)->next;
}
void delAll2(struct node** node)
{
struct node* P;
P = *node;
while (P != NULL)
{
P = P->next;
}
}void main()
{
int i;
struct node* start;
struct node* Q;
struct node* P;
start = NULL;
for(i=1; i<=10;i++)
{
Q=malloc(sizeof(struct node));
Q->dat = i;
Q->next = start;
start = Q;
}
printList(start);
printList(start);
delEven(&start);
printList(start);
delAll2(&start);
printList(start);
}
output when delAll2:
10 9 8 7 6 5 4 3 2 1
10 9 8 7 6 5 4 3 2 1
10 9 7 5 3 1
10 9 7 5 3 1
output when delAll:
10 9 8 7 6 5 4 3 2 1
10 9 8 7 6 5 4 3 2 1
10 9 7 5 3 1
empty
When managing linked lists, you should avoid having multiple versions of code to do the same thing in multiple locations.
Make a single function to delete a node. I recommend one that looks like this:
void deleteNextNode(struct node **list, struct node *node_before);
// node_before is the node _before_ the node to delete
// node_before == NULL to delete the first node in the list
// *list is updated if necessary
Once you have that, you can rewrite your delEven() as:
void delEven(struct node **list)
{
// Delete all even-valued nodes at the head of the list
while (*list && isEven(list->dat))
deleteNextNode(list, NULL);
if (!*list) return; // (it might have been the whole list)
// Find and delete even-valued nodes in the rest of the list
struct node *before = *list; // (list might be a single node at this point)
while (before->next)
// if the next node is even, delete it
if (isEven(before->next->dat))
deleteNextNode(list, before);
// else advance to the next “before” node
else
before = before->next;
}
Notice how we had to handle the (head of list) and (rest of list) cases separately?

Is there a way to implement this binary search tree function?

I am struggling to implement the following function:
Given a binary search tree, return the smallest node, then move the pointer to the next smallest node in the tree. Upon calling the function another time, it should return the next smallest node and so on.
Any help would be greatly appreciated.
Here is my program so far with some helper functions and their definitions:
#include <stdio.h>
#include <stdlib.h>
/* A binary tree node has data,
the pointer to left child
and a pointer to right child */
struct node {
int data;
struct node *left;
struct node *right;
struct node *parent;
};
struct node *minValue(struct node *node);
struct node *inOrderSuccessor(
struct node *root,
struct node *n)
{
if (n->right != NULL)
return minValue(n->right);
struct node *p = n->parent;
while (p != NULL && n == p->right) {
n = p;
p = p->parent;
}
return p;
}
/* Given a non-empty binary search tree,
return the minimum data
value found in that tree. Note that
the entire tree does not need
to be searched. */
struct node *minValue(struct node *node)
{
struct node *current = node;
/* loop down to find the leftmost leaf */
while (current->left != NULL) {
current = current->left;
}
return current;
}
/* Helper function that allocates a new
node with the given data and
NULL left and right pointers. */
struct node *newNode(int data)
{
struct node *node = (struct node *)malloc(sizeof(struct node));
node->data = data;
node->left = NULL;
node->right = NULL;
node->parent = NULL;
return (node);
}
/* Give a binary search tree and
a number, inserts a new node with
the given number in the correct
place in the tree. Returns the new
root pointer which the caller should
then use (the standard trick to
avoid using reference parameters). */
struct node *insert(struct node *node,
int data)
{
/* 1. If the tree is empty, return a new,
single node */
if (node == NULL)
return (newNode(data));
else {
struct node *temp;
/* 2. Otherwise, recur down the tree */
if (data <= node->data) {
temp = insert(node->left, data);
node->left = temp;
temp->parent = node;
} else {
temp = insert(node->right, data);
node->right = temp;
temp->parent = node;
}
/* return the (unchanged) node pointer */
return node;
}
}
Here are some remarks about your code:
the function minValue is correct, by it should accept a null argument (which is an empty tree) and return null for that.
the function new_node should check for memory allocation failure to avoid undefined behavior.
function inOrderSuccessor should stop scanning when it goes back up to the root node from its right child and return NULL. Also testing for a null parent node will avoid undefined behavior.
you can check for failure in insert and return a null pointer.
Here is a modified version with a functional test:
#include <stdio.h>
#include <stdlib.h>
/* A binary tree node has data,
the pointer to left child
a pointer to right child
and a pointer to parent node
*/
struct node {
int data;
struct node *left;
struct node *right;
struct node *parent;
};
/* Given a binary search tree,
return the node with the minimum data. */
struct node *minValue(struct node *node) {
if (node) {
/* loop down to find the leftmost leaf */
while (node->left != NULL) {
node = node->left;
}
}
return node;
}
struct node *inOrderSuccessor(struct node *root,
struct node *n)
{
if (n == NULL)
return minValue(root);
if (n->right != NULL)
return minValue(n->right);
for (;;) {
struct node *p = n->parent;
/* sanity test */
if (p == NULL)
return NULL;
/* coming back from the left child, return parent node */
if (n != p->right)
return p;
/* coming back from the right child, stop at the root node */
if (p == root)
return NULL;
n = p;
}
}
/* Helper function that allocates a new
node with the given data and
NULL left and right pointers. */
struct node *newNode(int data) {
struct node *node = malloc(sizeof(*node));
if (node) {
node->data = data;
node->left = NULL;
node->right = NULL;
node->parent = NULL;
}
return node;
}
/* Give a binary search tree and
a number, inserts a new node with
the given number in the correct
place in the tree. Returns the new
root pointer which the caller should
then use (the standard trick to
avoid using reference parameters).
Return a null pointer on memory allocation failure */
struct node *insert(struct node *node,
int data)
{
/* 1. If the tree is empty, return a new,
single node */
if (node == NULL) {
return newNode(data);
} else {
struct node *temp;
/* 2. Otherwise, recurse down the tree */
if (data <= node->data) {
temp = insert(node->left, data);
if (temp == NULL) /* return NULL on failure */
return NULL;
node->left = temp;
temp->parent = node;
} else {
temp = insert(node->right, data);
if (temp == NULL) /* return NULL on failure */
return NULL;
node->right = temp;
temp->parent = node;
}
/* return the (unchanged) node pointer */
return node;
}
}
void freeNode(struct node *node) {
if (node) {
freeNode(node->left);
freeNode(node->right);
free(node);
}
}
int main() {
struct node *tree = NULL;
printf("inserting values:");
for (int i = 0; i < 20; i++) {
int data = rand() % 1000;
tree = insert(tree, data);
printf(" %d", data);
}
printf("\n");
printf("enumerate values:");
for (struct node *cur = NULL;;) {
if ((cur = inOrderSuccessor(tree, cur)) == NULL)
break;
printf(" %d", cur->data);
}
printf("\n");
freeNode(tree);
return 0;
}
Output:
inserting values: 807 249 73 658 930 272 544 878 923 709 440 165 492 42 987 503 327 729 840 612
enumerate values: 42 73 165 249 272 327 440 492 503 544 612 658 709 729 807 840 878 923 930 987
Given a binary search tree, return the smallest node, then move the pointer to the next smallest node in the tree. Upon calling the function another time, it should return the next smallest node and so on.
struct node *next_smallest_node(struct node *root, struct node *min)
{
if (!min)
return min_node(root);
if (min->right)
return min_node(min->right);
for (struct node *p = min->parent; p; p = min->parent) {
// Coming from left: return parent
if (min != p->right)
return p;
// Coming from right: stop at root
if (p == root)
return NULL;
min = p;
}
return NULL;
}
min_node() returns the smallest node in a tree:
struct node *min_node(struct node *root)
{
struct node *min = NULL;
for (struct node *i = root; i; i = i->left)
min = i;
return min;
}
Usage:
int main(void)
{
struct node *tree = NULL;
// Fill tree with data ...
struct node *min = NULL;
while (min = next_smallest_node(tree, min)) {
printf("Next smallest = %d\n", min->data);
}
}
Update:
The code in next_smallest_node() now parses the left sub-tree (thanks to #chqrlie).
There's no need to compute the minimum value prior to calling the function.

Doubly linked list - Update list->tail after a merge sort

In an implementation of a doubly linked list I am using the typical structure:
struct node
{
void *data;
struct node *prev;
struct node *next;
};
I will also insert at the end of the list in O(1) time, so I have another struct storing the head and the tail:
struct linklist
{
struct node *head;
struct node *tail;
size_t size;
};
The program works as expected for all insert and delete operations, but I have a problem with the sort function, I am using the merge-sort algorithm, as I understand it is the most effective or one of the most effective to sort lists, the algorithm works well:
static struct node *split(struct node *head)
{
struct node *fast = head;
struct node *slow = head;
while ((fast->next != NULL) && (fast->next->next != NULL))
{
fast = fast->next->next;
slow = slow->next;
}
struct node *temp = slow->next;
slow->next = NULL;
return temp;
}
static struct node *merge(struct node *first, struct node *second, int (*comp)(const void *, const void *))
{
if (first == NULL)
{
return second;
}
if (second == NULL)
{
return first;
}
if (comp(first->data, second->data) < 0)
{
first->next = merge(first->next, second, comp);
first->next->prev = first;
first->prev = NULL;
return first;
}
else
{
second->next = merge(first, second->next, comp);
second->next->prev = second;
second->prev = NULL;
return second;
}
}
static struct node *merge_sort(struct node *head, int (*comp)(const void *, const void *))
{
if ((head == NULL) || (head->next == NULL))
{
return head;
}
struct node *second = split(head);
head = merge_sort(head, comp);
second = merge_sort(second, comp);
return merge(head, second, comp);
}
but I have no idea how to keep the address of list->tail updated:
void linklist_sort(struct linklist *list, int (*comp)(const void *, const void *))
{
list->head = merge_sort(list->head, comp);
// list->tail is no longer valid at this point
}
Sure I can walk the whole list after ordering and update list->tail by brute force, but I'd like to know if there's a fancier way to do it.
I managed to solve the problem using a circular list, but I would like to avoid changing the structure of the program.
Your algorithm uses O(N) stack space by recursing in the merge function for every step. With this method, it would be very cumbersome to keep track of the tail node. You can simply scan the list to find it and update the list structure in linklist_sort. This extra step does not change the complexity of the sorting operation. You can save some time by starting from the current value of link->tail: the loop will stop immediately if the list was already sorted.
Here is a modified version:
void linklist_sort(struct linklist *list, int (*comp)(const void *, const void *)) {
list->head = merge_sort(list->head, comp);
if (list->tail) {
struct node *tail = list->tail;
while (tail->next)
tail = tail->next;
list->tail = tail;
}
}
Sorting linked lists with merge sort should only use O(log(N)) space and O(N log(N)) time.
Here are some ideas to improve this algorithm:
since you know the length of the list, you don't need to scan the full list for splitting. You can just pass the lengths along with the list pointers and use that to determine where to split and only scan half the list.
if you convert merge to a non-recursive version, you can keep track of the last node in the merge phase and update a pointer struct node **tailp passed as an argument to point to this last node. This would save the last scan, and removing the recursion will lower the space complexity. Whether this improves efficiency is not obvious, benchmarking will tell.
from experience, sorting a linked list, singly and, a fortiori, doubly linked, is more efficiently implemented with an auxiliary array N pointers to the list nodes. You would sort this array and relink the nodes according to the order of the sorted array. The extra requirement is O(N) size.
Here is a modified version using the list lengths and with a non-recursive merge:
struct node {
void *data;
struct node *prev;
struct node *next;
};
struct linklist {
struct node *head;
struct node *tail;
size_t size;
};
static struct node *split(struct node *head, size_t pos) {
struct node *slow = head;
while (pos-- > 1) {
slow = slow->next;
}
struct node *temp = slow->next;
slow->next = NULL;
return temp;
}
static struct node *merge(struct node *first, struct node *second,
int (*comp)(const void *, const void *))
{
struct node *head = NULL;
struct node *prev = NULL;
struct node **linkp = &head;
for (;;) {
if (first == NULL) {
second->prev = prev;
*linkp = second;
break;
}
if (second == NULL) {
first->prev = prev;
*linkp = first;
break;
}
if (comp(first->data, second->data)) <= 0 {
first->prev = prev;
prev = *linkp = first;
linkp = &first->next;
} else {
second->prev = prev;
prev = *linkp = second;
linkp = &second->next;
}
}
return head;
}
static struct node *merge_sort(struct node *head, size_t size,
int (*comp)(const void *, const void *))
{
if (size < 2) {
return head;
}
struct node *second = split(head, size / 2);
head = merge_sort(head, size / 2, comp);
second = merge_sort(second, size - size / 2, comp);
return merge(head, second, comp);
}
void linklist_sort(struct linklist *list, int (*comp)(const void *, const void *)) {
list->head = merge_sort(list->head, comp, list->size);
if (list->tail) {
struct node *tail = list->tail;
while (tail->next)
tail = tail->next;
list->tail = tail;
}
}
Note that you could also simplify the merge function and not update the back pointers during the sort as you can relink the whole list during the last scan. This last scan would be longer and less cache friendly but it should still be more efficient and less error prone.
One option is to merge sort the nodes as if they were single list nodes, then make a one time pass when done to set the previous pointers, and update the tail pointer.
Another option would use something similar to C++ std::list and std::list::sort. A circular doubly linked list is used. There is one dummy node that uses "next" as "head" and "prev" as "tail". The parameters to merge sort and merge are iterators or pointers, and only used to keep track of run boundaries, as nodes are merged by moving them within the original list. The merge function merges nodes from a second run into the first run, using std::list::splice. The logic is if first run element is less than or equal to second run element, just advance the iterator or pointer to the first run, else remove the node from the second run and insert it before the current node in the first run. This will automatically update the head and tail pointers in the dummy node if involved in a remove + insert step.
Changing struct node to :
struct node
{
struct node *next; // used as head for dummy node
struct node *prev; // used as tail for dummy node
void *data;
};
would be a bit more generic.
Since the dummy node is allocated when a list is created, then begin == dummy->next, last == dummy-> prev, and end == dummy.
I'm not the best person to provide deep analysis concerning algorithms Big-O notation. Anyways, answering to a question with an already accepted "canonic" answer is great, because there's the possibility to explore alternative solutions without too much pressure.
This it interesting even if, as you will see, the analysed solution is not better than the current solution presented in the question.
The strategy starts by wondering if it is possible keeping track of the candidate tail element without turning the code upside down. The main candidate is the function deciding the order of the nodes in the sorted list: the merge() function.
Now, since after the comparison we decide which node will come first in the sorted list, we will have a "loser" that will be nearer to the tail. So, with a further comparison with the current tail element for each step, in the end we will be able to update the tail element with the "loser of the losers".
The merge function will have the additional struct node **tail parameter (the double pointer is required because we will change the list tail field in place:
static struct node *merge(struct node *first, struct node *second, struct node **tail, int (*comp)(const void *, const void *))
{
if (first == NULL)
{
return second;
}
if (second == NULL)
{
return first;
}
if (comp(first->data, second->data) < 0)
{
first->next = merge(first->next, second, tail, comp);
/* The 'second' node is the "loser". Let's compare current 'tail'
with it, and in case it loses again, let's update 'tail'. */
if( comp(second->data, (*tail)->data) > 0)
*tail = second;
/******************************************************************/
first->next->prev = first;
first->prev = NULL;
return first;
}
else
{
second->next = merge(first, second->next, tail, comp);
/* The 'first' node is the "loser". Let's compare current 'tail'
with it, and in case it loses again, let's update 'tail'. */
if( comp(first->data, (*tail)->data) > 0)
*tail = first;
/******************************************************************/
second->next->prev = second;
second->prev = NULL;
return second;
}
}
There are no more changes required to the code, except those for the "propagation" of the tail double pointer parameter through merge_sort() and linklist_sort() functions:
static struct node *merge_sort(struct node *head, struct node **tail, int (*comp)(const void *, const void *));
void linklist_sort(List_t *list, int (*comp)(const void *, const void *))
{
list->head = merge_sort(list->head, &(list->tail), comp);
}
The test
In order to test this modification I had to write a basic insert() function, a compare() function designed to obtain a sorted list in descending order, and a printList() utility. Then I wrote a main program to test all the stuff.
I did several tests; here I present just an example, in which I omit the functions presented in the question and above in this answer:
#include <stdio.h>
typedef struct node
{
void *data;
struct node *prev;
struct node *next;
} Node_t;
typedef struct linklist
{
struct node *head;
struct node *tail;
size_t size;
} List_t;
void insert(List_t *list, int data)
{
Node_t * newnode = (Node_t *) malloc(sizeof(Node_t) );
int * newdata = (int *) malloc(sizeof(int));
*newdata = data;
newnode->data = newdata;
newnode->prev = list->tail;
newnode->next = NULL;
if(list->tail)
list->tail->next = newnode;
list->tail = newnode;
if( list->size++ == 0 )
list->head = newnode;
}
int compare(const void *left, const void *right)
{
if(!left && !right)
return 0;
if(!left && right)
return 1;
if(left && !right)
return -1;
int lInt = (int)*((int *)left), rInt = (int)*((int *)right);
return (rInt-lInt);
}
void printList( List_t *l)
{
for(Node_t *n = l->head; n != NULL; n = n->next )
{
printf( " %d ->", *((int*)n->data));
}
printf( " NULL (tail=%d)\n", *((int*)l->tail->data));
}
int main(void)
{
List_t l = { 0 };
insert( &l, 5 );
insert( &l, 3 );
insert( &l, 15 );
insert( &l, 11 );
insert( &l, 2 );
insert( &l, 66 );
insert( &l, 77 );
insert( &l, 4 );
insert( &l, 13 );
insert( &l, 9 );
insert( &l, 23 );
printList( &l );
linklist_sort( &l, compare );
printList( &l );
/* Free-list utilities omitted */
return 0;
}
In this specific test I got the following output:
5 -> 3 -> 15 -> 11 -> 2 -> 66 -> 77 -> 4 -> 13 -> 9 -> 23 -> NULL (tail=23)
77 -> 66 -> 23 -> 15 -> 13 -> 11 -> 9 -> 5 -> 4 -> 3 -> 2 -> NULL (tail=2)
Conclusions
The good news is that theoretically speaking we still have an algorithm that, in the worst case, will have O(N log(N)) time complexity.
The bad news is that, to avoid a linear search in a linked list (N "simple steps"), we have to do N*logN comparisons, involving a call to a function. This makes the linear search still a better option.

Correctly Implementing a Linked List in C

I am trying to implement a linked list from scratch in C:
#include <stdio.h>
#include <stdlib.h>
struct node {
int data;
struct node * next;
};
void insert(struct node** root, int data){
// Create a Node
struct node * temp = malloc(sizeof(struct node));
temp->data = data;
temp->next = NULL;
// Either root is NULL or not
if (*root == NULL){
*root = temp; // Directly modify the root
}
else {
struct node * t = *root;
while (t->next!=NULL){
t = t->next;
}
t->next = temp; // Append at the last
}
}
void printList(struct node * root){
while(root!=NULL){
printf("%d\t", root->data);
}
}
struct node * search(struct node* root, int key){
while (root!=NULL) {
if (root->data == key) return root;
}
return NULL;
}
int main(){
struct node * head = NULL;
insert(&head,0);
insert(&head,1);
insert(&head,2);
insert(&head,3);
insert(&head,4);
printList(head);
}
Now, when I run the program, my output is:
0 0 0 0 0 0 0 0 0 0
However, my list doesn't contain all zeroes or 10 elements.
My logic seems correct but somehow code has a bug.
On a side note, is there a way to avoid double pointers, can't I work with only pointers while inserting in a linked list?
There is a small bug in the printList() function.
In printList() function, root not updated, to iterate whole list you should do root = root->next
void printList(struct node * root){
while(root!=NULL){
printf("%d\t", root->data);
root = root->next; /* you miss this one */
}
}
Same mistake is repeated in search() function also,
struct node * search(struct node* root, int key){
while (root!=NULL) {
if (root->data == key)
return root;
else
root = root->next; /* if key not found root should be updated to next one */
}
return NULL;
}

Inserting into a Huffman Tree

I am having problems with my huffman tree; when I try to build it I get the nodes in the wrong place. For example, I want my node of weight 2 (with children i:1 and n:1) to go in between a node of m:2 and space:3 but instead it goes right after the previous node that I put in (2 with children of e:1 and g:1).
My question is: how do I insert a node with two children into a huffman tree (I am using a linked list) by priority of both it's weight (aka the sum of both its children) and the symbols of the children (i.e. the right child 'n' comes before the other right child of 'g').
Thanks for your help!
EDIT: also, how can I print off the codes of the tree in alphabetical order; right now I have them printing off by rightmost tree to leftmost
Here is my insert function...
struct node* insert(struct node* head, struct node* temp)
{
struct node* previous = NULL;
struct node* current = head;
printf("entering insert function\n");
// finds the previous node, where we want to insert new node
while (temp->freq > current->freq && current->next != NULL)
{
printf("traversing: tempfreq is %lu and currentfreq is %lu\n", temp->freq, current->freq);
previous = current;
current = current->next;
}
if (current->next == NULL)
{
printf("hit end of list\n");
temp = current->next;
}
else
{
printf("inserting into list\n");
temp->next = current;
previous->next = temp;
}
return head;
}
You've got the insertion wrong when you hit the end of the list. This:
temp = current->next;
should be the other way round, otherwise you just assign NULL to a temporary variable, which won't do anything to your list.
But I think that you also got your special cases wrong. The special case is not "insert at the end", but "insert a new head". Your code will fail if head == NULL. (This might not happen, because you have already a list of nodes without children and you remove nodes until only one node is left, but still.)
A better implementation might therefore be:
struct node *insert(struct node *head, struct node *temp)
{
struct node *previous = NULL;
struct node *current = head;
while (current && temp->freq > current->freq) {
previous = current;
current = current->next;
}
if (previous == NULL) {
temp->next = head;
return temp;
}
temp->next = current;
previous->next = temp;
return head;
}
Note how this code never derefeneces current or previous when they are NULL. Your special case "insert at the end" is handled by the regular code when current == NULL.
Edit: Concerning your request to print the nodes in alphabetical order: There are many possibilities to do that. One is to add a char buffer to your structure that contains the encoding for the letter:
struct node {
int value;
unsigned long freq;
struct node *next;
struct node *left;
struct node *right;
char code[32];
};
Then you create an "alphabet", i.e a list of 256 pointers to nodes of your Huffman tree, initially all null. (You'll need that alphabet for encoding anyways.)
struct node *alpha[256] = {NULL};
Then traverse your tree, pass a temporary char buffer and assign nodes to your alphabet as appropriate:
void traverse(struct node *n, int level, char buf[], struct node *alpha[])
{
if (n == NULL) return;
if (n->value) {
alpha[n->value] = n;
strcpy(n->code, buf);
} else {
buf[level] = '0';
traverse(n->left, level + 1, buf, alpha);
buf[level] = '1';
traverse(n->right, level + 1, buf, alpha);
}
}
When the node has a value, i.e. is childless, the value (ASCII code) is assigned to the alphabet, so that alpha['a'] points to the node with value 'a'. Note that the alphabet does not create nodes, it points to existing nodes.
Finally, print the alphabet:
char buf[32];
traverse(head, 0, buf, alphabet);
for (i = 0; i < 256; i++) {
if (alpha[i] != NULL) {
printf("%c: %s\n", alpha[i]->value, alpha[i]->code);
}
}
Please note that 32 is n arbitrary value that is chosen to be high enough for the example. In a real tree, memory for the code might be allocated separately.

Resources