Binary tree is losing nodes while building the tree - c

I have built a binary tree (huffman tree) using the code below which takes a sorted in ascending order linked list, however when it finishes running it prints the bit-patterns and a few of the nodes that should be in the tree aren't.
The code essentially:
sets parent node to point at two lowest nodes
assigns internal frequency of parent node
points the start of the list to now be at nodes 2 along from where it was (to avoid re-using nodes)
inserts the new parent node into the correct position in the tree
gets the length of the tree
print all nodes left in list
iterates until one node is left (which is the root).
Any ideas as to why its 'losing' nodes along the way?
void build_tree(pqueue *list)
{
node *temp;
node* parent_node;
int min_1, min_2, ind = 0, counter = 0, length = 2, head;
int characters[CHARACTERS];
temp = new_node();
while (length > 1)
{
min_1 = 0;
min_2 = 0;
temp = list->start;
parent_node = new_node();
parent_node->letter = '#';
min_1 = temp->frequency;
parent_node->left = temp;
temp = temp->next;
min_2 = temp->frequency;
parent_node->right = temp;
parent_node->frequency = min_1 + min_2;
list->start = temp->next;
while (ind == 0) /* inserting a node to the correct place */
{
if (temp != NULL && temp->next != NULL)
{
temp = temp->next;
if (temp->frequency >= parent_node->frequency) /* in the middle */
{
parent_node->next = temp->next;
temp->next = parent_node;
ind = 1;
}
else if (temp->next == NULL) /* at the end */
{
temp->next = parent_node;
parent_node -> next = NULL;
ind = 1;
}
}
}
ind = 0;
temp = list->start;
while (temp->next != NULL) /* get number of nodes left to insert into tree */
{
temp = temp->next;
counter++;
printf("%c : %d\n", temp->letter, temp->frequency);
}
printf("----------------------------------------------\n");
length = counter;
counter = 0;
}
printf("Found root with value of: %d\n", temp->frequency);
head = 0;
BitPatterns(temp, characters, head);
temp = list->start;
deallocate(temp, list);
}
void BitPatterns(node* root, int characters[], int head)
{
if (root->left)
{
characters[head] = 0;
BitPatterns(root->left, characters, head +1);
}
if (root->right)
{
characters[head] = 1;
BitPatterns(root->right, characters, head +1);
}
if (isLeaf(root))
{
printf("'%c' : ", root->letter);
GetChars(characters, head);
}
}
void GetChars(int characters[], int n)
{
int i, counter = 0;
for (i = 0; i < n; ++i)
{
printf("%d", characters[i]);
counter++;
}
printf(" (%d * \n", counter);
}
int isLeaf(node* root)
{
return !(root->left) && !(root->right) ;
}

Ok! It was a tough one to debug. But, I think I have found the problem. The problem is with the while loop, where you find the length of the list, that is left for processing. Since the condition in the while loop is temp->next != NULL, so, consider that your List is of size 2, something like this ::
3 --> 4 --> NULL (Numbers represent the sum of frequencies of some nodes)
With list->start pointing to 3. And you will measure the length of this list to 1 and not 2, because you are checking temp->next != NULL.
Because of this you miss a crucial second node of the list, and you run BitPatterns() only on the first node, and you miss a few nodes.
A possible solution to this would be to insert a while loop at the beginning of the function to measure the length for once, and that could be decremented by 1 in every consecutive iteration of the while loop, where you combine two nodes, since you are removing two nodes and adding one node to the list always, you only have to decrement the length by 1. This would also save a lot of extra computation that you do at the end of the list for computing the length of the list everytime.
Something like this ::
temp = list->start;
while(temp != NULL) {
length++;
temp = temp->next;
}
EDIT ::
Moreover, there's another logical bug that I see in your code ::
Consider that the initial list is this ::
1 --> 2 --> 4 --> 5 --> NULL
You combine the first two nodes, let that node be called A (with freq = 3) for the moment and list_start points to 4. So, when you insert the node in the list looks something like this ::
4 --> A --> 5 --> NULL
Though the list, shall look something like this ::
A --> 4 --> 5
This, does not affect the functioning of the code, but might lead to some un-optimized huffman code results.

Related

How to solve this Depth first search problem?

So I need to do a depth first search traversal of a given graph, however if a node in the graph has multiple adjacent neighbours, I need to choose the node with the lowest value to go to. So I implemented the following recursive depth first search function:
void DFS(struct Graph *graph, int vertex) {
struct node *adjList = graph->adjLists[vertex];
struct node *temp = adjList;
graph->visited[vertex] = 1;
printf("Visited %d \n", vertex);
int neighbouring_nodes[graph->numVertices];
while (temp != NULL) {
int count = 0;
struct node *temp_cpy = temp;
while (temp_cpy != NULL) {
neighbouring_nodes[count] = temp_cpy->vertex;
count++;
temp_cpy = temp_cpy->next;
}
int smallest_node = neighbouring_nodes[0];
for (int i = 0; i < count; i++) {
if (neighbouring_nodes[i] < smallest_node) {
smallest_node = neighbouring_nodes[i];
}
}
if (graph->visited[smallest_node] == 0) {
DFS(graph, smallest_node);
} else if (graph->visited[smallest_node] == 1 && count == 1) {
//if the node is visited but is it the only neighbour
DFS(graph, smallest_node);
}
temp = temp->next;
}
}
But when I run my program, it results in an infinite loop. I think I know why I am getting an infinite loop, it might be because there is never a return condition, so the recursive function just keeps running?
Is this type of depth first search possible with a recursive function? If yes, where am I going wrong? If no, how would I do it iteratively?
Help would be much appreciated.
Below is my full program without the DFS function:
// DFS algorithm in C
#include <stdio.h>
#include <stdlib.h>
struct node {
int vertex;
struct node *next;
};
struct node *createNode(int v);
struct Graph {
int numVertices;
int *visited;
struct node **adjLists;
};
// Create a node
struct node *createNode(int v) {
struct node *newNode = malloc(sizeof(struct node));
newNode->vertex = v;
newNode->next = NULL;
return newNode;
}
// Create graph
struct Graph *createGraph(int vertices) {
struct Graph *graph = malloc(sizeof(struct Graph));
graph->numVertices = vertices;
graph->adjLists = malloc(vertices * sizeof(struct node*));
graph->visited = malloc(vertices * sizeof(int));
int i;
for (i = 0; i < vertices; i++) {
graph->adjLists[i] = NULL;
graph->visited[i] = 0;
}
return graph;
}
// Add edge
void addEdge(struct Graph *graph, int src, int dest) {
// Add edge from src to dest
struct node *newNode = createNode(dest);
newNode->next = graph->adjLists[src];
graph->adjLists[src] = newNode;
// Add edge from dest to src
newNode = createNode(src);
newNode->next = graph->adjLists[dest];
graph->adjLists[dest] = newNode;
}
// Print the graph
void printGraph(struct Graph *graph) {
int v;
for (v = 0; v < graph->numVertices; v++) {
struct node *temp = graph->adjLists[v];
printf("\n Adjacency list of vertex %d\n ", v);
while (temp) {
printf("%d -> ", temp->vertex);
temp = temp->next;
}
printf("\n");
}
}
int main() {
struct Graph *graph = createGraph(4);
addEdge(graph, 0, 1);
addEdge(graph, 0, 2);
addEdge(graph, 1, 2);
addEdge(graph, 2, 3);
printGraph(graph);
DFS(graph, 2);
return 0;
}
"if a node in the graph has multiple adjacent neighbours, I need to choose the node with the lowest value to go to."
I assume the 'value' of a node is an attribute of the node object?
Most implementations of DFS will first look at the node with the lowest index in the data structure containing the node objects. So, if you first sort the nodes in your data structure into ascending value order, then the DFS will do what you want without needing to change the DFS code.
Here is what I came up with:
void DFS(struct Graph* graph, int vertex) {
struct node* temp = graph->adjLists[vertex];
graph->visited[vertex] = 1;
printf("Visited %d \n", vertex);
int neighbouring_nodes[graph->numVertices];
int count = 0;
while(temp != NULL) {
neighbouring_nodes[count] = temp->vertex;
count++;
temp = temp->next;
}
int smallest_node = neighbouring_nodes[0];
// Need to search (at most) in every neighbouring node
for (int i = 0; i < count; i++) {
// Go through all nodes in neighbouring_nodes array in order
// to find the smallest unvisited one, if it exists
for (int j = 0; j < count; j++){
// if current smallest_node has already been visited and
// neighbouring_nodes[j] is unvisited, assign it to smallest_node
if (graph->visited[smallest_node] == 1 && graph->visited[neighbouring_nodes[j]] == 0){
smallest_node = neighbouring_nodes[j];
}
// if neighbouring_nodes[j] is smaller than smallest_node,
// assign it to smallest_node
if (graph->visited[neighbouring_nodes[j]] == 0 && neighbouring_nodes[j] < smallest_node){
smallest_node = neighbouring_nodes[j];
}
}
if (graph->visited[smallest_node] == 0){
// calls DFS on the smallest unvisited neighboring node, if it exists
DFS(graph, smallest_node);
}else{
// otherwise (all neighboring nodes already visited)
// return control to the caller function
return;
}
}
}
I'm not 100% sure I understood what you wanted to do with the while (temp != NULL) and while (temp_cpy != NULL) loops but couldn't really figure out a way to use this approach especially in your particular case in which you want to visit the neighboring nodes in ascending order.
Let's assume a simple graph like 6->0->1, calling DFS(g, 0) will get temp to point to 6->1->NULL (could be also 1->6->NULL, depending on how you construct the graph), then smallest_node will be 1 and therefore the node 1 will be visited and the temp = temp->next will "assign" 1->NULL to temp. Back to the beginning of the loop, now temp_cpy will "be equal" to temp, hence 1->NULL. The node 6 is not on the list anymore even if it was not visited, on the other hand the already visited node 1 is still there. Also count is now equal to 1 therefore the condition (graph->visited[smallest_node] == 1 && count == 1) is met and DFS(g, 1) is called, which should not since node 1 was already visited. The infinite loop arises from this, since the previous condition is always met when temp has one (already visited) element left ([some value]->NULL). Once you reach that point you always call DFS(g, [some value]) and this will never give back control, since before reaching the temp = temp->next statement (which should assign NULL to temp , hence ending the while loop), DFS(g, [some other value]) is again called, which at some point will again call DFS(g, [some value]), and so forth.
As mentioned, one problem your original code has is that you call the DFS function also for an already visited vertex, and this should never be the case. When you encounter an already visited neighboring vertex, you want either to check the next or, if there are no unvisited neighboring vertices left, to give back control to the caller function. Therefore the last if else statement should not be there. The second problem is that smallest_node is selected in the wrong way. This is because temp_cpy, as explained above, is not constructed in such a way that it necessarily contains all unvisited neighboring nodes and also because you're actually looking for the smallest element in this list, regardless if it has already been visited or not (again because of the assumption that temp_cpy contains only all unvisited nodes). In fact you should be looking for the "smallest unvisited node" rather than the "smallest node".
In my code I go through all neighboring nodes with two for loops, find the smallest unvisited one and call DFS(g, [smallest unvisited node]) and once there are no unvisited neighbors left, return control back to the caller function.
I Hope this is somewhat understandable and I also hope I'm not missing something about what you had in mind with your implementation, in which case I would be very much interested in some explanations!
Here is a simpler version of the DFS in which neighboring nodes are checked and eventually visited in the order they're presented in the adjList. In this case I think the while (temp != NULL)/temp = temp->next approach makes sense:
void DFS(struct Graph *graph, int vertex) {
struct node *temp = graph->adjLists[vertex];
graph->visited[vertex] = 1;
printf("Visited %d \n", vertex);
// for vertex search in every neighboring node
while (temp != NULL) {
// if neighboring vertex temp->vertex not visited, then search there
if (graph->visited[temp->vertex] == 0) {
DFS(graph, temp->vertex);
// if already visited, go to the next vertex on the neighbors list
}else{
temp = temp->next;
}
}
// when searched in all neighboring vertexes return control to caller
return;
}

Why is my linked list while loop not working? (C)

I am unsure why the code below does not execute up to the while loop. It only gives this output:enter image description here
The desired output of this program is that the largest node of the linked list is taken, multiplied by 0.8, and then printed as an output.
Code:
struct Process{
int burst_time;
struct Process* next;
};
int main()
{
int i;
struct Process* head = NULL, *temp = NULL;
struct Process* current = head; // Reset the pointer
int proc_count, time_quantum, total_time;
// BTmax
int max = 0;
printf("How many processes?: ");
scanf("%d",&proc_count);
for(i = 0; i < proc_count; i++)
{
temp = malloc(sizeof(struct Process));
printf("\nEnter burst time of process %d: ", i + 1);
scanf("%d", &temp -> burst_time);
temp->next=NULL;
if(head==NULL)
{
head=temp;
current=head;
}
else
{
current->next=temp;
current=temp;
}
}
current = head;
// BTmax * 0.8
while(current != NULL)
{
if (head -> burst_time > max)
{
max = head->burst_time;
}
head = head->next;
}
time_quantum = max * 0.8;
printf("\nTime Quantum is: %d", time_quantum);
Also, inside while loop you are iterating head variable but in condition you are checking current != NULL
From the way you wrote your while loop (iterating via head=head->next) you are apparently trying to do these two things at the same time:
Scan the list for its largest element
Remove/deallocate each element after it has been considered
Although head=head->next does remove each element from the list, it neglects to deallocate (causing memory to leak).
This loop correctly does both the scanning and the removal/deallocation:
while (head != NULL)
{
if (head->burst_time > max)
{
max = head->burst_time;
}
temp = head;
head = head->next;
free(temp);
}
(Notice that the while condition should be testing head, not testing current. Thus, there is no need to initialize current=head prior to the loop.)
You'll want to change the final while loop. You're checking to make sure current isnt NULL but you're iterating with head. If you still need access to the data, changing the final while loop to this should work :
while(current != NULL) {
if (current->burst_time > max) max = current->burst_time;
current = current->next;
}
Finally, maybe you already have in your actual program, but you need to free() any memory allocated with malloc()
So if you're done with the list at that point you can change the final while loop to :
while(head != NULL) {
if (head->burst_time > max) max = head->burst_time;
temp = head;
head = head->next;
free(temp);
}

C Program - Removing duplicates from LinkedList

I am trying out on removing duplicates from a linkedlist. The user will insert one value and then, the program will check if the user's input and value in the linkedlist are the same. If it's similar, it will remove and leave only one in the linkedlist. For e.g. linkedlist=10 100. user=10. outcome=10 100 and not 10 10 100.
int insertSorted(LinkedList *ll, int item)
{
ListNode *cur = ll->head;
int size = ll->size;
int i;
for (i = 0; i <= size; i++)
{
if ((size - i) == 0 || item < cur->item)
{
insertNode(ll, i, item); // function to insert the value into Linkedlist
return i;
}
cur = cur->next;
}
ListNode *current = ll->head;
while (current->next != NULL)
{
if (current->item == current->next->item)
{
ListNode *nextNext = current->next->next;
free(current->next);
current->next = nextNext;
}
else
{
current = current->next; // only advance if no deletion
}
}
return -1;
}
When you insert the new node (the value 10 in your example), you then return i;, making the rest of the code, which I suppose checks for duplicates not to execute.
If you always return after you insert, you don't reach the code that deletes the duplicate. I think you need to break instead.
Also, where do you modify the size of the linked-list when you delete an element?
Should be reduced by 1 if I understand correctly.
In addition, since you clean duplicates after every insertion, you don't need the conditional advancement in the deleting loop

c bubblesort a link list

trying to get a function to run that will bubble sort a link list from smallest to largest number. I don't want the data to be moved around in the link list instead have the pointers be pointing elsewhere in case each link needs to hold a lot of data.
In each link I have a INT arrivalTime field that will house a integer value. THis number determines where the link should be in the list.
My program seems to hang at the moment, I'd appreciate if anyone could fix it up, Thanks
bubble sort function
void bubbleSort(struct order *start)
{
int swapped, i;
processData *current = start;
processData *temp;
struct order *lptr = NULL;
/* Checking for empty list */
if (current == NULL)
printf("null \n");
do
{
swapped = 0;
current = start;
while (current->next != lptr)
{
if (current->arrivalTime > current->next->arrivalTime)
{
temp = current;
current = current->next;
current->next = temp;
swapped = 1;
}
current = current->next;
}
lptr = current;
}
while (swapped);
}
structure of link list
struct order
{
int name;
int arrivalTime;
int quanta;
struct order * next;
};
typedef struct order processData;
Swapping two adjacent items in a singly-linked-list requires that you change three next pointers.
If the order is A --> B --> C --> D and you swap B with C to get A --> C --> B --> D then
A->next needs to point to C instead of B
B->next needs to point to D instead of C
C->next needs to point to B instead of D
This is a very slow thing you're trying to do - see this paper on comparing ways to sort a linked list.
Currently, your algorithm will never terminate because in your while loop, you set lptr = current, but then test whether current->next == lptr. This will never return true unless one of your list members points to itself.
If you do have a definite use case to bubble sort the linked list, consider this: The purpose of a linked list is that you access it by working sequentially from the list head to the item you want. If your initial list head is not the smallest or the largest and you're only changing pointers, you need to keep track of where the head of the list is.
I did think of a way to do it loosely based on yours, see below. Sorry it's a bit messy - I'm not a C expert (hope the memory allocation is OK...) and I'm sure there are optimisations. In particular I'm sure there are better ways to pass the headOfList pointer back to the main algorithm. I've tested it for a number of 100 member lists of structs with some negative numbers. I've left the code to generate them with random arrival times in main() so you can just paste this in where your old code was and run it.
processData *bubbleSort(processData *start)
{
int swapped;
processData *current = start;
processData *lheadptr = start;
processData *previous = NULL;
processData *oldPrevButBigger;
processData *oldAfterThisPair;
/* Checking for empty list */
if (current == NULL)
{
printf("null \n");
return NULL;
}
do
{
swapped = 0;
current = lheadptr;
previous = NULL;
while (current->next) //Stop when current->next is null (i.e. end of list)
{
if (current->arrivalTime > current->next->arrivalTime)
{
oldPrevButBigger = current;
oldAfterThisPair = current->next->next;
current = current->next;
current->next = oldPrevButBigger;
current->next->next = oldAfterThisPair;
if (!previous)
{
//If no previous, this was the head of the list, so need to update
lheadptr = current;
}
else
{
// If there is a "previous", then we're not at head of list, so need
// to update that pointer too
previous->next = current;
}
swapped = 1;
}
previous = current;
current = current->next;
}
}
while (swapped);
return lheadptr;
}
int main()
{
srand(time(NULL));
int NUM = 100;
processData* pointToLast = NULL;
int i;
processData orders[NUM];
//Generate a list with random arrival times (last member of the list generated first
for (i = 0; i < NUM; i++ )
{
orders[i].name = i;
orders[i].arrivalTime = rand()-1000000000;
orders[i].quanta = 500;
orders[i].next = pointToLast;
pointToLast = &orders[i];
}
printf("List before\n");
printf("===========\n");
for (i=0; i < NUM; i++)
{
printf("%i;%i;%i\n", pointToLast->name, pointToLast->arrivalTime, pointToLast->quanta);
pointToLast = pointToLast->next;
}
processData* newListHead = bubbleSort(&orders[NUM-1]);
printf("List after\n");
printf("===========\n");
for (i=0; i < NUM; i++)
{
printf("%i;%i;%i\n", newListHead->name, newListHead->arrivalTime, newListHead->quanta);
newListHead = newListHead->next;
}
return 0;
}

Splitting a linked list

Why are the split lists always empty in this program? (It is derived from the code on the Wikipedia page on Linked Lists.)
/*
Example program from wikipedia linked list article
Modified to find nth node and to split the list
*/
#include <stdio.h>
#include <stdlib.h>
typedef struct ns
{
int data;
struct ns *next; /* pointer to next element in list */
} node;
node *list_add(node **p, int i)
{
node *n = (node *)malloc(sizeof(node));
if (n == NULL)
return NULL;
n->next = *p; //* the previous element (*p) now becomes the "next" element */
*p = n; //* add new empty element to the front (head) of the list */
n->data = i;
return *p;
}
void list_print(node *n)
{
int i=0;
if (n == NULL)
{
printf("list is empty\n");
}
while (n != NULL)
{
printf("Value at node #%d = %d\n", i, n->data);
n = n->next;
i++;
}
}
node *list_nth(node *head, int index) {
node *current = head;
node *temp=NULL;
int count = 0; // the index of the node we're currently looking at
while (current != NULL) {
if (count == index)
temp = current;
count++;
current = current->next;
}
return temp;
}
/*
This function is to split a linked list:
Return a list with nodes starting from index 'int ind' and
step the index by 'int step' until the end of list.
*/
node *list_split(node *head, int ind, int step) {
node *current = head;
node *temp=NULL;
int count = ind; // the index of the node we're currently looking at
temp = list_nth(current, ind);
while (current != NULL) {
count = count+step;
temp->next = list_nth(head, count);
current = current->next;
}
return temp; /* return the final stepped list */
}
int main(void)
{
node *n = NULL, *list1=NULL, *list2=NULL, *list3=NULL, *list4=NULL;
int i;
/* List with 30 nodes */
for(i=0;i<=30;i++){
list_add(&n, i);
}
list_print(n);
/* Get 1th, 5th, 9th, 13th, 18th ... nodes of n etc */
list1 = list_split(n, 1, 4);
list_print(list1);
list2 = list_split(n, 2, 4); /* 2, 6, 10, 14 etc */
list_print(list2);
list3 = list_split(n, 3, 4); /* 3, 7, 11, 15 etc */
list_print(list3);
list3 = list_split(n, 4, 4); /* 4, 8, 12, 16 etc */
list_print(list4);
getch();
return 0;
}
temp = list_nth(current, ind);
while (current != NULL) {
count = count+step;
temp->next = list_nth(head, count);
current = current->next;
}
You are finding the correct item to begin the split at, but look at what happens to temp from then on ... you only ever assign to temp->next.
You need to keep track of both the head of your split list and the tail where you are inserting new items.
The program, actually, has more than one problem.
Indexes are not a native way to address linked list content. Normally, pointers to nodes or iterators (which are disguised pointers to nodes) are used. With indexes, accessing a node has linear complexity (O(n)) instead of constant O(1).
Note that list_nth returns a pointer to a "live" node within a list, not a copy. By assigning to temp->next in list_split, you are rewiring the original list instead of creating a new one (but maybe it's intentional?)
Within list_split, temp is never advanced, so the loop just keeps attaching nodes to the head instead of to the tail.
Due to use of list_nth for finding nodes by iterating through the whole list from the beginning, list_split has quadratic time (O(n**2)) instead of linear time. It's better to rewrite the function to iterate through the list once and copy (or re-attach) required nodes as it passes them, instead of calling list_nth. Or, you can write current = list_nth(current, step).
[EDIT] Forgot to mention. Since you are rewiring the original list, writing list_nth(head, count) is incorrect: it will be travelling the "short-cirquited" list, not the unmodified one.
I also notice that it looks like you are skipping the first record in the list when you are calculating list_nth. Remember is C we normally start counting at zero.
Draw out a Linked List diagram and follow your logic:
[0]->[1]->[2]->[3]->[4]->[5]->[6]->[7]->[8]->[9]->...->[10]->[NULL]
Your description of what list_split is supposed to return is pretty clear, but it's not clear what is supposed to happen, if anything, to the original list. Assuming it's not supposed to change:
node *list_split(node *head, int ind, int step) {
node *current = head;
node *newlist=NULL;
node **end = &newlist;
node *temp = list_nth(current, ind);
while (temp != NULL) {
*end = (node *)malloc(sizeof(node));
if (*end == NULL) return NULL;
(*end)->data = temp->data;
end = &((*end)->next);
temp = list_nth(temp, step);
}
return newlist; /* return the final stepped list */
}
(You probably want to factor a list_insert routine out of that that inserts a new
node at a given location. list_add isn't very useful since it always adds to the
beginning of the list.)

Resources