C - Segfault when accessing struct member in a HashTable (insert function) - c

I am new to C and am having issues implementing an insert function for my HashTable.
Here are my structs:
typedef struct HashTableNode {
char *url; // url previously seen
struct HashTableNode *next; // pointer to next node
} HashTableNode;
typedef struct HashTable {
HashTableNode *table[MAX_HASH_SLOT]; // actual hashtable
} HashTable;
Here is how I init the table:
HashTable *initTable(){
HashTable* d = (HashTable*)malloc(sizeof(HashTable));
int i;
for (i = 0; i < MAX_HASH_SLOT; i++) {
d->table[i] = NULL;
}
return d;
}
Here is my insert function:
int HashTableInsert(HashTable *table, char *url){
long int hashindex = JenkinsHash(url, MAX_HASH_SLOT);
int uniqueBool = 2; // 0 for true, 1 for false, 2 for init
HashTableNode* theNode = (HashTableNode*)malloc(sizeof(HashTableNode));
theNode->url = url;
if (table->table[hashindex] != NULL) { // if we have a collision
HashTableNode* currentNode = (HashTableNode*)malloc(sizeof(HashTableNode));
currentNode = table->table[hashindex]->next; // the next node in the list
if (currentNode == NULL) { // only one node currently in list
if (strcmp(table->table[hashindex]->url, theNode->url) != 0) { // unique node
table->table[hashindex]->next = theNode;
return 0;
}
else{
printf("Repeated Node\n");
return 1;
}
}
else { // multiple nodes in this slot
printf("There was more than one element in this slot to start with. \n");
while (currentNode != NULL)
{
// SEGFAULT when accessing currentNode->url HERE
if (strcmp(currentNode->url, table->table[hashindex]->url) == 0 ){ // same URL
uniqueBool = 1;
}
else{
uniqueBool = 0;
}
currentNode = currentNode->next;
}
}
if (uniqueBool == 0) {
printf("Unique URL\n");
theNode->next = table->table[hashindex]->next; // splice current node in
table->table[hashindex]->next = theNode; // needs to be a node for each slot
return 0;
}
}
else{
printf("simple placement into an empty slot\n");
table->table[hashindex] = theNode;
}
return 0;
}
I get SegFault every time I try to access currentNode->url (the next node in the linked list of a given slot), which SHOULD have a string in it if the node itself is not NULL.
I know this code is a little dicey, so thank you in advance to anyone up for the challenge.
Chip
UPDATE:
this is the function that calls all ht functions. Through my testing on regular strings in main() of hash table.c, I have concluded that the segfault is due to something here:
void crawlPage(WebPage * page){
char * new_url = NULL;
int pos= 0;
pos = GetNextURL(page->html, pos, URL_PREFIX, &new_url);
while (pos != -1){
if (HashTableLookup(URLsVisited, new_url) == 1){ // url not in table
printf("url is not in table......\n");
hti(URLsVisited, new_url);
WebPage * newPage = (WebPage*) calloc(1, sizeof(WebPage));
newPage->url = new_url;
printf("Adding to LIST...\n");
add(&URLList, newPage); // added & to it.. no seg fault
}
else{
printf("skipping url cuz it is already in table\n");
}
new_url = NULL;
pos = GetNextURL(page->html, pos, URL_PREFIX, &new_url);
}
printf("freeing\n");
free(new_url); // cleanup
free(page); // free current page
}

Your hash table insertion logic violates some rather fundamental rules.
Allocating a new node before determining you actually need one.
Blatant memory leak in your currentNode allocation
Suspicious ownership semantics of the url pointer.
Beyond that, this algorithm is being made way too complicated for what it really should be.
Compute the hash index via hash-value modulo the table size.
Start at the table slot of the hash index, walking node pointers until one of two things happens:
You discover the node is already present
You reach the end of the collision chain.
Only in #2 above do you actually allocate a collision node and chain it to your existing collision list. Most of this is trivial when employing a pointer-to-pointer approach, which I demonstrate below:
int HashTableInsert(HashTable *table, const char *url)
{
// find collision list starting point
long int hashindex = JenkinsHash(url, MAX_HASH_SLOT);
HashTableNode **pp = table->table+hashindex;
// walk the collision list looking for a match
while (*pp && strcmp(url, (*pp)->url))
pp = &(*pp)->next;
if (!*pp)
{
// no matching node found. insert a new one.
HashTableNode *pNew = malloc(sizeof *pNew);
pNew->url = strdup(url);
pNew->next = NULL;
*pp = pNew;
}
else
{ // url already in the table
printf("url \"%s\" already present\n", url);
return 1;
}
return 0;
}
That really is all there is to it.
The url ownership issue I mentioned earlier is addressed above via string duplication using strdup(). Although not a standard library function, it is POSIX compliant and every non-neanderthal half-baked implementation I've seen in the last two decades provides it. If yours doesn't (a) I'd like to know what you're using, and (b) its trivial to implement with strlen and malloc. Regardless, when the nodes are being released during value-removal or table wiping, be sure and free a node's url before free-ing the node itself.
Best of luck.

Related

A pointer points to a NULL pointer

code from cs50 harvard course dealing with linked list:
---The problem I do not understand is that when node *ptr points to numbers, which is a null pointer, how can the for loop: (node *ptr = numbers; ptr != NULL) run at all since *numbers = NULL?---
full version of the codes can be found at: https://cdn.cs50.net/2017/fall/lectures/5/src5/list2.c
#include <cs50.h>
#include <stdio.h>
typedef struct node
{
int number;
struct node *next;
}
node;
int main(void)
{
// Memory for numbers
node *numbers = NULL;
// Prompt for numbers (until EOF)
while (true)
{
// Prompt for number
int number = get_int("number: ");
// Check for EOF
if (number == INT_MAX)
{
break;
}
// Check whether number is already in list
bool found = false;
for (node *ptr = numbers; ptr != NULL; ptr = ptr->next)
{
if (ptr->number == number)
{
found = true;
break;
}
}
The loop is to check for prior existence in the list actively being built. If not there (found was never set true), the remaining inconveniently omitted code adds it to the list.
On initial run, the numbers linked list head pointer is null, signifying an empty list. That doesn't change the algorithm of search + if-not-found-insert whatsoever. It just means the loop is never entered because the bail-case is immediately true. in other words, with numbers being NULL
for (node *ptr = numbers; ptr != NULL; ptr = ptr->next)
the condition to continue, ptr != NULL is already false, so the body of the for-loop is simply skipped. That leads to the remainder of the code you didn't post, which does the actual insertion. After that insertion, the list now has something, and the next iteration of the outer-while loop will eventually scan the list again after the next prospect value is read. This continues until the outer-while condition is no longer satisfied.
A Different Approach
I have never been fond of the cs50 development strategy, and Harvard's technique for teaching C to entry-level CS students. The cs50 header and lib has caused more transitional confusion to real-world software engineering than one can fathom. Below is an alternative for reading a linked list of values, keeping only unique entries. It may look like a lot, but half of this is inline comments describing what is going on. Some of it will seem trivial, but the search-and-insert methodology is what you should be focusing on. It uses a strategy of pointer-to-pointer that you're likely not familiar with, and this is a good exposure.
Enjoy.
#include <stdio.h>
#include <stdlib.h>
struct node
{
int value;
struct node *next;
};
int main()
{
struct node *numbers = NULL;
int value = 0;
// retrieve list input. stop when we hit
// - anything that doesn't parse as an integer
// - a value less than zero
// - EOF
while (scanf("%d", &value) == 1 && value >= 0)
{
// finds the address-of (not the address-in) the first
// pointer whose node has a value matching ours, or the
// last pointer in the list (which points to NULL).
//
// note the "last" pointer will be the head pointer if
// the list is empty.
struct node **pp = &numbers;
while (*pp && (*pp)->value != value)
pp = &(*pp)->next;
// if we didn't find our value, `pp` holds the address of
// the last pointer in the list. Again, not a pointer to the
// last "node" in the list; rather the last actual "pointer"
// in the list. Think of it as the "next" member of last node,
// and in the case of an empty list, it will be the address of
// the head pointer. *That* is where we will be hanging our
// new node, and since we already know where it goes, there is
// no need to rescan the list again.
if (!*pp)
{
*pp = malloc(sizeof **pp);
if (!*pp)
{
perror("Failed to allocate new node");
exit(EXIT_FAILURE);
}
(*pp)->value = value;
(*pp)->next = NULL;
}
}
// display entire list, single line
for (struct node const *p = numbers; p; p = p->next)
printf("%d ", p->value);
fputc('\n', stdout);
// free the list
while (numbers)
{
struct node *tmp = numbers;
numbers = numbers->next;
free(tmp);
}
return EXIT_SUCCESS;
}
This approach is especially handy when building sorted lists, as it can be altered with just a few changes to do so.
If you examine rest of the code which is also within the while loop, you can see alteration of numbers on the shared link.
if (!found)
{
// Allocate space for number
node *n = malloc(sizeof(node));
if (!n)
{
return 1;
}
// Add number to list
n->number = number;
n->next = NULL;
if (numbers)
{
for (node *ptr = numbers; ptr != NULL; ptr = ptr->next)
{
if (!ptr->next)
{
ptr->next = n;
break;
}
}
}
else
{
numbers = n;
}
}
Besides, it doesn't hit body of the for loop at first, so your thinking is correct.

C: Queue and Memory Limit Excedeed

I'm a C beginner and decided to participate in a small online contest in order to practice.
In the current problem I'm asked to write a queue with a struct that responds to the commands PushBack and PopFront.
The input consists of
A number n (n <= 1000000) indicating the number of commands inputs.
n lines. Each line consists of two integer numbers a and b:
a is 2 for executing PopFront, in which case b is the expected popped value.
a is 3 for PushBack, in which case b is the value to be enqueued.
If we try to pop from an empty queue then the value returned is -1.
The task is to print YES or NO after executing the last command if the value returned by any PushBack during the program execution coincide or not with the expected value.
I implemented a version of this, but after submitting my answer the online judge gives Maximum-Limit-Excedeed (in the last test out of 27).
I was reading about it and this issue may be related to some of these:
Using an array or data structure too big.
There is an infinite (or too big) recursion in the program.
An incorrect usage of pointers (diagnosed as MLE).
I'm not sure what is the problem. It seems to me that in some of the tests the number of addition of nodes is way greater than that of deletions (which means that 1. takes place in my code) which, in turn, causes the while loop in EmptyQueue to be too big (2. also takes place). I'm not able to spot whether there is an incorrect usage of pointers.
My questions are:
What am I'm doing wrong here?
What should I do to fix this?
Code:
#include <stdio.h>
#include <stdbool.h>
#include <stdlib.h>
//===================
//Definitions:
typedef int Item;
typedef struct node
{
Item item;
struct node * next;
} Node;
typedef struct queue
{
Node * front;
Node * rear;
long counter;
} Queue;
//===================
//Function Prototypes:
void InitializeQueue(Queue * pq);
bool PushBack(Queue * pq, Item item);
int PopFront(Queue * pq);
void EmptyQueue(Queue * pq);
int main(void)
{
Queue line;
long n, i;
int command, expected, received;
bool check = true;
scanf("%ld", &n);
InitializeQueue(&line);
i = 0;
while (i < n)
{
scanf("%d %d", &command, &expected);
switch (command)
{
case 2:
received = PopFront(&line);
if (received != expected)
check = false;
break;
case 3:
PushBack(&line, expected);
break;
}
i++;
}
if (check == true)
printf("YES\n");
else
printf("NO\n");
// free memory used by all nodes
EmptyQueue(&line);
return 0;
}
void InitializeQueue(Queue * pq)
{
pq->front = NULL;
pq->rear = NULL;
pq->counter = 0;
}
bool PushBack(Queue * pq, Item item)
{
Node * pnode;
//Create node
pnode = (Node *)malloc(sizeof(Node));
if (pnode == NULL)
{
fputs("Impossible to allocate memory", stderr);
return false;
}
else
{
pnode->item = item;
pnode->next = NULL;
}
//Connect to Queue
if (pq->front == NULL)
{
pq->front = pnode;
pq->rear = pnode;
}
else
{
pq->rear->next = pnode;
pq->rear = pnode;
}
pq->counter++;
return true;
}
int PopFront(Queue * pq)
{
int popped;
Node * temp;
temp = pq->front;
if (pq->counter == 0)
return -1;
else
{
popped = pq->front->item;
pq->front = pq->front->next;
free(temp);
pq->counter--;
return popped;
}
}
void EmptyQueue(Queue * pq)
{
int dummy;
while (pq->counter != 0)
dummy = PopFront(pq);
}
Thanks.
I don't think there's actually anything wrong with that code functionally, though it could do with some formatting improvements :-)
I will mention one thing:
The task is to check whether the returned value after executing PopFront coincides with the expected one. If so, then print YES. Print NO, otherwise.
I would read this as a requirement on each PopFront. You appear to be storing the fault condition and only printing YES or NO once at the end.
I'd suggest fixing that as a start and see what the online judge comes back with.
This all ignores the fact that it's actually rather difficult to debug code unless you can reproduce the problem. If you can't get the data set from the online contest, it may be worth generating your own (large) one to see if you can get you code to fail.
Once you have a repeatable failure, debugging becomes massively easier.
Although it's unlikely, you may (as mch points out in a comment) be running afoul of limited memory. I consider this unlikely as your own comments indicate only 5meg of space is being used at the end, which is not onerous. However, if that is the case, it's probably due to the fact that every single integer has the overhead of a pointer carried along with it.
If you wanted to investigate that avenue, you could slightly adjust the structures as follows (getting rid of the unnecessary counter as well):
#define ITEMS_PER_NODE 1000
typedef struct node {
Item item[ITEMS_PER_NODE]; // array of items.
int startIndex; // start index (one to pop from).
int nextIndex; // next index (one to push at).
struct node *next; // next node.
} Node;
typedef struct queue {
Node *front; // first multi-item node.
Node *rear; // last multi-item node.
} Queue;
The idea is to store many items per node so that the overhead of the next pointer is greatly reduced (one pointer per thousand items rather than one per item).
The code for queue manipulation would then become slightly more complex but still understandable. First off, a helper function for creating a new node, ready for adding data to:
// Helper to allocate a new node and prep it for appending.
// Returns node or NULL (and prints error) if out of memory.
Node *GetNewNode(void) {
Node *pnode = malloc (sizeof(Node));
if (pnode == NULL)
fputs ("Impossible to allocate memory", stderr);
else
pnode->startIndex = pnode->nextIndex = 0;
return pnode;
}
Next, the mostly unchanged queue initialisation:
void InitializeQueue (Queue *pq) {
pq->front = pq->rear = NULL;
}
The pushback is slightly more complex in that it first adds a new multi-item node if the queue is empty or current last node has reached the end. Whether that happens or not, an item is added to the final node:
bool PushBack (Queue *pq, Item item) {
// Default to adding to rear node (assuming space for now).
Node *pnode = pq->rear;
// Make sure queue has space at end for new item.
if (pq->front == NULL) {
// Handle empty queue first, add single node.
if ((pnode = GetNewNode()) == NULL)
return false;
pq->front = pq->rear = pnode;
} else if (pq->rear->nextItem == ITEMS_PER_NODE) {
// Handle new node needed in non-empty queue, add to rear of queue.
if ((pnode = GetNewNode()) == NULL)
return false;
pq->rear->next = pnode;
pq->rear = pnode;
}
// Guaranteed space in (possibly new) rear node now, just add item.
pq->rear->item[pq->rear->nextIndex++] = item;
}
Popping is also a bit more complex - it gets the value to return then deletes the first node if it's now exhausted. That may also entail clearing the queue if the node it deletes was the only one:
int PopFront (Queue * pq) {
// Capture empty queue.
if (pq->first == NULL)
return -1;
// Get value to pop.
Node *currFront = pq->front;
int valuePopped = currFront->item[currFront->startIndex++];
// Detect current node now empty, delete it.
if (currFront->startItem == currFront->endIndex) {
// Detect last node in queue, just free and empty entire queue.
if (currFront == pq->rear) {
free (currFront);
pq->front = pq->rear = NULL;
} else {
// Otherwise remove front node, leaving others.
pq->front = currFront->next;
free (currFront);
}
}
// Regardless of queue manipulation, return popped value.
return valuePopped;
}
Emptying the queue is largely unchanged other than the fact we clear nodes rather than items:
void EmptyQueue (Queue * pq) {
// Can empty node at a time rather than item at a time.
while (pq->front != NULL) {
Node *currentFront = pq->front;
pq->front = pq->front->next;
free (currentFront);
}
}
I think is better to use a more simple approach like that in the code I post here.
The code in the following lines doesn't match the input/output required by the contest, but contains a functional and simple approach to solve the problem: A simple stack manager! (if I correctly understood).
#include <stdio.h>
#include <malloc.h>
int * stack;
int * base;
int cnt;
/* To emulate input file */
struct stFile {
int n;
struct stCmd {
int a;
int b;
} cmd[200]; // 200 is an arbitrary value.
} fdata = {
20,
{
{2,0},
{2,0},
{2,0},
{3,35},
{2,0},
{3,4},
{2,0},
{2,0},
{2,0},
{3,12},
{3,15},
{3,8},{3,18},
{2,0},
{2,0},
{3,111},
{2,0},
{2,0},
{2,0},
{2,0},
{3,8},{3,18},{3,8},{3,18},{3,8},{3,18},{3,8},{3,18},{3,8},{3,18},
{3,11},{3,13},{3,11},{3,11},{3,11},{3,11},{3,11},{3,11},
{3,11},{3,13},{3,11},{3,11},{3,11},{3,11},{3,11},{3,11},
{2,0},
{2,0},
{2,0},
{2,0},
{2,0},
{2,0},
{2,0},
{2,0},
{0,0}
}
};
int push(int item)
{
if (cnt) {
*stack = item;
stack++;
cnt--;
return 0;
} else {
return 1;
}
}
int pop(int *empty)
{
if (stack!=base) {
stack--;
cnt++;
if (empty)
*empty = 0;
} else {
if (empty)
*empty = 1;
}
return *stack;
}
int main(void)
{
int i=0,e=0;
cnt = fdata.n;
base = stack = malloc(cnt*sizeof(int));
if (!base) {
puts("Not enough memory!");
return 1;
}
while(fdata.cmd[i].a!=0) {
switch(fdata.cmd[i].a) {
case 2:
printf("popping ...: %d ",pop(&e));
printf("empty: %d\n",e);
break;
case 3:
e = push(fdata.cmd[i].b);
printf("pushing ...: %d %s\n",fdata.cmd[i].b,(e)?"not pushed":"pushed");
break;
default:
break;
};
i++;
}
if (base)
free(base);
return 0;
}

Segmentation fault while creating a linked list

I am writing a small program which stores data and key inside a linked list structure, and retrieves data based on a key from the user. The program also checks whether it is a unique key and if it so it stores the data by creating a node at the front of the list. But the below code throws segmentation fault all the time.
#include<stdlib.h>
/* Node having data, unique key, and next */.
struct node
{
int data;
int key;
struct node *next;
}*list='\0',*p;
/* Create a node at the front */
void storeData(int data_x,int key_x)
{
int check_key;
position *nn; //nn specifies newnode
nn=(position)malloc(sizeof(struct node));
/* Segmentation Fault occurs here */
if(list->next==NULL)
{
nn->next=list->next;
nn->data = data_x;
nn->key = key_x;
list->next = nn;
}
else
{
check_key=checkUniqueKey(key_x);
if(check_key != FALSE)
{
printf("The entered key is not unique");
}
else
{
nn->data = data_x;
nn->key = key_x;
nn->next=list->next;
list->next=nn;
}
}
}
/* Retreive data based on a key */
int retreiveData(int key_find)
{
int ret_data = NULL;
p=list->next;
while(p->next != NULL)
{
if(p->key == key_find)
{
ret_data = p->data;
break;
}
p=p->next;
}
return(ret_data);
}
/* Checks whether user key is unique */
int checkUniqueKey(int key_x)
{
int key_check = FALSE;
p=list->next;
while(p->next != NULL)
{
if(p->key == key_x)
{
key_check = TRUE;
break;
}
p=p->next;
}
return(key_check);
}
The segmentation fault occurs in the storeData function after the dynamic allocation.
There are some problems in your code:
your list handling is flawed: you always dereference the global pointer list, even before any list items are created. You should instead test if the list is empty by comparing list to NULL.
type position is not defined. Avoid hiding pointers behind typedefs, this is a great cause of confusion, which explains your mishandling of list pointers.
avoid defining a global variable with the name p, which is unneeded anyway. Define p as a local variable in the functions that use it.
NULL is the null pointer, 0 a zero integer value and \0 the null byte at the end of a C string. All 3 evaluate to 0 but are not always interchangeable.
For better portability and readability, use the appropriate one for each case.
Here is an improved version:
#include <stdio.h>
#include <stdlib.h>
/* Node having data, unique key, and next */.
struct node {
int data;
int key;
struct node *next;
} *list;
/* Create a node at the front */
void storeData(int data_x, int key_x) {
if (checkUniqueKey(key_x)) {
printf("The entered key is not unique\n");
} else {
/* add a new node to the list */
struct node *nn = malloc(sizeof(struct node));
if (nn == NULL) {
printf("Cannot allocate memory for node\n");
return;
}
nn->data = data_x;
nn->key = key_x;
nn->next = list;
list = nn;
}
}
/* Retrieve data based on a key */
int retrieveData(int key_find) {
struct node *p;
int ret_data = 0;
for (p = list; p != NULL; p = p->next) {
if (p->key == key_find) {
ret_data = p->data;
break;
}
}
return ret_data;
}
/* Checks whether user key is unique */
int checkUniqueKey(int key_x) {
struct node *p;
int key_check = FALSE;
for (p = list; p != NULL; p = p->next) {
if (p->key == key_x) {
key_check = TRUE;
break;
}
}
return key_check;
}
You try to cast your address on a position structure instead of a position*
nn=(position)malloc(sizeof(struct node));
Compile your code with gcc flags -Wextra and -Wall to prevent this kind of issue.
Moreover I don't know is it is a mistake but malloc a size of struct node and your nn variable is a pointer on position.
When you initialized your list pointer you set it to NULL(as '\0'), when the program accesses address 0x00 it goes out of its boundaries and the operating system kills the process.
To avoid the segfault you can have "list" of non pointer type thus allocating on stack, when you want to access list as pointer you can do &list. Another solution would involve having variable on stack "root_node" and initialize list pointer as list = &root_node.

How to free memory occupied by a Tree, C?

I'm currently dealing with a generic Tree with this structure:
typedef struct NODE {
//node's keys
unsigned short *transboard;
int depth;
unsigned int i;
unsigned int j;
int player;
int value;
struct NODE *leftchild; //points to the first child from the left
struct NODE *rightbrothers; //linked list of brothers from the current node
}NODE;
static NODE *GameTree = NULL;
While the function that allocates the different nodes is (don't bother too much at the keys' values, basically allocates the children-nodes. If there aren't any the new child goes to leftchild, otherwise it goes at the end of the list "node->leftchild->rightbrothers"):
static int AllocateChildren(NODE **T, int depth, unsigned int i, unsigned int j, int player, unsigned short *transboard) {
NODE *tmp = NULL;
if ((*T)->leftchild == NULL) {
if( (tmp = (NODE*)malloc(sizeof(NODE)) )== NULL) return 0;
else {
tmp->i = i;
tmp->j = j;
tmp->depth = depth;
(player == MAX ) ? (tmp->value = 2 ): (tmp->value = -2);
tmp->player = player;
tmp->transboard = transboard;
tmp->leftchild = NULL;
tmp->rightbrothers = NULL;
(*T)->leftchild = tmp;
}
}
else {
NODE *scorri = (*T)->leftchild;
while (scorri->rightbrothers != NULL)
scorri = scorri->rightbrothers;
if( ( tmp = (NODE*)malloc(sizeof(NODE)) )== NULL) return 0;
else {
tmp->i = i;
tmp->j = j;
tmp->depth = depth;
(player == MAX) ? (tmp->value = 2) : (tmp->value = -2);
tmp->player = player;
tmp->transboard = transboard;
tmp->leftchild = NULL;
tmp->rightbrothers = NULL;
}
scorri->rightbrothers = tmp;
}
return 1;
}
I need to come up with a function, possibly recursive, that deallocates the whole tree, so far I've come up with this:
void DeleteTree(NODE **T) {
if((*T) != NULL) {
NODE *tmp;
for(tmp = (*T)->children; tmp->brother != NULL; tmp = tmp->brother) {
DeleteTree(&tmp);
}
free(*T);
}
}
But it doesn't seem working, it doesn't even deallocate a single node of memory.
Any ideas of where I am being wrong or how can it be implemented?
P.s. I've gotten the idea of the recursive function from this pseudocode from my teacher. However I'm not sure I've translated it correctly in C with my kind of Tree.
Pseudocode:
1: function DeleteTree(T)
2: if T != NULL then
3: for c ∈ Children(T) do
4: DeleteTree(c)
5: end for
6: Delete(T)
7: end if
8: end function
One thing I like doing if I'm allocating lots of tree nodes, that are going to go away at the same time, is to allocate them in 'batches'. I malloc then as an array of nodes and dole them out from a special nodealloc function after saving a pointer to the array (in a function like below). To drop the tree I just make sure I'm not keeping any references and then call the free routine (also like below).
This can also reduce the amount of RAM you allocate if you're lucky (or very smart) with your initial malloc or can trust realloc not to move the block when you shrink it.
struct freecell { struct freecell * next; void * memp; } * saved_pointers = 0;
static void
save_ptr_for_free(void * memp)
{
struct freecell * n = malloc(sizeof*n);
if (!n) {perror("malloc"); return; }
n->next = saved_pointers;
n->memp = memp;
saved_pointers = n;
}
static void
free_saved_memory(void)
{
while(saved_pointers) {
struct freecell * n = saved_pointers;
saved_pointers = saved_pointers->next;
free(n->memp);
free(n);
}
}
I've just realized my BIG mistake in the code and I'll just answer myself since no one had found the answer.
The error lies in this piece of code:
for(tmp = (*T)->children; tmp->brother != NULL; tmp = tmp->brother) {
DeleteTree(&tmp);
}
First of all Ami Tavory was right about the for condition, i need to continue as long as tmp != NULL
Basically it won't just work because after the DeleteTree(&tmp), I can no longer access the memory in tmp because it's obviously deleted, so after the first cycle of for ends I can't do tmp = tmp->rightbrother to move on the next node to delete because tmp->rightbrother no longer exists as I just deleted it.
In order to fix it I just needed to save the tmp->brother somewhere else:
void DeleteTree(NODE **T) {
if((*T) != NULL) {
NODE *tmp, *deletenode, *nextbrother;
for(tmp = (*T)->children; tmp != NULL; tmp = nextbrother) {
nextbrother = tmp->rightbrother;
DeleteTree(&tmp);
}
canc = (*T);
free(*T);
(*T) = NULL;
}
}
Just for the sake of completeness I want to add my version of DeleteTree
void DeleteTree(NODE *T) {
if(T != NULL) {
DeleteTree(T->rightbrothers);
DeleteTree(T->leftchild);
free(T);
}
}
I think it is much less obscure and much easier to read. Basically it solves the issue in DeleteTree but through eliminating the loop.
Since we free the nodes recursively we might as well do the whole process recursively.

C programming question on the implementation of a hash table

I have a C programming question on the implementation of a hash table. I have implemented the hash table for storing some strings.
I am having a problem while dealing with hash collisons. I am following a chaining linked-list approach to overcome the problem but, somehow, my code is behaving differently. I am not able to debug it. Can somebody help?
This is what I am facing:
Say first time, I insert a string called gaur. My hash map calculates the index as 0 and inserts the string successfully. However, when another string whose hash also, when calculated, turns out to be 0, my previous value gets overrwritten i.e. gaur will be replaced by new string.
This is my code:
struct list
{
char *string;
struct list *next;
};
struct hash_table
{
int size; /* the size of the table */
struct list **table; /* the table elements */
};
struct hash_table *create_hash_table(int size)
{
struct hash_table *new_table;
int i;
if (size<1) return NULL; /* invalid size for table */
/* Attempt to allocate memory for the table structure */
if ((new_table = malloc(sizeof(struct hash_table))) == NULL) {
return NULL;
}
/* Attempt to allocate memory for the table itself */
if ((new_table->table = malloc(sizeof(struct list *) * size)) == NULL) {
return NULL;
}
/* Initialize the elements of the table */
for(i=0; i<size; i++)
new_table->table[i] = '\0';
/* Set the table's size */
new_table->size = size;
return new_table;
}
unsigned int hash(struct hash_table *hashtable, char *str)
{
unsigned int hashval = 0;
int i = 0;
for(; *str != '\0'; str++)
{
hashval += str[i];
i++;
}
return (hashval % hashtable->size);
}
struct list *lookup_string(struct hash_table *hashtable, char *str)
{
printf("\n enters in lookup_string \n");
struct list * new_list;
unsigned int hashval = hash(hashtable, str);
/* Go to the correct list based on the hash value and see if str is
* in the list. If it is, return return a pointer to the list element.
* If it isn't, the item isn't in the table, so return NULL.
*/
for(new_list = hashtable->table[hashval]; new_list != NULL;new_list = new_list->next)
{
if (strcmp(str, new_list->string) == 0)
return new_list;
}
printf("\n returns NULL in lookup_string \n");
return NULL;
}
int add_string(struct hash_table *hashtable, char *str)
{
printf("\n enters in add_string \n");
struct list *new_list;
struct list *current_list;
unsigned int hashval = hash(hashtable, str);
printf("\n hashval = %d", hashval);
/* Attempt to allocate memory for list */
if ((new_list = malloc(sizeof(struct list))) == NULL)
{
printf("\n enters here \n");
return 1;
}
/* Does item already exist? */
current_list = lookup_string(hashtable, str);
if (current_list == NULL)
{
printf("\n DEBUG Purpose \n");
printf("\n NULL \n");
}
/* item already exists, don't insert it again. */
if (current_list != NULL)
{
printf("\n Item already present...\n");
return 2;
}
/* Insert into list */
printf("\n Inserting...\n");
new_list->string = strdup(str);
new_list->next = NULL;
//new_list->next = hashtable->table[hashval];
if(hashtable->table[hashval] == NULL)
{
hashtable->table[hashval] = new_list;
}
else
{
struct list * temp_list = hashtable->table[hashval];
while(temp_list->next!=NULL)
temp_list = temp_list->next;
temp_list->next = new_list;
hashtable->table[hashval] = new_list;
}
return 0;
}
I haven't checked to confirm, but this line looks wrong:
hashtable->table[hashval] = new_list;
This is right at the end of the last case of add_string. You have:
correctly created the new struct list to hold the value being added
correctly found the head of the linked list for that hashvalue, and worked your way to the end of it
correctly put the new struct list at the end of the linked list
BUT then, with the line I quote above, you are telling the hash table to put the new struct list at the head of the linked list for this hashvalue! Thus throwing away the whole linked list that was there before.
I think you should omit the line I quote above, and see how you get on. The preceding lines are correctly appending it to the end of the existing list.
The statement hashtable->table[hashval] = new_list; is the culprit. You insrted the new_list ( I think better name would have been new_node) at end of the linked list. But then you overwrite this linked list with new_list which is just a single node. Just remove this statement.
As others have already pointed out, you are walking to the end of the list with temp_list, appending new_list to it, then throwing away the existing list.
Since the same value NULL is used to indicate an empty bucket and the end of the list, it's quite a bit easier to put the new item at the head of the list.
You also should do any test which would result in the new item not being added before creating it, otherwise you will leak the memory.
I would also have an internal lookup function that takes the hash value, otherwise you have to calculate it twice
int add_string(struct hash_table *hashtable, char *str)
{
unsigned int hashval = hash(hashtable, str);
/* item already exists, don't insert it again. */
if (lookup_hashed_string(hashtable, hashval, str))
return 2;
/* Attempt to allocate memory for list */
struct list *new_list = malloc(sizeof(struct list));
if (new_list == NULL)
return 1;
/* Insert into list */
new_list->string = strdup(str);
new_list->next = hashtable->table[hashval];
hashtable->table[hashval] = new_list;
return 0;
}
The hash function must be a function which take your data in entry and return delimited id (eg: integer between 0 and HASH_MAX)
Then you must stock your element in a list in the Hash(data) index of a hash_table array. if a data have the same hash, it will be stock in the same list as the previous data.
struct your_type_list {
yourtype data;
yourtype *next_data;
};
struct your_type_list hash_table[HASH_MAX];

Resources