Adding twice in hash table in C - c

Based on print statements in the rehashing section, I can see that the same string is getting added to the table multiple times. I believe the problem is in this function, although I can't see where. The print statement below return is never executed, so I know that the function actually executes.
Any ideas on where the problem is, or how I should go about finding it? Thanks. The full code on our repository is here:
https://github.com/csking1/buddhism/blob/master/Finding/hash_tables.c
void add_to_table(HashTable *h, char* str, LinkedList* new){
unsigned int hashval = hash(h, str);
LinkedList *list;
// walk through the table and check for the first free spot, start at hash val and go to the top
for (int i = hashval; i<h->size; i++){
list = h->table[i];
if (list == NULL){
new->next = h->table[i];
h->table[i] = new;
new->string = str;
return;
printf("%s\n", "didn't return");
}
}
// start at the bottom of the table, check for the first free spot up to hash val, then return
for (int i = 0; i < hashval; i++){
list = h->table[i];
if (list == NULL){
new->next = h->table[i];
h->table[i] = new;
new->string = str;
return;
}
}
}
For a full view, here's the string lookup function. This gets called one or two lines above the add_to_table() function, and returns if the string is already there.
LinkedList *lookup_string(HashTable *h, char *str){
LinkedList *list;
unsigned int hashval = hash(h, str);
for (int i = hashval; i < h->size; i++){
list = h->table[i];
if (list == NULL){
return NULL;
}
if (list != NULL){
if (strcmp(str, list->string) == 0){
return list;
}
}
}
for (int i = 0; i < hashval; i++){
list = h->table[i];
if (list == NULL){
return NULL;
}
if (list != NULL){
if (strcmp(str, list->string) == 0){
return list;
}
}
}
return NULL;
}

Related

Insert data into a trie

So I'm trying to insert data into a trie, and my code works fine. But then I change my insert function a little bit and it doesn't work anymore and also causes the memory leak. To me, both versions of insert do the same thing but obviously, they are not. Can someone please explain to me why? Thanks in advance.
Here is the code that works
#include <stdio.h>
#include <stdbool.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
#define SIZE 26
#define hash(c) (tolower(c) - (int)'a')
typedef struct node{
bool endWord;
struct node* children[SIZE];
} node;
void freeTrie(node* root){
if(root == NULL) return;
for (size_t i = 0; i < SIZE; i++) {
freeTrie(root->children[i]);
}
free(root);
}
node* newNode(){
node* new = NULL;
new = (node*) malloc(sizeof(node));
if(new != NULL){
new->endWord = false;
for(int i = 0; i < SIZE; i++)
new->children[i] = NULL;
}
return new;
}
void insert(node* root, const char* data){
node* temp = root;
for (size_t i = 0, len = strlen(data); i < len; i++) {
int index = hash(data[i]);
if(temp->children[index] == NULL){
temp->children[index] = newNode();
if (temp->children[index] /*still*/ == NULL){
printf("Something went wrong\n");
return;
}
}
temp = temp->children[index];
}
temp->endWord = true;
}
bool search(node* root, const char* data){
node* temp = root;
for (size_t i = 0, len = strlen(data); i < len; i++) {
int index = hash(data[i]);
temp = temp->children[index];
if (temp == NULL){
printf("search end here\n");
return false;
}
}
return (temp != NULL && temp->endWord);
}
int main() {
char data[][8] = {"fox", "foo", "dog", "do"};
node* root = newNode();
if(root == NULL){
printf("Something went wrong\n");
return 1;
}
for (size_t i = 0, dataSize = sizeof(data)/sizeof(data[0]); i < dataSize; i++) {
insert(root, data[i]);
}
printf("Check: \n");
char output[][32] = {"not found", "found"};
// char s[5];
// fscanf(stdin, "%s", s);
printf("%s\n", output[search(root, "fox")]);
freeTrie(root);
printf("Done\n");
return 0;
}
Here is the insert that makes me confused
void insert(node* root, const char* data){
node* temp = root;
for (size_t i = 0, len = strlen(data); i < len; i++) {
int index = hash(data[i]);
temp = temp->children[index];
if(temp == NULL){
temp = newNode();
if (temp /*still*/ == NULL){
printf("Something went wrong\n");
return;
}
}
}
temp->endWord = true;
}
PS: I do this for a problem set of the CS50x course, in which I have to load a dictionary of 143091 words (in alphabetical order) into my trie. My program takes about 0.1s to load and 0.06s to unload when the staff's does the same job with just 0.02s and 0.01s. I am not allowed to see the staff's source code but I guess they used a trie to store data. How can I improve my code for faster runtime? Would it run faster if I store data in an array and then binary search instead?
When you write
temp = temp->children[index];
you copy value contained in temp->children[index] (I'll call it A) into a completely independent variable named temp. When you later modify temp, you modify temp only, not A. That is, all new nodes do not get inserted into the trie.

Trouble with strcmp function in C

I am writing a function called check that compares the alphabetical string of a dictionary that is loaded in through the command line with a text that is also loaded in through the command line. The function is part of a larger function called speller that acts as a spell checker.
I ran several printf debugging tests to check if the words being compared in the strcmp function. The problem comes here. The function finds that all words in the text are incorrectly spelled even when the printf test shows that the strings from the dictionary and the text are the same.
Don't know where to go from this point so any help would be greatly appreciated. Thanks so much
Below is the code for the particular function. Thanks again.
typedef struct node {
char word[LENGTH + 1];
struct node *next;
} node;
node *hashtable[27];
/* Returns true if word is in dictionary else false. */
int hash_fun (const char key);
bool check (const char *word)
{
//case-desensitizing
char caseless[strlen (word)];
int i, length;
for (int head = 0; head < 26; head++) {
hashtable[head] = NULL;
}
for (i = 0, length = strlen (word); i < length; i++) {
//("%c\n",word[i]);
if (isupper (word[i])) {
caseless[i] = tolower (word[i]);
} else {
caseless[i] = word[i];
}
}
caseless[i] = '\0';
//printf("-%s %s- \n*",word, caseless);
int word_index = hash_fun (caseless);
//printf("%i", word_index);
node *new_node = malloc (sizeof (node));
if (new_node == NULL) {
return 2;
}
if (word_index >= 0) {
if (hashtable[word_index] == NULL) {
hashtable[word_index] = new_node;
new_node->next = NULL;
}
node *cursor = malloc (sizeof (node));
cursor = hashtable[word_index];
while (cursor != NULL) {
//printf("Dictionary:%s and Text:%s \n", cursor->word, caseless);
int found;
found = strcmp (caseless, cursor->word);
if (found == 0) {
return true;
}
cursor = cursor->next;
}
}
return false;
}

Sorting and Displaying a Linked List

I am writing a library management program where the user can add books to a database which just means that the program will take the user's input and store it into a text document. Then, when the program starts up it will read through the text document where all the books are stored and build a linked list where each book would be a node. So, I have been able to get to a point where I can read the text file and store the values into nodes. However, when I try testing the bookList function to view the entire book list by title my program crashes. Here is the code:
void loadingMenu(){
FILE *fp = fopen(filename, "r");
int lines = 0;
char line[254];
char *ti = malloc(MAX_STR_LEN);
char *au = malloc(MAX_STR_LEN);
char *ca = malloc(MAX_STR_LEN);
char *id = malloc(MAX_STR_LEN);
char *ti_chopped;
char *au_chopped;
char *ca_chopped;
char *id_chopped;
int id_num;
struct node *tempNode;
while(fgets(line, sizeof(line), fp)){
if(line == 'EOF'){
break;
}
if(lines == 7){
lines = 0;
}
if(lines == 0){
line[strcspn(line, "\n")] = 0; // remove '\n' from string
strcpy(ti, line);
}
else if(lines == 1){
line[strcspn(line, "\n")] = 0;
strcpy(au, line);
}
else if(lines == 3){
line[strcspn(line, "\n")] = 0;
strcpy(ca, line);
}
else if(lines == 6){
line[strcspn(line, "\n")] = 0;
strcpy(id, line);
}
lines++;
if(lines == 6){
// removing the identifiers from each string
ti_chopped = ti + 6;
au_chopped = au + 7;
ca_chopped = ca + 9;
id_chopped = id + 3;
id_num = atoi(id_chopped);
// ------create book node------------
tempNode = malloc(sizeof *tempNode);
// ----------------------------------
tempNode->next = NULL;
tempNode->titleptr = malloc(strlen(ti_chopped) + 1);
strcpy(tempNode->titleptr, ti_chopped);
tempNode->authorptr = malloc(strlen(au_chopped) + 1);
strcpy(tempNode->authorptr, au_chopped);
tempNode->categoryptr = malloc(strlen(ca_chopped) + 1);
strcpy(tempNode->categoryptr, ca_chopped);
tempNode->id = id_num;
//printf("%d", tempNode->id);
head = addNode(head, tempNode);
}
}
fclose(fp);
}
int compareNode(struct node *n1, struct node *n2){
int compareValue = strcmp(n1->titleptr, n2->titleptr);
if(compareValue == 0){
return 0;
}
else if(compareValue < 0){
return -1;
}
else {
return 1;
}
}
struct node *addNode(struct node *list, struct node *node1){
struct node* tmp = list;
if(list == NULL){
return node1;
}
if(compareNode(node1,list) == -1){
node1->next = list;
list = node1;
return list;
}
else
{
struct node* prev = list;
while(tmp != NULL && compareNode(node1, tmp) >= 0){
prev = tmp;
tmp = tmp->next;
}
prev->next = node1;
node1->next = tmp;
return list;
}
}
void bookList(){
system("cls");
struct node *tmp;
tmp = head;
printf("List of all Books: ");
while(tmp != NULL)
{
printf("%s\n", tmp->titleptr);
tmp = tmp->next;
}
printf("\n\nEnd of list.");
}
First off I would like to apologize for the bad code, also I trimmed away some of the fat of the program and just left the functions involved behind.
So, please if you could help me out on this or at least point me in the right direction I would be very grateful. Also, if you have any coding tips or comments go ahead and tell me, I am always hungry to learn!
EDIT: The code now runs and the list will print, however, the strings will not be in alphabetical order. Currently trying to figure that out.
In the first addNode() function, you need to set the previous pointer. That should fix the problem.
if(compareNode(node1,list) == -1)
{
node1->next = list;
list->prev = node1;
list = node1;
return list;
}

Hashmap with Linked List to find word count

I have been working on this little project for quite some time and I can't figure out why I'm not getting the results that are expected. I am a beginner to C programming so my understanding with pointers and memory allocation/deallocation is novice. Anyways, I have constructed this segment of code by originally building a hash function, then adding a count to it. However, when I test it, sometimes the count works, sometimes it doesn't. I'm not sure whether it's the fault of the hash function, or the fault of the way I set up my count. The text file is read one line at a time and is a string consisting of a hexadecimal.
struct node {
char *data;
struct node *next;
int count; /* Implement count here for word frequencies */
};
#define H_SIZE 1024
struct node *hashtable[H_SIZE]; /* Declaration of hash table */
void h_lookup(void)
{
int i = 0;
struct node *tmp;
for(i = 0; i < H_SIZE; i++) {
for(tmp = hashtable[i]; tmp != NULL; tmp = tmp->next) {
if(tmp->data != 0) {
printf("Index: %d\nData: %s\nCount: %d\n\n", i,
tmp->data, tmp->count);
}
}
}
}
/* self explanatory */
void h_add(char *data)
{
unsigned int i = h_assign(data);
struct node *tmp;
char *strdup(const char *s);
/* Checks to see if data exists, consider inserting COUNT here */
for(tmp = hashtable[i]; tmp != NULL; tmp = tmp->next) {
if(tmp->data != 0) { /* root node */
int count = tmp->count;
if(!strcmp(data, tmp->data))
count= count+1;
tmp->count = count;
return;
}
}
for(tmp = hashtable[i]; tmp->next != NULL; tmp = tmp->next);
if(tmp->next == NULL) {
tmp->next = h_alloc();
tmp = tmp->next;
tmp->data = strdup(data);
tmp->next = NULL;
tmp->count = 1;
} else
exit(EXIT_FAILURE);
}
/* Hash function, takes value (string) and converts into an index into the array of linked lists) */
unsigned int h_assign(char *string)
{
unsigned int num = 0;
while(*string++ != '\0')
num += *string;
return num % H_SIZE;
}
/* h_initialize(void) initializes the array of linked lists. Allocates one node for each list by calling h_alloc which creates a new node and sets node.next to null */
void h_initialize(void)
{ int i;
for(i = 0; i <H_SIZE; i++) {
hashtable[i] = h_alloc();
}
}
/* h_alloc(void) is a method which creates a new node and sets it's pointer to null */
struct node *h_alloc(void)
{
struct node *tmp = calloc(1, sizeof(struct node));
if (tmp != NULL){
tmp->next = NULL;
return tmp;
}
else{
exit(EXIT_FAILURE);
}
}
/* Clean up hashtable and free up memory */
void h_free(void)
{
struct node *tmp;
struct node *fwd;
int x;
for(x = 0; x < H_SIZE; x++) {
tmp = hashtable[x];
while(tmp != NULL) {
fwd = tmp->next;
free(tmp->data);
free(tmp);
tmp = fwd;
}
}
}
I assume that the count is not being incremented when it does not work. It is possible that strdup is not able to allocate memory for the new string and is returning NULL. You should check the return value to and exit gracefully if it fails.

Pointer NULL issues

void addWord(char *word, bucket **bkt, int size)
{
bucket *node, *auxNode;
if(findWord(word, bkt[hash(word, size)]) == 1)
{
return;
}
node = (bucket*) malloc (sizeof(bucket));
node->data = (char*) malloc (strlen(word) * sizeof(char));
memset(node->data, 0, strlen(word));
sprintf(node->data, "%s", word);
if(*bkt == NULL)
{
node->next = NULL;
*bkt = node;
}
else
{
auxNode = (bucket*) malloc (sizeof(bucket));
auxNode = *bkt;
while(auxNode->next != NULL)
{
auxNode = auxNode->next;
}
node->next = NULL;
auxNode->next = node;
}
}
int main(int argc, char **argv)
{
............
bkt = (bucket**) malloc (*sizeHash * sizeof(bucket*));
for(i = 0 ; i < (*sizeHash) ; i++)
{
printf("%d\n", i);
bkt[i] = NULL;
}
.........
if(bkt[hash(pch, *sizeHash)] == NULL)
{
printf("NULL: %s -> %d\n",pch, hash(pch, *sizeHash));
bkt[hash(pch, *sizeHash)] = NULL;
}
addWord(pch, &bkt[hash(pch, *sizeHash)], *sizeHash);
Every time enters in that if, that means that the node send is NULL; But after two inserts, the third although enters in that if, in addWord it arrives not NULL(i put a printf before findWord). I don't understand why this happens. This is a hash table, hash() is djb2 of Dan Bernstein. Could somebody tell my why the NULL pointer isn't send in addWord()?
Surely this:
if(findWord(word, bkt[hash(word, size)]) == 1)
is supposed to be this:
if(findWord(word, *bkt) == 1)
?
Remember that the bkt inside addWord is the &bkt[hash(pch, *sizeHash)] from main: it already points to the hash-entry for word.

Resources