Load function trie segmentation fault - c

I keep getting segfault for my load function.
bool load(const char *dictionary)
{
//create a trie data type
typedef struct node
{
bool is_word;
struct node *children[27]; //this is a pointer too!
}node;
//create a pointer to the root of the trie and never move this (use traversal *)
node *root = malloc(sizeof(node));
for(int i=0; i<27; i++)
{
//NULL point all indexes of root -> children
root -> children[i] = NULL;
}
FILE *dptr = fopen(dictionary, "r");
if(dptr == NULL)
{
printf("Could not open dictionary\n");
return false;
}
char *c = NULL;
//scan the file char by char until end and store it in c
while(fscanf(dptr,"%s",c) != EOF)
{
//in the beginning of every word, make a traversal pointer copy of root so we can always refer back to root
node *trav = root;
//repeat for every word
while ((*c) != '\0')
{
//convert char into array index
int alpha = (tolower(*c) - 97);
//if array element is pointing to NULL, i.e. it hasn't been open yet,
if(trav -> children[alpha] == NULL)
{
//then create a new node and point it with the previous pointer.
node *next_node = malloc(sizeof(node));
trav -> children[alpha] = next_node;
//quit if malloc returns null
if(next_node == NULL)
{
printf("Could not open dictionary");
return false;
}
}
else if (trav -> children[alpha] != NULL)
{
//if an already existing path, just go to it
trav = trav -> children[alpha];
}
}
//a word is loaded.
trav -> is_word = true;
}
//success
free(root);
return true;
}
I checked whether I properly pointed new pointers to NULL during initialization. I have three types of nodes: root, traversal (for moving), and next_node. (i.) Am I allowed to null point the nodes before mallocing them? (ii.) Also, how do I free 'next_node' if that node is initialized and malloced inside an if statement? node *next_node = malloc(sizeof(node)); (iii.) If I want to set the nodes as global variables, which ones should be global? (iv.) Lastly, where do I set global variables: inside the main of speller.c, outside its main, or somewhere else? That's alot of questions, so you don't have to answer all of them, but it would be nice if you could answer the answered ones! Please point out any other peculiarities in my code. There should be plenty. I will accept most answers.

The cause of segmentation fault is the pointer "c" which you have not allocated memory.
Also, in your program -
//scan the file char by char until end and store it in c
while(fscanf(dptr,"%s",c) != EOF)
Once you allocate memory to pointer c, c will hold the word read from file dictionary.
Below in your code, you are checking for '\0' character-
while ((*c) != '\0')
{
But you are not moving the c pointer to point to next character in the string read because of which this code will end up executing infinite while loop.
May you can try something like this-
char *tmp;
tmp = c;
while ((*tmp) != '\0')
{
......
......
//Below in the loop at appropriate place
tmp++;
}

Related

cs50 pset5 segmentation fault [loading text in memory using hash table]

I am currently working on pset5 from cs50.
My entire program compiles successfully but stops in the middle of the function called load when program is executed.
Below is my load function, and you can see the comment where it gave me a segmentation fault error.
If you can help me with figuring out how I should approach my error, please do let me know.
I understand that segmentation fault is caused when the program attempts to access a memory that does not belong to it.
However, I have allocated memory and checked whether there was enough memory to continue on the program.
I will provide comments to highlight what my code does.
// In another header file, I have defined 'LENGTH'
// Maximum length for a word
// (e.g., pneumonoultramicroscopicsilicovolcanoconiosis)
#define LENGTH 45
// Represents a node in a hash table
typedef struct node
{
char word[LENGTH + 1];
struct node *next;
}
node;
// Hash table
// I have initialized the array of `node` pointer to point `NULL`
node *table[N] = {NULL};
unsigned int word_counter = 0;
bool load(const char *dictionary)
{
// Open file, and if cannot open, return false
FILE *file = fopen(dictionary, "r");
if (file == NULL)
{
return false;
}
// read string in the file into array of character, `word` until reaching end of the file
char word[LENGTH + 1];
while (fscanf(file, "%s", word) != EOF)
{
// keep track of how many word exists in the file, for later use (not in this function)
word_counter += 1;
// allocated memory for struct type `node`, if not enough memory found, return false
node *n = (node*)malloc(sizeof(node));
if (n == NULL)
{
return false;
}
// assign index by hashing (hash function will not be posted in this question though.)
unsigned int index = hash(&word[0]);
// copy the word from file, into word field of struct type `node`
strncpy(n->word, word, sizeof(word));
// Access the node pointer in this index from array(table), and check is its `next` field points to NULL or not.
// If it is pointing to NULL, that means there is no word stored in this index of the bucket
if (table[index]->next == NULL) // THIS IS WHERE PROGRAM GIVES 'segmentation fault' !!!! :(
{
table[index]->next = n;
}
else
{
n->next = table[index];
table[index]->next = n;
}
}
return true;
}
You define ant initialize the hash table as:
node *table[N] = {NULL};
That means you have an array of null-pointers.
When you insert the first value in the table, then table[index] (for any valid index) will be a null pointer. That means table[index]->next attempt to dereference this null pointer and you will have undefined behavior.
You need to check for a null pointers first:
if (table[index] == NULL)
{
n->next = NULL;
}
else
{
n->next = table[index];
}
table[index] = n;

multi-character character constant [-Werror,-Wmultichar]

(lines 43-56) I am attempting to implement load function for pset 5. I created a nested while loop, first one for iterating until the end of file and the other until end of each word. I created char *c to store whatever "string" I scan from dictionary, but when I compile
bool load(const char *dictionary)
{
//create a trie data type
typedef struct node
{
bool is_word;
struct node *children[27]; //this is a pointer too!
}node;
FILE *dptr = fopen(dictionary, "r");
if(dptr == NULL)
{
printf("Could not open dictionary\n");
unload();
return false;
}
//create a pointer to the root of the trie and never move this (use traversal *)
node *root = malloc(sizeof(node));
char *c = NULL;
//scan the file char by char until end and store it in c
while(fscanf(dptr,"%s",c) != EOF)
{
//in the beginning of every word, make a traversal pointer copy of root so we can always refer back to root
node *trav = root;
//repeat for every word
while ((*c) != '/0')
{
//convert char into array index
int alpha = ((*c) - 97);
//if array element is pointing to NULL, i.e. it hasn't been open yet,
if(trav -> children[alpha] == NULL)
{
//then create a new node and point it with the previous pointer.
node *next_node = malloc(sizeof(node));
trav -> children[alpha] = next_node;
//quit if malloc returns null
if(next_node == NULL)
{
printf("Could not open dictionary");
unload();
return false;
}
}
else if (trav -> children[alpha] != NULL)
{
//if an already existing path, just go to it
trav = trav -> children[alpha];
}
}
//a word is loaded.
trav -> is_word = true;
}
}
Error:
dictionary.c:52:23: error: multi-character character constant [-
Werror,-Wmultichar]
while ((*c) != '/0')
I think this means '/0' should be a single character, but I don't know how else I would check for end of the word!
I also get another error message saying:
dictionary.c:84:1: error: control may reach end of non-void function [-Werror,-Wreturn-type]
}
I've been playing with it for a while now, and it is frustrating. Please help, and if you find any additional bugs, I'll be glad!
You want '\0' (null terminating character) instead of '/0'.
Additionally, don't forget to return a bool at the end of your function !

CS50 pset5 Load Function

I'm having some trouble with the load section of pset5 on CS50, it would be great if someone could help. I'm trying to load a trie that reads from a dictionary (file fp below) and then iterates through the letters to create the trie.
I understand the concept of building a trie but I think I'm missing something with how the struct pointers are set up (hopefully I'm not way off the track with the code below). I've tried to set up 'trap' to navigate through each stage of the try.
I'm currently getting a segmentation fault so not entirely sure where to go next. Any help would be massively appreciated.
/**
* Loads dictionary into memory. Returns true if successful else false.
*/
bool load(const char* dictionary)
{
//create word node and set root
typedef struct node {
bool is_word;
struct node* children[27];
} node;
node* root = calloc(1, sizeof(root));
root -> is_word = false;
node* trav = root;
//open small dictionary
FILE* fp = fopen(dictionary, "r");
if (fp == NULL)
{
printf("Could not open %s.\n", dictionary);
return false;
}
//read characters one by one and write them to the trie
for (int c = fgetc(fp); c != EOF; c = fgetc(fp))
{
//set index using to lower. Use a-1 to set ' to 0 and other letters 1-27
int index = tolower(c)-('a'-1);
//if new line (so end of word) set is_word to true and return trav to root)
if (index == '\n')
{
trav->is_word = true;
trav = root;
}
//if trav-> children is NULL then create a new node assign to next
//and move trav to that position
if (trav->children[index] == NULL)
{
node* next = calloc(1, sizeof(node));
trav->children[index] = next;
trav = next;
}
//else pointer must exist so move trav straight on
else {
trav = trav->children[index];
}
}
fclose(fp);
return false;
}
I'm assuming you set the size of array children[] to store 26 letters of the alphabet plus apostrophes. If so, when fgetc(fp) returns an apostrophe with an acsii code of 39 (I think), index will be set to -57, which is definitely not part of trav->children. That's probably where you're getting the segfault (or at least one of the places)
.
Hope this helps.

SEGMENTATION FAULT in strncpy - load from dictionary

I have this function "load" where I read words from a dictionary and put them in an hashtable of linked lists. When I try to read a line and save it in my new_node->text the compiler returns SEGMENTATION FAULT and I don't know why. The error apperars when I use strncpy.
#define HASHTABLE_SIZE 76801
typedef struct node
{
char text[LENGTH+1];
//char* text;
//link to the next word
struct node* next_word;
}
node;
node* hashtable[HASHTABLE_SIZE];
bool load(const char* dictionary)
{
FILE* file = fopen(dictionary,"r");
unsigned long index = 0;
char str[LENGTH+1];
if(file == NULL)
{
printf("Error opening file!");
return false;
}
while(! feof(file))
{
node * new_node = malloc(sizeof(node)+1000);
while( fscanf(file,"%s",str) > 0)
{
printf("The word is %s",str);
strncpy(new_node->text,str,LENGTH+1);
//strcpy(new_node->text,str);
new_node->next_word = NULL;
index = hash( (unsigned char*)new_node->text);
if(hashtable[index] == NULL)
{
hashtable[index] = new_node;
}
else
{
new_node->next_word = hashtable[index];
hashtable[index] = new_node;
}
n_words++;
}
//free(new_node);
}
fclose(file);
loaded = true;
return true;
}
Let's look at your code line by line, shall we?
while(! feof(file))
{
This is not the right way to use feof - check out the post Why is “while ( !feof (file) )” always wrong? right here on StackOverflow.
node * new_node = malloc(sizeof(node)+1000);
Hmm, ok. We allocate space for one node and 1000 bytes. That's a bit weird, but hey... RAM is cheap.
while( fscanf(file,"%s",str) > 0)
{
Uhm... another loop? OK...
printf("The word is %s",str);
strncpy(new_node->text,str,LENGTH+1);
//strcpy(new_node->text,str);
new_node->next_word = NULL;
index = hash( (unsigned char*)new_node->text);
Hey! Wait a second... in this second loop we keep overwriting new_node repeatedly...
if(hashtable[index] == NULL)
{
hashtable[index] = new_node;
}
else
{
new_node->next_word = hashtable[index];
hashtable[index] = new_node;
}
Assume for a second that both words hash to the same bucket:
OK, so the first time through the loop, hashtable[index] will point to NULL and be set to point to new_node.
The second time through the loop, hashtable[index] isn't NULL so new_node will be made to point to whatever hashtable[index] points to (hint: new_node) and hashtable[index] will be made to point to new_node).
Do you know what an ouroboros is?
Now assume they don't hash to the same bucket:
One of the buckets now contains the wrong information. If you add "hello" in bucket 1 first and "goodbye" in bucket 2 first, when you try to traverse bucket 1 you may (only because the linking code is broken) find "goodbye" which doesn't belong in bucket 1 at all.
You should allocate a new node for every word you are adding. Don't reuse the same node.

Creating a singly linked list in C

I'm trying to create a singly linked list from an input text file for an assignment. I'm trying to do it a little bit at a time so I know my code is not complete. I tried creating the head pointer and just printing out its value and I can't even get that to work, but I'm not sure why. I included the struct, my create list, and print list functions. I didn't include the open file since that part works.
typedef struct List
{
struct List *next; /* pointer to the next list node */
char *str; /* pointer to the string represented */
int count; /* # of occurrences of this string */
} LIST;
LIST *CreateList(FILE *fp)
{
char input[LINE_LEN];
LIST *root; /* contains root of list */
size_t strSize;
LIST *newList; /* used to allocate new list members */
while (fscanf(fp, BUFFMT"s", input) != EOF) {
strSize = strlen(input) + 1;
/* create root node if no current root node */
if (root == NULL) {
if ((newList = (LIST *)malloc(sizeof(LIST))) == NULL) {
printf("Out of memory...");
exit(EXIT_FAILURE);
}
if ((char *)malloc(sizeof(strSize)) == NULL) {
printf("Not enough memory for %s", input);
exit(EXIT_FAILURE);
}
memcpy(newList->str, input, strSize); /*copy string */
newList->count = START_COUNT;
newList->next = NULL;
root = newList;
}
}
return root;
}
/* Prints sinly linked list and returns head pointer */
LIST *PrintList(const LIST *head)
{
int count;
for (count = 1; head != NULL; head = head->next, head++) {
printf("%s %d", head->str, head->count);
}
return head; /* does this actually return the start of head ptr, b/c I want to
return the start of the head ptr. */
}
root has an undefined value, so it won't initialize. The second line of CreateList should be
LIST *root = NULL;
Also, further down there is allocation apparently for the details of the item, but a) the code fails to capture the allocation and save it anywhere, and b) the size of the allocation should be strSize, not the length of the variable itself. There are several ways to fix it, but the most straightforward would be:
newList->str = (char *)malloc(strSize);
if (newList->str == NULL)
The second malloc allocates memory but its return value is not assigned to anything, so that allocated memory is lost.
newList is allocated but not initialized, so using a memcpy to copy memory to newList->str will fail since newList->str points to nothing. Probably you wanted the result of the second malloc to be assigned to newList->str, but you forgot it.
You shouldn't be incrementing head after head = head->next in the for loop. PrintList will return NULL every time since the loop wont stop until head is NULL. Why do you need to return the head of the list you just passed to the function anyway?
Edit:
LIST *current = head;
while (current != NULL) {
printf("%s %d", current->str, current->count);
current = current->next;
}

Resources