Segmentation fault: 11 occurs when printing the contents of "Dictionary" - c

I've been trying to create a dictionary of words and their definitions(sort of Oxford English Dictionary). So far, I've finished half of the job:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
typedef struct{
char wordInDictionary[32];
int numberOfMeanings;
char *wordDefinitions[10];
}Entry; //struct that describes a single entry in the dictionary
typedef struct{
int entries;
Entry* arrayOfEntries[100];
}Dictionary; //dictionary itself
Dictionary createDictionary(){
Dictionary emptyDictionary;
emptyDictionary.entries = 0;
int i,j;
for(i=0;i<100;i++){
emptyDictionary.arrayOfEntries[i] = malloc(sizeof(Entry));
emptyDictionary.arrayOfEntries[i]->numberOfMeanings = 0;
strcpy(emptyDictionary.arrayOfEntries[i]->wordInDictionary, "");
for(j=0;j<10;j++){
emptyDictionary.arrayOfEntries[i]->wordDefinitions[j] =
malloc(500*sizeof(char));
strcpy(emptyDictionary.arrayOfEntries[i]->wordDefinitions[j], "");
}
}
return emptyDictionary; //created empty dictionary with 0 entries and empty content
}
Entry* searchWord(char word[], Dictionary *dict){
Entry* foundWord;
int i;
for(i=0;i<100;i++){
if(strcmp(dict->arrayOfEntries[i]->wordInDictionary, word) != 0){
foundWord = NULL;
}else{
foundWord = dict->arrayOfEntries[i];
break;
}
}
return foundWord; //searches for a given word and returns NULL, if word is not found, or pointer to entry in the dictionary that corresponds to word, if word is found.
}
int addDefinition(char word[], char meaning[], Dictionary *dict){
Entry* foundEntry = searchWord(word, dict);
int returnNumber;
if((foundEntry == NULL || strcmp(foundEntry->wordInDictionary, word) != 0)
&& dict->entries < sizeof(dict->arrayOfEntries)/sizeof(dict->arrayOfEntries[0])){
Entry* newEntry = malloc(sizeof(Entry)); //allocating space if searched word is not found in dictionary
strcpy(newEntry->wordInDictionary, word); //now there is a new word in the dictionary
newEntry->numberOfMeanings = 0; //but the new word has 0 meanings
dict->arrayOfEntries[dict->entries++] = newEntry;
if(newEntry->numberOfMeanings<10){
newEntry->wordDefinitions[newEntry->numberOfMeanings] = malloc(500*sizeof(char));
strcpy(newEntry->wordDefinitions[newEntry->numberOfMeanings], meaning); //new word now has a new meaning since its meanings' amount is less than 10(which is max)
}
int oldNumberOfMeanings = newEntry->numberOfMeanings;
int newNumberOfMeanings = oldNumberOfMeanings + 1;
if(newNumberOfMeanings>oldNumberOfMeanings){
returnNumber = 2; //returns 2 if both new entry and new definition were added into the dictionary
}
newEntry->numberOfMeanings++;
}else if((foundEntry != NULL || strcmp(foundEntry->wordInDictionary, word) == 0)
&& dict->entries < sizeof(dict->arrayOfEntries)/sizeof(dict->arrayOfEntries[0])){
if(foundEntry->numberOfMeanings<10){
foundEntry->wordDefinitions[foundEntry->numberOfMeanings] = malloc(500*sizeof(char));
strcpy(foundEntry->wordDefinitions[foundEntry->numberOfMeanings], meaning);
}
int oldNumberOfMeanings = foundEntry->numberOfMeanings;
int newNumberOfMeanings = oldNumberOfMeanings + 1;
if(newNumberOfMeanings == oldNumberOfMeanings){
returnNumber = 0; //returns 0 if no new entry and no new meaning are added to already found one
}else if(newNumberOfMeanings > oldNumberOfMeanings){
returnNumber = 1; //returns 1 if no new entry and one new meaning are added to already found one
}
foundEntry->numberOfMeanings++;
}
return returnNumber;
}
So this vague piece of code describes my program. But the problem itself occurs when I test the code in main():
int main(void){
Entry* entryPtr;
Dictionary dict = createDictionary();
entryPtr = searchWord("include", &dict);
if(entryPtr != NULL){
printf("\nWord found: 'include'");
}else{
printf("\nWord 'include' is not found in the dictionary.");
}
int count;
count = addDefinition("include", "(verb) def1", &dict);
count = addDefinition("assist", "(verb) def2", &dict);
count = addDefinition("house", "(noun) def3", &dict);
count = addDefinition("camera", "(noun) def4", &dict);
count = addDefinition("valid", "(adjective) def5", &dict);
printf("\nLast count: %i", count);
int i, j;
for(i=0;i<100;i++){
if(strcmp(dict.arrayOfEntries[i]->wordInDictionary, "")!=0){
printf("\n\t%i. %s", i+1, dict.arrayOfEntries[i]->wordInDictionary);
}
for(j=0;j<10;j++){
if(strcmp(dict.arrayOfEntries[i]->wordDefinitions[j], "")!=0){
printf("\n\t\t%i.%i. %s\n", i+1, j+1, dict.arrayOfEntries[i]->wordDefinitions[j]);
}
}
}
entryPtr = searchWord ("exam", &dict);
if (entryPtr != NULL) {
printf("\nWord found: 'exam'");
} else {
printf("\nWord 'exam' is not found in the dictionary.");
}
entryPtr = searchWord ("include", &dict);
if (entryPtr != NULL) {
printf("\nWord found: 'include'");
} else {
printf("\nWord 'include' is not found in the dictionary.");
}
count = addDefinition ("house", "(adjective) def6", &dict);
count = addDefinition ("house", "(adjective) def7", &dict);
count = addDefinition ("house", "(adjective) def8", &dict);
printf("\n\nThe the value returned by the last addition: is %i.", count);
}
What the output gives is:
Word 'include' is not found in the dictionary.
Last count: 2
1. include
1.1. (verb) def1
Segmentation fault: 11
I changed the condition in nested for loops in main() as follows:
for(i=0;i<5;i++){
if(strcmp(dict.arrayOfEntries[i]->wordInDictionary, "")!=0){
printf("\n\t%i. %s", i+1, dict.arrayOfEntries[i]->wordInDictionary);
}
for(j=0;j<1;j++){
if(strcmp(dict.arrayOfEntries[i]->wordDefinitions[j], "")!=0){
printf("\n\t\t%i.%i. %s\n", i+1, j+1, dict.arrayOfEntries[i]->wordDefinitions[j]);
}
}
}
Then it showed the right content, though it is not what it should be like. Instead, it should work for all the entries in the dictionary, not first 5.
I've tried doing debugging many times and investigate it myself but still could not find the solution. Any hints to solve the possible problem(-s)?
P.S. I run the program on VS Code, macOS, GCC.

The issue is in line
if(strcmp(dict.arrayOfEntries[i]->wordDefinitions[j], "")!=0)
It doesn't check for valid dict.arrayOfEntries[i]->wordDefinitions[j] pointer. Something like this:
if(dict.arrayOfEntries[i]->wordDefinitions[j] && strcmp(dict.arrayOfEntries[i]->wordDefinitions[j], "")!=0)
Beside that, I think you are using malloc badly, you are creating at runtime a whole empty dictionary in createDictionary and then creating new entries in addDefinition and adding them like this:
dict->arrayOfEntries[dict->entries++] = newEntry;
This newEntry replaces the old pointer and it doesn't event delete it, this is a memory leak, and beside that newEntry is not fully initialized, and that's the main reason of the segmentation fault.

Related

How do I debug a Segmentation fault when I nothing is wrong before the fault or after?

The particular problem I have is that in my main function, I have added a print statement before and after I call the "bad" function. It always shows the before statement, but never the after statement. I also added a print statement to the end of the "bad" function, and I can see that it runs properly to the very last line of the "bad" function, so it should return normally. After the functions last print and before the main function print, I get the segfault. Any ideas? Here is the code:
int main(int argc, char* argv[])
{
char myItem[100];
int i = 0;
while (i < 100) {
scanf("%[^\n]", myItem);
i++;
if (myItem == EOF) {
break;
}
int c;
while ((c = getchar()) != '\n' && c != EOF);
//printf("string read in from user typing: %s\n", myItem);
printf("i = %d\n", i);
emailFilter(myItem);
printf("done with email filter in main\n");
//printf("item from this pass is:%s\n\n", myItem);
}
return 0;
}
and the "bad" function:
void emailFilter(char* mySubject)
{
printf(" Just entered the emailFilter() .\n");
char * event_holder[5]; //holds five separate char ptrs
for (int i = 0; i < 5; i++)
{
event_holder[i] = ((char*)malloc(100 * sizeof(char*)));
}
char command_type = parseSubject(mySubject, event_holder); //parses subject line and fills event_holder. returns command type, from parsing
//call proper parsing result
if (command_type == 'C')
{
create(event_holder);
}
else if (command_type == 'X')
{
change(event_holder);
}
else if (command_type == 'D')
{
delete(event_holder);
}
printf("Leaving emailfilter()...\n");
}
and running this code provides me:
$:
i = 1
Just entered the emailFilter() .
C, Meeting ,01/12/2019,15:30,NEB202
Leaving emailfilter()...
done with email filter in main
i = 2
Just entered the emailFilter() .
Leaving emailfilter()...
Segmentation fault
This shows that I always make it through the function, but still don't return properly.
Here is my entire code to reproduce the error.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
struct node {
char * event_data[5];
struct node * next;
};
struct node *head = NULL;
struct node *current = NULL;
char* earliest = NULL;
char* substring (char* orig, char* dest, int offset, int len)
{
int input_len = strlen (orig);
if (offset + len > input_len)
{
return NULL;
}
strncpy (dest, orig + offset, len);
//add null char \0 to end
char * term = "\0";
strncpy (dest + len, term, 1);
return dest;
}
char * firstItem(char* shortenedSubject)
{
int i = 0;
int currentLength = 0;
int currentCharIndex = 0;
int eventIndex = 0;
char * toReturn = (char*)malloc(100);
while ((shortenedSubject[currentLength] != '\0') && (shortenedSubject[currentLength] != ',') )//50 is my safety num to make sure it exits eventually
{
currentLength++;
}
if (shortenedSubject[currentLength] == ',') {
substring(shortenedSubject, toReturn, 0, currentLength);
}
return toReturn;
}
char parseSubject(char* subject,char * eventDataToReturn[5]) //returns "what type of command called, or none"
{
char toReturn;
char * shortenedSubject = (char*)malloc(100);
substring(subject,shortenedSubject,9,strlen(subject)-9);//put substring into tempString
int currentCharIndex = 0;// well feed into index of substring()
int eventIndex = 0; //lets us know which event to fill in
int currentLength = 0;//lets us know length of current event
int i = 0; //which char in temp string were alooking at
char * action = firstItem(shortenedSubject);
if (strlen(action) == 1)
{
if ( action[0] == 'C')
{
toReturn = 'C';
}
else if (action[0] == 'X')
{
toReturn = 'X';
}
else if (action[0] == 'D')
{
toReturn = 'D';
}
else
{
toReturn = 'N'; //not valid
//invalid email command, do nothing
}
}
else
{
toReturn = 'N'; //not valid
//invalid email command, do nothing
}
char* debug2;
while ((shortenedSubject[i] != '\0') && (i <= 50) )//50 is my safety num to make sure it exits eventually
{
char debugvar = shortenedSubject[i];
currentLength++;
if (shortenedSubject[i] == ',')
{
//eventDataToReturn[i] = substring2(shortenedSubject,currentCharIndex,currentLength);
substring(shortenedSubject,eventDataToReturn[eventIndex],currentCharIndex,currentLength-1);
debug2 = eventDataToReturn[eventIndex];
currentCharIndex= i +1;
eventIndex++;
currentLength = 0;
//i++;
}
i++;
}
substring(shortenedSubject,eventDataToReturn[4],currentCharIndex,currentLength);
return toReturn;
}
void printEventData(char* my_event_data[])
{
//printf("\nPrinting event data...\n");
for (int i = 1; i < 4; i++)
{
printf("%s,",my_event_data[i]);
}
//print last entry, no comma
printf("%s",my_event_data[4]);
}
void printEventsInorder()
{
struct node * ptr = head;
while (ptr != NULL)//if not empty, check each one and add when ready
{
printEventData(ptr->event_data);
printf("\n");
ptr = ptr->next;
}
}
void insertFront(char* my_event_data[5])
{
struct node *link = (struct node*) malloc(sizeof(struct node));
link->next = NULL;
for (int i = 0; i < 5; i++)
{
link->event_data[i] = my_event_data[i];
}
head = link;
}
int isEarlier(char* event_data_L[5], char* event_data_R[5])
{// will be given like 12:30 12:45,turn timeL into timeL1 and timeL2, and time R1 and timeR2
//compare dates for earlier
int month_L,day_L,year_L;
int month_R,day_R,year_R;
char* char_holder;
substring(event_data_L[2],char_holder,0,2);//extract first half of time
month_L = atoi(char_holder); //convert first half of time to int
substring(event_data_L[2],char_holder,3,2);//extract first half of time
day_L = atoi(char_holder); //convert first half of time to int
substring(event_data_L[2],char_holder,6,4);//extract first half of time
year_L = atoi(char_holder); //convert first half of time to int
substring(event_data_R[2],char_holder,0,2);//extract first half of time
month_R = atoi(char_holder); //convert first half of time to int
substring(event_data_R[2],char_holder,3,2);//extract first half of time
day_R = atoi(char_holder); //convert first half of time to int
substring(event_data_R[2],char_holder,6,4);//extract first half of time
year_R = atoi(char_holder); //convert first half of time to int
int time_L1,time_L2,time_R1,time_R2;
substring(event_data_L[3],char_holder,0,2);//extract first half of time
time_L1 = atoi(char_holder); //convert first half of time to int
substring(event_data_L[3],char_holder,3,2);//extract second half of time
time_L2 = atoi(char_holder); //convert second half of time to int
substring(event_data_R[3],char_holder,0,2);
time_R1 = atoi(char_holder);
substring(event_data_R[3],char_holder,3,2);
time_R2 = atoi(char_holder);
//convert to 2 ints, first compare left ints, then right ints
if(year_L < year_R)
{
return 1;
}
else if ( year_L == year_R)
{
if (month_L < month_L)
{
return 1;
}
else if (month_L == month_L)
{
if (day_L < day_R)
{
return 1;
}
else if (day_L == day_R)
{
if (time_L1 < time_R1)
{
return 1;
}
else if (time_L1 == time_R1)
{
if (time_L2 < time_R2)
{
return 1;
}
else if (time_L2 == time_R2)
{
return 2;
}
else//else, time is greater
{
return 3;
}
}
else //left time is greater, return 3
{
return 3;
}
}
else
{
return 3;
}
}
else
{
return 3;
}
}
else //its left is greater than right so return 3 to show that
{
return 3;
}
}
void create(char* my_event_data[5]) {
//print required sentence
char * debugvar2 = my_event_data[3];
if (head == NULL)//if empty calendar, just add it
{
insertFront(my_event_data);
//printf("EARLIEST bc empty list, \n");
printf("C, ");
printEventData(my_event_data);
printf("\n");
return;
}
else
{
struct node *link = (struct node*) malloc(sizeof(struct node));
link->next = NULL;
for (int i = 1; i < 5; i++)
{
link->event_data[i] = my_event_data[i];
}
struct node *ptr = head;
struct node *prev = NULL;
if (ptr->next == NULL) //if this is the last node to check against
{
if (isEarlier(my_event_data, ptr->event_data) == 1)
{ //check against it
printf("C, ");
printEventData(my_event_data);
printf("\n");
if (prev != NULL) //if this is first item in linked list...
{
link->next = head; //assign something before head
head = link; //move head to that thing
}
if (prev != NULL)
{
prev->next = link;
}
link->next = ptr;
return;
}
else //else is equal to or later, so tack it on after:
{
ptr->next = link;
}
}
else
{
while (ptr->next != NULL)//if not empty, check each one and add when ready
{
//if next node is later than current, we are done with insertion
if (isEarlier(my_event_data,ptr->event_data) == 1)
{
if (head == ptr) //if earlier than head... insert and print
{
//printf("earlier than head!");
printf("C, ");
printEventData(my_event_data);
printf("\n");
link->next = ptr;
head = link;
}
else //if earlier than non head, insert, but dont print
{
if (prev != NULL)
{
prev->next = link;
}
link->next = ptr;
}
return;
}
else
{
prev = ptr;
ptr = ptr->next;
}
}
if (isEarlier(my_event_data,ptr->event_data) == 1) //while ptr-> is null now
{
printf("C, ");
printEventData(my_event_data);
printf("\n");
if (prev != NULL)
{
prev->next = link;
}
link->next = ptr->next;
return;
}
else
{
prev = link;
link = ptr;
}
}
return;
}
//if it gets here, it is the latest meeting, tack it on the end
//prev->ptr = link;
}
void change(char* my_event_data[5]) {
//create a link
struct node *ptr = head;
while (ptr->next != NULL)//if not empty, check each one and add when ready
{
//if next node is later than current, we are done with insertion
if (*ptr->event_data[1] == *my_event_data[1])
{
for (int i = 1; i < 5; i++)
{
ptr->event_data[i] = my_event_data[i];
}
printf("X, ");
printEventData(my_event_data);
printf("\n");
return;
}
ptr = ptr->next;
}
if (*ptr->event_data[1] == *my_event_data[1]) //check final node
{
for (int i = 0; i < 5; i++)
{
ptr->event_data[i] = my_event_data[i];
}
printf("X, ");
printEventData(my_event_data);
printf("\n");
return;
}
printf("event to change not found");
return;
//if it gets here, nothing matched the title to change
}
void delete(char* my_event_data[5])
{
struct node *ptr = head;
struct node *prev = NULL;
while (ptr != NULL)//if not empty, check each one and add when ready
{
//if next node is later than current, we are done with insertion
if ( strcmp( ptr->event_data[1], my_event_data[1] ) == 0) // if title matches, delete it
{
if (prev != NULL)
{
prev->next = ptr->next;
}
if (ptr == head)
{
head = ptr->next;
}
free(ptr);
printf("D, ");
printEventData(my_event_data);
printf("\n");
return;
}
prev = ptr;
ptr = ptr->next;
}
}
void emailFilter(char* mySubject)
{
if (strlen(mySubject) < 9)
{
return;
}
char * event_holder[5]; //holds five separate char ptrs
for (int i = 0; i < 5; i++)
{
event_holder[i] = ((char*)malloc(100 * sizeof(char*)));
}
char command_type = parseSubject(mySubject, event_holder); //parses subject line and fills event_holder. returns command type, from parsing
//call proper parsing result
if (command_type == 'C')
{
create(event_holder);
}
else if (command_type == 'X')
{
change(event_holder);
}
else if (command_type == 'D')
{
delete(event_holder);
}
}
int main(int argc, char* argv[])
{
char myItem[100];
int i = 0;
while (i < 100)
{
scanf("%[^\n]", myItem);
i++;
if ( myItem == EOF )
{
break;
}
int c;
while ((c = getchar()) != '\n' && c != EOF);
printf("i = %d\n", i);
emailFilter(myItem);
}
return 0;
}
Also please note that this error happens when I use a txt file as STDIN via the ">" symbol on the command line. Here is the file I use:
Subject: C,Meeting ,01/12/2019,15:30,NEB202
Subject: C,Meeting ,01/12/2019,16:30,NEB202
Subject: C,Meeting ,01/12/2019,11:30,NEB202
Having tried to find something to contribute, there's this:
The code is dealing with the date/time. Below is the declaration and use of a "destination buffer" into which is copied fragments of the string:
int isEarlier(char* event_data_L[5], char* event_data_R[5])
{// will be given like 12:30 12:45 // ....
//compare dates for earlier
int month_L,day_L,year_L;
int month_R,day_R,year_R;
char* char_holder;
substring(event_data_L[2],char_holder,0,2);//extract first half of time
month_L = atoi(char_holder); //convert first half of time to int
//...
Notice that char_holder isn't pointing anywhere in particular. UB...
While it represents a beginner's approach, it is actually painful to see code like this. Below is a more concise version of isEarlier() (untested.)
int isEarlier( char *ed_L[5], char *ed_R[5] ) {
char l[16], r[16];
memcpy( l + 0, ed_L[2][6],4 ); // YYYY
memcpy( l + 4, ed_L[2][0],2 ); // MM
memcpy( l + 6, ed_L[2][3],2 ); // DD
memcpy( l + 8, ed_L[3][0],2 ); // hh
memcpy( l + 10, ed_L[3][3],2 ); // mm
memcpy( r + 0, ed_R[2][6],4 ); // YYYY
memcpy( r + 4, ed_R[2][0],2 ); // MM
memcpy( r + 6, ed_R[2][3],2 ); // DD
memcpy( r + 8, ed_R[3][0],2 ); // hh
memcpy( r + 10, ed_R[3][3],2 ); // mm
int res = memcmp( l, r, 12 );
return res < 0 ? 1 : res == 0 ? 2 : 3;
}
Note: The sample data provided indicates 2 digits for both month and day, and is ambiguous as to "mm/dd" or "dd/mm" format. The offset values used here come from the OP code.
One way to reduce the possibility of bugs in code is to both write less but more capable code, if you can, and to perform "unit testing" on code that you write. Focus on one function at a time and do not use global variables. Another is to become as familiar as you can with the proven capabilities of functions in the standard library.
EDIT: Looking at this answer, it occurs to me that this function should, itself, be refactored:
void reformatDateTime( char *d, char *s[5] ) {
memcpy( d + 0, s[2][6],4 ); // YYYY
memcpy( d + 4, s[2][0],2 ); // MM
memcpy( d + 6, s[2][3],2 ); // DD
memcpy( d + 8, s[3][0],2 ); // hh
memcpy( d + 10, s[3][3],2 ); // mm
}
int isEarlier( char *ed_L[5], char *ed_R[5] ) {
char l[16], r[16];
reformatDateTime( l, ed_L );
reformatDateTime( r, ed_R );
int res = memcmp( l, r, 12 );
res = res < 0 ? 1 : res == 0 ? 2 : 3;
printf( "isEarlier() '%.12s' vs '%.12s' result %d\n", l, r, res ); // debug
return res;
}
and I can see that it runs properly
You contradict yourself, you say that it sometimes or always seg faults. It's rather unlikely that some C code would crash at the point of leaving a function, since there's no "RAII" and in this case no multi-threading either. A stack corruption could have destroyed the function return address however.
The best way of debugging is not so much about focusing on the symptom, as it trying to pinpoint where something goes wrong. You've already done as much, so that's most of the debug effort already done.
One way of debugging from there is step #1: stare at the function for one minute. After around 5 seconds: event_holder[i] = ((char*)malloc(100 * sizeof(char*))); Well that's an obvious bug. After some 30 seconds more: wait, who cleans up this memory? The delete function perhaps but why is it then executed conditionally? (Turns out delete doesn't free() memory though.) The function leaks memory, another bug. Then after one full minute we realize that parseSubject does a whole lot of things and we'll need to dig through that one in detail if we want to weed out every possible chance of bugs. And it will take a lot more time to get to the bottom of that. But we already found 2 blatant bugs just by glancing at the code.
Fix the bugs, try again, is the problem gone?
At another glance there's a bug in main(), myItem == EOF is senseless and shouldn't compile. This suggests that you are compiling with way too lax warning levels or ignoring warnings, either is a very bad thing. What compiler options are recommended for beginners learning C?
We might note that extensive use of "magic numbers" make the code hard to read. It is also usually a sure sign of brittle code. Where do these 5 and 9 and so on come from? Use named constants. We will also fairly quickly note the lack of const correctness in something that's only supposed to parse, not change data. And so on.
I didn't read the code in detail, but the overall lack of following best practices and 3 bugs found just by brief glances suggests there's a whole lot more bugs in there.

I completed the week 5 speller cs50x today. The code was not giving the correct output when I used recursion, but normally, it worked

Here is the link to the question (problem set)speller
I find the recursion conditions correct. the problem is in the check function, according to the outputs. So when I changed it, the code worked.
Here are the codes from dictionary.c (first one with recursion and the second one without it)
with recursion
here I defined two functions of my own in order to recursively go through the lists...
// Implements a dictionary's functionality
#include <ctype.h>
#include <stdbool.h>
#include <string.h>
#include "dictionary.h"
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
// Represents a node in a hash table
typedef struct node
{
char word[LENGTH + 1];
struct node *next;
}
node;
bool search(node *s, const char *wod);
void frr(node *ptr);
int sije = 0;
// TODO: Choose number of buckets in hash table
// ans - i would choose a 2d array which is [26][LENGTH + 1]
const unsigned int N = 26;
// 2d Hash table
node *table[N][LENGTH];
// Returns true if word is in dictionary, else false
bool check(const char *word)
{
// finding the hash for the word
int h = hash(word);
// finding the length for the word
int len = strlen(word) - 1;
// creating another pointer for the sake of iterating
node *tmp = table[h][len];
return search(tmp, word);
}
bool search(node *s, const char *wod)
{
if (s == NULL)
{
return false;
}
if (!strcasecmp(s -> word, wod))
{
return true;
}
else
{
s = s -> next;
bool c = search(s, wod);
}
return false;
}
// Hashes word to a number
unsigned int hash(const char *word)
{
// TODO: Improve this hash function
return toupper(word[0]) - 'A';
}
// Loads dictionary into memory, returning true if successful, else false
bool load(const char *dictionary)
{
// make the hash table free of garbage values
for (int i = 0; i < 26; i++)
{
for (int j = 0; j < 10; j++)
{
table[i][j] = NULL;
}
}
// TODO
// open the dictionary file
FILE *dict = fopen(dictionary, "r");
if (dict == NULL)
{
return false;
}
// read the word inside a char array
char n[LENGTH + 1];
while (fscanf(dict, "%s", n) != EOF)
{
// call the hash function and get the hach code
int h = hash(n);
// this will return something from 0 till 25
int len = strlen(n) - 1;
// this will have the length of the word
// let's load it to the hach table
// create a new node
node *no = malloc(sizeof(node));
if (no == NULL)
{
return false;
}
// copy the word from n to no
strcpy(no -> word, n);
// declare the pointer inside the node to null
no -> next = NULL;
// insert the node inside the hach table
if (table[h][len] == NULL)
{
table[h][len] = no;
}
// else if the spot is populated
else
{
no -> next = table[h][len];
table[h][len] = no;
}
sije += 1;
}
return true;
}
// Returns number of words in dictionary if loaded, else 0 if not yet loaded
unsigned int size(void)
{
// TODO
return sije;
}
// Unloads dictionary from memory, returning true if successful, else false
bool unload(void)
{
for (int i = 0; i < 26; i++)
{
for (int j = 0; j < LENGTH; j++)
{
// if table at that inces in not null,
// that means that there is a linked list at that index of the table
if (table[i][j] != NULL)
{
frr(table[i][j]);
}
}
}
return true;
}
void frr(node *ptr)
{
if (ptr == NULL)
{
return;
}
frr(ptr -> next);
free(ptr);
}
without recursion (the one which worked)
// Implements a dictionary's functionality
#include <ctype.h>
#include <stdbool.h>
#include <string.h>
#include "dictionary.h"
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
// Represents a node in a hash table
typedef struct node
{
char word[LENGTH + 1];
struct node *next;
}
node;
void frr(node *ptr);
int sije = 0;
// TODO: Choose number of buckets in hash table
// ans - i would choose a 2d array which is [26][LENGTH]
const unsigned int N = 26;
// 2d Hash table
node *table[N][LENGTH];
// Returns true if word is in dictionary, else false
bool check(const char *word)
{
// finding the hash of the code
int h = hash(word);
// finding the length of the code
int len = strlen(word) - 1;
// creating another pointer for the sake of iterating
node *tmp = table[h][len];
while (true)
{
if (tmp == NULL)
{
return false;
}
if (!strcasecmp(tmp -> word, word))
{
return true;
}
tmp = tmp -> next;
}
}
// Hashes word to a number
unsigned int hash(const char *word)
{
// TODO: Improve this hash function
return toupper(word[0]) - 'A';
}
// Loads dictionary into memory, returning true if successful, else false
bool load(const char *dictionary)
{
// make the hash table free of garbage values
for (int i = 0; i < 26; i++)
{
for (int j = 0; j < 10; j++)
{
table[i][j] = NULL;
}
}
// TODO
// open the dictionary file
FILE *dict = fopen(dictionary, "r");
if (dict == NULL)
{
return false;
}
// read the word inside a char array
char n[LENGTH + 1];
while (fscanf(dict, "%s", n) != EOF)
{
// call the hash function and get the hach code
int h = hash(n);
// this will return something from 0 till 25
int len = strlen(n) - 1;
// this will have the length of the word
// let's load it to the hach table
// create a new node
node *no = malloc(sizeof(node));
if (no == NULL)
{
return false;
}
// copy the word from n to no
strcpy(no -> word, n);
// declare the pointer inside the node to null
no -> next = NULL;
// insert the node inside the hach table
if (table[h][len] == NULL)
{
table[h][len] = no;
}
// else if the spot is populated
else
{
no -> next = table[h][len];
table[h][len] = no;
}
sije += 1;
}
fclose(dict);
return true;
}
// Returns number of words in dictionary if loaded, else 0 if not yet loaded
unsigned int size(void)
{
// TODO
return sije;
}
// Unloads dictionary from memory, returning true if successful, else false
bool unload(void)
{
for (int i = 0; i < 26; i++)
{
for (int j = 0; j < LENGTH; j++)
{
// if table at that inces in not null,
// that means that there is a linked list at that index of the table
if (table[i][j] != NULL)
{
frr(table[i][j]);
}
}
}
return true;
}
void frr(node *ptr)
{
if (ptr == NULL)
{
return;
}
frr(ptr -> next);
free(ptr);
}
What could be the reason. All others are correct, only the checking part ...
I thought I would investigate this issue further as it seemed to me that adding in a return within the "else" block should address the issue. Possibly, I was not making the answer clear enough via the comments so I went ahead and acquired the lesson code, implemented your code solution with the non-recursive check function performing some test over a select set of text files, and then trying out the code with the recursive search function along with the code tweak I had noted.
Executing the non-recursive solution over the "aca.txt" file, the following was the terminal output to be used as the baseline.
WORDS MISSPELLED: 17062
WORDS IN DICTIONARY: 143091
WORDS IN TEXT: 376904
TIME IN load: 0.08
TIME IN check: 3.61
TIME IN size: 0.00
TIME IN unload: 0.01
TIME IN TOTAL: 3.70
I then revised the program, dictionary.c, to include the check/search function calls with the code in its original state.
// Returns true if word is in dictionary, else false
bool check(const char *word)
{
// finding the hash for the word
int h = hash(word);
// finding the length for the word
int len = strlen(word) - 1;
// creating another pointer for the sake of iterating
node *tmp = table[h][len];
return search(tmp, word);
}
bool search(node *s, const char *wod)
{
if (s == NULL)
{
return false;
}
if (!strcasecmp(s -> word, wod))
{
return true;
}
else
{
s = s -> next;
bool c = search(s, wod);
}
return false;
}
Executing this version of the code did give some superfluous misspelling values when run over the same file.
WORDS MISSPELLED: 359710
WORDS IN DICTIONARY: 143091
WORDS IN TEXT: 376904
TIME IN load: 0.07
TIME IN check: 4.85
TIME IN size: 0.00
TIME IN unload: 0.01
TIME IN TOTAL: 4.93
My guess is that this is the type of behavior you were experiencing.
I then revised the search function to provide a return statement within the "else" block as I had noted in the comments. Following is the refactored code.
// Returns true if word is in dictionary, else false
bool check(const char *word)
{
// finding the hash for the word
int h = hash(word);
// finding the length for the word
int len = strlen(word) - 1;
// creating another pointer for the sake of iterating
node *tmp = table[h][len];
return search(tmp, word);
}
bool search(node *s, const char *wod)
{
if (s == NULL)
{
return false;
}
if (!strcasecmp(s -> word, wod))
{
return true;
}
else
{
s = s -> next;
return search(s, wod); /* Added this line of code */
//bool c = search(s, wod); /* Deactivated this line of code */
}
return false;
}
When the program was recompiled and executed over the same text file, the following statistical output was acquired.
WORDS MISSPELLED: 17062
WORDS IN DICTIONARY: 143091
WORDS IN TEXT: 376904
TIME IN load: 0.08
TIME IN check: 4.14
TIME IN size: 0.00
TIME IN unload: 0.01
TIME IN TOTAL: 4.23
This agrees with the values listed with the non-recursive version of the program.
I checked this refactored version against the other selected text files and all agreed with the values run using the non-recursive version of the program.
FYI, I utilized the gcc compiler. That is the compiler that I have on my system in lieu of Clang. And I will point out one other thing I needed to do to make successfully build this program. When compiling, the compiler complained about having a variable size definition for the "table" array.
const unsigned int N = 26;
To get around this issue with my compiler, I had to move the assignment/definition of "N" to the same spot as the "LENGTH" definition in the header file.
// Maximum length for a word
// (e.g., pneumonoultramicroscopicsilicovolcanoconiosis)
#define LENGTH 45
#define N 26
I don't think these additional tweaks had any bearing on the functionality of using a recursive function, but I wanted to be fully transparent.
Anyway, hopefully with this expanded explanation, you might try out the code tweaks to see if the recursive function would now work.

Problems on computing statistics and parsing a text file

a must be simple question but I couldn't manage to do it.
I have to scan on a struct a text file with entries in this format:
{"data1","data2",number1,number2}
And compute first populating a struct.
Text of the exercise:
Consider the definition of the following structure
typedef struct {
char teamHome [30];
char teamHost [30];
int goalSquadraHome;
int goalSquadraOspite;
} match;
which is used to represent the result of a football match.
Write a function that takes as parameters an array of games and its size e
returns a result structure containing the following information:
the number of games won by the home team,
the number of games won by the visiting team,
the number of ties,
the name of the team that has scored the most goals in a match.
Then write a program that, given the array containing all 380 Serie A 2019/2020 matches,
print the information contained in the result.
The code is the following:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct{
char squadraCasa[30];
char squadraOspite[30];
int golSquadraCasa;
int golSquadraOspite;
} partita;
typedef struct {
int partite_casa;
int partite_ospite;
int pareggi;
char squad_magg_num_goal[30];
} risultato;
int main(){
FILE *fp;
risultato risultati;
int maxgoal = 0;
risultati.partite_casa = 0;
risultati.partite_ospite = 0;
risultati.pareggi = 0;
partita partite[380];
int i=0;
if((fp = fopen("partiteSerieA1920.txt","rt"))==NULL){
printf("Errore nell'apertura del file\n");
exit(1);
}
while(!feof(fp)){
fscanf(fp,"{\"%s\",\"%s\",%d,%d",partite[i].squadraCasa,partite[i].squadraOspite,partite[i].golSquadraCasa,partite[i].golSquadraOspite);
i++;
}
for(i=0;i<380;i++){
if(partite[i].golSquadraCasa>partite[i].golSquadraOspite){
risultati.partite_casa++;
}else if(partite[i].golSquadraCasa<partite[i].golSquadraOspite){
risultati.partite_ospite++;
}else
risultati.pareggi++;
if(partite[i].golSquadraCasa>maxgoal){
strncpy(partite[i].squadraCasa,risultati.squad_magg_num_goal,30);
maxgoal = partite[i].golSquadraCasa;
}
if(partite[i].golSquadraOspite>maxgoal){
strncpy(partite[i].squadraOspite, risultati.squad_magg_num_goal,30);
maxgoal = partite[i].golSquadraOspite;
}
}
fclose(fp);
printf("%d %d %d %s\n",risultati.partite_casa,risultati.partite_ospite,&risultati.pareggi,&risultati.squad_magg_num_goal);
return 0;
}
Please let me know how to arrange it properly.
There may be other issues, but certainly your flow control is wrong. Instead of the incorrect while/feof loop(Why is “while ( !feof (file) )” always wrong?), try something like:
partita *p = partite;
while( 4 == fscanf(fp, "{\"%29s\",\"%29s\",%d,%d",
p->squadraCasa,
p->squadraOspite,
&p->golSquadraCasa,
&p->golSquadraOspite
) ){
p++;
}
Give this a try, its a bit of a different approach:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct
{
char squadraCasa[30];
char squadraOspite[30];
int golSquadraCasa;
int golSquadraOspite;
} partita;
typedef struct
{
int partite_casa;
int partite_ospite;
int pareggi;
char squad_magg_num_goal[30];
} risultato;
// solves given problem and stores results in risultato struct
risultato *getResult(partita **playedGames, int size)
{
risultato *result = malloc(sizeof(risultato));
result->partite_casa = 0;
result->partite_ospite = 0;
result->pareggi = 0;
int currentHighest = 0;
for (int i = 0; i < size; i++)
{
if (playedGames[i]->golSquadraCasa > playedGames[i]->golSquadraOspite){
result->partite_casa++;
if(playedGames[i]->golSquadraCasa > currentHighest ){
currentHighest = playedGames[i]->golSquadraCasa;
strcpy(result->squad_magg_num_goal, playedGames[i]->squadraCasa);
}
}
else if (playedGames[i]->golSquadraCasa < playedGames[i]->golSquadraOspite){
result->partite_ospite++;
if (playedGames[i]->golSquadraOspite > currentHighest){
currentHighest = playedGames[i]->golSquadraOspite;
strcpy(result->squad_magg_num_goal, playedGames[i]->squadraOspite);
}
}
else{
result->pareggi++;
}
}
return result;
}
// This is a custom parser of a line from the file
// data = {"data1","data2",number1,number2}
// return -> partita struct
partita *newGame(char *data){
partita *partite = malloc(sizeof(partita));
char* temp;
// Get Home Team
temp = strchr(data, ',') -1;
temp[0] = '\0';
strcpy(partite->squadraCasa, data + 2);
data = temp+1;
// Get Away Team
temp = strchr(data+1, ',') -1;
temp[0] = '\0';
strcpy(partite->squadraOspite, data + 2);
data = temp + 1;
// Get Home Score
temp = strchr(data + 1, ',');
temp[0] = '\0';
partite->golSquadraCasa = atoi(data + 1);
data = temp + 1;
// Get Away Score
temp = strchr(data, '}');
temp[0] = '\0';
partite->golSquadraOspite = atoi(data);
// Return game
return partite;
}
int main()
{
FILE *fp;
partita **partite = malloc(sizeof(partita *)); // list of size one, currently...
risultato *risultati;
char linea[50];
int indice = 0;
if ((fp = fopen("./partiteSerieA1920.txt", "rt")) == NULL)
{
printf("Errore nell'apertura del file\n");
exit(1);
}
// For each linea in the file, load a game into an array.
while (fgets(linea, 50,fp))
{
//chomp the \n
linea[strlen(linea)-1]='\0';
// increase size of list
partite = realloc(partite, sizeof(partita *) * (indice + 1));
// insert game into array of games
partite[indice] = newGame(linea);
indice++;
}
risultati = getResult(partite, indice);
// Print risultato
printf("\n----RESULT----\nHome Wins: %d\nAway Wins: %d\nTies: %d\nTeam With Most Goals: %s\n\n", risultati->partite_casa, risultati->partite_ospite, risultati->pareggi, risultati->squad_magg_num_goal);
// free all allocated memory then return
for (int i = 0; i < indice;i++){
free(partite[i]);
}
free(partite);
free(risultati);
fclose(fp);
return 0;
}
I was trying to run your code but couldnt get it to parse data from the file properly so i made a quick parser for you (This is in the code above already):
partita *newGame(char *data){
partita *partite = malloc(sizeof(partita));
char* temp;
// Get Home Team
temp = strchr(data, ',') -1;
temp[0] = '\0';
strcpy(partite->squadraCasa, data + 2);
data = temp+1;
// Get Away Team
temp = strchr(data+1, ',') -1;
temp[0] = '\0';
strcpy(partite->squadraOspite, data + 2);
data = temp + 1;
// Get Home Score
temp = strchr(data + 1, ',');
temp[0] = '\0';
partite->golSquadraCasa = atoi(data + 1);
data = temp + 1;
// Get Away Score
temp = strchr(data, '}');
temp[0] = '\0';
partite->golSquadraOspite = atoi(data);
// Return game
return partite;
}
You can always try to use something similar to this to parse strings or lines that you bring in as I find it is more efficient to just code something that you know works to the specification you want.
Let me know if there is some problem with the code or would like to know more about the functionality of this. I tried to keep as much of this in Italian.
Cheers

Modifying data in a Linked List in C?

When I modify the linked list (based on the ID), it modifies the node successfully but deletes the rest of the list. The whole list stays only if I modify the most recent node that I've added to the list.
I know that the problem is at the end where it says:
phead=i;
return phead;
But I don't know how to fix it as I haven't found anything to help me, even though I'm sure it is simple to know why it is wrong.
struct ItemNode *modify1Item(struct ItemNode *phead){
int modID;
int lfound=0;
int lID;
char lDesc[30];
char lName[30];
double lUPrice;
int lOnHand;
struct ItemNode *i=phead;
printf("Enter the ID of the item that you want to modify\n");
scanf("%d", &modID);
while(i != NULL){
if(i->ID == modID){
break;
}
i= i->next;
}
if(i==NULL){
printf("An item with that ID wasn't found.\n");
return 0;
}
else{
printf("Enter new Name\n");
scanf("%s", lName);
strcpy(i->name, lName);
printf("Enter new Description\n");
scanf("%s", lDesc);
strcpy(i->desc, lDesc);
printf("Enter new Unit Price $\n");
scanf("%lf", &lUPrice);
i->uPrice = lUPrice;
printf("Enter new Number of Items On Hand\n");
scanf("%d", &lOnHand);
i->onHand = lOnHand;
}
phead=i;
return phead;
}
When I return it, i say head=modify1Item(phead);
I tested your code everything worked as expected. Without seeing your code, I can't comment much. But I think, for your code, the only time everything would get delete is if you assign the return value incorrectly. So this below is probably something close to your code. For the test code below, unless you modify it, the IDs are 0, 1, and 2. Oh and the reason why I commented only work for 0 to 9 is because I don't want to make up the entire char string so I used i ^ 48. Of which, 0 - 9 ^ 48 would turn into the correspondent ASCII code of 0 - 9. If you go beyond that, you may get weird result for that two string that all.
I just noticed that you use NULL in your search. Thus, I updated the code so the "next" of last index will be NULL otherwise if your code found nothing, it will run forever.
#include <stdio.h>
#include <string.h>
typedef struct ItemNode {
int ID;
int uPrice;
int onHand;
char name[30];
char desc[30];
struct ItemNode * next;
} ItemNode ;
struct ItemNode * modify1Item(struct ItemNode * phead){
int modID;
int lfound=0;
int lID;
char lDesc[30];
char lName[30];
double lUPrice;
int lOnHand;
struct ItemNode *i = phead;
printf("Enter the ID of the item that you want to modify\n");
scanf("%d", &modID);
while(i != NULL){
if(i->ID == modID){
break;
}
i = i->next;
}
if(i==NULL){
printf("An item with that ID wasn't found.\n");
return 0;
} else {
printf("Enter new Name\n");
scanf("%s", lName);
strcpy(i->name, lName);
printf("Enter new Description\n");
scanf("%s", lDesc);
strcpy(i->desc, lDesc);
printf("Enter new Unit Price $\n");
scanf("%lf", &lUPrice);
i->uPrice = lUPrice;
printf("Enter new Number of Items On Hand\n");
scanf("%d", &lOnHand);
i->onHand = lOnHand;
}
phead=i;
return phead;
}
int main(){
// only work for 0 - 9.
int index = 3;
ItemNode iArr[index];
for ( int i = 0; i < index; i++ ){
iArr[i].ID = i;
iArr[i].uPrice = i + i;
iArr[i].onHand = i * i;
iArr[i].name[0] = i ^ 48;
iArr[i].desc[0] = i ^ 48;
// If last index link back to first index.
// Updated: but for you usage case
// because of your search function
// last index should be NULL otherwise your
// search will run forever
if ( i < index - 1 ) iArr[i].next = &iArr[i + 1];
else iArr[i].next = NULL; // if change search method with unique ID then you can use -> &iArr[0];
}
// Mod 0
ItemNode * test = modify1Item(iArr);
printf("0 name: %s\n\n",iArr[0].name );
// Mod 1
ItemNode * test1 = modify1Item(iArr);
printf("1 name: %s\n\n",iArr[1].name );
// Mod 2
ItemNode * test2 = modify1Item(iArr);
printf("2 name: %s\n\n",iArr[2].name );
// Check if 0 is still there.
printf("0 name: %s\n\n",iArr[0].name );
return 0;
}

Seg. Fault in Hash Table ADT - C

Edit:
Hash.c is updated with revisions from the comments, I am still getting a Seg fault. I must be missing something here that you guys are saying
I have created a hash table ADT using C but I am encountering a segmentation fault when I try to call a function (find_hash) in the ADT.
I have posted all 3 files that I created parse.c, hash.c, and hash.h, so you can see all of the variables. We are reading from the file gettysburg.txt which is also attached
The seg fault is occuring in parse.c when I call find_hash. I cannot figure out for the life of me what is going on here. If you need anymore information I can surely provide it.
sorry for the long amount of code I have just been completely stumped for a week now on this. Thanks in advance
The way I run the program is first:
gcc -o parse parse.c hash.c
then: cat gettysburg.txt | parse
Parse.c
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include "hash.h"
#define WORD_SIZE 40
#define DICTIONARY_SIZE 1000
#define TRUE 1
#define FALSE 0
void lower_case_word(char *);
void dump_dictionary(Phash_table );
/*Hash and compare functions*/
int hash_func(char *);
int cmp_func(void *, void *);
typedef struct user_data_ {
char word[WORD_SIZE];
int freq_counter;
} user_data, *Puser_data;
int main(void)
{
char c, word1[WORD_SIZE];
int char_index = 0, dictionary_size = 0, num_words = 0, i;
int total=0, largest=0;
float average = 0.0;
Phash_table t; //Pointer to main hash_table
int (*Phash_func)(char *)=NULL; //Function Pointers
int (*Pcmp_func)(void *, void *)=NULL;
Puser_data data_node; //pointer to hash table above
user_data * find;
printf("Parsing input ...\n");
Phash_func = hash_func; //Assigning Function pointers
Pcmp_func = cmp_func;
t = new_hash(1000,Phash_func,Pcmp_func);
// Read in characters until end is reached
while ((c = getchar()) != EOF) {
if ((c == ' ') || (c == ',') || (c == '.') || (c == '!') || (c == '"') ||
(c == ':') || (c == '\n')) {
// End of a word
if (char_index) {
// Word is not empty
word1[char_index] = '\0';
lower_case_word(word1);
data_node = (Puser_data)malloc(sizeof(user_data));
strcpy(data_node->word,word1);
printf("%s\n", data_node->word);
//!!!!!!SEG FAULT HERE!!!!!!
if (!((user_data *)find_hash(t, data_node->word))){ //SEG FAULT!!!!
insert_hash(t,word1,(void *)data_node);
}
char_index = 0;
num_words++;
}
} else {
// Continue assembling word
word1[char_index++] = c;
}
}
printf("There were %d words; %d unique words.\n", num_words,
dictionary_size);
dump_dictionary(t); //???
}
void lower_case_word(char *w){
int i = 0;
while (w[i] != '\0') {
w[i] = tolower(w[i]);
i++;
}
}
void dump_dictionary(Phash_table t){ //???
int i;
user_data *cur, *cur2;
stat_hash(t, &(t->total), &(t->largest), &(t->average)); //Call to stat hash
printf("Number of unique words: %d\n", t->total);
printf("Largest Bucket: %d\n", t->largest);
printf("Average Bucket: %f\n", t->average);
cur = start_hash_walk(t);
printf("%s: %d\n", cur->word, cur->freq_counter);
for (i = 0; i < t->total; i++)
cur2 = next_hash_walk(t);
printf("%s: %d\n", cur2->word, cur2->freq_counter);
}
int hash_func(char *string){
int i, sum=0, temp, index;
for(i=0; i < strlen(string);i++){
sum += (int)string[i];
}
index = sum % 1000;
return (index);
}
/*array1 and array2 point to the user defined data struct defined above*/
int cmp_func(void *array1, void *array2){
user_data *cur1= array1;
user_data *cur2= array2;//(user_data *)array2;
if(cur1->freq_counter < cur2->freq_counter){
return(-1);}
else{ if(cur1->freq_counter > cur2->freq_counter){
return(1);}
else return(0);}
}
hash.c
#include "hash.h"
Phash_table new_hash (int size, int(*hash_func)(char*), int(*cmp_func)(void*, void*)){
int i;
Phash_table t;
t = (Phash_table)malloc(sizeof(hash_table)); //creates the main hash table
t->buckets = (hash_entry **)malloc(sizeof(hash_entry *)*size); //creates the hash table of "size" buckets
t->size = size; //Holds the number of buckets
t->hash_func = hash_func; //assigning the pointer to the function in the user's program
t->cmp_func = cmp_func; // " "
t->total=0;
t->largest=0;
t->average=0;
t->sorted_array = NULL;
t->index=0;
t->sort_num=0;
for(i=0;i<size;i++){ //Sets all buckets in hash table to NULL
t->buckets[i] = NULL;}
return(t);
}
void free_hash(Phash_table table){
int i;
hash_entry *cur;
for(i = 0; i<(table->size);i++){
if(table->buckets[i] != NULL){
for(cur=table->buckets[i]; cur->next != NULL; cur=cur->next){
free(cur->key); //Freeing memory for key and data
free(cur->data);
}
free(table->buckets[i]); //free the whole bucket
}}
free(table->sorted_array);
free(table);
}
void insert_hash(Phash_table table, char *key, void *data){
Phash_entry new_node; //pointer to a new node of type hash_entry
int index;
new_node = (Phash_entry)malloc(sizeof(hash_entry));
new_node->key = (char *)malloc(sizeof(char)*(strlen(key)+1)); //creates the key array based on the length of the string-based key
new_node->data = data; //stores the user's data into the node
strcpy(new_node->key,key); //copies the key into the node
//calling the hash function in the user's program
index = table->hash_func(key); //index will hold the hash table value for where the new node will be placed
table->buckets[index] = new_node; //Assigns the pointer at the index value to the new node
table->total++; //increment the total (total # of buckets)
}
void *find_hash(Phash_table table, char *key){
int i;
hash_entry *cur;
printf("Inside find_hash\n"); //REMOVE
for(i = 0;i<table->size;i++){
if(table->buckets[i]!=NULL){
for(cur = table->buckets[i]; cur->next != NULL; cur = cur->next){
if(strcmp(table->buckets[i]->key, key) == 0)
return((table->buckets[i]->data));} //returns the data to the user if the key values match
} //otherwise return NULL, if no match was found.
}
return NULL;
}
void stat_hash(Phash_table table, int *total, int *largest, float *average){
int node_num[table->size]; //creates an array, same size as table->size(# of buckets)
int i,j, count = 0;
int largest_buck = 0;
hash_entry *cur;
for(i = 0; i < table->size; i ++){
if(table->buckets[i] != NULL){
for(cur=table->buckets[i]; cur->next!=NULL; cur = cur->next){
count ++;}
node_num[i] = count;
count = 0;}
}
for(j = 0; j < table->size; j ++){
if(node_num[j] > largest_buck)
largest_buck = node_num[j];}
*total = table->total;
*largest = largest_buck;
*average = (table->total) / (table->size);
}
void *start_hash_walk(Phash_table table){
Phash_table temp = table;
int i, j, k;
hash_entry *cur; //CHANGE IF NEEDED to HASH_TABLE *
if(table->sorted_array != NULL) free(table->sorted_array);
table->sorted_array = (void**)malloc(sizeof(void*)*(table->total));
for(i = 0; i < table->total; i++){
if(table->buckets[i]!=NULL){
for(cur=table->buckets[i]; cur->next != NULL; cur=cur->next){
table->sorted_array[i] = table->buckets[i]->data;
}}
}
for(j = (table->total) - 1; j > 0; j --) {
for(k = 1; k <= j; k ++){
if(table->cmp_func(table->sorted_array[k-1], table->sorted_array[k]) == 1){
temp -> buckets[0]-> data = table->sorted_array[k-1];
table->sorted_array[k-1] = table->sorted_array[k];
table->sorted_array[k] = temp->buckets[0] -> data;
}
}
}
return table->sorted_array[table->sort_num];
}
void *next_hash_walk(Phash_table table){
table->sort_num ++;
return table->sorted_array[table->sort_num];
}
hash.h
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct hash_entry_ { //Linked List
void *data; //Generic pointer
char *key; //String-based key value
struct hash_entry_ *next; //Self-Referencing pointer
} hash_entry, *Phash_entry;
typedef struct hash_table_ {
hash_entry **buckets; //Pointer to a pointer to a Linked List of type hash_entry
int (*hash_func)(char *);
int (*cmp_func)(void *, void *);
int size;
void **sorted_array; //Array used to sort each hash entry
int index;
int total;
int largest;
float average;
int sort_num;
} hash_table, *Phash_table;
Phash_table new_hash(int size, int (*hash_func)(char *), int (*cmp_func)(void *, void *));
void free_hash(Phash_table table);
void insert_hash(Phash_table table, char *key, void *data);
void *find_hash(Phash_table table, char *key);
void stat_hash(Phash_table table, int *total, int *largest, float *average);
void *start_hash_walk(Phash_table table);
void *next_hash_walk(Phash_table table);
Gettysburg.txt
Four score and seven years ago, our fathers brought forth upon this continent a new nation: conceived in liberty, and dedicated to the proposition that all men are created equal.
Now we are engaged in a great civil war. . .testing whether that nation, or any nation so conceived and so dedicated. . . can long endure. We are met on a great battlefield of that war.
We have come to dedicate a portion of that field as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.
But, in a larger sense, we cannot dedicate. . .we cannot consecrate. . . we cannot hallow this ground. The brave men, living and dead, who struggled here have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember, what we say here, but it can never forget what they did here.
It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us. . .that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion. . . that we here highly resolve that these dead shall not have died in vain. . . that this nation, under God, shall have a new birth of freedom. . . and that government of the people. . .by the people. . .for the people. . . shall not perish from the earth.
It's possible that one of several problems with this code are loops like:
for(table->buckets[i];
table->buckets[i]->next != NULL;
table->buckets[i] = table->buckets[i]->next)
...
The initializing part of the for loop (table->buckets[i]) has no effect. If i is 0 and table->buckets[0] == NULL, then the condition on this loop (table->buckets[i]->next != NULL) will dereference a null pointer and crash.
That's where your code seemed to be crashing for on my box, at least. When I changed several of your loops to:
if (table->buckets[i] != NULL) {
for(;
table->buckets[i]->next != NULL;
table->buckets[i] = table->buckets[i]->next)
...
}
...it kept crashing, but in a different place. Maybe that will help get you unstuck?
Edit: another potential problem is that those for loops are destructive. When you call find_hash, do you really want all of those buckets to be modified?
I'd suggest using something like:
hash_entry *cur;
// ...
if (table->buckets[i] != NULL) {
for (cur = table->buckets[i]; cur->next != NULL; cur = cur->next) {
// ...
}
}
When I do that and comment out your dump_dictionary function, your code runs without crashing.
Hmm,
here's hash.c
#include "hash.h"
Phash_table new_hash (int size, int(*hash_func)(char*), int(*cmp_func)(void*, void*)){
int i;
Phash_table t;
t = (Phash_table)calloc(1, sizeof(hash_table)); //creates the main hash table
t->buckets = (hash_entry **)calloc(size, sizeof(hash_entry *)); //creates the hash table of "size" buckets
t->size = size; //Holds the number of buckets
t->hash_func = hash_func; //assigning the pointer to the function in the user's program
t->cmp_func = cmp_func; // " "
t->total=0;
t->largest=0;
t->average=0;
for(i=0;t->buckets[i] != NULL;i++){ //Sets all buckets in hash table to NULL
t->buckets[i] = NULL;}
return(t);
}
void free_hash(Phash_table table){
int i;
for(i = 0; i<(table->size);i++){
if(table->buckets[i]!=NULL)
for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
free(table->buckets[i]->key); //Freeing memory for key and data
free(table->buckets[i]->data);
}
free(table->buckets[i]); //free the whole bucket
}
free(table->sorted_array);
free(table);
}
void insert_hash(Phash_table table, char *key, void *data){
Phash_entry new_node; //pointer to a new node of type hash_entry
int index;
new_node = (Phash_entry)calloc(1,sizeof(hash_entry));
new_node->key = (char *)malloc(sizeof(char)*(strlen(key)+1)); //creates the key array based on the length of the string-based key
new_node->data = data; //stores the user's data into the node
strcpy(new_node->key,key); //copies the key into the node
//calling the hash function in the user's program
index = table->hash_func(key); //index will hold the hash table value for where the new node will be placed
table->buckets[index] = new_node; //Assigns the pointer at the index value to the new node
table->total++; //increment the total (total # of buckets)
}
void *find_hash(Phash_table table, char *key){
int i;
hash_entry *cur;
printf("Inside find_hash\n"); //REMOVE
for(i = 0;i<table->size;i++){
if(table->buckets[i]!=NULL){
for (cur = table->buckets[i]; cur != NULL; cur = cur->next){
//for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
if(strcmp(cur->key, key) == 0)
return((cur->data));} //returns the data to the user if the key values match
} //otherwise return NULL, if no match was found.
}
return NULL;
}
void stat_hash(Phash_table table, int *total, int *largest, float *average){
int node_num[table->size];
int i,j, count = 0;
int largest_buck = 0;
hash_entry *cur;
for(i = 0; i < table->size; i ++)
{
if(table->buckets[i]!=NULL)
for (cur = table->buckets[i]; cur != NULL; cur = cur->next){
//for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
count ++;}
node_num[i] = count;
count = 0;
}
for(j = 0; j < table->size; j ++){
if(node_num[j] > largest_buck)
largest_buck = node_num[j];}
*total = table->total;
*largest = largest_buck;
*average = (table->total) /(float) (table->size); //oook: i think you want a fp average
}
void *start_hash_walk(Phash_table table){
void* temp = 0; //oook: this was another way of overwriting your input table
int i, j, k;
int l=0; //oook: new counter for elements in your sorted_array
hash_entry *cur;
if(table->sorted_array !=NULL) free(table->sorted_array);
table->sorted_array = (void**)calloc((table->total), sizeof(void*));
for(i = 0; i < table->size; i ++){
//for(i = 0; i < table->total; i++){ //oook: i don't think you meant total ;)
if(table->buckets[i]!=NULL)
for (cur = table->buckets[i]; cur != NULL; cur = cur->next){
//for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
table->sorted_array[l++] = cur->data;
}
}
//oook: sanity check/assert on expected values
if (l != table->total)
{
printf("oook: l[%d] != table->total[%d]\n",l,table->total);
}
for(j = (l) - 1; j > 0; j --) {
for(k = 1; k <= j; k ++){
if (table->sorted_array[k-1] && table->sorted_array[k])
{
if(table->cmp_func(table->sorted_array[k-1], table->sorted_array[k]) == 1){
temp = table->sorted_array[k-1]; //ook. changed temp to void* see assignment
table->sorted_array[k-1] = table->sorted_array[k];
table->sorted_array[k] = temp;
}
}
else
printf("if (table->sorted_array[k-1] && table->sorted_array[k])\n");
}
}
return table->sorted_array[table->sort_num];
}
void *next_hash_walk(Phash_table table){
/*oook: this was blowing up since you were incrementing past the size of sorted_array..
NB: *you **need** to implement some bounds checking here or you will endup with more seg-faults!!*/
//table->sort_num++
return table->sorted_array[table->sort_num++];
}
here's parse.c
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <assert.h> //oook: added so you can assert ;)
#include "hash.h"
#define WORD_SIZE 40
#define DICTIONARY_SIZE 1000
#define TRUE 1
#define FALSE 0
void lower_case_word(char *);
void dump_dictionary(Phash_table );
/*Hash and compare functions*/
int hash_func(char *);
int cmp_func(void *, void *);
typedef struct user_data_ {
char word[WORD_SIZE];
int freq_counter;
} user_data, *Puser_data;
int main(void)
{
char c, word1[WORD_SIZE];
int char_index = 0, dictionary_size = 0, num_words = 0, i;
int total=0, largest=0;
float average = 0.0;
Phash_table t; //Pointer to main hash_table
int (*Phash_func)(char *)=NULL; //Function Pointers
int (*Pcmp_func)(void *, void *)=NULL;
Puser_data data_node; //pointer to hash table above
user_data * find;
printf("Parsing input ...\n");
Phash_func = hash_func; //Assigning Function pointers
Pcmp_func = cmp_func;
t = new_hash(1000,Phash_func,Pcmp_func);
// Read in characters until end is reached
while ((c = getchar()) != EOF) {
if ((c == ' ') || (c == ',') || (c == '.') || (c == '!') || (c == '"') ||
(c == ':') || (c == '\n')) {
// End of a word
if (char_index) {
// Word is not empty
word1[char_index] = '\0';
lower_case_word(word1);
data_node = (Puser_data)calloc(1,sizeof(user_data));
strcpy(data_node->word,word1);
printf("%s\n", data_node->word);
//!!!!!!SEG FAULT HERE!!!!!!
if (!((user_data *)find_hash(t, data_node->word))){ //SEG FAULT!!!!
dictionary_size++;
insert_hash(t,word1,(void *)data_node);
}
char_index = 0;
num_words++;
}
} else {
// Continue assembling word
word1[char_index++] = c;
}
}
printf("There were %d words; %d unique words.\n", num_words,
dictionary_size);
dump_dictionary(t); //???
}
void lower_case_word(char *w){
int i = 0;
while (w[i] != '\0') {
w[i] = tolower(w[i]);
i++;
}
}
void dump_dictionary(Phash_table t){ //???
int i;
user_data *cur, *cur2;
stat_hash(t, &(t->total), &(t->largest), &(t->average)); //Call to stat hash
printf("Number of unique words: %d\n", t->total);
printf("Largest Bucket: %d\n", t->largest);
printf("Average Bucket: %f\n", t->average);
cur = start_hash_walk(t);
if (!cur) //ook: do test or assert for null values
{
printf("oook: null== (cur = start_hash_walk)\n");
exit(-1);
}
printf("%s: %d\n", cur->word, cur->freq_counter);
for (i = 0; i < t->total; i++)
{//oook: i think you needed these braces
cur2 = next_hash_walk(t);
if (!cur2) //ook: do test or assert for null values
{
printf("oook: null== (cur2 = next_hash_walk(t) at i[%d])\n",i);
}
else
printf("%s: %d\n", cur2->word, cur2->freq_counter);
}//oook: i think you needed these braces
}
int hash_func(char *string){
int i, sum=0, temp, index;
for(i=0; i < strlen(string);i++){
sum += (int)string[i];
}
index = sum % 1000;
return (index);
}
/*array1 and array2 point to the user defined data struct defined above*/
int cmp_func(void *array1, void *array2){
user_data *cur1= array1;
user_data *cur2= array2;//(user_data *)array2;
/* ooook: do assert on programmatic errors.
this function *requires non-null inputs. */
assert(cur1 && cur2);
if(cur1->freq_counter < cur2->freq_counter){
return(-1);}
else{ if(cur1->freq_counter > cur2->freq_counter){
return(1);}
else return(0);}
}
follow the //ooks
Explanation:
There were one or two places this was going to blow up in.
The quick fix and answer to your question was in parse.c, circa L100:
cur = start_hash_walk(t);
printf("%s: %d\n", cur->word, cur->freq_counter);
..checking that cur is not null before calling printf fixes your immediate seg-fault.
But why would cur be null ? ~because of this bad-boy:
void *start_hash_walk(Phash_table table)
Your hash_func(char *string) can (& does) return non-unique values. This is of course ok except that you have not yet implemented your linked list chains. Hence you end up with table->sorted_array containing less than table->total elements ~or you would if you were iterating over all table->size buckets ;)
There are one or two other issues.
For now i hacked Nate Kohl's for(cur=table->buckets[i]; cur->next != NULL; cur=cur->next) further, to be for(cur=table->buckets[i]; cur != NULL; cur=cur->next) since you have no chains. But this is *your TODO so enough said about that.
Finally. note that in next_hash_walk(Phash_table table) you have:
table->sort_num++
return table->sorted_array[table->sort_num];
Ouch! Do check those array bounds!
Notes
1) If you're function isn't designed to change input, then make the input const. That way the compiler may well tell you when you're inadvertently trashing something.
2) Do bound checking on your array indices.
3) Do test/assert for Null pointers before attempting to use them.
4) Do unit test each of your functions; never write too much code before compiling & testing.
5) Use minimal test-data; craft it such that it limit-tests your code & attempts to break it in cunning ways.
6) Do initialise you data structures!
7)Never use egyptian braces ! {
only joking ;)
}
PS Good job so far ~> pointers are tricky little things! & a well asked question with all the necessary details so +1 and gl ;)
(//oook: maybe add a homework tag)

Resources