Creating a variable amount of linked lists for hashmap - c

Here are the important parts of my code with unhelpful portions commented out:
#include <stdio.h>
#include <ctype.h>
#include <stdlib.h>
#include <string.h>
#include "hmap.h"
struct val_word{
char *final_word;
struct val_word* next;
};
int main (int argc, char **argv){
//Check if dictionary file is given
FILE *fp1;
char key [125];
char val [125];
char temp;
struct val_word *storage;
char c;
int i;
int j;
int l;
HMAP_PTR dictionary = hmap_create(0, 0.75);
fp1 = fopen(argv[1], "r");
do{
c = fscanf(fp1, "%s", key);
// Convert string to lowercase
strcpy(val, key);
//Alphabetically sort string
struct val_word* word_node = malloc(sizeof(struct val_word));
word_node->final_word = val;
word_node->next = NULL;
storage = hmap_get(dictionary, key);
if(storage == NULL){
hmap_set(dictionary, key, word_node);
}
else{
struct val_word *temp2 = storage;
while(temp2->next != NULL){
temp2 = temp2->next;
}
word_node->final_word = val;
word_node->next = NULL;
temp2->next = word_node;
hmap_set(dictionary, key, storage);
}
} while (c != EOF);
fclose(fp1);
while(storage->next != NULL){
printf("The list is %s\n", storage->final_word);
storage = storage->next;
}
return 0;
}
I am given a dictionary file of unknown length, as well as a hash table implementation file that I cannot touch. The hash table stores jumbled versions of words, with the key being the alphabetically sorted version of the word. For example:
Part of the dictionary contains: leloh, hello, elloh, holel
key would be: ehllo
val would be a linked list storing the aforementioned 4 words.
hmap_get gets the value at the given key, and hmap_set sets the value at the given key.
My code processes everything fine, until I try to print the list located at a key.
The list will be of the correct size, but only stores the LAST value that it took as input. So adding onto the example above, my list would be (in chronological order):
leloh
elloh -> elloh
holel -> holel -> holel
ehllo -> ehllo -> ehllo -> ehllo
For some reason it also stores the correctly alphabetized string as the last string, which I did not provide the hmap_set function. Very confused about that.
However, the list makes perfect sense. I only have one node, and it is inside of a for loop. I do not change the variable name and therefore the pointers all point to the same node, and the node changes the string it contains through every iteration of the loop.
So, I am wondering how I would fix this.
I can't dynamically name variables, I can't just create a dynamic array of linked lists because I feel like that would defeat the purpose of having the hash table.
I don't know what sort of data type I would use to store this.
Any help is appreciated, thank you!

Transferring comments into an answer — where the code is easier to read.
The problem is, I think, that you keep reading new values into val (copying from key), but you only have one variable.
You need to duplicate the strings before stashing them in your hash map. So, look up the strdup() function and make a copy of the string in key using strdup() instead of strcpy(). Assign the value returned from strdup() to word_node->final_word.
If you're not allowed to use strdup(), write your own variant:
char *dup_str(const char *str)
{
size_t len = strlen(str) + 1;
char *dup = malloc(len);
if (dup != 0)
memmove(dup, str, len);
return dup;
}

Related

correct use of single/double pointers in C

I am doing a university project where i want to read lines from a file and store them in an array of Records, (and later sort them) each record is a struct as follows:
typedef struct Record{
int id;
char* string_field;
int int_field;
double double_field;
} Record;
The size of the array is unknown. Initially im allocating a portion in memory as follows and calling a function to load the Records from file in my Records Array, and then reallocating (in the load_array() function) whenever i need more space:
int main(int argc, char const *argv[]){
if(argc < 2){
printf("Usage: ordered_array_main <file_name>\n");
exit(EXIT_FAILURE);
}
int capacity = 2;
Record* array_of_records = NULL;
array_of_records = (Record*)malloc(capacity * sizeof(Record));
int size = load_array(argv[1], array_of_records, capacity);//size_of_array
printf("%d, %d\n", (*array_of_records).id, (*array_of_records).int_field);
//printing here shows me random garbage values.
free(array_of_records);
return 0;
}
The load_array function is as follows, i will skip the part where i do my fopen() and strtok() stuff since its doesnt matter to the issue. I am reading each line and saving 4 variables (id, int_field, string_field, float_field) which then will be stored in each field of the current Record of my Records array.
The issue is that when i print the values of my record array field in the load_array() function they are the correctly read ones from the file, but when i go to print them in the main() they are garbage values, typical C style.
I noticed by printing the array_of_records pointer value at different locations that it changes where i do the realloc(), but as far as i know i dont think i made any mistakes here, realloc is returning me the new location of memory and im storing it in tmp, and then storing tmp in array_of_records.
int load_array(const char* file_name, Record* array_of_records, int capacity){
int index_of_record = 0;
while(fgets(buffer,buf_size,fp)!=NULL){
int id_field = atoi(field[0]);
char *string_field = malloc((strlen(field[1])+1)* sizeof(char));
strcpy(string_field, field[1]);
int int_field = atoi(field[2]);
double double_field = atof(field[3]);
if(index_of_record>=capacity){
Record* tmp = NULL;
tmp = (Record*)realloc(array_of_records,2*capacity*sizeof(Record));
if(tmp==NULL){
fprintf(stderr, "Memory allocation failure\n");
exit(EXIT_FAILURE);
}
array_of_records = tmp;
capacity = 2*capacity;
}
(array_of_records+index_of_record)->id = id_field;
(array_of_records+index_of_record)->string_field = string_field;
(array_of_records+index_of_record)->int_field = int_field;
(array_of_records+index_of_record)->double_field = double_field;
printf("%d\n", (array_of_records+index_of_record)->int_field);
//printing here shows me the correct values
index_of_record++;
free(string_field);
}
return index_of_record;
}
I cant wrap my head around this issue, it seems that i am passing correctly the pointer of the Record array, and then storing each record moving fowards with index_of_record. I am clueless. Maybe i am supposed to use double pointers?

Im trying to read a txt file into a linked list, that contains both integers and strings in a line, separated by commas

Im making a database, where the information about books and readers in a library are contained in two linked lists, and each line of the txt file contains the data of one book/peopleinlibrary.
Data of the books:
id;year;title;writer;isborrowed;\n
...
The id and the isborrowed(1or0) are integers, the rest are strings.
I know the atoi can convert the input lines to numbers, but when I use it while separating each line by commas, it just doesn't work, the printig is wrong and also i can't return with the *start/*begin pointer, where the list starts. By now im totally clueless because it's not the first method that I tried.
The part of the code:
#include <stdio.h>
#include <string.h>
#include <conio.h>
#include <stdlib.h>
typedef struct Konyv{
int id;
char *year;
char *title;
char *writer;
int ki;
struct Konyv*next;
}Konyv;
int main(){
Konyv*start=NULL;
FILE*f;
const char s[1] = ";";
int i;
f=fopen("konyv.adat.txt","r+");
Konyv*u;
if(f != NULL){
char line[1000];
while(fgets(line, sizeof line, f) !=NULL){
u= (Konyv*)malloc(sizeof(Konyv));
u->id = strtok(line, s);
u->id=atoi(u->id);
printf("%d ",u->id);
u->year = strtok(NULL,s);
printf("%s ",u->year);
u->title = strtok(NULL,s);
printf("%s ",u->title);
u->writer = strtok(NULL,s);
printf("%s ",u->writer);
u->ki = strtok(NULL,s);
u->ki=atoi(u->ki);
printf("%d",u->ki);
printf("\n");
u->next=NULL;
if(start==NULL){
start=u;
}
else{
Konyv *mozgo = start;
while (mozgo->next!= NULL){
mozgo = mozgo->next;
}
mozgo->next= u;
}
}
else{
printf("Error while opening.\n");
return 0;
}
printf("\n");
//test if start pointer is right by printig the list again(HELP)
Konyv* temp;
temp=start;
while(temp!=NULL) {
printf("%d ",temp->id);
printf("%s ",temp->year);
printf("%s ",temp->title);
printf("%s ",temp->writer);
printf("%d ",temp->ki);
printf("\n");
temp = temp->next;
}
free(u);
fclose(f);
return 0;
}
#WhozCraig is correct, and borrowing from this post you can see that we simply need to copy the data at the pointer to a new chunk of memory. For this we can use the strdup function included in string.h.
For example:
u->year = strtok(NULL,s);
becomes
u->year = strdup(strtok(NULL,s));
You can find documentation on strdup here for further reference.
Also, I don't want to leave you hanging here, the code you handed over didn't compile cleanly -- I had to complete some brackets.
One final thing storing anything other than an int in your id field is problematic. So,
u->id = strtok(line, s);
is an issue. Simple fix is,
u->id = atoi(strdup(strtok(line, s)));
but that is kind of dirty and hard to read for another programmer coming in to maintain code. I would advise taking the time to declare a variable just to temporarily store the token you are eventually going to duplicate into your struct.

Creating a Dictionary in C

I am currently working on creating a dictionary using a binary search tree-like structure we designed in class.
#include <ctype.h>
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
struct entry
{
char* word;
unsigned int n; /* n is the number of times the word appears in the source. */
struct entry *left;
struct entry *right;
};
/*input_from_args: if no additional argument is given, return stdin. Else, open the text file and read it.*/
FILE*
input_from_args(int argc, const char *argv[]){
if(argc==1){
return stdin;
}else{
return fopen(argv[1],"r");
}
}
Below is the insert function that we also wrote in my class. Given the new word we are looking at, if it is
struct entry*
insert(struct entry *table, char* str)
{
if(table == NULL){
table = (struct entry *)malloc(sizeof(struct entry));
strcpy(table->word,str);
table -> n = 1;
table -> left = NULL;
table -> right = NULL;
}else if(strcmp(str, table->word) == 0){
table -> n = (table ->n)+1;
}else if(strcmp(str, table->word) <0){
table->left = insert(table->left, str);
}else if(strcmp(str, table->word) >0){
table ->right = insert(table->right, str);
}
return table;
}
Below is a print function which I wrote myself which is to print every word in table and N, the number of times it occurs.
void
print_table(struct entry *table){
if(table!=NULL){
print_table(table->left);
printf("%s ", table->word);
printf("%d \n", table->n);
print_table(table->right);
}
}
And finally, below is the main function.
int
main(int argc, const char *argv[])
{
FILE *src = input_from_args(argc, argv);
if(src == NULL){
fprintf(stderr, "%s: unable to open %s\n", argv[0], argv[1]);
exit(EXIT_FAILURE);
}
char str[1024];
struct entry *table;
int c;
while((fscanf(src, "%s", str))!= EOF){
table = insert(table, str);
}
print_table(table);
return 0;
}
I'm having some very odd behavior when I run this function. It seems to only be happening when I run it with longer input.
When I run it with this input(in a .txt file):
This is a test.
This is a test.
This is a test.
I get the following output:
This 3
a 3
is 3
test 3
This is what I should be getting. However, when I give it slightly longer input, such as:
Apple Apple
Blue Blue
Cat Cat
Dog Dog
Elder Elder
Funions Funions
Gosh Gosh
Hairy Hairy
I get the following output:
Appme 2
Blue 2
Cat 2
Dog 2
Elder 2
Funions 2
Gosi 2
Hairy 2
Which is clearly correct as far as the numbers go, but why is it changing some of the letters in my words? I gave it Apple, it returned Appme. I gave it Gosh, it gave me Gosi. What's going on with my code that I am missing?
This line in the insert function is very problematic:
strcpy(table->word,str);
It's problematic because you don't actually allocate memory for the string. That means that table->word is uninitialized and its value will be indeterminate, so the strcpy call will lead to undefined behavior.
The simple solution? Use strdup to duplicate the string:
table->word = strdup(str);
The strdup function is not actually in standard C, but just about all platforms have it.
In your insert function, you do not allocate/malloc() space for the word pointer you are trying to strcpy() to:
if(table == NULL){
table = (struct entry *)malloc(sizeof(struct entry));
strcpy(table->word,str);
table -> n = 1;
table -> left = NULL;
table -> right = NULL;
}
Usually this code would exit with a segmentation fault, because you are copying data to memory you don't own, but this is easy to fix:
table->word = malloc(strlen(str) + 1);
strcpy(table->word, str);
You'll want to allocate one extra byte above the string length, to allow for the null terminator.
You do not need or want to cast the result of malloc(). In other words, this is fine:
table = malloc(sizeof(struct entry));
Get into the habit of using free() on any pointers you have malloc()-ed, when you are done with them. Otherwise, you end up with a memory leak.
Also, compile with -Wall -Weverything flags to enable all warnings.
Note: If one absolutely must use strdup(), it is easy to write a custom function to do so:
char* my_very_own_strdup(const char* src)
{
char* dest = NULL;
if (!src)
return dest;
size_t src_len = strlen(src) + 1;
dest = malloc(src_len);
if (!dest) {
perror("Error: Could not allocate space for string copy\n");
exit(EXIT_FAILURE);
}
memcpy(dest, src, src_len);
return dest;
}
On the line strcpy(table->word,str); where is table->word allocated?
So It copies only 4 bytes to table->word because pointer size is 4-bytes in your machine. So Be careful, you must allocate table->word there,
I would use this one instead of that table->word = strdup(str);

Possible memory issue with a simple C linked list program

I am trying to write a program which will count the occurrence of words in a paragraph.
The logic I am following : I am using a linked list for the purpose. And I am searching sequentially - if new word encountered adding the word in the list, but if word already exist in the list increase its count flag.
//case insensitive string matching
int strcicmp(char const *a, char const *b)
{
int d;
for(;;a++,b++)
{
d=tolower(*a)-tolower(*b);
if(d!=0 || !*a)
return d;
}
}
//declare the linked list structure to store distinct words and their count
typedef struct node
{
char *word;
int count;
struct node *next;
} node;
node *ptr, *newnode, *first=NULL, *last=NULL;
void insertnewword(char *ch)
{
newnode=(node*)malloc(sizeof(node));
if(newnode == NULL)
{
printf("\nMemory is not allocated\n");
exit(0);
}
else
{
newnode->word=ch;
newnode->count=1;
newnode->next=NULL;
}
if(first==last && last==NULL)
{
first=last=newnode;
first->next=NULL;
last->next=NULL;
}
else
{
last->next=newnode;
last=newnode;
last->next=NULL;
}
}
void processword(char *ch)
{
int found=0;
//if word is already in the list, increase the count
for(ptr=first;ptr!=NULL;ptr=ptr->next)
if(strcicmp(ptr->word, ch) == 0)
{
ptr->count += 1;
found=1;
break;
}
//if it's a new word, add the word to the list
if(!found)
insertnewword(ch);
}
int main()
{
const char *delimiters=" ~`!##$%^&*()_-+={[}]:;<,>.?/|\\\'\"\t\n\r";
char *ch, *str;
str=(char*)malloc(sizeof(char));
ch=(char*)malloc(sizeof(char));
//get the original string
scanf("%[^\n]%*c", str);
//fgets(str, 500, stdin);
//get the tokenized string
ch=strtok(str,delimiters);
while(ch!=NULL)
{
//a, an, the should not be counted
if(strcicmp("a", ch)!=0 && strcicmp("an", ch)!=0 && strcicmp("the", ch)!=0)
processword(ch);
ch=strtok(NULL,delimiters);
}
//print the word and it's occurrence count
for(ptr=first; ptr!=NULL; ptr=ptr->next)
printf("%s\t\t%d\n",ptr->word,ptr->count);
return 0;
}
this seem to be working fine for few number of words, but if word count is more than 6-7, this program is encountering some problem.
Say input is : I am a good boy. I am a bad Boy.
Input should be
I 2
am 2
good 1
bad 1
boy 2
But what I am getting is
I 2
am 2
good 1
bad 1
(some garbage character) 1
I can always implement any other logic for the same problem, but I want to know the issue with this implementation.
Thanks in advance
I think the problem come from your scanf:
in the man scanf:
the next pointer must be a pointer to char, and there must be enough room for all the characters in the string, plus a terminating null byte
but in the top of your main, the allocation of your char array is juste one bite long:
str=(char*)malloc(sizeof(char));
I think it would be better to use function like getline
ssize_t getline(char **lineptr, size_t *n, FILE *stream);
and setting lineptr pointing to NULL
I think your linked list implementation is not causing you the problems but your memory allocation is causing you the actual problems.
First memory allocation problem:
str=(char*)malloc(sizeof(char));
ch=(char*)malloc(sizeof(char));
Here str and ch should have memory to hold the complete word along with terminating null character but you have allocated only one byte(i.e. size of char)
Second memory allocation problem:
newnode->word=ch;
This code fragment is present in your insertnewword() function.
Here you have allocated memory to your new node but you have not allocated any memory to the char *word present inside new node. After that you are directly making newnode->word point to ch which is a local variable of main() function.
When you complete the first word and tokenize the input string further, ch contains next word from the string. Now this may corrupt data in your linked list as you have made newnode->word directly point to ch.
So please allocate memory to word field present in newnode and copy the contents of ch into it.
I hope this will solve your problem.

Hashed array linked list collision resolution error

This block of code reads a dictionary file and stores it in a hashed array. This hashing array uses linked list collision resolution. But, for some incomprehensible reason, the reading stops in the middle. (i'm assuming some problem occurs when linked list is made.) Everything works fine when data is being stored in a empty hashed array element.
#define SIZE_OF_ARRAY 350
typedef struct {
char* key;
int status; // (+1) filled, (-1) deleted, 0 empty
LIST* list;
}HASHED_ARRAY;
void insertDictionary (HASHED_ARRAY hashed_array[])
{
//Local Declaration
FILE* data;
char word[30];
char* pWord;
int index;
int length;
int countWord = 0;
//Statement
if (!(data = fopen("dictionaryWords.txt", "r")))
{
printf("Error Opening File");
exit(1);
}
SetStatusToNew (hashed_array); //initialize all status to 'empty'
while(fscanf(data, "%s\n", word) != EOF)
{
length = strlen(word) + 1;
index = hashing_function(word);
if (hashed_array[index].status == 0)//empty
{
hashed_array[index].key = (char*) malloc(length * sizeof(char));//allocate word.
if(!hashed_array[index].key)//check error
{
printf("\nMemory Leak\n");
exit(1);
}
strcpy(hashed_array[index].key, word); //insert the data into hashed array.
hashed_array[index].status = 1;//change hashed array node to filled.
}
else
{
//collision resolution (linked list)
pWord = (char*) malloc(length * sizeof(char));
strcpy (pWord, word);
if (hashed_array[index].list == NULL) // <====== program doesn't enter
//this if statement although the list is NULL.
//So I'm assuming this is where the program stops reading.
{
hashed_array[index].list = createList(compare);
}
addNode(hashed_array[index].list, pWord);
}
countWord++;
//memory allocation for key
}
printStatLinkedList(hashed_array, countWord);
fclose(data);
return;
}
createList and addNode are both ADT function. Former takes a function pointer (compare is a function that I build inside the main function) as a parameter, and latter takes list name, and void type data as parameters. compare sorts linked list. Please spot me the problem .
Depending on where you declare the hashed_array you pass to this function, the contents of it may not be initialized. This means that all contents of all entries is random. This includes the list pointer.
You need to initialize this array properly first. The easiest way is to simple use memset:
memset(hashed_array, 0, sizeof(HASHED_ARRAY) * whatever_size_it_is);
This will set all members to zero, i.e. NULL for pointers.

Resources