I'm trying to implement a dictionary of words using a hash table, so I need to have it global, and in one of my header files I declare it
extern node** dictionary;
Where node is
typedef struct node
{
char* word;
struct node* next;
} node;
Then in another file in which functions are defined I include the header which has the dictionary declaration, and also I add at the top
node** dictionary;
Then in the function which actually loads the dictionary I first allocate memory for the linked lists which will make the hash table
bool load(const char* dict_file)
{
dictionary = malloc(sizeof(node*) * LISTS);
FILE* dict = fopen(dict_file, "r");
if(dict == NULL)
return false;
char buffer[MAX_LEN + 2];
size_dict = 0;
while(fgets(buffer, MAX_LEN + 2, dict) != NULL)
{
node* new_node = malloc(sizeof(node));
int len = strlen(buffer);
new_node->word = malloc(sizeof(char) * (len));
//avoid \n
for(int i = 0; i < len - 1; i++)
new_node->word[i] = buffer[i];
new_node->word[len - 1] = '\0';
new_node->next = NULL;
int index = hash(buffer);
new_node->next = dictionary[index];
dictionary[index] = new_node;
size_dict++;
}
if (ferror(dict))
{
fclose(dict);
return false;
}
fclose(dict);
return true;
}
So the program works fine, I then free all the allocated memory for strings and nodes and when I run valgrind(a debugger which detects memory leaks) it says no memory leaks are possible, but it says that there is an error Uninitilised value was created by a heap allocation and redirects me to that exact line where I'm allocating memory for dictionary the exact first line of the load function which I've written above.What am I doing wrong? I guess the way I use dictionary globally is wrong, so can anybody suggest some other way of keeping it global and avoid this error?
In the updated code you use an uninitialized pointer:
dictionary = malloc(sizeof(node*) * LISTS);
// .... code that does not change dictionary[i] for any i
new_node->next = dictionary[index]; // use uninitialized pointer
As people had wrote already, this will only work if you had pre-set all the pointers to be NULL before entering this loop:
dictionary = malloc(sizeof(node*) * LISTS);
if ( !dictionary ) {
return false;
}
for (size_t i = 0; i < LISTS; ++i) {
dictionary[i] = NULL;
}
The heap allocation you assign to dictionary uses malloc which does not initialize the returned bytes. So dictionary in the code you've posted ends up being an array of uninitialized pointers. Presumably you go on to use those pointers in some way which valgrind knows to be an error.
An easy way to fix this is to use calloc instead of malloc, because it zeros the returned bytes for you. Or, use memset to zero the bytes yourself.
Related
I just finished pset5 of cs50, and one of functions is meant to load content of a dictionary into a hash table. Inside the loop in said function i have to malloc memory for a node that i will later assign to node in the hash table.
When i tried freeing node n after each loop iteration my function wouldn't work.
When i don't free it it does work and more confusingly it also passes valgrind check and cs50's check50 for memory leaks.
My questions are :
how would i free 'node n' to allow my function to still work?
Why doesn't valgrind detect any memory leaks when i don't free 'n' ? Is it example of undefined behavior ?
How does malloc in a loop work, does it allocate new chunk of memory each time or does it overwrite previous chunk of memory ?
Any answers would be greatly appreciated.
Here is the code :
bool load(const char *dictionary)
{
//Setting counter to determine wheather node comes second in linked list or not.
int counter = 0;
//declaring string array to store words from dictionary
char word1[LENGTH +1];
FILE *dic = fopen(dictionary, "r");
if(dic == NULL)
{
return false;
}
//Loop loading words from dictionary to hash table
while(fscanf(dic, "%s", word1) != EOF )
{
node *n = malloc(sizeof(node));
if (n == NULL)
{
return false;
free(n);
}
int i = hash(word1);
//Storing word in temporary node
strcpy(n->word, word1);
n->next = NULL;
//Three different conditions(first node of[i], second node of[i], and after second node of[i])
if(table[i] == NULL)
{
table[i] = n;
counter++;
counter2++;
}
else if (counter == 1)
{
table[i]->next = n;
counter = 0;
counter2++;
}
else
{
n->next = table[i];
table[i] = n;
counter2++;
}
}
fclose(dic);
return true;
You don't free memory in load. You free it in unload. That was the whole point.
If valgrind doesn't detect memory leaks, then presumably you have a working unload function. Why would it be undefined behaviour?
It will allocate new memory every time. This wouldn't work if it didn't.
I am currently working on pset5 from cs50.
My entire program compiles successfully but stops in the middle of the function called load when program is executed.
Below is my load function, and you can see the comment where it gave me a segmentation fault error.
If you can help me with figuring out how I should approach my error, please do let me know.
I understand that segmentation fault is caused when the program attempts to access a memory that does not belong to it.
However, I have allocated memory and checked whether there was enough memory to continue on the program.
I will provide comments to highlight what my code does.
// In another header file, I have defined 'LENGTH'
// Maximum length for a word
// (e.g., pneumonoultramicroscopicsilicovolcanoconiosis)
#define LENGTH 45
// Represents a node in a hash table
typedef struct node
{
char word[LENGTH + 1];
struct node *next;
}
node;
// Hash table
// I have initialized the array of `node` pointer to point `NULL`
node *table[N] = {NULL};
unsigned int word_counter = 0;
bool load(const char *dictionary)
{
// Open file, and if cannot open, return false
FILE *file = fopen(dictionary, "r");
if (file == NULL)
{
return false;
}
// read string in the file into array of character, `word` until reaching end of the file
char word[LENGTH + 1];
while (fscanf(file, "%s", word) != EOF)
{
// keep track of how many word exists in the file, for later use (not in this function)
word_counter += 1;
// allocated memory for struct type `node`, if not enough memory found, return false
node *n = (node*)malloc(sizeof(node));
if (n == NULL)
{
return false;
}
// assign index by hashing (hash function will not be posted in this question though.)
unsigned int index = hash(&word[0]);
// copy the word from file, into word field of struct type `node`
strncpy(n->word, word, sizeof(word));
// Access the node pointer in this index from array(table), and check is its `next` field points to NULL or not.
// If it is pointing to NULL, that means there is no word stored in this index of the bucket
if (table[index]->next == NULL) // THIS IS WHERE PROGRAM GIVES 'segmentation fault' !!!! :(
{
table[index]->next = n;
}
else
{
n->next = table[index];
table[index]->next = n;
}
}
return true;
}
You define ant initialize the hash table as:
node *table[N] = {NULL};
That means you have an array of null-pointers.
When you insert the first value in the table, then table[index] (for any valid index) will be a null pointer. That means table[index]->next attempt to dereference this null pointer and you will have undefined behavior.
You need to check for a null pointers first:
if (table[index] == NULL)
{
n->next = NULL;
}
else
{
n->next = table[index];
}
table[index] = n;
I trying to write a queue(String Version) program in C by using linked lists.
Here is the structure:
struct strqueue;
typedef struct strqueue *StrQueue;
struct node {
char *item;
struct node *next;
};
struct strqueue {
struct node *front;//first element
struct node *back;//last element in the list
int length;
};
I creates a new StrQueue first
StrQueue create_StrQueue(void) {
StrQueue q = malloc(sizeof (struct strqueue));
q->front = NULL;
q->back = NULL;
q->length = 0;
return q;
}
makes a copy of str and places it at the end of the queue
void push(StrQueue sq, const char *str) {
struct node *new = malloc(sizeof(struct node));
new->item = NULL;
strcpy(new->item,str);//invalid write size of 1 ?
new->next = NULL;
if (sq->length == 0) {
sq->front = new;
sq->back = new;
} else {
sq->back->next = new;
sq->back = new;
}
sq->length++;
}
frees the node at the front of the sq and returns the string that was first in the queue
char *pop(StrQueue sq) {
if (sq->length == 0) {
return NULL;
}
struct node *i = sq->front;
char *new = sq->front->item;
sq->front = i->next;
sq->length --;
free(sq->front);
return new;
}
I got invalid write size of 1 at strcpy(new->item,str); I dont understand why I got this error.
Can anyone tell me why and tell me how should I fix it? Thanks in advance.
Okay, first things first, in the answer below I am NOT fixing your doubly linked list concepts, I am just showing you how you should fix the code above within the scope of your question. You may want to look into how doubly linked lists are done.
In:
void push(StrQueue sq, const char *str) {
struct node *new = malloc(sizeof(struct node));
new->item = NULL;
The next statement is wrong:
strcpy(new->item,str);
There are two ways you can solve it:
Make sure that *str is a valid pointer outside of the list management context while the list is being used.
Let the list manage the string allocation (and possibly deallocation).
is the quick and dirty method, it's easier to debug later but larger codebase makes it cumbersome.
cleaner looking code, but requires initial setup discipline, you should create object (string) management routines in addition to list management routines. can be cumbersome in its own right.
CASE 1: const char *str is guaranteed to be valid for life of StrQueue (this is what you are looking for really)
It should be:
new->item = str;
Here we assume str was a dynamic string allocated elsewhere
Now, in pop when you pop off the string you are okay. because the pointer you are returning is still valid (you are guaranteeing it elsewhere)
CASE 2: const char *str is not guaranteed to be valid for life of StrQueue
Then use:
new->item = strdup(str);
Now, in pop when you pop off the string you can either
de-allocate the strdup and not return anything, (not quite the same things as you did)
pass a container pointer to pop where contents of item are copied (clean)
return the popped off pointer, but you must deallocate it separately when you are done with it (ugly)
Which would make your pop function one of the following:
Case 2.1:
void pop(StrQueue sq) {
if (sq->length == 0) {
return NULL;
}
struct node *node = sq->front;
sq->front = node->next;
sq->length--;
free(node->item);
free(node);
}
Case 2.2:
char *pop(StrQueue sq, char *here) {
if (sq->length == 0) {
return NULL;
}
struct node *node = sq->front;
sq->front = node->next;
sq->length--;
strcpy(here, node->item);
free(node->item);
free(node);
}
Case 2.3:
char *pop(StrQueue sq) {
char *dangling_item = NULL;
if (sq->length == 0) {
return NULL;
}
struct node *node = sq->front;
sq->front = node->next;
sq->length--;
dangling_item = node->item;
free(node);
return dangling_item;
}
I got invalid write size of 1 at strcpy(new->item,str); I dont understand why I got this error. Can anyone tell me why and tell me how should I fix it?
Why:
This code:
new->item = NULL;
strcpy(new->item,str);//invalid write size of 1 ?
You're not suppose to pass a null pointer to the first argument, it should be a pointer to allocated memory. The reason why you're getting this error message, I can imagine, is because the implementation of strcpy probably looks like this:
for (int i = 0; str2[i]; i++) str1[i] = str2[i];
And in the first iteration of the for loop, it writes to address 0 (a read-only section of memory) - this gives you the invalid write of size 1. I'm not sure, however, why you are only getting a size of 1, though (I would imagine it would be the entire size of the string). This could be because either a) str is only of size 1 or b) because the signal, SIGSEGV stops the program.
How to fix:
Allocate space for new->item before calling strcpy, like this:
new->item = malloc (strlen (str) + 1); // + 1 for null-terminating character
But you could probably include some error checking, like this:
int len = strlen (str) + 1;
if (len){
new->item = malloc (len);
if (!new->item){
return;
}
}
I wrote a hashtable and it basically consists of these two structures:
typedef struct dictEntry {
void *key;
void *value;
struct dictEntry *next;
} dictEntry;
typedef struct dict {
dictEntry **table;
unsigned long size;
unsigned long items;
} dict;
dict.table is a multidimensional array, which contains all the stored key/value pair, which again are a linked list.
If half of the hashtable is full, I expand it by doubling the size and rehashing it:
dict *_dictRehash(dict *d) {
int i;
dict *_d;
dictEntry *dit;
_d = dictCreate(d->size * 2);
for (i = 0; i < d->size; i++) {
for (dit = d->table[i]; dit != NULL; dit = dit->next) {
_dictAddRaw(_d, dit);
}
}
/* FIXME memory leak because the old dict can never be freed */
free(d); // seg fault
return _d;
}
The function above uses the pointers from the old hash table and stores it in the newly created one. When freeing the old dict d a Segmentation Fault occurs.
How am I able to free the old hashtable struct without having to allocate the memory for the key/value pairs again?
Edit, for completness:
dict *dictCreate(unsigned long size) {
dict *d;
d = malloc(sizeof(dict));
d->size = size;
d->items = 0;
d->table = calloc(size, sizeof(dictEntry*));
return d;
}
void dictAdd(dict *d, void *key, void *value) {
dictEntry *entry;
entry = malloc(sizeof *entry);
entry->key = key;
entry->value = value;
entry->next = '\0';
if ((((float)d->items) / d->size) > 0.5) d = _dictRehash(d);
_dictAddRaw(d, entry);
}
void _dictAddRaw(dict *d, dictEntry *entry) {
int index = (hash(entry->key) & (d->size - 1));
if (d->table[index]) {
dictEntry *next, *prev;
for (next = d->table[index]; next != NULL; next = next->next) {
prev = next;
}
prev->next = entry;
} else {
d->table[index] = entry;
}
d->items++;
}
best way to debug this is to run your code against valgrind .
But to you give some perspective :
when you free(d) you are expecting more of a destructor call on your struct dict which would internally free the memory allocated to the pointer to pointer to dictEntry
why do you have to delete the entire has table to expand it ? you have a next pointer anyways why not just append new hash entries to it ?
Solution is not to free the d rather just expand the d by allocating more struct dictEntry and assigning them to appropriate next.
When contracting the d you will have to iterate over next to reach the end and then start freeing the memory for struct dictEntrys inside of your d.
To clarify Graham's point, you need to pay attention to how memory is being accessed in this library. The user has one pointer to their dictionary. When you rehash, you free the memory referenced by that pointer. Although you allocated a new dictionary for them, the new pointer is never returned to them, so they don't know not to use the old one. When they try to access their dictionary again, it's pointing to freed memory.
One possibility is not to throw away the old dictionary entirely, but only the dictEntry table you allocated within the dictionary. That way your users will never have to update their pointer, but you can rescale the table to accomodate more efficient access. Try something like this:
void _dictRehash(dict *d) {
printf("rehashing!\n");
int i;
dictEntry *dit;
int old_size = d->size;
dictEntry** old_table = d->table;
int size = old_size * 2;
d->table = calloc(size, sizeof(dictEntry*));
d->size = size;
d->items = 0;
for (i = 0; i < old_size; i++) {
for (dit = old_table[i]; dit != NULL; dit = dit->next) {
_dictAddRaw(d, dit);
}
}
free(old_table);
return;
}
As a side note, I'm not sure what your hash function does, but it seems to me that the line
int index = (hash(entry->key) & (d->size - 1));
is a little unorthodox. You get a hash value and do a bitwise and with the size of the table, which I guess works in the sense that it will be guaranteed to be within (I think?) [0, max_size), I think you might mean % for modulus.
You are freeing a pointer which is passed in to your function. This is only safe if you know that whoever's calling your function isn't still trying to use the old value of d. Check all the code which calls _dictRehash() and make sure nothing's hanging on to the old pointer.
What does dictCreate actually do?
I think you're getting confused between the (fixed size) dict object, and the (presumably variable sized) array of pointers to dictEntries in dict.table.
Maybe you could just realloc() the memory pointed to by dict.table, rather than creating a new 'dict' object and freeing the old one (which incidentally, isn't freeing the table of dictentries anyway!)
Just trying to make a kind of hash table with each node being a linked list.
Having trouble just initializing the space, what am I doing wrong?
#include <stdlib.h>
typedef struct entry {
struct entry *next;
void *theData;
} Entry;
typedef struct HashTable {
Entry **table;
int size;
} HashTable;
int main(){
HashTable *ml;
ml = initialize();
return 0;
}
HashTable *initialize(void)
{
HashTable *p;
Entry **b;
int i;
if ((p = (HashTable *)malloc(sizeof(HashTable *))) == NULL)
return NULL;
p->size = 101;
if ((b = (Entry **)malloc(p->size * sizeof(Entry **))) == NULL)
return NULL;
p->table = b;
for(i = 0; i < p->size; i++) {
Entry * b = p->table[i];
b->theData = NULL;
b->next = NULL;
}
return p;
}
You need to change sizeof(HashTable*) to sizeof(HashTable) and similarly sizeof(Entry **) to sizeof(Entry *) . And the second thing is for every Entry you need to allocate memory using malloc again inside the loop.
if ((p = malloc(sizeof(HashTable))) == NULL)
return NULL;
p->size = 101;
if ((b = malloc(p->size * sizeof(Entry *))) == NULL)
return NULL;
I believe removing the malloc() result casts is best practice.
Plus, as #Naveen was first to point out you also need to allocate memory for each Entry.
Firstly your sizeofs are wrong. T * = malloc( num * sizeof(T)) is correct. You can also use calloc.
You are reusing b for different purposes so it is quite confusing. Not generally good using a single character variable.
p->table which was b is allocated but not initialised, i.e. it doesn't point to anything useful, then you are trying to dereference it.
You need to fill it will Entry* pointers first, and they must be pointing to valid Entry structs if you are going to dereference those.
Your process probably dies on the line b>theData = NULL
Also, you can statically declare your HashTable, either locally, or in some region high enough in the stack that the stack is non-ascending (in memory) while it is used and pass a pointer to the HashTable to your initialize function to avoid a malloc. malloc is slow.
So in main, you can do:
HashTable table;
InitializeHashTable(&table);
// use table (no need to free)
// just do not return table