Validity of Hashtable From K&R C Progamming Language Book

Validity of Hashtable From K&R C Progamming Language Book - c

As I am searching dictionary example in C, I have come accross the example in here stackoverflow which references K&R The C Programming Language book. In that book, there is a table lookup topic in section 6.6. The section exemplifies table lookup as a hash table.
The hashtable is formed by 101 sized nlist(the struct in the below code snippet) self-referential nodes in the example.
My question is here why they have used self-referential struct for a lookup table? Look-up tables work as key-value pair so we dont have to hold next node.
struct nlist {
/* table entry: */
struct nlist *next; /* next entry in chain */
char *name; /* defined name */
char *defn; /* replacement text */
};
The second part of my question related with the example is for the loop statement in the lookup(char *s) function. The loop works only for one time and also np = np->next expression may irrelevant, i think or there could be anything that i missed!
struct nlist *lookup(char *s)
{
struct nlist *np;
for (np = hashtab[hash(s)]; np != NULL; np = np->next)
if (strcmp(s, np->name) == 0)
return np; /* found */
return NULL; /* not found */
}
The last part of my question is about the assignment np->next = hashtab[hashval]; (the function below) in the function *install(char *name, char *defn), actually it assignes its current node to itself as a next node.
struct nlist *install(char *name, char *defn)
{
struct nlist *np;
unsigned hashval;
if ((np = lookup(name)) == NULL) { /* not found */
np = (struct nlist *) malloc(sizeof(*np));
if (np == NULL || (np->name = strdup(name)) == NULL)
return NULL;
hashval = hash(name);
np->next = hashtab[hashval];
hashtab[hashval] = np;
} else /* already there */
free((void *) np->defn); /*free previous defn */
if ((np->defn = strdup(defn)) == NULL)
return NULL;
return np;
}
Thanks in advance.

The table is not indexed with keys directly, but with hashes of keys. Different keys may have the same hash. This is called hash collisions.
This implementation stores all values that correspond to keys with the same hash as a linked list, thus a self referential structure. The table stores linked lists of key-value pairs. All keys in the same list have the same hash. (This is not the only method of handling collisions).
In case of a collision, the loop does work more than once. It you don't see this loop executing more than once, keep adding entries until you hit a collision.
No, it does not assign a node to itself. It inserts a newly allocated node at the head of the linked list. The former head of the list becomes the second node in the list.

Related

How does install() behave when NULL is stored as a name definition?

Trying to understand the code for an implementation of hash search as discussed in K&R C programming book (page 143-145).
Consider a #define statement
#define STATE 1
The aim is that we should store the name and its replacement text in a table. We create an array hashtab having pointers to linked lists. Each pointer refers to a linked list which has a name and its replacement text in each of its nodes (also with a link node, of course). If there are no names with a certain hash value, the array element at that index is NULL. For a linked list pointed to by a pointer from hashtab array, all nodes have names with common hash value.
Following are the function and struct definitions.
Here is the struct nlist. It is used to create a node which would record the name and replacement text. The node will be added in front of the already existing linked list for that hash value.
struct nlist { /* table entry: */
struct nlist *next; /* next entry in chain */
char *name; /* defined name */
char *defn; /* replacement text */
};
Here is the code for the lookup() function. It searches for the string in the table and returns a pointer to the place where it was found, or NULL if the string is not present.
/* lookup: look for s in hashtab */
struct nlist *lookup(char *s)
{
struct nlist *np;
for (np = hashtab[hash(s)]; np != NULL; np = np->next)
if (strcmp(s, np->name) == 0)
return np; /* found */
return NULL; /* not found */
}
The strdup() function which makes a duplicate of a string src. Error handling, for example when malloc() returns NULL, is left to the caller.
char *strdup(const char *src) {
char *p = (char *) malloc(strlen (src) + 1); // Space for length plus '\0'
if (p != NULL) strcpy(p, src);
return p;
}
My questions mainly concern the following install() function which records a name and replacement text in the front of the already existing linked list for that hash value.
/* install: put (name, defn) in hashtab */
struct nlist *install(char *name, char *defn)
{
struct nlist *np;
unsigned hashval;
if ((np = lookup(name)) == NULL) { /* not found */
np = (struct nlist *) malloc(sizeof(*np));
if (np == NULL || (np->name = strdup(name)) == NULL)
return NULL;
hashval = hash(name);
np->next = hashtab[hashval];
hashtab[hashval] = np;
} else /* already there */
free((void *) np->defn); /*free previous defn */
if ((np->defn = strdup(defn)) == NULL)
return NULL;
return np;
}
I have not written the hash() function here.
What is the significance of its return value to the main()?
One more related question:
The last if condition may assign NULL to np->defn (and also returns it). How will this assignment of NULL be useful to the user of install() since the first item in the list of names (having such hash values) would contain NULL in the defn? Are we allowing a name to have NULL as its definition and ignoring the reason why NULL was assigned to its defn field?

Singly Linked List Implementation in C using 3 different typedefs

So, my task is to to write a full implementation of a Singly Linked List in C.
I wrote before implementations of a stack and a dynamic vector, but this time, the linked list confuses me a little because of the use of 3 different typedef.
I'll be glad to get your review and tips on my code.
I would make a test file as I always do, but I am having a hard time of writing one because of all the void * casts .
I won't add all the 14 functions, i'll add just the functions that I'm least sure of.
So we must follow the following typedefs and the given prototypes. So neither of them can be changed.
I also had to add a "dummy node" as the last node, which means there will be always a "dummy node" that will indicate that the one before it, is the "real" last node in the list. This is part of the instructions.
typedef struct slist slist_ty;
typedef struct slist_node slist_node_ty;
typedef slist_node_ty *slist_iter_ty;
This is my implementation of the structs:
They asked us to allow in theory any type of data in our nodes, that's why I wrote void *.
struct slist
{
slist_iter_ty head;
slist_iter_ty end;
};
struct slist_node
{
void *data;
slist_iter_ty next;
};
And these are the functions:
/* Creates an empty single-linked list and returns pointer to the head */
/* returns NULL on failure*/
/* Complexity: O(1) */
slist_ty *SlistCreate(void)
{
slist_ty *new_list = (slist_ty *)malloc(sizeof(slist_ty));
if (NULL == new_list)
{
fprintf(stderr, "Failed to allocate memory\n");
return(NULL);
}
new_list->head = NULL;
/* create a dummy node that will represent the end of the list */
new_list->end = (slist_node *)malloc(sizeof(slist_node));
if (NULL == new_list->end)
{
fprintf(stderr, "Failed to allocate memory\n");
free(new_list);
return(NULL);
}
new_list->end->data = NULL;
new_list->end->next = NULL;
return(new_list->head);
}
/* Deletes entire List */
/* Complexity: O(n) */
void SlistDestroy(slist_ty *slist)
{
slist_iter_ty temp = NULL;
assert(slist);
while(NULL != slist->head)
{
tmp = slist->head;
slist->head = temp;
free(temp);
}
free(slist->end);
slist->end = NULL;
free(slist);
slist = NULL;
}
/* Insters the element after the iterator, returns iterator to the new node */
/* TODO Undefined behaviour if iter is slist_END */
/* Complexity: O(1) */
slist_iter_ty SlistInsert(slist_iter_ty iter, void *data)
{
slist_iter_ty new_node = NULL;
assert(iter);
assert(iter->next);
assert(data);
new_node->data = data;
new_node->next = iter->next;
iter->next = new_node;
return(new_node);
}
/* Returns iterator to end of the list */
/* Complexity: O(1) */
slist_iter_ty SlistIteratorEnd(const slist_ty *slist)
{
slist_iter_ty iterator = slist->head;
assert (slist);
if (NULL == slist->head)
{
return(NULL);
}
while (NULL != iterator->next->data)
{
iterator = iterator->next;
}
return(iterator);
}
My question along the request to get a feedback is:
Should I free any new slist_iter_ty that I make ?
For example, I made an iterator of type slist_iter_ty in the last function, in order to help me to traverse the list. But I can't free the iterator because I need to return it as the return value.
I also made a new_node in the SlistInsertfunction.
Will it be freed as part of the SlistDestroy function?
Thanks.

slist - is the list. when you create this list you use malloc so when you want to destroy it you need to free the list.
also - you used malloc every time you used insert. so when you want to destroy the list, you need to empty it from all the nodes - so you will need to free node by node
i can see you doesn't use mallloc in slist insert - how can you keep the data without use malloc?
In destroy function
while(NULL != slist->head)
{
tmp = slist->head;
slist->head = temp;
free(temp);
}
I think what you meant is:
while(NULL != slist->head)
{
tmp = slist->head;
slist->head = slist->head->next;
free(tmp);
}
In insert function
slist_iter_ty new_node = NULL;
what you should write:
new_node = (slist_iter_ty) malloc(sizeof(slist_node));
in slist end function
slist_iter_ty SlistIteratorEnd(const slist_ty *slist)
you can just return (after you assert something :)) :
return (slist->end);
(otherwise it wouldent be O(1) it would be O(n))

I don't understand the hash tables example in K&R

here is the install function from the hash tables example from K&R's book:
struct nlist *install(char *name, char *defn)
{
struct nlist *np;
unsigned hashval;
if ((np = lookup(name)) == NULL) { /* not found */
np = (struct nlist *) malloc(sizeof(*np));
if (np == NULL || (np->name = strdup(name)) == NULL)
return NULL;
hashval = hash(name);
np->next = hashtab[hashval];
hashtab[hashval] = np;
} else /* already there */
free((void *) np->defn); /*free previous defn */
if ((np->defn = strdup(defn)) == NULL)
return NULL;
return np;
}
I don't understand the line np->next = hashtab[hasvall]
I thought the reason to have the variable np->next is for putting in the table two string with the same hash value, but the outcome from this is having only one name for every hash value.
Furthermore I cannot seem to understand the function lookup, and the "AFTERTHOUGHT part in the for(because I think there is only one vaule to every struct in the talbe:
/* lookup: look for s in hashtab */
struct nlist *lookup(char *s)
{
struct nlist *np;
for (np = hashtab[hash(s)]; np != NULL; np = np->next)
if (strcmp(s, np->name) == 0)
return np; /* found */
return NULL; /* not found */
}
What am I missing?

You can have only one key (name) for every value, but two or more keys can have the same hash. np->next = hashtab[hashval] adds the new hashval to the linked list. Lookup then iterates through the list until the key (name) is matched.

np->next = hashtab[hashval];
hashtab[hashval] = np;
These two lines do not replace the old entry, they add to it.
hashtab[hashval]-> existing_node becomes
hashtab[hashval]-> np -(next)-> existing_node
As #Bo Persson mentions in the comments, this is called "chaining".
Given this structure, the lookup function correctly checks the names of each node in the chain.

Adapting generic code to different size of linked list in C

For a programming assignment, we've been asked to read in some data from a text file and populate a linked list with the data. Here is the example code we've been given:
#include <stdio.h>
#include <stdlib.h>
#include <assert.h>
#define MAX_INPUT 20
#define EXTRA_CHARS 2
typedef struct listNode
{
int data;
struct listNode * next;
} ListNode;
typedef ListNode * ListNodePtr;
int main()
{
ListNodePtr head, new, current, previous, next;
unsigned listSize;
int i, anInt;
char str[MAX_INPUT];
listSize = 0;
head = NULL;
while (fgets(str, MAX_INPUT+EXTRA_CHARS, stdin) != NULL)
{
/* Parsing the string to int */
if(sscanf(str,"%d",&anInt) != 1)
{
sprintf(str, "Invalid input entered \n");
exit(EXIT_FAILURE);
}
/* Creating the node using malloc(...) */
if ( (new=malloc(sizeof(ListNode))) == NULL)
{
fprintf(stderr,"\nMemory Allocation for ListInsert failed!\n");
fprintf(stderr,"Aborting data entry!\n");
break;
}
current = head;
previous = NULL;
/* Search to find where in insert new list node */
while (current != NULL && current->data < anInt)
{
previous = current;
current = current->next;
}
new->data = anInt;
new->next = current;
listSize++;
if (previous == NULL)
{
head = new;
}
else
{
previous->next = new;
}
}/*End of input loop */
/* Display integers in linked list */
current = head;
while (current != NULL)
{
printf("%d\n", current->data);
current = current->next;
}
/* Deallocate memory used by list nodes */
current = head;
while (current != NULL)
{
next = current->next;
free(current);
current = next;
}
return EXIT_SUCCESS;
}
Here's the problem I have with it. In every example of linked lists I've seen so far online or in books, the definition of a linked list is given as a struct that contains only one item of data and a pointer to the next node in the list. The problem is that we've been given the following structure definitions to populate with data:
typedef struct price
{
unsigned dollars;
unsigned cents;
} PriceType;
typedef struct item
{
char itemID[ID_LEN + 1];
char itemName[MAX_NAME_LEN + 1];
PriceType prices[NUM_PRICES];
char itemDescription[MAX_DESC_LEN + 1];
ItemTypePtr nextItem;
} ItemType;
typedef struct category
{
char categoryID[ID_LEN + 1];
char categoryName[MAX_NAME_LEN + 1];
char drinkType; /* (H)ot or (C)old. */
char categoryDescription[MAX_DESC_LEN + 1];
CategoryTypePtr nextCategory;
ItemTypePtr headItem;
unsigned numItems;
} CategoryType;
typedef struct bcs
{
CategoryTypePtr headCategory; /* Pointer to the next node */
unsigned numCategories;
} BCSType;
This does not fit in with all the examples I've seen. So in the "generic" code above, do I have to do everything above, but replace the "new->data" part with, for example, "category->categoryID", and "category->categoryName" etc etc for all the members of the struct in order to populate the entire linked list with data?

Everything you need in the way of data structures was given to you. There's a top level item, BCSType, that leads to everything else. It has a linked list of categories with the head link in headCategory. And each CategoryType has a next pointer (nextCategory), and head link (headItem) to a linked list of ItemTypes, each of which has a next pointer, nextItem. You don't need to add anything to these structs and should not (and you'll get graded down if you do). Now you need to write the code that reads the data from the file and creates instances of these data structures, using the example code as a guideline for dealing with linked lists ... but you have to think, and be creative, to apply it to the three-level structure you've been given.
The vital thing here is to learn by doing. Try to write such code and don't give up, but if you truly get stuck you can, as a last resort, ask another question here ... but be very specific when you do, and include your code and error messages and all other relevant information about what you expected to happen and what actually happened instead. And be sure to learn how to use a debugger, and use it, before posting such a question.

Hashtable insertion/search in C

Hello i have a problem with my hash table its implemented like this:
#define HT_SIZE 10
typedef struct _list_t_ {
char key[20];
char string[20];
char prevValue[20];
struct _list_t_ *next;
} list_t;
typedef struct _hash_table_t_ {
int size; /* the size of the table */
list_t ***table; /* first */
sem_t lock;
} hash_table_t;
I have a Linked list with 3 pointers because i want a hash table with several partitions (shards), here is my initialization of my Hash table:
hash_table_t *create_hash_table(int NUM_SERVER_THREADS, int num_shards){
hash_table_t *new_table;
int j,i;
if (HT_SIZE<1) return NULL; /* invalid size for table */
/* Attempt to allocate memory for the hashtable structure */
new_table = (hash_table_t*)malloc(sizeof(hash_table_t)*HT_SIZE);
/* Attempt to allocate memory for the table itself */
new_table->table = (list_t ***)calloc(1,sizeof(list_t **));
/* Initialize the elements of the table */
for(j=0; j<num_shards; j++){
new_table->table[j] = (list_t **)calloc(1,sizeof(list_t *));
for(i=0; i<HT_SIZE; i++){
new_table->table[j][i] = (list_t *)calloc(1,sizeof(list_t ));
}
}
/* Set the table's size */
new_table->size = HT_SIZE;
sem_init(&new_table->lock, 0, 1);
return new_table;
}
Here is my search function to search in the hash table
list_t *lookup_string(hash_table_t *hashtable, char *key, int shardId){
list_t *list ;
int hashval = hash(key);
/* Go to the correct list based on the hash value and see if key is
* in the list. If it is, return return a pointer to the list element.
* If it isn't, the item isn't in the table, so return NULL.
*/
sem_wait(&hashtable->lock);
for(list = hashtable->table[shardId][hashval]; list != NULL; list =list->next) {
if (strcmp(key, list->key) == 0){
sem_post(&hashtable->lock);
return list;
}
}
sem_post(&hashtable->lock);
return NULL;
}
And my insert function:
char *add_string(hash_table_t *hashtable, char *str,char *key, int shardId){
list_t *new_list;
list_t *current_list;
unsigned int hashval = hash(key);
/*printf("|%d|%d|%s|\n",hashval,shardId,key);*/
/* Lock for concurrency */
sem_wait(&hashtable->lock);
/* Attempt to allocate memory for list */
new_list = (list_t*)malloc(sizeof(list_t));
/* Does item already exist? */
sem_post(&hashtable->lock);
current_list = lookup_string(hashtable, key,shardId);
sem_wait(&hashtable->lock);
/* item already exists, don't insert it again. */
if (current_list != NULL){
strcpy(new_list->prevValue,current_list->string);
strcpy(new_list->string,str);
strcpy(new_list->key,key);
new_list->next = hashtable->table[shardId][hashval];
hashtable->table[shardId][hashval] = new_list;
sem_post(&hashtable->lock);
return new_list->prevValue;
}
/* Insert into list */
strcpy(new_list->string,str);
strcpy(new_list->key,key);
new_list->next = hashtable->table[shardId][hashval];
hashtable->table[shardId][hashval] = new_list;
/* Unlock */
sem_post(&hashtable->lock);
return new_list->prevValue;
}
My main class runs some of tests by executing the insertion / reading / delete from the elements of the hash table the problem is when i have more than 4 partitions/shards the tests stop at the first reading element saying it returned the wrong value NULL on the search function, when its less than 4 it runs perfectly well and passes all the tests.
You can see my main.c in here if you want to give a look:
http://hostcode.sourceforge.net/view/1105
My complete Hash table code:
http://hostcode.sourceforge.net/view/1103
And other functions where hash table code is executed:
.c file http://hostcode.sourceforge.net/view/1104
.h file http://hostcode.sourceforge.net/view/1106
Thank for you time, i appreciate any help you can give to me this is a college important project that I'm trying to solve and I'm stuck here for 2 days.

Hi already solved this problem i was doing a bad allocation in my initialization:
new_table->table = (list_t ***)calloc(1,sizeof(list_t **));
it should be like this:
new_table->table = (list_t ***)calloc(num_shards,sizeof(list_t **));

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Validity of Hashtable From K&R C Progamming Language Book - c

Related

How does install() behave when NULL is stored as a name definition?

Singly Linked List Implementation in C using 3 different typedefs

I don't understand the hash tables example in K&R

Adapting generic code to different size of linked list in C

Hashtable insertion/search in C

Categories

Resources