Symbol table implementation using hash table in C - c

I am trying to implement a simple symbol table that stores the strings in a hash table according to their hash values. The hash table in my program is an array of pointers to linked lists. we have 6 linked lists corresponding to each hash value.
The problem is that though the program runs, it replaces the old strings with the new string in each iteration.
my code is..
struct node{
char *string;
struct node *next;
};
struct node *hashtable[6];
int calchash(char *arr);
main()
{
char *line, a='n';
int val, i;
do{
printf("Enter string:\n");
scanf("%s", line);
struct node *current;
struct node *q= (struct node*)malloc(sizeof(struct node));
q->string = line;
q->next = NULL;
val= calchash(line);
if(hashtable[val] == NULL)
{
hashtable[val] = q;
current =q;}
else{
current->next = q;
current = q;
}
printf("Node created\n");
for(i=0; i<6; i++)
{ printf("Hash value %d :\n", i);
if(hashtable[i]==NULL)
{printf("No STRINGS!\n\n");}
else
{struct node *t = hashtable[i];
while(t != NULL)
{printf("%s \n", t->string);
t = t->next;}
printf("\n\n");
}
}
printf("CONTINUE(y/n):\n");
scanf(" %c", &a);
}while(a!='n');
}
int calchash(char *arr)
{int i=0, ascii;
int sum=0;
while(arr[i] != '\0')
{ascii = arr[i];
if(ascii>=48 && ascii<=57)
{
sum+= 2*ascii;}
else
{sum=sum+ ascii;}
i++;
}
return ((sum*17+5)%6);
}
And the output is:
Enter string:
az9
Node created
Hash value 0 :
No STRINGS!
Hash value 1 :
No STRINGS!
Hash value 2 :
az9
Hash value 3 :
No STRINGS!
Hash value 4 :
No STRINGS!
Hash value 5 :
No STRINGS!
CONTINUE(y/n):
y
Enter string:
Az9
Node created
Hash value 0 :
No STRINGS!
Hash value 1 :
No STRINGS!
Hash value 2 :
Az9
Hash value 3 :
No STRINGS!
Hash value 4 :
Az9
Hash value 5 :
No STRINGS!
CONTINUE(y/n):
n
Can someone please tell me what changes are needed so as to retain the previous az9 string under hash value 2???

if(hashtable[val] == NULL) {
hashtable[val] = q;
current =q;
} else {
current->next = q;
current = q;
}
should be replaced with:
q->next = hashtable[val];
hashtable[val] = q;
// no need for current
Also, writing through any uninitialised pointer is UB, please allocate sufficient space first. Anything might happen...

How does this not crash immediately? Neither line nor hashtable are initialized.
You will need to make a copy of the string to go into each hash node, probably with strdup. Currently, all of the nodes point to the same string buffer as line, so when you read a new string into line, all of the nodes will see it. This is why you must duplicate the string for each node. I wonder where the buffer ended up though, since you never initialized line...
Also, what is current? It is local to the loop, and appears unnecessary. You should just chain new nodes onto the head of the bucket, so you don't need to check if the bucket is empty.
The insert also does not check if the string is already present, so you will insert duplicates.

Related

Same values with different pointers in a linked list?

Why does this code output the same name for all the nodes in the linked list?
Program output
Insert number of users :
4
Mike
John
Bob
Alice
Name : Alice # Pointer :0x874ae0
Name : Alice # Pointer :0x874b00
Name : Alice # Pointer :0x874b20
Name : Alice # Pointer :(nil)
The idea behind this code is to take x number of user names and create a linked list then loop over the linked list and print each name with the pointer for the next name.
typedef struct node
{
char *name;
struct node *next;
} node;
int main(void)
{
int x;
printf("Insert number of users :\n"); // capture int from user
scanf("%i", &x);
char str[LENGTH];
node *n = malloc(sizeof(node));
if (n == NULL)
return 1;
node *start = n; // pointer to the start of the linked list
// loop for n times to capture names
for (int i = 0; i < x; i++)
{
scanf("%s", str); // capture string
n->name = str;
// reached end of loop
if (i == x-1)
n->next = NULL;
else
n->next = malloc(sizeof(node));
n = n->next;
}
for (node *tmp = start; tmp != NULL; tmp = tmp->next)
{
printf("Name : %s # Pointer :%p\n", tmp->name, tmp->next);
}
return 0;
}
A simple script to take the names of people and insert them into a linked list.
In this statement within the for loop
n->name = str;
the data member name of all nodes is set to the address of the first character of the array str declared like
char str[LENGTH];
So all nodes will point to the same array — that is, to the last stored string in this array after the for loop.
You need to create dynamically a copy of the string stored in the array for each node. Something like
#include <string.h>
//...
n->name = malloc( strlen( str ) + 1 );
strcpy( n->name, str );

Attempting to replace a word that consists of nodes in a linked list in C

I am attempting to find a lists of nodes which form a word in a linked list. So it goes something like: I->a->n-> ->i->s-> ->a->w-e>s->o->m->e->NULL. The goal is to replace it with something like I->a->n-> ->i->s-> ->c->o->o->l->NULL. We want to do this irrespective of the size of the word being replaced or the word replacing it.
I've attempted to loop through index and delete the word and then replace it via index. However, this creates complicates things and I never truly get the word that I am seeking.
I am now simply trying to delete the word that is to be replaced and now I'm trying to simply replace the words for by the nodes with new nodes that form new words.
I am now attempting to manipulate the array size to see if this allows me to put a word through.
void indexInsert(char character, int n){
node* temp1 =(node*)malloc(sizeof(struct node));
temp1->character = character;
temp1->nextNode = NULL;
if(n == 1){
temp1->nextNode = headNode;
headNode = temp1;
return;
}
node* temp2 = headNode;
for(int i = 0; i < n-2; i++){
temp2 = temp2->nextNode;
}
temp1->nextNode = temp2->nextNode;
temp2->nextNode = temp1;
}
void replaceWord(char replaceWord[]) {
deleteWord(&headNode, replaceWord);
int Size = 1;
int Size2 = 2;
char entryWord[Size];
char entryWordCopy[Size2];
printf("Please enter the new word you wish to insert: ");
strcpy_s(entryWordCopy, Size2,gets_s(entryWord, Size));
printf("\n");
int length = strlen(entryWordCopy);
indexInsert(entryWordCopy, length);
Print(head);
}
The end result should be the removal of the nodes that form word A, and subsequently being replaced by the nodes that form word B. However, upon executing the program, I run into problems with my size arrays and my strings not evaluated. Reports back: failure was caused by a read of a variable outside of its lifetime.
My tip:
Do not use indices with lists, you can do the same thing with pointers to the nodes. You do not have to iterate to the index, pointers to the nodes are faster.
You should change the algorithms to use pointers to nodes:
Finding the last word:
You iterate through the list, if you find a node which is a space, you store the pointer to that node in a variable, if you reached the end of the list, the remembered pointer to the node, is the space before the last word. You just have to change the following nodes. If you never found a node, which contains a space, you can simply replace the whole list, with the replacement.
Another trick is to use double pointers, which stores the pointer to the pointer, which points to the node of the last word. (This might also be the root of the list)
// node** p is a pointer to the pointer of the first element
// if your root is defined as `node* root`, you use
// `... = last_word(&root);`
node** last_word(node** p) {
node* n = *p;
while(n) {
if(n->data == ' ') p = &n->next;
n = n->next;
};
return p;
};
Inserting a node:
You got a pointer from an algorithm like last_word, which is a pointer to a pointer to a node. It points to the variable, which stores the pointer to the next node (There is a ComputerPhile video about this), this fully handels insertion at the beginning, end and in the middle:
void insert(node** p, char c) {
node* elem = (node*)malloc(sizeof(node));
elem->data = c;
elem->next = *p; // connect to following node
*p = elem; // connect previous node/root to the node
};
And if you really need to work with indices, you should split always split the code up into seperate functions.
node** node_by_index(node** p, int index) {
while(index > 0) {
p = &(*p)->next;
--index;
};
return p;
};

How to keep track of all nodes in a linkedlist using only one variable?

The program will read values that are typed in by the user iteratively in a while loop. Every time the value is read, a new node will be created with this integer value. Now I'll make a LinkedList using these created nodes. And then print out every value stored in the LinkedList. I expect the order of reading in and the order of printing out are reversed. The following is my main function where readline is just a function that prints a message and reads an input, make_new_node is a function that takes in a value and makes a new node with this value, printlink is the function that prints out all the values saved in every node in the linked list, and freelink contains the free function. For now, I only get a printed-out result of the last integer typed in. With just one node variable n, how can I possibly print out all the value members of every node in the linked list?
int main()
{
char buf[100];
int num;
Node* n = NULL;
while(1) {
readline("The num to put in?: ", buf, sizeof(buf));
sscanf(buf, "%d", &num);
if(num == -1) {
n -> next = NULL;
printlink(n);
freelink(n);
break;
} else {
n = make_new_node(num);
n -> next = n;
}
}
return 0;
}
This is how you would generally approach this, but I guess it breaks your rule about using only a single node pointer? Is that a limitation specified in how you are allowed to solve the problem, or just a limitation you ran into based on your implementation?
int main()
{
char buf[100];
int num;
Node *head = NULL, *n;
while(1){
readline("The num to put in?: ", buf, sizeof(buf));
sscanf(buf, "%d", &num);
if (num == -1){
printlink(n);
freelink(n);
break;
} else{
n = make_new_node(num);
n -> next = head;
head = n;
}
}
return 0;
}
Store a reference to the first node. Since you never add any nodes before this one you could just have a method that takes the first node and iterates through your list via next.
Alternatively you could make a doubly linked list where you have a next AND a previous variable. Your method could then find the first node by going back via previous.
The most common way is to change the signature of make_new_node to
Node * make_new_node(int new_value, Node *current_head);
Code then become:
} else {
n = make_new_node(num, n);
}
A classical stack implementation is for example
Node * make_new_node(int new_value, Node *current_head) {
Node *n = malloc(*Node);
n->value = new_value;
n-> next = current_head;
return n;
}
But in that case, this
if(num == -1) {
n -> next = NULL; // NO!
would immediately trunk the (stack) linked list to only one element, all other would be leaked...

Segfaults on commented lines :

I'm trying to solve a problem on Codechef. I've posted about this before but am doing in completely differently.
http://www.codechef.com/problems/STEPUP#
The idea of the problem is to determine whether or not the desired situation arises for the given testcase.
A desired situation is when every vertex has a higher indirection than the vertices that are connected to it. Ie. if a->b, F(b) should be > F(a). If this isn't possible for the given setup, output is IMPOSSIBLE. If not, output the minimum value of F(x) for the vertex X with maximum indirection such that it holds for all other vertices.
I haven't tried to print the output for the possible cases yet.
INPUT FORMAT:
First line of input contains a number t, the number of test cases.
Each test case contain starts with two space seperated integers N and M, denoting the number of vertices and the number of edges in the graph respectively.
Each of the following M lines contain two space seperated integers a b denoting an edge from vertex a to vertex b.
There can be multiple edges between two vertices a and b.
For eg.,
2
2 2
1 2
2 1
3 2
1 2
1 3
OUTPUT should be:
IMPOSSIBLE
2
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
typedef struct Node{
int val;
struct Node* next;
};
int indirection[10001];//indirection[a] holds count. What vertex it holds count OF is given by list[a].val;
int main()
{
int testcases, num_vertices, num_edges, a,b,c,d,e;
scanf("%d", &testcases);
while(testcases--)
{
scanf("%d %d",&num_vertices, &num_edges);
struct Node *list[num_vertices];//array of pointers to node
int h;
struct Node * ptr;
for(h=1;h<=num_vertices;h++)
{
list[h]=(struct Node *)malloc(sizeof(struct Node));
list[h]->val=0;
}
memset(indirection,0,10001);
for(e=0;e<10001;e++)
printf("Indirection[e]=%d \n",indirection[e]);
a=1;
while(a<=num_edges)
{
printf("messge printing for the %dth time\n",a);
scanf("%d %d",&b,&c);
printf("Message recd %d \n",indirection[c]);
if(indirection[c]==0)
{
printf("entered case1\n");
list[a]->val=c;
printf("S\n");
//Segfaults here
list[a]->next->val=b;
printf("SS\n");
indirection[a]=1;
ptr=list[a]->next;
printf("SSS \n");
printf("case1\n");
}
else
{ printf("entered case2\n");
indirection[c]++;
//segfaults here if i comment out the previous one
ptr->next->val=b;
printf("case2\n");
ptr=ptr->next;
}
a++;
}
int tra,i;
struct Node *ptr1,*ptrnext;
for(i=1;i<=num_edges;i++)
{
ptr1=list[i];
ptrnext=list[i]->next;
{
if (indirection[ptr1->val]<indirection[ptrnext->val])
{ printf("IMPOSSIBLE");
break;
}
else
{
ptr1=ptrnext;
ptrnext=ptrnext->next;
}
}
}
free(list);
}
}
The 2 statements where I've mentioned a segfault in comments are just before the (I think) questionable statements. If I remove the first, segfault at the second. If I remove both, segfault ANYWAY.
Still trying to solve this problem so I can move forward with the next one. Thanks!
num_vertices treated as if it is 1 based rather than 0 based
struct Node *list[num_vertices];//array of pointers to node
int h;
struct Node * ptr;
// for(h=1;h<=num_vertices;h++)
for(h=0;h<num_vertices;h++)
{
list[h]=(struct Node *)malloc(sizeof(struct Node));
list[h]->val=0;
}
next field is not initialized as answered by Daniel
{
list[h]=(struct Node *)malloc(sizeof(struct Node));
list[h]->val = 0;
list[h]->next = something_maybe_NULL();
}
Suggest simpler malloc() style
list[h] = malloc(sizeof *(list[h]));
Your code segfaults because you create an array of struct Node* and allocate memory for them, but you never set the next pointer of each Node. So each Node's next pointer is just pointing somewhere random in memory and segfaults when you try to access it.
I think your design is just wrong. If you are trying to make a linked list of nodes (as suggested by the presence of a next pointer), you don't need to create an array to hold the nodes at all.
I analyzed all your code and found several problems in it, these problems indicate mainly that you don't understand pointers
Arrays are 0-index based
/* if you declare, struct list[size];
* index goes from 0 ti szie - 1
*/
for (h = 1 ; h <= num_vertices ; h++)
{
You never initialize node->next pointer
/* You should initialize the next node to null. */
list[h]->next = NULL;
Your memset is wrong, sizeof(int) != 1
/* memset(indirection, 0, 10001); wrong */
memset(indirection, 0, 10001 * sizeof(int));
You don't check for overflow when accessing the indirection array
/* this is very unsafe, you don't check c */
printf("Message recd %d \n", indirection[c]);
You dereference node->next without checking for NULL
/* don't dereference list[a]->next without checking .
* list[a]->next->val (wrong)
*/
next = list[a]->next;
if (next != NULL)
next->val = b;
You free list, it is an array not a pointer so you can't call free on it, however, you should free its elements, since they are pointers to valid malloced memory
for (i = 0 ; i < num_vertices ; i++)
free(list[i]);
Here is a version of your code with this issues fixed, I don't know if your algorithm works, but the code has at least 6 fewer errors.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
/* no need for typedef, since you declare as struct Node */
struct Node
{
int val;
struct Node* next;
};
int indirection[10001];//indirection[a] holds count. What vertex it holds count OF is given by list[a].val;
int main()
{
int testcases, num_vertices, num_edges, a, b, c;
printf("input testcase: ");
scanf("%d", &testcases);
while (testcases--)
{
printf("input testcase num_vertices and num_edges: ");
scanf("%d %d",&num_vertices, &num_edges);
int h;
struct Node *list[num_vertices]; // array of pointers to node
struct Node *ptr;
/* struct list[size];
* index goes from 0 ti szie - 1
*/
for (h = 0 ; h < num_vertices ; h++)
{
/* If this is plain C you don't need the cast (struct Node *) */
list[h] = malloc(sizeof(struct Node));
list[h]->val = 0;
/* You should initialize the next node to null. */
list[h]->next = NULL;
}
/* memset(indirection, 0, 10001); wrong */
memset(indirection, 0, 10001 * sizeof(int));
/* What, you dont believe all values are 0? */
/* for(e = 0 ; e < 10001 ; e++)
printf("Indirection[e] = %d\n",indirection[e]); */
/* arrays go from 0 ti size - 1 */
a = 0;
while (a < num_edges)
{
printf("messge printing for the %dth time\n", a);
printf("input b and c: ");
scanf("%d %d", &b, &c);
if (c < 10001)
{
/* this is very unsafe, you don't check c */
printf("Message recd %d \n", indirection[c]);
if (indirection[c]==0)
{
struct Node *next;
printf("entered case1\n");
list[a]->val = c;
printf("S\n");
// Segfaults here
/* don't dereference list[a]->next without checking . */
next = list[a]->next;
if (next != NULL)
next->val = b;
printf("SS\n");
indirection[a] = 1;
ptr = list[a]->next;
printf("SSS \n");
printf("case1\n");
}
else
{
printf("entered case2\n");
indirection[c]++;
//segfaults here if i comment out the previous one
ptr->next->val=b;
printf("case2\n");
ptr=ptr->next;
}
a++;
}
}
int i;
struct Node *ptr1, *ptrnext;
for(i = 0 ; i < num_edges ; i++) /* arrays go from 0 ti size - 1 */
{
ptr1 = list[i];
if (ptr1 != NULL)
ptrnext = ptr1->next;
if ((ptr1 != NULL) && (ptrnext != NULL))
{
if (indirection[ptr1->val] < indirection[ptrnext->val])
{
printf("IMPOSSIBLE");
break;
}
else
{
ptr1 = ptrnext;
ptrnext = ptrnext->next;
}
}
}
for (i = 0 ; i < num_vertices ; i++)
free(list[i]);
}
return 0;
}
in your code
list[a]->next->val=b;
list[a]->next maybe NULL. IMO, it's better to put a NULL check before dereferencing.
Same goes for ptr->next in
ptr->next->val=b;
Nevertheless, you need to allocate memory to next before using it. Otherwise, it will point to some
unknown memory location.
Also, why not start the loop from 0 in
for(h=1;h<=num_vertices;h++)
Sidenote: Please do not cast the return value of malloc().

Inserting into hash table

I am trying to insert an integer into a hash table. To do this, I'm creating an array of node*'s and I'm trying to make assignments like listarray[i]->data=5 possible. However, I'm still very confused with pointers and I'm crashing at the line with the comment '//crashes here' and I don't understand why. Was my initialization in main() invalid?
#include <stdio.h>
#include <stdlib.h>
typedef struct node
{
int data;
struct node * next;
} node;
//------------------------------------------------------------------------------
void insert (node **listarray, int size)
{
node *temp;
int value = 11; //just some random value for now, eventually will be scanned in
int index = value % size; // 11 modulo 8 yields 3
printf ("index is %d\n", index); //prints 3 fine
if (listarray[index] == NULL)
{
printf("listarray[%d] is NULL",index); //prints because of loop in main
listarray[index]->data = value; //crashes here
printf("listarray[%d] is now %d",index,listarray[index]->data); //never prints
listarray[index]->next = NULL;
}
else
{
temp->next = listarray[index];
listarray[index] = temp;
listarray[index]->data = value;
}
}//end insert()
//------------------------------------------------------------------------------
int main()
{
int size = 8,i; //set default to 8
node * head=NULL; //head of the list
node **listarray = malloc (sizeof (node*) * size); //declare an array of Node *
//do i need double pointers here?
for (i = 0; i < size; i++) //malloc each array position
{
listarray[i] = malloc (sizeof (node) * size);
listarray[i] = NULL; //satisfies the first condition in insert();
}
insert(*&listarray,size);
}
output:
index is 3
listarray[3] is NULL
(crash)
desired output:
index is 3
listarray[3] is NULL
listarray[3] is now 11
There are various issues here:
If you have a hash table of a certain size, then the hash code must map to a value between 0 and size - 1. Your default size is 8, but your hash code is x % 13, which means that your index might be out of bounds.
Your insert function should also pass the item to insert (unless that's the parameter called size, in which case it is severely misnamed).
if (listarray[index] == NULL) {
listarray[index]->data = value; //crashes here
listarray[index]->next = NULL;
}
It's no wonder that it crashes: When the node is NULL, you cannot dereference it with either * or ->. You should allocate new memory here.
And you shouldn't allocate memory here:
for (i = 0; i < size; i++) //malloc each array position
{
listarray[i] = malloc (sizeof (node) * size);
listarray[i] = NULL; //satisfies the first condition in insert();
}
Allocating memory and then resetting it to NULL is nonsense. NULL is a special value that means that no memory is at the pointed-to location. Just set all nodes to NULL, which means that the hash table starts out without any nodes. Allocate when you need a node at a certain position.
In the else clause, you write:
else
{
temp->next = listarray[index];
listarray[index] = temp;
listarray[index]->data = value;
}
but temp hasn't been allocated, but you dereference it. That's just as bad as dereferencing ´NULL`.
Your hash table also needs a means to handle collisions. It looks as if at every index in the hash table, there is a linked list. That's a good way to deal with it, but you haven't implemented it properly.
You seem to have problems to understand pointers. Perhaps you should start with a simpler data structure like a linked list, just to practice? When you have gotten a firm grasp of that, you can use what you've learned to implement your hash table.

Resources