Adding words to a hash table in C - c

Before proceeding with PSET5 - SPELLER of the CS50 course, I have decided to practice with a made-up program that takes words from a file and sorts them into a Hash Table, but I think I`m doing something wrong with the Hash Function as I keep getting the following error:
array subscript is not an integer
table[hash] = n;
Some of the elements are taken from the task itself to understand how they work. I don`t have any previous knowledge, totally limited to the CS50 course.
Please have a look at my code and maybe give a few pointers to what I am doing wrong.
From what I understand - every new word`s first letter goes through Hash Functions and returns a number for the Bucket in which this word goes.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int hash(const char *buffer);
const unsigned int LENGTH = 9;
typedef struct node
{
char word[LENGTH + 1];
struct node* next;
}
node;
node *table[26] = {NULL};
int hash(const char *buffer)
{
return toupper(buffer[0]) - 'A';
}
int main(void)
{
FILE *file = fopen("words", "r");
if (file != NULL)
{
char buffer[LENGTH];
while (fscanf(file, "%s", buffer) != EOF)
{
node *n = malloc(sizeof(node));
if (n == NULL)
{
return 1;
}
strcpy(n->word, buffer);
n->next = NULL;
table[hash] = n;
}
fclose(file);
}
}

You need to call the function hash(..) , it is not a variable.
Your line should be:
table[ hash(n->word) ] = n;

Related

Error pops up at structs and double pointers in c

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define CHARS 10
char **readText(void) ;
typedef struct text_t
{
char * * mt;
int words;
}
Text_t;
int main(void)
{
int i, words;
Text_t.mt = readText();
for (i=0; i<words; i++)
printf("%s\n", Text_t.mt[i]);
return 0;
}
char **readText(void)
{
*Text_t.mt = NULL;
char *word;
*words = 0;
while (scanf("%s", word=malloc(CHARS*sizeof(char))),
strcmp(word,"END"))
{
(*words) ++;
Text_t.mt = realloc(mytext, (*words)*sizeof(char *));
Text_t.mt[*words-1] = word;
}
free(word);
return Text_t.mt;
}
The goal of this programm is to take some words as input from the user untill the word "END" is entered. Then we print the given words with the same order as they were entered. The problem is when I try to run the programm, this error comes up:
main.c|16|error: expected identifier or '(' before '.' token|
You probably mean
Text_t tt;
tt.mt = NULL;
ie
create an instance ot the Text_t struct
set its 'mt' pointer to NULL
whether thats the correct thing for your main aim I cant say, but at least it will get you past your current compile error

How do you assign structs into an array?

I have currently made this much of the code:
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <ctype.h>
#define STRSIZE 21
struct PInven{
int count;
struct PItem{
char name[STRSIZE];
int amount;
}Pitem;
}Pinven;//this needs to be an output file
int ReadInProduce (){
//read in file and check to see if the file exist or not.
FILE * PinFile = fopen("produce.txt","r");
if (PinFile == NULL){
printf("ERROR: WRONG FILE");
}
else{
printf("I did it!!\n");
}
//assigning the value gotten into the struct variable(but need to maybe change this since it needs to be an output)
fscanf(PinFile,"%d",&Pinven.count);
printf("%d\n", Pinven.count);
int i;
for(i =0; i <Pinven.count; i++){
fscanf(PinFile,"%20s %d",Pinven.Pitem.name, &Pinven.Pitem.amount);
printf("%s %d\n",Pinven.Pitem.name, Pinven.Pitem.amount);
}
//making an array to hold the variables
//FILE * PoutFile = fopen("produce_update.txt","w");
fclose(PinFile);
return 0;
}
From there I want to get the file that is read to the structs to be printed out into an array so that later on I can make a function that will be able to compare to the to it.
Basically a store management system. Where the file of the inventory is read in and compared to the file that is store and return a new value for the amount of produce now either left or gained.
10 //number of items that will be stored in the store
apple 19
banana 31
broccoli 9
...
In general, it's a really bad idea to include header information in the file about the number of entries in the file. You want to be able to do stream processing, and that will be more difficult if you need that meta-data. More importantly, it is important to understand how to write the code so that you don't need it. It's not really that difficult, but for some reason people avoid it. One simple approach is just to grow the array for each entry. This is horribly inefficient, but for the sake of simplicity, here's an example that expects the file not not include that first line:
#include <stdio.h>
#include <stdlib.h>
#include <stdbool.h>
#include <ctype.h>
#include <limits.h>
#define STRSIZE 128
struct PItem{
char name[STRSIZE];
int amount;
};
struct PInven{
int count;
struct PItem *PItem;
};
static void
grow(struct PInven *p)
{
p->PItem = realloc(p->PItem, ++p->count * sizeof *p->PItem);
if( p->PItem == NULL ){
perror("out of memory");
exit(1);
}
}
int
ReadInProduce(struct PInven *P, const char *path)
{
FILE * PinFile = fopen(path, "r");
if( PinFile == NULL ){
perror(path);
exit(1);
}
char fmt[64];
int max_len;
max_len = snprintf(fmt, 0, "%d", INT_MAX);
snprintf(fmt, sizeof fmt, "%%%ds %%%dd", STRSIZE - 1, max_len - 1);
grow(P);
struct PItem *i = P->PItem;
while( fscanf(PinFile, fmt, i->name, &i->amount) == 2 ){
i += 1;
grow(P);
}
P->count -= 1;
fclose(PinFile); /* Should check for error here! */
return P->count;
}
int
main(int argc, char **argv)
{
struct PInven P = {0};
char *input = argc > 1 ? argv[1] : "produce.txt";
ReadInProduce(&P, input);
struct PItem *t = P.PItem;
for( int i = 0; i < P.count; i++, t++ ){
printf("%10d: %s\n", t->amount, t->name);
}
}
As an exercise for the reader, you should add some error handling. At the moment, this code simply stops reading the input file if there is bad input. Also, it would be a useful exercise to do fewer reallocations.
you should change Structure of PInven to it can save a dynamic array of Pitem with a Pitem pointer.
tested :
#include <ctype.h>
#include <stdbool.h>
#include <stdio.h>
#include <stdlib.h>
#define STRSIZE 21
typedef struct {
char name[STRSIZE];
int amount;
} Pitem;
struct PInven {
int count;
Pitem *pitem;
} Pinven; // this needs to be an output file
int main() {
// read in file and check to see if the file exist or not.
FILE *PinFile = fopen("produce.txt", "r");
if (PinFile == NULL) {
printf("ERROR: WRONG FILE");
} else {
printf("I did it!!\n");
}
// assigning the value gotten into the struct variable(but need to maybe
// change this since it needs to be an output)
fscanf(PinFile, "%d", &Pinven.count);
Pinven.pitem = (Pitem *)malloc(sizeof(Pitem) * Pinven.count);
printf("%d\n", Pinven.count);
int i;
for (i = 0; i < Pinven.count; i++) {
fscanf(PinFile, "%20s %d", Pinven.pitem[i].name,
&Pinven.pitem[i].amount);
// printf("%s %d\n",Pinven.pitem[i].name, Pinven.pitem[i].amount);
}
for (i = 0; i < Pinven.count; i++) {
printf("%s %d\n", Pinven.pitem[i].name, Pinven.pitem[i].amount);
}
// making an array to hold the variables
// FILE * PoutFile = fopen("produce_update.txt","w");
fclose(PinFile);
// remember free
free(Pinven.pitem);
return 0;
}

Pointer of strings stored in linked list

I want to store words from a pointer of char strings in a double linked list. My function for storing the words in the char strings works perfect, but when it comes to storing in the dll elements it doesn't work anymore. I can't understand if there is a problem in the declarative zone of the list (I am new to lists, we just did some theory on them in the class) or with the node changing pointer.
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
#include <string.h>
int number_of_words (FILE *f) {
char x[1024];
int i=0;
while (fscanf(f, " %1023s", x) == 1) {
i++;
}
return i;
}
void words (FILE *f, char *words[]) {
char x[1024];
int i=0;
while (fscanf(f, " %1023s", x) == 1) {
words[i]=strdup(x);
i++;
}
}
typedef struct node{
int freq;
char *word_string;
struct node *next;
struct node *prev;
}node;
int main(int argc, const char * argv[]) {
FILE *input=fopen(argv[1], "r+");
if(input==NULL) printf("error in reading from file");
else printf("reading works.\n");
int k=number_of_words(input);
char *word[k];
char *word_unique[k];
rewind(input);
words(input, word);
int j=0,l=0,s=0;
for(j=0;j<k;j++) {
for (l=0; l<j; l++){
if (strcmp(word[j],word[l])==0)
break;
}
if (j==l){
word_unique[s]=word[j];
s++;
}
}
int *word_freq[s];
for(j=0;j<s;j++){
word_freq[j]=0;
}
for(j=0;j<s;j++) {
for (l=j; l<k; l++){
if (strcmp(word_unique[j],word[l])==0)
word_freq[j]++;
}
}
char *aux=malloc(30*sizeof(char));
for(j=0;j<s;j++){
for(l=j+1;l<s-1;l++){
if(strcasecmp(word_unique[j], word_unique[l])>0)
{
strcpy(aux,word_unique[j]);
strcpy(word_unique[j],word_unique[l]);
strcpy(word_unique[l],aux);
}
}
}
node *head, *curr=NULL;
int i=0;
head=NULL;
for(i=0;i<k;i++){
curr=(node *)malloc(sizeof(node));
curr->word_string=word_unique[i];
curr->freq=word_freq[i];
curr->next=head;
head=curr;
}
while(curr) {
if(curr->word_string!=NULL) printf("%s:%d\n", curr->word_string, curr->freq);
curr = curr->next;
}
return 0;
}
The input file is a text file and it looks like this:
Everything LaTeX numbers for you has a counter associated with it. The name of the counter
is the same as the name of the environment or command that produces the number, except
with no. Below is a list of some of the counters used in LaTeX’s standard document styles
to control numbering.
When I tried to print the unique elements in alphabetical order with their frequency, it actually prints out in reverse order with 4x frequency they actually have. It also separates "numbering." from the others + a new line at the beginning which I don't know where it comes from. This is what it prints:
reading works.
0- :2098416
numbering.:4
you:4
with:4
used:4
to:4
the:4
The:4
that:4
styles:4
standard:4
some:4
same:4
produces:4
or:4
of:4
numbers:4
number,:4
no:4
name:4
list:4
LaTeX’s:4
LaTeX:4
it.:4
is:4
in:8
has:24
for:16
except:8
Everything:4
environment:4
document:8
counters:4
counter:8
control:8
command:4
Below:4
associated:4
as:4
a:4
\.:4
Program ended with exit code: 0

Hash Table in C (find the frequency of every word)

I want to create a hash table for an exercise I have to send in my University.
The program will open a number of files, break each file's content to <<words>> (tokens) and it will save each <<word>> in a hash table with the frequency of each <<word>>.
In case the word is already in the hash table , the program will increase the word's frequency.
At the end the program will print the words and it's frequencies accordingly.
Also the frequencies should be printed from the highest word frequency to the lowest.
The comparison of the <<words>> will ignore upper and lower case letters.
For example if a file contains : one two three four Two Three Four THREE FOUR FoUr
It should print:
four 4
three 3
two 2
one 1
The professor gave us a template that we should complete but I'm really confused on what to do with the insert_ht() and clear_ht() functions as well as the compare one.
Here is the code :
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#define HTABLE_SIZ 1001
#define MAX_LINE_SIZ 1024
/* Hash Table */
typedef struct node* link;
struct node { char *token; int freq; link next; };
link htable[HTABLE_SIZ] = { NULL }; /* Table of lists (#buckets) */
int size = 0; /* Size (number of elements) of hash table */
unsigned int hash (char *tok );
void insert_ht (char *data);
void clear_ht ( );
void print_ht ( );
void Process(FILE *fp);
int main(int argc, char *argv[])
{
int i;
FILE *fp;
for (i=1; i < argc; i++)
{
fp = fopen(argv[i],"r");
if (NULL == fp)
{
fprintf(stderr,"Problem opening file: %s\n",argv[i]);
continue;
}
Process(fp);
fclose(fp);
}
print_ht();
clear_ht();
return 0;
}
void Process(FILE *fp)
{
const char *seperators = " ?!'\";,.:+-*&%(){}[]<>\\\t\n";
char line[MAX_LINE_SIZ];
char *s;
while((fgets(line,MAX_LINE_SIZ, fp)) != NULL)
{
for (s=strtok(line,seperators); s; s=strtok(NULL,seperators))
insert_ht(s);
}
}
/* Hash Function */
unsigned int hash(char *tok)
{
unsigned int hv = 0;
while (*tok)
hv = (hv << 4) | toupper(*tok++);
return hv % HTABLE_SIZ;
}
void insert_ht(char *token)
{
……………………………………………
}
void clear_ht()
{
……………………………………………
}
int compare(const void *elem1, const void *elem2)
{
……………………………………………
}
void print_ht()
{
int i, j=0;
link l, *vector = (link*) malloc(sizeof(link)*size);
for (i=0; i < HTABLE_SIZ; i++)
for (l=htable[i]; l; l=l->next)
vector[j++] = l;
qsort(vector,size,sizeof(link),compare);
for (i=0; i < size; i++)
printf("%-50s\t%7d\n",vector[i]->token,vector[i]->freq);
free(vector);
}
I'll answer you in a new post because it's hard to be exhaustive in comments.
1. Malloc
Why would I need to use malloc then ? Shouldn't i write directly to the htable? (on the insert_ht() funtion)
You need to use malloc because you declare a char pointer in struct (char *token). The thing is that you never initialize the pointer to anything, and as far you don't know the size of the token, you need to malloc every token. But, as you use strdup(token), you don't need to malloc token because strdup does. So don't forget to free every token in order to avoid memory leaks.
2. Segfault
I can't test you code, but it seems like the following line causes the segmentation fault :
list = htable[hashval]->token
Indeed, you try to access token while htable[hashval] is NULL, and to assign a char * to a link type (list).
You need to loop with this :
for(list = htable[hashval]; list != NULL; list = list->next) { ... }
3. Notes
if (x=1) should be if(x==1).
Don't malloc new_list if you don't need to.
Because new_list if used when htable[hashval] is NULL, new_list->next = htable[hashval]; will set new_list->next to NULL.
You should use the -Wall option in gcc (for warnings) and you may use valgrind to understand your segmentation faults. In this case, use gcc with debug mode (-g).
Double and Final edit : Ι found the solution. Apparently for some reason my compare function was wrong.
I still haven't figured out why but here is the correct one, hopefully someone else will find this post helpful!
int compare(const void *elem1, const void *elem2)
{
return (*(link*)elem2)->freq - (*(link*)elem1)->freq;
}
Edit: deleted old answer . Found the correct way I think but I have another problem right now.
The compare function doesn't work correctly. My printf is fine but it doesnt sort them with the frequiencies. I want them to be sorted from the highest to lowest .
In this example: the file contains -> one two three four Two Three Four THREE FOUR FoUr
And I get:
two 2
one 1
four 4
three 3
While I should be getting :
four 4
three 3
two 2
one 1
Here is the code. Feel free to help!
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#define HTABLE_SIZ 1001
#define MAX_LINE_SIZ 1024
/* Hash Table */
typedef struct node* link;
struct node { char *token; int freq; link next; };
link htable[HTABLE_SIZ] = { NULL }; /* Table of lists (#buckets) */
int size = 0; /* Size (number of elements) of hash table */
unsigned int hash (char *tok );
void insert_ht (char *data);
void clear_ht ( );
void print_ht ( );
void Process(FILE *fp);
int main(int argc, char *argv[])
{
int i;
FILE *fp;
printf("prin tin for \n");
for (i=1; i < argc; i++)
{
printf("prin tin fopen \n");
fp = fopen(argv[i],"r");
if (NULL == fp)
{
fprintf(stderr,"Problem opening file: %s\n",argv[i]);
continue;
}
printf("prin tin process \n");
Process(fp);
fclose(fp);
}
print_ht();
//clear_ht();
return 0;
}
void Process(FILE *fp)
{
const char *seperators = " ?!'\";,.:+-*&%(){}[]<>\\\t\n";
char line[MAX_LINE_SIZ];
char *s;
while((fgets(line,MAX_LINE_SIZ, fp)) != NULL)
{
for (s=strtok(line,seperators); s; s=strtok(NULL,seperators)){
printf("prin tin insert %s \n",s);
insert_ht(s);
}
}
}
/* Hash Function */
unsigned int hash(char *tok)
{
printf("bike stin hash \n");
unsigned int hv = 0;
while (*tok)
hv = (hv << 4) | toupper(*tok++);
printf("VGAINEIIIIIIIIIIIIII %d \n",hv);
return hv % HTABLE_SIZ;
}
void insert_ht(char *token)
{
printf("bike stin insert %s \n",token);
unsigned int hashval = hash(token);
if (htable[hashval]==NULL){
printf("mesa stin prwti if %u %s \n",hashval,token);
//token = strdup(token);
htable[hashval] = malloc(sizeof(token));
htable[hashval]->token = token ;
htable[hashval]->freq = 1;
size++;
}else {
htable[hashval]->freq++;
}
printf("ta evale epitixws \n");
}
int compare(const void *elem1, const void *elem2)
{
const struct node *p1 = elem1;
const struct node *p2 = elem2;
if ( p1->freq < p2->freq)
return -1;
else if (p1->freq > p2->freq)
return 1;
else
return 0;
}
void print_ht()
{
int i, j=0;
link l, *vector = (link*) malloc(sizeof(link)*size);
for (i=0; i < HTABLE_SIZ; i++)
for (l=htable[i]; l; l=l->next)
vector[j++] = l;
qsort(vector,size,sizeof(link),compare);
for (i=0; i < size; i++)
printf("%-50s\t%7d\n",vector[i]->token,vector[i]->freq);
free(vector);
}
Sorry for my bad english.
I think that :
insert(char *token) takes a word of the file and puts into the hash table. In brief, if the word exists in the hash table, you just have to increment its frequencie. Otherwise, you need to create another node and put the frequencie to 1, then ad it to the array. At the end, you will have one entry for each unique word.
compare(const void *elem1, const void *elem2) will be used by qsort. It returns 0 if elem1 = elem2, a negative number if elem1 < elem2 and a number > 0 if elem1 > elem2. By passing compare to qsort, you allow qsort to sort you array according to your own criteria.
clear_ht() may set all the values of the array to NULL, in order to restart another count ?

Read file .txt and save it to struct in c

I want to ask about file processing and struct in C language, I get an assignment from my lecture, and am so really confused about string manipulation in C programming. Here is the task.
get data from mhs.txt
store in struct
sort by name ascending
Here is the mhs.txt
1701289436#ANDI#1982
1701317124#WILSON#1972
1701331734#CHRISTOPHER STANLEY#1963
1701331652#SHINVANNI THEODORE#1962
1701331141#MUHAMMAD IMDAAD ZAKARIA#1953
1701331564#MARCELLO GENESIS DRIET J.#1942
1701322282#ANANDA AULIA#1972
1701329175#LORIS TUJIBA SOEJONOPOETRO#1983
1701301422#DEWI JULITA#1993
1701332610#HARRY HUTALIANG#1982
first before # is NIM,
after first # is name
and the last after #, is year
and here is what i've done
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct student{
char nim[11];
char name[50];
int year;
}s[10];
int main(){
FILE *fp;
int c,i,n;
printf("Read mhs.txt...");
getchar();
fp = fopen("mhs.txt", "r");
c = getc(fp);
i = 0;
while(c!=EOF){
printf("%c", c);
c = getc(fp);
i++;
}
fclose(fp);
getchar();
return 0;
}
First thing, I could save data on struct, but in here I very confused to separate a string.
That's all I know about struct and file processing, is there anyone who can help me? I have traveled around the internet and could not find the correct results.
Sorry if there are duplicate questions, and sorry if my english is too bad.
This is pure C code, you should new three import function: strtok & qsort & fsan.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
struct student
{
char nim[11];
char name[50];
int year;
};
#define BUFFER_SIZE 100
struct student saveToStruct (char* str)
{
struct student res;
int flag = 0;
char *token = strtok(str, "#");
while( token != NULL )
{
if (0 == flag)
strcpy(res.nim, token);
else if (1 == flag)
strcpy(res.name, token);
else
res.year = atoi(token);
flag++;
token = strtok( NULL, "#" );
}
return res;
}
void print(struct student* arr, int size)
{
for (int i = 0; i < size; i++)
{
printf("%s, %s, %d\n", arr[i].nim, arr[i].name, arr[i].year);
}
}
int cmp(const void* l, const void* r)
{
return strcmp(((const student*)l)->name, ((const student*)r)->name);
}
int main()
{
struct student arr[10];
FILE* file = fopen("mhs.txt", "r");
if (!file)
return -1;
char buffer[BUFFER_SIZE];
int flag = 0;
while (fgets(buffer, BUFFER_SIZE, file))
{
arr[flag] = saveToStruct(buffer);
flag++;
}
print(arr, 10);
qsort(arr, 10, sizeof(struct student), cmp);
printf("After sort by name!\n");
print(arr, 10);
return 0;
}
Since you've tagged this as C++, I'd use C++:
#include <iostream>
#include <string>
#include <algorithm>
struct student {
std::string nim;
std::string name;
int year;
bool operator<(student const &other) {
return name < other.name;
}
friend std::istream &operator>>(std::istream &is, student &s) {
std::getline(is, s.nim, '#');
std::getline(is, s.name, '#');
return is >> s.year;
}
};
int main() {
std::ifstream in("mhs.txt");
std::vector<student> students{
std::istream_iterator<student>(in),
std::istream_iterator<student>()
};
std::sort(students.begin(), students.end());
}
If you want to accomplish roughly the same thing in C, it's probably easiest to do the reading with fscanf using a scanset conversion, like:
fscanf(infile, "%10[^#]#%49[^#]#%d", student.nim, student.name, &student.year);
The scanset conversion gives you something like a subset of regular expressions, so the %[^#] converts a string of characters up to (but not including) a #. In this case, I've limited the length of each to one less than the length you gave for the arrays in your struct definition to prevent buffer overruns.
Then you can do the sorting with qsort. You'll need to write a comparison function, and doing that correctly isn't always obvious though:
int cmp(void const *aa, void const *bb) {
student const *a = aa;
student const *b = bb;
return strcmp(a->name, b->name);
}
Here are some hints, not the full answer. Hope it could help you.
first you need to read the file line by line, instead of character by character. You need the function of fgets(). you may find the reference from www.cplusplus.com/reference/cstdio/fgets/
second you can use strtok() to seperate strings. here is an example.
char str[] = "now # is the time for all # good men to come to the # aid of their country";
char delims[] = "#";
char *result = NULL;
result = strtok( str, delims );
while( result != NULL ) {
printf( "result is \"%s\"\n", result );
result = strtok( NULL, delims );
}
and you may find the reference to strtok() from http://www.cplusplus.com/reference/cstring/strtok/
third, use qsort() to sort the structure array. you may find the reference of it from http://www.cplusplus.com/reference/cstdlib/qsort/. examples can also be found there.

Resources