I can't seem to find where the problem is in my code, i'm essentially storing character patterns, if an existing pattern is already present I'll store it in the duplicate node, otherwise it will shift to the regular pattern one. The problem starts when i have a duplicate, it should move to the duplicate node, but instead loops infinitely.
The binary tree struct: :
struct arbre{
char id[10];
int length;
int token;
int count;
struct arbre *pattern;
struct arbre *doublon;
};
typedef struct arbre *Arbre;
The function to create and make new nodes
void ajouter(Arbre *a, char *tablettre, int length, int token){
if(*a==NULL){
*a=(Arbre)malloc(sizeof(struct arbre));
strcpy((*a)->id, tablettre);
//append((*a)->id,tablettre);
//printf("%s",(*a)->id);
(*a)->length = length;
(*a)->token = token;
(*a)->doublon=NULL;
(*a)->pattern=NULL;
}
if (strcmp((*a)->id, tablettre) == 0){ /// The problem is here
printf("%s",(*a)->doublon->id);
printf(" and %s",tablettre);
ajouter(&(*a)->doublon, tablettre, length, token);
}
if (strcmp((*a)->id, tablettre) != 0){
ajouter(&(*a)->pattern, tablettre, length, token);
}
}
Related
Before proceeding with PSET5 - SPELLER of the CS50 course, I have decided to practice with a made-up program that takes words from a file and sorts them into a Hash Table, but I think I`m doing something wrong with the Hash Function as I keep getting the following error:
array subscript is not an integer
table[hash] = n;
Some of the elements are taken from the task itself to understand how they work. I don`t have any previous knowledge, totally limited to the CS50 course.
Please have a look at my code and maybe give a few pointers to what I am doing wrong.
From what I understand - every new word`s first letter goes through Hash Functions and returns a number for the Bucket in which this word goes.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
int hash(const char *buffer);
const unsigned int LENGTH = 9;
typedef struct node
{
char word[LENGTH + 1];
struct node* next;
}
node;
node *table[26] = {NULL};
int hash(const char *buffer)
{
return toupper(buffer[0]) - 'A';
}
int main(void)
{
FILE *file = fopen("words", "r");
if (file != NULL)
{
char buffer[LENGTH];
while (fscanf(file, "%s", buffer) != EOF)
{
node *n = malloc(sizeof(node));
if (n == NULL)
{
return 1;
}
strcpy(n->word, buffer);
n->next = NULL;
table[hash] = n;
}
fclose(file);
}
}
You need to call the function hash(..) , it is not a variable.
Your line should be:
table[ hash(n->word) ] = n;
I have the following code:
if (strcmp(method, "print") == 0){
for (i = 0; i < hashtable->M; i++){
fprintf(fout,"[%d] :", i);
if (hashtable->v[i] != NULL)
while(hashtable->v[i]){
fprintf(fout," ");
((Pair*)hashtable->v[i]->info)->fp(hashtable->v[i], fout);
hashtable->v[i] = hashtable->v[i]->urm;
}
fprintf(fout,"\n");
}
}
In my code, when I encounter a "print" from the file, I have to print out the values stored in my hashtable. M is the size of the hashtable, while v is a void* which points to a linked list. It's all good but if I have to add some values in the hashtable and then print them, after printing I lose all the previously stored values. I think this might have to do with the fact that I'm doing this: hashtable->v[i] = hashtable->v[i]->urm; but even so, I don't know how to recover the values. I was thinking about something as an auxiliary pointer to the hashtable, but that failed for me. Note, the structures are as follow
typedef struct celulag {
void *info;
struct celulag *urm;
} TCelulaG, *TLG, **ALG;
typedef unsigned int (*TFHash)(const void*, size_t M, size_t range);
/* structura tabela Hash */
typedef struct {
size_t M;
TFHash fh;
TLG *v;
} TH;
typedef void (*TPrint)(TLG l, FILE *g);
typedef struct {
void *key;
void *value;
TPrint fp;
int frequency;
} Pair;
Note that this is done in a while so the assignment would look as this:
Add 3 values, print, add another 4 values, print. After adding the first 3 values, if I prind, it's all good, after adding the next 4 values, if I print, only the last 4 values appear.
Can anyone help? Thank you very much.
I want to store words from a pointer of char strings in a double linked list. My function for storing the words in the char strings works perfect, but when it comes to storing in the dll elements it doesn't work anymore. I can't understand if there is a problem in the declarative zone of the list (I am new to lists, we just did some theory on them in the class) or with the node changing pointer.
#include <stdio.h>
#include <stdlib.h>
#include <strings.h>
#include <string.h>
int number_of_words (FILE *f) {
char x[1024];
int i=0;
while (fscanf(f, " %1023s", x) == 1) {
i++;
}
return i;
}
void words (FILE *f, char *words[]) {
char x[1024];
int i=0;
while (fscanf(f, " %1023s", x) == 1) {
words[i]=strdup(x);
i++;
}
}
typedef struct node{
int freq;
char *word_string;
struct node *next;
struct node *prev;
}node;
int main(int argc, const char * argv[]) {
FILE *input=fopen(argv[1], "r+");
if(input==NULL) printf("error in reading from file");
else printf("reading works.\n");
int k=number_of_words(input);
char *word[k];
char *word_unique[k];
rewind(input);
words(input, word);
int j=0,l=0,s=0;
for(j=0;j<k;j++) {
for (l=0; l<j; l++){
if (strcmp(word[j],word[l])==0)
break;
}
if (j==l){
word_unique[s]=word[j];
s++;
}
}
int *word_freq[s];
for(j=0;j<s;j++){
word_freq[j]=0;
}
for(j=0;j<s;j++) {
for (l=j; l<k; l++){
if (strcmp(word_unique[j],word[l])==0)
word_freq[j]++;
}
}
char *aux=malloc(30*sizeof(char));
for(j=0;j<s;j++){
for(l=j+1;l<s-1;l++){
if(strcasecmp(word_unique[j], word_unique[l])>0)
{
strcpy(aux,word_unique[j]);
strcpy(word_unique[j],word_unique[l]);
strcpy(word_unique[l],aux);
}
}
}
node *head, *curr=NULL;
int i=0;
head=NULL;
for(i=0;i<k;i++){
curr=(node *)malloc(sizeof(node));
curr->word_string=word_unique[i];
curr->freq=word_freq[i];
curr->next=head;
head=curr;
}
while(curr) {
if(curr->word_string!=NULL) printf("%s:%d\n", curr->word_string, curr->freq);
curr = curr->next;
}
return 0;
}
The input file is a text file and it looks like this:
Everything LaTeX numbers for you has a counter associated with it. The name of the counter
is the same as the name of the environment or command that produces the number, except
with no. Below is a list of some of the counters used in LaTeX’s standard document styles
to control numbering.
When I tried to print the unique elements in alphabetical order with their frequency, it actually prints out in reverse order with 4x frequency they actually have. It also separates "numbering." from the others + a new line at the beginning which I don't know where it comes from. This is what it prints:
reading works.
0- :2098416
numbering.:4
you:4
with:4
used:4
to:4
the:4
The:4
that:4
styles:4
standard:4
some:4
same:4
produces:4
or:4
of:4
numbers:4
number,:4
no:4
name:4
list:4
LaTeX’s:4
LaTeX:4
it.:4
is:4
in:8
has:24
for:16
except:8
Everything:4
environment:4
document:8
counters:4
counter:8
control:8
command:4
Below:4
associated:4
as:4
a:4
\.:4
Program ended with exit code: 0
I want to create a hash table for an exercise I have to send in my University.
The program will open a number of files, break each file's content to <<words>> (tokens) and it will save each <<word>> in a hash table with the frequency of each <<word>>.
In case the word is already in the hash table , the program will increase the word's frequency.
At the end the program will print the words and it's frequencies accordingly.
Also the frequencies should be printed from the highest word frequency to the lowest.
The comparison of the <<words>> will ignore upper and lower case letters.
For example if a file contains : one two three four Two Three Four THREE FOUR FoUr
It should print:
four 4
three 3
two 2
one 1
The professor gave us a template that we should complete but I'm really confused on what to do with the insert_ht() and clear_ht() functions as well as the compare one.
Here is the code :
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#define HTABLE_SIZ 1001
#define MAX_LINE_SIZ 1024
/* Hash Table */
typedef struct node* link;
struct node { char *token; int freq; link next; };
link htable[HTABLE_SIZ] = { NULL }; /* Table of lists (#buckets) */
int size = 0; /* Size (number of elements) of hash table */
unsigned int hash (char *tok );
void insert_ht (char *data);
void clear_ht ( );
void print_ht ( );
void Process(FILE *fp);
int main(int argc, char *argv[])
{
int i;
FILE *fp;
for (i=1; i < argc; i++)
{
fp = fopen(argv[i],"r");
if (NULL == fp)
{
fprintf(stderr,"Problem opening file: %s\n",argv[i]);
continue;
}
Process(fp);
fclose(fp);
}
print_ht();
clear_ht();
return 0;
}
void Process(FILE *fp)
{
const char *seperators = " ?!'\";,.:+-*&%(){}[]<>\\\t\n";
char line[MAX_LINE_SIZ];
char *s;
while((fgets(line,MAX_LINE_SIZ, fp)) != NULL)
{
for (s=strtok(line,seperators); s; s=strtok(NULL,seperators))
insert_ht(s);
}
}
/* Hash Function */
unsigned int hash(char *tok)
{
unsigned int hv = 0;
while (*tok)
hv = (hv << 4) | toupper(*tok++);
return hv % HTABLE_SIZ;
}
void insert_ht(char *token)
{
……………………………………………
}
void clear_ht()
{
……………………………………………
}
int compare(const void *elem1, const void *elem2)
{
……………………………………………
}
void print_ht()
{
int i, j=0;
link l, *vector = (link*) malloc(sizeof(link)*size);
for (i=0; i < HTABLE_SIZ; i++)
for (l=htable[i]; l; l=l->next)
vector[j++] = l;
qsort(vector,size,sizeof(link),compare);
for (i=0; i < size; i++)
printf("%-50s\t%7d\n",vector[i]->token,vector[i]->freq);
free(vector);
}
I'll answer you in a new post because it's hard to be exhaustive in comments.
1. Malloc
Why would I need to use malloc then ? Shouldn't i write directly to the htable? (on the insert_ht() funtion)
You need to use malloc because you declare a char pointer in struct (char *token). The thing is that you never initialize the pointer to anything, and as far you don't know the size of the token, you need to malloc every token. But, as you use strdup(token), you don't need to malloc token because strdup does. So don't forget to free every token in order to avoid memory leaks.
2. Segfault
I can't test you code, but it seems like the following line causes the segmentation fault :
list = htable[hashval]->token
Indeed, you try to access token while htable[hashval] is NULL, and to assign a char * to a link type (list).
You need to loop with this :
for(list = htable[hashval]; list != NULL; list = list->next) { ... }
3. Notes
if (x=1) should be if(x==1).
Don't malloc new_list if you don't need to.
Because new_list if used when htable[hashval] is NULL, new_list->next = htable[hashval]; will set new_list->next to NULL.
You should use the -Wall option in gcc (for warnings) and you may use valgrind to understand your segmentation faults. In this case, use gcc with debug mode (-g).
Double and Final edit : Ι found the solution. Apparently for some reason my compare function was wrong.
I still haven't figured out why but here is the correct one, hopefully someone else will find this post helpful!
int compare(const void *elem1, const void *elem2)
{
return (*(link*)elem2)->freq - (*(link*)elem1)->freq;
}
Edit: deleted old answer . Found the correct way I think but I have another problem right now.
The compare function doesn't work correctly. My printf is fine but it doesnt sort them with the frequiencies. I want them to be sorted from the highest to lowest .
In this example: the file contains -> one two three four Two Three Four THREE FOUR FoUr
And I get:
two 2
one 1
four 4
three 3
While I should be getting :
four 4
three 3
two 2
one 1
Here is the code. Feel free to help!
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
#define HTABLE_SIZ 1001
#define MAX_LINE_SIZ 1024
/* Hash Table */
typedef struct node* link;
struct node { char *token; int freq; link next; };
link htable[HTABLE_SIZ] = { NULL }; /* Table of lists (#buckets) */
int size = 0; /* Size (number of elements) of hash table */
unsigned int hash (char *tok );
void insert_ht (char *data);
void clear_ht ( );
void print_ht ( );
void Process(FILE *fp);
int main(int argc, char *argv[])
{
int i;
FILE *fp;
printf("prin tin for \n");
for (i=1; i < argc; i++)
{
printf("prin tin fopen \n");
fp = fopen(argv[i],"r");
if (NULL == fp)
{
fprintf(stderr,"Problem opening file: %s\n",argv[i]);
continue;
}
printf("prin tin process \n");
Process(fp);
fclose(fp);
}
print_ht();
//clear_ht();
return 0;
}
void Process(FILE *fp)
{
const char *seperators = " ?!'\";,.:+-*&%(){}[]<>\\\t\n";
char line[MAX_LINE_SIZ];
char *s;
while((fgets(line,MAX_LINE_SIZ, fp)) != NULL)
{
for (s=strtok(line,seperators); s; s=strtok(NULL,seperators)){
printf("prin tin insert %s \n",s);
insert_ht(s);
}
}
}
/* Hash Function */
unsigned int hash(char *tok)
{
printf("bike stin hash \n");
unsigned int hv = 0;
while (*tok)
hv = (hv << 4) | toupper(*tok++);
printf("VGAINEIIIIIIIIIIIIII %d \n",hv);
return hv % HTABLE_SIZ;
}
void insert_ht(char *token)
{
printf("bike stin insert %s \n",token);
unsigned int hashval = hash(token);
if (htable[hashval]==NULL){
printf("mesa stin prwti if %u %s \n",hashval,token);
//token = strdup(token);
htable[hashval] = malloc(sizeof(token));
htable[hashval]->token = token ;
htable[hashval]->freq = 1;
size++;
}else {
htable[hashval]->freq++;
}
printf("ta evale epitixws \n");
}
int compare(const void *elem1, const void *elem2)
{
const struct node *p1 = elem1;
const struct node *p2 = elem2;
if ( p1->freq < p2->freq)
return -1;
else if (p1->freq > p2->freq)
return 1;
else
return 0;
}
void print_ht()
{
int i, j=0;
link l, *vector = (link*) malloc(sizeof(link)*size);
for (i=0; i < HTABLE_SIZ; i++)
for (l=htable[i]; l; l=l->next)
vector[j++] = l;
qsort(vector,size,sizeof(link),compare);
for (i=0; i < size; i++)
printf("%-50s\t%7d\n",vector[i]->token,vector[i]->freq);
free(vector);
}
Sorry for my bad english.
I think that :
insert(char *token) takes a word of the file and puts into the hash table. In brief, if the word exists in the hash table, you just have to increment its frequencie. Otherwise, you need to create another node and put the frequencie to 1, then ad it to the array. At the end, you will have one entry for each unique word.
compare(const void *elem1, const void *elem2) will be used by qsort. It returns 0 if elem1 = elem2, a negative number if elem1 < elem2 and a number > 0 if elem1 > elem2. By passing compare to qsort, you allow qsort to sort you array according to your own criteria.
clear_ht() may set all the values of the array to NULL, in order to restart another count ?
in order to complete a larger project, im trying to get an idea of how to send an array of structures, and a token of char* type to a function. my Pupose of this code is to do the following:
open file
tokenize file
send token,and array of structures to search function
search function will go through the arrayofstructures, using strcmp to find a match with token
if match found return 1, the main function will check for 1 or 0
if 0, dont add token to array of structures,if 1 add token to arrayof structures
i just wrote a small program to see if i could send the array,and token to a function and compare but i get so many errors im lost at what to do since i dont understand most of the errors.
#include <stdio.h>
#include <string.h>
int search(struct id array[],char* tok);
struct id
{
char name[20];
int age;
};
int main(void)
{
struct id person[2] = { {"John Smith", 25},
{"Mary Jones", 32} };
char* token = "Mary Jones"; /*char* for strtok() return type*/
search(person,token);
}
int search(struct id array[],char* tok)
{
int i,value;int size = 2;
for(i=0;i<size;i++)
{
if(strcmp(array[i].name,tok) == 0)
value = 0;
else
value = 1;
}
return value;
}
Place
int search(struct id array[],char* tok);
after struct declaration. And assign the return value from search to an int variable.
int found = search(person,token);
if(found == 0)
printf("Name is found\n"); // or whatever you want
Here is the code that you should use.
#include <stdio.h>
#include <string.h>
struct id
{
char name[20];
int age;
};
int search( const struct id array[], size_t n, const char *tok );
int main( void )
{
struct id person[2] = { {"John Smith", 25},
{"Mary Jones", 32} };
char* token = "Mary Jones"; /*char* for strtok() return type*/
printf( "%d\n", search( person, sizeof( person ) / sizeof( *person ), token ) );
return 0;
}
int search( const struct id array[], size_t n, const char *tok )
{
size_t i = 0;
while ( i < n && strcmp( array[i].name, tok ) != 0 ) ++i;
return n != 0 && i != n;
}
EDIT: I removed some typos.
The output is
1
that is the name has been found.
Take into account that the correct function search has to return 1 if the name is found.:)
Always declare structs / enums / unions before defining pointers to them, or using them in a function declaration, like this:
struct id;
Extra tip, introduce a typedef-name with the same id as the tag name at the same time:
typedef struct id id;
Always declare functions before first use, like this:
int search(struct id array[],char* tok);
Always define structs / enums / unions before using them for anything but what the first rule covers, like this:
struct id {
char name[20];
int age;
};
With typedef-name:
typedef struct id { /**/ } id;
Now, where possible, put the definition where you would otherwise need to put a forward-declaration.
Only exception: If you put the declaration in a header-file, that's fine.
That reduces superfluous redundancy.
Some more observations:
Don't use fixed-size fields for names and such. They are always too short.
Never modify string literals, failing to heed that prohibition results in Undefined Behavior, your program just became meaningless. Work with a copy instead.
/* Most implementations supply this non-standard function */
char* strdup(const char* s) {
size_t n = strlen(s)+1;
char* p = malloc(n);
if(p) memcpy(p, s, n);
return p;
}
When you pass an array, you are actually only passing a pointer to its first element, so pass an element-count too.
Types size_t or ssize_t are designed for that chore.
If you have an array named a, you get the element count using sizeof a / sizeof *a. Be sure that's not a pointer though!
Early return are good: Return success as early as possible.
Then you don't chance to forget your success in the next loop iteration (as happened to you), beside being faster.