I am trying to implement a data structure where I have a list of char* values and an array that stores the number of occurrences for each unique char* value from cmd line. It's a bit messy, but I thought I had it figured it out until I tried testing it (each test does not have to be compiled differently, I just renamed each run to make the test names different).
Test cmd inputs:
test1 a b c d e f g
test2 a a b b c d d e f f g g
test3 a a a a a a a a a a a
test4 a a a a a a a a a a a b c c
test5 a j i q y z n f o p m a
test6 a a b b c d d e f f g
test7 a j k q e s l i h i a
test8 a j i q y z n f o p m
Based on these inputs, the table lengths for each should look like:
test1 table->length = 7 <- works
test2 table->length = 7 <- works
test3 table->length = 1 <- works
test4 table->length = 3 <- works
test5 table->length = 11 <- works
test6 table->length = 7 <- no work
test7 table->length = 9 <- no work
test8 table->length = 11 <- no work
Based on these results, I don't believe it's the length of argc (as tests 1, 2, 3 works) nor is it the number of unique *chars (as per tests 2, 3, 4,5). When running the code below, it either works and I get to the end of main(...), or, depending on the input, it will fail after displaying:
...
ptr address: 00DF04F4
ptr value: 00DF0510
test.c:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct node {
char *path;
int index;
struct node *next;
} node;
typedef struct lookup {
struct node *head;
struct node *ptr;
int length;
int array[];
} lookup;
void add(lookup *table, char *path);
void search(lookup *table, char *path, int *i);
void add(lookup *table, char *path) {
int i = 0;
search(table, path, &i);
if(table->ptr == NULL) {
node *new_node = malloc(sizeof(node));
if (table->head == NULL) {
table->head = new_node;
table->head->next = NULL;
table->head->path = path;
table->array[0] = 1;
table->head->index = 0;
table->length = 1;
} else {
new_node->path = path;
new_node->next = table->head;
table->head = new_node;
table->array[i] = 1;
table->head->index = i;
table->length += 1;
}
} else {
table->array[table->ptr->index] += 1;
}
}
void search(lookup *table, char *path, int *i) {
if(table->head == NULL) {
table->ptr = NULL;
return;
}
table->ptr = table->head;
(*i) = 0;
while(table->ptr != NULL) {
printf("ptr address: %p\n", &(table->ptr));
printf("ptr value: %p\n", (table->ptr));
/* FAILURE POINT */
printf("%s\n", table->ptr->path);
printf("comparing ptr string: '%s' with given '%s'\n", table->ptr->path, path);
if(strcmp(table->ptr->path, path) == 0) {
printf("found match!\n");
return;
}
table->ptr = table->ptr->next;
(*i) += 1;
}
printf("could not find '%s' in table\n", path);
}
int main(int argc, char **argv) {
lookup *table = malloc(sizeof(lookup) + argc);
table->head = NULL;
int i = 1;
for(i; i < argc; i++) {
printf("\n\ntry adding: %s\n", argv[i]);
add(table, argv[i]);
}
printf("\n\n############\nfinished adding\n");
printf("table length: %d\n", table->length);
return 0;
}
I know the pointer isn't null (as I print it's address and value right before) yet I'm not sure why it stops execution where it does with seemingly normal input. Expected output (based on current code) should show each char* that is attempted to be added, the address and value of the pointer as search(...) searches the table, the current *char ptr is pointing to, what two *chars are being compared, whether or not a match was found, or if search(...) could not find the *char.
Related
I have a contact structure inserted into a linked list which is in an hashtable. I don't know if I defined all my structures correctly.
I basically want to add a contact via input when given the command 'a' (command would be like this: a name mail phone).
I sould not be able to add the contact if it already exists.
I've tried creating the necessary structure of an hashtable with linked lists, i just don't understand how to work with it. So this function would help me a lot with understanding this concept.
This is the structure i've tried
#define NOME_SIZE 1023
#define MAIL_SIZE 511
#define TELEFONE_SIZE 63
#define HASH_SIZE 1000
typedef struct contacts{
char name[NOME_SIZE];
char mail[MAIL_SIZE];
char phone[TELEFONE_SIZE];
struct contacts *next;
}HashList;
typedef struct hash_bucket{
HashList *head, *tail;
int n_elements;
}HashBucket;
HashBucket hashtable[HASH_SIZE];
I do not expect any output if i can successfuly add the contact.
If it already exists it should return an error saying the contact already exists
A proposal from your code and my remarks. I removed n_elements because for me it is useless. I let tail but not sure it is useful because your list only have a next with a previous. I let arrays for name, phone and mail but I think it is better to use char *
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NOME_SIZE 1023
#define MAIL_SIZE 511
#define TELEFONE_SIZE 63
#define HASH_SIZE 1000
typedef struct contacts{
char name[NOME_SIZE];
char mail[MAIL_SIZE];
char phone[TELEFONE_SIZE];
struct contacts *next;
}HashList;
typedef struct hash_bucket{
HashList *head, *tail;
/* int n_elements; * I removed that field, it is useless */
}HashBucket;
HashBucket hashtable[HASH_SIZE];
// from https://stackoverflow.com/a/7666577/2458991
size_t hash(char * str)
{
size_t hash = 5381;
unsigned char c;
while ((c = (unsigned char) *str++) != 0)
hash = ((hash << 5) + hash) + c; /* hash * 33 + c */
return hash;
}
HashList * createElt(char * n, char * m, char * p)
{
HashList * e = malloc(sizeof(HashList));
strncpy(e->name, n, NOME_SIZE - 1);
e->name[NOME_SIZE - 1] = 0;
strncpy(e->mail, m, MAIL_SIZE - 1);
e->name[MAIL_SIZE - 1] = 0;
strncpy(e->phone, p, TELEFONE_SIZE - 1);
e->name[TELEFONE_SIZE - 1] = 0;
e->next = NULL;
return e;
}
// add or replace an element, the key is the name
// return 0 if the entry is added, else a non null value if the entry is (probably) modified
int insertElt(char * n, char * m, char * p)
{
// suppose list sort on the name
HashBucket * hb = &hashtable[hash(n) % HASH_SIZE];
HashList ** hl = &hb->head;
for (;;) {
if (*hl == NULL) {
/* last (and may be first) element */
*hl = createElt(n, m, p);
hb->tail = *hl;
return 0;
}
int cmp = strcmp((*hl)->name, n);
if (cmp == 0) {
/* replace */
strncpy((*hl)->mail, m, MAIL_SIZE - 1);
(*hl)->name[MAIL_SIZE - 1] = 0;
strncpy((*hl)->phone, p, TELEFONE_SIZE - 1);
(*hl)->name[TELEFONE_SIZE - 1] = 0;
return 1;
}
if (cmp > 0) {
/* insert before */
HashList * e = createElt(n, m, p);
e->next = *hl;
*hl = e;
return 0;
}
hl = &(*hl)->next;
}
}
void pr()
{
for (size_t i = 0; i != HASH_SIZE; ++i)
for (HashList * hl = hashtable[i].head; hl != NULL; hl = hl->next)
printf("%s %s %s\n", hl->name, hl->mail, hl->phone);
}
int main()
{
printf("%d\n", insertElt("n1", "m1", "p1"));
printf("%d\n", insertElt("n2", "m2", "p2"));
pr();
printf("%d\n", insertElt("n1", "mm1", "pp1"));
pr();
return 0;
}
Compilation and execution :
pi#raspberrypi:/tmp $ gcc -pedantic -Wextra -Wall hm.c
pi#raspberrypi:/tmp $ ./a.out
0
0
n1 m1 p1
n2 m2 p2
1
n1 mm1 pp1
n2 m2 p2
This is my code :
typedef struct noeud{
int x;
struct noeud* suivant;
} noeud;
typedef noeud* file;
file enfiler(file f, int val){
file nv = (file) malloc(sizeof(noeud));
nv->x = val; nv->suivant = NULL;
if (f == NULL)
f = nv;
else {
file tmp = f;
while(tmp->suivant != NULL) tmp = tmp->suivant;
tmp->suivant = nv;
}
return f;
}
file defiler(file f){//removing an element of the FIFO data structure
if (f == NULL)
return f;
else {
file tmp = f;
f = f->suivant;//receiving address of next node, the last one points to NULL
free(tmp);
return f;
}
}
int tete(file f){
return f->x;//getting the element of the head
}
void init(file * f) {
*f = NULL;
}
void affiche(file f){//print data structure's elements
if (f == NULL)
printf("File vide.\n");
else {//emptying the FIFO data structure into tmp to access elements
file tmp; init(&tmp);
while(f != NULL){
tmp = enfiler(tmp, tete(f));
f = defiler(f);
}
int i = 0;
while(tmp != NULL) {//emptying tmp to original f
printf("F[%d] = %d\n", ++i, tete(tmp));
f = enfiler(f, tete(tmp));
tmp = defiler(tmp);
}
}
}
This is my input :
file f; init(&f);//initializing f to NULL
f = enfiler(f, 6);//adding elements
f = enfiler(f, 45);
f = enfiler(f, 78);
f = enfiler(f, 5);
affiche(f);
affiche(f);
affiche(f);
This is the output :
F[1] = 6
F[2] = 45
F[3] = 78
F[4] = 5
F[1] = 78
F[2] = 5
F[1] = 2036736 //this is a random value
With each void affiche(file f) two heads are being lost, I revised the function file defiler(file f) but can't seem to find an error, file enfiler(file f, int x) is also fine.
Thank you for your time!
In order to reverse the output, you construct and re-construct your queue. Your reconstruction does not re-link the same nodes. Instead, you just take the values and create two completely new lists. That means that the local variable f in afficher will usually be different when after entering and before leaving afficher. You can test that by adding the following statements:
void affiche(noeud *f)
{
printf("entree: %p\n", f);
// body of afficher function
printf("sortie: %p\n", f);
}
The problem is that the f in your calling function is not updated. It could point to recently free'd memory or to another valid node. In other words, your list will likely be corrupt.
The easiest way to fix that is by returning the new head, as in enfiler and defiler.
I am actually trying to implement a breadth-first-search algorithm in C, as an input I take any graph from a file and store all nodes and vertices in a structure.
Then I create an adjacency matrix, and run through all columns, push encountered nodes on a stack, pop them and so on till I have all the paths.
Still I have problems storing those paths in a linked list, and by problem I mean that sometimes, from some specific cases, I lose the last value of my stored path (one int array per link), which is quite surprising as it occurs only on paths of length 5 (I cannot test all lengths but up to 12 it seems OK).
It's weird, because these values are lost at function exit (I tried debugging using LLDB and in the function that creates the link, the last byte exists, but once I leave the function, it does not) and not all the time (1 out 10 execution all is fine).
To me this is a malloc issue, so I checked every single malloc of my program in order to solve (unsuccessfully) the problem. Checked all the variables and all seems fine, except for this 5 length case (I assume my program has a 'defect' that is only apparent in this case, but why ?).
I would gladly accept some help, as I just ran out of things to check.
Here is the code of the main BFS function :
void bfs(t_lemin *e)
{
t_path *save;
//set needed variables
set_bfs_base_var(e);
save = e->p;
while (paths_remain(e))
{
//Special Start-End case
if (e->map[e->nb_start][e->nb_end] == 1)
{
create_single(e);
break ;
}
e->x = e->nb_start;
reset_tab(e);
while (e->x != e->nb_end)
{
e->y = 0;
while (e->y < e->nb_rooms)
{
if (e->map[e->x][e->y] == 1 && !e->visited[e->y])
##push_on_stack the nodes
push_stack(e);
e->y++;
}
//go_to first elem on stack
e->x = e->stack[0];
if (e->x == e->nb_end || is_stack_empty(e->stack, e->nb_rooms - 1))
break ;
e->visited[e->x] = 1;
//set_it as visited than pop it
pop_stack(e, e->nb_rooms);
}
if (is_stack_empty(e->stack, e->nb_rooms - 1))
break ;
e->find_new[add_path(e)] = 1;
discover_more_paths(e, save);
}
print_paths(e, save);
e->p = save;
}
And here the 2 functions that stores the paths in a linked list :
void create_path(t_lemin *e, int *pa, int len)
{
int j;
j = 1;
//create_new_node if required
if (e->p->path)
{
if (!(e->p->next = malloc(sizeof(t_path))))
return ;
e->p = e->p->next;
}
//create_the_array_for_path_storing
e->p->path = malloc(sizeof(int) * len + 2);
e->p->next = NULL;
e->p->size_path = len + 2;
//copy_in_it
while (--len >= 0)
{
e->p->path[j++] = pa[len];
}
//copy_end_and_start_at_end_and_start
e->p->path[e->p->size_path - 1] = e->nb_end;
e->p->path[0] = e->nb_start;
e->nb_paths++;
}
int add_path(t_lemin *e)
{
int i;
int save;
int *path;
int next_path;
i = 0;
if (!(path = malloc(sizeof(int) * e->nb_rooms)))
exit(-1);
save = e->nb_end;
//in_order_to_save_the_path_i store the previous value of each node so I can find the path by iterating backward
next_path = -1;
while (e->prev[save] != e->nb_start)
{
path[i] = e->prev[save];
save = e->prev[save];
next_path = next_path == -1 && get_nb_links(e, path[i])
> 2 ? path[i] : -1;
i++;
}
//path_contains all values of the path except for start and end
save = i;
while (i < e->nb_rooms)
{
path[i] = 0;
i++;
}
create_path(e, path, save);
i = next_path == -1 ? path[0] : next_path;
//ft_printf("to_block : %d\n", i);
return (next_path == -1 ? path[0] : next_path);
}
If needed here is a clone of the entire repository, the issue can be seen running the program with maptest in the main directory : https://github.com/Caribou123/bfs_agesp.git
Make && ./lem_in < maptest
All paths must end by the last room, whereas in this case the value of the last room becomes 0. So the program outputs "start->room1->room2->....->start as the index value of start is 0.
Here is a look at my 'e', the main structure. (it's quite huge, don't be scared) :
typedef struct s_lemin
{
int x;
int y;
char *av;
int nb_ants;
int st;
int nd;
int nb_rooms;
int nb_paths;
int max_sizep;
int nb_links;
int nb_start;
int nb_end;
int **map;
int *stack;
int *visited;
int *prev;
int *find_new;
int maxy;
int maxx;
int minx;
int conti;
int miny;
char ***saa;
struct s_rooms *r;
struct s_ants *a;
struct s_rooms **table_r;
struct s_links *l;
struct s_hash **h;
struct s_rooms *start;
struct s_rooms *end;
struct s_info *i;
struct s_path *p;
struct s_path *select_p;
}
Thank you in advance for your help, and sorry if it's some stupid malloc that I somehow missed.
Artiom
I am trying to use MATLAB-API to read .mat file using C (NOT C++).
This is MATLAB code which would create sort of .mat file I want:
A = [[1 2 3]; [5 7 1]; [3 5 9]];
B = [[2 4];[5 7]];
Creator = 'DKumar';
nFilters = 2;
Filters{1} = [[-1.0 -1.0 -1.0]; [-1.0 8 -1.0]; [-1.0 -1.0 -1.0]];
Filters{2} = 2.0*[[-1.0 -1.0 -1.0]; [-1.0 8 -1.0]; [-1.0 -1.0 -1.0]];
cd('/home/dkumar/CPP_ExampleCodes_DKU/Read_mat_File');
save('Test_FILE.mat', 'A', 'B', 'Creator', 'nFilters', 'Filters');
Please notice that I also need to read cell-structure or something similar.
(1) In the C code, it seems that I can read matrix stored in .mat just fine; but, cannot return properly (see the output in the end).
(2) I still have no idea about cell-structure which in this example would STORE DOUBLE MATRICES which may vary in size.
Full-C code follows. First, the function matread, which can seemingly read the data properly.
#include <stdio.h>
#include <stdlib.h>
#include "/usr/local/MATLAB/R2011b/extern/include/mat.h"
struct stDoubleMat{
double* pValueInField;
int nRows, nCols;
};
void matread(const char *file, const char *FieldName2Read, struct stDoubleMat oDoubleMat_LOC)
{
printf("Reading file %s...\n\n", file);
//Open file to get directory
MATFile* pmat = matOpen(file, "r");
if (pmat == NULL) {
printf("Error opening file %s\n", file);
return;
}
// extract the specified variable
mxArray *arr = matGetVariable(pmat, FieldName2Read);
double *pr;
if (arr != NULL && !mxIsEmpty(arr)) {
// copy data
mwSize num = mxGetNumberOfElements(arr);
pr = mxGetPr(arr);
if (pr != NULL) {
oDoubleMat_LOC.pValueInField = pr;
oDoubleMat_LOC.nRows = mxGetM(arr);
oDoubleMat_LOC.nCols = mxGetN(arr);
}
printf("From inside the function \n") ;
printf( "oDoubleMat_LOC.nRows %i ; oDoubleMat_LOC.nCols %i \n", oDoubleMat_LOC.nRows , oDoubleMat_LOC.nCols);
}else{
printf("nothing to read \n") ;
}
// cleanup
mxDestroyArray(arr);
matClose(pmat);
return;
}
In the same file, the main function, which seems to be unable to return the read data:
int main(int argc, char **argv)
{
const char *FileName = "/home/dkumar/CPP_ExampleCodes_DKU/Read_mat_File/Test_FILE.mat";
const char *FieldName2Read = "A";
struct stDoubleMat oDoubleMat;
matread(FileName, FieldName2Read, oDoubleMat);
double* v = oDoubleMat.pValueInField;
printf("From main \n");
printf( "oDoubleMat.nRows %i ; oDoubleMat.nCols %i \n", oDoubleMat.nRows , oDoubleMat.nCols);
/*
for (int i = 0; i < oDoubleMat.nElements; i++)
{
std::cout <<" copied value : " << *v << "\n";
v = v +1;
}*/
return 0;
}
Here is the output
$ gcc -o Test Read_MatFile_DKU_2.c -I/usr/local/MATLAB/R2011b/extern/include -L/usr/local/MATLAB/R2011b/bin/glnxa64 -lmat -lmx
$ ./Test
Reading file /home/dkumar/CPP_ExampleCodes_DKU/Read_mat_File/Test_FILE.mat...
From inside the function
oDoubleMat_LOC.nRows 3 ; oDoubleMat_LOC.nCols 3
From main
oDoubleMat.nRows 0 ; oDoubleMat.nCols 0
Update:
Here is the updated code which read matrix-field just fine. I still have no clue about how to read "cell-structure".
#include <stdio.h>
#include <stdlib.h>
#include "/usr/local/MATLAB/R2011b/extern/include/mat.h"
mxArray *arr;
struct stDoubleMat{
double* pValueInField;
int nRows, nCols;
};
void matread(const char *file, const char *FieldName2Read, struct stDoubleMat* poDoubleMat_LOC)
{
printf("Reading file %s...\n\n", file);
//Open file to get directory
MATFile* pmat = matOpen(file, "r");
if (pmat == NULL) {
printf("Error opening file %s\n", file);
return;
}
// extract the specified variable
arr = matGetVariable(pmat, FieldName2Read);
double *pr;
if (arr != NULL && !mxIsEmpty(arr)) {
// copy data
mwSize num = mxGetNumberOfElements(arr);
pr = mxGetPr(arr);
if (pr != NULL) {
poDoubleMat_LOC->pValueInField = pr;
poDoubleMat_LOC->nRows = mxGetM(arr);
poDoubleMat_LOC->nCols = mxGetN(arr);
}
printf("From inside the function \n") ;
printf( "oDoubleMat_LOC.nRows %i ; oDoubleMat_LOC.nCols %i \n", poDoubleMat_LOC->nRows , poDoubleMat_LOC->nCols);
}else{
printf("nothing to read \n") ;
}
// close the file
matClose(pmat);
return;
}
int main(int argc, char **argv)
{
const char *FileName = "/home/dkumar/CPP_ExampleCodes_DKU/Read_mat_File/Test_FILE.mat";
const char *FieldName2Read = "A";
struct stDoubleMat oDoubleMat;
matread(FileName, FieldName2Read, &oDoubleMat);
double* v = oDoubleMat.pValueInField;
printf("From main \n");
printf( "oDoubleMat.nRows %i ; oDoubleMat.nCols %i \n", oDoubleMat.nRows , oDoubleMat.nCols);
int i;
for (i = 0; i < oDoubleMat.nCols*oDoubleMat.nRows; i++)
{
printf(" copied value : %f \n", *v);
v = v +1;
}
// cleanup the mex-array
mxDestroyArray(arr);
return 0;
}
You pass the "output" argument (oDoubleMat_LOC) by value to matread, so you can never actually get an output because it is copied on input (i.e. only modified locally):
void matread(const char *file, const char *FieldName2Read,
struct stDoubleMat oDoubleMat_LOC) /* oDoubleMat_LOC copied */
Since you are using C, where references are not available, pass a pointer. Redefine matread:
void matread(const char *file, const char *FieldName2Read,
struct stDoubleMat *oDoubleMat_LOC) /* use a pointer */
Then inside matread, you need to dereference it to modify its fields (with -> instead of .):
oDoubleMat_LOC->pValueInField = pr;
oDoubleMat_LOC->nRows = mxGetM(arr);
oDoubleMat_LOC->nCols = mxGetN(arr);
In main, call like this:
struct stDoubleMat oDoubleMat;
matread(FileName, FieldName2Read, &oDoubleMat);
However, note that you have bigger problems because the mxArray that is backing double *pValueInField is both allocate and destroyed inside matread. While you can return the pointer to the data array, it will be a dangling pointer, which points to deallocated data. You'll need to either allocate an mxArray outside of matread and pass it in, or allocate a double * and copy the data into it inside matread. Otherwise, as soon as mxDestroyArray is called, the pointer is useless.
Edit:
Hash.c is updated with revisions from the comments, I am still getting a Seg fault. I must be missing something here that you guys are saying
I have created a hash table ADT using C but I am encountering a segmentation fault when I try to call a function (find_hash) in the ADT.
I have posted all 3 files that I created parse.c, hash.c, and hash.h, so you can see all of the variables. We are reading from the file gettysburg.txt which is also attached
The seg fault is occuring in parse.c when I call find_hash. I cannot figure out for the life of me what is going on here. If you need anymore information I can surely provide it.
sorry for the long amount of code I have just been completely stumped for a week now on this. Thanks in advance
The way I run the program is first:
gcc -o parse parse.c hash.c
then: cat gettysburg.txt | parse
Parse.c
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include "hash.h"
#define WORD_SIZE 40
#define DICTIONARY_SIZE 1000
#define TRUE 1
#define FALSE 0
void lower_case_word(char *);
void dump_dictionary(Phash_table );
/*Hash and compare functions*/
int hash_func(char *);
int cmp_func(void *, void *);
typedef struct user_data_ {
char word[WORD_SIZE];
int freq_counter;
} user_data, *Puser_data;
int main(void)
{
char c, word1[WORD_SIZE];
int char_index = 0, dictionary_size = 0, num_words = 0, i;
int total=0, largest=0;
float average = 0.0;
Phash_table t; //Pointer to main hash_table
int (*Phash_func)(char *)=NULL; //Function Pointers
int (*Pcmp_func)(void *, void *)=NULL;
Puser_data data_node; //pointer to hash table above
user_data * find;
printf("Parsing input ...\n");
Phash_func = hash_func; //Assigning Function pointers
Pcmp_func = cmp_func;
t = new_hash(1000,Phash_func,Pcmp_func);
// Read in characters until end is reached
while ((c = getchar()) != EOF) {
if ((c == ' ') || (c == ',') || (c == '.') || (c == '!') || (c == '"') ||
(c == ':') || (c == '\n')) {
// End of a word
if (char_index) {
// Word is not empty
word1[char_index] = '\0';
lower_case_word(word1);
data_node = (Puser_data)malloc(sizeof(user_data));
strcpy(data_node->word,word1);
printf("%s\n", data_node->word);
//!!!!!!SEG FAULT HERE!!!!!!
if (!((user_data *)find_hash(t, data_node->word))){ //SEG FAULT!!!!
insert_hash(t,word1,(void *)data_node);
}
char_index = 0;
num_words++;
}
} else {
// Continue assembling word
word1[char_index++] = c;
}
}
printf("There were %d words; %d unique words.\n", num_words,
dictionary_size);
dump_dictionary(t); //???
}
void lower_case_word(char *w){
int i = 0;
while (w[i] != '\0') {
w[i] = tolower(w[i]);
i++;
}
}
void dump_dictionary(Phash_table t){ //???
int i;
user_data *cur, *cur2;
stat_hash(t, &(t->total), &(t->largest), &(t->average)); //Call to stat hash
printf("Number of unique words: %d\n", t->total);
printf("Largest Bucket: %d\n", t->largest);
printf("Average Bucket: %f\n", t->average);
cur = start_hash_walk(t);
printf("%s: %d\n", cur->word, cur->freq_counter);
for (i = 0; i < t->total; i++)
cur2 = next_hash_walk(t);
printf("%s: %d\n", cur2->word, cur2->freq_counter);
}
int hash_func(char *string){
int i, sum=0, temp, index;
for(i=0; i < strlen(string);i++){
sum += (int)string[i];
}
index = sum % 1000;
return (index);
}
/*array1 and array2 point to the user defined data struct defined above*/
int cmp_func(void *array1, void *array2){
user_data *cur1= array1;
user_data *cur2= array2;//(user_data *)array2;
if(cur1->freq_counter < cur2->freq_counter){
return(-1);}
else{ if(cur1->freq_counter > cur2->freq_counter){
return(1);}
else return(0);}
}
hash.c
#include "hash.h"
Phash_table new_hash (int size, int(*hash_func)(char*), int(*cmp_func)(void*, void*)){
int i;
Phash_table t;
t = (Phash_table)malloc(sizeof(hash_table)); //creates the main hash table
t->buckets = (hash_entry **)malloc(sizeof(hash_entry *)*size); //creates the hash table of "size" buckets
t->size = size; //Holds the number of buckets
t->hash_func = hash_func; //assigning the pointer to the function in the user's program
t->cmp_func = cmp_func; // " "
t->total=0;
t->largest=0;
t->average=0;
t->sorted_array = NULL;
t->index=0;
t->sort_num=0;
for(i=0;i<size;i++){ //Sets all buckets in hash table to NULL
t->buckets[i] = NULL;}
return(t);
}
void free_hash(Phash_table table){
int i;
hash_entry *cur;
for(i = 0; i<(table->size);i++){
if(table->buckets[i] != NULL){
for(cur=table->buckets[i]; cur->next != NULL; cur=cur->next){
free(cur->key); //Freeing memory for key and data
free(cur->data);
}
free(table->buckets[i]); //free the whole bucket
}}
free(table->sorted_array);
free(table);
}
void insert_hash(Phash_table table, char *key, void *data){
Phash_entry new_node; //pointer to a new node of type hash_entry
int index;
new_node = (Phash_entry)malloc(sizeof(hash_entry));
new_node->key = (char *)malloc(sizeof(char)*(strlen(key)+1)); //creates the key array based on the length of the string-based key
new_node->data = data; //stores the user's data into the node
strcpy(new_node->key,key); //copies the key into the node
//calling the hash function in the user's program
index = table->hash_func(key); //index will hold the hash table value for where the new node will be placed
table->buckets[index] = new_node; //Assigns the pointer at the index value to the new node
table->total++; //increment the total (total # of buckets)
}
void *find_hash(Phash_table table, char *key){
int i;
hash_entry *cur;
printf("Inside find_hash\n"); //REMOVE
for(i = 0;i<table->size;i++){
if(table->buckets[i]!=NULL){
for(cur = table->buckets[i]; cur->next != NULL; cur = cur->next){
if(strcmp(table->buckets[i]->key, key) == 0)
return((table->buckets[i]->data));} //returns the data to the user if the key values match
} //otherwise return NULL, if no match was found.
}
return NULL;
}
void stat_hash(Phash_table table, int *total, int *largest, float *average){
int node_num[table->size]; //creates an array, same size as table->size(# of buckets)
int i,j, count = 0;
int largest_buck = 0;
hash_entry *cur;
for(i = 0; i < table->size; i ++){
if(table->buckets[i] != NULL){
for(cur=table->buckets[i]; cur->next!=NULL; cur = cur->next){
count ++;}
node_num[i] = count;
count = 0;}
}
for(j = 0; j < table->size; j ++){
if(node_num[j] > largest_buck)
largest_buck = node_num[j];}
*total = table->total;
*largest = largest_buck;
*average = (table->total) / (table->size);
}
void *start_hash_walk(Phash_table table){
Phash_table temp = table;
int i, j, k;
hash_entry *cur; //CHANGE IF NEEDED to HASH_TABLE *
if(table->sorted_array != NULL) free(table->sorted_array);
table->sorted_array = (void**)malloc(sizeof(void*)*(table->total));
for(i = 0; i < table->total; i++){
if(table->buckets[i]!=NULL){
for(cur=table->buckets[i]; cur->next != NULL; cur=cur->next){
table->sorted_array[i] = table->buckets[i]->data;
}}
}
for(j = (table->total) - 1; j > 0; j --) {
for(k = 1; k <= j; k ++){
if(table->cmp_func(table->sorted_array[k-1], table->sorted_array[k]) == 1){
temp -> buckets[0]-> data = table->sorted_array[k-1];
table->sorted_array[k-1] = table->sorted_array[k];
table->sorted_array[k] = temp->buckets[0] -> data;
}
}
}
return table->sorted_array[table->sort_num];
}
void *next_hash_walk(Phash_table table){
table->sort_num ++;
return table->sorted_array[table->sort_num];
}
hash.h
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct hash_entry_ { //Linked List
void *data; //Generic pointer
char *key; //String-based key value
struct hash_entry_ *next; //Self-Referencing pointer
} hash_entry, *Phash_entry;
typedef struct hash_table_ {
hash_entry **buckets; //Pointer to a pointer to a Linked List of type hash_entry
int (*hash_func)(char *);
int (*cmp_func)(void *, void *);
int size;
void **sorted_array; //Array used to sort each hash entry
int index;
int total;
int largest;
float average;
int sort_num;
} hash_table, *Phash_table;
Phash_table new_hash(int size, int (*hash_func)(char *), int (*cmp_func)(void *, void *));
void free_hash(Phash_table table);
void insert_hash(Phash_table table, char *key, void *data);
void *find_hash(Phash_table table, char *key);
void stat_hash(Phash_table table, int *total, int *largest, float *average);
void *start_hash_walk(Phash_table table);
void *next_hash_walk(Phash_table table);
Gettysburg.txt
Four score and seven years ago, our fathers brought forth upon this continent a new nation: conceived in liberty, and dedicated to the proposition that all men are created equal.
Now we are engaged in a great civil war. . .testing whether that nation, or any nation so conceived and so dedicated. . . can long endure. We are met on a great battlefield of that war.
We have come to dedicate a portion of that field as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.
But, in a larger sense, we cannot dedicate. . .we cannot consecrate. . . we cannot hallow this ground. The brave men, living and dead, who struggled here have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember, what we say here, but it can never forget what they did here.
It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us. . .that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion. . . that we here highly resolve that these dead shall not have died in vain. . . that this nation, under God, shall have a new birth of freedom. . . and that government of the people. . .by the people. . .for the people. . . shall not perish from the earth.
It's possible that one of several problems with this code are loops like:
for(table->buckets[i];
table->buckets[i]->next != NULL;
table->buckets[i] = table->buckets[i]->next)
...
The initializing part of the for loop (table->buckets[i]) has no effect. If i is 0 and table->buckets[0] == NULL, then the condition on this loop (table->buckets[i]->next != NULL) will dereference a null pointer and crash.
That's where your code seemed to be crashing for on my box, at least. When I changed several of your loops to:
if (table->buckets[i] != NULL) {
for(;
table->buckets[i]->next != NULL;
table->buckets[i] = table->buckets[i]->next)
...
}
...it kept crashing, but in a different place. Maybe that will help get you unstuck?
Edit: another potential problem is that those for loops are destructive. When you call find_hash, do you really want all of those buckets to be modified?
I'd suggest using something like:
hash_entry *cur;
// ...
if (table->buckets[i] != NULL) {
for (cur = table->buckets[i]; cur->next != NULL; cur = cur->next) {
// ...
}
}
When I do that and comment out your dump_dictionary function, your code runs without crashing.
Hmm,
here's hash.c
#include "hash.h"
Phash_table new_hash (int size, int(*hash_func)(char*), int(*cmp_func)(void*, void*)){
int i;
Phash_table t;
t = (Phash_table)calloc(1, sizeof(hash_table)); //creates the main hash table
t->buckets = (hash_entry **)calloc(size, sizeof(hash_entry *)); //creates the hash table of "size" buckets
t->size = size; //Holds the number of buckets
t->hash_func = hash_func; //assigning the pointer to the function in the user's program
t->cmp_func = cmp_func; // " "
t->total=0;
t->largest=0;
t->average=0;
for(i=0;t->buckets[i] != NULL;i++){ //Sets all buckets in hash table to NULL
t->buckets[i] = NULL;}
return(t);
}
void free_hash(Phash_table table){
int i;
for(i = 0; i<(table->size);i++){
if(table->buckets[i]!=NULL)
for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
free(table->buckets[i]->key); //Freeing memory for key and data
free(table->buckets[i]->data);
}
free(table->buckets[i]); //free the whole bucket
}
free(table->sorted_array);
free(table);
}
void insert_hash(Phash_table table, char *key, void *data){
Phash_entry new_node; //pointer to a new node of type hash_entry
int index;
new_node = (Phash_entry)calloc(1,sizeof(hash_entry));
new_node->key = (char *)malloc(sizeof(char)*(strlen(key)+1)); //creates the key array based on the length of the string-based key
new_node->data = data; //stores the user's data into the node
strcpy(new_node->key,key); //copies the key into the node
//calling the hash function in the user's program
index = table->hash_func(key); //index will hold the hash table value for where the new node will be placed
table->buckets[index] = new_node; //Assigns the pointer at the index value to the new node
table->total++; //increment the total (total # of buckets)
}
void *find_hash(Phash_table table, char *key){
int i;
hash_entry *cur;
printf("Inside find_hash\n"); //REMOVE
for(i = 0;i<table->size;i++){
if(table->buckets[i]!=NULL){
for (cur = table->buckets[i]; cur != NULL; cur = cur->next){
//for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
if(strcmp(cur->key, key) == 0)
return((cur->data));} //returns the data to the user if the key values match
} //otherwise return NULL, if no match was found.
}
return NULL;
}
void stat_hash(Phash_table table, int *total, int *largest, float *average){
int node_num[table->size];
int i,j, count = 0;
int largest_buck = 0;
hash_entry *cur;
for(i = 0; i < table->size; i ++)
{
if(table->buckets[i]!=NULL)
for (cur = table->buckets[i]; cur != NULL; cur = cur->next){
//for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
count ++;}
node_num[i] = count;
count = 0;
}
for(j = 0; j < table->size; j ++){
if(node_num[j] > largest_buck)
largest_buck = node_num[j];}
*total = table->total;
*largest = largest_buck;
*average = (table->total) /(float) (table->size); //oook: i think you want a fp average
}
void *start_hash_walk(Phash_table table){
void* temp = 0; //oook: this was another way of overwriting your input table
int i, j, k;
int l=0; //oook: new counter for elements in your sorted_array
hash_entry *cur;
if(table->sorted_array !=NULL) free(table->sorted_array);
table->sorted_array = (void**)calloc((table->total), sizeof(void*));
for(i = 0; i < table->size; i ++){
//for(i = 0; i < table->total; i++){ //oook: i don't think you meant total ;)
if(table->buckets[i]!=NULL)
for (cur = table->buckets[i]; cur != NULL; cur = cur->next){
//for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
table->sorted_array[l++] = cur->data;
}
}
//oook: sanity check/assert on expected values
if (l != table->total)
{
printf("oook: l[%d] != table->total[%d]\n",l,table->total);
}
for(j = (l) - 1; j > 0; j --) {
for(k = 1; k <= j; k ++){
if (table->sorted_array[k-1] && table->sorted_array[k])
{
if(table->cmp_func(table->sorted_array[k-1], table->sorted_array[k]) == 1){
temp = table->sorted_array[k-1]; //ook. changed temp to void* see assignment
table->sorted_array[k-1] = table->sorted_array[k];
table->sorted_array[k] = temp;
}
}
else
printf("if (table->sorted_array[k-1] && table->sorted_array[k])\n");
}
}
return table->sorted_array[table->sort_num];
}
void *next_hash_walk(Phash_table table){
/*oook: this was blowing up since you were incrementing past the size of sorted_array..
NB: *you **need** to implement some bounds checking here or you will endup with more seg-faults!!*/
//table->sort_num++
return table->sorted_array[table->sort_num++];
}
here's parse.c
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <assert.h> //oook: added so you can assert ;)
#include "hash.h"
#define WORD_SIZE 40
#define DICTIONARY_SIZE 1000
#define TRUE 1
#define FALSE 0
void lower_case_word(char *);
void dump_dictionary(Phash_table );
/*Hash and compare functions*/
int hash_func(char *);
int cmp_func(void *, void *);
typedef struct user_data_ {
char word[WORD_SIZE];
int freq_counter;
} user_data, *Puser_data;
int main(void)
{
char c, word1[WORD_SIZE];
int char_index = 0, dictionary_size = 0, num_words = 0, i;
int total=0, largest=0;
float average = 0.0;
Phash_table t; //Pointer to main hash_table
int (*Phash_func)(char *)=NULL; //Function Pointers
int (*Pcmp_func)(void *, void *)=NULL;
Puser_data data_node; //pointer to hash table above
user_data * find;
printf("Parsing input ...\n");
Phash_func = hash_func; //Assigning Function pointers
Pcmp_func = cmp_func;
t = new_hash(1000,Phash_func,Pcmp_func);
// Read in characters until end is reached
while ((c = getchar()) != EOF) {
if ((c == ' ') || (c == ',') || (c == '.') || (c == '!') || (c == '"') ||
(c == ':') || (c == '\n')) {
// End of a word
if (char_index) {
// Word is not empty
word1[char_index] = '\0';
lower_case_word(word1);
data_node = (Puser_data)calloc(1,sizeof(user_data));
strcpy(data_node->word,word1);
printf("%s\n", data_node->word);
//!!!!!!SEG FAULT HERE!!!!!!
if (!((user_data *)find_hash(t, data_node->word))){ //SEG FAULT!!!!
dictionary_size++;
insert_hash(t,word1,(void *)data_node);
}
char_index = 0;
num_words++;
}
} else {
// Continue assembling word
word1[char_index++] = c;
}
}
printf("There were %d words; %d unique words.\n", num_words,
dictionary_size);
dump_dictionary(t); //???
}
void lower_case_word(char *w){
int i = 0;
while (w[i] != '\0') {
w[i] = tolower(w[i]);
i++;
}
}
void dump_dictionary(Phash_table t){ //???
int i;
user_data *cur, *cur2;
stat_hash(t, &(t->total), &(t->largest), &(t->average)); //Call to stat hash
printf("Number of unique words: %d\n", t->total);
printf("Largest Bucket: %d\n", t->largest);
printf("Average Bucket: %f\n", t->average);
cur = start_hash_walk(t);
if (!cur) //ook: do test or assert for null values
{
printf("oook: null== (cur = start_hash_walk)\n");
exit(-1);
}
printf("%s: %d\n", cur->word, cur->freq_counter);
for (i = 0; i < t->total; i++)
{//oook: i think you needed these braces
cur2 = next_hash_walk(t);
if (!cur2) //ook: do test or assert for null values
{
printf("oook: null== (cur2 = next_hash_walk(t) at i[%d])\n",i);
}
else
printf("%s: %d\n", cur2->word, cur2->freq_counter);
}//oook: i think you needed these braces
}
int hash_func(char *string){
int i, sum=0, temp, index;
for(i=0; i < strlen(string);i++){
sum += (int)string[i];
}
index = sum % 1000;
return (index);
}
/*array1 and array2 point to the user defined data struct defined above*/
int cmp_func(void *array1, void *array2){
user_data *cur1= array1;
user_data *cur2= array2;//(user_data *)array2;
/* ooook: do assert on programmatic errors.
this function *requires non-null inputs. */
assert(cur1 && cur2);
if(cur1->freq_counter < cur2->freq_counter){
return(-1);}
else{ if(cur1->freq_counter > cur2->freq_counter){
return(1);}
else return(0);}
}
follow the //ooks
Explanation:
There were one or two places this was going to blow up in.
The quick fix and answer to your question was in parse.c, circa L100:
cur = start_hash_walk(t);
printf("%s: %d\n", cur->word, cur->freq_counter);
..checking that cur is not null before calling printf fixes your immediate seg-fault.
But why would cur be null ? ~because of this bad-boy:
void *start_hash_walk(Phash_table table)
Your hash_func(char *string) can (& does) return non-unique values. This is of course ok except that you have not yet implemented your linked list chains. Hence you end up with table->sorted_array containing less than table->total elements ~or you would if you were iterating over all table->size buckets ;)
There are one or two other issues.
For now i hacked Nate Kohl's for(cur=table->buckets[i]; cur->next != NULL; cur=cur->next) further, to be for(cur=table->buckets[i]; cur != NULL; cur=cur->next) since you have no chains. But this is *your TODO so enough said about that.
Finally. note that in next_hash_walk(Phash_table table) you have:
table->sort_num++
return table->sorted_array[table->sort_num];
Ouch! Do check those array bounds!
Notes
1) If you're function isn't designed to change input, then make the input const. That way the compiler may well tell you when you're inadvertently trashing something.
2) Do bound checking on your array indices.
3) Do test/assert for Null pointers before attempting to use them.
4) Do unit test each of your functions; never write too much code before compiling & testing.
5) Use minimal test-data; craft it such that it limit-tests your code & attempts to break it in cunning ways.
6) Do initialise you data structures!
7)Never use egyptian braces ! {
only joking ;)
}
PS Good job so far ~> pointers are tricky little things! & a well asked question with all the necessary details so +1 and gl ;)
(//oook: maybe add a homework tag)