Structure pointer operator - c

Code is meant to create an array of pointers to student structure in order to use the array of pointers in other functions. I'm not sure how to use the arrow operator in binary function. It doesn't return a value for the index where id is found.
typedef struct{
int IDno;
char name[20];
int project;
int exam;
double final;
} student;
student **create_class_list(char*filename, int *sizePtr);
void print_list(student**list,int *sizePtr);
int find_binsrch(int idNo, student **list, int size,int low, int high);
int main(void){
int i, n;
student **listPtr;
listPtr = create_class_list("student.txt", &n);
print_list(listPtr,&n);
index2 = find_binsrch(searchID, listPtr, n, 1200, 4580);
}
student **create_class_list(char *filename, int *sizeptr){
int n,i;
FILE *fptr;
fptr=fopen(filename,"r");
if(fptr==NULL)
printf("The file could not be opened.\n");
else
fscanf(fptr, "%d",sizeptr);
n=*sizeptr;
student **list;
list = (student**)calloc(1, sizeof(student*));
for(i=0;i<n;i++){
list[i]=(student*)calloc(n,sizeof(student));
fscanf(fptr,"%d %[^\n]s", &(list[i]->IDno),(list[i]->name));
}
return list;
}
void print_list(student**list,int *sizePtr){
int i;
for(i=0; i<*sizePtr; i++){
printf("%d %s\n",&(list[i]->IDno),(list[i]->name));
}
}
int find_binsrch(int idNo, student **list, int size, int low, int high){
int middle, i;
while(low<=high){
middle =(low+high)/2;
printf("%d\n", middle);
if(idNo==list[middle]->IDno)
return list[i]->IDno;
if(idNo<list[middle]->IDno)
high = middle -1;
else
low = middle +1;
return -1;
}
}

What you must learn to do is enable Warnings every time you compile. This allows the compiler to identify many areas in your code that need attention. You should not accept code that compiles with warnings. There are only very, very rare circumstances where it is acceptable to rely on code that compiles with warnings (none that you will likely encounter in your first year of programming) So always enable -Wall -Wextra as part of your compile string. (you can also enable -pedantic to see additional warnings as well as some specific warning requests, but for general use -Wall -Wextra will do)
Had you compiled with warnings you would have seen:
students3.c: In function ‘main’:
students3.c:23:5: error: ‘index2’ undeclared (first use in this function)
index2 = find_binsrch(searchID, listPtr, n, 1200, 4580);
^
students3.c:23:5: note: each undeclared identifier is reported only once for each function it appears in
students3.c:23:27: error: ‘searchID’ undeclared (first use in this function)
index2 = find_binsrch(searchID, listPtr, n, 1200, 4580);
^
students3.c:19:9: warning: unused variable ‘i’ [-Wunused-variable]
int i, n;
^
students3.c: In function ‘print_list’:
students3.c:53:9: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘int *’ [-Wformat=]
printf("%d %s\n",&(list[i]->IDno),(list[i]->name));
^
students3.c: In function ‘find_binsrch’:
students3.c:57:48: warning: unused parameter ‘size’ [-Wunused-parameter]
int find_binsrch(int idNo, student **list, int size, int low, int high){
^
students3.c: In function ‘main’:
students3.c:24:1: warning: control reaches end of non-void function [-Wreturn-type]
}
^
students3.c: In function ‘find_binsrch’:
students3.c:74:1: warning: control reaches end of non-void function [-Wreturn-type]
}
<snip>
Simply addressing the warnings/errors and recompiling (and addressing new warnings/errors disclosed from fixing the first list) will allow you to systematically correct your code. Taking these basic steps will allow you to correct your code to the point it will compile without warnings:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct{
int IDno;
char name[20];
int project;
int exam;
double final;
} student;
student **create_class_list(char*filename, int *sizePtr);
void print_list(student**list,int *sizePtr);
int find_binsrch(int idNo, student **list, int size,int low, int high);
int main(void){
int n, index2, searchID = 2;
student **listPtr = NULL;
listPtr = create_class_list("student.txt", &n);
if (!listPtr) {
fprintf (stderr, "error: create_class_list failed.\n");
return 1;
}
print_list(listPtr,&n);
index2 = find_binsrch(searchID, listPtr, n, 1200, 4580);
if (index2) {} /* stub to eliminate unused warning */
return 0;
}
student **create_class_list(char *filename, int *sizeptr){
int n,i;
FILE *fptr;
fptr=fopen(filename,"r");
if(fptr==NULL)
printf("The file could not be opened.\n");
else
fscanf(fptr, "%d",sizeptr);
n=*sizeptr;
student **list;
list = (student**)calloc(n, sizeof(student*));
for(i=0;i<n;i++){
list[i]=(student*)calloc(n,sizeof(student));
fscanf(fptr,"%d %[^\n]s", &(list[i]->IDno),(list[i]->name));
}
return list;
}
void print_list(student**list,int *sizePtr){
int i;
for(i=0; i<*sizePtr; i++){
printf("%d %s\n",list[i]->IDno, list[i]->name);
}
}
int find_binsrch(int idNo, student **list, int size, int low, int high)
{
int middle;
if (size) {} /* stub to eliminate unused warning */
while(low<=high){
middle =(low+high)/2;
printf("%d\n", middle);
if(idNo==list[middle]->IDno)
return list[middle]->IDno;
if(idNo<list[middle]->IDno)
high = middle -1;
else
low = middle +1;
}
return -1;
}
Note: whether it runs correctly is a different question indeed, that depends on your data and the elimination of any logic errors.

In your binary search routine, your if's are comparing idNo against list[middle] when they need to compare against list[middle].idNo
You could simplify a bit by using a 1D array that gets realloc'ed rather than a 2D array of pointers. The entire code will be simpler and you won't lose any functionality.
UPDATE
I've switched your code to use an array of structs rather than an array of pointers to structs. It simplifies things and the two level lookup was just adding complexity that was probably tripping you up. Also, cleaned up more style-wise--Sorry about that but it's how I was able to see enough of your logic in order to make the changes.
Note: I agree completely with David [and many others] about compiler warnings. They are your friends. They usually show bugs that are 10x harder to find with a running program. I've been doing C for many years, I [still] always use -Wall -Werror
If you'd like to learn more about pointers to structs, arrays of structs, see my recent answer Issue implementing dynamic array of structures It has a primer on the various ways to switch between arrays, pointers to arrays, indexing of pointers, etc. that may be useful.
Added a full diagnostic suite that proves the binsrch algorithm, including edge cases that might not appear for a given set of data before turning it loose on real/large data. A good technique to remember.
Note that I'm not sure why you passed low/high as arguments as they serve no purpose for binary search in general. They're useful if you wanted a specific subset of the data. If so, comment out my extra code resetting them.
// binsrch -- program to do binary search
#include <stdio.h>
#include <stdlib.h>
typedef struct {
int IDno;
char name[20];
int project;
int exam;
double final;
} student;
student *
create_class_list(char *filename,int *sizeptr)
{
int n;
int i;
FILE *fptr;
student *cur;
student *list;
fptr = fopen(filename,"r");
if (fptr == NULL)
printf("The file could not be opened.\n");
else
fscanf(fptr,"%d",sizeptr);
n = *sizeptr;
list = calloc(n,sizeof(student));
for (i = 0; i < n; i++) {
cur = &list[i];
fscanf(fptr,"%d %[^\n]s",&cur->IDno,cur->name);
}
fclose(fptr);
return list;
}
void
print_list(student *list,int count)
{
int i;
student *cur;
for (i = 0; i < count; i++) {
cur = &list[i];
printf("%d %s\n",cur->IDno,cur->name);
}
}
student *
find_binsrch(int idNo,student *list,int count,int low,int high)
{
student *cur;
int middle;
student *match;
match = NULL;
// what is the purpose of the limited range? -- ignore for now
low = 0;
high = count - 1;
while (low <= high) {
middle = (low + high) / 2;
cur = &list[middle];
//printf("find_binsrch: TRACE middle=%d\n",middle);
if (idNo == cur->IDno) {
match = cur;
break;
}
if (idNo < cur->IDno)
high = middle - 1;
else
low = middle + 1;
}
return match;
}
#define RAND0(_lim) \
(rand() % _lim)
#define RAND1(_lim) \
(RAND0(_lim) + 1)
// diag_binsrch -- run diagnostic on single array size
void
diag_binsrch(int count)
{
student *list;
student *cur;
int searchidx;
student *match;
int err;
list = calloc(count,sizeof(student));
searchidx = 0;
cur = &list[searchidx];
cur->IDno = RAND1(30);
// create interesting data
++searchidx;
for (; searchidx < count; ++searchidx)
list[searchidx].IDno = list[searchidx - 1].IDno + RAND1(137);
err = 0;
// search for something lower that the lowest -- we _want_ it to fail
searchidx = 0;
cur = &list[searchidx];
match = find_binsrch(cur->IDno - 1,list,count,1200,4580);
if (match != NULL) {
printf("DIAG: expected failure -- searchidx=%d cur=%d match=%d\n",
searchidx,cur->IDno - 1,match->IDno);
++err;
}
// search for something higher that the highest -- we _want_ it to fail
searchidx = count - 1;
cur = &list[searchidx];
match = find_binsrch(cur->IDno + 1,list,count,0,count - 1);
if (match != NULL) {
printf("DIAG: expected failure -- searchidx=%d cur=%d match=%d\n",
searchidx,cur->IDno + 1,match->IDno);
++err;
}
// search for all remaining entries -- they should all match
cur = list;
for (searchidx = 0; searchidx < count; ++searchidx, ++cur) {
match = find_binsrch(cur->IDno,list,count,0,count - 1);
if (match == NULL) {
printf("DIAG: null return -- searchidx=%d IDno=%d\n",
searchidx,cur->IDno);
++err;
continue;
}
if (match->IDno != cur->IDno) {
printf("DIAG: mismatch -- searchidx=%d cur=%d match=%d\n",
searchidx,cur->IDno,match->IDno);
++err;
continue;
}
}
free(list);
if (err)
exit(1);
}
// diag_binsrch_full -- run full diagnostic
void
diag_binsrch_full(void)
{
int count;
printf("diag_binsrch_full: start ...\n");
for (count = 1; count < 1000; ++count)
diag_binsrch(count);
for (count = 1000; count <= 10000000; count *= 10)
diag_binsrch(count);
printf("diag_binsrch_full: complete\n");
}
int
main(void)
{
int listCount;
student *listPtr;
//student *cur;
//student *match;
// run diagnostic
diag_binsrch_full();
exit(0);
listPtr = create_class_list("student.txt",&listCount);
print_list(listPtr,listCount);
#if 0
match = find_binsrch(searchID,listPtr,n,1200,4580);
if (match != NULL)
printf("main: MATCH IDno=%d name='%s'\n",match->IDno,match->name);
#endif
return 0;
}

Related

My function is returning segmentation fault error for aparently nothing wrong

I'm making an hashing table data structure and having segmentation fault error on my inicialization function. Here the code:
void allocTableSlots(alu **table, int index){
if(index == MAX)
return;
else{
table[index] = calloc(1, sizeof(alu));
table[index]->registration = -1;
table[index]->next = -1;
allocTableSlots(table, index+1);
}
}
void initializateHashTable(hash *hashing){
hashing = calloc(1, sizeof(hash));
allocTableSlots(hashing->table, 0);
hashing->collisionArea = 690;
}
My structs are these:
#define MAX 997
typedef struct alu{
int registration;
char name[80];
char email[80];
int next;
} alu;
typedef struct reg{
alu *table[MAX];
int collisionArea;
}hash;
The error comes in:
if(index == MAX)
on allocTableSlots() function
If I change MAX, for MAX-1, or any other number, like 500 the error still comes after position 499, so its not look like that I trying to access an invalid position of my array table
I already tried an iterative version (in case that my recursion has some error) but still the same
As suggested in the comments, you most likely should just return the pointer to the allocated block from the init function. Furthermore, if the maximum bucket size is known, as is in your code with MAX, the code simplifies to:
...
typedef struct reg {
alu table[MAX];
int collisionArea;
} hash;
hash *initializateHashTable(void) {
hash *t = calloc(1, sizeof *t);
if (!t) return NULL; // check calloc, just in case.
/* Whatever initialization you want to perform. As per your code,
setting registration and next members to -1 */
for (int i = 0; i < MAX; i++) {
t->table[i].registration = t->table[i].next = -1;
}
t->collisionArea = 690; // EDIT: Forgot the collisionArea
return t;
}

Reading string from array of pointers

How can I read each individual character from a string that is accessed through an array of pointers? In the below code I currently have generated an array of pointers to strings called, symCodes, in my makeCodes function. I want to read the strings 8 characters at a time, I thought about concatenating each string together, then looping through that char by char but the strings in symCodes could be up to 255 characters each, so I feel like that could possibly be too much all to handle at once. Instead, I thought I could read each character from the strings, character by character.
I've tried scanf or just looping through and always end up with seg faults. At the end of headerEncode(), it's near the bottom. I malloc enough memory for each individual string, I try to loop through the array of pointers and print out each individual character but am ending up with a seg fault.
Any suggestions of a different way to read an array of pointers to strings, character by character, up to n amount of characters is appreciated.
EDIT 1: I've updated the program to no longer output warnings when using the -Wall and -W flags. I'm no longer getting a seg fault(yay!) but I'm still unsure of how to go about my question, how can I read an array of pointers to strings, character by character, up to n amount of characters?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "huffman.h"
#define FAIL 0
#define SUCCESS 1
/* global 1 day arrays that hold chars and their freqs from file */
unsigned long globalFreqs[256] = {0};
unsigned char globalUsedCh[256] = {0};
char globalCodes[256] = {0};
unsigned char globalUniqueSymbols;
unsigned long totalCount = 0;
typedef struct HuffmanTreeNode* HTNode;
struct HuffmanTreeNode* globalSortedLL;
/*
struct has the input letter, the letters frequency, and the left and irght childs
*/
struct HuffmanTreeNode
{
char symbol;
unsigned long freq;
char *code;
struct HuffmanTreeNode *left, *right;
struct HuffmanTreeNode* next;
};
/* does it make sense to have a struct for the entire huffman tree to see its size? */
struct HuffmanTree
{
unsigned size;
};
/*generate new node with given symbol and freq */
struct HuffmanTreeNode* newNode(char symbol, int freq)
{
struct HuffmanTreeNode* newNode = malloc(sizeof(struct HuffmanTreeNode));
newNode->symbol = symbol;
newNode->freq = freq;
newNode->left = newNode->right = NULL;
return newNode;
}
/*current work in progress, i believe this is the way to insert it for a BST
/* will change for HuffmanTreenode once working
/*
*/
struct HuffmanTreeNode* insert(struct HuffmanTreeNode* node, struct HuffmanTreeNode* htnNew)
{
struct HuffmanTreeNode* currentNode = node;
if(currentNode == NULL || compareTwoNodes(htnNew, currentNode))
{
htnNew->next = currentNode;
return htnNew;
}
else
{
while(currentNode->next != NULL && compareTwoNodes(currentNode->next, htnNew))
{
currentNode = currentNode->next;
}
htnNew->next = currentNode->next;
currentNode->next = htnNew;
return node;
}
}
int compareTwoNodes(struct HuffmanTreeNode* a, struct HuffmanTreeNode* b)
{
if(b->freq < a->freq)
{
return 0;
}
if(a->freq == b->freq)
{
if(a->symbol > b->symbol)
return 1;
return 0;
}
if(b->freq > a->freq)
return 1;
}
struct HuffmanTreeNode* popNode(struct HuffmanTreeNode** head)
{
struct HuffmanTreeNode* node = *head;
*head = (*head)->next;
return node;
}
/*convert output to bytes from bits*/
/*use binary fileio to output */
/*put c for individual character byte*/
/*fwrite each individual byte for frequency of symbol(look at fileio slides) */
/*
#function:
#param:
#return:
*/
int listLength(struct HuffmanTreeNode* node)
{
struct HuffmanTreeNode* current = node;
int length = 0;
while(current != NULL)
{
length++;
current = current->next;
}
return length;
}
/*
#function:
#param:
#return:
*/
void printList(struct HuffmanTreeNode* node)
{
struct HuffmanTreeNode* currentNode = node;
while(currentNode != NULL)
{
if(currentNode->symbol <= ' ' || currentNode->symbol > '~')
printf("=%d", currentNode->symbol);
else
printf("%c", currentNode->symbol);
printf("%lu ", currentNode->freq);
currentNode = currentNode->next;
}
printf("\n");
}
/*
#function:
#param:
#return:
*/
void buildSortedList()
{
int i;
for(i = 0; i < 256; i++)
{
if(!globalFreqs[i] == 0)
{
globalSortedLL = insert(globalSortedLL, newNode(i, globalFreqs[i]));
}
}
printf("Sorted freqs: ");
printList(globalSortedLL);
printf("listL: %d\n", listLength(globalSortedLL));
}
/*
#function: isLeaf()
will test to see if the current node is a leaf or not
#param:
#return
*/
int isLeaf(struct HuffmanTreeNode* node)
{
if((node->left == NULL) && (node->right == NULL))
return SUCCESS;
else
return FAIL;
}
/*where I plan to build the actual huffmantree */
/*
#function:
#param:
#return:
*/
struct HuffmanTreeNode* buildHuffmanTree(struct HuffmanTreeNode* node)
{
int top = 0;
struct HuffmanTreeNode *left, *right, *topNode, *huffmanTree;
struct HuffmanTreeNode* head = node;
struct HuffmanTreeNode *newChildNode, *firstNode, *secondNode;
while(head->next != NULL)
{
/*grab first two items from linkedL, and remove two items*/
firstNode = popNode(&head);
secondNode = popNode(&head);
/*combine sums, use higher symbol, create new node*/
newChildNode = newNode(secondNode->symbol, (firstNode->freq + secondNode->freq));
newChildNode->left = firstNode;
newChildNode->right = secondNode;
/*insert new node, decrement total symbols in use */
head = insert(head, newChildNode);
}
return head;
}
void printTable(char *codesArray[])
{
int i;
printf("Symbol\tFreq\tCode\n");
for(i = 0; i < 256; i++)
{
if(globalFreqs[i] != 0)
{
if(i <= ' ' || i > '~')
{
printf("=%d\t%lu\t%s\n", i, globalFreqs[i], codesArray[i]);
}
else
{
printf("%c\t%lu\t%s\n", i, globalFreqs[i], codesArray[i]);
}
}
}
printf("Total chars = %lu\n", totalCount);
}
void makeCodes(
struct HuffmanTreeNode *node, /* Pointer to some tree node */
char *code, /* The *current* code in progress */
char *symCodes[256], /* The array to hold the codes for all the symbols */
int depth) /* How deep in the tree we are (code length) */
{
char *copiedCode;
int i = 0;
if(isLeaf(node))
{
code[depth] = '\0';
symCodes[node->symbol] = code;
return;
}
copiedCode = malloc(255*sizeof(char));
memcpy(copiedCode, code, 255*sizeof(char));
code[depth] = '0';
copiedCode[depth] = '1';
makeCodes(node->left, code, symCodes, depth+1);
makeCodes(node->right, copiedCode, symCodes, depth+1);
}
/*
#function: getFileFreq()
gets the frequencies of each character in the given
file from the command line, this function will also
create two global 1d arrays, one for the currently
used characters in the file, and then one with those
characters frequencies, the two arrays will line up
parallel
#param: FILE* in, FILE* out,
the current file being processed
#return: void
*/
void getFileFreq(FILE* in, FILE* out)
{
unsigned long freqs[256] = {0};
int i, t, fileCh;
while((fileCh = fgetc(in)) != EOF)
{
freqs[fileCh]++;
totalCount++;
}
for(i = 0; i < 256; i++)
{
if(freqs[i] != 0)
{
globalUsedCh[i] = i;
globalFreqs[i] = freqs[i];
if(i <= ' ' || i > '~')
{
globalUniqueSymbols++;
}
else
{
globalUniqueSymbols++;
}
}
}
/* below code until total count is for debugging purposes */
printf("Used Ch: ");
for(t = 0; t < 256; t++)
{
if(globalUsedCh[t] != 0)
{
if(t <= ' ' || t > '~')
{
printf("%d ", globalUsedCh[t]);
}
else
printf("%c ", globalUsedCh[t]);
}
}
printf("\n");
printf("Freq Ch: ");
for(t = 0; t < 256; t++)
{
if(globalFreqs[t] != 0)
{
printf("%lu ", globalFreqs[t]);
}
}
printf("\n");
/* end of code for debugging/vizualazation of arrays*/
printf("Total Count %lu\n", totalCount);
printf("globalArrayLength: %d\n", globalUniqueSymbols);
}
void headerEncode(FILE* in, FILE* out, char *symCodes[256])
{
char c;
int i, ch, t, q, b, z;
char *a;
char *fileIn;
unsigned char *uniqueSymbols;
unsigned char *byteStream;
unsigned char *tooManySym = 0;
unsigned long totalEncodedSym;
*uniqueSymbols = globalUniqueSymbols;
totalEncodedSym = ftell(in);
rewind(in);
fileIn = malloc((totalEncodedSym+1)*sizeof(char));
fread(fileIn, totalEncodedSym, 1, in);
if(globalUniqueSymbols == 256)
{
fwrite(tooManySym, 1, sizeof(char), out);
}
else
{
fwrite(uniqueSymbols, 1, sizeof(uniqueSymbols)-7, out);
}
for(i = 0; i < 256; i++)
{
if(globalFreqs[i] != 0)
{
fwrite(globalUsedCh+i, 1, sizeof(char), out);
fwrite(globalFreqs+i, 8, sizeof(char), out);
}
}
for(t = 0; t < totalEncodedSym; t++)
{
fwrite(symCodes[fileIn[t]], 8, sizeof(char), out);
}
for(q = 0; q < totalEncodedSym; q++)
{
symCodes[q] = malloc(255*sizeof(char));
a = symCodes[q];
while(*a != '\0')
printf("%c\n", *(a++));
}
printf("Total encoded symbols: %lu\n", totalEncodedSym);
printf("%s\n", fileIn);
}
void encodeFile(FILE* in, FILE* out)
{
int top = 0;
int i;
char *code;
char *symCodes[256] = {0};
int depth = 0;
code = malloc(255*sizeof(char));
getFileFreq(in, out);
buildSortedList();
makeCodes(buildHuffmanTree(globalSortedLL), code, symCodes, depth);
printTable(symCodes);
headerEncode(in, out, symCodes);
free(code);
}
/*
void decodeFile(FILE* in, FILE* out)
{
}*/
There are many problems in your code:
[major] function compareTwoNodes does not always return a value. The compiler can detect such problems if instructed to output more warnings.
[major] the member symbol in the HuffmanTreeNode should have type int. Type char is problematic as an index value because it can be signed or unsigned depending on compiler configuration and platform specificities. You assume that char has values from 0 to 255, which is incorrect for most platforms where char actually has a range of -128 .. 127. Use unsigned char or int but cast the char values to unsigned char to ensure proper promotion.
[major] comparison if (globalUniqueSymbols == 256) is always false because globalUniqueSymbols is an unsigned char. The maximum number of possible byte values is indeed 256 for 8-bit bytes, but it does not fit in an unsigned char, make globalUniqueSymbols an int.
[major] *uniqueSymbols = globalUniqueSymbols; in function headerEncode stores globalUniqueSymbols into an uninitialized pointer, definitely undefined behavior, probable segmentation fault.
[major] sizeof(uniqueSymbols) is the size of a pointer, not the size of the array not the size of the type. Instead of hacking it as sizeof(uniqueSymbols)-7, fputc(globalUniqueSymbols, out);
[major] fwrite(tooManySym, 1, sizeof(char), out); is incorrect too, since tooManySym is initialized to 0, ie: it is a NULL pointer. You need a special value to tell that all bytes values are used in the source stream, use 0 for that and write it with fputc(0, out);.
You have nested C style comments before function insert, this is not a bug but error prone and considered bad style.
function newNode should take type unsigned long for freq for consistency.
function buildHuffmanTree has unused local variables: right, top and topNode.
variable i is unused in function makeCodes.
many unused variables in headerEncode: byteStream, c, ch, b...
totalEncodedSym is an unsigned long, use an index of the proper type in the loops where you stop at totalEncodedSym.
unused variables un encodeFile: i, top...
Most of these can be detected by the compiler with the proper warning level: gcc -Wall -W or clang -Weverything...
There are probably also errors in the program logic, but you cannot see these until you fix the major problems above.

Comparing contents of files for groupings of words

Background:
I am currently working on a project. The main objective is to read files and compare groupings of words. Only user interaction will be to specify group length. My programs are placed into a directory. Inside that directory, there will be multiple textfiles(Up to 30). I use
system("ls /home/..... > inputfile.txt");
system("ls /home/..... > inputfile.txt");
From there, I open the files from inputfile.txt to read for their contents.
Now to the actual question/problem part.
The method I am using for this is a queue because FIFO. (Code "link.c":http://pastebin.com/rLpVGC00
link.c
#include <stdlib.h>
#include <stdio.h>
#include <math.h>
#include <string.h>
#include "linkedlist.h"
struct linkedList
{
char *data;
int key;
int left;
int right;
int size;
};
LinkedList createLinkedList(int size)
{
LinkedList newLL = malloc(sizeof *newLL);
newLL->data = malloc(sizeof(int) * (size+1));
newLL->size = size;
newLL->left = 0;
newLL->right = 0;
return newLL;
}
bool isFull(LinkedList LL)
{
return abs(abs(LL->left)- abs(LL->right)) == LL->size;
}
void insertFront(LinkedList LL, char *newInfo)
{
if(isFull(LL))
{
printf("FULL");
exit(1);
}
LL->data[((--(LL->left) % LL->size) + LL->size) % LL->size] = newInfo;
}
bool isEmpty(LinkedList LL)
{
return LL->left == LL->right;
}
const char * removeEnd(LinkedList LL)
{
if(isEmpty(LL))
{
return "EMPTY";
//exit(1);
}
return LL->data[((--(LL->right) % LL->size) + LL->size) % LL->size];
}
I get two warnings when I compile with link.c and my main (Start11.c)
link.c: In function ‘insertFront’:
link.c:39:64: warning: assignment makes integer from pointer without a cast [enabled by default]
LL->data[((--(LL->left) % LL->size) + LL->size) % LL->size] = newInfo;
^
link.c: In function ‘removeEnd’:
link.c:54:5: warning: return makes pointer from integer without a cast [enabled by default]
return LL->data[((--(LL->right) % LL->size) + LL->size) % LL->size];
^
FULL start11.c code: http://pastebin.com/eskn5yxm .
From bulk of read() function that I have questions about:
fp = fopen(filename, "r");
//We want two word or three word or four word PHRASES
for (i = 0; fgets(name, 100, fp) != NULL && i < 31; i++)
{
char *token = NULL; //setting to null before using it to strtok
token = strtok(name, ":");
strtok(token, "\n");//Getting rid of that dirty \n that I hate
strcat(fnames[i], token);
char location[350];
//Copying it back to a static array to avoid erros with fopen()
strcpy(location, fnames[i]);
//Opening the files for their contents
fpp = fopen(location, "r");
printf("\nFile %d:[%s] \n", i+1, fnames[i]);
char* stringArray[400];
//Reading the actual contents
int y;
for(j = 0; fgets(info,1600,fpp) != NULL && j < 1600; j++)
{
for( char *token2 = strtok(info," "); token2 != NULL; token2 = strtok(NULL, " ") )
{
puts(token2);
++y;
stringArray[y] = strdup(token2);
insertFront(index[i],stringArray[y]);
}
}
}
//Comparisons
char take[20010],take2[200100], take3[200100],take4[200100];
int x,z;
int count, count2;
int groupV,groupV2;
for(x = 0; x < 10000; ++x)
{
if(removeEnd(index[0])!= "EMPTY")
{
take[x] = removeEnd(index[0]);
}
if(removeEnd(index[1])!= "EMPTY")
{
take2[x] = removeEnd(index[1]);
}
if(removeEnd(index[2])!= "EMPTY")
{
take3[x] = removeEnd(index[2]);
}
}
for(z = 0; z < 10; z++)
{
if(take[z] == take2[z])
{
printf("File 1 and File 2 are similar\n");
++count;
if(count == groupL)
{
++groupV;
}
}
if(take[z] == take3[z])
{
printf("File 1 and File 3 are similar\n");
++count2;
if(count == groupL)
{
++groupV2;
}
}
}
Are those two warnings before the reason why when I try to compare the files it'll not be correct? (Yes I realize I "hardcoded" the comparisons. That is just temporary till I get some of this down...)
I'll post header files as a comment. Won't let me post more than two links.
Additional notes:
removeEnd() returns "EMPTY" if there is if there is nothing left to remove.
insertFront() is a void function.
Before I created this account so I can post, I read a previous question regarding strttok and how if I want to insert something I have to strdup() it.
I have not added my free functions to my read() function. I will do that last too.
start.h (pastebin.com/NTnEAPYE)
#ifndef START_H
#define START_H
#include "linkedlist.h"
void read(LinkedList LL,char* filename, int lineL);
#endif
linkedlist.h (pastebin.com/ykzbnCTV)
#include <stdlib.h>
#include <stdio.h>
#ifndef LINKEDLIST_H
#define LINKEDLIST_H
typedef int bool;
typedef struct linkedList *LinkedList;
LinkedList createLinkedList(int size);
bool isFull(LinkedList LL);
void insertFront(LinkedList LL, char *newInfo);
const char * removeEnd(LinkedList LL);
bool isEmpty(LinkedList LL);
#endif
The main problem is around the removeEnd (resp. insertFrom) function:
const char * removeEnd(LinkedList LL)
{
if (...)
return "EMPTY";
else
return LL->data[xxx];
you return a const char * in the first return but a char in the second return, hence the warning, which is a serious one.
And when you compare return value to "EMPTY" in the caller, it's just wrong: you should use strcmp instead of comparing arrays which may be the same depending on compilers which group same strings in the same location, but only by chance (and not portable!)

Overlapping and too long integer values in dynamic c structs

I have the following problem.
I need to create a list of savestates with dynamical length. That's why I decided to create a list by defining some structs and connecting dynamically created structs together to build a list of structs which can dynamically be extended and so on.
However, some things seem to not work at all. Here's the relevant code first:
saves.h:
#ifndef SAVES_H
#include<time.h>
#define SAVES_H
#define SVS_STRLEN 500
#define SVS_FILE "savefile.dat"
#define True 1
#define False 0
typedef struct SVS_STATE SVS_STATE;
typedef struct SVS_STATES SVS_STATES;
struct SVS_STATE {
int i_playfield[6][7];
int i_turn;
time_t i_time;
void *next;
};
struct SVS_STATES {
SVS_STATE *states;
int count;
int loaded;
};
void SVS_Add_State(int i_playfield[][7], int i_turn, time_t i_time);
void SVS_Debug_State(SVS_STATE *state);
void SVS_Format_State(SVS_STATE *state, char text[]);
SVS_STATE *SVS_Get_State(int number);
#endif
saves.c:
#include "saves.h"
#include<string.h>
#include<time.h>
SVS_STATE *SVS_Get_State(int number)
{
int i = 1;
SVS_STATE *state;
if (svs_current_state.loaded == False) return NULL;
if (number > svs_current_state.count) return NULL;
state = svs_current_state.states;
printf("printing state 1:");
SVS_Debug_State(state);
while( i < number)
{
i++;
state = (SVS_STATE*)(state->next);
printf("printing state %i:", i);
SVS_Debug_State(state);
}
return state;
}
void SVS_Format_State(SVS_STATE *state, char text[])
{
int i, j;
if (svs_current_state.loaded == False) return;
text[0] = '\0';
strcat(text, "{\0");
for (i = 0; i < X_SIZE; i++)
{
strcat(text, "{\0");
for(j = 0; j < Y_SIZE; j++)
{
strcat(text, "%i,\0");
sprintf(text, text, state->i_playfield[i][j]);
}
strcat(text, "}\0");
}
strcat(text, "};%i;%i\n\0");
sprintf(text, text, state->i_turn, state->i_time);
printf("\nFormatted state:%s\n", text);
}
void SVS_Debug_State(SVS_STATE *state)
{
char text[SVS_STRLEN];
SVS_Format_State(state, text);
printf("%s\n", text);
}
void SVS_Add_State(int i_playfield[][7], int i_turn, time_t i_time)
{
int i, j;
SVS_STATE *laststate, *newstate;
newstate = (SVS_STATE*)malloc(sizeof(SVS_STATE));
printf("adding state with time:%i\n", i_time);
if (svs_current_state.loaded == False) return;
for (i = 0; i < 6; i++)
for (j = 0; j < 7; j++)
newstate->i_playfield[i][j] = i_playfield[i][j];
newstate->i_turn = i_turn;
newstate->i_time = i_time;
newstate->next = NULL;
printf("initialized state:");
SVS_Debug_State(newstate);
if (svs_current_state.coun > 0)
{
laststate = SVS_Get_State(svs_current_state.count);
laststate->next = (void*)newstate;
} else
svs_current_state.states=newstate;
svs_current_state.count++;
}
int main()
{
int i_playfield[6][7] = {0};
// mark saves library as loaded here, but removed function, since it
// just sets svs_current_state.loaded (which is the global struct of
// type SVS_STATES) to 1
SVS_Add_State(i_playfield, 1, time(NULL));
i_playfield[0][0] = 2;
SVS_Add_State(i_playfield, 2, time(NULL));
return 0;
}
The actual problems I encountered while using the printf's and Debug_State calls in these functions:
- the i_time I give is printed out once in Add_State(), correctly. Means it is a legal time and stuff, but when printed out after creating the full state by using Format_State() the string is 50 percent to long and the last part is displayed twice, for example:
if the time is 12345678, it is displayed correctly while debugging in Add_State, but Format_State() displays 123456785678.
- second problem: the first state added works, more or less, fine. But after adding a second one, printing the first state (retrieved by using Get_State and formatted with Format_State) prints a mixture of two states, for example something like this:
state 1: {{0,0,0,0,0,0,0}{0,0,0,0,0,0,0}{0,0,0,0,0,0,0}...
{0,0,0,0,0,0}};1;123456785678
state 2: {{0,0,0,0,0,0}{0,0,0,0,0,0}...
{0,0,0,0,0,0}};2;1234567856785678,0}{0,0,0,0,0,0}...
Thanks for reading.
These calls
sprintf(text, text, ...
invoke undefined behaviour, as the target buffer and one of the other arguments overlap.
From the POSIX specs to sprintf():
If copying takes place between objects that overlap as a result of a call to sprintf() [...], the results are undefined.

Seg. Fault in Hash Table ADT - C

Edit:
Hash.c is updated with revisions from the comments, I am still getting a Seg fault. I must be missing something here that you guys are saying
I have created a hash table ADT using C but I am encountering a segmentation fault when I try to call a function (find_hash) in the ADT.
I have posted all 3 files that I created parse.c, hash.c, and hash.h, so you can see all of the variables. We are reading from the file gettysburg.txt which is also attached
The seg fault is occuring in parse.c when I call find_hash. I cannot figure out for the life of me what is going on here. If you need anymore information I can surely provide it.
sorry for the long amount of code I have just been completely stumped for a week now on this. Thanks in advance
The way I run the program is first:
gcc -o parse parse.c hash.c
then: cat gettysburg.txt | parse
Parse.c
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include "hash.h"
#define WORD_SIZE 40
#define DICTIONARY_SIZE 1000
#define TRUE 1
#define FALSE 0
void lower_case_word(char *);
void dump_dictionary(Phash_table );
/*Hash and compare functions*/
int hash_func(char *);
int cmp_func(void *, void *);
typedef struct user_data_ {
char word[WORD_SIZE];
int freq_counter;
} user_data, *Puser_data;
int main(void)
{
char c, word1[WORD_SIZE];
int char_index = 0, dictionary_size = 0, num_words = 0, i;
int total=0, largest=0;
float average = 0.0;
Phash_table t; //Pointer to main hash_table
int (*Phash_func)(char *)=NULL; //Function Pointers
int (*Pcmp_func)(void *, void *)=NULL;
Puser_data data_node; //pointer to hash table above
user_data * find;
printf("Parsing input ...\n");
Phash_func = hash_func; //Assigning Function pointers
Pcmp_func = cmp_func;
t = new_hash(1000,Phash_func,Pcmp_func);
// Read in characters until end is reached
while ((c = getchar()) != EOF) {
if ((c == ' ') || (c == ',') || (c == '.') || (c == '!') || (c == '"') ||
(c == ':') || (c == '\n')) {
// End of a word
if (char_index) {
// Word is not empty
word1[char_index] = '\0';
lower_case_word(word1);
data_node = (Puser_data)malloc(sizeof(user_data));
strcpy(data_node->word,word1);
printf("%s\n", data_node->word);
//!!!!!!SEG FAULT HERE!!!!!!
if (!((user_data *)find_hash(t, data_node->word))){ //SEG FAULT!!!!
insert_hash(t,word1,(void *)data_node);
}
char_index = 0;
num_words++;
}
} else {
// Continue assembling word
word1[char_index++] = c;
}
}
printf("There were %d words; %d unique words.\n", num_words,
dictionary_size);
dump_dictionary(t); //???
}
void lower_case_word(char *w){
int i = 0;
while (w[i] != '\0') {
w[i] = tolower(w[i]);
i++;
}
}
void dump_dictionary(Phash_table t){ //???
int i;
user_data *cur, *cur2;
stat_hash(t, &(t->total), &(t->largest), &(t->average)); //Call to stat hash
printf("Number of unique words: %d\n", t->total);
printf("Largest Bucket: %d\n", t->largest);
printf("Average Bucket: %f\n", t->average);
cur = start_hash_walk(t);
printf("%s: %d\n", cur->word, cur->freq_counter);
for (i = 0; i < t->total; i++)
cur2 = next_hash_walk(t);
printf("%s: %d\n", cur2->word, cur2->freq_counter);
}
int hash_func(char *string){
int i, sum=0, temp, index;
for(i=0; i < strlen(string);i++){
sum += (int)string[i];
}
index = sum % 1000;
return (index);
}
/*array1 and array2 point to the user defined data struct defined above*/
int cmp_func(void *array1, void *array2){
user_data *cur1= array1;
user_data *cur2= array2;//(user_data *)array2;
if(cur1->freq_counter < cur2->freq_counter){
return(-1);}
else{ if(cur1->freq_counter > cur2->freq_counter){
return(1);}
else return(0);}
}
hash.c
#include "hash.h"
Phash_table new_hash (int size, int(*hash_func)(char*), int(*cmp_func)(void*, void*)){
int i;
Phash_table t;
t = (Phash_table)malloc(sizeof(hash_table)); //creates the main hash table
t->buckets = (hash_entry **)malloc(sizeof(hash_entry *)*size); //creates the hash table of "size" buckets
t->size = size; //Holds the number of buckets
t->hash_func = hash_func; //assigning the pointer to the function in the user's program
t->cmp_func = cmp_func; // " "
t->total=0;
t->largest=0;
t->average=0;
t->sorted_array = NULL;
t->index=0;
t->sort_num=0;
for(i=0;i<size;i++){ //Sets all buckets in hash table to NULL
t->buckets[i] = NULL;}
return(t);
}
void free_hash(Phash_table table){
int i;
hash_entry *cur;
for(i = 0; i<(table->size);i++){
if(table->buckets[i] != NULL){
for(cur=table->buckets[i]; cur->next != NULL; cur=cur->next){
free(cur->key); //Freeing memory for key and data
free(cur->data);
}
free(table->buckets[i]); //free the whole bucket
}}
free(table->sorted_array);
free(table);
}
void insert_hash(Phash_table table, char *key, void *data){
Phash_entry new_node; //pointer to a new node of type hash_entry
int index;
new_node = (Phash_entry)malloc(sizeof(hash_entry));
new_node->key = (char *)malloc(sizeof(char)*(strlen(key)+1)); //creates the key array based on the length of the string-based key
new_node->data = data; //stores the user's data into the node
strcpy(new_node->key,key); //copies the key into the node
//calling the hash function in the user's program
index = table->hash_func(key); //index will hold the hash table value for where the new node will be placed
table->buckets[index] = new_node; //Assigns the pointer at the index value to the new node
table->total++; //increment the total (total # of buckets)
}
void *find_hash(Phash_table table, char *key){
int i;
hash_entry *cur;
printf("Inside find_hash\n"); //REMOVE
for(i = 0;i<table->size;i++){
if(table->buckets[i]!=NULL){
for(cur = table->buckets[i]; cur->next != NULL; cur = cur->next){
if(strcmp(table->buckets[i]->key, key) == 0)
return((table->buckets[i]->data));} //returns the data to the user if the key values match
} //otherwise return NULL, if no match was found.
}
return NULL;
}
void stat_hash(Phash_table table, int *total, int *largest, float *average){
int node_num[table->size]; //creates an array, same size as table->size(# of buckets)
int i,j, count = 0;
int largest_buck = 0;
hash_entry *cur;
for(i = 0; i < table->size; i ++){
if(table->buckets[i] != NULL){
for(cur=table->buckets[i]; cur->next!=NULL; cur = cur->next){
count ++;}
node_num[i] = count;
count = 0;}
}
for(j = 0; j < table->size; j ++){
if(node_num[j] > largest_buck)
largest_buck = node_num[j];}
*total = table->total;
*largest = largest_buck;
*average = (table->total) / (table->size);
}
void *start_hash_walk(Phash_table table){
Phash_table temp = table;
int i, j, k;
hash_entry *cur; //CHANGE IF NEEDED to HASH_TABLE *
if(table->sorted_array != NULL) free(table->sorted_array);
table->sorted_array = (void**)malloc(sizeof(void*)*(table->total));
for(i = 0; i < table->total; i++){
if(table->buckets[i]!=NULL){
for(cur=table->buckets[i]; cur->next != NULL; cur=cur->next){
table->sorted_array[i] = table->buckets[i]->data;
}}
}
for(j = (table->total) - 1; j > 0; j --) {
for(k = 1; k <= j; k ++){
if(table->cmp_func(table->sorted_array[k-1], table->sorted_array[k]) == 1){
temp -> buckets[0]-> data = table->sorted_array[k-1];
table->sorted_array[k-1] = table->sorted_array[k];
table->sorted_array[k] = temp->buckets[0] -> data;
}
}
}
return table->sorted_array[table->sort_num];
}
void *next_hash_walk(Phash_table table){
table->sort_num ++;
return table->sorted_array[table->sort_num];
}
hash.h
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
typedef struct hash_entry_ { //Linked List
void *data; //Generic pointer
char *key; //String-based key value
struct hash_entry_ *next; //Self-Referencing pointer
} hash_entry, *Phash_entry;
typedef struct hash_table_ {
hash_entry **buckets; //Pointer to a pointer to a Linked List of type hash_entry
int (*hash_func)(char *);
int (*cmp_func)(void *, void *);
int size;
void **sorted_array; //Array used to sort each hash entry
int index;
int total;
int largest;
float average;
int sort_num;
} hash_table, *Phash_table;
Phash_table new_hash(int size, int (*hash_func)(char *), int (*cmp_func)(void *, void *));
void free_hash(Phash_table table);
void insert_hash(Phash_table table, char *key, void *data);
void *find_hash(Phash_table table, char *key);
void stat_hash(Phash_table table, int *total, int *largest, float *average);
void *start_hash_walk(Phash_table table);
void *next_hash_walk(Phash_table table);
Gettysburg.txt
Four score and seven years ago, our fathers brought forth upon this continent a new nation: conceived in liberty, and dedicated to the proposition that all men are created equal.
Now we are engaged in a great civil war. . .testing whether that nation, or any nation so conceived and so dedicated. . . can long endure. We are met on a great battlefield of that war.
We have come to dedicate a portion of that field as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this.
But, in a larger sense, we cannot dedicate. . .we cannot consecrate. . . we cannot hallow this ground. The brave men, living and dead, who struggled here have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember, what we say here, but it can never forget what they did here.
It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us. . .that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion. . . that we here highly resolve that these dead shall not have died in vain. . . that this nation, under God, shall have a new birth of freedom. . . and that government of the people. . .by the people. . .for the people. . . shall not perish from the earth.
It's possible that one of several problems with this code are loops like:
for(table->buckets[i];
table->buckets[i]->next != NULL;
table->buckets[i] = table->buckets[i]->next)
...
The initializing part of the for loop (table->buckets[i]) has no effect. If i is 0 and table->buckets[0] == NULL, then the condition on this loop (table->buckets[i]->next != NULL) will dereference a null pointer and crash.
That's where your code seemed to be crashing for on my box, at least. When I changed several of your loops to:
if (table->buckets[i] != NULL) {
for(;
table->buckets[i]->next != NULL;
table->buckets[i] = table->buckets[i]->next)
...
}
...it kept crashing, but in a different place. Maybe that will help get you unstuck?
Edit: another potential problem is that those for loops are destructive. When you call find_hash, do you really want all of those buckets to be modified?
I'd suggest using something like:
hash_entry *cur;
// ...
if (table->buckets[i] != NULL) {
for (cur = table->buckets[i]; cur->next != NULL; cur = cur->next) {
// ...
}
}
When I do that and comment out your dump_dictionary function, your code runs without crashing.
Hmm,
here's hash.c
#include "hash.h"
Phash_table new_hash (int size, int(*hash_func)(char*), int(*cmp_func)(void*, void*)){
int i;
Phash_table t;
t = (Phash_table)calloc(1, sizeof(hash_table)); //creates the main hash table
t->buckets = (hash_entry **)calloc(size, sizeof(hash_entry *)); //creates the hash table of "size" buckets
t->size = size; //Holds the number of buckets
t->hash_func = hash_func; //assigning the pointer to the function in the user's program
t->cmp_func = cmp_func; // " "
t->total=0;
t->largest=0;
t->average=0;
for(i=0;t->buckets[i] != NULL;i++){ //Sets all buckets in hash table to NULL
t->buckets[i] = NULL;}
return(t);
}
void free_hash(Phash_table table){
int i;
for(i = 0; i<(table->size);i++){
if(table->buckets[i]!=NULL)
for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
free(table->buckets[i]->key); //Freeing memory for key and data
free(table->buckets[i]->data);
}
free(table->buckets[i]); //free the whole bucket
}
free(table->sorted_array);
free(table);
}
void insert_hash(Phash_table table, char *key, void *data){
Phash_entry new_node; //pointer to a new node of type hash_entry
int index;
new_node = (Phash_entry)calloc(1,sizeof(hash_entry));
new_node->key = (char *)malloc(sizeof(char)*(strlen(key)+1)); //creates the key array based on the length of the string-based key
new_node->data = data; //stores the user's data into the node
strcpy(new_node->key,key); //copies the key into the node
//calling the hash function in the user's program
index = table->hash_func(key); //index will hold the hash table value for where the new node will be placed
table->buckets[index] = new_node; //Assigns the pointer at the index value to the new node
table->total++; //increment the total (total # of buckets)
}
void *find_hash(Phash_table table, char *key){
int i;
hash_entry *cur;
printf("Inside find_hash\n"); //REMOVE
for(i = 0;i<table->size;i++){
if(table->buckets[i]!=NULL){
for (cur = table->buckets[i]; cur != NULL; cur = cur->next){
//for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
if(strcmp(cur->key, key) == 0)
return((cur->data));} //returns the data to the user if the key values match
} //otherwise return NULL, if no match was found.
}
return NULL;
}
void stat_hash(Phash_table table, int *total, int *largest, float *average){
int node_num[table->size];
int i,j, count = 0;
int largest_buck = 0;
hash_entry *cur;
for(i = 0; i < table->size; i ++)
{
if(table->buckets[i]!=NULL)
for (cur = table->buckets[i]; cur != NULL; cur = cur->next){
//for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
count ++;}
node_num[i] = count;
count = 0;
}
for(j = 0; j < table->size; j ++){
if(node_num[j] > largest_buck)
largest_buck = node_num[j];}
*total = table->total;
*largest = largest_buck;
*average = (table->total) /(float) (table->size); //oook: i think you want a fp average
}
void *start_hash_walk(Phash_table table){
void* temp = 0; //oook: this was another way of overwriting your input table
int i, j, k;
int l=0; //oook: new counter for elements in your sorted_array
hash_entry *cur;
if(table->sorted_array !=NULL) free(table->sorted_array);
table->sorted_array = (void**)calloc((table->total), sizeof(void*));
for(i = 0; i < table->size; i ++){
//for(i = 0; i < table->total; i++){ //oook: i don't think you meant total ;)
if(table->buckets[i]!=NULL)
for (cur = table->buckets[i]; cur != NULL; cur = cur->next){
//for(table->buckets[i]; table->buckets[i]->next != NULL; table->buckets[i] = table->buckets[i]->next){
table->sorted_array[l++] = cur->data;
}
}
//oook: sanity check/assert on expected values
if (l != table->total)
{
printf("oook: l[%d] != table->total[%d]\n",l,table->total);
}
for(j = (l) - 1; j > 0; j --) {
for(k = 1; k <= j; k ++){
if (table->sorted_array[k-1] && table->sorted_array[k])
{
if(table->cmp_func(table->sorted_array[k-1], table->sorted_array[k]) == 1){
temp = table->sorted_array[k-1]; //ook. changed temp to void* see assignment
table->sorted_array[k-1] = table->sorted_array[k];
table->sorted_array[k] = temp;
}
}
else
printf("if (table->sorted_array[k-1] && table->sorted_array[k])\n");
}
}
return table->sorted_array[table->sort_num];
}
void *next_hash_walk(Phash_table table){
/*oook: this was blowing up since you were incrementing past the size of sorted_array..
NB: *you **need** to implement some bounds checking here or you will endup with more seg-faults!!*/
//table->sort_num++
return table->sorted_array[table->sort_num++];
}
here's parse.c
#include <stdio.h>
#include <ctype.h>
#include <string.h>
#include <assert.h> //oook: added so you can assert ;)
#include "hash.h"
#define WORD_SIZE 40
#define DICTIONARY_SIZE 1000
#define TRUE 1
#define FALSE 0
void lower_case_word(char *);
void dump_dictionary(Phash_table );
/*Hash and compare functions*/
int hash_func(char *);
int cmp_func(void *, void *);
typedef struct user_data_ {
char word[WORD_SIZE];
int freq_counter;
} user_data, *Puser_data;
int main(void)
{
char c, word1[WORD_SIZE];
int char_index = 0, dictionary_size = 0, num_words = 0, i;
int total=0, largest=0;
float average = 0.0;
Phash_table t; //Pointer to main hash_table
int (*Phash_func)(char *)=NULL; //Function Pointers
int (*Pcmp_func)(void *, void *)=NULL;
Puser_data data_node; //pointer to hash table above
user_data * find;
printf("Parsing input ...\n");
Phash_func = hash_func; //Assigning Function pointers
Pcmp_func = cmp_func;
t = new_hash(1000,Phash_func,Pcmp_func);
// Read in characters until end is reached
while ((c = getchar()) != EOF) {
if ((c == ' ') || (c == ',') || (c == '.') || (c == '!') || (c == '"') ||
(c == ':') || (c == '\n')) {
// End of a word
if (char_index) {
// Word is not empty
word1[char_index] = '\0';
lower_case_word(word1);
data_node = (Puser_data)calloc(1,sizeof(user_data));
strcpy(data_node->word,word1);
printf("%s\n", data_node->word);
//!!!!!!SEG FAULT HERE!!!!!!
if (!((user_data *)find_hash(t, data_node->word))){ //SEG FAULT!!!!
dictionary_size++;
insert_hash(t,word1,(void *)data_node);
}
char_index = 0;
num_words++;
}
} else {
// Continue assembling word
word1[char_index++] = c;
}
}
printf("There were %d words; %d unique words.\n", num_words,
dictionary_size);
dump_dictionary(t); //???
}
void lower_case_word(char *w){
int i = 0;
while (w[i] != '\0') {
w[i] = tolower(w[i]);
i++;
}
}
void dump_dictionary(Phash_table t){ //???
int i;
user_data *cur, *cur2;
stat_hash(t, &(t->total), &(t->largest), &(t->average)); //Call to stat hash
printf("Number of unique words: %d\n", t->total);
printf("Largest Bucket: %d\n", t->largest);
printf("Average Bucket: %f\n", t->average);
cur = start_hash_walk(t);
if (!cur) //ook: do test or assert for null values
{
printf("oook: null== (cur = start_hash_walk)\n");
exit(-1);
}
printf("%s: %d\n", cur->word, cur->freq_counter);
for (i = 0; i < t->total; i++)
{//oook: i think you needed these braces
cur2 = next_hash_walk(t);
if (!cur2) //ook: do test or assert for null values
{
printf("oook: null== (cur2 = next_hash_walk(t) at i[%d])\n",i);
}
else
printf("%s: %d\n", cur2->word, cur2->freq_counter);
}//oook: i think you needed these braces
}
int hash_func(char *string){
int i, sum=0, temp, index;
for(i=0; i < strlen(string);i++){
sum += (int)string[i];
}
index = sum % 1000;
return (index);
}
/*array1 and array2 point to the user defined data struct defined above*/
int cmp_func(void *array1, void *array2){
user_data *cur1= array1;
user_data *cur2= array2;//(user_data *)array2;
/* ooook: do assert on programmatic errors.
this function *requires non-null inputs. */
assert(cur1 && cur2);
if(cur1->freq_counter < cur2->freq_counter){
return(-1);}
else{ if(cur1->freq_counter > cur2->freq_counter){
return(1);}
else return(0);}
}
follow the //ooks
Explanation:
There were one or two places this was going to blow up in.
The quick fix and answer to your question was in parse.c, circa L100:
cur = start_hash_walk(t);
printf("%s: %d\n", cur->word, cur->freq_counter);
..checking that cur is not null before calling printf fixes your immediate seg-fault.
But why would cur be null ? ~because of this bad-boy:
void *start_hash_walk(Phash_table table)
Your hash_func(char *string) can (& does) return non-unique values. This is of course ok except that you have not yet implemented your linked list chains. Hence you end up with table->sorted_array containing less than table->total elements ~or you would if you were iterating over all table->size buckets ;)
There are one or two other issues.
For now i hacked Nate Kohl's for(cur=table->buckets[i]; cur->next != NULL; cur=cur->next) further, to be for(cur=table->buckets[i]; cur != NULL; cur=cur->next) since you have no chains. But this is *your TODO so enough said about that.
Finally. note that in next_hash_walk(Phash_table table) you have:
table->sort_num++
return table->sorted_array[table->sort_num];
Ouch! Do check those array bounds!
Notes
1) If you're function isn't designed to change input, then make the input const. That way the compiler may well tell you when you're inadvertently trashing something.
2) Do bound checking on your array indices.
3) Do test/assert for Null pointers before attempting to use them.
4) Do unit test each of your functions; never write too much code before compiling & testing.
5) Use minimal test-data; craft it such that it limit-tests your code & attempts to break it in cunning ways.
6) Do initialise you data structures!
7)Never use egyptian braces ! {
only joking ;)
}
PS Good job so far ~> pointers are tricky little things! & a well asked question with all the necessary details so +1 and gl ;)
(//oook: maybe add a homework tag)

Resources