Comparing strings with user-created string class - c

For this assignment I had to create my own string class. I initially wrote the compareto method to compare two string but return whichever is overall larger. What I want to do is compare and return which one is alphabetically larger i.e. comparing two strings, for example: smith and htims. With the way I designed the compareto method is that the result will be that they are equal. What I want to do is tell me which one comes first alphabetically, so for my example htims would come first. I understand how to do this in Java or even in C with using the <string.h> library, I am just confused as to how to do this myself.
EDIT: I just wanted to note that I am not looking for code answer, rather a nudge in the how I should write the code.
int compareto(void * S1, void * S2){
String s1 = (String S1);
String s2 = (String S2);
int i, cs1 = 0, cs2 = 0; //cs1 is count of s1, cs2 is count of s2
while(s1->c[i] != '\0'){ //basically, while there is a word
if(s1->c[i] < s2->c[i]) // if string 1 char is less than string 2 char
cs2++; //add to string 2 count
else (s1->c[i] > s2->c[i]) //vice versa
cs1++;
i++;
}
//for my return I basically have
if(cs1>cs2){
return 1;
}
else if(cs2 > cs1){
return 2;
}
return 0;
here is mystring.h
typedef struct mystring {
char * c;
int length;
int (*sLength)(void * s);
char (*charAt)(void * s, int i);
int (*compareTo)(void * s1, void * s2);
struct mystring * (*concat)(void * s1, void * s2);
struct mystring * (*subString)(void * s, int begin, int end);
void (*printS)(void * s);
} string_t;
typedef string_t * String;
Any suggestions, all of my google searches involve using the <string.h> library, so I've had no luck.
Im using this to traverse through a linked list and remove the person whose last name matches the person the user is trying to delete.
Here is my test code to help clarify my problem (Note that compareto is in the remove function):
int main() {
Node startnode, currentnode, newnode;
int ans, success;
String who;
who = newString2();
startnode = (Node) malloc(sizeof(pq_t));
startnode->next = NULL;
currentnode = startnode;
ans = menu();
while (ans != 0) {
switch (ans) {
case add:
newnode = getStudent();
startnode = insert(newnode, startnode);
break;
case remove:
printf("Enter the last name of the person you want to delete : \n");
scanf("%s", &who->c);
startnode = removeStudent(startnode, who, &success);
if (success == 0)
printf("UNFOUND\n");
else
printf("permanently DELETED\n");
break;
case view:
printf("Now displaying the list : \n");
displaylist(startnode);
break;
}
ans = menu();
}
}
Node removeStudent(Node head, String who, int * success) {
Node p, l; //p = pointer node, l = previous node
Student current; //Im using generics, so I have to case the current node->obj as a student.
String ln, cln; //the last name of the person I want to remove, and the last name of the current node
p = head;
l = p;
//there can be three cases, p->first node, p->last node, p->some node in between
if (head->obj == NULL) {
printf("The list is empty\n"); //when the list is empty
*success = 0;
return NULL;
}
while (p != NULL) {
current = (Student) p->obj;
cln = current->ln;
if (ln->compareTo(who, cln) == 0) {
if (head == p) { //when there is only one node
head = head->next;
free(p);
*success = 1;
return head;
} else if (p->next == NULL) { //last node
l->next = NULL;
free(p);
*success = 1;
return head;
} else {
l->next = p->next; //middle
free(p);
*success = 1;
return head;
}
}
l = p;
p = p->next;
}
*success = 0;
return head;//couldnt find the node
}

Try comparing the following pairs of strings:
"ABC" vs "DEF"
"ADF" vs "BBB"
"ABC" vs "CBA"
What results do you get? More importantly, why? How do these results compare to what you want to get?
(You should first work it out in your head. Work out the values of c1 and c2 for each step of the comparison loop.)

First, ln isn't properly initialized in the sample removeStudent(), so calling ln->compareTo will probably cause a segfault. Hopefully, ln is properly initialized in your actual code.
To define an ordering on the strings, you can first define what's known in database circles as a "collation": an ordering on characters. You can implement the collation as a function (called, say, chrcmp), or inline within your string comparison function. The important thing is to define it.
Generally speaking, an ordering on a type induces a lexicographic order on sequences of that type: to compare two sequences, find the first place they differ; the lesser sequence is the one with the lesser element at that position.
More formally, suppose sequences are indexed starting at 0. let a and b be sequences of the base type of lengths m and n, respectively. The lexicographic order a ≤ b is:
a < b where ai R bi and aj=mj for all 0 ≤ j < i
a < b if a is a prefix of b
a=b if m=n and ai=bi for all 0 ≤ i < m
Where "a is a prefix of b" means m < n and ai = bi for all 0 ≤ i < m.
The advantage of this approach is you can write a comparison function that will work with any homogeneous sequence type: strings, lists of strings, arrays of integers, what-have-you. If you specialize the comparison function for null-terminated strings, you don't need to worry about the cases for prefixes; simply have '\0' be the least character in the collation.
From a general comparison function (called, say, lexiCompare), you can define
lexicCompareString (a, b):
return lexicCompare(a, b, chrcmp)
and set a String's compareTo member to lexicCompareString.

Related

Need help to create my list in alphabetic order

I'm trying to create an alphabetically ordered linked list from a file by placing the node in the correct spot after reading it. The file must not be alphabetically ordered. The program reads the file correctly and I'm able to add everything at the end of the list.
Place search_place(Place first, char *new){
Place aux = first;
while (aux->abcnext != NULL){
if ( strcmp(new,aux->place) > 0)
aux = aux->abcnext;
else
break;
}
return aux;
}
void insert_place(Place first, char* string){
Place previous,temp,new;
previous = search_place(first, string);
if (previous->abcnext == NULL){
new = create_place();
previous->place = string;
new->abcnext = previous->abcnext;
previous->abcnext = new;
}
else{
new = (Place)malloc(sizeof(place_node));
new->place = string;
new->abcnext = previous;
previous = new;
}
}
Place create_place(){
Place aux;
aux=(Place)malloc(sizeof(place_node));
if (aux!=NULL){
aux->place=malloc(25*sizeof(char));
aux->abcnext=NULL;
}
return aux;
}
typedef struct placenode*Place;
typedef struct placenode{
char *place;
Place abcnext;
}place_node;
Considering the results that I've obtained from this code I suppose the problem is related to either pointers or the header of the linked list or both. With 4 places: P, Z, W, L - I get only P -> Z from the list.
if (previous->abcnext == NULL){
new = create_place();
previous->place = string;
new->abcnext = previous->abcnext;
previous->abcnext = new;
}
A couple of obvious problems with the above code. Firstly, you don't set new->place - you replace previous->place which doesn't seem right. So your new node will have NULL for it's "place" and you'll have lost the value for the previous node.
Secondly you're assigning the value of string rather than making a new copy. If you're using the same string each time you call the function, you'd end up with all the nodes pointing to the same string.
You should do something like
new->place = malloc(strlen(string)+1);
strcpy(new->place, string);
or if your version of C has it, use strdup
new->place = strdup(string);

Is there a C function that returns the second, third, etc. instance an int value occurs?

I have implemented code that imports data from a file containing 5 different values, one of them being Time. I have converted the time given in the format Hour.Minute.Second.Millisecond into just Milliseconds.
With this data I created a function Find that finds the data for a given time. This is where the problem arises, since there are multiple days of data here, and the time will repeat multiple times. Is there a function in the C library that returns all instances of a value? Ex.arr =[2,3,4,1,2,] I want it to tell me when the second 2 appears, returning 4.
Edit: For better clarity
These are the functions
void Find(SortedLinkedList *list,int target,int date, char *search) {
if(strcmp(search, "Time") == 0){
Sate *found = findTime(list, target,date);
printf("The Node with time:%d\n Is from the date:%d\n Contains the following:",found->Time,found->Date);
printf("RMag:%6.3f ", found->rmag);
printf("NSmag:%6.3f ", found->NSmag);
printf("azmag:%6.3f ", found->azmag);
printf("avgmag:%6.3f \n", found->avgmag);
}
}
Sate *findTime(SortedLinkedList *list, int target,int date){
Node *current = list->head;
for (int i = 0; i < (list->size)+1 && current != NULL; i++) {
if(current->data->Time == target && current->data->Date == date)
return current->data;
else{
current = current->next;
}
}
}
Right now for it to work I implemented a date insert to differentiate between the times but I'm wondering if it can be done without it.
There's not any kind of a iterate over a collection type of function in the Standard C library other than something like strtok() which will iterate over a text string using the provided token identification pattern.
There is the bsearch() function however that does a search through a sorted list of items and is not really what you want either.
It sounds like you want something like the following. This demonstrates an instantiation of an algorithm however I am not sure what the time points data looks like so that is something you will need to provide.
typedef unsigned long long TimePoint; // made up time data item
typedef struct {
int bFound;
unsigned long ulCurOffset; // array position where item found if bFound is true.
unsigned long ulOffset; // next array position to test
unsigned long ulCount; // count of times found
} IteratorThing;
IteratorThing IterateFunc (IteratorThing x, TimePoint *array, size_t len, TimePoint search)
{
x.bFound = 0; // assume we didn't find one.
// resuming from the current place in the array, search until we
// find a match or we reach the end of the array.
for ( ; x.ulOffset < len; x.ulOffset++) {
// this is a simple comparison for equality which may need to be
// more complex for your specific application.
if (array[x.ulOffset] == search) {
// we have found a match so lets update counts, etc.
x.ulCount++; // count of this search item found.
x.bFound = 1; // indicate we found one.
x.ulCurOffset = x.ulOffset; // remember where we found it.
x.ulOffset++; // point to the next array item to look at
break;
}
}
return x;
}
This would be used as in:
void main_xfun(void)
{
TimePoint array[] = { 1, 2, 3, 2, 3, 4, 0 };
TimePoint search = 2;
size_t len = sizeof(array) / sizeof(array[0]);
{
IteratorThing x = { 0 }; // define and initialize our iterator
while ((x = IterateFunc(x, array, len, search)).bFound) {
// do what is needed when we find a time value
// array offset to the item is x.ulCurOffset
// current count of times found is in x.ulCount;
printf(" found item %d at offset %d count is %d\n", (long)array[x.ulCurOffset], x.ulCurOffset, x.ulCount);
}
printf(" item %d found %d time\n", (long)search, x.ulCount);
}
{
IteratorThing x = { 0 }; // define and initialize our iterator
search = 25;
while ((x = IterateFunc(x, array, len, search)).bFound) {
// do what is needed when we find a time value
// array offset to the item is x.ulCurOffset
// current count of times found is in x.ulCount;
printf(" found item %d at offset %d count is %d\n", (long)array[x.ulCurOffset], x.ulCurOffset, x.ulCount);
}
printf(" item %d found %d time\n", (long)search, x.ulCount);
}
}
produces output of
found item 2 at offset 1 count is 1
found item 2 at offset 3 count is 2
item 2 found 2 time
item 25 found 0 time
To restart the search from the beginning just initialize the iterator struct to all zeros again.
What would be really interesting is to provide a pointer to a comparison function in the interface of the function IterateFunc() which would be called to do the comparisons. This would be along the lines of the bsearch() function which requires a pointer to a comparison function but then that is probably overkill for your specific needs.
If you want this hypothetical function to work for either an array of integers or for your time indexed structures, you will probably need to write a generic function.
If POSIX functions are available to you, you can use lfind() as a starting point for such a generic function.
The lsearch() function shall linearly search the table and return a pointer into the table for the matching entry. If the entry does not occur, it shall be added at the end of the table. ...
The lfind() function shall be equivalent to lsearch(), except that if the entry is not found, it is not added to the table. Instead, a null pointer is returned.
Since lfind() will return the first instance, you need to re-invoke lfind() again past the given instance to find the second instance.
void * lfind_Nth (const void *key, const void *base, size_t *nelp,
size_t width, int (*compar)(const void *, const void *),
int N)
{
const char (*array)[width] = base;
char (*p)[width] = NULL;
size_t n = *nelp;
while (N-- > 0) {
p = n ? lfind(key, array, &n, width, compar) : NULL;
if (p == NULL) break;
n -= (p + 1) - array;
array = p + 1;
}
return p;
}
For your integer array example:
int compar_int (const void *a, const void *b) {
return *(const int *)a != *(const int *)b;
}
int where_Nth_int(int key, int *arr, size_t nelm, int N) {
int *w = lfind_Nth(&key, arr, &nelm, sizeof(*arr),
compar_int, N);
return w ? w - arr : -1;
}
int main (void) {
int arr[] = {2,3,4,1,2,};
int nelm = sizeof(arr)/sizeof(*arr);
printf("Second 2 # %d\n", where_Nth_int(2, arr, nelm, 2));
}

Linear search through LinkedList

I'm attempting to implement a linear search function for strings in C, but it isn't currently working. Here is my code:
// Linear search for name matching input string
int listSearch(struct LinkedList* linkedList, char name)
{
struct StudentRecord* temp = linkedList->head; // Go to first item in linked list
int count = 0; // Count variable to give index of search item
while((temp != NULL) && (name != temp->name))
{
temp = temp->next;
count++;
}
return count;
}
And here is the function call to listSearch:
printf("\nItem: Tim\nIndex: %d", listSearch(list_1, "Tim"));
'Tim' is at index 3, but the output consistently puts him at index 4 (there are 4 total items in the list and thus index 4 doesn't exist) - and the same is true for any item we search for. This leads me to believe that the (name != temp->name) condition is failing, but I can't for the life of me see why...Could anyone give me a hint as to why it isn't working?
You're passing in a char, not a pointer to a char and as a result, you were comparing a char to a string pointer. You also need to compare the strings.
int listSearch(struct LinkedList* linkedList, char * name)
{
struct StudentRecord* temp = linkedList; // Go to first item in linked list
int count = 0; // Count variable to give index of search item
while(temp != NULL) {
if (temp->name != NULL && strcmp(name,temp->name)) {
count++;
}
temp = temp->next;
}
return count;
}
Use strcmp to compare two strings, for example:
if(strcmp(a,b)==0)
printf("Entered strings are equal");
else
printf("Entered strings are not equal");

A function searching through an array of structures (C)

This is an address:
struct Adress {
char name[31], lastname[31], email[48];
};
The goal is to have an address book in the main function, and the user should be able to type in a string, and the program lists out all of the people from the address book whose name or the last name contains the given string.
For example, if the address book contains "john" "doe", "jane" "doey" and "george" "johnson", and the user types in "doe", the output is:
1. john doe johndoe#email.com
2. jane doey janedoey#email.com
This part of the main function should use a function
int search(struct Adress array[], int size_of_the_addressbook, char* string_to_search)
which returns the index of the first found address, and -1 in case no address has been found.
Here's my try:
In the snippet from my main function (there 's no need to post input stuff here):
struct Adress adressbook[1000], *member;
int i = 0;
member = adressbook;
if (search(member, number_of_elements, string_to_seach)) == -1)
printf("No person found.\n");
else while((search(member, number_of_elements, string_to_seach)) != -1)
{
member = adressbook + search(member, number_of_elements, string_to_seach);
++i;
printf("%d. %s %s - %s\n", i, (*member).name, (*member).lastname, (*member).email);
++member;
}
And here's the search function:
int search(struct Adress array[], int size_of_the_addressbook, char* string_to_search)
{
int j, index;
struct Adress *i;
i = array;
while (strstr((*i).name, string_to_search) == 0 && strstr((*i).lastname, string_to_search) == 0)
{
index = ((i - array)/(sizeof (struct Adress)));
if (index == size_of_the_addressbook) return -1;
++i;
}
index = ((i - array)/(sizeof (struct Adresa)));
return index;
}
However, this program gets stuck in an infinite loop in pretty much any case when there is more than one member in the address book. I'm suspecting that in the while loop the search doesn't go on from the previously found member, but rather it starts from the begin each time, therefore it keeps finding the same, firstly found member each time.
Your search never actually returns -1, and your invoke of that search doesn't thusly have an exit condition. Further, you should be adjust each starting point of the next search to be one slot beyond the last discovery point.
I'm nearly certain this is what you're trying to do. I've not tested this (have no data to do so nor any info on the invocation of this functionality), but I hope the point is obvious:
int search(const struct Adress array[],
int size_of_the_addressbook,
const char* string_to_search)
{
const struct Adress *end = array + size_of_the_addressbook;
const struct Adress *i = array;
for (; i != end; ++i)
{
if (strstr(i->name, string_to_search) != NULL ||
strstr(i->lastname, string_to_search) != NULL)
break;
}
return i == end ? -1 : (int)(i - array);
}
void do_search(const struct Adress *array,
int number_of_elements,
const char *string_to_search)
{
int i = search(array, number_of_elements, string_to_search), base=0;
if (i == -1)
{
printf("No person found.\n");
return;
}
while (i != -1)
{
base += i;
printf("%d. %s %s - %s\n", base,
array[base].name,
array[base].lastname,
array[base].email);
base += 1;
// note adjustment of starting point using pointer arithmetic.
i = search(array + base,
number_of_elements - base,
string_to_search);
}
}
Hope it helps. Best of luck.
You have a few problems to mention
You call search() twice in your main loop which is absolutely unnecessary, you should call it once and store it's return value.
Your member pointer, never points after the first match, so the first match will always be found,
leading to an infinite loop.
You increase the member pointer and still pass number_of_elements to the search function. When you increase the member pointer the number of elements left to the right of it's resulting position is decreased by the same number that you increase member.
This expression is not giving the value you think
((i - array)/(sizeof (struct Adress)));
because you are computing the distaince between the two pointers i and array and then dividing it by sizeof(struct Address) which is 110, and as another answer mentioned, the value is automatically scaled, so
((i - array)/(sizeof (struct Adress))); -> i - array;
to see what I mean you may try to print this values
printf("\t%d\n", ((void*)member - (void*)adressbook));
printf("\t%d\n", ((void*)member - (void*)adressbook) / sizeof(*member));
printf("\t%d\n", member - adressbook);
Note: if your OS is 64bit, change the format specifier to "%ld".
This is the code that will do what you need
int search(struct Adress **array, int size_of_the_addressbook, char* string_to_search)
{
int index;
struct Adress *pointer;
if ((size_of_the_addressbook == 0) || (array == NULL) || (*array == NULL))
return -1;
pointer = *array;
index = 0;
while (strstr(pointer->name, string_to_search) == 0 &&
strstr(pointer->lastname, string_to_search) == 0)
{
/* check that we are not at the end of the array. */
if (++index == size_of_the_addressbook)
return -1;
/* not found yet, increment both arrays */
(*array)++;
pointer = *array;
}
return index;
}
and in main()
int index;
int foundIndex;
index = 1;
while ((foundIndex = search(&member, number_of_elements, string_to_seach)) != -1)
{
printf("%d. %s %s - %s\n", index, member->name, member->lastname, member->email);
index += 1 + foundIndex;
number_of_elements -= 1 + foundIndex;
++member;
}
in this approach, the member pointer is increased inside the search() function to point to the found element, a counter is added to reflect how much was advanced.
After the search() function returns, member should be increased by 1 again to point to the next element, and number_of_elements should be decreased by the number of elements advanced in the search function + 1 for the found element.
Also, keep a variable that you update on each iteration that gives you the actual index of the element in the array.

Hash table sorting and execution time

I write a program to count the frequency word count using hash table, but I don't how to sort it.
I use struct to store value and count.
My hash code generate function is using module and my hash table is using by linked list.
1.My question is how do I sort them by frequency?
2.I am wondering that why my printed execute time is always zero, but I check it for many time. Where is the wrong way?
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <time.h>
#include <ctype.h>
#define HASHSIZE 29989
#define FACTOR 31
#define VOCABULARYSIZE 30
typedef struct HashNode HashNode;
struct HashNode{
char* voc;//vocabulary
int freq;//frequency
struct HashNode *next;//pointed to the same hashcode
//but actually are different numbers
};
HashNode *HashTable[HASHSIZE] = {NULL,0,NULL};//an array of pointers
unsigned int HashCode(const char *pVoc){//generate hashcode
unsigned int index = 0;
int n = strlen(pVoc);
int i = 0;
for(; i < n; i ++)
index = FACTOR*index + pVoc[i];
return index % HASHSIZE;
}
void InsertVocabulary(const char *pVoc){//insert vocabulary to hash table
HashNode *ptr;
unsigned int index = HashCode(pVoc);
for(ptr = HashTable[index]; ptr != NULL; ptr = ptr -> next){//search if already exist
if(!strcmp (pVoc, ptr -> voc)){
(ptr->freq)++;
return;
}
}
ptr = (HashNode*)malloc(sizeof(HashNode));//if doesn't exist, create it
ptr -> freq = 1;
ptr -> voc = (char*)malloc(strlen(pVoc)+1);
strcpy(ptr -> voc, pVoc);
ptr -> next = HashTable[index];
HashTable[index] = ptr;
}
void ReadVocabularyTOHashTable(const char *path){
FILE *pFile;
char buffer[VOCABULARYSIZE];
pFile = fopen(path, "r");//open file for read
if(pFile == NULL)
perror("Fail to Read!\n");//error message
char ch;
int i =0;
do{
ch = fgetc(pFile);
if(isalpha(ch))
buffer[i++] = tolower(ch);//all convert to lowercase
else{
buffer[i] = '\0';//c-style string
i = 0;
if(!isalpha(buffer[0]))
continue;//blank line
else //printf("%s\n",buffer);
InsertVocabulary(buffer);
}
}while(ch != EOF);
fclose(pFile);
}
void WriteVocabularyTOHashTable(const char *path){
FILE *pFile;
pFile = fopen(path, "w");
if(pFile == NULL)
perror("Fail to Write\n");
int i = 0;
for(; i < HASHSIZE; i++){
HashNode *ptr = HashTable[i];
for(; ptr != NULL; ptr = ptr -> next){
fprintf(pFile, "Vocabulary:%s,Count:%d\n", ptr -> voc, ptr -> freq);
if(ptr -> next == NULL)
fprintf(pFile,"\n");
}
}
fclose(pFile);
}
int main(void){
time_t start, end;
time(&start);
ReadVocabularyTOHashTable("test.txt");
WriteVocabularyTOHashTable("result.txt");
time(&end);
double diff = difftime(end,start);
printf("%.21f seconds.\n", diff);
system("pause");
return 0;
}
This is an answer to your first question, sorting by frequency. Every hash node in your table is a distinct vocabulary entry. Some hash to the same code (thus your collision chains) but eventually you have one HashNode for every unique entry. To sort them by frequency with minimal disturbing of your existing code you can use qsort() with a pointer list (or any other sort of your choice) with relative ease.
Note: the most efficient way to do this would be to maintain a sorted linked-list during vocab-insert, and you may want to consider that. This code assumes you already have a hash table populated and need to get the frequencies out in sorted order of highest to lowest.
First, keep a running tally of all unique insertions. Simple enough, just add a counter to your allocation subsection:
gVocabCount++; // increment with each unique entry.
ptr = (HashNode*)malloc(sizeof(HashNode));//if doesn't exist, create it
ptr -> freq = 1;
ptr -> voc = (char*)malloc(strlen(pVoc)+1);
strcpy(ptr -> voc, pVoc);
ptr -> next = HashTable[index];
HashTable[index] = ptr;
Next allocate a list of pointers to HashNodes as large as your total unique vocab-count. then walk your entire hash table, including collision chains, and put each node into a slot in this list. The list better be the same size as your total node count or you did something wrong:
HashNode **nodeList = malloc(gVocabCount * sizeof(HashNode*));
int i;
int idx = 0;
for (i=0;i<HASHSIZE;++i)
{
HashNode* p = HashTable[i];
while (p)
{
nodeList[idx++] = p;
p = p->next;
}
}
So now we have a list of all unique node pointers. We need a comparison function to send to qsort(). We want the items with the largest numbers to be at the head of the list.
int compare_nodeptr(void* left, void* right)
{
return (*(HashNode**)right)->freq - (*(HashNode**)left)->freq;
}
And finally, fire qsort() to sort your pointer list.
qsort(nodeList, gVocabCount, sizeof(HashNode*), compare_nodeptr);
The nodeList array of HashNode pointers will have all of your nodes sorted in descending frequency:
for (i=0; i<gVocabCount; ++i)
printf("Vocabulary:%s,Count:%d\n", nodeList[i]->voc, nodeList[i]->freq);
Finally, don't forget to free the list:
free(nodeList);
As I said at the beginning, the most efficient way to do this would be to use a sorted linked list that pulls an incremented value (by definition all new entries can go to the end) and runs an insertion sort to slip it back into the right place. In the end that list will look virtually identical to what the above code would create (like-count-order not withstanding; i.e. a->freq = 5 and b->freq = 5, either a-b or b-a can happen).
Hope this helps.
EDIT: Updated to show OP an idea of what the Write function that outputs sorted data may look like:
static int compare_nodeptr(const void* left, const void* right)
{
return (*(const HashNode**)right)->freq - (*(const HashNode**)left)->freq;
}
void WriteVocabularyTOHashTable(const char *path)
{
HashNode **nodeList = NULL;
size_t i=0;
size_t idx = 0;
FILE* pFile = fopen(path, "w");
if(pFile == NULL)
{
perror("Fail to Write\n");
return;
}
nodeList = malloc(gVocabCount * sizeof(HashNode*));
for (i=0,idx=0;i<HASHSIZE;++i)
{
HashNode* p = HashTable[i];
while (p)
{
nodeList[idx++] = p;
p = p->next;
}
}
// send to qsort()
qsort(nodeList, idx, sizeof(HashNode*), compare_nodeptr);
for(i=0; i < idx; i++)
fprintf(pFile, "Vocabulary:%s,Count:%d\n", nodeList[i]->voc, nodeList[i]->freq);
fflush(pFile);
fclose(pFile);
free(nodeList);
}
Something like that, anyway. From the OP's test file, these are the top few lines of output:
Vocabulary:the, Count:912
Vocabulary:of, Count:414
Vocabulary:to, Count:396
Vocabulary:a, Count:388
Vocabulary:that, Count:260
Vocabulary:in, Count:258
Vocabulary:and, Count:221
Vocabulary:is, Count:220
Vocabulary:it, Count:215
Vocabulary:unix, Count:176
Vocabulary:for, Count:142
Vocabulary:as, Count:121
Vocabulary:on, Count:111
Vocabulary:you, Count:107
Vocabulary:user, Count:102
Vocabulary:s, Count:102

Resources