Parsing a String in C, Without Strtok() - c

I need to parse a string in C by removing all non-alphabetic characters from it. To do this I am checking the ascii value of every char and making sure its within the correct bounds. It works just the way I want it to, so that's not the problem. What I am having trouble with, however, is storing the resulting strings after the parse is completed. (I am 3 weeks into C by the way) Also if you notice that I used weird sizes for the arrays, that's because I purposely made them bigger than they needed to be.
char * carry[2]; // This is to simulate argv
carry[1] = "hello1whats2up1"; // 0 is title so I placed at 1
char array[strlen(carry[1])]; // char array of string length
strcpy(array, carry[1]); // copied string to char array
char temp[strlen(carry[1]) + 1]; // Reusable char array
char * finalAnswer[10];
int m = 0, x = 0; // Indexes
if ((sizeof(carry))/8 > 1) { // We were given arguments
printf("Array: %lu\n\n", sizeof(array));
for (int i = 0; i < sizeof(array); i++)
{
if(isalpha(array[i])) { // A-Z & a-z
//printf("%s\n", temp);
temp[x] = array[i]; // Placing chars in temp array
x++;
}
else {
printf("String Length: %lu \nString Name: %s \nWord Index: %d \n\n",
strlen(temp), temp, m); // Testing Purposes
strcpy(finalAnswer[m], temp); // Copies temp into the final answer *** Source of Error
for(int w = 0; w < sizeof(temp); w++) { temp[w] = '\0'; } // Clears temp
x = 0;
m++;
}
}
printf("String Length: %lu \nString Name: %s \nWord Index: %d \n",
strlen(temp), temp, m); // Testing Purposes
strcpy(finalAnswer[m], temp);
for(int w = 0; w < sizeof(temp); w++) { temp[w] = '\0'; } // Clears temp
x = 0;
}
else { printf("No Arguments Given\n"); }
printf("\n");
** Edit
The error I keep getting is when I try copying temp to finalAnswer
** Edit 2
I solved the problem I was having with char * finalAnswer[10]
When I was trying to use strcpy on finalAnswer, I never specified the size that was needed to store the particular string. Works fine after I did it.

Since you have solved the actual string parsing, your last comment, I shall take as the actual requirement.
"... I want to create a list of words with varying length that can be accessed by index ..."
That is certainly not a task to be solved easily if one is "three weeks into C". Data structure that represents that is what main() second argument is:
// array (of unknown size)
// of pointers to char
char * argv[] ;
This can be written as an pointer to pointer:
// same data structure as char * []
char ** list_of_words ;
And this is pushing you straight into the deep waters of C. An non trivial C data structure. As a such it might require a bit more than four weeks of C.
But we can be creative. There is "inbuilt in C" one non trivial data structure we might use. A file.
We can write the words into the file. One word one line. And that is our output: list of words, separated by new line character, stored in a file.
We can even imagine and write a function that will read the word from that result "by index". As you (it seems) need.
// hint: there is a FILE * behind
int words_count = result_size () ;
const char * word = result_get_word(3) ;
Now, I have boldly gone ahead and have written "all" of it, beside that last "crucial" part. After all, I am sure you would like to contribute too.
So the working code (minus the result_size) and result_get_word() ) is alive and kicking here: https://wandbox.org/permlink/uLpAplNl6A3fgVGw
To avoid the "Wrath of Khan" I have also pasted it here:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
/*
task: remove all non alpha chars from a given string, store the result
*/
int process_and_save (FILE *, const char *) ;
int dump_result(FILE *) ;
int main( const int argc, const char * argv[] )
{
const char * filename = "words.txt";
const char * to_parse = "0abra123ka456dabra789" ;
(void)(&argc) ; (void)argv ; // pacify the compiler warnings
printf("\nInput: %s", to_parse ) ;
int retval = process_and_save(fopen(filename, "w"), to_parse ) ;
if ( EXIT_FAILURE != retval )
{
printf("\n\nOutput:\n") ;
retval = dump_result(fopen(filename, "r"));
}
return retval ;
}
int process_and_save (FILE * fp, const char * input )
{
if(!fp) {
perror("File opening failed");
return EXIT_FAILURE;
}
//
char * walker = (char *)(input) ;
while ( walker++ )
{
if ( ! *walker ) break ;
if ( isalpha(*walker) ) {
fprintf( fp, "%c", *walker ) ;
// I am alpha but next one is not
// so write word end, next
if ( ! isalpha(*(walker +1) ) )
fprintf( fp, "\n" ) ;
}
}
fclose(fp);
return EXIT_SUCCESS;
}
int dump_result(FILE* fp )
{
if(!fp) {
perror("\nFile opening failed");
return EXIT_FAILURE;
}
int c; while ((c = fgetc(fp)) != EOF) { putchar(c); }
if (ferror(fp))
puts("\nI/O error when reading");
fclose(fp);
return EXIT_SUCCESS;
}
I think this is functional and does the job of parsing and storing the result. Not in the complex data structure but in the simple file. The rest should be easy. If need help please do let me know.

Related

Read specific variables from a file and assign their values to those in my code

I'm not too practical with c or the "c way of doing stuff", and I stumbled upon a problem.
I have a file with some variables and their corresponding values such as
Var1 1
Var2 15
Var3 1.6
var4 SomeText
How can I read the variables from this file and assign it to variables with corresponding name in my code?
I'm looking for something in the style of
double Var3 = ReadFromFile(File, "Var3");
so my main attempt was focused on trying to parse the "Var3" part from the file, but I can't manage to do it in c, so any help would be appreciated. I don't want to simply read the file in order since the placement of the variables in the file should be arbitrary so also
var4 SomeText
Var2 15
Var1 1
Var3 1.6
should be readable.
My code thus far is
FILE* InstFile = fopen("Filename", "r");
char row[128];
while(fgets(row, sizeof(row), InstFile) != NULL) {
//I don't know what to put here to select only the text I want and extract the value.
}
Here's a function that uses strtok and strtod to get a double based on a string you enter:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define WHITESPACE " \t\r\n"
double get_double_from_file(FILE *fp, const char *var_name)
{
rewind(fp);
for (char *vname, buf[1024]; fgets(buf, sizeof buf, fp);)
{
if (strcmp((vname = strtok(buf, WHITESPACE)), var_name) == 0)
{
char *vstr = strtok(NULL, WHITESPACE);
char *errptr;
double dv = strtod(vstr, &errptr);
if (*errptr != '\0')
{
// handle error
}
return dv;
}
}
// handle if string not found
return 0.0;
}
int main(void)
{
FILE *fp = fopen("foo.txt", "r");
printf("%lf\n", get_double_from_file(fp, "Var2"));
}
So if my foo.txt is:
Var1 1.23
Var2 4.56
Var3 7.89
Output is:
4.560000
You want something like:
int Var1 = ReadFromFile(File, "Var1");
int Var2 = ReadFromFile(File, "Var2");
double Var3 = ReadFromFile(File, "Var3");
char* Var4 = ReadFromFile(File, "Var4");
Well, that's not possible.
The reason is that a) A function can't return a int sometimes and a double other times and b) when you read from the file all values comes as text values so to get an int you need code to convert a text string to int and likewise for double.
In other words - it can't be done without type information.
Instead you can do:
int Var1 = ReadIntFromFile(File, "Var1");
int Var2 = ReadIntFromFile(File, "Var2");
double Var3 = ReadDoubleFromFile(File, "Var3");
char* Var4 = ReadTextFromFile(File, "Var3");
so that you have a specific function for each type.
Unless the file is extremely big this is what I would do:
At program start-up I would read the whole file into a (dynamic allocated) array of structs. Each struct would have two strings, i.e. a key-string to hold the variable name and another string to hold the value. Like:
struct key_value
{
char key[32];
char val[992];
};
So when fgets give a string like "Var2 15" the string must be split into two strings where the first string is copied to key and the second string to val. Something like:
char row[2000];
int i = 0;
while(fgets(row, sizeof(row), InstFile) != NULL)
{
if (sscanf(row,
"%31s %991s",
key_value_array[i].key,
key_value_array[i].val) != 2)
{
// Error - unexpected format
exit(1); // or something better...
}
++i;
}
strcpy(key_value_array[i].key, "end_of_array");
strcpy(key_value_array[i].val, "");
note: The code above doesn't check that size of key_value_array is sufficient to hold all the keys. Adding such a check I'll leave to OP.
I added an extra struct in the end to indicate end-of-array.
Now you can search the array of structs for the correct key and - if found - convert the value string in accordance with the type information.
Something like:
int ReadInt(struct key_value * key_value_array, char * var_name)
{
int result = 0;
int i = 0;
while( strcmp(key_value_array[i].key, "end_of_array") != 0 )
{
if( strcmp(key_value_array[i].key, var_name) == 0 )
{
// Variable found
result = atoi(key_value_array[i].val);
break;
}
++i;
}
return result;
}
The good thing about this approach is that you only need to access the file once. Further the code for reading the file is completely independent of type information.
However, there is one big problem here... The code can't give you error information in case a specific variable name wasn't found or a type conversion failed.
So we need to add that.
This is what I would do:
int ReadInt(struct key_value * key_value_array, char * var_name, int* value)
{
int i = 0;
while( strcmp(key_value_array[i].key, "end_of_array") != 0 )
{
if( strcmp(key_value_array[i].key, var_name) == 0 )
{
// Variable found
*value = atoi(key_value_array[i].val);
return 0;
}
++i;
}
return -1; // Error, key not found
}
and call it like:
int Var1;
if (ReadInt(key_value_array, "Var1", &Var1)
{
// Error handling... perhaps
Var1 = some_default_value;
}
A last remark:
Global variables is something that should be avoided. However, configuration data like this is (IMO) an exception. Placing a file scope global array variable and all the associated function in one compilation unit would be fine. By doing that there would be no need to pass the array in all function calls. So a call could be:
int Var1;
if (ReadInt("Var1", &Var1)
{
// Error handling... perhaps
Var1 = some_default_value;
}
The simplest simplest design is just sscanf and check one at a time:
int var1;
float var2;
// etc...
while(fgets(...) != NULL) {
if (sscanf(line, "Var1 %d", &var1) == 1) {
// yay - var1 assigned
} else if (sscanf(line, "Var2 %f", &var2) == 1) {
// yay - var2 assigned
} else {
// invalid input - display error
continue;
}
}
You can extract first word first with strtok and then sscanf the value or scan the value with strtoi/strtof/other.
int var1;
float var2;
// etc...
while(fgets(...) != NULL) {
const char delim[] = " ";
char *firstword = strtok(line, delim);
char *value = strtok(NULL, delim);
if (firstword == NULL) {
// error - nothing on the line
continue;
}
if (value == NULL) {
// error - only one thing on the line
continue;
}
if (strcmp(firstword, "Var1") == 0) {
if (sscanf(value, "%d", &var1) == 1) {
// error
continue;
}
// use var1
} else if (strcmp(firstword, "Var2") == 0) {
if (sscanf(value, "%f", &var2) == 1) {
// error
continue;
}
// use var2
} else {
// error - invalid first word
continue;
}
}
ie. sky the limit. You should definitely research how to deal with strings in C and all string-related functions and research about sscanf manual - scanf can be tricky to get it safe and right.
Here is my version of the function ReadDoubleFromFile.
My version of the function has the following advantages:
It returns whether it was able to find the name in the file or not.
It performs extensive input validation.
It does not use strtok, so it is thread-safe.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdbool.h>
//The following function will return true if it was able to
//find the name in the file, false otherwise. If it returns
//true, the found value will be written to *pd.
bool ReadDoubleFromFile( FILE *fp, const char *name, double *pd )
{
char line[200];
size_t name_len = strlen(name);
//rewind file position to start
rewind( fp );
//read one line per loop iteration
while( fgets( line, sizeof line, fp ) != NULL)
{
//count and remember length of line
size_t line_len = strlen(line);
//remove newline character, if it exists
if ( line_len > 0 && line[line_len-1] == '\n' )
line[--line_len] = '\0';
//check line length
if ( line_len == sizeof line - 1 )
{
fprintf( stderr, "error: line too long\n" );
return false;
}
//check if all characters of name match
if ( strncmp( name, line, name_len ) != 0 )
continue;
//check if there are additional characters (which would
//mean that there is no match)
if ( line[name_len] != ' ' )
continue;
//attempt to convert the rest of the line to double
char *end;
*pd = strtod( line + name_len + 1 , &end );
//check if the entire rest of the line was converted
if ( *end != '\0' )
{
fprintf( stderr, "error: conversion to double failed\n" );
return false;
}
return true;
}
return false;
}

How to read word for word that are only separated by a ":" from the buffer?

I am making a language translator, and want to read from the buffer word by word and store them in a key-value struct.
The buffer contains such a file:
hola:hello
que:what
and so on. I already tried everything and I keep errors such as segmentation fault: 11 or just reading the same line again and again.
struct key_value{
char *key;
char *value;
};
...
struct key_value *kv = malloc(sizeof(struct key_value) * count);
char k[20]; //key
char v[20]; //value
int x = 0;
for(i = 0; i < numbytes; i++){
sscanf(buffer,"%21[^:]:%21[^\n]\n",k,v);
(kv + i)->key = k;
(kv + i)->value = v;
}
for(i = 0; i < count; i++){
printf("key: %s, value: %s\n",(kv + i)->key,(kv + i)->value);
}
free(buffer);
free(kv);
I expect the output to be key: hola, value: hello key: que, value: what,
but the actual output is just key: hola, value: hello again and again.
Which is the right way to do it?
There are multiple problems with your code, among them
On each loop iteration, you read from the beginning of the buffer. It is natural, then, that each iteration extracts the same key and value.
More generally, your read loop iteration variable seems to have no relationship with the data read. It appears to be a per-byte iteration, but you seem to want a per-line iteration. You might want to look into scanf's %n directive to help you track progress through the buffer.
You are scanning each key / value pair into the same local k and v variables, then you are assigning pointers to those variables to your structures. The resulting pointers are all the same, and they will become invalid when the function returns. I suggest giving structkey_value` arrays for its members instead of pointers, and copying the data into them.
Your sscanf format reads up to 21 characters each for key and value, but the provided destination arrays are not long enough for that. You need them to be dimensioned for at least 22 characters to hold 21 plus a string terminator.
Your sscanf() format and usage do not support recognition of malformed input, especially overlength keys or values. You need to check the return value, and you probably need to match the trailing newline with a %c field (the literal newline in the format does not mean what you think it means).
Tokenizing (the whole buffer) with strtok_r or strtok or even strchr instead of sscanf() might be easier for you.
Also, style note: your expressions of the form (kv + i)->key are valid, but it would be more idiomatic to write kv[i].key.
I've written a simple piece of code that may help you to solve your problem. I've used the function fgets to read from a file named "file.txt" and the function strchr to individuate the 1st occurence of the separator ':'.
Here the code:
#include <stdio.h>
#include <string.h>
#include <errno.h>
#define MAX_LINE_SIZE 256
#define MAX_DECODED_LINE 1024
struct decod {
char key[MAX_LINE_SIZE];
char value[MAX_DECODED_LINE];
};
static struct decod decod[1024];
int main(void)
{
FILE * fptr = NULL;
char fbuf[MAX_LINE_SIZE];
char * value;
int cnt=0,i;
if ( !(fptr=fopen("file.txt","r")) )
{
perror("");
return errno;
}
while( fgets(fbuf,MAX_LINE_SIZE,fptr)) {
// Eliminate UNIX/DOS line terminator
value=strrchr(fbuf,'\n');
if (value) *value=0;
value=strrchr(fbuf,'\r');
if (value) *value=0;
//Find first occurrence of the separator ':'
value=strchr(fbuf,':');
if (value) {
// Truncates fbuf string to first word
// and (++) points second word
*value++=0;
}
if (cnt<MAX_DECODED_LINE) {
strcpy(decod[cnt].key,fbuf);
if (value!=NULL) {
strcpy(decod[cnt].value,value);
} else {
decod[cnt].value[0]=0;
}
cnt++;
} else {
fprintf(stderr,
"Cannot read more than %d lines\n", MAX_DECODED_LINE);
break;
}
}
if (fptr)
fclose(fptr);
for(i=0;i<cnt;i++) {
printf("key:%s\tvalue:%s\n",decod[i].key,decod[i].value);
}
return 0;
}
This code reads all the lines (max 1024) that the file named file.txt contains, loads all individuated couples (max 1024) into the struct array decod and then printouts the content of the structure.
I wrote this code, I think it does the job! this is simpler than the accepted answer I think! and it uses just as much as memory is needed, no more.
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
struct key_value{
char key[22];
char value[22];
};
void parse_str(char* str, struct key_value** kv_arr, int* num){
int n = 0;
int read = -1;
char k[22];
char v[22];
int current_pos = 0;
int consumed = 0;
/*counting number of key-value pairs*/
while (1){
if(current_pos > strlen(str)){
break;
}
read = sscanf(str + current_pos, "%21[^:]:%21[^\n]\n%n", k, v, &consumed);
current_pos += consumed;
if(read == 2){
++n;
}
}
printf("n = %d\n", n);
*kv_arr = malloc(sizeof(struct key_value) * n);
/*filling key_value array*/
int i = 0;
read = -1;
current_pos = 0;
consumed = 0;
while (1){
if(current_pos > strlen(str)){
break;
}
read = sscanf(str + current_pos, "%21[^:]:%21[^\n]\n%n", k, v, &consumed);
current_pos += consumed;
if(read == 2){
struct key_value* kv = &((*kv_arr)[i]);
strncpy(kv->key, k, 22);
strncpy(kv->value, v, 22);
++i;
}
}
*num = n;
}
int main(){
char* str = "hola:hello\n"
"que:what\n";
int n;
struct key_value* kv_arr;
parse_str(str, &kv_arr, &n);
for (int i = 0; i < n; ++i) {
printf("%s <---> %s\n", kv_arr[i].key, kv_arr[i].value);
}
free(kv_arr);
return 0;
}
output :
n = 2
hola <---> hello
que <---> what
Process finished with exit code 0
Note: sscanf operates on a const char*, not an input stream from a file, so it will NOT store any information about what it has consumed.
solution : I used %n in the format string to get the number of characters that it has consumed so far (C89 standard).

How to read pointer values into an array of structs

I have following struct
typedef struct
{
char* city;
int temp;
} Place;`
I am attempting to read in two values from a line into an array of structs.
The lines look like:
Los Angeles; 88
I am attempting to read data into the array. Assuming my memory allocation is correct what is the correct way to read in these values.
my code
void readData(FILE** fpData, FILE** fpOutput)
{
char s[100];
int index = 0;
Place *values;
values=malloc(size * sizeof(Place));
if (values == NULL)
{
MEM_ERROR;
exit(1);
}
for (int a = 0; a < size; a++)
{
(values[a]).city = (char *) malloc(100 * sizeof(char));
if(values[a].city == NULL)
{
MEM_ERROR;
exit(100);
}
}
while(fgets(s, sizeof(s), *fpData)!=NULL)
{
sscanf(s, "%[^:]%*c%d\n", values[index].city, &values[index].temp);
index++;
}
sortInsertion(values, size, fpOutput);
free(values);
return;
}
The city is not going into the array so I am assuming the part where it says values[index].city is incorrect.
How can I fix this ?
Your data using semicolon ; while your sscanf format using colon :, make sure this is same character. If your data really use semicolon, change %[^:] part in the sscanf format to %[^;]
Here my code and how I run it to show you that it works:
#include <stdio.h>
struct Place {
char city[100];
int temp;
} values[30];
int main() {
char s[100];
int i=0, n=0;
while ( fgets(s, sizeof(s), stdin) != NULL ) {
sscanf(s, "%[^;]%*c%d\n", values[n].city, &values[n].temp);
n++;
}
printf("n=%d\n", n);
for ( i=0; i<n; i++ ) {
printf("values[%d] = (%s, %d)\n", i, values[i].city, values[i].temp);
}
}
This is how I run it on Linux:
% for a in `seq 1 3`; do echo "City-$a; $a$a"; done | ./a.out
n=3
values[0] = (City-1, 11)
values[1] = (City-2, 22)
values[2] = (City-3, 33)
sscanf(s, "%[^:]%*c%d\n", values[index].city, &values[index].temp);
This will copy everything from the start of the line read up to the first colon (:) into the city array you allocated. Your example input would seem to have a semi-colon (;), so you'll get the entire line in the city array and nothing in the temp field.
You don't do any input checking, so any too-long input line will get split into multiple (probably corrupted) cities. You'll also have problems if there are not exactly size lines in the file, as you don't check to make sure index doesn't go past size while reading, and you assume you have size entries at the end without checking.

Read lines from a file and create alphabetically sorted array

I am learning C and I want to do this specific task. I know there is a number of similar questions and answers, but still... I will try to be more specific. Lets say, I have a file with following lines:
program01
programs
aprogram
1program
prog
5program
And I want now an array with:
1program
5program
aprogram
prog
program01
programs
So there are ONLY latin small letters and numbers in strings, no spaces. I know how to perform some separate steps, but want to get and feel the whole (and proper) concept, so to say. Probably it could make some sorting decisions on the fly when reading from file first? Manual sort is preferred for my particular case, just for the sake of better learning and possible optimisation. Lets say, maximal length of one line is 256, maximal number of lines is 256. Thanks in advance.
Check the below code:
#include <stdio.h>
#include<string.h>
int main(void) {
char a[256][256];
int i=0,j=0,k=0,n;
while(i<256 && fgets(a[i],256,stdin) != NULL)
{
n = strlen(a[i]);
if(n >0 && a[i][n-1] == '\n')
a[i][n -1] = '\0';
i++;
}
for(j=0;j<i;j++)
{
char max[256];
strcpy(max,a[j]);
for(k=j+1;k<i;k++)
{
if(strcmp(a[k],max) < 0)
{
char tmp[256];
strcpy(tmp,a[k]);
strcpy(a[k],max);
strcpy(max,tmp);
}
}
strcpy(a[j],max);
}
for(j=0;j<i;j++)
{
printf("%s\n",a[j]);
}
return 0;
}
The following cleanly compiles
however, I have not tested it
you might want to modify it to get the file name from
the command line
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAX_ROWS (256)
#define MAX_COLUMNS (256)
#define FILE_NAME "myInputFile"
// prototypes
void bubbleSortWordsArray( int wordCount );
void printWordsArray( int wordCount );
static char words[MAX_ROWS][MAX_COLUMNS] = {{'\0','\0'}};
int main(void)
{
FILE *fp = NULL;
if( NULL == (fp = fopen( FILE_NAME, "r") ) )
{
perror( "fopen failed" );
exit( EXIT_FAILURE );
}
// implied else, fopen successful
// read each line from file into entry in words array
int i = 0;
while( fgets(words[i], MAX_COLUMNS, fp ) )
{
// remove trailing newline from string
words[i][strlen(words[i])-1] = '\0';
i++;
}
// 'i' contains number of valid entries in words[][]
// sort the array of strings
bubbleSortWordsArray(i);
printWordsArray(i);
return(0);
} // end function: main
void bubbleSortWordsArray( int wordCount )
{
int c; // outer index through rows
int d; // inner index through rows
char swap[MAX_COLUMNS] = {'\0'};
for (c = 0 ; c < ( wordCount - 1 ); c++)
{
for (d = 0 ; d < (wordCount - c - 1); d++)
{
if( 0 > strcmp( words[d], words[d+1] ) )
{ // then words need to be swapped
strcpy( swap, words[d] );
strcpy( words[d], words[d+1]);
strcpy( words[d+1], swap );
} // end if compare/swap
} // end for
} // end for each row
} // end function: bubbleSortWordsArray
void printWordsArray( int wordCount )
{
int i; // loop index
printf( "\n" ); // start on new output line
for( i=0; i<wordCount; i++ )
{
printf( "%s\n", words[i] );
}
} // end function: printWordsArray
Make a 2D char array
try to read it by using fscanf function(sorry i cant remember the syntax).
fscan will read your whole line till '\n' but there should not be space
and store each string in a row.
then sort it by comparaing the first index of each string

Loop crashing in C

I'm very new to C and I'm still learning the basics. I'm creating an application that reads in a text file and breaks down the words individually. My intention will be to count the amount of times each word occurs.
Anyway, the last do-while loop in the code below executes fine, and then crashes. This loop prints memory address to this word (pointer) and then prints the word. It accomplishes this fine, and then crashes on the last iteration. My intention is to push this memory address into a singly linked list, albeit once it's stopped crashing.
Also, just a quick mention regarding the array sizes below; I yet figured out how to set the correct size needed to hold the word character array etc because you must define the size before the array is filled, and I don't know how to do this. Hence why I've set them to 1024.
#include<stdio.h>
#include<string.h>
int main (int argc, char **argv) {
FILE * pFile;
int c;
int n = 0;
char *wp;
char wordArray[1024];
char delims[] = " "; // delims spaces in the word array.
char *result = NULL;
result = strtok(wordArray, delims);
char holder[1024];
pFile=fopen (argv[1],"r");
if (pFile == NULL) perror ("Error opening file");
else {
do {
c = fgetc (pFile);
wordArray[n] = c;
n++;
} while (c != EOF);
n = 0;
fclose (pFile);
do {
result = strtok(NULL, delims);
holder[n] = *result; // holder stores the value of 'result', which should be a word.
wp = &holder[n]; // wp points to the address of 'holder' which holds the 'result'.
n++;
printf("Pointer value = %d\n", wp); // Prints the address of holder.
printf("Result is \"%s\"\n", result); // Prints the 'result' which is a word from the array.
//sl_push_front(&wp); // Push address onto stack.
} while (result != NULL);
}
return 0;
}
Please ignore the bad program structure, as I mentioned, I'm new to this!
Thanks
As others have pointed out, your second loop attempts to dereference result before you check for it being NULL. Restructure your code as follows:
result = strtok( wordArray, delims ); // do this *after* you have read data into
// wordArray
while( result != NULL )
{
holder[n] = *result;
...
result = strtok( NULL, delims );
}
Although...
You're attempting to read the entire contents of the file into memory before breaking it up into words; that's not going to work for files bigger than the size of your buffer (currently 1K). If I may make a suggestion, change your code such that you're reading individual words as you go. Here's an example that breaks the input stream up into words delimited by whitespace (blanks, newlines, tabs, etc.) and punctuation (period, comma, etc.):
#include <stdio.h>
#include <ctype.h>
int main(int argc, char **argv)
{
char buffer[1024];
int c;
size_t n = 0;
FILE *input = stdin;
if( argc > 1 )
{
input = fopen( argv[1], "r");
if (!input)
input = stdin;
}
while(( c = fgetc(input)) != EOF )
{
if (isspace(c) || ispunct(c))
{
if (n > 0)
{
buffer[n] = 0;
printf("read word %s\n", buffer);
n = 0;
}
}
else
{
buffer[n++] = c;
}
}
if (n > 0)
{
buffer[n] = 0;
printf("read word %s\n", buffer);
}
fclose(input);
return 0;
}
No warranties express or implied (having pounded this out before 7:00 a.m.). But it should give you a flavor of how to parse a file as you go. If nothing else, it avoids using strtok, which is not the greatest of tools for parsing input. You should be able to adapt this general structure to your code. For best results, you should abstract that out into its own function:
int getNextWord(FILE *stream, char *buf, size_t bufsize)
{
int c;
size_t n = 0;
while(( c = fgetc(input)) != EOF && n < bufsize)
{
if (isspace(c) || ispunct(c))
{
if (n > 0)
{
buf[n] = 0;
n = 0;
}
}
else
{
buffer[n++] = c;
}
}
if (n > 0)
{
buffer[n] = 0;
printf("read word %s\n", buffer);
}
if (n == 0)
return 0;
else
return 1;
}
and you would call it like
void foo(void)
{
char word[SOME_SIZE];
...
while (getNextWord(inFile, word, sizeof word))
{
do_something_with(word);
}
...
}
If you expect in your do...while code, that result could be null (this is the condition for loop break), how do you think this code-line:
holder[n] = *result;
must work? It seems to me, that it is the reason for crashing in your program.
Change do while loop to while
use
while (condition)
{
}
instead of
do {
}while(condition)
It is crashing because you are trying to derefrance a NULL pointer result in do while loop.
I work mostly with Objective-C and was just looking at your question for fun, but I may have a solution.
Before setting n=0; after your first do-while loop, create another variable called totalWords and set it equal to n, totalWords can be declared anywhere within the file (except within one of the do-while loops), but can be defined at the top to the else block since its lifetime is short:
totalWords = n;
then you can set n back to zero:
n = 0;
Your conditional for the final do-while loop should then say:
...
} while (n <= ++totalWords);
The logic behind the application will thus say, count the words in the file (there are n words, which is the totalWords in the file). When program prints the results to the console, it will run the second do-while loop, which will run until n is one result past the value of totalWords (this ensures that you print the final word).
Alternately, it is better practice and clearer for other programmers to use a loop and a half:
do {
result = strtok(NULL, delims);
holder[n] = *result;
wp = &holder[n];
printf("Pointer value = %d\n", wp);
printf("Result is \"%s\"\n", result);
//sl_push_front(&wp); // Push address onto stack.
if (n == totalWords) break; // This forces the program to exit the do-while after we have printed the last word
n++; // We only need to increment if we have not reached the last word
// if our logic is bad, we will enter an infinite loop, which will tell us while testing that our logic is bad.
} while (true);

Resources