Reading formatted strings from file into Array in C

Reading formatted strings from file into Array in C - c

I am new to the C programming language and trying to improve by solving problems from the Project Euler website using only C and its standard libraires. I have covered basic C fundamentals(I think), functions, pointers, and some basic file IO but now am running into some issues.
The question is about reading a text file of first names and calculating a "name score" blah blah, I know the algorithm I am going to use and have most of the program setup but just cannot figure out how to read the file correctly.
The file is in the format
"Nameone","Nametwo","billy","bobby","frank"...
I have searched and searched and tried countless things but cannot seem to read these as individual names into an array of strings(I think thats the right way to store them individually?) I have tried using sscanf/fscanf with %[^\",]. I have tried different combos of those functions and fgets, but my understanding of fgets is everytime I call it it will get a new line, and this is a text file with over 45,000 characters all on the same line.
I am unsure if I am running into problems with my misunderstanding of the scanf functions, or my misunderstanding with storing an array of strings. As far as the array of strings goes, I (think) I have realized that when I declare an array of strings it does not allocate memory for the strings themselves, something that I need to do. But I still cannot get anything to work.
Here is the code I have now to try to just read in some names I enter from the command line to test my methods.
This code works to input any string up to buffer size(100):
int main(void)
{
int i;
char input[100];
char* names[10];
printf("\nEnter up to 10 names\nEnter an empty string to terminate input: \n");
for(int i = 0; i < 10; i++)
{
int length = 0;
printf("%d: ", i);
fgets(input, 100, stdin);
length = (int)strlen(input);
input[length-1] = 0; // Delete newline character
length--;
if(length < 1)
{
break;
}
names[i] = malloc(length+1);
assert(names[i] != NULL);
strcpy(names[i], input);
}
}
However, I simply cannot make this work for reading in the formatted strings.
PLEASE advise me as to how to read it in with format. I have previously used sscanf on the input buffer and that has worked fine, but I dont feel like I can do that on a 45000+ char line? Am I correct in assuming this? Is this even an acceptable way to read strings into an array?
I apologize if this is long and/or not clear, it is very late and I am very frustrated.
Thank anyone and everyone for helping, and I am looking forward to finally becoming an active member on this site!

There are really two basic issues here:
Whether scanning string input is the proper strategy here. I would argue not because while it might work on this task you are going to run into more complicated scenarios where it too easily breaks.
How to handle a 45k string.
In reality you won't run into too many string of this size but it is nothing that a modern computer of any capacity can't easily handle. Insofar as this is for learning purposes then learn iteratively.
The easiest first approach is to fread() the entire line/file into an appropriately sized buffer and parse it yourself. You can use strtok() to break up the comma-delimited tokens and then pass the tokens to a function that strips the quotes and returns the word. Add the word to your array.
For a second pass you can do away with strtok() and just parse the string yourself by iterating over the buffer and breaking up the comma tokens yourself.
Last but not least you can write a version that reads smaller chunks of the file into a smaller buffer and parses them. This has the added complexity of handling multiple reads and managing the buffers to account for half-read tokens at the end of a buffer and so on.
In any case, break the problem into chunks and learn with each refinement.
EDIT
#define MAX_STRINGS 5000
#define MAX_NAME_LENGTH 30
char* stripQuotes(char *str, char *newstr)
{
char *temp = newstr;
while (*str)
{
if (*str != '"')
{
*temp = *str;
temp++;
}
str++;
}
return(newstr);
}
int main(int argc, char *argv[])
{
char fakeline[] = "\"Nameone\",\"Nametwo\",\"billy\",\"bobby\",\"frank\"";
char *token;
char namebuffer[MAX_NAME_LENGTH] = {'\0'};
char *name;
int index = 0;
char nameArray[MAX_STRINGS][MAX_NAME_LENGTH];
token = strtok(fakeline, ",");
if (token)
{
name = stripQuotes(token, namebuffer);
strcpy(nameArray[index++], name);
}
while (token != NULL)
{
token = strtok(NULL, ",");
if (token)
{
memset(namebuffer, '\0', sizeof(namebuffer));
name = stripQuotes(token, namebuffer);
strcpy(nameArray[index++], name);
}
}
return(0);
}

fscanf("%s", input) reads one token (a string surrounded by spaces) at a time. You can either scan the input until you encounter a specific "end-of-input" string, such as "!", or you can wait for the end-of-file signal, which is achieved by pressing "Ctrl+D" on a Unix console or by pressing "Ctrl+Z" on a Windows console.
The first option:
fscanf("%s", input);
if (input[0] == '!') {
break;
}
// Put input on the array...
The second option:
result = fscanf("%s", input);
if (result == EOF) {
break;
}
// Put input on the array...
Either way, as you read one token at a time, there are no limits on the size of the input.

Why not search the giant string for quote characters instead? Something like this:
#include <stdio.h>
#include <string.h>
int main(void)
{
char mydata[] = "\"John\",\"Smith\",\"Foo\",\"Bar\"";
char namebuffer[20];
unsigned int i, j;
int begin = 1;
unsigned int beginName, endName;
for (i = 0; i < sizeof(mydata); i++)
{
if (mydata[i] == '"')
{
if (begin)
{
beginName = i;
}
else
{
endName = i;
for (j = beginName + 1; j < endName; j++)
{
namebuffer[j-beginName-1] = mydata[j];
}
namebuffer[endName-beginName-1] = '\0';
printf("%s\n", namebuffer);
}
begin = !begin;
}
}
}
You find the first double quote, then the second, and then read out the characters in between to your name string. Then you process those characters as needed for the problem in question.

Related

strtok() C-Strings to Array

Currently learning C, Having some trouble with passing c-string tokens into array. Lines come in by standard input, strtok is used to split the line up, and I want to put each into an array properly. an EOF check is required for exiting the input stream. Here's what I have, set up so that it will print the tokens back to me (these tokens will be converted to ASCII in a different code segment, just trying to get this part to work first).
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
char string[1024]; //Initialize a char array of 1024 (input limit)
char *token;
char *token_arr[1024]; //array to store tokens.
char *out; //used
int count = 0;
while(fgets(string, 1023, stdin) != NULL) //Read lines from standard input until EOF is detected.
{
if (count == 0)
token = strtok(string, " \n"); //If first loop, Get the first token of current input
while (token != NULL) //read tokens into the array and increment the counter until all tokens are stored
{
token_arr[count] = token;
count++;
token = strtok(NULL, " \n");
}
}
for (int i = 0; i < count; i++)
printf("%s\n", token_arr[i]);
return 0;
}
this seems like proper logic to me, but then i'm still learning. The issue seems to be with streaming in multiple lines before sending the EOF signal with ctrl-D.
For example, given an input of:
this line will be fine
the program returns:
this
line
will
be
fine
But if given:
none of this
is going to work
It returns:
is going to work
ing to work
to work
any help is greatly appreciated. I'll keep working at it in the meantime.

There are a couple of issues here:
You never call token = strtok(string, " \n"); again once the string is "reset" to a new value, so strtok() still thinks it is tokenizing your original string.
strtok is returning pointers to "substrings" inside string. You are changing the contents of what is in string and so your second line effectively corrupts your first (since the original contents of string are overwritten).
To do what you want you need to either read each line into a different buffer or duplicate the strings returned by strtok (strdup() is one way - just remember to free() each copy...)

Reading strings with spaces from a file

I'm working on a project and I just encountered a really annoying problem. I have a file which stores all the messages that my account received. A message is a data structure defined this way:
typedef struct _message{
char dest[16];
char text[512];
}message;
dest is a string that cannot contain spaces, unlike the other fields.
Strings are acquired using the fgets() function, so dest and text can have "dynamic" length (from 1 character up to length-1 legit characters). Note that I manually remove the newline character after every string is retrieved from stdin.
The "inbox" file uses the following syntax to store messages:
dest
text
So, for example, if I have a message from Marco which says "Hello, how are you?" and another message from Tarma which says "Are you going to the gym today?", my inbox-file would look like this:
Marco
Hello, how are you?
Tarma
Are you going to the gym today?
I would like to read the username from the file and store it in string s1 and then do the same thing for the message and store it in string s2 (and then repeat the operation until EOF), but since text field admits spaces I can't really use fscanf().
I tried using fgets(), but as I said before the size of every string is dynamic. For example if I use fgets(my_file, 16, username) it would end up reading unwanted characters. I just need to read the first string until \n is reached and then read the second string until the next \n is reached, this time including spaces.
Any idea on how can I solve this problem?

#include <stdio.h>
int main(void){
char username[16];
char text[512];
int ch, i;
FILE *my_file = fopen("inbox.txt", "r");
while(1==fscanf(my_file, "%15s%*c", username)){
i=0;
while (i < sizeof(text)-1 && EOF!=(ch=fgetc(my_file))){
if(ch == '\n' && i && text[i-1] == '\n')
break;
text[i++] = ch;
}
text[i] = 0;
printf("user:%s\n", username);
printf("text:\n%s\n", text);
}
fclose(my_file);
return 0;
}

As the length of each string is dynamic then, if I were you, I would read the file first for finding each string's size and then create a dynamic array of strings' length values.
Suppose your file is:
A long time ago
in a galaxy far,
far away....
So the first line length is 15, the second line length is 16 and the third line length is 12.
Then create a dynamic array for storing these values.
Then, while reading strings, pass as the 2nd argument to fgets the corresponding element of the array. Like fgets (string , arrStringLength[i++] , f);.
But in this way you'll have to read your file twice, of course.

You can use fgets() easily enough as long as you're careful. This code seems to work:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
enum { MAX_MESSAGES = 20 };
typedef struct Message
{
char dest[16];
char text[512];
} Message;
static int read_message(FILE *fp, Message *msg)
{
char line[sizeof(msg->text) + 1];
msg->dest[0] = '\0';
msg->text[0] = '\0';
while (fgets(line, sizeof(line), fp) != 0)
{
//printf("Data: %zu <<%s>>\n", strlen(line), line);
if (line[0] == '\n')
continue;
size_t len = strlen(line);
line[--len] = '\0';
if (msg->dest[0] == '\0')
{
if (len < sizeof(msg->dest))
{
memmove(msg->dest, line, len + 1);
//printf("Name: <<%s>>\n", msg->dest);
}
else
{
fprintf(stderr, "Error: name (%s) too long (%zu vs %zu)\n",
line, len, sizeof(msg->dest)-1);
exit(EXIT_FAILURE);
}
}
else
{
if (len < sizeof(msg->text))
{
memmove(msg->text, line, len + 1);
//printf("Text: <<%s>>\n", msg->dest);
return 0;
}
else
{
fprintf(stderr, "Error: text for %s too long (%zu vs %zu)\n",
msg->dest, len, sizeof(msg->dest)-1);
exit(EXIT_FAILURE);
}
}
}
return EOF;
}
int main(void)
{
Message mbox[MAX_MESSAGES];
int n_msgs;
for (n_msgs = 0; n_msgs < MAX_MESSAGES; n_msgs++)
{
if (read_message(stdin, &mbox[n_msgs]) == EOF)
break;
}
printf("Inbox (%d messages):\n\n", n_msgs);
for (int i = 0; i < n_msgs; i++)
printf("%d: %s\n %s\n\n", i + 1, mbox[i].dest, mbox[i].text);
return 0;
}
The reading code will handle (multiple) empty lines before the first name, between a name and the text, and after the last name. It is slightly unusual in they way it decides whether to store the line just read in the dest or text parts of the message. It uses memmove() because it knows exactly how much data to move, and the data is null terminated. You could replace it with strcpy() if you prefer, but it should be slower (the probably not measurably slower) because strcpy() has to test each byte as it copies, but memmove() does not. I use memmove() because it is always correct; memcpy() could be used here but it only works when you guarantee no overlap. Better safe than sorry; there are plenty of software bugs without risking extras. You can decide whether the error exit is appropriate — it is fine for test code, but not necessarily a good idea in production code. You can decide how to handle '0 messages' vs '1 message' vs '2 messages' etc.
You can easily revise the code to use dynamic memory allocation for the array of messages. It would be easy to read the message into a simple Message variable in main(), and arrange to copy into the dynamic array when you get a complete message. The alternative is to 'risk' over-allocating the array, though that is unlikely to be a major problem (you would not grow the array one entry at a time anyway to avoid quadratic behaviour when the memory has to be moved during each allocation).
If there were multiple fields to be processed for each message (say, date received and date read too), then you'd need to reorganize the code some more, probably with another function.
Note that the code avoids the reserved namespace. A name such as _message is reserved for 'the implementation'. Code such as this is not part of the implementation (of the C compiler and its support system), so you should not create names that start with an underscore. (That over-simplifies the constraint, but only slightly, and is a lot easier to understand than the more nuanced version.)
The code is careful not to write any magic number more than once.
Sample output:
Inbox (2 messages):
1: Marco
How are you?
2: Tarma
Are you going to the gym today?

Array of strings being overwritten

I have a program that is trying to take a text file that consists of the following and feed it to my other program.
Bruce, Wayne
Bruce, Banner
Princess, Diana
Austin, Powers
This is my C code. It is trying to get the number of lines in the file, parse the comma-separated keys and values, and put them all in a list of strings. Lastly, it is trying to iterate through the list of strings and print them out. The output of this is just Austin Powers over and over again. I'm not sure if the problem is how I'm appending the strings to the list or how I'm reading them off.
#include<stdio.h>
#include <stdlib.h>
int main(){
char* fileName = "Example.txt";
FILE *fp = fopen(fileName, "r");
char line[512];
char * keyname = (char*)(malloc(sizeof(char)*80));
char * val = (char*)(malloc(sizeof(char)*80));
int i = 0;
int ch, lines;
while(!feof(fp)){
ch = fgetc(fp);
if(ch == '\n'){ //counts how many lines there are
lines++;
}
}
rewind(fp);
char* targets[lines*2];
while (fgets(line, sizeof(line), fp)){
strtok(line,"\n");
sscanf(line, "%[^','], %[^',']%s\n", keyname, val);
targets[i] = keyname;
targets[i+1] = val;
i+=2;
}
int q = 0;
while (q!=i){
printf("%s\n", targets[q]);
q++;
}
return 0;
}

The problem is with the two lines:
targets[i] = keyname;
targets[i+1] = val;
These do not make copies of the string - they only copy the address of whatever memory they point to. So, at the end of the while loop, each pair of target elements point to the same two blocks.
To make copies of the string, you'll either have to use strdup (if provided), or implement it yourself with strlen, malloc, and strcpy.
Also, as #mch mentioned, you never initialize lines, so while it may be zero, it may also be any garbage value (which can cause char* targets[lines*2]; to fail).

First you open the file. The in the while loop, check the condition to find \n or EOF to end the loop. In the loop, if you get anything other than alphanumeric, then separate the token and store it in string array. Increment the count when you encounter \n or EOF. Better use do{}while(ch!=EOF);

How would I compare a string (entered by the user) to the first word of a line in a file?

I am really struggling to understand how character arrays work in C. This seems like something that should be really simple, but I do not know what function to use, or how to use it.
I want the user to enter a string, and I want to iterate through a text file, comparing this string to the first word of each line in the file.
By "word" here, I mean substring that consists of characters that aren't blanks.
Help is greatly appreciated!
Edit:
To be more clear, I want to take a single input and search for it in a database of the form of a text file. I know that if it is in the database, it will be the first word of a line, since that is how to database is formatted. I suppose I COULD iterate through every single word of the database, but this seems less efficient.
After finding the input in the database, I need to access the two words that follow it (on the same line) to achieve the program's ultimate goal (which is computational in nature)

Here is some code that will do what you are asking. I think it will help you understand how string functions work a little better. Note - I did not make many assumptions about how well conditioned the input and text file are, so there is a fair bit of code for removing whitespace from the input, and for checking that the match is truly "the first word", and not "the first part of the first word". So this code will not match the input "hello" to the line "helloworld 123 234" but it will match to "hello world 123 234". Note also that it is currently case sensitive.
#include <stdio.h>
#include <string.h>
int main(void) {
char buf[100]; // declare space for the input string
FILE *fp; // pointer to the text file
char fileBuf[256]; // space to keep a line from the file
int ii, ll;
printf("give a word to check:\n");
fgets(buf, 100, stdin); // fgets prevents you reading in a string longer than buffer
printf("you entered: %s\n", buf); // check we read correctly
// see (for debug) if there are any odd characters:
printf("In hex, that is ");
ll = strlen(buf);
for(ii = 0; ii < ll; ii++) printf("%2X ", buf[ii]);
printf("\n");
// probably see a carriage return - depends on OS. Get rid of it!
// note I could have used the result that ii is strlen(but) but
// that makes the code harder to understand
for(ii = strlen(buf) - 1; ii >=0; ii--) {
if (isspace(buf[ii])) buf[ii]='\0';
}
// open the file:
if((fp=fopen("myFile.txt", "r"))==NULL) {
printf("cannot open file!\n");
return 0;
}
while( fgets(fileBuf, 256, fp) ) { // read in one line at a time until eof
printf("line read: %s", fileBuf); // show we read it correctly
// find whitespace: we need to keep only the first word.
ii = 0;
while(!isspace(fileBuf[ii]) && ii < 255) ii++;
// now compare input string with first word from input file:
if (strlen(buf)==ii && strstr(fileBuf, buf) == fileBuf) {
printf("found a matching line: %s\n", fileBuf);
break;
}
}
// when you get here, fileBuf will contain the line you are interested in
// the second and third word of the line are what you are really after.
}

Your recent update states that the file is really a database, in which you are looking for a word. This is very important.
If you have enough memory to hold the whole database, you should do just that (read the whole database and arrange it for efficient searching), so you should probably not ask about searching in a file.
Good database designs involve data structures like trie and hash table. But for a start, you could use the most basic improvement of the database - holding the words in alphabetical order (use the somewhat tricky qsort function to achieve that).
struct Database
{
size_t count;
struct Entry // not sure about C syntax here; I usually code in C++; sorry
{
char *word;
char *explanation;
} *entries;
};
char *find_explanation_of_word(struct Database* db, char *word)
{
for (size_t i = 0; i < db->count; i++)
{
int result = strcmp(db->entries[i].word, word);
if (result == 0)
return db->entries[i].explanation;
else if (result > 0)
break; // if the database is sorted, this means word is not found
}
return NULL; // not found
}
If your database is too big to hold in memory, you should use a trie that holds just the beginnings of the words in the database; for each beginning of a word, have a file offset at which to start scanning the file.
char* find_explanation_in_file(FILE *f, long offset, char *word)
{
fseek(f, offset, SEEK_SET);
char line[100]; // 100 should be greater than max line in file
while (line, sizeof(line), f)
{
char *word_in_file = strtok(line, " ");
char *explanation = strtok(NULL, "");
int result = strcmp(word_in_file, word);
if (result == 0)
return explanation;
else if (result > 0)
break;
}
return NULL; // not found
}

I think what you need is fseek().
1) Pre-process the database file as follows. Find out the positions of all the '\n' (carriage returns), and store them in array, say a, so that you know that ith line starts at a[i]th character from the beginning of the file.
2) fseek() is a library function in stdio.h, and works as given here. So, when you need to process an input string, just start from the start of the file, and check the first word, only at the stored positions in the array a. To do that:
fseek(inFile , a[i] , SEEK_SET);
and then
fscanf(inFile, "%s %s %s", yourFirstWordHere, secondWord, thirdWord);
for checking the ith line.
Or, more efficiently, you could use:
fseek ( inFile , a[i]-a[i-1] , SEEK_CURR )
Explanation: What fseek() does is, it sets the read/write position indicator associated with the file at the desired position. So, if you know at which point you need to read or write, you can just go there and read directly or write directly. This way, you won't need to read whole lines just to get first three words.

How do I parse a string in C?

I am a beginner learning C; so, please go easy on me. :)
I am trying to write a very simple program that takes each word of a string into a "Hi (input)!" sentence (it assumes you type in names). Also, I am using arrays because I need to practice them.
My problem is that, some garbage gets putten into the arrays somewhere, and it messes up the program. I tried to figure out the problem but to no avail; so, it is time to ask for expert help. Where have I made mistakes?
p.s.: It also has an infinite loop somewhere, but it is probably the result of the garbage that is put into the array.
#include <stdio.h>
#define MAX 500 //Maximum Array size.
int main(int argc, const char * argv[])
{
int stringArray [MAX];
int wordArray [MAX];
int counter = 0;
int wordCounter = 0;
printf("Please type in a list of names then hit ENTER:\n");
// Fill up the stringArray with user input.
stringArray[counter] = getchar();
while (stringArray[counter] != '\n') {
stringArray[++counter] = getchar();
}
// Main function.
counter = 0;
while (stringArray[wordCounter] != '\n') {
// Puts first word into temporary wordArray.
while ((stringArray[wordCounter] != ' ') && (stringArray[wordCounter] != '\n')) {
wordArray[counter++] = stringArray[wordCounter++];
}
wordArray[counter] = '\0';
//Prints out the content of wordArray.
counter = 0;
printf("Hi ");
while (wordArray[counter] != '\0') {
putchar(wordArray[counter]);
counter++;
}
printf("!\n");
//Clears temporary wordArray for new use.
for (counter = 0; counter == MAX; counter++) {
wordArray[counter] = '\0';
}
wordCounter++;
counter = 0;
}
return 0;
}
Solved it! I needed to add to following if sentence to the end when I incremented the wordCounter. :)
if (stringArray[wordCounter] != '\n') {
wordCounter++;
}

You are using int arrays to represent strings, probably because getchar() returns in int. However, strings are better represented as char arrays, since that's what they are, in C. The fact that getchar() returns an int is certainly confusing, it's because it needs to be able to return the special value EOF, which doesn't fit in a char. Therefore it uses int, which is a "larger" type (able to represent more different values). So, it can fit all the char values, and EOF.
With char arrays, you can use C's string functions directly:
char stringArray[MAX];
if(fgets(stringArray, sizeof stringArray, stdin) != NULL)
printf("You entered %s", stringArray);
Note that fscanf() will leave the end of line character(s) in the string, so you might want to strip them out. I suggest implementing an in-place function that trims off leading and trailing whitespace, it's a good exercise as well.

for (counter = 0; counter == MAX; counter++) {
wordArray[counter] = '\0';
}
You never enter into this loop.

user1799795,
For what it's worth (now that you've solved your problem) I took the liberty of showing you how I'd do this given the restriction "use arrays", and explaining a bit about why I'd do it that way... Just beware that while I am experienced programmer I'm no C guru... I've worked with guys who absolutely blew me into the C-weeds (pun intended).
#include <stdio.h>
#include <string.h>
#define LINE_SIZE 500
#define MAX_WORDS 50
#define WORD_SIZE 20
// Main function.
int main(int argc, const char * argv[])
{
int counter = 0;
// ----------------------------------
// Read a line of input from the user (ie stdin)
// ----------------------------------
char line[LINE_SIZE];
printf("Please type in a list of names then hit ENTER:\n");
while ( fgets(line, LINE_SIZE, stdin) == NULL )
fprintf(stderr, "You must enter something. Pretty please!");
// A note on that LINE_SIZE parameter to the fgets function:
// wherever possible it's a good idea to use the version of the standard
// library function that allows you specificy the maximum length of the
// string (or indeed any array) because that dramatically reduces the
// incedence "string overruns", which are a major source of bugs in c
// programmes.
// Also note that fgets includes the end-of-line character/sequence in
// the returned string, so you have to ensure there's room for it in the
// destination string, and remember to handle it in your string processing.
// -------------------------
// split the line into words
// -------------------------
// the current word
char word[WORD_SIZE];
int wordLength = 0;
// the list of words
char words[MAX_WORDS][WORD_SIZE]; // an array of upto 50 words of
// upto 20 characters each
int wordCount = 0; // the number of words in the array.
// The below loop syntax is a bit cyptic.
// The "char *c=line;" initialises the char-pointer "c" to the start of "line".
// The " *c;" is ultra-shorthand for: "is the-char-at-c not equal to zero".
// All strings in c end with a "null terminator" character, which has the
// integer value of zero, and is commonly expressed as '\0', 0, or NULL
// (a #defined macro). In the C language any integer may be evaluated as a
// boolean (true|false) expression, where 0 is false, and (pretty obviously)
// everything-else is true. So: If the character at the address-c is not
// zero (the null terminator) then go-round the loop again. Capiche?
// The "++c" moves the char-pointer to the next character in the line. I use
// the pre-increment "++c" in preference to the more common post-increment
// "c++" because it's a smidge more efficient.
//
// Note that this syntax is commonly used by "low level programmers" to loop
// through strings. There is an alternative which is less cryptic and is
// therefore preferred by most programmers, even though it's not quite as
// efficient. In this case the loop would be:
// int lineLength = strlen(line);
// for ( int i=0; i<lineLength; ++i)
// and then to get the current character
// char ch = line[i];
// We get the length of the line once, because the strlen function has to
// loop through the characters in the array looking for the null-terminator
// character at its end (guess what it's implementation looks like ;-)...
// which is inherently an "expensive" operation (totally dependant on the
// length of the string) so we atleast avoid repeating this operation.
//
// I know I might sound like I'm banging on about not-very-much but once you
// start dealing with "real word" magnitude datasets then such habits,
// formed early on, pay huge dividends in the ability to write performant
// code the first time round. Premature optimisation is evil, but my code
// doesn't hardly ever NEED optimising, because it was "fairly efficient"
// to start with. Yeah?
for ( char *c=line; *c; ++c ) { // foreach char in line.
char ch = *c; // "ch" is the character value-at the-char-pointer "c".
if ( ch==' ' // if this char is a space,
|| ch=='\n' // or we've reached the EOL char
) {
// 1. add the word to the end of the words list.
// note that we copy only wordLength characters, instead of
// relying on a null-terminator (which doesn't exist), as we
// would do if we called the more usual strcpy function instead.
strncpy(words[wordCount++], word, wordLength);
// 2. and "clear" the word buffer.
wordLength=0;
} else if (wordLength==WORD_SIZE-1) { // this word is too long
// so split this word into two words.
strncpy(words[wordCount++], word, wordLength);
wordLength=0;
word[wordLength++] = ch;
} else {
// otherwise: append this character to the end of the word.
word[wordLength++] = ch;
}
}
// -------------------------
// print out the words
// -------------------------
for ( int w=0; w<wordCount; ++w ) {
printf("Hi %s!\n", words[w]);
}
return 0;
}
In the real world one can't make such restrictive assumptions about the maximum-length of words, or how many there will be, and if such restrictions are given they're almost allways arbitrary and therefore proven wrong all too soon... so straight-off-the-bat for this problem, I'd be inclined to use a linked-list instead of the "words" array... wait till you get to "dynamic data structures"... You'll love em ;-)
Cheers. Keith.
PS: You're going pretty well... My advise is "just keep on truckin"... this gets a LOT easier with practice.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Reading formatted strings from file into Array in C - c

Related

strtok() C-Strings to Array

Reading strings with spaces from a file

Array of strings being overwritten

How would I compare a string (entered by the user) to the first word of a line in a file?

How do I parse a string in C?

Categories

Resources