Reading strings with spaces from a file - c

I'm working on a project and I just encountered a really annoying problem. I have a file which stores all the messages that my account received. A message is a data structure defined this way:
typedef struct _message{
char dest[16];
char text[512];
}message;
dest is a string that cannot contain spaces, unlike the other fields.
Strings are acquired using the fgets() function, so dest and text can have "dynamic" length (from 1 character up to length-1 legit characters). Note that I manually remove the newline character after every string is retrieved from stdin.
The "inbox" file uses the following syntax to store messages:
dest
text
So, for example, if I have a message from Marco which says "Hello, how are you?" and another message from Tarma which says "Are you going to the gym today?", my inbox-file would look like this:
Marco
Hello, how are you?
Tarma
Are you going to the gym today?
I would like to read the username from the file and store it in string s1 and then do the same thing for the message and store it in string s2 (and then repeat the operation until EOF), but since text field admits spaces I can't really use fscanf().
I tried using fgets(), but as I said before the size of every string is dynamic. For example if I use fgets(my_file, 16, username) it would end up reading unwanted characters. I just need to read the first string until \n is reached and then read the second string until the next \n is reached, this time including spaces.
Any idea on how can I solve this problem?

#include <stdio.h>
int main(void){
char username[16];
char text[512];
int ch, i;
FILE *my_file = fopen("inbox.txt", "r");
while(1==fscanf(my_file, "%15s%*c", username)){
i=0;
while (i < sizeof(text)-1 && EOF!=(ch=fgetc(my_file))){
if(ch == '\n' && i && text[i-1] == '\n')
break;
text[i++] = ch;
}
text[i] = 0;
printf("user:%s\n", username);
printf("text:\n%s\n", text);
}
fclose(my_file);
return 0;
}

As the length of each string is dynamic then, if I were you, I would read the file first for finding each string's size and then create a dynamic array of strings' length values.
Suppose your file is:
A long time ago
in a galaxy far,
far away....
So the first line length is 15, the second line length is 16 and the third line length is 12.
Then create a dynamic array for storing these values.
Then, while reading strings, pass as the 2nd argument to fgets the corresponding element of the array. Like fgets (string , arrStringLength[i++] , f);.
But in this way you'll have to read your file twice, of course.

You can use fgets() easily enough as long as you're careful. This code seems to work:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
enum { MAX_MESSAGES = 20 };
typedef struct Message
{
char dest[16];
char text[512];
} Message;
static int read_message(FILE *fp, Message *msg)
{
char line[sizeof(msg->text) + 1];
msg->dest[0] = '\0';
msg->text[0] = '\0';
while (fgets(line, sizeof(line), fp) != 0)
{
//printf("Data: %zu <<%s>>\n", strlen(line), line);
if (line[0] == '\n')
continue;
size_t len = strlen(line);
line[--len] = '\0';
if (msg->dest[0] == '\0')
{
if (len < sizeof(msg->dest))
{
memmove(msg->dest, line, len + 1);
//printf("Name: <<%s>>\n", msg->dest);
}
else
{
fprintf(stderr, "Error: name (%s) too long (%zu vs %zu)\n",
line, len, sizeof(msg->dest)-1);
exit(EXIT_FAILURE);
}
}
else
{
if (len < sizeof(msg->text))
{
memmove(msg->text, line, len + 1);
//printf("Text: <<%s>>\n", msg->dest);
return 0;
}
else
{
fprintf(stderr, "Error: text for %s too long (%zu vs %zu)\n",
msg->dest, len, sizeof(msg->dest)-1);
exit(EXIT_FAILURE);
}
}
}
return EOF;
}
int main(void)
{
Message mbox[MAX_MESSAGES];
int n_msgs;
for (n_msgs = 0; n_msgs < MAX_MESSAGES; n_msgs++)
{
if (read_message(stdin, &mbox[n_msgs]) == EOF)
break;
}
printf("Inbox (%d messages):\n\n", n_msgs);
for (int i = 0; i < n_msgs; i++)
printf("%d: %s\n %s\n\n", i + 1, mbox[i].dest, mbox[i].text);
return 0;
}
The reading code will handle (multiple) empty lines before the first name, between a name and the text, and after the last name. It is slightly unusual in they way it decides whether to store the line just read in the dest or text parts of the message. It uses memmove() because it knows exactly how much data to move, and the data is null terminated. You could replace it with strcpy() if you prefer, but it should be slower (the probably not measurably slower) because strcpy() has to test each byte as it copies, but memmove() does not. I use memmove() because it is always correct; memcpy() could be used here but it only works when you guarantee no overlap. Better safe than sorry; there are plenty of software bugs without risking extras. You can decide whether the error exit is appropriate — it is fine for test code, but not necessarily a good idea in production code. You can decide how to handle '0 messages' vs '1 message' vs '2 messages' etc.
You can easily revise the code to use dynamic memory allocation for the array of messages. It would be easy to read the message into a simple Message variable in main(), and arrange to copy into the dynamic array when you get a complete message. The alternative is to 'risk' over-allocating the array, though that is unlikely to be a major problem (you would not grow the array one entry at a time anyway to avoid quadratic behaviour when the memory has to be moved during each allocation).
If there were multiple fields to be processed for each message (say, date received and date read too), then you'd need to reorganize the code some more, probably with another function.
Note that the code avoids the reserved namespace. A name such as _message is reserved for 'the implementation'. Code such as this is not part of the implementation (of the C compiler and its support system), so you should not create names that start with an underscore. (That over-simplifies the constraint, but only slightly, and is a lot easier to understand than the more nuanced version.)
The code is careful not to write any magic number more than once.
Sample output:
Inbox (2 messages):
1: Marco
How are you?
2: Tarma
Are you going to the gym today?

Related

fscanf() how to go in the next line?

So I have a wall of text in a file and I need to recognize some words that are between the $ sign and call them as numbers then print the modified text in another file along with what the numbers correspond to.
Also lines are not defined and columns should be max 80 characters.
Ex:
I $like$ cats.
I [1] cats.
[1] --> like
That's what I did:
#include <stdio.h>
#include <stdlib.h>
#define N 80
#define MAX 9999
int main()
{
FILE *fp;
int i=0,count=0;
char matr[MAX][N];
if((fp = fopen("text.txt","r")) == NULL){
printf("Error.");
exit(EXIT_FAILURE);
}
while((fscanf(fp,"%s",matr[i])) != EOF){
printf("%s ",matr[i]);
if(matr[i] == '\0')
printf("\n");
//I was thinking maybe to find two $ but Idk how to replace the entire word
/*
if(matr[i] == '$')
count++;
if(count == 2){
...code...
}
*/
i++;
}
fclose(fp);
return 0;
}
My problem is that fscanf doesn't recognize '\0' so it doesn't go in the next line when I print the array..also I don't know how to replace $word$ with a number.
Not only will fscanf("%s") read one whitespace-delimited string at a time, it will also eat all whitespace between those strings, including line terminators. If you want to reproduce the input whitespace in the output, as your example suggests you do, then you need a different approach.
Also lines are not defined and columns should be max 80 characters.
I take that to mean the number of lines is not known in advance, and that it is acceptable to assume that no line will contain more than 80 characters (not counting any line terminator).
When you say
My problem is that fscanf doesn't recognize '\0' so it doesn't go in the next line when I print the array
I suppose you're talking about this code:
char matr[MAX][N];
/* ... */
if(matr[i] == '\0')
Given that declaration for matr, the given condition will always evaluate to false, regardless of any other consideration. fscanf() does not factor in at all. The type of matr[i] is char[N], an array of N elements of type char. That evaluates to a pointer to the first element of the array, which pointer will never be NULL. It looks like you're trying to determine when to write a newline, but nothing remotely resembling this approach can do that.
I suggest you start by taking #Barmar's advice to read line-by-line via fgets(). That might look like so:
char line[N+2]; /* N + 2 leaves space for both newline and string terminator */
if (fgets(line, sizeof(line), fp) != NULL) {
/* one line read; handle it ... */
} else {
/* handle end-of-file or I/O error */
}
Then for each line you read, parse out the "$word$" tokens by whatever means you like, and output the needed results (everything but the $-delimited tokens verbatim; the bracket substitution number for each token). Of course, you'll need to memorialize the substitution tokens for later output. Remember to make copies of those, as the buffer will be overwritten on each read (if done as I suggest above).
fscanf() does recognize '\0', under select circumstances, but that is not the issue here.
Code needs to detect '\n'. fscanf(fp,"%s"... will not do that. The first thing "%s" directs is to consume (and not save) any leading white-space including '\n'. Read a line of text with fgets().
Simple read 1 line at a time. Then march down the buffer looking for words.
Following uses "%n" to track how far in the buffer scanning stopped.
// more room for \n \0
#define BUF_SIZE (N + 1 + 1)
char buffer[BUF_SIZE];
while (fgets(buffer, sizeof buffer, stdin) != NULL) {
char *p = buffer;
char word[sizeof buffer];
int n;
while (sscanf(p, "%s%n", word, &n) == 1) {
// do something with word
if (strcmp(word, "$zero$") == 0) fputs("0", stdout);
else if (strcmp(word, "$one$") == 0) fputs("1", stdout);
else fputs(word, stdout);
fputc(' ', stdout);
p += n;
}
fputc('\n', stdout);
}
Use fread() to read the file contents to a char[] buffer. Then iterate through this buffer and whenever you find a $ you perform a strncmp to detect with which value to replace it (keep in mind, that there is a 2nd $ at the end of the word). To replace $word$ with a number you need to either shrink or extend the buffer at the position of the word - this depends on the string size of the number in ascii format (look solutions up on google, normally you should be able to use memmove). Then you can write the number to the cave, that arose from extending the buffer (just overwrite the $word$ aswell).
Then write the buffer to the file, overwriting all its previous contents.

How to copy one line from long string C

I'm looking to copy the FIRST line from a LONG string P into a buffer
I have no idea how to make it.
while (*pros_id != '/n'){
*pros_id_line=*pros_id;
pros_id++;
pros_id_line++;
}
And tried
fgets(pros_id_line, sizeof(pros_id_line), pros_id);
Both are not working. Can I get some help please?
Note, as Adriano Repetti pointed out in a comment and an answer, that the newline character is '\n' and not '/n'.
Your initial code can be fixed up to work, provided that the destination buffer is big enough:
while (*pros_id != '\n' && *pros_id != '\0')
*pros_id_line++ = *pros_id++;
*pros_id_line = '\0';
This code does not include the newline in the copied buffer; it is easy enough to add it if you need it.
One advantage of this code is that it makes a single pass through the data up to the newline (or end of string). An alternative makes two passes through the data, one to find the newline and another to copy to the newline:
if ((end = strchr(pros_id, '\n')) != 0)
{
memmove(pros_id_line, pros_id, end - pros_id);
pros_id_line[end - pros_id] = '\0';
}
This ensures that the string is null-terminated; again, it omits the newline, and assumes there is enough space in the pros_id_line buffer for the data. You have to decide what is the correct behaviour when there is no newline in the buffer. It might be sufficient to copy the buffer without the newline into the target area, or you might prefer to report a problem.
You can use strncpy() instead of memmove() but it has a more complex loop condition than memmove() — it has to check for a null byte as well as the count, whereas memmove() only has to check the count. You can use memcpy() instead of memmove() if you're sure there's no overlap between source and target, but memmove() always works and memcpy() sometimes doesn't (though only when the source and target areas overlap), and I prefer reliability over possible misbehaviour.
Note that setting a buffer to zero before copying a string to it is a waste of energy. The parts that you're about to overwrite with data didn't need to be zeroed. The parts that you aren't going to overwrite with data didn't need to be zeroed either. You should know exactly which byte needs to be zeroed, so why waste the time on zeroing anything except the one byte that needs to be zeroed?
(One exception to this is if you are dealing with sensitive data and are concerned that some function that your code will call may deliberately read beyond the end of the string and come across parts of a password or other sensitive data. Then it may be appropriate to wipe the memory before writing new data to it. On the whole, though, most people aren't writing such code.)
New line is \n not /n anyway I'd use strchar for this:
char* endOfFirstLine = strchr(inputString, '\n');
if (endOfFirstLine != NULL)
{
strncpy(yourBuffer, inputString,
endOfFirstLine - inputString);
}
else // Input is one single line
{
strcpy(yourBuffer, inputString);
}
With inputString as your char* multiline string and inputBuffer (assuming it's big enough to contain all data from inputString and it has been zeroed) as your required output (first line of inputString).
If you're going to be doing a lot of reading from long text buffers, you could try using a memory stream, if you system supports them: https://www.gnu.org/software/libc/manual/html_node/String-Streams.html
#define _GNU_SOURCE
#include <stdio.h>
#include <string.h>
static char buffer[] = "foo\nbar";
int
main()
{
char arr[100];
FILE *stream;
stream = fmemopen(buffer, strlen(buffer), "r");
fgets(arr, sizeof arr, stream);
printf("First line: %s\n", arr);
fgets(arr, sizeof arr, stream);
printf("Second line: %s\n", arr);
fclose (stream);
return 0;
}
POSIX 2008 (e.g. most Linux systems) has getline(3) which heap-allocates a buffer for a line.
So you could code
FILE* fil = fopen("something.txt","r");
if (!fil) { perror("fopen"); exit(EXIT_FAILURE); };
char *linebuf=NULL;
size_t linesiz=0;
if (getline(&linebuf, &linesiz, fil) {
do_something_with(linebuf);
}
else { perror("getline"; exit(EXIT_FAILURE); }
If you want to read an editable line from stdin in a terminal consider GNU readline.
If you are restricted to pure C99 code you have to do the heap allocation yourself (malloc or calloc or perhaps -with care- realloc)
If you just want to copy the first line of some existing buffer char*bigbuf; which is non-NULL, valid, and zero-byte terminated:
char*line = NULL;
char *eol = strchr(bigbuf, '\n');
if (!eol) { // bigbuf is a single line so duplicate it
line = strdup(bigbuf);
if (!line) { perror("strdup"); exit(EXIT_FAILURE); }
} else {
size_t linesize = eol-bugbuf;
line = malloc(linesize+1);
if (!line) { perror("malloc"); exit(EXIT_FAILURE);
memcpy (line, bigbuf, linesize);
line[linesize] = '\0';
}

How would I compare a string (entered by the user) to the first word of a line in a file?

I am really struggling to understand how character arrays work in C. This seems like something that should be really simple, but I do not know what function to use, or how to use it.
I want the user to enter a string, and I want to iterate through a text file, comparing this string to the first word of each line in the file.
By "word" here, I mean substring that consists of characters that aren't blanks.
Help is greatly appreciated!
Edit:
To be more clear, I want to take a single input and search for it in a database of the form of a text file. I know that if it is in the database, it will be the first word of a line, since that is how to database is formatted. I suppose I COULD iterate through every single word of the database, but this seems less efficient.
After finding the input in the database, I need to access the two words that follow it (on the same line) to achieve the program's ultimate goal (which is computational in nature)
Here is some code that will do what you are asking. I think it will help you understand how string functions work a little better. Note - I did not make many assumptions about how well conditioned the input and text file are, so there is a fair bit of code for removing whitespace from the input, and for checking that the match is truly "the first word", and not "the first part of the first word". So this code will not match the input "hello" to the line "helloworld 123 234" but it will match to "hello world 123 234". Note also that it is currently case sensitive.
#include <stdio.h>
#include <string.h>
int main(void) {
char buf[100]; // declare space for the input string
FILE *fp; // pointer to the text file
char fileBuf[256]; // space to keep a line from the file
int ii, ll;
printf("give a word to check:\n");
fgets(buf, 100, stdin); // fgets prevents you reading in a string longer than buffer
printf("you entered: %s\n", buf); // check we read correctly
// see (for debug) if there are any odd characters:
printf("In hex, that is ");
ll = strlen(buf);
for(ii = 0; ii < ll; ii++) printf("%2X ", buf[ii]);
printf("\n");
// probably see a carriage return - depends on OS. Get rid of it!
// note I could have used the result that ii is strlen(but) but
// that makes the code harder to understand
for(ii = strlen(buf) - 1; ii >=0; ii--) {
if (isspace(buf[ii])) buf[ii]='\0';
}
// open the file:
if((fp=fopen("myFile.txt", "r"))==NULL) {
printf("cannot open file!\n");
return 0;
}
while( fgets(fileBuf, 256, fp) ) { // read in one line at a time until eof
printf("line read: %s", fileBuf); // show we read it correctly
// find whitespace: we need to keep only the first word.
ii = 0;
while(!isspace(fileBuf[ii]) && ii < 255) ii++;
// now compare input string with first word from input file:
if (strlen(buf)==ii && strstr(fileBuf, buf) == fileBuf) {
printf("found a matching line: %s\n", fileBuf);
break;
}
}
// when you get here, fileBuf will contain the line you are interested in
// the second and third word of the line are what you are really after.
}
Your recent update states that the file is really a database, in which you are looking for a word. This is very important.
If you have enough memory to hold the whole database, you should do just that (read the whole database and arrange it for efficient searching), so you should probably not ask about searching in a file.
Good database designs involve data structures like trie and hash table. But for a start, you could use the most basic improvement of the database - holding the words in alphabetical order (use the somewhat tricky qsort function to achieve that).
struct Database
{
size_t count;
struct Entry // not sure about C syntax here; I usually code in C++; sorry
{
char *word;
char *explanation;
} *entries;
};
char *find_explanation_of_word(struct Database* db, char *word)
{
for (size_t i = 0; i < db->count; i++)
{
int result = strcmp(db->entries[i].word, word);
if (result == 0)
return db->entries[i].explanation;
else if (result > 0)
break; // if the database is sorted, this means word is not found
}
return NULL; // not found
}
If your database is too big to hold in memory, you should use a trie that holds just the beginnings of the words in the database; for each beginning of a word, have a file offset at which to start scanning the file.
char* find_explanation_in_file(FILE *f, long offset, char *word)
{
fseek(f, offset, SEEK_SET);
char line[100]; // 100 should be greater than max line in file
while (line, sizeof(line), f)
{
char *word_in_file = strtok(line, " ");
char *explanation = strtok(NULL, "");
int result = strcmp(word_in_file, word);
if (result == 0)
return explanation;
else if (result > 0)
break;
}
return NULL; // not found
}
I think what you need is fseek().
1) Pre-process the database file as follows. Find out the positions of all the '\n' (carriage returns), and store them in array, say a, so that you know that ith line starts at a[i]th character from the beginning of the file.
2) fseek() is a library function in stdio.h, and works as given here. So, when you need to process an input string, just start from the start of the file, and check the first word, only at the stored positions in the array a. To do that:
fseek(inFile , a[i] , SEEK_SET);
and then
fscanf(inFile, "%s %s %s", yourFirstWordHere, secondWord, thirdWord);
for checking the ith line.
Or, more efficiently, you could use:
fseek ( inFile , a[i]-a[i-1] , SEEK_CURR )
Explanation: What fseek() does is, it sets the read/write position indicator associated with the file at the desired position. So, if you know at which point you need to read or write, you can just go there and read directly or write directly. This way, you won't need to read whole lines just to get first three words.

Issue reading from sscanf

All this is probably a real simple one but I am missing something and hope you can help. Ok this is my issue as simple as I can put it.
I am returning a buffer from readfile after using a USB device. This all works ok and I can out put the buffer fine by using a loop like so
for (long i=0; i<sizeof(buffer); i++) //for all chars in string
{
unsigned char c = buffer[i];
switch (Format)
{
case 2: //hex
printf("%02x",c);
break;
case 1: //asc
printf("%c",c);
break;
} //end of switch format
}
When I use the text (%c) version I can see the data in the buffer in my screen the I way I expected it. However my issue is when I come to read it using sscanf. I use strstr to search some key in the buffer and use sscanf to retrieve its data. However, sscanf fails. What could be the problem?
Below is an example of the code I am using to scan the buffer and it works fine with this standalone version. Buffer section in the above code can't be read. Even though I can see it with printf.
#include <stdio.h>
#include <string.h>
#include <windows.h>
int main ()
{
// in my application this comes from the handle and readfile
char buffer[255]="CODE-12345.MEP-12453.PRD-222.CODE-12355" ;
//
int i;
int codes[256];
char *pos = buffer;
size_t current = 0;
//
while ((pos=strstr(pos, "PRD")) != NULL) {
if (sscanf(pos, "PRD - %d", codes+current))
++current;
pos += 4;
}
for (i=0; i<current; i++)
printf("%d\n", codes[i]);
system("pause");
return 0;
}
Thanks
The problem is that, your ReadFile is giving you non-printable characters before the data you are interested in, specifically with a '\0' in the beginning. Since strings in C are NUL-terminated, all standard functions assume there is nothing in the buffer.
I don't know what it is exactly that you are reading, but perhaps you are reading a message that contains a header? In such a case you should skip the header first.
Blindly trying to solve the problem, you can skip the bad characters manually, assuming they are all in the beginning.
First of all, let's make sure the buffer is always NUL-terminated:
char buffer[1000 + 1]; // +1 in case it read all 1000 characters
ReadFile(h,buffer,0x224,&read,NULL);
buffer[read] = '\0';
Then, we know that there are read number of bytes filled by ReadFile. We first need to go back from that to find out where the good data start. Then, we need to go further back and find the first place where the data is not interesting. Note that, I am assuming in the end of the message, there are no printable characters. If there are, then this gets more complicated. In such a case, it is better if you write your own strstr that doesn't terminate on '\0', but reads up to a given length.
So instead of
char *pos = buffer;
We do
// strip away the bad part in the end
for (; read > 0; --read)
if (buffer[read - 1] >= ' ' && buffer[read - 1] <= 126)
break;
buffer[read] = '\0';
// find where the good data start
int good_position;
for (good_position = read; good_position > 0; --good_position)
if (buffer[good_position - 1] < ' ' || buffer[good_position - 1] > 126)
break;
char *pos = buffer + good_position;
The rest can remain the same.
Note: I am going from the back of the array, because assuming the beginning is a header, then it may contain data that might be interpreted as printable characters. On the other hand, in the end it may be all zeros or something.

Reading formatted strings from file into Array in C

I am new to the C programming language and trying to improve by solving problems from the Project Euler website using only C and its standard libraires. I have covered basic C fundamentals(I think), functions, pointers, and some basic file IO but now am running into some issues.
The question is about reading a text file of first names and calculating a "name score" blah blah, I know the algorithm I am going to use and have most of the program setup but just cannot figure out how to read the file correctly.
The file is in the format
"Nameone","Nametwo","billy","bobby","frank"...
I have searched and searched and tried countless things but cannot seem to read these as individual names into an array of strings(I think thats the right way to store them individually?) I have tried using sscanf/fscanf with %[^\",]. I have tried different combos of those functions and fgets, but my understanding of fgets is everytime I call it it will get a new line, and this is a text file with over 45,000 characters all on the same line.
I am unsure if I am running into problems with my misunderstanding of the scanf functions, or my misunderstanding with storing an array of strings. As far as the array of strings goes, I (think) I have realized that when I declare an array of strings it does not allocate memory for the strings themselves, something that I need to do. But I still cannot get anything to work.
Here is the code I have now to try to just read in some names I enter from the command line to test my methods.
This code works to input any string up to buffer size(100):
int main(void)
{
int i;
char input[100];
char* names[10];
printf("\nEnter up to 10 names\nEnter an empty string to terminate input: \n");
for(int i = 0; i < 10; i++)
{
int length = 0;
printf("%d: ", i);
fgets(input, 100, stdin);
length = (int)strlen(input);
input[length-1] = 0; // Delete newline character
length--;
if(length < 1)
{
break;
}
names[i] = malloc(length+1);
assert(names[i] != NULL);
strcpy(names[i], input);
}
}
However, I simply cannot make this work for reading in the formatted strings.
PLEASE advise me as to how to read it in with format. I have previously used sscanf on the input buffer and that has worked fine, but I dont feel like I can do that on a 45000+ char line? Am I correct in assuming this? Is this even an acceptable way to read strings into an array?
I apologize if this is long and/or not clear, it is very late and I am very frustrated.
Thank anyone and everyone for helping, and I am looking forward to finally becoming an active member on this site!
There are really two basic issues here:
Whether scanning string input is the proper strategy here. I would argue not because while it might work on this task you are going to run into more complicated scenarios where it too easily breaks.
How to handle a 45k string.
In reality you won't run into too many string of this size but it is nothing that a modern computer of any capacity can't easily handle. Insofar as this is for learning purposes then learn iteratively.
The easiest first approach is to fread() the entire line/file into an appropriately sized buffer and parse it yourself. You can use strtok() to break up the comma-delimited tokens and then pass the tokens to a function that strips the quotes and returns the word. Add the word to your array.
For a second pass you can do away with strtok() and just parse the string yourself by iterating over the buffer and breaking up the comma tokens yourself.
Last but not least you can write a version that reads smaller chunks of the file into a smaller buffer and parses them. This has the added complexity of handling multiple reads and managing the buffers to account for half-read tokens at the end of a buffer and so on.
In any case, break the problem into chunks and learn with each refinement.
EDIT
#define MAX_STRINGS 5000
#define MAX_NAME_LENGTH 30
char* stripQuotes(char *str, char *newstr)
{
char *temp = newstr;
while (*str)
{
if (*str != '"')
{
*temp = *str;
temp++;
}
str++;
}
return(newstr);
}
int main(int argc, char *argv[])
{
char fakeline[] = "\"Nameone\",\"Nametwo\",\"billy\",\"bobby\",\"frank\"";
char *token;
char namebuffer[MAX_NAME_LENGTH] = {'\0'};
char *name;
int index = 0;
char nameArray[MAX_STRINGS][MAX_NAME_LENGTH];
token = strtok(fakeline, ",");
if (token)
{
name = stripQuotes(token, namebuffer);
strcpy(nameArray[index++], name);
}
while (token != NULL)
{
token = strtok(NULL, ",");
if (token)
{
memset(namebuffer, '\0', sizeof(namebuffer));
name = stripQuotes(token, namebuffer);
strcpy(nameArray[index++], name);
}
}
return(0);
}
fscanf("%s", input) reads one token (a string surrounded by spaces) at a time. You can either scan the input until you encounter a specific "end-of-input" string, such as "!", or you can wait for the end-of-file signal, which is achieved by pressing "Ctrl+D" on a Unix console or by pressing "Ctrl+Z" on a Windows console.
The first option:
fscanf("%s", input);
if (input[0] == '!') {
break;
}
// Put input on the array...
The second option:
result = fscanf("%s", input);
if (result == EOF) {
break;
}
// Put input on the array...
Either way, as you read one token at a time, there are no limits on the size of the input.
Why not search the giant string for quote characters instead? Something like this:
#include <stdio.h>
#include <string.h>
int main(void)
{
char mydata[] = "\"John\",\"Smith\",\"Foo\",\"Bar\"";
char namebuffer[20];
unsigned int i, j;
int begin = 1;
unsigned int beginName, endName;
for (i = 0; i < sizeof(mydata); i++)
{
if (mydata[i] == '"')
{
if (begin)
{
beginName = i;
}
else
{
endName = i;
for (j = beginName + 1; j < endName; j++)
{
namebuffer[j-beginName-1] = mydata[j];
}
namebuffer[endName-beginName-1] = '\0';
printf("%s\n", namebuffer);
}
begin = !begin;
}
}
}
You find the first double quote, then the second, and then read out the characters in between to your name string. Then you process those characters as needed for the problem in question.

Resources