C Multiple Strtok to determine delimiter

C Multiple Strtok to determine delimiter - c

I'm writing a C program where the user can input a string of 1-3 digits followed by a backslash and then another 1-3 digits or they can enter 1-3 digits, followed by a comma, then another 1-3 digits and there is no limit to how many times they can iterate this.
I need to determine whether the input delimiter is a backslash or comma (to determine what to do with the numbers) and put the numbers into an array.
The way I was thinking of doing this was to using strtok as follows. The string is inputted as char *token.
op_tok1 = strtok(token, "\\");
if(op_tok1 != NULL)
{
/* Process numbers */
return;
}
op_tok2 = strtok(token, ",");
if(op_tok2 != NULL)
{
/* Process other numbers */
return;
}
This works for anything delimetered with a backslash, but not with a comma. I believe this is because strtok messes with the token variable. Is this true? Is there a better way to go about this? thanks!

There are certainly ways I'd consider better. If you can depend reasonably well on the format of the input (i.e., really being three digits followed by one of the allowed delimiters), you could do something like:
char *pos = 0;
while (2 == sscanf(input+pos, "%d%c", &number, &delimiter)) {
if ('\\' == delimiter)
process_backslash(number);
else if (',' == delimiter)
process_comma(number);
else
error_invalid_delimiter(delimiter);
pos += 4;
}

Others have posted better solutions - strtok is not really suitable for this task. However, answering the first question - is strtok changing the underlying string, Yes (it's evil in my mind how it works. Many a young player has fallen into this trap):
strtok replaces token with \0 (Null terminator) and passes the start of the string. Subsequent calls to strtok(NULL, <token>) continue scanning the string, looking for the next token, which does not need to be the same.
Therefore you could do:
op_tok1 = strtok(token, "\\");
if(op_tok1 != NULL)
{
/* Process numbers */
return;
}
op_tok2 = strtok(NULL, ",");
if(op_tok2 != NULL)
{
/* Process other numbers */
return;
}
Also beware it is not thread safe.

Why not just use scanf()?
~/tmp$ cat test.c
#include <stdio.h>
int main(int argc, char ** argv) {
int i;
char c;
while (2 == scanf("%d%[\\.]",&i,&c)) {
printf("Int %d\nChar %c\n", i, c);
}
}
... worked for me.
~/tmp$ gcc test.c && echo "123.456\789.4" | ./a.out
Int 123
Char .
Int 456
Char \
Int 789
Char .
~/tmp$

Related

SOLVED-what am I doing wrong that strtok does right in splitting a string

Previous question was : what am I doing wrong that strtok does right in splitting a string. Also separating the strtok to a function suddenly doesn't produce correct result?
This is the first time that I ask a question in stackoverflow so forgive me if this is wordy and incoherent. The last part of the question is elaborated at the bottom part of this question body.
So, I was doing a course assessment assigned by my college, in that, one question is :
Remove duplicate words and print only unique words
Input : A single sentence in which each word separated by a space
Output : Unique words separated by a space [Order of words should be same as in input]
Example:
Input : adc aftrr adc art
Output : adc aftrr art
Now, I have the solution which is to split the string on whitespaces and adding the word to a array(set) if it is not already exists, but it is the implementation part that makes me to plug my hair out
#include <stdio.h>
#include <string.h>
#define MAX 20
int exists(char words[][MAX], int n, char *word){ // The existence check function
for(int i=0;i < n;i++)
if(strcmp(words[i],word) == 0)
return 1;
return 0;
}
void removeDuplicateOld(char*);
void removeDuplicateNew(char*);
int main(){
char sentence[MAX*50] = {0}; //arbitary length
fgets(sentence,MAX*50,stdin);
sentence[strcspn(sentence,"\n")]=0;
printf("The Old One : \n");
removeDuplicateOld(sentence);
printf("\nThe New One : \n");
removeDuplicateNew(sentence);
}
The fucntion that uses strtok to split string :
void removeDuplicateNew(char *sentence){
char words[10][MAX] = {0};
int wi=0;
char *token = strtok(sentence," ");
while(token != NULL){
if(exists(words,wi,token)==0) {
strcpy(words[wi++],token);
}
token = strtok(NULL," ");
}
for(int i=0;i<wi;i++) printf("%s ",words[i]);
}
The old function that uses my old method (which is constructing a word until I hit whitespace) :
void removeDuplicateOld(char *sentence){
char objects[10][MAX] = {0}; //10 words with MAX letters
char tword[MAX];
int oi=0, si=0, ti=0;
while(sentence[si]!='\0'){
if(sentence[si] != ' ' && sentence[si+1] != '\0')
tword[ti++] = sentence[si];
else{
if(sentence[si+1] == '\0')
tword[ti++]=sentence[si];
tword[ti]='\0';
if(exists(objects,oi,tword) == 0){
strcpy(objects[oi++],tword);
}
ti=0; // The buffer[tword] is made to be overwritten
}
si++;
}
for(int i=0;i<oi;i++)
printf("%s ",objects[i]);
}
Solved : changed if(sentence[si+1] == '\0') to if(sentence[si+1] == '\0' && sentence[si]!=' ')
Here is the output :
input : abc def ghi abc jkl ghi
The Old One :
abc def ghi jkl
The New One :
abc def ghi jkl
Note trailing whitespaces in input and output is not checked as their own driver code doesn't properly handle them while strtok method does and it passes all tests.
Now both methods seems to be producing same results but they are indeed producing different outputs according to test cases and in top of that separating strtok method as a separate function[removeDuplicateNew] fails one test case while writing it in main method itself passes all test, see these results :
Old Method Test Case results
Strtok Method as Separate Function Test Case Results
Following Was Moved To A separate Question Thread
When Coded in main method itself :
int main(){
char sentence[MAX*50] = {0}; //arbitary length
fgets(sentence,MAX*50,stdin);
sentence[strcspn(sentence,"\n")] = 0;
char words[10][MAX] = {0};
int wi=0;
char *token = strtok(sentence," ");
while(token != NULL){
if(exists(words,wi,token)==0) {
strcpy(words[wi++],token);
}
token = strtok(NULL," ");
}
for(int i=0;i<wi;i++) printf("%s ",words[i]);
}
Strtok Method as inline code Test Case Results
For the record, it is the same code just placed in main method, so what the heck happens here that when I separate it as a function and pass the string as argument it suddenly isn't working properly.
Also any advice on my question building, wording is appreciated.

Your code...
void removeDuplicateOld(char *sentence){
char objects[10][MAX] = {0}; //10 words with MAX letters
char tword[MAX];
int oi=0, si=0, ti=0;
while(sentence[si]!='\0'){
if(sentence[si] != ' ' && sentence[si+1] != '\0')
tword[ti++] = sentence[si];
else{
// right here have hit SP.
// if SP followed by '\0'
// then append SP to my word... wrong! <=====
if(sentence[si+1] == '\0')
tword[ti++]=sentence[si];
tword[ti]='\0';
This is why the library function strtok() works better than hand rolled code.It has been tested and proven to work as it says it does.
There's a better way to use strtok()
for( char *p = sentence; (p = strtok( p, " \n") ) != NULL; p = NULL )
if( exists( words, wi, p ) == 0 )
strcpy( words[wi++], p );
That's all you need. strtok() will even trim the LF off the buffer for you, no extra charge.
Final suggestion: Instead of a fixed-sized array of pointers to words, you might consider a linked-list (LL) that can easily grow. The function that would append a new word to the end of the list can quietly eat the word if it turns out to be a duplicate found while traversing to append to the end of the LL.

parsing a file while reading in c

I am trying to read each line of a file and store binary values into appropriate variables.
I can see that there are many many other examples of people doing similar things and I have spent two days testing out different approaches that I found but still having difficulties getting my version to work as needed.
I have a txt file with the following format:
in = 00000000000, out = 0000000000000000
in = 00000000001, out = 0000000000001111
in = 00000000010, out = 0000000000110011
......
I'm attempting to use fscanf to consume the unwanted characters "in = ", "," and "out = "
and keep only the characters that represent binary values.
My goal is to store the first column of binary values, the "in" values into one variable
and the second column of binary values, the "out" value into another buffer variable.
I have managed to get fscanf to consume the "in" and "out" characters but I have not been
able to figure out how to get it to consume the "," "=" characters. Additionally, I thought that fscanf should consume the white space but it doesn't appear to be doing that either.
I can't seem to find any comprehensive list of available directives for scanners, other than the generic "%d, %s, %c....." and it seems that I need a more complex combination of directives to filter out the characters that I'm trying to ignore than I know how to format.
I could use some help with figuring this out. I would appreciate any guidance you could
provide to help me understand how to properly filter out "in = " and ", out = " and how to store
the two columns of binary characters into two separate variables.
Here is the code I am working with at the moment. I have tried other iterations of this code using fgetc() in combination with fscanf() without success.
int main()
{
FILE * f = fopen("hamming_demo.txt","r");
char buffer[100];
rewind(f);
while((fscanf(f, "%s", buffer)) != EOF) {
fscanf(f,"%[^a-z]""[^,]", buffer);
printf("%s\n", buffer);
}
printf("\n");
return 0;
}
The outputs from my code appear as follows:
= 00000000000,
= 0000000000000000
= 00000000001,
= 0000000000001111
= 00000000010,
= 0000000000110011
Thank you for your time.

The scanf family function is said to be a poor man'parser because it is not very tolerant to input errors. But if you are sure of the format of the input data it allows for simple code. The only magic here if that a space in the format string will gather all blank characters including new lines or none. Your code could become:
int main()
{
FILE * f = fopen("hamming_demo.txt", "r");
if (NULL == f) { // always test open
perror("Unable to open input file");
return 1;
}
char in[50], out[50]; // directly get in and out
// BEWARE: xscanf returns the number of converted elements and never EOF
while (fscanf(f, " in = %[01], out = %[01]", in, out) == 2) {
printf("%s - %s\n", in, out);
}
printf("\n");
return 0;
}

So basically you want to filter '0' and '1'? In this case fgets and a simple loop will be enough: just count the number of 0's and 1's and null-terminate the string at the end:
#include <stdio.h>
int main(void)
{
char str[50];
char *ptr;
// Replace stdin with your file
while ((ptr = fgets(str, sizeof str, stdin)))
{
int count = 0;
while (*ptr != '\0')
{
if ((*ptr >= '0') && (*ptr <= '1'))
{
str[count++] = *ptr;
}
ptr++;
}
str[count] = '\0';
puts(str);
}
}

How do I make this shell to parse the statement with quotes around them in C?

I am trying to make this shell parse. How do I make the program implement parsing in a way so that commands that are in quotes will be parsed based on the starting and ending quotes and will consider it as one token? During the second while loop where I am printing out the tokens I think I need to put some sort of if statement, but I am not too sure. Any feedback/suggestions are greatly appreciated.
#include <stdio.h> //printf
#include <unistd.h> //isatty
#include <string.h> //strlen,sizeof,strtok
int main(int argc, char **argv[]){
int MaxLength = 1024; //size of buffer
int inloop = 1; //loop runs forever while 1
char buffer[MaxLength]; //buffer
bzero(buffer,sizeof(buffer)); //zeros out the buffer
char *command; //character pointer of strings
char *token; //tokens
const char s[] = "-,+,|, ";
/* part 1 isatty */
if (isatty(0))
{
while(inloop ==1) // check if the standard input is from terminal
{
printf("$");
command = fgets(buffer,sizeof(buffer),stdin); //fgets(string of char pointer,size of,input from where
token = strtok(command,s);
while (token !=NULL){
printf( " %s\n",token);
token = strtok(NULL, s); //checks for elements
}
if(strcmp(command,"exit\n")==0)
inloop =0;
}
}
else
printf("the standard input is NOT from a terminal\n");
return 0;
}

For an arbitrary command-line syntax, strtok is not the best function. It works for simple cases, where the words are delimited by special characters or white space, but there will come a time where you want to split something like this ls>out into three tokens. strtok can't handle this, because it needs to place its terminating zeros somewhere.
Here's a quick and dirty custom command-line parser:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int error(const char *msg)
{
printf("Error: %s\n", msg);
return -1;
}
int token(const char *begin, const char *end)
{
printf("'%.*s'\n", end - begin, begin);
return 1;
}
int parse(const char *cmd)
{
const char *p = cmd;
int count = 0;
for (;;) {
while (isspace(*p)) p++;
if (*p == '\0') break;
if (*p == '"' || *p == '\'') {
int quote = *p++;
const char *begin = p;
while (*p && *p != quote) p++;
if (*p == '\0') return error("Unmachted quote");
count += token(begin, p);
p++;
continue;
}
if (strchr("<>()|", *p)) {
count += token(p, p + 1);
p++;
continue;
}
if (isalnum(*p)) {
const char *begin = p;
while (isalnum(*p)) p++;
count += token(begin, p);
continue;
}
return error("Illegal character");
}
return count;
}
This code understands words separated by white-space, words separated by single or double quotation marks and single-character operators. It doesn't understand escaped quotation marks inside quotes and non-alphanumeric characters such as the dot in words.
The code is not hard to understand and you can extend it easily to understand double-char operators such as >> or comments.
If you want to escape quotation marks, you'll have to recognise the escape character in parse and unescape it and possible other escape sequences in token.

First, you've declared argv to be an array of pointers to... pointers. In fact, it is an array of pointers to chars. So:
int main(int argc, char **argv){
The trend is you want to reach for [], which got you into incorrect code here, but the idiom in C/C++ is more commonly to use pointer syntax, e.g.:
const char* s = "-+| ";
FWIW.
Also, note that fgets() will return NULL when it hits end of file (e.g., the user types CTRL-D on *nix or CTRL-Z on DOS/Windows). You probably don't want a segment violation when that happens.
Also, bzero() is a nonportable function (you probably don't care in this context) and the C compiler will happily initialize an array to zeroes for you if you ask it to (possibly worth caring about; syntax demonstrated below).
Next, as soon as you allow quoted strings, the next language question that immediately arises is: "how do I quote a quote?". Then, you are immediately out of the territory that can be handled cleanly with strtok(). I'm not 100% sure how you want to break your string into tokens. Using strtok() in the way you do, I think the string "a|b" would produce two tokens, "a" and "b", making you overlook the "|". You're treating "|" and "-" and "+" like whitespace, to be ignored, which is not generally what a shell does. For example, given this command-line:
echo 'This isn''t so hard' | cp -n foo.h .. >foo.out
I would probably want to get the following list of tokens:
echo
'This isn''t so hard'
|
cp
-n
foo.h
..
>
foo.out
Usually, characters like '+' and '-' are not special for most shells' tokenizing process (unlike '|' and '&' and '<', etc. which are instructions to the shell that the spawned command never sees). They get passed onto the application that is then free to decide "'-' indicates this word is an option and not a filename" or whatever.
What follows is a version of your code that produces the output I described (which may or may not be exactly what you want) and allows either double or single-quoted arguments (trivial to extend to handle back-ticks too) that can contain quote marks of the same kind, etc.
#include <stdio.h> //printf
#include <unistd.h> //isatty
#include <string.h> //strlen,sizeof,strtok
#define MAXLENGTH 1024
int main(int argc, char **argv[]){
int inloop = 1; //loop runs forever while 1
char buffer[MAXLENGTH] = {'\0'}; //compiler inits entire array to NUL bytes
// bzero(buffer,sizeof(buffer)); //zeros out the buffer
char *command; //character pointer of strings
char *token; //tokens
char* rover;
const char* StopChars = "|&<> ";
size_t toklen;
/* part 1 isatty */
if (isatty(0))
{
while(inloop ==1) // check if the standard input is from terminal
{
printf("$");
token = command = fgets(buffer,sizeof(buffer),stdin); //fgets(string of char pointer,size of,input from where
if(command)
while(*token)
{
// skip leading whitespace
while(*token == ' ')
++token;
rover = token;
// if possible quoted string
if(*rover == '\'' || *rover == '\"')
{
char Quote = *rover++;
while(*rover)
if(*rover != Quote)
++rover;
else if(rover[1] == Quote)
rover += 2;
else
{
++rover;
break;
}
}
// else if special-meaning character token
else if(strchr(StopChars, *rover))
++rover;
// else generic token
else
while(*rover)
if(strchr(StopChars, *rover))
break;
else
++rover;
toklen = (size_t)(rover-token);
if(toklen)
printf(" %*.*s\n", toklen, toklen, token);
token = rover;
}
if(strcmp(command,"exit\n")==0)
inloop =0;
}
}
else
printf("the standard input is NOT from a terminal\n");
return 0;
}

Regarding your specific request: commands that are in quotes will be parsed based on the starting and ending quotes.
You can use strtok() by tokenizing on the " character. Here's how:
char a[]={"\"this is a set\" this is not"};
char *buf;
buf = strtok(a, "\"");
In that code snippet, buf will contain "this is a set"
Note the use of \ allowing the " character to used as a token delimiter.
Also, Not your main issue, but you need to:
Change this:
const char s[] = "-,+,|, "; //strtok will parse on -,+| and a " " (space)
To:
const char s[] = "-+| "; //strtok will parse on only -+| and a " " (space)
strtok() will parse out whatever you have in the delimiter string, including ","

Reading a file in C

I have an input file I need to extract words from. The words can only contain letters and numbers so anything else will be treated as a delimiter. I tried fscanf,fgets+sscanf and strtok but nothing seems to work.
while(!feof(file))
{
fscanf(file,"%s",string);
printf("%s\n",string);
}
Above one clearly doesn't work because it doesn't use any delimiters so I replaced the line with this:
fscanf(file,"%[A-z]",string);
It reads the first word fine but the file pointer keeps rewinding so it reads the first word over and over.
So I used fgets to read the first line and use sscanf:
sscanf(line,"%[A-z]%n,word,len);
line+=len;
This one doesn't work either because whatever I try I can't move the pointer to the right place. I tried strtok but I can't find how to set delimitters
while(p != NULL) {
printf("%s\n", p);
p = strtok(NULL, " ");
This one obviously take blank character as a delimitter but I have literally 100s of delimitters.
Am I missing something here becasue extracting words from a file seemed a simple concept at first but nothing I try really works?

Consider building a minimal lexer. When in state word it would remain in it as long as it sees letters and numbers. It would switch to state delimiter when encountering something else. Then it could do an exact opposite in the state delimiter.
Here's an example of a simple state machine which might be helpful. For the sake of brevity it works only with digits. echo "2341,452(42 555" | ./main will print each number in a separate line. It's not a lexer but the idea of switching between states is quite similar.
#include <stdio.h>
#include <string.h>
int main() {
static const int WORD = 1, DELIM = 2, BUFLEN = 1024;
int state = WORD, ptr = 0;
char buffer[BUFLEN], *digits = "1234567890";
while ((c = getchar()) != EOF) {
if (strchr(digits, c)) {
if (WORD == state) {
buffer[ptr++] = c;
} else {
buffer[0] = c;
ptr = 1;
}
state = WORD;
} else {
if (WORD == state) {
buffer[ptr] = '\0';
printf("%s\n", buffer);
}
state = DELIM;
}
}
return 0;
}
If the number of states increases you can consider replacing if statements checking the current state with switch blocks. The performance can be increased by replacing getchar with reading a whole block of the input to a temporary buffer and iterating through it.
In case of having to deal with a more complex input file format you can use lexical analysers generators such as flex. They can do the job of defining state transitions and other parts of lexer generation for you.

Several points:
First of all, do not use feof(file) as your loop condition; feof won't return true until after you attempt to read past the end of the file, so your loop will execute once too often.
Second, you mentioned this:
fscanf(file,"%[A-z]",string);
It reads the first word fine but the file pointer keeps rewinding so it reads the first word over and over.
That's not quite what's happening; if the next character in the stream doesn't match the format specifier, scanf returns without having read anything, and string is unmodified.
Here's a simple, if inelegant, method: it reads one character at a time from the input file, checks to see if it's either an alpha or a digit, and if it is, adds it to a string.
#include <stdio.h>
#include <ctype.h>
int get_next_word(FILE *file, char *word, size_t wordSize)
{
size_t i = 0;
int c;
/**
* Skip over any non-alphanumeric characters
*/
while ((c = fgetc(file)) != EOF && !isalnum(c))
; // empty loop
if (c != EOF)
word[i++] = c;
/**
* Read up to the next non-alphanumeric character and
* store it to word
*/
while ((c = fgetc(file)) != EOF && i < (wordSize - 1) && isalnum(c))
{
word[i++] = c;
}
word[i] = 0;
return c != EOF;
}
int main(void)
{
char word[SIZE]; // where SIZE is large enough to handle expected inputs
FILE *file;
...
while (get_next_word(file, word, sizeof word))
// do something with word
...
}

I would use:
FILE *file;
char string[200];
while(fscanf(file, "%*[^A-Za-z]"), fscanf(file, "%199[a-zA-Z]", string) > 0) {
/* do something with string... */
}
This skips over non-letters and then reads a string of up to 199 letters. The only oddness is that if you have any 'words' that are longer than 199 letters they'll be split up into multiple words, but you need the limit to avoid a buffer overflow...

What are your delimiters? The second argument to strtok should be a string containing your delimiters, and the first should be a pointer to your string the first time round then NULL afterwards:
char * p = strtok(line, ","); // assuming a , delimiter
printf("%s\n", p);
while(p)
{
p = strtok(NULL, ",");
printf("%S\n", p);
}

Parsing text in C

I have a file like this:
...
words 13
more words 21
even more words 4
...
(General format is a string of non-digits, then a space, then any number of digits and a newline)
and I'd like to parse every line, putting the words into one field of the structure, and the number into the other. Right now I am using an ugly hack of reading the line while the chars are not numbers, then reading the rest. I believe there's a clearer way.

Edit: You can use pNum-buf to get the length of the alphabetical part of the string, and use strncpy() to copy that into another buffer. Be sure to add a '\0' to the end of the destination buffer. I would insert this code before the pNum++.
int len = pNum-buf;
strncpy(newBuf, buf, len-1);
newBuf[len] = '\0';
You could read the entire line into a buffer and then use:
char *pNum;
if (pNum = strrchr(buf, ' ')) {
pNum++;
}
to get a pointer to the number field.

fscanf(file, "%s %d", word, &value);
This gets the values directly into a string and an integer, and copes with variations in whitespace and numerical formats, etc.
Edit
Ooops, I forgot that you had spaces between the words.
In that case, I'd do the following. (Note that it truncates the original text in 'line')
// Scan to find the last space in the line
char *p = line;
char *lastSpace = null;
while(*p != '\0')
{
if (*p == ' ')
lastSpace = p;
p++;
}
if (lastSpace == null)
return("parse error");
// Replace the last space in the line with a NUL
*lastSpace = '\0';
// Advance past the NUL to the first character of the number field
lastSpace++;
char *word = text;
int number = atoi(lastSpace);
You can solve this using stdlib functions, but the above is likely to be more efficient as you're only searching for the characters you are interested in.

Given the description, I think I'd use a variant of this (now tested) C99 code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <ctype.h>
struct word_number
{
char word[128];
long number;
};
int read_word_number(FILE *fp, struct word_number *wnp)
{
char buffer[140];
if (fgets(buffer, sizeof(buffer), fp) == 0)
return EOF;
size_t len = strlen(buffer);
if (buffer[len-1] != '\n') // Error if line too long to fit
return EOF;
buffer[--len] = '\0';
char *num = &buffer[len-1];
while (num > buffer && !isspace((unsigned char)*num))
num--;
if (num == buffer) // No space in input data
return EOF;
char *end;
wnp->number = strtol(num+1, &end, 0);
if (*end != '\0') // Invalid number as last word on line
return EOF;
*num = '\0';
if (num - buffer >= sizeof(wnp->word)) // Non-number part too long
return EOF;
memcpy(wnp->word, buffer, num - buffer);
return(0);
}
int main(void)
{
struct word_number wn;
while (read_word_number(stdin, &wn) != EOF)
printf("Word <<%s>> Number %ld\n", wn.word, wn.number);
return(0);
}
You could improve the error reporting by returning different values for different problems.
You could make it work with dynamically allocated memory for the word portion of the lines.
You could make it work with longer lines than I allow.
You could scan backwards over digits instead of non-spaces - but this allows the user to write "abc 0x123" and the hex value is handled correctly.
You might prefer to ensure there are no digits in the word part; this code does not care.

You could try using strtok() to tokenize each line, and then check whether each token is a number or a word (a fairly trivial check once you have the token string - just look at the first character of the token).

Assuming that the number is immediately followed by '\n'.
you can read each line to chars buffer, use sscanf("%d") on the entire line to get the number, and then calculate the number of chars that this number takes at the end of the text string.

Depending on how complex your strings become you may want to use the PCRE library. At least that way you can compile a perl'ish regular expression to split your lines. It may be overkill though.

Given the description, here's what I'd do: read each line as a single string using fgets() (making sure the target buffer is large enough), then split the line using strtok(). To determine if each token is a word or a number, I'd use strtol() to attempt the conversion and check the error condition. Example:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
/**
* Read the next line from the file, splitting the tokens into
* multiple strings and a single integer. Assumes input lines
* never exceed MAX_LINE_LENGTH and each individual string never
* exceeds MAX_STR_SIZE. Otherwise things get a little more
* interesting. Also assumes that the integer is the last
* thing on each line.
*/
int getNextLine(FILE *in, char (*strs)[MAX_STR_SIZE], int *numStrings, int *value)
{
char buffer[MAX_LINE_LENGTH];
int rval = 1;
if (fgets(buffer, buffer, sizeof buffer))
{
char *token = strtok(buffer, " ");
*numStrings = 0;
while (token)
{
char *chk;
*value = (int) strtol(token, &chk, 10);
if (*chk != 0 && *chk != '\n')
{
strcpy(strs[(*numStrings)++], token);
}
token = strtok(NULL, " ");
}
}
else
{
/**
* fgets() hit either EOF or error; either way return 0
*/
rval = 0;
}
return rval;
}
/**
* sample main
*/
int main(void)
{
FILE *input;
char strings[MAX_NUM_STRINGS][MAX_STRING_LENGTH];
int numStrings;
int value;
input = fopen("datafile.txt", "r");
if (input)
{
while (getNextLine(input, &strings, &numStrings, &value))
{
/**
* Do something with strings and value here
*/
}
fclose(input);
}
return 0;
}

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

C Multiple Strtok to determine delimiter - c

Related

SOLVED-what am I doing wrong that strtok does right in splitting a string

parsing a file while reading in c

How do I make this shell to parse the statement with quotes around them in C?

Reading a file in C

Parsing text in C

Categories

Resources