what is wrong with the following strtok() usage?

what is wrong with the following strtok() usage? - c

I am using vc2010 and I am trying to read contents of a file into a struct as follows and it gives me run time error.
char buf[100];
char *token = NULL;
while( fgets (buf , 100 , rd) != NULL )
{
token = strtok( buf,", ");
test_st.fp.chunk_offset = atol(token);
printf("\n %llu ", test_st.fp.chunk_offset);
//OPTION 1: if i do this there will be no runtime error but the same
// value as the first token will be assigned to chunk_length
token = strtok(buf, ",");
//OPTION 2: this line gives error in the second while loop iteration
token=strtok(NULL,",");
test_st.fp.chunk_length = atol(token);
printf("%llu ", test_st.fp.chunk_length);
token = strtok(NULL, " ");
....
}
my other problem is how can i use strtok() or any- if there is one- to assign a very long character string(like the one after the second coma in the following file content) into a structure data member(in this case into fp.fing_print)?
Here is the first part of the file i am trying to read
0,4096,2ed692b40e335f7c20b71c5ed0486634
4096,3028,da40bf20c8ff189087b8bd7e8580118a
7124,2177,e6dfaee81e96095d302c82f9fd73dc55
9301,1128,76201eadff3c89a381a314ed311f75ff
the structure definition i am trying to read this into is:
typedef struct fpinfo
{
unsigned long chunk_offset;
unsigned long chunk_length;
unsigned char fing_print[32];
}fpinfo;
typedef struct Hash_Entry {
struct Hash_Entry *next; /* Link entries within same bucket. */
unsigned namehash; /* hash value of key */
struct fpinfo fp;
} Hash_Entry;
EDIT:
Yes you are right I have to use the second option.
I HAVE FOUND THE PROBLEM WITH THIS ONE. The file have empty lines between every line and it gives error message when it gets to these empty lines.
BUT I AM STUCK in the other problem where I have to read the last part of each line into the array member(fp.fing_print) of the structure.
Thank you everybody for your help!!!
EDIT:
the fing_print[32] array is supposed to hold 32 characters which are results of an md5 hash function. I am not sure if i have to make it a null terminated string or not. if you guys could give me a tip about it I will be more than grateful.

As stated by paulsm4 you need to check strtok() return value. The following code works and includes the creation of the string:
while( fgets (buf , 100 , rd) != NULL )
{
token = strtok(buf,", \n");
if (0 != token)
{
test_st.fp.chunk_offset = atol(token);
printf("\tchunk_offset=%lu\n", test_st.fp.chunk_offset);
token = strtok(0, ", \n");
}
if (0 != token)
{
test_st.fp.chunk_length = atol(token);
printf("\tchunk_length=%lu\n", test_st.fp.chunk_length);
token = strtok(0, ", \n");
}
if (0 != token)
{
/* EDIT: fing_print datatype changed. */
memset(test_st.fp.fing_print, 0, sizeof(test_st.fp.fing_print));
strncpy(test_st.fp.fing_print,
token,
sizeof(test_st.fp.fing_print) - 1);
/* test_st.fp.fing_print = _strdup(token); */
printf("\tfing_print=[%s]\n", test_st.fp.fing_print);
}
printf("\n");
}
Delimiter set is ", \n" as fgets() will read the newline character into buf.

You need to check for NULL return:
if ((token = strtok(buf, ",")) {
// must have found a comma
}
if (token && (token = strtok(NULL,"'")) {
// Two in a row: I found a comma and then I found an apostrophe
}
...

Related

Reading from an input file and storing words into an array [duplicate]

This question already has an answer here:
Unexpected strtok() behaviour
(1 answer)
Closed 4 years ago.
The end goal is to output a text file where repeating words are encoded as single digits instead. The current problem I'm having is reading words and storing them into an array.
#include<stdio.h>
#include<stdlib.h>
#include<string.h>
#define MAX_CODE 120
void main() {
FILE *inputfile = fopen("input.txt","rw");
char buffer[128];
char *token;
char *words[MAX_CODE];
int i = 0;
while(fgets(buffer, 128, inputfile)){
token = strtok(buffer," ");
printf("Token %d was %s",i,token);
while(token != NULL) {
words[i] = malloc(strlen(token)+1);
strcpy(words[i], token);
i++;
token = strtok(buffer," ");
}
}
for(int i = 0; i<3; i++) printf("%d\n%s\n",i,words[i]);
printf("End");
}
What I get is segmentation fault errors, or nothing. What I want is for words to be an array of strings. I'm allocating memory for each string, so where am I going wrong?

Your second call to strtok should pass NULL for the first argument. Otherwise, strtok will parse the first token over and over again.
token = strtok(buffer," ");
printf("Token %d was %s\n",i,token);
while(i < MAX_CODE && token != NULL) {
words[i] = malloc(strlen(token)+1);
strcpy(words[i], token);
i++;
token = strtok(NULL," ");
}
The check against MAX_CODE is for the safety's sake, in case you ever increase the size of your buffer or reduce the value of MAX_CODE. In your current code, the maximum number of space delimited tokens you can hold in a 128 byte buffer is 64.
From cppreference:
If str != NULL, the call is treated as the first call to strtok for this particular string. ...
If str == NULL, the call is treated as a subsequent calls to strtok: the function continues from where it left in previous invocation. ...

Taking of the last word in the line with strtok

Given a file with the following line:
word1 word2 word3 word4
I tried to write the following code:
FILE* f = fopen("my_file.txt", "r");
char line[MAX_BUFFER + 1];
if (fgets(line, MAX_LENGTH_LINE, f) == NULL) {
return NULL;
}
char* word = strtok(line, " ");
for (int i = 0; i < 4; i++) {
printf("%s ", word);
word = strtok(NULL, " ");
}
For prints the "words".
It's working. But, I don't understand something.
How it's acheive the last word word4? (I don't understand it because that after "word4" not exists a space)..

I'm not quite sure what you're asking. Are you asking how the program was able to correctly read word4 from the file even though it wasn't followed by a space? Or are you asking why, when the program printed word4 back out, it didn't seem to print a space after it?
The answer to the first question is that strtok is designed to give you tokens separated by delimiters, not terminated by delimiters. There is no requirement that the last token be followed by a delimiter.
To see the answer to the second question, it may be more clear if we adjust the program and its printout slightly:
char* word = strtok(line, " ");
for (int i = 0; word != NULL; i++) {
printf("%d: \"%s\"\n", i, word);
word = strtok(NULL, " ");
}
I have made two changes here:
The loop runs until word is NULL, that is, as long as strtok finds another word on the line. (This is to make sure we see all the words, and to make sure we're not trying to treat the fourth word specially in any way. If you were trying to treat the fourth word specially in some way, please say so.)
The words are printed back out surrounded by quotes, so that we can see exactly what they contain.
When I run the modified program, I see:
0: "word1"
1: "word2"
2: "word3"
3: "word4
"
That last line looks very strange at first, but the explanation is straightforward. You originally read the line using fgets, which does copy the terminating \n character into the line buffer. So it ends up staying tacked onto word4; that is, the fourth "word" is "word4\n".
For this reason, it's often a good idea to include \n in the set of whitespace delimiter characters you hand to strtok -- that is, you can call strtok(line, " \n") instead. If I do that (in both of the strtok calls), the output changes to
0: "word1"
1: "word2"
2: "word3"
3: "word4"
which may be closer to what you expected.

Your code doesn't check the return value of strtok(), it may be unsafe in some cases.
/* Split string
#content origin string content
#delim delimiter for splitting
#psize pointer pointing at the variable to store token size
#return tokens after splitting
*/
const char **split(char *content, const char *delim, int *psize)
{
char *token;
const char **tokens;
int capacity;
int size = 0;
token = strtok(content, delim);
if (!token)
{
return NULL;
}
// Initialize tokens
tokens = malloc(sizeof(char *) * 64);
if (!tokens)
{
exit(-1);
}
capacity = 64;
tokens[size++] = token;
while ((token = strtok(NULL, delim)))
{
if (size >= capacity)
{
tokens = realloc(tokens, sizeof(char *) * capacity * 2);
if (!tokens)
{
exit(-1);
}
capacity *= 2;
}
tokens[size++] = token;
}
// if (size < capacity)
// {
// tokens = realloc(tokens, sizeof(char *) * size);
// if (!tokens)
// {
// exit(-1);
// }
// }
*psize = size;
return tokens;
}

How would I read a line, parse the information and then attribute it to a struct in C?

I am currently trying to understand how to go through a .txt file in C and I think I have mostly everything worked out but what I need to do is kind of confusing. I need to create an array of Pointers to point to structs.
Each line in my .txt file should have information corresponding to a single struct. Each line should start with a name followed by some float values.
My question is, when I read the lines and parse them using strtok first, how would I get that information in a struct?
second how would I then make the sample pointer at index i point to the struct?
I tried doing the name seperate from the numbers since the numbers need their own special atof conversion since initially it will be a string. However I think this is probably incorrect since I want to read multiple lines, the code I have before the while loop for obtaining the name will only run once so any following lines will not have the name seperated. I can technically delimit my text file as I choose, so maybe I can just seperate the name with a semicolon and the rest spaces?
If this question seems confusing its probably because I am over thinking
Should I be declaring a struct such as : Sample tmp;
I've been reading examples but I can't figure out how to put the information together. Let me know if I declared my array of pointers incorrectly... Which I think I did. I think my the line that says:
sample arr[SIZE] = {NULL}; might be incorrect but I am not sure. if you can help me work out the logic behind all this I would appreciate it. Thanks.
typedef struct sample{
char* name;
int list_len;
float* value_list;
}sample;
void read_and_parse(){
const int SIZE = 1024;
sample* sample = (sample*)malloc(sizeof(sample); //pointer allocation?
FILE* fin;
fin = fopen("record.txt", "r");
if (fin == NULL) {
printf("record.txt could not be opened \n");
exit(1);
}
else {
int i= 0;
sample arr[SIZE] = {NULL}; //here I try to make the array of pointers
char linebuf[SIZE];
token = strtok(linebuf, " "); //grab the first item
while (fgets(linebuf, SIZE, fin) && i<SIZE) {
arr[i] = malloc(sizeof(sample));
arr[i.name] = token;
token = strtok(NULL, " ");
// now parse the linebuf and fill arr[i] with it
i++;
}
Edited: 11/02/2017
any print statements you see are just silly markers I placed for testing and recognizing what is running when I finally get this code compiled
Here is a much better edited version of the code. I think it should work now.
typedef struct sample{
char* name;
int list_len;
float* value_list;
}sample;
void read_and_parse(FILE **fin, sample* arr[]){
const int SIZE = 1024;
if (*fin == NULL) {
printf("record.txt could not be opened \n");
exit(1);
}
else {
printf("successfully opened file\n");
char linebuf[SIZE];
while ( fgets(linebuf, SIZE, fin) ) {
arr[i] = malloc(sizeof(sample));
int floats_per_line = 0;
while(linebuf[i]){
if(linebuf[i] == ' ');
++floats_per_line;
}
arr[i]->list_len = values_per_line;
arr[i]->value_list = (float*)malloc(sizeof(float)*floats_per_line);
arr[i]->name = strdup(strtok(linebuf, ' '));
char* tok;
int j = 0
while(tok = strtok(NULL, ' ')){
arr[i]->value_list[j] = atof(tok);
++j
}
i++;
}
}
fclose(fin);
}

How would I read a line, parse the information and then attribute it to a struct ?
Read with fgets() which converts a line of file input into a string. OP does that well. Then parse the string.
when I read the lines and parse them using strtok first, how (to) get that information in a struct?
Should I be declaring a struct such as : sample tmp;
Pass the string to a helper function to parse it into a sample that can hold any input. So the pointer members of tmp need to point to maximal space.
char name[SIZE];
char f[SIZE/2];
sample tmp = { name, 0, f };
while (i<SIZE && fgets(linebuf, SIZE, fin)) {
if (sample_parse(&tmp, linebuf) == NULL) {
break; // Parsing failed for some reason, perhaps an error message?
}
// Now populate arr[i] with right-sized memory allocations
arr[i].name = strdup(tmp.name); // ToDo: add NULL check
arr[i].list_len = tmp.list_len;
size_t f_size = sizeof *(tmp.value_list) * tmp.list_len;
arr[i].value_list = malloc(f_size); // ToDo: add NULL check
memcpy(arr[i].value_list, tmp.value_list, f_size);
i++;
}
so maybe I can just separate the name with a semicolon and the rest spaces?
Yes. Also allow other white-spaces too.
if I declared my array of pointers incorrectly.
Code does not have an array of pointers anywhere.
Recommend using size_t for array size type.
typedef struct sample {
char* name;
// int list_len;
size_t list_len;
float* value_list;
} sample;
Some untested code for parsing. Parse the line with strtok(). Further parse the number tokens with strtof().
#define sample_NAME_DELIMITER ":"
#define sample_NUMBER_DELIMITER " \n\t\r"
// parse for a name and then 0 or more numbers
static sample *sample_parse(sample *dest, char *linebuf) {
char *s = strtok(linebuf, sample_NAME_DELIMITER);
if (s == NULL) {
return NULL; // no name - TBD on if this is allowed
}
strcpy(dest->name, s);
size_t i = 0;
while ((s = strtok(NULL, sample_NUMBER_DELIMITER)) != NULL) {
char *endptr;
dest->value_list[i] = strtof(s, &endptr);
if (s == endptr || *endptr) {
// conversion failed or extra junk
break;
}
i++;
}
dest->list_len = i;
return dest;
}

Access the next word/string

I have a simple C-based code to read a file. Read the input line by line. Tokenize the line and prints the current token. My problem is, I want to print the next token if some conditions are satisfied. Do you have any idea how to do it. I really need your help for this project. Thank you
Here is the code:
main(){
FILE *input;
FILE *output;
//char filename[100];
const char *filename = "sample1.txt";
input=fopen(filename,"r");
output=fopen("test.st","w");
char word[1000];
char *token;
int num =0;
char var[100];
fprintf(output,"LEXEME, TOKEN");
while( fgets(word, 1000, input) != NULL ){ //reads a line
token = strtok(word, " \t\n" ); // tokenize the line
while(token!=NULL){ // while line is not equal to null
fprintf(output,"\n");
if (strcmp(token,"SIOL")==0)
fprintf(output,"SIOL, SIOL", token);
else if (strcmp(token,"DEFINE")==0)
fprintf(output,"DEFINE, DEFINE", token);
else if (strcmp(token,"INTEGER")==0){
fprintf(output,"INTEGER, INTEGER");
strcpy(var,token+1);
fprintf(output,"\n%s,Ident",var);
}
else{
printf("%s\n", token);
}
token = strtok(NULL, " \t\n" ); //tokenize the word
}}fclose(output);return 0;}

Continuing from my comment. I'm not sure I completely understand what you need, but if you have the string:
"The quick brown fox";
And, you want to tokenize the string, printing the next word, only if a condition concerning the current word is met, then you need to adjust your thinking just a bit. In your example, you want to print the next word "quick", only if the current word is "The".
The adjustment in thinking is how you look at the test. Instead of thinking about printing the next word if the current matches some condition, you need to save the last word, and only print the current if the last word matches some condition -- "The" in your example.
To handle that situation, you can make use of a statically declared character array of at least 47 characters (the longest word in Merriam-Websters Unabridged Dictionary is 46-character). I'll use 48 in the example below. You may be tempted to just save a pointer to the last word, but when using strtok there is no guarantee that the memory address returned by the previous iteration is preserved -- so make a copy of the word.
Putting the pieces together, you could do something like the following. It saves the prior token in last and then compares the current word to the last and prints the current word if last == "The":
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define MAXW 48
int main (void) {
char str[] = "The quick brown fox";
char last[MAXW] = {0};
char *p;
for (p = strtok (str, " "); p; p = strtok (NULL, " "))
{
if (*last && strcmp (last, "The") == 0)
printf (" '%s'\n", p);
strncpy (last, p, MAXW);
}
return 0;
}
Output
$ ./bin/str_chk_last
'quick'
Let me know if you have any questions.
Test Explanation
As written in the comment *last is simply shorthand for last[0]. So the first part of the test, *last is just testing if ((last[0] != 0) && ... Since last was initially declared and initialized:
char last[MAXW] = {0};
All chars in last are 0 for the first pass through the loop. By including the check last[0] != 0, that just causes the printf to be skipped the first time the for loop executes. The longhand for the test would look like:
if ((last[0] != 0) && strcmp (last, "The") == 0)
printf (" '%s'\n", p);
Which in pseudo code just says:
if (NOT first iteration && last == "The")
printf (" '%s'\n", p);
Let me know if that doesn't make sense.

It is easy to achieve with strtok function. Note that if you put null pointer as the first argument, the function continues scanning the same string where a previous successful call to the function ended. So if you need next token just call
char* token = strtok(NULL, delimeters);
See small example below
#include <stdio.h>
#include <string.h>
int main(void)
{
char str[] = "The quick brown fox";
// split str by space
char* token = strtok(str, " ");
// if a token is found
if(token != NULL) {
// print current token
printf("%s\n", token);
// if token is "The"
if(strcmp(token, "The") == 0) {
// print next token
printf("%s\n", strtok(NULL, " "));
}
}
return 0;
}
The output will be
The
quick

How would i Use strtok to compare word by word

I've been reading up on strtok and thought it would be the best way for me to compare two files word by word. So far i can't really figure out how i would do it though
Here is my function that perfoms it:
int wordcmp(FILE *fp1, FILE *fp2)
{
char *s1;
char *s2;
char *tok;
char *tok2;
char line[BUFSIZE];
char line2[BUFSIZE];
char comp1[BUFSIZE];
char comp2[BUFSIZE];
char temp[BUFSIZE];
int word = 1;
size_t i = 0;
while((s1 = fgets(line,BUFSIZE, fp1)) && (s2 = fgets(line2,BUFSIZE, fp2)))
{
;
}
tok = strtok(line, " ");
tok2 = strtok(line, " ");
while(tok != NULL)
{
tok = strtok (NULL, " ");
}
return 0;
}
Don't mind the unused variables, I've been at this for 3 hours and have tried all possible ways I can think of to compare the values of the first and second strtok. Also I would to know how i would check which file reaches EOF first.
when i tried
if(s1 == EOF && s2 != EOF)
{
return -1;
}
It returns -1 even when the files are the same! Is it because in order for it to reach the if statement outside of the loop both files have reached EOF which makes the program always go to this if statement?
Thanks in advance!

If you want to check if files are same try doing,
do {
s1 = fgetc(fp1);
s2 = fgetc(fp2);
if (s1 == s2) {
if (s1 == EOF) {
return 1; // RETURN TRUE
}
continue;
}
else {
return -1; // RETURN FALSE
}
} while (1);
Good Luck :)

When you use strtok() you typically use code like this:
tok = strtok(line, " ");
while (NULL != tok)
{
tok = strtok(NULL, " ");
}
The NULL in the call in the loop tells strtok to continue from after the previously found token until it finds the null terminating character in the value you originally passed (line) or until there are no more tokens. The current pointer is stored in the run time library, and once strtok() returns NULL to indicate no more tokens any more calls to strtok() using NULL as the first parameter (to continue) will result in NULL. You need to call it with another value (e.g. another call to strtok(line, " ")) to get it to start again.
What this means is that to use strtok on two different strings at the same time you need to manually update the string position and pass in a modified value on each call.
tok = strtok(line, " ");
tok2 = strtok(line2, " ");
while (NULL != tok && NULL != tok2)
{
/* Do stuff with tok and tok2 here */
if (strcmp(tok, tok2)... {}
/* Update strtok pointers */
tok += strlen(tok) + 1;
tok2 += strlen(tok2) + 1;
/* Get next token */
tok = strtok(tok, " ");
tok2 = strtok(tok2, " ");
}
You'll still need to add logic for determining whether lines are different - you've not said whether the files are equivalent if a line break occurs at different position but the words surrounding it are the same. I assume it should be, given your description, but it makes the logic more awkward as you only need to perform the initial fgets() and strtok() for a file if you don't already have a token. You also need to look at how files are read in. Currently your first while loop just reads lines until the end of the file without processing them.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight