I have this code in my program:
char* tok = NULL;
char move[100];
if (fgets(move, 100, stdin) != NULL)
{
/* then split into tokens using strtok */
tok = strtok(move, " ");
while (tok != NULL)
{
printf("Element: %s\n", tok);
tok = strtok(NULL, " ");
}
}
I have tried adding printf statements before and after fgets, and the one before gets printed, but the one after does not.
I cannot see why this fgets call is causing a segmentation failure.
If someone has any idea, I would much appreciate it.
Thanks
Corey
The strtok runtime function works like this
the first time you call strtok you provide a string that you want to tokenize
char s[] = "this is a string";
in the above string space seems to be a good delimiter between words so lets use that:
char* p = strtok(s, " ");
what happens now is that 's' is searched until the space character is found, the first token is returned ('this') and p points to that token (string)
in order to get next token and to continue with the same string NULL is passed as first argument since strtok maintains a static pointer to your previous passed string:
p = strtok(NULL," ");
p now points to 'is'
and so on until no more spaces can be found, then the last string is returned as the last token 'string'.
more conveniently you could write it like this instead to print out all tokens:
for (char *p = strtok(s," "); p != NULL; p = strtok(NULL, " "))
{
puts(p);
}
EDITED HERE:
If you want to store the returned values from strtok you need to copy the token to another buffer e.g. strdup(p); since the original string (pointed to by the static pointer inside strtok) is modified between iterations in order to return the token.
Related
I am getting this error:
Error in `./sorter': double free or corruption (!prev): 0x0000000000685010
and then a bunch of numbers which is the memory map.
My program reads a CSV file of movies and their attributes from stdin and tokenizes it. The titles of the movies with commas in them are surrounded in quotes, so I split each line into 3 tokens and tokenize the front and back token again using the comma as the delimeter. I free all my mallocs at the end of the code but I still get this error. The csv is scanned until the end but I get an the error message. If I don't free the mallocs at all I don't get an error message but I highly doubt it is right. This is my main() :
char* CSV = (char*)malloc(sizeof(char)*500);
char* fronttoken = (char*)malloc(sizeof(char)*500);
char* token = (char*)malloc(sizeof(char)*500);
char* backtoken = (char*)malloc(sizeof(char)*500);
char* title = (char*)malloc(sizeof(char)*100);
while(fgets(CSV, sizeof(CSV)*500,stdin))
{
fronttoken = strtok(CSV, "\""); //store token until first quote, if no quote, store whole line
title = strtok(NULL,"\""); //store token after first quote until 2nd quote
if(title != NULL) //if quotes in line, consume comma meant to delim title
{
token = strtok(NULL, ","); //eat comma
}
backtoken = strtok(NULL,"\n"); //tokenize from second quote to \n character (remainder of line)
printf("Front : %s\nTitle: %s\nBack: %s\n", fronttoken, title, backtoken); //temp print statement to see front,back,title components
token = strtok(fronttoken, ","); //tokenizing front using comma delim
while (token != NULL)
{
printf("%s\n", token);
token = strtok(NULL, ",");
}
if (title != NULL) //print if there is a title with comma
{
printf("%s\n",title);
}
token = strtok(backtoken,","); //tokenizing back using comma delim
while (token != NULL)
{
printf("%s\n", token);
token = strtok(NULL, ",");
}
}
free(CSV);
free(token);
free(fronttoken);
free(backtoken);
free(title);
return 0;
Focus here:
char* title = (char*)malloc(sizeof(char)*100);
title = strtok(NULL,"\"");
You dynamically allocate memory that title points to.
You assign the return value of strtok to title, losing any
reference to the memory dynamically allocated with malloc()! This
means that you will definetely have a memory leak, since you will
never be able to de-allocate the memory you dynamically allocated
before.
The ref's example of strtok() has a very informative example:
/* strtok example */
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
As a result, there is no need to allocate memory for what strtok() returns - it's actually bad as I explained before.
Back to your code:
free(title);
does nothing, since title is NULL at that point (because of the while loop after strtok().
Same with token.
Furthermore, fronttoken and backtoken also result in memory leaks, since they are assigned the return value of strtok(), after malloc() has been called. But their free() is problematic too (in contrast with the other de-allocations of title and token), since they point within the original memory allocated for CSV.
So, when free(backtoken); is called, double-free or memory corruption occurs.
Moreover, change this:
while(fgets(CSV, sizeof(CSV)*500,stdin))
to this:
while(fgets(CSV, sizeof(*CSV)*500,stdin))
since you want the size of where CSV points to (that's the size of the memory you dynamically allocated).
Here is a weird problem:
token = strtok(NULL, s);
printf(" %s\n", token); // these two lines can read the token and print
However!
token = strtok(NULL, s);
printf("%s\n", token); // these two lines give me a segmentation fault
Idk whats happened, because I just add a space before %s\n, and I can see the value of token.
my code:
int main() {
FILE *bi;
struct _record buffer;
const char s[2] = ",";
char str[1000];
const char *token;
bi = fopen(DATABASENAME, "wb+");
/*get strings from input, and devides it into seperate struct*/
while(fgets(str, sizeof(str), stdin)!= NULL) {
printf("%s\n", str); // can print string line by line
token = strtok(str, s);
strcpy(buffer.id, token);
printf("%s\n", buffer.id); //can print the value in the struct
while(token != NULL){
token = strtok(NULL, s);
printf("%s\n", token); // problem starts here
/*strcpy(buffer.lname, token);
printf("%s\n", buffer.lname); // cant do anything with token */
}}
fclose(bi);
return 1;}
Here is the example of string I read from stdin and after parsed(I just tried to strtok the first two elements to see if it works):
<15322101,MOZNETT,JOSE,n/a,n/a,2/23/1943,MALE,824-75-8088,42 SMITH AVENUE,n/a,11706,n/a,n/a,BAYSHORE,NY,518-215-5848,n/a,n/a,n/a
<
< 15322101
< MOZNETT
In the first version your compiler transforms printf() into a
puts() and puts does not allow null pointers, because internally
invokes the strlen() to determine the lenght of the string.
In the case of the second version you add a space in front of format
specifier. This makes it impossible for the compiler to call puts
without appending this two string together. So it invokes the actual
printf() function, which can handle NULL pointers. And your code
works.
Your problem reduces to the following question What is the behavior of printing NULL with printf's %s specifier?
.
In short NULL as an argument to a printf("%s") is undefined. So you need to check for NULL as suggested by #kninnug
You need to change you printf as follows:
token = strtok(NULL, s);
if (token != NULL) printf("%s\n", token);
Or else
printf ("%s\n", token == NULL ? "" : token);
So im getting a file with strings, i want to tokenize each string whenever i come to a whitespace/newline. i am able to get the tokens seperated into delimiter strings, but im not able to copy them into an array.
int lexer(FILE* file){
char line[50];
char* delim;
int i = 0;
int* intptr = &i;
while(fgets(line,sizeof(line),file)){
printf("%s\n", line);
if(is_empty(line) == 1)
continue;
delim = strtok(line," ");
if(delim == NULL)
printf("%s\n", "ERROR");
while(delim != NULL){
if(delim[0] == '\n'){
//rintf("%s\n", "olala");
break;
}
tokenArray[*intptr] = delim;
printf("Token IN array: %s\n", tokenArray[*intptr]);
*intptr = *intptr + 1;
delim = strtok(NULL, " ");
}
if i run this i get the output :
Token IN array: 012
Token IN array: 23ddd
Token IN array: vs32
Token IN array: ,344
Token IN array: 0sdf
which is correct according to my textfile, but when i try to reprint the array at a later time in the same function and out
*intptr = *intptr + 1;
delim = strtok(NULL, " ");
}
}
printf("%s\n", tokenArray[3]);
fclose(file);
return 0;
i dont get an output, i tried writing all the contents of the array to a txt file, i got gibberish. i dont know what to do plz help
First, your pointer on i is useless. Why not using i directly?
I'll assume that from now on.
Then, the real problem: you have to allocate and copy the strings that strtok returns each time because strtok does not allocate the tokens for you, it justs points to the last one. The references are all the same, so you get last empty token
Something like this would help:
tokenArray[*intptr] = strdup(delim);
(instead of tokenArray[*intptr] = delim;) note that I have replaced the index by i. Just to i++ afterwards.
BTW I wouldn't recommend using strtok for other purposes that quick hacks. This function has a memory, so if you call several functions using it in different parts of your program, it can conflict (I made that mistake a long time ago). Check manual for strtok_r in that case (r for reentrant)
tokenArray[*intptr] = delim;
In this line, delim is a pointer to a char array of which the content is ever changing in the for loop. So in your case, the content which delim point to should be copied as content of tokenArray[*intptr], that is:
tokenArray[*intptr] = strdup(delim);
I've been reading up on strtok and thought it would be the best way for me to compare two files word by word. So far i can't really figure out how i would do it though
Here is my function that perfoms it:
int wordcmp(FILE *fp1, FILE *fp2)
{
char *s1;
char *s2;
char *tok;
char *tok2;
char line[BUFSIZE];
char line2[BUFSIZE];
char comp1[BUFSIZE];
char comp2[BUFSIZE];
char temp[BUFSIZE];
int word = 1;
size_t i = 0;
while((s1 = fgets(line,BUFSIZE, fp1)) && (s2 = fgets(line2,BUFSIZE, fp2)))
{
;
}
tok = strtok(line, " ");
tok2 = strtok(line, " ");
while(tok != NULL)
{
tok = strtok (NULL, " ");
}
return 0;
}
Don't mind the unused variables, I've been at this for 3 hours and have tried all possible ways I can think of to compare the values of the first and second strtok. Also I would to know how i would check which file reaches EOF first.
when i tried
if(s1 == EOF && s2 != EOF)
{
return -1;
}
It returns -1 even when the files are the same! Is it because in order for it to reach the if statement outside of the loop both files have reached EOF which makes the program always go to this if statement?
Thanks in advance!
If you want to check if files are same try doing,
do {
s1 = fgetc(fp1);
s2 = fgetc(fp2);
if (s1 == s2) {
if (s1 == EOF) {
return 1; // RETURN TRUE
}
continue;
}
else {
return -1; // RETURN FALSE
}
} while (1);
Good Luck :)
When you use strtok() you typically use code like this:
tok = strtok(line, " ");
while (NULL != tok)
{
tok = strtok(NULL, " ");
}
The NULL in the call in the loop tells strtok to continue from after the previously found token until it finds the null terminating character in the value you originally passed (line) or until there are no more tokens. The current pointer is stored in the run time library, and once strtok() returns NULL to indicate no more tokens any more calls to strtok() using NULL as the first parameter (to continue) will result in NULL. You need to call it with another value (e.g. another call to strtok(line, " ")) to get it to start again.
What this means is that to use strtok on two different strings at the same time you need to manually update the string position and pass in a modified value on each call.
tok = strtok(line, " ");
tok2 = strtok(line2, " ");
while (NULL != tok && NULL != tok2)
{
/* Do stuff with tok and tok2 here */
if (strcmp(tok, tok2)... {}
/* Update strtok pointers */
tok += strlen(tok) + 1;
tok2 += strlen(tok2) + 1;
/* Get next token */
tok = strtok(tok, " ");
tok2 = strtok(tok2, " ");
}
You'll still need to add logic for determining whether lines are different - you've not said whether the files are equivalent if a line break occurs at different position but the words surrounding it are the same. I assume it should be, given your description, but it makes the logic more awkward as you only need to perform the initial fgets() and strtok() for a file if you don't already have a token. You also need to look at how files are read in. Currently your first while loop just reads lines until the end of the file without processing them.
I'm using strtok() in c to parse a csv string. First I tokenize it to just find out how many tokens there are so I can allocate a string of the correct size. Then I go through using the same variable I used last time for tokenization. Every time I do it a second time though it strtok(NULL, ",") returns NULL even though there are still more tokens to parse. Can somebody tell me what I'm doing wrong?
char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
count++;
tok = strtok(NULL, ",");
}
//allocate array
tok = strtok(buffer, ",");
while(tok != NULL) {
//do other stuff
tok = strtok(NULL, ",");
}
So on that second while loop it always ends after the first token is found even though there are more tokens. Does anybody know what I'm doing wrong?
strtok() modifies the string it operates on, replacing delimiter characters with nulls. So if you want to use it more than once, you'll have to make a copy.
There's not necessarily a need to make a copy - strtok() does modify the string it's tokenizing, but in most cases that simply means the string is already tokenized if you want to deal with the tokens again.
Here's your program modified a bit to process the tokens after your first pass:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int i;
char buffer[] = "some, string with , tokens";
char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
count++;
tok = strtok(NULL, ",");
}
// walk through the tokenized buffer again
tok = buffer;
for (i = 0; i < count; ++i) {
printf( "token %d: \"%s\"\n", i+1, tok);
tok += strlen(tok) + 1; // get the next token by skipping past the '\0'
tok += strspn(tok, ","); // then skipping any starting delimiters
}
return 0;
}
Note that this is unfortunately trickier than I first posted - the call to strspn() needs to be performed after skipping the '\0' placed by strtok() since strtok() will skip any leading delimiter characters for the token it returns (without replacing the delimiter character in the source).
Use strsep - it actually updates your pointer. In your case you would have to keep calling NULL versus passing in the address of your string. The only issue with strsep is if it was previously allocated on the heap, keep a pointer to the beginning and then free it later.
char *strsep(char **string, char *delim);
char *string;
char *token;
token = strsep(&string, ",");
strtok is used in your normal intro to C course - use strsep, it's much better. :-)
No getting confused on "oh shit - i have to pass in NULL still cuz strtok screwed up my positioning."