Copying content of strtok token in C - c

Need to separate a string and then do another separation.
char *token = strtok(str, ",");
while(token){
char *current_string = malloc(sizeof(char) * strlen(token));
strcpy(current_string, token);
char *tk = strtok(current_string, ":"); // KEY
printf("key: %s ", tk);
tk = strtok(0, ":"); // VALUE
printf("value: %s\r\n", tk);
printf("%s\n", token);
token = strtok(0, ",");
}
printf("Done\n");
Trying to copy the content of token, but doing so messes with what remains in the token variable. It only processes one line instead of the three it should. I suspect the issue is with the strcpy(current_string, token) but unsure how I should go about it.

The strtok function uses an internal static buffer to keep track of where it left off. This means you can't use it to go back and forth parsing two different strings.
In your specific case, on this call:
token = strtok(0, ",");
The internal buffer is still pointing to a location inside of current_string, so attempting to go back to token won't work.
What you need to strtok_r. This version takes an additional parameter to keep track of the current state. That way, you can interchangeably parse two or more string by using a different state pointer for each one:
char *state1, *state2;
char *token = strtok_r(str, ",", &state1);
while(token){
char *current_string = strdup(token);
char *tk = strtok_r(current_string, ":", &state2); // KEY
printf("key: %s ", tk);
tk = strtok_r(NULL, ":", &state2); // VALUE
printf("value: %s\r\n", tk);
printf("%s\n", token);
free(current_string);
token = strtok_r(NULL, ",", &state1);
}
printf("Done\n");
Note that NULL was passed to the later strtok_r calls instead of 0, since NULL may not necessarily be 0. Also, the calls to malloc/strcpy were replaced with a call to strdup which does the same, and also a call to free was added to prevent a memory leak.
The strtok_r is available on UNIX/Linux system. On Windows, use strtok_s which works the same way.

Related

Segmentation fault (core dumped) c

Here is a weird problem:
token = strtok(NULL, s);
printf(" %s\n", token); // these two lines can read the token and print
However!
token = strtok(NULL, s);
printf("%s\n", token); // these two lines give me a segmentation fault
Idk whats happened, because I just add a space before %s\n, and I can see the value of token.
my code:
int main() {
FILE *bi;
struct _record buffer;
const char s[2] = ",";
char str[1000];
const char *token;
bi = fopen(DATABASENAME, "wb+");
/*get strings from input, and devides it into seperate struct*/
while(fgets(str, sizeof(str), stdin)!= NULL) {
printf("%s\n", str); // can print string line by line
token = strtok(str, s);
strcpy(buffer.id, token);
printf("%s\n", buffer.id); //can print the value in the struct
while(token != NULL){
token = strtok(NULL, s);
printf("%s\n", token); // problem starts here
/*strcpy(buffer.lname, token);
printf("%s\n", buffer.lname); // cant do anything with token */
}}
fclose(bi);
return 1;}
Here is the example of string I read from stdin and after parsed(I just tried to strtok the first two elements to see if it works):
<15322101,MOZNETT,JOSE,n/a,n/a,2/23/1943,MALE,824-75-8088,42 SMITH AVENUE,n/a,11706,n/a,n/a,BAYSHORE,NY,518-215-5848,n/a,n/a,n/a
<
< 15322101
< MOZNETT
In the first version your compiler transforms printf() into a
puts() and puts does not allow null pointers, because internally
invokes the strlen() to determine the lenght of the string.
In the case of the second version you add a space in front of format
specifier. This makes it impossible for the compiler to call puts
without appending this two string together. So it invokes the actual
printf() function, which can handle NULL pointers. And your code
works.
Your problem reduces to the following question What is the behavior of printing NULL with printf's %s specifier?
.
In short NULL as an argument to a printf("%s") is undefined. So you need to check for NULL as suggested by #kninnug
You need to change you printf as follows:
token = strtok(NULL, s);
if (token != NULL) printf("%s\n", token);
Or else
printf ("%s\n", token == NULL ? "" : token);

Splitting string into strings using strtok() not working

I have a problem with regards to separating the contents of a string passed to a function. The function is called with a string like this:
ADD:Nathaniel:50
Where ADD will be the protocol name, Nathaniel will be the key, and 50 will be the value, all separated with a :.
My code looks like this:
bool add_to_list(char* buffer){
char key[40];
char value[40];
int* token;
char buffer_copy[1024];
const char delim[2] = ":";
strcpy(buffer_copy, buffer);
token = strtok(NULL, delim);
//strcpy(key, token);
printf("%d",token);
printf("%p",token);
while(token != NULL){
token = strtok (NULL, delim);
}
//strcpy(value, token);
printf("%s", key);
printf("%s", value);
push(key, value);
return true;
}
What I am trying to do is store each key and value in a separate variable, using strtok(). Note that I am trying to store the second and third values (Nathaniel and 50) not the first bit (ADD).
When I run the code, it gives me a segmentation fault, so I am guessing that I am trying to access an invalid memory address rather than a value. I just need to store the second and third bit of the string. Can anyone help please?
EDIT:
I have changed the code to look like this:
bool add_to_list(char* buffer){
char *key, *value, *token;
const char *delim = ":";
token = strtok(buffer, delim);
//printf("%d",token);
printf("%s",token);
key = strtok(NULL, delim);
value = strtok(NULL, delim);
printf("%s", key);
printf("%s", value);
//push(key, value);
return true;
}
But I am still getting the same segmentation fault (core dumped) error
The first call to strtok() needs to provide the string to scan. You only use NULL on the repeated calls, so it will keep processing the rest of the string. SO the first call should be:
token = strtok(buffer_copy, delim);
Then when you want to get the key and value, you need to copy them to the arrays:
token = strtok(NULL, delim);
key = strcpy(token);
token = strtok(NULL, delim);
value = strcpy(token);
You don't need a loop, since you just want to extract these two values.
Actually, you don't need to declare key and value as arrays, you could use pointers:
char *key, *value;
Then you can do:
token = strtok(buffer_copy, delim);
key = strtok(NULL, delim);
value = strtok(NULL, delim);
Your main problem is that when you first call strtok, the first parameter should be the string you want to parse, so not:
strcpy(buffer_copy, buffer);
token = strtok(NULL, delim);
but
strcpy(buffer_copy, buffer);
token = strtok(buffer_copy, delim);
Additionally when you detect the tokens in your while loop, you are throwing them away. You want to do something at that point (or simply unroll the loop and call strtok three times).
Also:
const char* delim = ":";
would be a more conventional way of ensuring a NUL terminated string than:
const char delim[2] = ":";
Also consider using strtok_r not strtok as strtok is not thread-safe and horrible. Whilst you are not using threads here (it seems), you might as well get into good practice.

Segmentation fault on line with fgets() - C

I have this code in my program:
char* tok = NULL;
char move[100];
if (fgets(move, 100, stdin) != NULL)
{
/* then split into tokens using strtok */
tok = strtok(move, " ");
while (tok != NULL)
{
printf("Element: %s\n", tok);
tok = strtok(NULL, " ");
}
}
I have tried adding printf statements before and after fgets, and the one before gets printed, but the one after does not.
I cannot see why this fgets call is causing a segmentation failure.
If someone has any idea, I would much appreciate it.
Thanks
Corey
The strtok runtime function works like this
the first time you call strtok you provide a string that you want to tokenize
char s[] = "this is a string";
in the above string space seems to be a good delimiter between words so lets use that:
char* p = strtok(s, " ");
what happens now is that 's' is searched until the space character is found, the first token is returned ('this') and p points to that token (string)
in order to get next token and to continue with the same string NULL is passed as first argument since strtok maintains a static pointer to your previous passed string:
p = strtok(NULL," ");
p now points to 'is'
and so on until no more spaces can be found, then the last string is returned as the last token 'string'.
more conveniently you could write it like this instead to print out all tokens:
for (char *p = strtok(s," "); p != NULL; p = strtok(NULL, " "))
{
puts(p);
}
EDITED HERE:
If you want to store the returned values from strtok you need to copy the token to another buffer e.g. strdup(p); since the original string (pointed to by the static pointer inside strtok) is modified between iterations in order to return the token.

C - strtok give back less data

char line[81] = "$11,$10,1";
token = strtok(line, " \t\v,$");
token = strtok(NULL, ",");
printf("%s\n",token ); // its $10 from the previous strtok
if(strstr (token, "$") != NULL){
token = strtok(NULL, "$");
printf("%s\n",token ); // I want to print 10 but it prints 1.
}
I'm trying to remove one char with strtok. However as you can see it give back only one digit.
If you want 10 as a result then I think what you are looking for is
char line[81] = "$11,$10,1";
token = strtok(line, " \t\v,$");
token = strtok(NULL, ",");
if(*token == '$')
printf("%s\n", token + 1);
else
... do something else ...
After all you have already got your token, there is no use looking for more.
As #Dietrich said, the output looks correct. Let me break it down for you.
char line[81] = "$11,$10,1";
token = strtok(line, " \t\v,$");
This eats ignores the initial "$" (because its a delimiter) and returns "11".
token = strtok(NULL, ",");
printf("%s\n",token ); // its $10 from the previous strtok
The strtok eats the "," and returns "$10". It also seems to eat the following
"," -- that behaviour is allowed but not required by the man page that I am reading.
if(strstr (token, "$") != NULL){
token = strtok(NULL, "$");
The remaining string is non-empty, but has no more "$"-delimters. Thus this strtok will return the whole remainder, that is "1".
Change token = strtok(NULL, ","); to token = strtok(NULL, "$,");
So I'm trying to follow it:
char line[81] = "$11,$10,1";
token = strtok(line, " \t\v,$");
you'd might expect it to return "\0" while in the buffer there is "11,$10,1"
but - it returns "11" while in the buffer there is "$10,1", since it will not have an empty buffer to return
token = strtok(NULL, ",");
returns "$10" while in the buffer there is "1"
printf("%s\n",token ); // its $10 from the previous strtok
token = strtok(NULL, "$");
returns "1" while in the buffer there is "\0", since there's no separator
printf("%s\n",token ); // it prints 1 but not 0.

tokenizing a string twice in c with strtok()

I'm using strtok() in c to parse a csv string. First I tokenize it to just find out how many tokens there are so I can allocate a string of the correct size. Then I go through using the same variable I used last time for tokenization. Every time I do it a second time though it strtok(NULL, ",") returns NULL even though there are still more tokens to parse. Can somebody tell me what I'm doing wrong?
char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
count++;
tok = strtok(NULL, ",");
}
//allocate array
tok = strtok(buffer, ",");
while(tok != NULL) {
//do other stuff
tok = strtok(NULL, ",");
}
So on that second while loop it always ends after the first token is found even though there are more tokens. Does anybody know what I'm doing wrong?
strtok() modifies the string it operates on, replacing delimiter characters with nulls. So if you want to use it more than once, you'll have to make a copy.
There's not necessarily a need to make a copy - strtok() does modify the string it's tokenizing, but in most cases that simply means the string is already tokenized if you want to deal with the tokens again.
Here's your program modified a bit to process the tokens after your first pass:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int i;
char buffer[] = "some, string with , tokens";
char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
count++;
tok = strtok(NULL, ",");
}
// walk through the tokenized buffer again
tok = buffer;
for (i = 0; i < count; ++i) {
printf( "token %d: \"%s\"\n", i+1, tok);
tok += strlen(tok) + 1; // get the next token by skipping past the '\0'
tok += strspn(tok, ","); // then skipping any starting delimiters
}
return 0;
}
Note that this is unfortunately trickier than I first posted - the call to strspn() needs to be performed after skipping the '\0' placed by strtok() since strtok() will skip any leading delimiter characters for the token it returns (without replacing the delimiter character in the source).
Use strsep - it actually updates your pointer. In your case you would have to keep calling NULL versus passing in the address of your string. The only issue with strsep is if it was previously allocated on the heap, keep a pointer to the beginning and then free it later.
char *strsep(char **string, char *delim);
char *string;
char *token;
token = strsep(&string, ",");
strtok is used in your normal intro to C course - use strsep, it's much better. :-)
No getting confused on "oh shit - i have to pass in NULL still cuz strtok screwed up my positioning."

Resources