C - Unexpected Segmentation Fault on strtok(...) - c

I am using strtok(...) of the library and it appears to be working fine until the end condition, where it results in a segmentation fault and program crash. The API claims that strtok(...) will output a NULL when there are no more tokens to be found, which meant, I thought, that you had to catch this NULL in order to terminate any loops that you were running using strtok(...). What do I need to do to catch this NULL to prevent my program from crashing? I imagined the NULL was allowed for use as a terminating condition.
I have prepared a SSCCE for you to observe this behavior. I need strtok(...) to work for a much larger piece of software I am writing, and I am getting the exact same segmentation behavior. The output at the command line is shown below this code vignette (yes I know you use <...> to enclose libraries, but I was having difficulty getting this post to display the code libraries). I am using gcc version 4.5.3, on a Windows 8 OS, and below shows two different flavors of how I imagine one could try to catch the NULL in a loop.
#include <stdlib.h>
#include <stdio.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
main(){
char* from = "12.34.56.78";
char * ch = ".";
char * token = strtok(from, ch);
printf("%s\n",token);
while(token != NULL){
token = strtok(NULL, ch);
printf("%s\n", token);
}
printf("Broke out of loop!");
while(strcmp(token, 0) != 0){
printf("%s\n",token);
token = strtok(NULL, ch);
}
}
############ OUTPUT: ############
$ ./test
12
34
56
78
Segmentation fault (core dumped)

strtok modifies its first argument. You are passing it a string from read-only memory, and the segfault occurs when strtok tries to change it. Try changing from:
char* from = "12.34.56.78";
to
char from[] = "12.34.56.78";

you are first checking if token is not equal to NULL(when it is, it breaks out of the while loop). Then you are comparing token, which is a NULL with a constant NUMBER? here: strcmp(token, 0) when strcmp expects 2 strings, you provide a number. strcmp will try to fetch a string at 0th address(or NULL) giving you a segmentation fault.
while(strcmp(token, 0) != 0){
token = strtok(NULL, ch);
printf("%s\n",token);
}
Also this piece of code should be something like the following:
change
char * token = strtok(from, ch);
printf("%s\n",token);
while(token != NULL){
token = strtok(NULL, ch);
printf("%s\n", token);
}
to
char * token = strtok(from, ch);
printf("%s\n",token);
while(token != NULL){
printf("%s\n", token);
token = strtok(NULL, ch);
}

This is a problem:
while(token != NULL){
token = strtok(NULL, ch);
printf("%s\n", token);
}
You're checking for NULL, but then calling strtok again and not checking after that but before printing.
There are other problems with the code, but I suspect this is why it crashes where it does now.

The problem is that even though you terminate the loop when strtok() returns NULL, you try to print the NULL first:
while(token != NULL){
token = strtok(NULL, ch);
printf("%s\n", token); // not good when token is NULL
}
It turns out there are several opportunities in addition to this one for segfaults in this example, as pointed out by other answers.
Here's one way to handle your example tokenization:
char from[] = "12.34.56.78";
char * ch = ".";
char * token = strtok(from, ch);
while (token != NULL){
printf("%s\n", token);
token = strtok(NULL, ch);
}

If purpose of code is only to print element separated by '.',
Only change in char declaration and before printing token check for its value NULL or not !
main(){
char from[] = "12.34.56.78.100.101";
char * ch = ".";
char * token = strtok(from, ch);
//printf("%s\n",token);
while(token != NULL){
printf("%s\n", token);
token = strtok(NULL, ch);
}
}
OUTPUT
./test1
12
12
34
56
78
100
101

You have both memory access errors and logic errors. I will only address the memory access errors that are causing your program to crash.
strtok modifies it's first argument. Since you are passing in a string literal, it is unable to modify the string (string literals are not modifiable.)
Here's a possible fix to define from as a modifiable string array:
char from[] = "12.34.56.78";
Because strtok modifies the string passed into it, you cannot process that string again in your second while loop. You are essentially passing in a NULL into the strcmp function there. A possible fix would be to copy the from array into another buffer each time you wish to use strtok.

Related

Segmentation fault (core dumped) c

Here is a weird problem:
token = strtok(NULL, s);
printf(" %s\n", token); // these two lines can read the token and print
However!
token = strtok(NULL, s);
printf("%s\n", token); // these two lines give me a segmentation fault
Idk whats happened, because I just add a space before %s\n, and I can see the value of token.
my code:
int main() {
FILE *bi;
struct _record buffer;
const char s[2] = ",";
char str[1000];
const char *token;
bi = fopen(DATABASENAME, "wb+");
/*get strings from input, and devides it into seperate struct*/
while(fgets(str, sizeof(str), stdin)!= NULL) {
printf("%s\n", str); // can print string line by line
token = strtok(str, s);
strcpy(buffer.id, token);
printf("%s\n", buffer.id); //can print the value in the struct
while(token != NULL){
token = strtok(NULL, s);
printf("%s\n", token); // problem starts here
/*strcpy(buffer.lname, token);
printf("%s\n", buffer.lname); // cant do anything with token */
}}
fclose(bi);
return 1;}
Here is the example of string I read from stdin and after parsed(I just tried to strtok the first two elements to see if it works):
<15322101,MOZNETT,JOSE,n/a,n/a,2/23/1943,MALE,824-75-8088,42 SMITH AVENUE,n/a,11706,n/a,n/a,BAYSHORE,NY,518-215-5848,n/a,n/a,n/a
<
< 15322101
< MOZNETT
In the first version your compiler transforms printf() into a
puts() and puts does not allow null pointers, because internally
invokes the strlen() to determine the lenght of the string.
In the case of the second version you add a space in front of format
specifier. This makes it impossible for the compiler to call puts
without appending this two string together. So it invokes the actual
printf() function, which can handle NULL pointers. And your code
works.
Your problem reduces to the following question What is the behavior of printing NULL with printf's %s specifier?
.
In short NULL as an argument to a printf("%s") is undefined. So you need to check for NULL as suggested by #kninnug
You need to change you printf as follows:
token = strtok(NULL, s);
if (token != NULL) printf("%s\n", token);
Or else
printf ("%s\n", token == NULL ? "" : token);

Length of string returned by strtok()

I want to find length of word from string. When i use strlen(split) out of while loop it's ok. But when i use it from loop i have segmentation fault error. What's the problem?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char string[] = "Hello world!";
char* word = strtok(string, " ");
printf("%d\n", strlen(word));
while(split != NULL) {
word = strtok(NULL, " ");
printf("%d\n", strlen(word ));
}
}
You need to check that strtok didn't return NULL before calling strlen
From the strtok man page (my emphasis)
Each call to strtok() returns a pointer to a null-terminated string
containing the next token. This string does not include the delimiting
byte. If no more tokens are found, strtok() returns NULL.
while(word != NULL) {
word = strtok(NULL, " ");
if (word != NULL) {
printf("%d\n", strlen(word ));
}
}
Note that there was also a typo in your code. The while loop should test word rather than split.

strtok and strncat error

I want to add string "ay" to each word by using both strtok and strncat. But there seemed to be a conflict somewhere that I cannot find. It only gives me the first word "Computeray" for an output. Help?
#include <stdio.h>
#include <string.h>
int main(void)
{
char str[] = "Computer science is hard";
char* Token;
char* work = "ay";
Token = strtok(str, " ");
while (Token != NULL)
{
strncat(Token, work, 2);
printf("%s", Token);
Token = strtok(NULL, " ");
}
return 0;
}
You're modifying the string (with strcat) and expecting strtok to still behave properly - that's not going to work. Instead of using strcat, just print the "ay" separately:
while (Token != NULL)
{
printf("%say ", Token);
Token = strtok(NULL, " ");
}
Even if it were working the way you'd like, you'd be overwriting a bunch of your input along the way. Probably not what you were going for - if you need to build up a whole new string, you should do it into a new buffer, instead of overwriting the input.

Strtok and Strcat conflict

I am trying to work with strtok and strcat but the second printf never shows up. Here is the code:
int i = 0;
char *token[128];
token[i] = strtok(tmp, "/");
printf("%s\n", token[i]);
i++;
while ((token[i] = strtok(NULL, "/")) != NULL) {
strcat(token[0], token[i]);
printf("%s", token[i]);
i++;
}
If my input is 1/2/3/4/5/6 for tmp then the console output would be 13456. The 2 is always missing. Does anyone know how to fix this?
The two is always missing because on the first iteration of your loop you overwrite it with the call to strcat.
After entry to the loop your buffer contains: "1\02\03/4/5/6" internal strtok pointer is pointing to "3". tokens[1] points to "2".
You then call strcat: "12\0\03/4/5/6" so your token[i] pointer is pointing to "\0". The first print prints nothing.
Subsequent calls are OK because the null characters do not overwrite the input data.
To fix it you should build up your output string into a second buffer, not the one you are parsing.
A working(?) version:
#include <stdio.h>
#include <string.h>
int main(void)
{
int i = 0;
char *token[128];
char tmp[128];
char removed[128] = {0};
strcpy(tmp, "1/2/3/4/5/6");
token[i] = strtok(tmp, "/");
strcat(removed, token[i]);
printf("%s\n", token[i]);
i++;
while ((token[i] = strtok(NULL, "/")) != NULL) {
strcat(removed, token[i]);
printf("%s", token[i]);
i++;
}
return (0);
}
strtok modifies the input string in place and returns pointers to that string. You then take one of those pointers (token[0]) and pass it to another operation (strcat) that writes to that pointer. The writes are clobbering each other.
If you want to concatenate all the tokens, you should allocate a separate char* to strcpy to.

tokenizing a string twice in c with strtok()

I'm using strtok() in c to parse a csv string. First I tokenize it to just find out how many tokens there are so I can allocate a string of the correct size. Then I go through using the same variable I used last time for tokenization. Every time I do it a second time though it strtok(NULL, ",") returns NULL even though there are still more tokens to parse. Can somebody tell me what I'm doing wrong?
char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
count++;
tok = strtok(NULL, ",");
}
//allocate array
tok = strtok(buffer, ",");
while(tok != NULL) {
//do other stuff
tok = strtok(NULL, ",");
}
So on that second while loop it always ends after the first token is found even though there are more tokens. Does anybody know what I'm doing wrong?
strtok() modifies the string it operates on, replacing delimiter characters with nulls. So if you want to use it more than once, you'll have to make a copy.
There's not necessarily a need to make a copy - strtok() does modify the string it's tokenizing, but in most cases that simply means the string is already tokenized if you want to deal with the tokens again.
Here's your program modified a bit to process the tokens after your first pass:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int i;
char buffer[] = "some, string with , tokens";
char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
count++;
tok = strtok(NULL, ",");
}
// walk through the tokenized buffer again
tok = buffer;
for (i = 0; i < count; ++i) {
printf( "token %d: \"%s\"\n", i+1, tok);
tok += strlen(tok) + 1; // get the next token by skipping past the '\0'
tok += strspn(tok, ","); // then skipping any starting delimiters
}
return 0;
}
Note that this is unfortunately trickier than I first posted - the call to strspn() needs to be performed after skipping the '\0' placed by strtok() since strtok() will skip any leading delimiter characters for the token it returns (without replacing the delimiter character in the source).
Use strsep - it actually updates your pointer. In your case you would have to keep calling NULL versus passing in the address of your string. The only issue with strsep is if it was previously allocated on the heap, keep a pointer to the beginning and then free it later.
char *strsep(char **string, char *delim);
char *string;
char *token;
token = strsep(&string, ",");
strtok is used in your normal intro to C course - use strsep, it's much better. :-)
No getting confused on "oh shit - i have to pass in NULL still cuz strtok screwed up my positioning."

Resources