strtok and strncat error - c

I want to add string "ay" to each word by using both strtok and strncat. But there seemed to be a conflict somewhere that I cannot find. It only gives me the first word "Computeray" for an output. Help?
#include <stdio.h>
#include <string.h>
int main(void)
{
char str[] = "Computer science is hard";
char* Token;
char* work = "ay";
Token = strtok(str, " ");
while (Token != NULL)
{
strncat(Token, work, 2);
printf("%s", Token);
Token = strtok(NULL, " ");
}
return 0;
}

You're modifying the string (with strcat) and expecting strtok to still behave properly - that's not going to work. Instead of using strcat, just print the "ay" separately:
while (Token != NULL)
{
printf("%say ", Token);
Token = strtok(NULL, " ");
}
Even if it were working the way you'd like, you'd be overwriting a bunch of your input along the way. Probably not what you were going for - if you need to build up a whole new string, you should do it into a new buffer, instead of overwriting the input.

Related

How does strcat affect the strtok?

Assume we need to copy user's input into another string by concatenating the tokens of input, e.g., "hello world" -> "helloworld".
#include <stdio.h>
#include <string.h>
int main(void) {
char buffer[50];
printf("\nEnter a string: ");
while (fgets(buffer, sizeof(buffer), stdin) != 0) {
size_t size = strlen(buffer);
if (size > 0 && buffer[size - 1] == '\n') {
char input[1]; // set it too small
buffer[size - 1] = '\0';
char *tok = strtok(buffer, " "); // works fine
do {
strcat(input, tok); // append to "input" that has not enough space
printf("\nfound token: %s", tok);
tok = strtok(NULL, " "); // produces garbage
} while (tok);
break;
}
}
Running the code above:
Enter a string: hello world
found token: hello
found token: w
found token: r
*** stack smashing detected ***: <unknown> terminated
I struggle to understand how is strtok related to strcat failing to append tok. They are not sharing variables except for tok which is (according to the docs) copied by strcat, so whatever strcat is doing shouldn't affect the strtok behavior and the program should crash on the second strcat call at least, right? But we see that strcat is getting called 3 times before stack smashing gets detected. Can you please explain why?
For starters this array
char input[1];
is not initialized and does not contain a string.
So this call of strcat
strcat(input, tok);
invokes undefined behavior also because the array input is not large enough to store the copied string. It can overwrite memory beyond the array.
There are multiple problems in the code:
char input[1]; is too small to do anything. You cannot concatenate the tokens from the line into this minuscule array. You must define it with a sufficient length, namely the same length as buffer for simplicity.
input must be initialized as an empty string for strcat(input, tok); to have defined behavior. As coded, the first call to strcat corrupts other variables causing the observed behavior, but be aware anything else could happen as a result of this undefined behavior.
char *tok = strtok(buffer, " "); works fine but may return a null pointer if buffer contains only whitespace if anything. The do loop will then invoke undefined behavior on strcat(input, tok). Use a for or while loop instead.
there is a missing } in the code, it is unclear whether you mean to break from the while loop after the first iteration or only upon getting the end of the line.
Here is a modified version:
#include <stdio.h>
#include <string.h>
int main(void) {
char buffer[50];
char input[sizeof buffer] = "";
printf("Enter a string: ");
if (fgets(buffer, sizeof(buffer), stdin)) {
char *tok = strtok(buffer, " \n");
while (tok) {
strcat(input, tok);
printf("found token: %s\n", tok);
tok = strtok(NULL, " \n");
}
printf("token string: %s\n", input);
}
return 0;
}

Segmentation fault (core dumped) c

Here is a weird problem:
token = strtok(NULL, s);
printf(" %s\n", token); // these two lines can read the token and print
However!
token = strtok(NULL, s);
printf("%s\n", token); // these two lines give me a segmentation fault
Idk whats happened, because I just add a space before %s\n, and I can see the value of token.
my code:
int main() {
FILE *bi;
struct _record buffer;
const char s[2] = ",";
char str[1000];
const char *token;
bi = fopen(DATABASENAME, "wb+");
/*get strings from input, and devides it into seperate struct*/
while(fgets(str, sizeof(str), stdin)!= NULL) {
printf("%s\n", str); // can print string line by line
token = strtok(str, s);
strcpy(buffer.id, token);
printf("%s\n", buffer.id); //can print the value in the struct
while(token != NULL){
token = strtok(NULL, s);
printf("%s\n", token); // problem starts here
/*strcpy(buffer.lname, token);
printf("%s\n", buffer.lname); // cant do anything with token */
}}
fclose(bi);
return 1;}
Here is the example of string I read from stdin and after parsed(I just tried to strtok the first two elements to see if it works):
<15322101,MOZNETT,JOSE,n/a,n/a,2/23/1943,MALE,824-75-8088,42 SMITH AVENUE,n/a,11706,n/a,n/a,BAYSHORE,NY,518-215-5848,n/a,n/a,n/a
<
< 15322101
< MOZNETT
In the first version your compiler transforms printf() into a
puts() and puts does not allow null pointers, because internally
invokes the strlen() to determine the lenght of the string.
In the case of the second version you add a space in front of format
specifier. This makes it impossible for the compiler to call puts
without appending this two string together. So it invokes the actual
printf() function, which can handle NULL pointers. And your code
works.
Your problem reduces to the following question What is the behavior of printing NULL with printf's %s specifier?
.
In short NULL as an argument to a printf("%s") is undefined. So you need to check for NULL as suggested by #kninnug
You need to change you printf as follows:
token = strtok(NULL, s);
if (token != NULL) printf("%s\n", token);
Or else
printf ("%s\n", token == NULL ? "" : token);

Only getting 2 tokens per line from strtok()

I'm trying to make tokens from an input file. So, I get one line with fgets and feed it to a helper method that takes in a char* and returns a char* of the token. I am utilizing strtok() with delimiter as " " since the tokens are all separated by " ". But, I can't figure out why the code only makes 2 tokens per line and just moves on to the next line even though there is more in that line needed to be tokenized. Here is the code:
char *TKGetNextToken( char * start ) {
/* fill in your code here */
printf("Entered TKGetNextToken \n");
printf(&start[0]);
char* temp = &start[0];
//Delimiters for the tokens
const char* delim = " ";
//store tempToken
char* tempTok = strtok(temp, delim);
//return the token
return tempTok;
}
Here is how I'm storing the tokens in the main method:
//call get next token and get the token and store into temptok
while (temp!= NULL) {
tempTok = TKGetNextToken(temp);
printf("tempTok: %s\n",tempTok);
token.charPtr[tempNum] = tempTok;
tempNum++;
printf("Temp: %s\n",tempTok);
temp = strtok(NULL, " \0\n");
}
So, lets say I have a file.txt with:
abcd ef ghij asf32
fsadf ads adf
The tokens created would be "abcd" and "ef" and it will go on to the next line without making tokens for "ghij" and "asf32".
Use the proper syntax for strtok
char *tempTok = strtok(line, " "); //initialize
while (tempTok != NULL) {
//do the work
tempTok = strtok(NULL, " \n"); //update
}
If you do like above, then you can get the tokens quite easy. Please have a look at this example, which is similar to your code, just remember how you use strtok properly, then it will work. Look at strtok and how it is used in the loop, updating and consumes the char *.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main() {
FILE *fp = fopen("data.txt", "r");
char line[256];
while (fgets(line, sizeof(line), fp)) {
char *tempTok = strtok(line, " ");
while (tempTok != NULL) {
printf("token %s\n", tempTok);
tempTok = strtok(NULL, " \n");
}
}
fclose(fp);
return 0;
}
File data.txt
abcd ef ghij asf32
fsadf ads adf
Output
./a.out
token abcd
token ef
token ghij
token asf32
token fsadf
token ads
token adf

Length of string returned by strtok()

I want to find length of word from string. When i use strlen(split) out of while loop it's ok. But when i use it from loop i have segmentation fault error. What's the problem?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char string[] = "Hello world!";
char* word = strtok(string, " ");
printf("%d\n", strlen(word));
while(split != NULL) {
word = strtok(NULL, " ");
printf("%d\n", strlen(word ));
}
}
You need to check that strtok didn't return NULL before calling strlen
From the strtok man page (my emphasis)
Each call to strtok() returns a pointer to a null-terminated string
containing the next token. This string does not include the delimiting
byte. If no more tokens are found, strtok() returns NULL.
while(word != NULL) {
word = strtok(NULL, " ");
if (word != NULL) {
printf("%d\n", strlen(word ));
}
}
Note that there was also a typo in your code. The while loop should test word rather than split.

tokenizing a string twice in c with strtok()

I'm using strtok() in c to parse a csv string. First I tokenize it to just find out how many tokens there are so I can allocate a string of the correct size. Then I go through using the same variable I used last time for tokenization. Every time I do it a second time though it strtok(NULL, ",") returns NULL even though there are still more tokens to parse. Can somebody tell me what I'm doing wrong?
char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
count++;
tok = strtok(NULL, ",");
}
//allocate array
tok = strtok(buffer, ",");
while(tok != NULL) {
//do other stuff
tok = strtok(NULL, ",");
}
So on that second while loop it always ends after the first token is found even though there are more tokens. Does anybody know what I'm doing wrong?
strtok() modifies the string it operates on, replacing delimiter characters with nulls. So if you want to use it more than once, you'll have to make a copy.
There's not necessarily a need to make a copy - strtok() does modify the string it's tokenizing, but in most cases that simply means the string is already tokenized if you want to deal with the tokens again.
Here's your program modified a bit to process the tokens after your first pass:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main()
{
int i;
char buffer[] = "some, string with , tokens";
char* tok;
int count = 0;
tok = strtok(buffer, ",");
while(tok != NULL) {
count++;
tok = strtok(NULL, ",");
}
// walk through the tokenized buffer again
tok = buffer;
for (i = 0; i < count; ++i) {
printf( "token %d: \"%s\"\n", i+1, tok);
tok += strlen(tok) + 1; // get the next token by skipping past the '\0'
tok += strspn(tok, ","); // then skipping any starting delimiters
}
return 0;
}
Note that this is unfortunately trickier than I first posted - the call to strspn() needs to be performed after skipping the '\0' placed by strtok() since strtok() will skip any leading delimiter characters for the token it returns (without replacing the delimiter character in the source).
Use strsep - it actually updates your pointer. In your case you would have to keep calling NULL versus passing in the address of your string. The only issue with strsep is if it was previously allocated on the heap, keep a pointer to the beginning and then free it later.
char *strsep(char **string, char *delim);
char *string;
char *token;
token = strsep(&string, ",");
strtok is used in your normal intro to C course - use strsep, it's much better. :-)
No getting confused on "oh shit - i have to pass in NULL still cuz strtok screwed up my positioning."

Resources