The difference between using strtok() to inputed string or declared string - c

To understand the behavior of strtok() in C ANSI, I worte two code.
#include <stdio.h>
#include <string.h>
int main()
{
char str[101] = "This is";
char *pch;
printf("Splitting string %s into tokens : \n",str);
pch = strtok(str," ");`enter code here`
while(pch != NULL)
{
printf("%s\n",pch);
pch = strtok(NULL, " ");
}
return 0;
}
The result of This program is
Splitting string "This is " into tokens:
This
is
Next, I changed it a little bit.
#include <stdio.h>
#include <string.h>
int main()
{
char str[101] = ;
char *pch;
scanf("%s",str); //After launch program, I typed "This is "
str[strcspn(str,"\n")] = '\0'
printf("Splitting string %s into tokens : \n",str);
pch = strtok(str," ");`enter code here`
while(pch != NULL)
{
printf("%s\n",pch);
pch = strtok(NULL, " ");
}
return 0;
}
It prints
Splitting string "This" into tokens:
This
I can't understand why the second word is gone when I use stdin.

The problem isn't with strtok, but with your use of scanf and the "%s" format specifier. That format specifier reads space delimited strings, i.e you can not use "%s" to read anything with a space in it.
The natural solution is to use fgets instead, which you have already prepared for by "removing the newline" (which scanf would not usually read anyway).
It should have been pretty obvious that the strtok can't be involved, since you print the input string before even calling strtok.

Related

srtok is not working in c

here is my code,
#include <string.h>
#include <stdio.h>
main ()
{
explode (" ", "this is a text");
}
explode (char *delimiter, char string[])
{
char *pch;
printf ("Splitting string \"%s\" into tokens:\n",string);
pch = strtok (string,delimiter);
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, delimiter);
}
return 0;
}
I compile this code using gcc -o 1.exe 1.c and shows no error. But when i execute 1.exe it shows Splitting string "this is a text" into tokens: and at that moment 1.exe stops working (a dialogue box of windows shows). can anybody tell the problem and solve the problem? I am using windows 10.
While you can't do this with strtok because the literal can't be modified, it can be done with strcspn.
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
void explode (char *delimiter, char *string);
int main()
{
explode (" ", "this is a text");
return 0;
}
void explode (char *delimiter, char *string)
{
int span = 0;
int offset = 0;
int length = 0;
if ( delimiter && string) {
length = strlen ( string);
printf ("Splitting string \"%s\" into tokens:\n",string);
while (offset < length) {
span = strcspn ( &string[offset],delimiter);//work from offset to find next delimiter
printf ("%.*s\n",span, &string[offset]);//print span number of characters
offset += span + 1;// increment offset by span and one characters
}
}
}
In your explode() function, you're passing a string liteal ("this is a text") and using the same as the input to strtok().
As strtok() modifies the input string, here, it will invoke invokes undefined behavior. As mentioned in th C11 standard, chapter ยง6.4.5, String literals
[...] If the program attempts to modify such an array, the behavior is
undefined.
You can either
Define an array and initalize it with the string literal and the use the array as input to strtok().
take a pointer, use strdup() to copy the initializer and then supply that pointer to strtok().
The bottom line is, the input string to strtok() should be modifiable.

Length of string returned by strtok()

I want to find length of word from string. When i use strlen(split) out of while loop it's ok. But when i use it from loop i have segmentation fault error. What's the problem?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(void) {
char string[] = "Hello world!";
char* word = strtok(string, " ");
printf("%d\n", strlen(word));
while(split != NULL) {
word = strtok(NULL, " ");
printf("%d\n", strlen(word ));
}
}
You need to check that strtok didn't return NULL before calling strlen
From the strtok man page (my emphasis)
Each call to strtok() returns a pointer to a null-terminated string
containing the next token. This string does not include the delimiting
byte. If no more tokens are found, strtok() returns NULL.
while(word != NULL) {
word = strtok(NULL, " ");
if (word != NULL) {
printf("%d\n", strlen(word ));
}
}
Note that there was also a typo in your code. The while loop should test word rather than split.

How to cut a string using 2 delimiters

How to cut a string using 2 delimiters in C?
I'm getting a string from the user in this platform:
cp <path1> <path2>
I need to get the pathes into a new string (each path to one string).
I tried to use strstr and strtok but it doesn't work.
I don't know the length of the pathes. I also just know that they are starting with " \" (this are the delimiters that I have (space + \)).
this is what i tried
#include
#include
#include
int main()
{
char *c;
char *ch = malloc(1024);
while (strcmp(ch, "exit"))
{
scanf("%[^\n]%*c", ch); //what was the input (cp /dor/arthur /king/apple)
c = malloc(sizeof(strlen(ch) + 1));
strcpy(c, ch);
char *pch = strtok(c, " //");
printf("this is : %s \n", pch); //printed "this is: cp"
}
}
use strtok() . the above link contains an example of using strtok().
you cans use the 2 delimeters (space + \) with strtok() in this way:
str = strtok(str, " \\");
Is in the main function? If it is, main function has argc (int) and *argv[] (string) parameters which you can do what you want.

How to scan multiple words using sscanf in C?

I'm trying to scan a line that contains multiple words in C. Is there a way to scan it word by word and store each word as a different variable?
For example, I have the following types of lines:
A is the 1 letter;
B is the 2 letter;
C is the 3 letter;
If I'm parsing through the first line: "A is the 1 letter" and I have the following code, what do I put in each case so I can get the individual tokens and store them as variables. To clarify, by the end of this code, I want "is," "the," "1," "letter" in different variables.
I have the following code:
while (feof(theFile) != 1) {
string = "A is the 1 letter"
first_word = sscanf(string);
switch(first_word):
case "A":
what to put here?
case "B":
what to put here?
...
You shouldn't use feof() like that. You should use fgets() or equivalent. You probably need to use the little-known (but present in standard C89) conversion specifier %n.
#include <stdio.h>
int main(void)
{
char buffer[1024];
while (fgets(buffer, sizeof(buffer), stdin) != 0)
{
char *str = buffer;
char word[256];
int posn;
while (sscanf(str, "%255s%n", word, &posn) == 1)
{
printf("Word: <<%s>>\n", word);
str += posn;
}
}
return(0);
}
This reads a line, then uses sscanf() iteratively to fetch words from the line. The %n format specifier doesn't count towards the successful conversions, hence the comparison with 1. Note the use of %255s to prevent overflows in word. Note too that sscanf() could write a null after the 255 count specified in the conversion specification, hence the difference of one between the declaration of char word[256]; and the conversion specifier %255s.
Clearly, it is up to you to decide what to do with each word as it is extracted; the code here simply prints it.
One advantage of this technique over any solution based on strtok() is that sscanf() does not modify the input string so if you need to report an error, you have the original input line to use in the error report.
After editing the question, it seems that the punctuation like semi-colon is not wanted in a word; the code above would include punctuation as part of the word. In that case, you have to think a bit harder about what to do. The starting point might well be using and alphanumeric scan-set as the conversion specification in place of %255s:
"%255[a-zA-Z_0-9]%n"
You probably then have to look at what's in the character at the start of the next component and skip it if it is not alphanumeric:
if (!isalnum((unsigned char)*str))
{
if (sscanf(str, "%*[^a-zA-Z_0-9]%n", &posn) == 0)
str += posn;
}
Leading to:
#include <stdio.h>
#include <ctype.h>
int main(void)
{
char buffer[1024];
while (fgets(buffer, sizeof(buffer), stdin) != 0)
{
char *str = buffer;
char word[256];
int posn;
while (sscanf(str, "%255[a-zA-Z_0-9]%n", word, &posn) == 1)
{
printf("Word: <<%s>>\n", word);
str += posn;
if (!isalnum((unsigned char)*str))
{
if (sscanf(str, "%*[^a-zA-Z_0-9]%n", &posn) == 0)
str += posn;
}
}
}
return(0);
}
You'll need to consider the I18N and L10N aspects of the alphanumeric ranges chosen; what's available may depend on your implementation (POSIX doesn't specify support in scanf() scan-sets for the notations such as [[:alnum:]], unfortunately).
You can use strtok() to tokenize or split strings. Please refer the following link for an example: http://www.cplusplus.com/reference/cstring/strtok/
You can take array of character pointers and assign tokens to them.
Example:
char *tokens[100];
int i = 0;
char *token = strtok(string, " ");
while (token != NULL) {
tokens[i] = token;
token = strtok(NULL, " ");
i++;
}
printf("Total Tokens: %d", i);
Note the %s specifier strips whitespace. So you can write:
std::string s = "A is the 1 letter";
typedef char Word[128];
Word words[6];
int wordsRead = sscanf(s.c_str(), "%128s%128s%128s%128s%128s%128s", words[0], words[1], words[2], words[3], words[4], words[5] );
std::cout << wordsRead << " words read" << std::endl;
for(int i = 0;
i != wordsRead;
++i)
std::cout << "'" << words[i] << "'" << std::endl;
Note how this approach (unlike strtok), effectively requires an assumption about the maximim number of words to read, as well as their lengths.
I would recommend using strtok().
Here is the example from http://www.cplusplus.com/reference/cstring/strtok/
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
Output will be:
Splitting string "- This, a sample string." into tokens:
This
a
sample
string

C - Split a string

Is there any pre-defined function in C that can split a string given a delimeter? Say I have a string:
"Command:Context"
Now, I want to store "Command" and "Context" to a two dimensional array of characters
char ch[2][10];
or to two different variables
char ch1[10], ch2[10];
I tried using a loop and it works fine. I'm just curious if there is such function that already exists, I don't want to reinvent the wheel. Please provide a clear example, thank you very much!
You can use strtok
Online Demo:
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="Command:Context";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str,":");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, ":");
}
return 0;
}
Output:
Splitting string "Command:Context" into tokens:
Command
Context
You can tokenise a string with strtok as per the following sample:
#include <stdio.h>
#include <string.h>
int main (void) {
char instr[] = "Command:Context";
char words[2][10];
char *chptr;
int idx = 0;
chptr = strtok (instr, ":");
while (chptr != NULL) {
strcpy (words[idx++], chptr);
chptr = strtok (NULL, ":");
}
printf ("Word1 = [%s]\n", words[0]);
printf ("Word2 = [%s]\n", words[1]);
return 0;
}
Output:
Word1 = [Command]
Word2 = [Context]
The strtok function has some minor gotchas that you probably want to watch out for. Primarily, it modifies the string itself to weave its magic so won't work on string literals (for example).

Resources