Why is my strtok breaking up my strings after space when I specified my delimiter as ","?
I can only suggest that you're doing something wrong though it's a little hard to tell exactly what (you should generally post your code when asking about specifics). Sample programs, like the following, seem to work fine:
#include <stdio.h>
#include <string.h>
int main (void) {
char *s;
char str[] =
"This is a string,"
" with both spaces and commas,"
" for testing.";
printf ("[%s]\n", str);
s = strtok (str, ",");
while (s != NULL) {
printf (" [%s]\n", s);
s = strtok (NULL, ",");
}
return 0;
}
It outputs:
[This is a string, with both spaces and commas, for testing.]
[This is a string]
[ with both spaces and commas]
[ for testing.]
The only possibility that springs to mind immediately is if you're using " ," instead of ",". In that case, you would get:
[This is a string, with both spaces and commas, for testing.]
[This]
[is]
[a]
[string]
[with]
[both]
[spaces]
[and]
[commas]
[for]
[testing.]
Thanks! I looked around and figured out that the problem was with my scanf which doesn't read the whole line the user inputs. It seems that my strtok was working fine but the value i am using to match the return value of strtok is wrong.
For example, my strtok function takes "Jeremy whitfield,Ronny Whifield" and gives me "Jeremy Whitfield" and "Ronny Whitfield". In my program, i am using scanf to take in user input > "Ronny Whitfield" which is actually only reading "Ronny". So its a problem with my scanf not strtok.
My virtual machine is getting stuck everytime i open it so i am unable to access my code for now.
Related
I am having a struggle with the following exercise in my book:
Write a program that prompts the user to enter a series of words separated by single spaces, then prints the words in reverse order. Read the input as a string, and then use strtok to break it into words.
Input:hi there you are cool
Output: None it shuts itself.
Expected:cool are you there hi
My program only gets the string and waits and shuts after a couple of seconds. Here's the code:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
int main(void){
int ch ;
char * str , * str2;
char * p;
str = (char*)malloc(sizeof(char) * 100);
str2 =(char*)malloc(sizeof(char) * 100);
if((fgets(str , sizeof(str) , stdin)) != NULL){
str = strtok(str ," \t");
p = strrchr(str , '\0');
strcat(str2,p);
printf("%s",p);
while(str != NULL){
str = strtok(NULL ," \t");
p = strrchr(str + 1, '\0');
strcat(str2,p);
printf("%s",p);
}
}
return 0;
}
I know this question has been asked here. I get the idea there but my problem is implementation and carrying out. This is more of a beginner question.
Since you yourself stated that this is for an exercise I will not provide a working solution but an outline of what you might want to do.
Functions you want to use:
getline - for an easy read of an input line (notice that the newline character will not be eliminated
strtok_r to get the tokens (i.e. the words) from the input string
the _r means that this function is re-entrant which means that it can saftly be called by multiple threads at the same time. The normal version has an internal state and strtok_r lets you manage that state via a parameter.
(Please also read the docs for these functions if you have further questions)
For the algorithm:
Use getline to read a single line from input and replace the newline character with the 0 char. Then you should extract all one token after the other from the input and store them in a stack like fashion. After you tokenized the input just pop the token from the stack an print them to the stdout.
Another approach would be:
Write a function that simply reverses a string. Then use this function to reverse the input string and then for all tokens to read the token from the reversed input string and print the reverse token to stdout.
Here is my program (written in C, compiled and run on Omega, if it makes any difference):
#include <stdio.h>
#include <string.h>
int main (void)
{
char string[] = " hello!how are you? I am fine.";
char *token = strtok(string,"!?.");
printf("Token points to '%c'.\n",*token);
return 0;
}
This is the output I'm expecting:
"Token points to '!'."
But the output I'm getting is:
"Token points to ' '."
From trial and error, I know this is referring to the first character in the string: the space before "hello!".
Why am I not getting the output I'm expecting, and how can I fix it? I do understand from what I've read on here already that strtok is better off buried in a ditch, but let's assume that (if it's possible) I have to use it here, and I have to make it work.
As per strtok man page description
The strtok() function parses a string into a sequence of tokens. On
the first call to strtok() the string to be parsed should be specified
in str. In each subsequent call that should parse the same string, str
should be NULL.
It parses the string based on delimiter and return you the string not the delimiter.
In your case delimiters are "!?."
char string[] = " hello!how are you? I am fine.";
First occurrence of the delimiter "!" match after the string " hello". So it will return " hello" as return of strtok. And your output is nothing but first character ' ' of the " hello" string.
Someone just posted an answer. It worked for me and now I can't find it. Reposting as best I remember in case someone else has the same question.
char *token = strtok(string,"!?.");
token = strtok(NULL, "!?."); //<--THIS
token points to the first letter after the first delimiter, which is at least something I can work with. Thank you stranger!
Ive been working on a c program that will allow the user to type in line of text until they type in the phrase "The end" on a line by itself. The program will replace every occurrence of "is" with the string "was" and count the number of changes made.
so far I've written some code but I'm getting a little lost on how to get it to work correctly, as of right now the program is giving me a buffer overflow error
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main ()
{
char str[500];// ="- This, a sample string.";
char * pch;
char endTerm[5000];
printf("Enter a string to parse: ");
scanf("%[^\n]",str);
strcat(endTerm, str);
while ( (strcmp(str, "the end")) != 0 || (strcmp(str, "the end.")) != 0 )
{
scanf("%[^\n]",str);
strcat(endTerm, str);
}
printf ("Your original string was: %s\n\n",endTerm);
pch = strtok (endTerm," ,-");
while (pch != NULL)
{
if ((strcmp(pch, "is")) == 0)
{
pch = "was";
}
else if ((strcmp(pch, "is.")) == 0)
{
pch="was.";
}
printf ("%s ",pch);
pch = strtok (NULL, " ,-");
}
printf("\n\n");
return 0;
}
I can probably figure out how to end the program if the user types the end, but i really need help with replacing the word is with was.
As others have pointed out in the comments above, there are several flaws in your program.
First flaw is your usage of strcat function. If you read the documentation, you would understand that strcat treats the first argument as a destination pointer and hence expects the user to allocate sufficient memory (enough to hold the concatenated string) to the destination pointer. In your case you are passing a string of " " which can accommodate only 1 character. This is the reason, you are getting the buffer overflow or segmentation fault.
The second error in your program is in the usage of strcmp function. This function returns 0 (which is defined by false and not true in strbool.h) when two strings are equal.
The third problem in your program is in the usage of the function strtok. You need to pass NULL as the first argument from the second call onward to get the pointers to the remaining tokens.
So fix these 3 errors first and then try to think about what else needs to be corrected in order to get your desired output.
I'm trying to do split some strings by {white_space} symbol.
btw, there is a problem within some splits. which means, I want to split by {white_space} symbol but also quoted sub-strings.
example,
char *pch;
char str[] = "hello \"Stack Overflow\" good luck!";
pch = strtok(str," ");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok(NULL, " ");
}
This will give me
hello
"Stack
Overflow"
good
luck!
But What I want, as you know,
hello
Stack Overflow
good
luck!
Any suggestion or idea please?
You'll need to tokenize twice. The program flow you currently have is as follows:
1) Search for space
2) Print all characters prior to space
3) Search for next space
4) Print all characters between last space, and this one.
You'll need to start thinking in a different matter, two layers of tokenization.
Search for Quotation Mark
On odd-numbered strings, perform your original program (search for spaces)
On even-numbered strings, print blindly
In this case, even numbered strings are (ideally) within quotes. ab"cd"ef would result in ab being odd, cd being even... etc.
The other side, is remembering what you need to do, and what you're actually looking for (in regex) is "[a-zA-Z0-9 \t\n]*" or, [a-zA-Z0-9]+. That means the difference between the two options, are whether it's separated by quotes. So separate by quotes, and identify from there.
Try altering your strategy.
Look at non-white space things, then when you find quoted string you can put it in one string value.
So, you need a function that examines characters, between white space. When you find '"' you can change the rules and hoover everything up to a matching '"'. If this function returns a TOKEN value and a value (the string matched) then what calls it, can decide to do the correct output. Then you have written a tokeniser, and there actually exist tools to generate them called "lexers" as they are used widely, to implement programming languages/config files.
Assuming nextc reads next char from string, begun by firstc( str) :
for (firstc( str); ((c = nextc) != NULL;) {
if (isspace(c))
continue;
else if (c == '"')
return readQuote; /* Handle Quoted string */
else
return readWord; /* Terminated by space & '"' */
}
return EOS;
You'll need to define return values for EOS, QUOTE and WORD, and a way to get the text in each Quote or Word.
Here's the code that works... in C
The idea is that you first tokenize the quote, since that's a priority (if a string is inside the quotes than we don't tokenize it, we just print it). And for each of those tokenized strings, we tokenize within that string on the space character, but we do it for alternate strings, because alternate strings will be in and out of the quotes.
#include <stdio.h>
#include <string.h>
#include <stdbool.h>
int main() {
char *pch1, *pch2, *save_ptr1, *save_ptr2;
char str[] = "hello \"Stack Overflow\" good luck!";
pch1 = strtok_r(str,"\"", &save_ptr1);
bool in = false;
while (pch1 != NULL) {
if(in) {
printf ("%s\n", pch1);
pch1 = strtok_r(NULL, "\"", &save_ptr1);
in = false;
continue;
}
pch2 = strtok_r(pch1, " ", &save_ptr2);
while (pch2 != NULL) {
printf ("%s\n",pch2);
pch2 = strtok_r(NULL, " ", &save_ptr2);
}
pch1 = strtok_r(NULL, "\"", &save_ptr1);
in = true;
}
}
References
Tokenizing multiple strings simultaneously
http://linux.die.net/man/3/strtok_r
http://www.cplusplus.com/reference/cstring/strtok/
Forgive the basic question, but i'm missing something foundational. Recently found myself having to code after 8 years of complete programming inactivity.
What i'm trying to do is read in a string (known amount of tokens within it), and then take these characters and convert them to integers.
This second part is my problem. I'm sure it's the way i'm handling the pointers,but am unsure exactly where i'm going wrong.
Any tips would be welcome, i'm sure there are several issues below.
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="20\n3000 53\n96";
char * pch;
int num_one = 0;
int num_two = 0;
int num_three = 0;
int num_four = 0;
printf ("Splitting string \"%s\" into tokens:\n",str);
printf("\n");
pch = strtok (str," \n");
num_one = atoi(&pch);
printf ("%s\n",pch);
printf ("%u\n",num_one);
pch = strtok (NULL, " \n");
printf ("%s\n",pch);
num_two = atoi(&pch);
printf ("%u\n",num_two);
pch = strtok (NULL, " \n");
num_three = atoi(&pch);
printf ("%s\n",pch);
printf ("%u\n",num_three);
pch = strtok (NULL, " \n");
num_four = atoi(&pch);
printf ("%s\n",pch);
printf ("%u\n",num_four);
return 0;
}
Current output(can't get it to format correctly, but the \n are behaving as they should):
Splitting string "20
3000 53
96" into tokens:
20
0
3000
0
53
0
96
0
Firstly note that your string constants are const - you cannot modify them. strtok() will modify the calling string, so this can give rise to a GPF or core dump, unless you copy them.
I see you are calling atoi(&pch) which is incorrect - that should be atoi(pch); You gave it a pointer to a pointer to the string, and not a pointer to the string.
This would probably be much simpler using sscanf(), it will do it all in a single call.
Your format string would become something like "%d %d %d %d", since newline is just whitespace to sscanf().
If your input is "entrenched" you might also need to use %n to figure out many characters of input were consumed. This is helpful to skip to the next "record". Also remember to check the return value to know whether the call suceeded or not.