Weird behavour of strtok in C? - c

I was writing a program in which I want to print common words between two strings . Well I use two loops and split those strings in those two loops . But didn't get the requisite result . Then I changed the program a bit and then I researched that outer loop run only once . Didn't able fathom why ? Anybody any idea ?
#include<stdio.h>
#include<string.h>
#include<stdlib.h>
int main()
{
char str1[] = "Japan Korea Spain Germany Australia France ";
char str2[] = "England USA Russia Italy Australia India Nepal France";
char *tar1 = strtok(str1," ");
char *tar2 = NULL;
while(tar1)
{
tar2 = strtok(str2," ");
while(tar2)
{
if(strcmp(tar1,tar2)) printf("%s %s\n",tar1 , tar2);
tar2 = strtok(NULL," ");
}
tar1 = strtok(NULL," ");
tar2 = NULL;
}
return 0;
}

You cannot use strtok on two different strings at the same time, and you cannot parse a string more than once, because strtok has already modified the string by breaking it with nul terminators.
This example extracts the token pointers into an array of pointers for each input string, before checking for matches.
#include <stdio.h>
#include <string.h>
#define MAXSTR 20
int main()
{
char str1[] = "Japan Korea Spain Germany Australia France ";
char str2[] = "England USA Russia Italy Australia India Nepal France";
char *tar1[MAXSTR];
char *tar2[MAXSTR];
char *tok;
int ind1 = 0, ind2 = 0;
int i, j;
tok = strtok(str1, " \t");
while(tok != NULL && ind1 < MAXSTR) {
tar1[ind1++] = tok;
tok = strtok(NULL, " \t");
}
tok = strtok(str2, " \t");
while(tok != NULL && ind2 < MAXSTR) {
tar2[ind2++] = tok;
tok = strtok(NULL, " \t");
}
for(i=0; i<ind1; i++) {
for(j=0; j<ind2; j++) {
if(strcmp(tar1[i], tar2[j]) == 0) {
printf("%s\n", tar1[i]);
break;
}
}
}
return 0;
}
Program output:
Australia
France

The strtok() function breaks a string into a sequence of zero or more
nonempty tokens.
In other words: ' ' is replaced with a NUL (0) by strtok.
In consequence, you can not use tar2 = strtok(str2," "); twice with the same string.
And as pointed out by #WeatherVane: You cannot use strtok on two different strings at the same time.
An alternative to your code:
#include <stdio.h>
#include <string.h>
int main(void)
{
char str1[] = "Japan Korea Spain Germany Australia France ";
char str2[] = "England USA Russia Italy Australia India Nepal France";
char *tar = strtok(str1, " ");
char *ptr;
size_t sz;
while (tar) {
if ((ptr = strstr(str2, tar)) != NULL) {
/* First string or starts with " " */
if ((ptr == str2) || (*(ptr -1) == ' ')) {
sz = strlen(tar);
/* Last string or ends with " " */
if ((*(ptr + sz) == ' ') || (*(ptr + sz) == '\0')) {
puts(tar);
}
}
}
tar = strtok(NULL, " ");
}
return 0;
}
Output:
Australia
France

Related

How to split a string into separate words and create the array of these words in C language?

So, the task is the following:
Find the number of words in the text in which the first and last characters are the same.
In order to do this, I think I first should split the text and create the array of separate words.
For example, the string is:
"hello goodbye river dog level"
I want to split it and get the following array:
{"hello", "goodbye", "river", "dog", "level"}
I have the code that splits the string:
#include<stdio.h>
#include <string.h>
int main() {
char string[100] = "hello goodbye river dog level";
// Extract the first token
char * token = strtok(string, " ");
// loop through the string to extract all other tokens
while( token != NULL ) {
printf( " %s\n", token ); //printing each token
token = strtok(NULL, " ");
}
return 0;
}
However, it just prints these words, and I need to append each word to some array. The array shouldn't be of fixed size, because potentially I could add as many elements as the text requires. How to do this?
I don't see any reason to split into words. Just iterate the string while keeping a flag that tells whether you are inside or outside a word (i.e. a state variable). Then have variables for first and last character that you maintain as you iterate. Compare them when you go out of a word or reach end-of-string.
A simple approach could look like:
#include <stdio.h>
int count(const char* s)
{
int res = 0;
int in_word = 0;
char first;
char last;
while(*s)
{
if (in_word)
{
if (*s == ' ')
{
// Found end of a word
if (first == last) ++res;
in_word = 0;
}
else
{
// Word continues so update last
last = *s;
}
}
else
{
if (*s != ' ')
{
// Found start of new word. Update first and last
first = *s;
last = *s;
in_word = 1;
}
}
++s;
}
if (in_word && first == last) ++res;
return res;
}
int main(void)
{
char string[100] = "hello goodbye river dog level";
printf("found %d words\n", count(string));
return 0;
}
Output:
found 2 words
Note: Current code assumes that word delimiter is always a space. Further the code doesn't treat stuff like , . etc. But all that can be added pretty easy.
Here is a simple (but naive) implementation based on the existing strtok code. It doesn't just count but also points out which words that were found, by storing a pointer to them in a separate array of pointers.
This works since strtok changes the string in-place, replacing spaces with null terminators.
#include <stdio.h>
#include <string.h>
int main(void)
{
char string[100] = "hello goodbye river dog level";
char* words[10]; // this is just assuming there's not more than 10 words
size_t count=0;
for(char* token=strtok(string," "); token!=NULL; token=strtok(NULL, " "))
{
if( token[0] == token[strlen(token)-1] ) // strlen(token)-1 gives index of last character
{
words[count] = token;
count++;
}
}
printf("Found: %zu words. They are:\n", count);
for(size_t i=0; i<count; i++)
{
puts(words[i]);
}
return 0;
}
Output:
Found: 2 words. They are:
river
level
with strtok based on Alexander's code.
#include <stdio.h>
#include <string.h>
int main(void)
{
char string[] = "hello, goodbye; river, dog; level.";
char *token = strtok(string, " ,;.");
int counter =0;
while( token != NULL )
{
if(token[0]==token[strlen(token)-1]) counter++;
token = strtok(NULL, " ,;.");
}
printf("found : %d", counter);
return 0;
}

Split string and append them to an array

Let's say I have a string containing integers "1 3 4 9" and I want to separate them based on the whitespace between them, and then save them into an array.
For example:
Input:
char str[] = "1 3 4 9";
int arr[4];
Then arr should be like:
arr[] = {1, 3, 4, 9}
How to do that in C, please help me. Hope I made the question clear
You can use strtok:
#include <stdio.h>
#include <string.h>
int main()
{
char str[] = "1 3 4 9";
char *token = strtok(str, " ");
while (token != NULL)
{
printf("%s\n", token);
token = strtok(NULL, " ");
}
return 0;
}
Your desired functionality in C.
#include <stdio.h>
#include <string.h>
int main()
{
char str[] = "1 3 4 9";
char newstr[50];
char *token = strtok(str, " ");
while (token != NULL)
{
printf("%s\n", token);
strcat(newstr, token);
token = strtok(NULL, " ");
}
printf("Newstr: %s",newstr);
return 0;
}
It will work with any string which will be including white space in it.

How to fill fields in a Struct type? error: variable-sized object may not be initialized

Having trouble using malloc to create each row vector to store data in. Also, I can't seem to assign the fields of struct using functions I've coded.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include "redo_hw4_functs.h"
typedef struct _Stores{
char name[10];
char addr[50];
char city[30];
char state[3];
} Store;
//function that creates a new matrix of size r x c
int main(int argc, char* argv[])
{
Store* pStores;
FILE* pFile;
char* stateGiven;
char buffer[180];
char* storeGiven;
char lineChar;
int lineNumb = 0;
int r;
char* tempName, tempAddr, tempCity, tempState;
if(argc < 4){
printf("Too few arguments! \n");
}
else if(argc > 4){
printf("Too many arguments! \n");
}
pFile = fopen(argv[1],"r");
for (lineChar = getc(pFile); lineChar != EOF; lineChar = getc(pFile))
{
if (lineChar == '\n') // Increment count if this character is newline
lineNumb = lineNumb + 1;
}
fclose(pFile);
pFile = fopen(argv[1],"r");
while(fgets(buffer, sizeof(buffer), pFile) != NULL)
{
for (r = 0; r < lineNumb; r++)
{
pStores = realloc(pStores, lineNumb * sizeof(Store*));
Store pStores[r] = malloc(sizeof(Store));
getName(pStores[r].name, buffer);
getAddress(pStores[r].addr, buffer);
getCity(pStores[r].city, buffer);
getState(pStores[r].state, buffer);
printf(" Store name: %s \n", pStores[r].name);
printf(" Address: %s \n", pStores[r].addr);
printf(" City: %s \n", pStores[r].city);
printf(" State: %s \n", pStores[r].state);
}
}
}
^^^ In the above block of code I made some improvements and also included realloc(). I initialized the lineNumb variable. I believe the problem regarding my initialization of each row that Store* pStores is trying to reference/point to.
Here are the helper functions:
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "redo_hw4_functs.h"
//accepts a line of string formatted as expected and stores the store name in char file
void getName(char strName[], char strLine[])
{
char* token;
char delim[] = " ,\t\n";
token = strtok(strLine, delim);
while(token != NULL)
{
if(strcmp(token, "sears") == 0 || strcmp(token, "kmart"))
{
strcpy(strName, token);
break;
}
token = strtok(NULL, delim);
}
}
//accepts a line of string formatted as expected and stores the store address in char file
void getAddress(char strAddress[], char strLine[])
{
char* token;
char delim[] = ",\t\n";
token = strtok(strLine, delim);
while(token != NULL)
{
if(isdigit(token[0]) && isalpha(token[sizeof(token)-1]))
{
strcpy(strAddress, token);
break;
}
token = strtok(NULL, delim);
}
}
//accepts a line of string formatted as expected and stores the store city in char file
void getCity(char strCity[], char strLine[])
{
int i;
char* token;
char delim[] = ",\t\n";
token = strtok(strLine, delim);
while( token != NULL )
{
strcpy(strCity, token + strlen(token)-3);
token = strtok(NULL, delim);
}
}
//accepts a line of string formatted as expected and stores the store state in char file ¡OJO! This is the hardest one because you cant rely on delimeters alone to find state
void getState(char strState[], char strLine[])
{
int i;
char* token;
char delim[] = "\n";
token = strtok(strLine, delim);
while( token != NULL )
{
strcpy(strState, token + strlen(token)-3);
token = strtok(NULL, delim);
}
}
Here is some sample input:
Kmart, 217 Forks Of River Pkwy, Sevierville TN
Kmart, 4110 E Sprague Ave, Spokane WA
Kmart, 1450 Summit Avenue, Oconomowoc WI
Sears, 2050 Southgate Rd, Colorado Spgs CO
Sears, 1650 Briargate Blvd, Colorado Spgs CO
Sears, 3201 Dillon Dr, Pueblo CO

How to count words on each line of input in C

Im trying to make a program so it counts the words on each line.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#define NUM_LINES 50
int main()
{
char text[NUM_LINES];
int count = 0;
int nw = 0;
char *token;
char *space = " ";
printf("Enter the text:\n");
while (fgets(text, NUM_LINES, stdin)){
token = strtok(text, space);
while (token != NULL){
if (strlen(token) > 0){
++nw;
}
token = strtok(NULL, space);
}
if (strcmp(text , "e") == 0 || strcmp(text , "e\n") == 0){
break;
}
}
printf("%d words", nw-1);
return 0;
}
For example if the input is:
Hello my name is John
I would like to have a snack
I like to play tennis
e
My program outputs the total words (17 in this case) how do I count the words on each line individually. So the output I would want is "5 7 5" in this example.
How do I count the words on each line individually?
Simply add a local counter line_word_count.
Suggest expanding the delimiter list to cope with spaces after the last word.
char *space = " \r\n";
while (fgets(text, NUM_LINES, stdin)){
int line_word_count = 0;
token = strtok(text, space);
while (token != NULL){
if (strlen(token) > 0){
line_word_count++;
}
token = strtok(NULL, space);
}
if (strcmp(text , "e") == 0 || strcmp(text , "e\n") == 0){
break;
}
printf("%d ", line_word_count);
nw += line_word_count;
}
printf("\n%d words\n", nw);

Parsing command line statements as a list of tokens

#include <stdio.h>
#include <string.h> /* needed for strtok */
#include <unistd.h>
#include <stdlib.h>
int main(int argc, char **argv) {
char text[10000];
fgets(text, sizeof(text), stdin);
char *t;
int i;
t = strtok(text, "\"\'| ");
for (i=0; t != NULL; i++) {
printf("token %d is \"%s\"\n", i, t);
t = strtok(NULL, "\"\'| ");
}
}
This is part of the code that im trying to make it is supposed to separate tokens
Let's say the input is 'abc' "de f'g" hij| k "lm | no"
The output should be
token 1: "abc"
token 2: "de f'g"
token 3: "hij"
token 4: "|"
token 5: "k"
token 6: "lm | no"
I get something different but close anyway I can change it to this format?
What you're trying to do is essentially a parser. strtok isn't a very good tool for this, and you may have better luck writing your own. strtok works on the presumption that whatever delimits your tokens is unimportant and so can be overwritten with '\0'. But you DO care what the delimiter is.
The only problem you'll have is that | syntax. The fact that you want to use it as a token delimiter and a token is likely to make your code more complicated (but not too much). Here, you have the issue that hij is followed immediately by |. If you terminate hij to get the token, you will have to overwrite the |. You either have to store the overwritten character and restore it, or copy the string out somewhere else.
You basically have three cases:
The bar | is a special delimiter that is also a token;
Quoted delimiters " and ' match everything until the next quote of the same kind;
Otherwise, tokens are delimited by whitespace.
#include <stdio.h>
#include <string.h>
char *getToken(char **sp){
static const char *sep = " \t\n";
static char vb[] = "|", vbf;
char *p, *s;
if(vbf){
vbf = 0;
return vb;
}
if (sp == NULL || *sp == NULL || **sp == '\0') return(NULL);
s = *sp;
if(*s == '"')
p = strchr(++s, '"');
else if(*s == '\'')
p = strchr(++s, '\'');
else
p = s + strcspn(s, "| \t\n");
if(*p != '\0'){
if(*p == '|'){
*vb = vbf = '|';
}
*p++ = '\0';
p += strspn(p, sep);
}
*sp = p;
if(!*s){
vbf = 0;
return vb;
}
return s;
}
int main(int argc, char **argv) {
char text[10000];
fgets(text, sizeof(text), stdin);
char *t, *p = text;
int i;
t = getToken(&p);
for (i=1; t != NULL; i++) {
printf("token %d is \"%s\"\n", i, t);
t = getToken(&p);
}
return 0;
}

Resources