I want to compare two strings which contains some other characters as well. To eliminate those characters I am using strtok()
First I am copying strings into temp buffers, which I will use in strtok().
#include<stdio.h>
#include<string.h>
int main()
{
char ch[50]="supl-dev.google.com";
char ch1[50]="*.google.com";
printf("ch =%s\n",ch);
printf("ch1 =%s\n",ch1);
char temp_ch[50], temp_ch1[50];
strcpy(temp_ch,ch);
strcpy(temp_ch1,ch1);
char *ch_token, *ch1_token;
ch_token = strtok(temp_ch,".");
ch1_token = strtok(temp_ch1,"*");
printf("ch_token=%s\n",ch_token);
printf("ch1_token = %s\n",ch1_token);
return 0;
}
Expected results :
ch =supl-dev.google.com
ch1 =*.google.com
ch_token=supl-dev
ch1_token = *
Actual results :
ch =supl-dev.google.com
ch1 =*.google.com
ch_token=supl-dev
ch1_token = .google.com
Here I am expecting ch1_token should contain '*'.
Nope. Your expectation is wrong. You set your delimiter for ch2 to *, which means that strtok will strip off the leading * in *.google.com and return .google.com as the first token. To get what you want, you have to set the delimiter to ..
#include<stdio.h>
#include<string.h>
int main()
{
char ch[50]="supl-dev.google.com";
char ch1[50]="*.google.com";
printf("ch =%s\n",ch);
printf("ch1 =%s\n",ch1);
char temp_ch[50], temp_ch1[50];
strcpy(temp_ch,ch);
strcpy(temp_ch1,ch1);
char *ch_token, *ch1_token;
ch_token = strtok(temp_ch,".");
ch1_token = strtok(temp_ch1,".");
printf("ch_token=%s\n",ch_token);
printf("ch1_token = %s\n",ch1_token);
return 0;
}
Now ch_token should be supl-dev and ch1_token should be *.
The thing to keep in mind is that strtok will go on to find the next token if the current token is empty.
So, when you strtok the string *.google.com with delimiter *, it finds the delimiter in the first position itself. As the current token is empty, the next token is returned which is .google.com
you are splitting the ch1 by * so its result is an empty string which is ignored and the rest of string which is .google.com.(it ignores * because it's your delimiter).
just change your splitting code to ch1_token = strtok(temp_ch1,"."); and it will return *,google and then com.
Your stated need is to search for a common sub-string within two strings.
Using strtok may work, but there are simpler ways to do this without parsing.
Have you considered using strstr()]?
char ch[50]="supl-dev.google.com";
char ch1[50]="*.google.com";
if((strstr(ch, "google.com")) && (strstr(ch1, "google.com"))
{
/// sub-string exists in both strings
}
Related
I'm relatively new to C, so any help understanding what's going on would be awesome!!!
I have a struct called Token that is as follows:
//Token struct
struct Token {
char type[16];
char value[1024];
};
I am trying to read from a file and append characters read from the file into Token.value like so:
struct Token newToken;
char ch;
ch = fgetc(file);
strncat(newToken.value, &ch, 1);
THIS WORKS!
My problem is that Token.value begins with several values I don't understand, preceding the characters that I appended. When I print the result of newToken.value to the console, I get #�����TheCharactersIWantedToAppend. I could probably figure out a band-aid solution to retroactively remove or work around these characters, but I'd rather not if I don't have to.
In analyzing the � characters, I see them as (in order from index 1-5): \330, \377, \377, \377, \177. I read that \377 is a special character for EOF in C, but also 255 in decimal? Do these values make up a memory address? Am I adding the address to newToken.value by using &ch in strncat? If so, how can I keep them from getting into newToken.value?
Note: I get a segmentation fault if I use strncat(newToken.value, ch, 1) instead of strncat(newToken.value, &ch, 1) (ch vs. &ch).
I'll try to consolidate the answers already given in the comments.
This version of the code uses strncat(), as yours, but solving the problems noted by Nick (we must initialize the target) and Dúthomhas (the second parameter to strncat() must be a string, and not a pointer to a single char) (Yes, a "string" is actually a char[] and the value passed to the function is a char*; but it must point to an array of at least two chars, the last one containing a '\0'.)
Please be aware that strncat(), strncpy() and all related functions are tricky. They don't write more than N chars. But strncpy() only adds the final '\0' to the target string when the source has less than N chars; and strncat() always adds it, even if it the source has exactly N chars or more (edited; thanks, #Clifford).
#include <stdio.h>
#include <string.h>
int main() {
FILE* file = stdin; // fopen("test.txt", "r");
if (file) {
struct Token {
char type[16];
char value[1024];
};
struct Token newToken;
newToken.value[0] = '\0'; // A '\0' at the first position means "empty"
int aux;
char source[2] = ""; // A literal "" has a single char with value '\0', but this syntax fills the entire array with '\0's
while ((aux = fgetc(file)) != EOF) {
source[0] = (char)aux;
strncat(newToken.value, source, 1); // This appends AT MOST 1 CHAR (and always adds a final '\0')
}
strncat(newToken.value, "", 1); // As the source string is empty, it just adds a final '\0' (superfluous in this case)
printf(newToken.value);
}
return 0;
}
This other version uses an index variable and writes each singe char directly into the "current" position of the target string, without using strncat(). I think is simpler and more secure, because it doesn't mix the confusing semantics of single chars and strings.
#include <stdio.h>
#include <string.h>
int main() {
FILE* file = stdin; // fopen("test.txt", "r");
if (file) {
struct Token {
int index = 0;
char type[16];
char value[1024]; // Max size is 1023 chars + '\0'
};
struct Token newToken;
newToken.value[0] = '\0'; // A '\0' at the first position means "empty". This is not really necessary anymore
int aux;
while ((aux = fgetc(file)) != EOF)
// Index will stop BEFORE 1024-1 (value[1022] will be the last "real" char, leaving space for a final '\0')
if (newToken.index < sizeof newToken.value -1)
newToken.value[newToken.index++] = (char)aux;
newToken.value[newToken.index++] = '\0';
printf(newToken.value);
}
return 0;
}
Edited: fgetc() returns an int and we should check for EOF before casting it to a char (thanks, #chqrlie).
You are appending string that is not initialised, so can contain anything. The end I'd a string is indicated by a NUL(0) character, and in your example there happened to be one after 6 bytes, but there need not be any within the value array, so the code is seriously flawed, and will result in non-deterministic behaviour.
You need to initialise the newToken instance to empty string. For example:
struct Token newToken = { "", "" } ;
or to zero initialise the whole structure:
struct Token newToken = { 0 } ;
The point is that C does not initialise non-static objects without an explicit initialiser.
Furthermore using strncat() is very inefficient and has non-deterministic execution time that depends on the length of the destination string (see https://www.joelonsoftware.com/2001/12/11/back-to-basics/).
In this case you would do better to maintain a count of the number of characters added, and write the character and terminator directly to the array. For example:
size_t index ;
int ch = 0 ;
do
{
ch = fgetc(file);
if( ch != EOF )
{
newToken.value[index] = (char)ch ;
index++ ;
newToken.value[index] = '\0' ;
}
} while( ch != EOF &&
index < size of(newToken.value) - 1 ) ;
I am reading a .txt file which contains data in a random form i.e. it contains integers and strings mixed in it.
Sample .txt file:
this is a 22 string 33 sorry222 stack33ing
still yysi288 2nd line
I want to read the file and differentiate all valid string i.e. which do not contain integers concatinated with them. And store those strings in an array.
Any leads on how to differentiate?
You can use #include <ctype.h> for that purpose.
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int checkString( const char s[] ) {
unsigned char c;
while ( ( c = *s ) && ( isalpha( c ) || isblank( c ) ) ) ++s;
return *s == '\0';
}
void printNonNumericWords(char str[]) {
int init_size = strlen(str);
char delim[] = " ";
char *ptr = strtok(str, delim);
while(ptr != NULL)
{
if (checkString(ptr))
printf("'%s'\n", ptr);
ptr = strtok(NULL, delim);
}
printf("\n");
}
Call function like this.
printNonNumericWords(this is a 22 string 33 sorry222 stack33ing");
First: I can´t write the programm for you. It is your task to do and beside this i can´t even change things on your code or at least suggest to alter, because you are not providing any code. I can give you only a dusty leading algorithm for this case:
All the words you might see, are not valid strings in the file. It is only one string containing white space characters between each sequence of characters which appear for you as one word or separated string, but it doesn´t.
You have to get the whole string from the file first and store it into an char array, lets name it source, using fgets():
#include <stdio.h>
FILE *input;
char source[200];
input = fopen("text.txt","r");
fgets(source, 200, input);
After that you need to make those "words" separated strings by transport each of the characters of the source string, one after another, to other char arrays and quit writing the characters to them as soon as the space character or the NUL-byte after the last word is provided in the source string. Don´t forget to make a Nul-Byte (\n) terminating each string.
Thereafter you check each, now valid and separated string, if it either is a string that contains numbers or a string without any number.
I want to parse a string into a note and octave. For example if the user inputs "A#4", (A#)-note that will be stored in (char n) and (4)- octave that will be stored in (char o). Why am I getting blanked line instead of 4 as output after A#?
#include <cs50.h>
#include <stdio.h>
#include <string.h>
int main()
{
string src = get_string();
char *n;
char *o;
char *note = "ABCDEFG#b";
char *octave = "12345678";
o = strtok(src, note);
n = strtok(src, octave);
printf("%s\n", n);
printf("%s\n", o);
}
Output:
A#
Can you please point to error and suggest a solution?
strtok is not the function you want to use in this instance.
When you call it, it alters the string, replacing the character that matches the deliminator with a NUL so you'll lose the character you're looking for as the note. The second time you call it with src, the string will appear empty and it won't find anything - you're meant to call it on subsequent times with the first parameter set to NULL so that it knows you're searching for the next token in the same string.
You might want to use strspn which counts the number of characters that match your set (ie note) or strpbrk that finds the first character that matches.
Or you could traverse the string yourself and use strchr like this
char *pos;
for(pos=src;*pos!='\0';pos++)
{
if(strchr(note,*pos))
{
// *pos is a note character
}
}
Whatever you use, you'll need to build a new string based on your results as the original string won't have space to put NUL terminators inside to separate out the two parts you're looking for.
I was trying to parse strings using strtok(); I am trying to parse strings delimited by a semicolon ( ; ). But when I input a string with no semicolons to strtok(), it returns the entire string. Shouldn't it be returning NULL if there are no token matches?
This is my code:
int main(int argc, char** argv)
{
char cmd[] = "INSERT A->B B->C INSERT C->D";
char delim[] = ";";
char *result = NULL;
result = strtok(cmd,delim);
if(result == NULL)
{
printf("\n NO TOKENS\n");
}
else
{
printf("\nWe got something !! %s ",result);
}
return (EXIT_SUCCESS);
}
The output is : We got something !! INSERT A->B B->C INSERT C->D
No, the delimiter means that it's the thing that separates tokens, so if there is no delimiters, then the entire string is considered the first token
consider if you have two tokens, then take one of those tokens away.
if you have
a;b
then you have tokens a and b
now if you take b away...
a
you still have token a
If you read the man page(http://man7.org/linux/man-pages/man3/strtok.3.html) carefully, you will see that it says:
The strtok() function breaks a string into a sequence of zero or
more nonempty tokens.
So, basically it is either breaking your input string into multiple tokens or not(and it happens when it finds no given delimiter into the given string).
Example:
input_string || delimiter || tokens
"abc:def" || ":" || "abc" and "def"
"abcdef" || ":" || "abcdef"
I'm having a lot of trouble figuring this out. I have a C string, and I want to remove the first part of it. Let's say its: "Food,Amount,Calories". I want to copy out each one of those values, but not the commas. I find the comma, and return the position of the comma to my method. Then I use
strncpy(aLine.field[i], theLine, end);
To copy "theLine" to my array at position "i", with only the first "end" characters (for the first time, "end" would be 4, because that is where the first comma is). But then, because it's in a Loop, I want to remove "Food," from the array, and do the process over again. However, I cannot see how I can remove the first part (or move the array pointer forward?) and keep the rest of it. Any help would be useful!
What you need is to chop off strings with comma as your delimiter.
You need strtok to do this. Here's an example code for you:
int main (int argc, const char * argv[]) {
char *s = "asdf,1234,qwer";
char str[15];
strcpy(str, s);
printf("\nstr: %s", str);
char *tok = strtok(str, ",");
printf("\ntok: %s", tok);
tok = strtok(NULL, ",");
printf("\ntok: %s", tok);
tok = strtok(NULL, ",");
printf("\ntok: %s", tok);
return 0;
}
This will give you the following output:
str: asdf,1234,qwer
tok: asdf
tok: 1234
tok: qwer
If you have to keep the original string, then strtok. If not, you can replace each separator with '\0', and use the obtained strings directly:
char s_RO[] = "abc,123,xxxx", *s = s_RO;
while (s){
char* old_str = s;
s = strchr(s, ',');
if (s){
*s = '\0';
s++;
};
printf("found string %s\n", old_str);
};
The function you might want to use is strtok()
Here is a nice example - http://www.cplusplus.com/reference/clibrary/cstring/strtok/
Personally, I would use strtok().
I would not recommend removing extracted tokens from the string. Removing part of a string requires copying the remaining characters, which is not very efficient.
Instead, you should keep track of your positions and just copy the sections you want to the new string.
But, again, I would use strtok().
if you know where the comma is, you can just keep reading the string from that point on.
for example
void readTheString(const char *theLine)
{
const char *wordStart = theLine;
const char *wordEnd = theLine;
int i = 0;
while (*wordStart) // while we haven't reached the null termination character
{
while (*wordEnd != ',')
wordEnd++;
// ... copy the substring ranging from wordStart to wordEnd
wordStart = ++wordEnd; // start the next word
}
}
or something like that.
the null termination check is probably wrong, unless the string also ends with a ','... but you get the idea.
anyway, using strtok would probably be a better idea.