Let's say I have to parse some phone numbers that can have different delimiters.
Example: 01/555555 01/555-5555
Can I use strtok() in c and give a regex as a delimiter parameter that would include all the different possible delimiters?
No, it does not support regex. Read the documentation before asking. On the other hand, that's precisely how it works so again Read the documentation, i.e. You give it all the possible delimiters.
Check it here
#include <stdio.h>
#include <string.h>
int
main(void)
{
char example[] = "exa$mple#str#ing";
char *token;
char *pointer;
pointer = example;
token = strtok(pointer, "##$");
if (token == NULL)
return -1;
do
{
fprintf(stdout, "%s\n", token);
pointer = NULL;
} while ((token = strtok(NULL, "##$")) != NULL);
}
As a complement to iharob's answer, sscanf may sometimes be an alternative to strtok. Here's an illustration with the given example:
#include <stdio.h>
int main(void) {
const char *s = "01/555555 01/555-5555";
int a, b, c, d, e;
int ret = sscanf(s, "%02d/%d %02d/%d-%d", &a, &b, &c, &d, &e);
if (ret != 5) {
printf("The string is in bad format.\n");
} else {
printf("%02d/%d %02d/%d-%d\n", a, b, c, d, e);
}
return 0;
}
Like strtok, it doesn't support regex but it enables to extract the data within one single line. It works exactly like scanf but it reads from a given string instead of reading from the standard input.
http://linux.die.net/man/3/sscanf
Related
In an effort of solving a textbook problem, I'm trying to create a case insensitive version of the function called strstr() which is in the C language. So far, I've run into two problems. The first problem being that when I make the case insensitive version of strstr() it worked, but it didn't stop at the first matching string and continued to return the string even if they didn't match.
strstr() is supposed to see the first instance of a matching character up to n counts specified and then stop. Like if I wrote: "Xehanort" in string A and "Xemnas" in string B and specified 4, as the number, it would return Xe.
The idea behind the case insensitive version is that I can write : "Xehanort" in one string and "xemnas" in the next string and have it return Xe.
However, I've run into a new problem in new code I've tried: the function doesn't seem to want to run at all. I've tested this and it turns out the function seems to be at a crash and I'm not sure how to make it stop.
I've tried editing the code, I've tried using different for loops but figured that the code doesn't need to be too sophisticated yet, I've also tried different code entirely than what you are going to read, but that resulted in the problem mentioned earlier.
#include <ctype.h>
#include <stdio.h>
#include <string.h>
#include <limits.h>
#define MAX 100
char *stristr4(const char *p1, const char *p2, size_t num);
int main() {
char c[MAX], d[MAX];
printf("Please enter the string you want to compare.");
gets(c);
printf("Please enter the next string you want to compare.");
gets(d);
printf("The first string to be obtained from \n%s, and \n%s is \n%s",
c, d, stristr4(c, d, MAX));
}
char *stristr4(const char *p1, const char *p2, size_t num) {
const char *str1 = p1;
const char *str2 = p2;
char *str3;
int counter = 0;
for (int i = 0; i < num; i++) {
for (int j = 0; j < num; j++) {
if (tolower(str1[i]) == tolower(str2[j])) {
str3[i] = str1[i];
counter++;
} else {
if (counter > 0) {
break;
} else
continue;
}
}
}
return str3;
}
The code you see will ask for the strings you want to input. Ideally, it should return the input.
Then it should do the stristr function and return the first instance of matching string with case insensitivity.
However, the function I've created doesn't even seem to run.
Your code has undefined behavior (in this case causing a segmentation fault), because you try to store the resulting string via an uninitialized pointer str3.
Standard function strstr returns a pointer to the matching subsequence, you should do the same. The third argument is useless if the first and second arguments are proper C strings.
Here is a modified version:
char *stristr4(const char *p1, const char *p2) {
for (;; p1++) {
for (size_t i = 0;; i++) {
if (p2[i] == '\0')
return (char *)p1;
if (tolower((unsigned char)p1[i]) != tolower((unsigned char)p2[i]))
break;
}
if (*p1 == '\0')
return NULL;
}
}
Notes:
function tolower() as other functions from <ctype.h> takes an int argument that must have the value of an unsigned char or the special negative value EOF. char arguments must be converted to unsigned char to avoid undefined behavior for negative char values. char can be signed or unsigned by default depending on the platform and the compilers settings.
you should never use gets(). This function is obsolete and cannot be used safely with uncontrolled input. Use fgets() and strip the trailing newline:
if (fgets(c, sizeof c, stdin)) {
c[strcspn(c, "\n")] = '\0'; // strip the trailing newline if any
...
}
A third string could be passed to the function and fill that string with the matching characters.
Use fgets instead of gets.
#include <ctype.h>
#include <stdio.h>
#include <string.h>
#define MAX 100
int stristr4(const char* p1, const char *p2, char *same);
int main( void)
{
int comp = 0;
char c[MAX] = "", d[MAX] = "", match[MAX] = "";//initialize to all zero
printf ( "Please enter the string you want to compare. ");
fflush ( stdout);//printf has no newline so make sure it prints
fgets ( c, MAX, stdin);
c[strcspn ( c, "\n")] = 0;//remove newline
printf ( "Please enter the next string you want to compare. ");
fflush ( stdout);//printf has no newline so make sure it prints
fgets ( d, MAX, stdin);
d[strcspn ( d, "\n")] = 0;//remove newline
comp = stristr4 ( c, d, match);
printf ( "Comparison of \n%s, and \n%s is \n%d\n", c, d, comp);
if ( *match) {
printf ( "The matching string to be obtained from \n%s, and \n%s is \n%s\n"
, c, d, match);
}
return 0;
}
int stristr4 ( const char *p1,const char *p2, char *same)
{
//pointers not pointing to zero and tolower values are equal
while ( *p1 && *p2 && tolower ( (unsigned char)*p1) == tolower ( (unsigned char)*p2))
{
*same = tolower ( (unsigned char)*p1);//count same characters
same++;//increment to next character
*same = 0;//zero terminate
p1++;
p2++;
}
return *p1 - *p2;//return difference
}
I'm new to C language and I need a help on String functions.
I have a string variable called mcname upon which I would like to compare the characters between special characters.
For example:
*mcname="G2-99-77"
I expect the output to be 99 as this is between the - characters.
How can I do this in C please?
Travel the string (walking pointer) till u hit a special character.
Then start copying the characters into seperate array untill u hit the next special character (Place a null character when u encounter the special character second time)
You can do this by using strtok or sscanf
using sscanf:
#include <stdio.h>
int main()
{
char str[64];
int out;
char mcname[] = "G2-99-77";
sscanf(mcname, "%[^-]-%d", str, &out);
printf("%d\n", out);
return 0;
}
Using strtok:
#include <stdio.h>
#include <string.h>
int main()
{
char *str;
int out;
char mcname[] = "G2-99-77";
str = strtok(mcname, "-");
str = strtok (NULL, "-");
out = atoi(str);
printf("%d\n", out);
return 0;
}
sscanf() has great flexibility. Used correctly, code may readily parse a string.
Be sure to test the sscanf() return value.
%2[A-Z0-9] means to scan up to 2 characters from the set 'A' to 'Z' and '0' to '9'.
Use %2[^-] if code goal is any 2 char other than '-'.
char *mcname = "G2-99-77";
char prefix[3];
char middle[3];
char suffix[3];
int cnt = sscanf(mcname, "%2[A-Z0-9]-%2[A-Z0-9]-%2[A-Z0-9]", prefix, middle,
suffix);
if (cnt != 3) {
puts("Parse Error\n");
}
else {
printf("Prefix:<%s> Middle:<%s> Suffix:<%s>\n", prefix, middle, suffix);
}
I'm building a linked list and need your assistance please as I'm new to C.
I need to input a string that looks like this: (word)_#_(year)_#_(DEFINITION(UPPER CASE))
Ex: Enter a string
Input: invest_#_1945_#_TRADE
Basically I'm looking to build a function that scans the DEFINITION and give's me back the word it relates to.
Enter a word to search in the dictionary
Input: TRADE
Output: Found "TREADE" in the word "invest"
So far I managed to come up using the strtok() function but right now I'm not sure what to do about printing the first word then.
Here's what I could come up with:
char split(char words[99],char *p)
{
p=strtok(words, "_#_");
while (p!=NULL)
{
printf("%s\n",p);
p = strtok(NULL, "_#_");
}
return 0;
}
int main()
{
char hello[99];
char *s = NULL;
printf("Enter a string you want to split\n");
scanf("%s", hello);
split(hello,s);
return 0;
}
Any ideas on what should I do?
I reckon that your problem is how to extract the three bits of information from your formatted string.
The function strtok does not work as you think it does: The second argument is not a literal delimiting string, but a string that serves as a set of characters that are delimiters.
In your case, sscanf seems to be the better choice:
#include <stdlib.h>
#include <stdio.h>
int main()
{
const char *line = "invest_#_1945 _#_TRADE ";
char word[40];
int year;
char def[40];
int n;
n = sscanf(line, "%40[^_]_#_%d_#_%40s", word, &year, def);
if (n == 3) {
printf("word: %s\n", word);
printf("year: %d\n", year);
printf("def'n: %s\n", def);
} else {
printf("Unrecognized line.\n");
}
return 0;
}
The function sscanf examines a given string according to a given pattern. Roughly, that pattern consists of format specifiers that begin with a percent sign, of spaces which denote any amount of white-space characters (including none) and of other characters that have to be matched varbatim. The format specifiers yield a result, which has to be stored. Therefore, for each specifier, a result variable must be given after the format string.
In this case, there are several chunks:
%40[^_] reads up to 40 characters that are not the underscore into a char array. This is a special case of reading a string. Strings in sscanf are really words and may not contain white space. The underscore, however, would be part of a string, so in order not to eat up the underscore of the first delimiter, you have to use the notation [^(chars)], which means: Any sequence of chars that do not contain the given chars. (The caret does the negation here, [(chars)] would mean any sequence of the given chars.)
_#_ matches the first delimiter literally, i.e. only if the next chars are underscore hash mark, underscore.
%d reads a decimal number into an integer. Note that the adress of the integer has to be given here with &.
_#_ matches the second delimiter.
%40s reads a string of up to 40 non-whitespace characters into a char array.
The function returns the number of matched results, which should be three if the line is valid. The function sscanf can be cumbersome, but is probably your best bet here for quick and dirty input.
#include <stdio.h>
#include <string.h>
char *strtokByWord_r(char *str, const char *word, char **store){
char *p, *ret;
if(str != NULL){
*store = str;
}
if(*store == NULL) return NULL;
p = strstr(ret=*store, word);
if(p){
*p='\0';
*store = p + strlen(word);
} else {
*store = NULL;
}
return ret;
}
char *strtokByWord(char *str, const char *word){
static char *store = NULL;
return strtokByWord_r(str, word, &store);
}
int main(){
char input[]="invest_#_1945_#_TRADE";
char *array[3];
char *p;
int i, size = sizeof(array)/sizeof(char*);
for(i=0, p=input;i<size;++i){
if(NULL!=(p=strtokByWord(p, "_#_"))){
array[i]=p;//strdup(p);
p=NULL;
} else {
array[i]=NULL;
break;
}
}
for(i = 0;i<size;++i)
printf("array[%d]=\"%s\"\n", i, array[i]);
/* result
array[0]="invest"
array[1]="1945"
array[2]="TRADE"
*/
return 0;
}
There is a string with a line of text. Let's say:
char * line = "Foo|bar|Baz|23|25|27";
I would have to find the numbers.
I was thinking of something like this:
If the given char is a number, let's put it into a temporary char array. (buffer)
If the next character is NOT a number, let's make the buffer a new int.
The problem is... how do I find numbers in a string like this?
(I'm not familiar with C99/gcc that much.)
Compiler used: gcc 4.3 (Environment is a Debian Linux stable.)
I would approach as the following:
Considering '|' as the separator, tokenize the line of text, i.e. split the line into multiple fields.
For each token:
If the token is numeric:
Convert the token to a number
Some library functions that might be useful are strtok, isdigit, atoi.
One possible implementation for the approach suggested in this answer, based on sscanf.
#include <stdio.h>
#include <string.h>
void find_integers(const char* p) {
size_t s = strlen(p)+1;
char buf[s];
const char * p_end = p+s;
int n;
/* tokenize string */
for (; p < p_end && sscanf(p, "%[^|]%n", &buf, &n); p += (n+1))
{
int x;
/* try to parse an integer */
if (sscanf(buf, "%d", &x)) {
printf("got int :) %d\n", x);
}
else {
printf("got str :( %s\n", buf);
}
}
}
int main() {
const char * line = "Foo|bar|Baz|23|25|27";
find_integers(line);
}
Output:
$ gcc test.c && ./a.out
got str :( Foo
got str :( bar
got str :( Baz
got int :) 23
got int :) 25
got int :) 27
I'm trying to extract a string and an integer out of a string using sscanf:
#include<stdio.h>
int main()
{
char Command[20] = "command:3";
char Keyword[20];
int Context;
sscanf(Command, "%s:%d", Keyword, &Context);
printf("Keyword:%s\n",Keyword);
printf("Context:%d",Context);
getch();
return 0;
}
But this gives me the output:
Keyword:command:3
Context:1971293397
I'm expecting this ouput:
Keyword:command
Context:3
Why does sscanf behaves like this? Thanks in advance you for your help!
sscanf expects the %s tokens to be whitespace delimited (tab, space, newline), so you'd have to have a space between the string and the :
for an ugly looking hack you can try:
sscanf(Command, "%[^:]:%d", Keyword, &Context);
which will force the token to not match the colon.
If you aren't particular about using sscanf, you could always use strtok, since what you want is to tokenize your string.
char Command[20] = "command:3";
char* key;
int val;
key = strtok(Command, ":");
val = atoi(strtok(NULL, ":"));
printf("Keyword:%s\n",key);
printf("Context:%d\n",val);
This is much more readable, in my opinion.
use a %[ convention here. see the manual page of scanf: http://linux.die.net/man/3/scanf
#include <stdio.h>
int main()
{
char *s = "command:3";
char s1[0xff];
int d;
sscanf(s, "%[^:]:%d", s1, &d);
printf("here: %s:%d\n", s1, d);
return 0;
}
which gives "here:command:3" as its output.