read the characters between special characters in C - c

I'm new to C language and I need a help on String functions.
I have a string variable called mcname upon which I would like to compare the characters between special characters.
For example:
*mcname="G2-99-77"
I expect the output to be 99 as this is between the - characters.
How can I do this in C please?

Travel the string (walking pointer) till u hit a special character.
Then start copying the characters into seperate array untill u hit the next special character (Place a null character when u encounter the special character second time)

You can do this by using strtok or sscanf
using sscanf:
#include <stdio.h>
int main()
{
char str[64];
int out;
char mcname[] = "G2-99-77";
sscanf(mcname, "%[^-]-%d", str, &out);
printf("%d\n", out);
return 0;
}
Using strtok:
#include <stdio.h>
#include <string.h>
int main()
{
char *str;
int out;
char mcname[] = "G2-99-77";
str = strtok(mcname, "-");
str = strtok (NULL, "-");
out = atoi(str);
printf("%d\n", out);
return 0;
}

sscanf() has great flexibility. Used correctly, code may readily parse a string.
Be sure to test the sscanf() return value.
%2[A-Z0-9] means to scan up to 2 characters from the set 'A' to 'Z' and '0' to '9'.
Use %2[^-] if code goal is any 2 char other than '-'.
char *mcname = "G2-99-77";
char prefix[3];
char middle[3];
char suffix[3];
int cnt = sscanf(mcname, "%2[A-Z0-9]-%2[A-Z0-9]-%2[A-Z0-9]", prefix, middle,
suffix);
if (cnt != 3) {
puts("Parse Error\n");
}
else {
printf("Prefix:<%s> Middle:<%s> Suffix:<%s>\n", prefix, middle, suffix);
}

Related

sscanf get string until second symbol (include one)

How to get string until second symbol through sscanf?
for example:
char *str = "struct1.struct2.struct3.int";
char buf[256] = {0};
sscanf(str, "", buf); //have any format string could get string until second dot?
sscanf get string until second symbol (include one)
How to get string until second symbol through sscanf?
Not generally possible with a single use of sscanf().
Certainly, without a lot of work, a more involved use of sscanf() will work for many input strings, yet fail for select ones1. sscanf() is not the best fit here for this task.
strchr(), strcspn() better suited.
#include <string.h>
#include<stdlib.h>
// Return offset to 2nd needle occurrence
// or end of string, if not found.
size_t foo(const char *haystack, const char *needle) {
size_t offset = strcspn(haystack, needle);
if (haystack[offset]) {
offset++;
offset += strcspn(haystack + offset, needle);
}
return offset;
}
#include <stdio.h>
int main() {
const char *haystack = "struct1.struct2.struct3.int";
printf("<%.*s>\n", (int) foo(haystack, "."), haystack);
}
Output
<struct1.struct2>
1 Consider: "struct1.struct2", "struct1..", "..struct2", ".struct2.", "..", ".", "".
You can use a * to tell scanf to ignore an element:
const char *str = "struct1.struct2.struct3.int";
int main() {
char buf[256];
int i = sscanf(str, "%*[^.].%[^.]", buf);
printf("%d >%s<\n", i, buf);
return 0;
}
This outputs as expected:
1 >struct2<
because exactly 1 element was assigned even if another one was parsed.

Strtok outputting just a part of the string

#include <stdio.h>
#include <string.h>
int main(){
char name[] = "eseumdesconhecidolheoferecerflores.issoeimpulse.cities";
char *str;
printf("%s\n", name)
str = strtok(name, ".cities");
printf("%s\n", str);
return 0;
}
This is the output:
eseumdesconhecidolheoferecerflores.issoeimpulse.cities
umd
I have no idea what is happening at all. What I want is for the output of strtok to be a pointer to "eseumdesconhecidolheoferecerflores.issoeimpulse"
The delimiter argument to strtok is a string containing individual characters used to separate the string.
You specified delimiters ., c, i, t, e, and s.
So it's no surprise the output is umd for the first token, since it is surrounded by characters in your delimiter string.
If you want to find a whole string, you should use strstr instead.
For example:
char name[] = "eseumdesconhecidolheoferecerflores.issoeimpulse.cities";
char *pos;
pos = strstr(name, ".cities");
if (pos)
{
*pos = '\0';
printf("%s\n", name);
}

c, delete words which contain digits from a string

I need to delete all words that contain digits from the string.
E.g. if input is abdgh 67fgh 32ghj hj dfg43 11 fg, output should be abdgh hj fg.
I thought of using while( text[i] != ' '), but I don't know how to continue it for the rest of the string (after the first whitespace).
I don't have any other idea, and couldn't find anything by googling. Please, help me!
Here, i gave it a try. Works just fine for me. I tried to explain the logic throughout the code via comments. Hope it helps.
#include <stdio.h>
#include <string.h>
int containsNum(char * str);
int main()
{
char str[] = "abdgh 67fgh 32ghj hj dfg43 11 fg"; // input string
char newstr[100] = ""; //new string to create with filtered data
char * pch; //temp string to use in strtok
printf("given string : %s\n",str );
pch = strtok (str," ");
while (pch != NULL)
{
if(!containsNum(pch))// creation of new string with strcat
{ // if the current word not contains any number
strcat(newstr,pch);
strcat(newstr," "); //adding a space between words for readability
}
pch = strtok (NULL, " ");
}
printf("modified string : %s\n", newstr );
return 0;
}
//function containsNum
//gets a string and checks if it has any numbers in it
//returns 1 if so , 0 otherwise
int containsNum(char * str)
{
int i,
size =strlen(str),
flag=0;
for(i=0; i<size ; ++i)
{
if((int)str[i] >=48 && (int)str[i] <=57 ){
flag =1;
break;
}
}
return flag;
}
Regards
Algorithm:
1-You will have to break your input string into smaller components which are also called as tokens. For example: for the string abdgh 67fgh 32ghj hj dfg43 11 fg the tokens could be abdgh, 67fgh, 32ghj, hj, dfg43, 11 and fg.
2- These smaller strings or tokens can be formed using the strtok function which is defined as
char * strtok ( char * str, const char * delimiters );. Thestr in the first argument is the input sting which in the code presented below is string1. The second argument called the delimiters is what actually defines when to divide the input string into smaller pieces(tokens).
For instance, a whitespace as a delimiter will divide the input string whenever a whitespace is encountered, which is how the string is being divided in the code.
3-Since, your program needs to delete those words in the input string which contain digits we can use the isdigit() function to check exactly that.
WORKING CODE:
#include <cstring>
#include <ctype.h>
#include<stdio.h>
int main ()
{
char output[100]="";
int counter;
int check=0; /* An integer variable which takes the value of "1" whenever a digit
is encountered in one of the smaller strings or tokens.
So, whenever check is 1 for any of the tokens that token is to be ignored, that is,
not shown in the output string.*/
char string1[] = "abdgh 67fgh 32ghj hj dfg43 11 fg";
char delimiters[] = " ";//A whitespace character functions as a delimiter in the program
char * token;//Tokens are the sub-strings or the smaller strings which are part of the input string.
token=strtok(string1,delimiters);/*The first strktok call forms the first token/substring which for the input
given would be abdgh*/
while(token!=NULL)/*For the last substring(token) the strtok function call will return a NULL pointer, which
also indicates the last of the tokens(substrings) that can be formed for a given input string.
The while loop finishes when the NULL pointer is encountered.*/
{
for(counter=0;counter<=strlen(token)-1;counter++)/*This for loop iterates through each token element.
Example: In case of abdgh, it will first check for 'a',
then 'b', then 'd' and so on..*/
{
if(isdigit((int)token[counter])>0)/*This is to check if a digit has been encountered inside a token(substring).
If a digit is encountered we make check equal to 1 and break our loop, as
then that token is to be ignored and there is no real need to iterate
through the rest of the elements of the token*/
{
check=1;
break;
}
}
if(check==1) /* Outside the for loop, if check is equal to one that means we have to ignore that token and
it is not to be made a part of the output string. So we just concatenate(join) an
empty string ( represented by " " )with the output string*/
{
strcat(output,"");
check=0;
}
else /*If a token does not contain any digit we simply make it a part of the output string
by concatenating(joining) it with the output string. We also add a space for clarity.*/
{
strcat(output,token);
strcat(output," ");
}
token = strtok( NULL, delimiters ); /*This line of code forms a new token(substring) every time it is executed
inside the while loop*/
}
printf( "Output string is:: %s\n", output ); //Prints the final result
return 0;
}
#include <stdio.h>
#include <ctype.h>
#include <stdbool.h>
char *filter(char *str){
char *p, *r;
p = r = str;
while(*r){
char *prefetch = r;
bool contain_digit = false;
while(!isspace(*prefetch) && *prefetch){
if(contain_digit)
++prefetch;
else if(isdigit(*prefetch++))
contain_digit = true;
}
if(contain_digit){
r = prefetch;
}else {
while(r < prefetch){
*p++ = *r++;
}
}
if(!*r)
break;
if(p[-1] == *r)
++r;
else
*p++ =*r++;
}
*p = '\0';
return str;
}
int main(void) {
char text[] = "abdgh 67fgh 32ghj hj dfg43 11 fg";
printf("%s\n", filter(text));//abdgh hj fg
return 0;
}

How do I make this shell to parse the statement with quotes around them in C?

I am trying to make this shell parse. How do I make the program implement parsing in a way so that commands that are in quotes will be parsed based on the starting and ending quotes and will consider it as one token? During the second while loop where I am printing out the tokens I think I need to put some sort of if statement, but I am not too sure. Any feedback/suggestions are greatly appreciated.
#include <stdio.h> //printf
#include <unistd.h> //isatty
#include <string.h> //strlen,sizeof,strtok
int main(int argc, char **argv[]){
int MaxLength = 1024; //size of buffer
int inloop = 1; //loop runs forever while 1
char buffer[MaxLength]; //buffer
bzero(buffer,sizeof(buffer)); //zeros out the buffer
char *command; //character pointer of strings
char *token; //tokens
const char s[] = "-,+,|, ";
/* part 1 isatty */
if (isatty(0))
{
while(inloop ==1) // check if the standard input is from terminal
{
printf("$");
command = fgets(buffer,sizeof(buffer),stdin); //fgets(string of char pointer,size of,input from where
token = strtok(command,s);
while (token !=NULL){
printf( " %s\n",token);
token = strtok(NULL, s); //checks for elements
}
if(strcmp(command,"exit\n")==0)
inloop =0;
}
}
else
printf("the standard input is NOT from a terminal\n");
return 0;
}
For an arbitrary command-line syntax, strtok is not the best function. It works for simple cases, where the words are delimited by special characters or white space, but there will come a time where you want to split something like this ls>out into three tokens. strtok can't handle this, because it needs to place its terminating zeros somewhere.
Here's a quick and dirty custom command-line parser:
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <ctype.h>
int error(const char *msg)
{
printf("Error: %s\n", msg);
return -1;
}
int token(const char *begin, const char *end)
{
printf("'%.*s'\n", end - begin, begin);
return 1;
}
int parse(const char *cmd)
{
const char *p = cmd;
int count = 0;
for (;;) {
while (isspace(*p)) p++;
if (*p == '\0') break;
if (*p == '"' || *p == '\'') {
int quote = *p++;
const char *begin = p;
while (*p && *p != quote) p++;
if (*p == '\0') return error("Unmachted quote");
count += token(begin, p);
p++;
continue;
}
if (strchr("<>()|", *p)) {
count += token(p, p + 1);
p++;
continue;
}
if (isalnum(*p)) {
const char *begin = p;
while (isalnum(*p)) p++;
count += token(begin, p);
continue;
}
return error("Illegal character");
}
return count;
}
This code understands words separated by white-space, words separated by single or double quotation marks and single-character operators. It doesn't understand escaped quotation marks inside quotes and non-alphanumeric characters such as the dot in words.
The code is not hard to understand and you can extend it easily to understand double-char operators such as >> or comments.
If you want to escape quotation marks, you'll have to recognise the escape character in parse and unescape it and possible other escape sequences in token.
First, you've declared argv to be an array of pointers to... pointers. In fact, it is an array of pointers to chars. So:
int main(int argc, char **argv){
The trend is you want to reach for [], which got you into incorrect code here, but the idiom in C/C++ is more commonly to use pointer syntax, e.g.:
const char* s = "-+| ";
FWIW.
Also, note that fgets() will return NULL when it hits end of file (e.g., the user types CTRL-D on *nix or CTRL-Z on DOS/Windows). You probably don't want a segment violation when that happens.
Also, bzero() is a nonportable function (you probably don't care in this context) and the C compiler will happily initialize an array to zeroes for you if you ask it to (possibly worth caring about; syntax demonstrated below).
Next, as soon as you allow quoted strings, the next language question that immediately arises is: "how do I quote a quote?". Then, you are immediately out of the territory that can be handled cleanly with strtok(). I'm not 100% sure how you want to break your string into tokens. Using strtok() in the way you do, I think the string "a|b" would produce two tokens, "a" and "b", making you overlook the "|". You're treating "|" and "-" and "+" like whitespace, to be ignored, which is not generally what a shell does. For example, given this command-line:
echo 'This isn''t so hard' | cp -n foo.h .. >foo.out
I would probably want to get the following list of tokens:
echo
'This isn''t so hard'
|
cp
-n
foo.h
..
>
foo.out
Usually, characters like '+' and '-' are not special for most shells' tokenizing process (unlike '|' and '&' and '<', etc. which are instructions to the shell that the spawned command never sees). They get passed onto the application that is then free to decide "'-' indicates this word is an option and not a filename" or whatever.
What follows is a version of your code that produces the output I described (which may or may not be exactly what you want) and allows either double or single-quoted arguments (trivial to extend to handle back-ticks too) that can contain quote marks of the same kind, etc.
#include <stdio.h> //printf
#include <unistd.h> //isatty
#include <string.h> //strlen,sizeof,strtok
#define MAXLENGTH 1024
int main(int argc, char **argv[]){
int inloop = 1; //loop runs forever while 1
char buffer[MAXLENGTH] = {'\0'}; //compiler inits entire array to NUL bytes
// bzero(buffer,sizeof(buffer)); //zeros out the buffer
char *command; //character pointer of strings
char *token; //tokens
char* rover;
const char* StopChars = "|&<> ";
size_t toklen;
/* part 1 isatty */
if (isatty(0))
{
while(inloop ==1) // check if the standard input is from terminal
{
printf("$");
token = command = fgets(buffer,sizeof(buffer),stdin); //fgets(string of char pointer,size of,input from where
if(command)
while(*token)
{
// skip leading whitespace
while(*token == ' ')
++token;
rover = token;
// if possible quoted string
if(*rover == '\'' || *rover == '\"')
{
char Quote = *rover++;
while(*rover)
if(*rover != Quote)
++rover;
else if(rover[1] == Quote)
rover += 2;
else
{
++rover;
break;
}
}
// else if special-meaning character token
else if(strchr(StopChars, *rover))
++rover;
// else generic token
else
while(*rover)
if(strchr(StopChars, *rover))
break;
else
++rover;
toklen = (size_t)(rover-token);
if(toklen)
printf(" %*.*s\n", toklen, toklen, token);
token = rover;
}
if(strcmp(command,"exit\n")==0)
inloop =0;
}
}
else
printf("the standard input is NOT from a terminal\n");
return 0;
}
Regarding your specific request: commands that are in quotes will be parsed based on the starting and ending quotes.
You can use strtok() by tokenizing on the " character. Here's how:
char a[]={"\"this is a set\" this is not"};
char *buf;
buf = strtok(a, "\"");
In that code snippet, buf will contain "this is a set"
Note the use of \ allowing the " character to used as a token delimiter.
Also, Not your main issue, but you need to:
Change this:
const char s[] = "-,+,|, "; //strtok will parse on -,+| and a " " (space)
To:
const char s[] = "-+| "; //strtok will parse on only -+| and a " " (space)
strtok() will parse out whatever you have in the delimiter string, including ","

Using Pointers and strtok()

I'm building a linked list and need your assistance please as I'm new to C.
I need to input a string that looks like this: (word)_#_(year)_#_(DEFINITION(UPPER CASE))
Ex: Enter a string
Input: invest_#_1945_#_TRADE
Basically I'm looking to build a function that scans the DEFINITION and give's me back the word it relates to.
Enter a word to search in the dictionary
Input: TRADE
Output: Found "TREADE" in the word "invest"
So far I managed to come up using the strtok() function but right now I'm not sure what to do about printing the first word then.
Here's what I could come up with:
char split(char words[99],char *p)
{
p=strtok(words, "_#_");
while (p!=NULL)
{
printf("%s\n",p);
p = strtok(NULL, "_#_");
}
return 0;
}
int main()
{
char hello[99];
char *s = NULL;
printf("Enter a string you want to split\n");
scanf("%s", hello);
split(hello,s);
return 0;
}
Any ideas on what should I do?
I reckon that your problem is how to extract the three bits of information from your formatted string.
The function strtok does not work as you think it does: The second argument is not a literal delimiting string, but a string that serves as a set of characters that are delimiters.
In your case, sscanf seems to be the better choice:
#include <stdlib.h>
#include <stdio.h>
int main()
{
const char *line = "invest_#_1945 _#_TRADE ";
char word[40];
int year;
char def[40];
int n;
n = sscanf(line, "%40[^_]_#_%d_#_%40s", word, &year, def);
if (n == 3) {
printf("word: %s\n", word);
printf("year: %d\n", year);
printf("def'n: %s\n", def);
} else {
printf("Unrecognized line.\n");
}
return 0;
}
The function sscanf examines a given string according to a given pattern. Roughly, that pattern consists of format specifiers that begin with a percent sign, of spaces which denote any amount of white-space characters (including none) and of other characters that have to be matched varbatim. The format specifiers yield a result, which has to be stored. Therefore, for each specifier, a result variable must be given after the format string.
In this case, there are several chunks:
%40[^_] reads up to 40 characters that are not the underscore into a char array. This is a special case of reading a string. Strings in sscanf are really words and may not contain white space. The underscore, however, would be part of a string, so in order not to eat up the underscore of the first delimiter, you have to use the notation [^(chars)], which means: Any sequence of chars that do not contain the given chars. (The caret does the negation here, [(chars)] would mean any sequence of the given chars.)
_#_ matches the first delimiter literally, i.e. only if the next chars are underscore hash mark, underscore.
%d reads a decimal number into an integer. Note that the adress of the integer has to be given here with &.
_#_ matches the second delimiter.
%40s reads a string of up to 40 non-whitespace characters into a char array.
The function returns the number of matched results, which should be three if the line is valid. The function sscanf can be cumbersome, but is probably your best bet here for quick and dirty input.
#include <stdio.h>
#include <string.h>
char *strtokByWord_r(char *str, const char *word, char **store){
char *p, *ret;
if(str != NULL){
*store = str;
}
if(*store == NULL) return NULL;
p = strstr(ret=*store, word);
if(p){
*p='\0';
*store = p + strlen(word);
} else {
*store = NULL;
}
return ret;
}
char *strtokByWord(char *str, const char *word){
static char *store = NULL;
return strtokByWord_r(str, word, &store);
}
int main(){
char input[]="invest_#_1945_#_TRADE";
char *array[3];
char *p;
int i, size = sizeof(array)/sizeof(char*);
for(i=0, p=input;i<size;++i){
if(NULL!=(p=strtokByWord(p, "_#_"))){
array[i]=p;//strdup(p);
p=NULL;
} else {
array[i]=NULL;
break;
}
}
for(i = 0;i<size;++i)
printf("array[%d]=\"%s\"\n", i, array[i]);
/* result
array[0]="invest"
array[1]="1945"
array[2]="TRADE"
*/
return 0;
}

Resources