I'm trying to scanf words and numbers from a string looks like: "hello, world, I, 287876, 6.0" <-- this string is stored in a char array (string)
What I need to do is to split things up and assign them to different variables so it would be like
char a = "hello"
char b = "world"
char c = "I"
unsigned long d = 287876
float e = 6.0
I know that regular scanf stops reading from stdin when it reaches a white space. So I've been thinking that there might be a way to make sscanf stop reading when it reaches a "," (comma)
I've been exploring the library to find a format for sscanf to read only alphabet and numbers. I couldn't find such a thing, maybe I should look once more.
Any help?
Thanks in advance :)
If the order of your variables in the string is fixe, I mean It's always:
string, string, string, int, float
the use the following format specifier in sscanf():
int len = strlen(str);
char a[len];
char b[len];
char c[len];
unsigned long d;
float e;
sscanf(" %[^,] , %[^,] , %[^,] , %lu , %lf", a, b, c, &d, &e);
This example using strtok should be helpful:
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="hello, world, I, 287876, 6.0" ;
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str,",");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, ",");
}
return 0;
}
Assuming the format of the text file is constant you can use the following solution.
std::ifstream ifs("datacar.txt");
if(ifs)
{
std::string line;
while(std::getline(ifs,line))
{
/* optional to check number of items in a line*/
std::vector<std::string> row;
std::istringstream iss(line);
std::copy(
std::istream_iterator<std::string>(iss),
std::istream_iterator<std::string>(),
std::back_inserter(row)
);
/*To avoid parsing the first line and avoid any error in text file */
if(row.size()<=2)
continue;
std::string format = "%s %s %s %f %[^,] %d";
char year[line.size()],make[line.size()],three[line.size()],full[line.size()];
float numberf;
int numberi;
std::sscanf(line.c_str(),format.c_str(),&year,&make,&three,&numberf,&full,&numberi);
/* create your object and parse the next line*/
}
}
See the documentation for strtok and/or strtok_r
Related
I'm trying to read strings from a file using
while(fscanf(fd, "%s ", word) != EOF) {}
Where fd is the file and word is where I'm storing the string.
However, this effectively uses the whitespace as the delimeter. Currently, if I have a file that reads "this% is, the4 str%ng" it would result in strings "this%", "is,", "the4", and "str%ng". I need it to be "this" "is" "the" "str" "ng". Is it possible to do this with fscanf, or is there something else I need to use?
I saw some answers here and here but they didn't seem to help me out.
Those answers show the use of the "%[] format specifier. As an example suppose you have this to get two strings from the console:
#include <stdio.h>
int main(void){
char s1[100] = "", s2[100] = "";
int res;
res = scanf("%99[^%]%%%99[^%]%%", s1, s2);
printf("%d %s %s\n", res, s1, s2);
}
The first % starts the each format spec, the ^% tells scanf to stop at %, and the next "escaped" double % tells scanf to read the % that stopped the scan. It then repeats for the second string, so the format spec for one string is %99[^%]%% .
To make the format look simpler, suppose the delimiter is not % but #, then the code would be:
#include <stdio.h>
int main(void){
char s1[100] = "", s2[100] = "";
int res;
res = scanf("%99[^#]#%99[^#]#", s1, s2);
printf("%d %s %s\n", res, s1, s2);
}
The function fscanf is similar.
EDIT
This answer does not handle "unknown" delimiters, so I modified the code.
#include <stdio.h>
int main(void){
char s1[100] = "";
while(scanf("%99[^!£$%&*()_-+={};:'##~,.<>/?0123456789]", s1) == 1) {
getchar(); // remove the delimiter
printf("%s\n", s1);
}
}
Note I have not included characters ^ or " or [ or ] as delimiters.
If you don't have a specific delimiter (it seems to be your case), you need to parse each file line manually. You can read each line with fgets(), then parse manually(for example ignore every non-alphabetic chars).
Regards
I'm new to C language and I need a help on String functions.
I have a string variable called mcname upon which I would like to compare the characters between special characters.
For example:
*mcname="G2-99-77"
I expect the output to be 99 as this is between the - characters.
How can I do this in C please?
Travel the string (walking pointer) till u hit a special character.
Then start copying the characters into seperate array untill u hit the next special character (Place a null character when u encounter the special character second time)
You can do this by using strtok or sscanf
using sscanf:
#include <stdio.h>
int main()
{
char str[64];
int out;
char mcname[] = "G2-99-77";
sscanf(mcname, "%[^-]-%d", str, &out);
printf("%d\n", out);
return 0;
}
Using strtok:
#include <stdio.h>
#include <string.h>
int main()
{
char *str;
int out;
char mcname[] = "G2-99-77";
str = strtok(mcname, "-");
str = strtok (NULL, "-");
out = atoi(str);
printf("%d\n", out);
return 0;
}
sscanf() has great flexibility. Used correctly, code may readily parse a string.
Be sure to test the sscanf() return value.
%2[A-Z0-9] means to scan up to 2 characters from the set 'A' to 'Z' and '0' to '9'.
Use %2[^-] if code goal is any 2 char other than '-'.
char *mcname = "G2-99-77";
char prefix[3];
char middle[3];
char suffix[3];
int cnt = sscanf(mcname, "%2[A-Z0-9]-%2[A-Z0-9]-%2[A-Z0-9]", prefix, middle,
suffix);
if (cnt != 3) {
puts("Parse Error\n");
}
else {
printf("Prefix:<%s> Middle:<%s> Suffix:<%s>\n", prefix, middle, suffix);
}
I'm building a linked list and need your assistance please as I'm new to C.
I need to input a string that looks like this: (word)_#_(year)_#_(DEFINITION(UPPER CASE))
Ex: Enter a string
Input: invest_#_1945_#_TRADE
Basically I'm looking to build a function that scans the DEFINITION and give's me back the word it relates to.
Enter a word to search in the dictionary
Input: TRADE
Output: Found "TREADE" in the word "invest"
So far I managed to come up using the strtok() function but right now I'm not sure what to do about printing the first word then.
Here's what I could come up with:
char split(char words[99],char *p)
{
p=strtok(words, "_#_");
while (p!=NULL)
{
printf("%s\n",p);
p = strtok(NULL, "_#_");
}
return 0;
}
int main()
{
char hello[99];
char *s = NULL;
printf("Enter a string you want to split\n");
scanf("%s", hello);
split(hello,s);
return 0;
}
Any ideas on what should I do?
I reckon that your problem is how to extract the three bits of information from your formatted string.
The function strtok does not work as you think it does: The second argument is not a literal delimiting string, but a string that serves as a set of characters that are delimiters.
In your case, sscanf seems to be the better choice:
#include <stdlib.h>
#include <stdio.h>
int main()
{
const char *line = "invest_#_1945 _#_TRADE ";
char word[40];
int year;
char def[40];
int n;
n = sscanf(line, "%40[^_]_#_%d_#_%40s", word, &year, def);
if (n == 3) {
printf("word: %s\n", word);
printf("year: %d\n", year);
printf("def'n: %s\n", def);
} else {
printf("Unrecognized line.\n");
}
return 0;
}
The function sscanf examines a given string according to a given pattern. Roughly, that pattern consists of format specifiers that begin with a percent sign, of spaces which denote any amount of white-space characters (including none) and of other characters that have to be matched varbatim. The format specifiers yield a result, which has to be stored. Therefore, for each specifier, a result variable must be given after the format string.
In this case, there are several chunks:
%40[^_] reads up to 40 characters that are not the underscore into a char array. This is a special case of reading a string. Strings in sscanf are really words and may not contain white space. The underscore, however, would be part of a string, so in order not to eat up the underscore of the first delimiter, you have to use the notation [^(chars)], which means: Any sequence of chars that do not contain the given chars. (The caret does the negation here, [(chars)] would mean any sequence of the given chars.)
_#_ matches the first delimiter literally, i.e. only if the next chars are underscore hash mark, underscore.
%d reads a decimal number into an integer. Note that the adress of the integer has to be given here with &.
_#_ matches the second delimiter.
%40s reads a string of up to 40 non-whitespace characters into a char array.
The function returns the number of matched results, which should be three if the line is valid. The function sscanf can be cumbersome, but is probably your best bet here for quick and dirty input.
#include <stdio.h>
#include <string.h>
char *strtokByWord_r(char *str, const char *word, char **store){
char *p, *ret;
if(str != NULL){
*store = str;
}
if(*store == NULL) return NULL;
p = strstr(ret=*store, word);
if(p){
*p='\0';
*store = p + strlen(word);
} else {
*store = NULL;
}
return ret;
}
char *strtokByWord(char *str, const char *word){
static char *store = NULL;
return strtokByWord_r(str, word, &store);
}
int main(){
char input[]="invest_#_1945_#_TRADE";
char *array[3];
char *p;
int i, size = sizeof(array)/sizeof(char*);
for(i=0, p=input;i<size;++i){
if(NULL!=(p=strtokByWord(p, "_#_"))){
array[i]=p;//strdup(p);
p=NULL;
} else {
array[i]=NULL;
break;
}
}
for(i = 0;i<size;++i)
printf("array[%d]=\"%s\"\n", i, array[i]);
/* result
array[0]="invest"
array[1]="1945"
array[2]="TRADE"
*/
return 0;
}
I'm trying to scan a line that contains multiple words in C. Is there a way to scan it word by word and store each word as a different variable?
For example, I have the following types of lines:
A is the 1 letter;
B is the 2 letter;
C is the 3 letter;
If I'm parsing through the first line: "A is the 1 letter" and I have the following code, what do I put in each case so I can get the individual tokens and store them as variables. To clarify, by the end of this code, I want "is," "the," "1," "letter" in different variables.
I have the following code:
while (feof(theFile) != 1) {
string = "A is the 1 letter"
first_word = sscanf(string);
switch(first_word):
case "A":
what to put here?
case "B":
what to put here?
...
You shouldn't use feof() like that. You should use fgets() or equivalent. You probably need to use the little-known (but present in standard C89) conversion specifier %n.
#include <stdio.h>
int main(void)
{
char buffer[1024];
while (fgets(buffer, sizeof(buffer), stdin) != 0)
{
char *str = buffer;
char word[256];
int posn;
while (sscanf(str, "%255s%n", word, &posn) == 1)
{
printf("Word: <<%s>>\n", word);
str += posn;
}
}
return(0);
}
This reads a line, then uses sscanf() iteratively to fetch words from the line. The %n format specifier doesn't count towards the successful conversions, hence the comparison with 1. Note the use of %255s to prevent overflows in word. Note too that sscanf() could write a null after the 255 count specified in the conversion specification, hence the difference of one between the declaration of char word[256]; and the conversion specifier %255s.
Clearly, it is up to you to decide what to do with each word as it is extracted; the code here simply prints it.
One advantage of this technique over any solution based on strtok() is that sscanf() does not modify the input string so if you need to report an error, you have the original input line to use in the error report.
After editing the question, it seems that the punctuation like semi-colon is not wanted in a word; the code above would include punctuation as part of the word. In that case, you have to think a bit harder about what to do. The starting point might well be using and alphanumeric scan-set as the conversion specification in place of %255s:
"%255[a-zA-Z_0-9]%n"
You probably then have to look at what's in the character at the start of the next component and skip it if it is not alphanumeric:
if (!isalnum((unsigned char)*str))
{
if (sscanf(str, "%*[^a-zA-Z_0-9]%n", &posn) == 0)
str += posn;
}
Leading to:
#include <stdio.h>
#include <ctype.h>
int main(void)
{
char buffer[1024];
while (fgets(buffer, sizeof(buffer), stdin) != 0)
{
char *str = buffer;
char word[256];
int posn;
while (sscanf(str, "%255[a-zA-Z_0-9]%n", word, &posn) == 1)
{
printf("Word: <<%s>>\n", word);
str += posn;
if (!isalnum((unsigned char)*str))
{
if (sscanf(str, "%*[^a-zA-Z_0-9]%n", &posn) == 0)
str += posn;
}
}
}
return(0);
}
You'll need to consider the I18N and L10N aspects of the alphanumeric ranges chosen; what's available may depend on your implementation (POSIX doesn't specify support in scanf() scan-sets for the notations such as [[:alnum:]], unfortunately).
You can use strtok() to tokenize or split strings. Please refer the following link for an example: http://www.cplusplus.com/reference/cstring/strtok/
You can take array of character pointers and assign tokens to them.
Example:
char *tokens[100];
int i = 0;
char *token = strtok(string, " ");
while (token != NULL) {
tokens[i] = token;
token = strtok(NULL, " ");
i++;
}
printf("Total Tokens: %d", i);
Note the %s specifier strips whitespace. So you can write:
std::string s = "A is the 1 letter";
typedef char Word[128];
Word words[6];
int wordsRead = sscanf(s.c_str(), "%128s%128s%128s%128s%128s%128s", words[0], words[1], words[2], words[3], words[4], words[5] );
std::cout << wordsRead << " words read" << std::endl;
for(int i = 0;
i != wordsRead;
++i)
std::cout << "'" << words[i] << "'" << std::endl;
Note how this approach (unlike strtok), effectively requires an assumption about the maximim number of words to read, as well as their lengths.
I would recommend using strtok().
Here is the example from http://www.cplusplus.com/reference/cstring/strtok/
#include <stdio.h>
#include <string.h>
int main ()
{
char str[] ="- This, a sample string.";
char * pch;
printf ("Splitting string \"%s\" into tokens:\n",str);
pch = strtok (str," ,.-");
while (pch != NULL)
{
printf ("%s\n",pch);
pch = strtok (NULL, " ,.-");
}
return 0;
}
Output will be:
Splitting string "- This, a sample string." into tokens:
This
a
sample
string
I'm trying to extract a string and an integer out of a string using sscanf:
#include<stdio.h>
int main()
{
char Command[20] = "command:3";
char Keyword[20];
int Context;
sscanf(Command, "%s:%d", Keyword, &Context);
printf("Keyword:%s\n",Keyword);
printf("Context:%d",Context);
getch();
return 0;
}
But this gives me the output:
Keyword:command:3
Context:1971293397
I'm expecting this ouput:
Keyword:command
Context:3
Why does sscanf behaves like this? Thanks in advance you for your help!
sscanf expects the %s tokens to be whitespace delimited (tab, space, newline), so you'd have to have a space between the string and the :
for an ugly looking hack you can try:
sscanf(Command, "%[^:]:%d", Keyword, &Context);
which will force the token to not match the colon.
If you aren't particular about using sscanf, you could always use strtok, since what you want is to tokenize your string.
char Command[20] = "command:3";
char* key;
int val;
key = strtok(Command, ":");
val = atoi(strtok(NULL, ":"));
printf("Keyword:%s\n",key);
printf("Context:%d\n",val);
This is much more readable, in my opinion.
use a %[ convention here. see the manual page of scanf: http://linux.die.net/man/3/scanf
#include <stdio.h>
int main()
{
char *s = "command:3";
char s1[0xff];
int d;
sscanf(s, "%[^:]:%d", s1, &d);
printf("here: %s:%d\n", s1, d);
return 0;
}
which gives "here:command:3" as its output.