Sscanf delimiters for parsing? - c

I am trying to parse the following string with sscanf:
query=testword&diskimg=simple.img
How can I use sscanf to parse out "testword" and "simple.img"? The delimiter arguments for sscanf really confuse me :/
Thank you!

If you know that the length of "testword" will always be 8 characters, you can do it like this:
char str[] = "query=testword&diskimg=simple.img";
char buf1[100];
char buf2[100];
sscanf(str, "query=%8s&diskimg=%s", buf1, buf2);
buf1 will now contain "testword" and buf2 will contain "simple.img".
Alternatively, if you know that testword will always be preceded by = and followed by &, and that simple.img will always be preceded by =, you can use this:
sscanf(str, "%*[^=]%*c%[^&]%*[^=]%*c%s", buf1, buf2);
It's pretty cryptic, so here's the summary: each % designates the start of a chunk of text. If there's a * following the %, that means that we ignore that chunk and don't store it in one of our buffers. The ^ within the brackets means that this chunk contains any number of characters that are not the characters within the brackets (excepting ^ itself). %s reads a string of arbitrary length, and %c reads a single character.
So to sum up:
We keep reading and ignoring characters if they are not =.
We read and ignore another character (the equal sign).
Now we're at testword, so we keep reading and storing characters into buf1 until we encounter the & character.
More characters to read and ignore; we keep going until we hit = again.
We read and ignore a single character (again, the equal sign).
Finally, we store what's left ("simple.img") into buf2.

Related

fscanf: how to read but not assign commas?

I have a C program that reads in a list of values separated by commas from a .txt file and assigns the values to variables. I want to assign the first value to a string, the second to an int, the third to a double and the fourth to a double. However, the entire line gets assigned to the string, and the rest are garbage or random values. I want to be able to "skip over" the commas and read assign values between the commas. The final double has a percentage sign at the end so I read the value using a %%, at least thats what I believe should be done.
fscanf(text_file, "%s,%d,%lf,%lf%%%[^\n]", title, &count, &size, &percentage);
A data point would look like this:
yellow-leaves,43,4.50,9.00%
But the values of title contain the entire line, and the rest of the values are just random garbage values.
Bear in mind that scanf() is ill-suited for user-input.
Anyway ... "%s" reads commas (it skips whitespace), try "%[^,]".
// assuming
// char title[99];
if (fscanf(f, " %98[^,],%d...", title, ...) != 4) /* error */;
// ^ skip whitespace
Better yet: read a whole line with fgets() then parse it, possibly with sscanf().
char buff[999];
while (fgets(buff, sizeof buff, f)) {
// parse buff
}

Getting a char* with spaces in C from sscanf

I am attempting to read a line written in the format:
someword: .asciiz "want this as a char*"
There is an arbitrary amount of white space between words. I am curious if there is a simple way of getting the internal characters in the quotes into a char* variable using something like sscanf? I am guaranteed the quotes and that where will be no more than 32 characters (including spaces). There will also be a new line character immediately following the quotes.
Most scanf() field descriptors implicitly cause leading whitespace to be skipped and expect the field to be whitespace-terminated. To scan a string that may contain whitespace, however, you can use the %[] field descriptor with an appropriate scan set. Thus, you might scan sequence of lines following the pattern you describe like so by looping calls like this:
char keyword[32], value[32], description[32];
scanf("%s%s%*[ \t]\"%[^\"]\"", keyword, value, description);
That format string:
scans two whitespace-delimited strings into char arrays keyword and value,
scans but does not assign one or more whitespace characters followed by a quotation mark,
scans everything up to but not including the next quotation mark into char array description, and scans and discards a quotation mark.
It relies on the data to be correctly formatted; among other things, this is vulnerable to a buffer overflow if the data are malformed. You can address that by specifying maximum field widths in the format string.
Note, too, that you should check the return value of the function to ensure that all fields were successfully matched. That will allow you to terminate early in the event of malformed input, and even to present valid information about the location of the malformation.
You can use scanf ("%s%s%31[^\n]",s1,s2,s3);
Example:
#include <stdio.h>
int main()
{
char s1[32],s2[32],s3[32];
printf ("write something: ");
scanf ("%s%s%31[^\n]",s1,s2,s3);
printf ("%s %s %s",s1,s2,s3);
return 0;
}
s1 and s2 will ignore spaces but s3 won't
Use \"%32[^\"]\" to capture the quoted phrase. Use "%n" to detect success.
char w1[32+1];
char w2[32+1];
char w3[32+1];
int n = 0;
sscanf(buffer, "%32s%32s \"%32[^\"]\" %n", w1, w2, w3, &n);
if (n == 0) return fail; // format mis-match
if (buffer[n]) return fail; // Extra garbage detected
// else good to go.
"%32s" Skip white-space,then read & save up to 32 non-white-space char. Append '\0'.
" " Skip white space.
"\"" Match a '\"'.
"%32[^\"]" Read and save up to 32 non-'\"' char. Append '\0'.
"%n" Save the count of characters scanned.

Using fscanf to get 2 string in 2 columns

I'm trying to make my program read a file with 2 columns and the first column contains some strings and i cant make it to store into an array
Here is my code:
fp2=fopen("Symbol Table.txt","r");char str[100];
while(fscanf(fp2,"%s %s",str,stemp[scnt])!=NULL) {
puts(stemp[scnt++]);getch(); //This is just here to display conents of second col
}
fclose(fp2)
and here is my txt file:
void void
main Main
( Left Parenthesis
) Right Parenthesis
{ Left Brace
S Identifier
: Colon
$% Start of Block Comment
This program is a simple calculatorFuctions:ADD,SUB,MULT,DIV String
%$ End of Block Comment
Unsigned Noise Words
int Integer
the code store the long string is divided before going into the array
When you try to read strings with scanf or fscanf it reads the string until it encounters a space, a tab, or a newline. In this case, as I can see, in the first loop the str and stemp will be assigned to the stings "void" and "void", in the second loop, it will be "main" and "Main" and in the third loop, "(" and "Left" and so on.
You need to separate the two columns with multiple strings with a special character like the tab (\t) character so that one knows when the 1st column ends and read with the getc function instead of scanf until you reach that special character in your line and then combine all the characters (i.e. letters) in to a string. You can adopt the same method to read the second column with a getc function until you encounter the newline (\n).
while(fscanf(fp2,"%s %s",str,stemp[scnt])!=NULL) {
is not going to work
"%s" scans for non- white-space text. "This program is" and "Left Parenthesis" have embedded white-space.
Need to read a line-at-a-time and then parse the line into 2 strings for the 2 columns.
char buf[100];
if (fgets(buf, sizeof buf, fp2) == NULL) Handle_EOForIOError();
column[2][100];
size_t len = strlen(buf);
if (len < 62) Handle_ShortLine();
buf[61] = = '\0'; // cut the line in 2
column[0][0] = '\0'; // In case column is all white-space
sscanf(buf, " %[^\n]", column[0]);
column[1][0] = '\0';
sscanf(&buf[62], " %[^\n]", column[1]);
Other answers contain statements which are not true. Actually,
fscanf does not always read strings only until a white-space character. The conversion specifier [ also reads a string, and includes the set of expected characters.
You do not need to separate the two columns with a special character, since the columns are of known width 62.
You do not need to read a line at a time and then parse it; though, that might be easier.
So, if you prefer fscanf, you can use
while (fscanf(fp2, "%62[^\n]%[^\n]\n", str, stemp[scnt]) == 2)
(your comparing with NULL is wrong, because NULL is a pointer constant and fscanf doesn't return a pointer but the number of input items assigned or EOF).
A version without fscanf:
while (fgets(str, 62+1, fp2) && fgets(stemp[scnt], 62+1, fp2) && getc(fp2))
(The getc reads the \n.)
The Nature of while(fscanf(fp2,"%s %s",str,stemp[scnt])!=NULL) is will read only the two string separated by a space. if your text file contains long strings like this
This program is a simple calculatorFuctions:ADD,SUB,MULT,DIV
While fetching it will divide this strings into parts of %s %s and fetch. So change your text file in the following way and try:
void void
main Main
( LeftParenthesis
) RightParenthesis
{ LeftBrace
S Identifier
: Colon
$% StartofBlockComment
ThisProgramIsASimpleCalculatorFuctions:ADD,SUB,MULT,DIV String
%$ EndofBlockComment
Unsigned NoiseWords
int Integer

sscanf reading multiple characters

I am trying to read a list of interfaces given in a config file.
char readList[3];
memset(readList,'\0',sizeof(readList));
fgets(linebuf, sizeof(linebuf),fp); //I get the file pointer(fp) without any error
sscanf(linebuf, "List %s, %s, %s \n",
&readList[0],
&readList[1],
&readList[2]);
Suppose the line the config file is something like this
list value1 value2 value3
I am not able to read this. Can someone please tell me the correct syntax for doing this. What am I doing wrong here.
The %s conversion specifier in the format string reads a sequence of non-whitespace characters and stores in buffer pointed to by the corresponding argument passed to sscanf. The buffer must be large enough to store the input string plus the terminating null byte which is added automatically by sscanf else it is undefined behaviour.
char readList[3];
The above statement defines readList to be an array of 3 characters. What you need is an array of characters arrays large enough to store the strings written by sscanf. Also the format string "List %s, %s, %s \n" in your sscanf call means that it must exactly match "List" and two commas ' in that order in the string linebuf else sscanf will fail due to matching failure. Make sure that you string linebuf is formatted accordingly else here is what I suggest. Also, you must guard against sscanf overrunning the buffer it writes into by specifying the maximum field width else it will cause undefined behaviour. The maximum field width should be one less than the buffer size to accommodate the terminating null byte.
// assuming the max length of a string in linebuf is 40
// +1 for the terminating null byte which is added by sscanf
char readList[3][40+1];
fgets(linebuf, sizeof linebuf, fp);
// "%40s" means that sscanf will write at most 40 chars
// in the buffer and then append the null byte at the end.
sscanf(linebuf,
"%40s%40s%40s",
readList[0],
readList[1],
readList[2]);
Your char readlist[3] is an array of three chars. What you need however is an array of three strings, like char readlist[3][MUCH]. Then, reading them like
sscanf(linebuf, "list %s %s %s",
readList[0],
readList[1],
readList[2]);
will perhaps succeed. Note that the alphabetic strings in scanf need to match character-by-character (hence list, not List), and any whitespace in the format string is a signal to skip all whitespace in the string up to the next non-whitespace character. Also note the absence of & in readList arguments since they are already pointers.

splitting string in c

I have a file where each line looks like this:
cc ssssssss,n
where the two first 'c's are individual characters, possibly spaces, then a space after that, then the 's's are a string that is 8 or 9 characters long, then there's a comma and then an integer.
I'm really new to c and I'm trying to figure out how to put this into 4 seperate variables per line (each of the first two characters, the string, and the number)
Any suggestions? I've looked at fscanf and strtok but i'm not sure how to make them work for this.
Thank you.
I'm assuming this is a C question, as the question suggests, not C++ as the tags perhaps suggest.
Read the whole line in.
Use strchr to find the comma.
Do whatever you want with the first two characters.
Switch the comma for a zero, marking the end of a string.
Call strcpy from the fourth character on to extract the sssssss part.
Call atoi on one character past where the comma was to extract the integer.
A string is a sequence of characters that ends at the first '\0'. Keep this in mind. What you have in the file you described isn't a string.
I presume n is an integer that could span multiple decimal places and could be negative. If that's the case, I believe the format string you require is "%2[^ ] %9[^,\n],%d". You'll want to pass fscanf the following expressions:
Your FILE *,
The format string,
An array of 3 chars silently converted to a pointer,
An array of 9 chars silently converted to a pointer,
... and a pointer to int.
Store the return value of fscanf into an int. If fscanf returns negative, you have a problem such as EOF or some other read error. Otherwise, fscanf tells you how many objects it assigned values into. The "success" value you're looking for in this case is 3. Anything else means incorrectly formed input.
I suggest reading the fscanf manual for more information, and/or for clarification.
fscanf function is very powerful and can be used to solve your task:
We need to read two chars - the format is "%c%c".
Then skip a space (just add it to the format string) - "%c%c ".
Then read a string until we hit a comma. Don't forget to specify max string size. So, the format is "%c%c %10[^,]". 10 - max chars to read. [^,] - list of allowed chars. ^, - means all except a comma.
Then skip a comma - "%c%c %10[^,],".
And finally read an integer - "%c%c %10[^,],%d".
The last step is to be sure that all 4 tokens are read - check fscanf return value.
Here is the complete solution:
FILE *f = fopen("input_file", "r");
do
{
char c1 = 0;
char c2 = 0;
char str[11] = {};
int d = 0;
if (4 == fscanf(f, "%c%c %10[^,],%d", &c1, &c2, str, &d))
{
// successfully got 4 values from the file
}
}
while(!feof(f));
fclose(f);

Resources