sscanf reading multiple characters - c

I am trying to read a list of interfaces given in a config file.
char readList[3];
memset(readList,'\0',sizeof(readList));
fgets(linebuf, sizeof(linebuf),fp); //I get the file pointer(fp) without any error
sscanf(linebuf, "List %s, %s, %s \n",
&readList[0],
&readList[1],
&readList[2]);
Suppose the line the config file is something like this
list value1 value2 value3
I am not able to read this. Can someone please tell me the correct syntax for doing this. What am I doing wrong here.

The %s conversion specifier in the format string reads a sequence of non-whitespace characters and stores in buffer pointed to by the corresponding argument passed to sscanf. The buffer must be large enough to store the input string plus the terminating null byte which is added automatically by sscanf else it is undefined behaviour.
char readList[3];
The above statement defines readList to be an array of 3 characters. What you need is an array of characters arrays large enough to store the strings written by sscanf. Also the format string "List %s, %s, %s \n" in your sscanf call means that it must exactly match "List" and two commas ' in that order in the string linebuf else sscanf will fail due to matching failure. Make sure that you string linebuf is formatted accordingly else here is what I suggest. Also, you must guard against sscanf overrunning the buffer it writes into by specifying the maximum field width else it will cause undefined behaviour. The maximum field width should be one less than the buffer size to accommodate the terminating null byte.
// assuming the max length of a string in linebuf is 40
// +1 for the terminating null byte which is added by sscanf
char readList[3][40+1];
fgets(linebuf, sizeof linebuf, fp);
// "%40s" means that sscanf will write at most 40 chars
// in the buffer and then append the null byte at the end.
sscanf(linebuf,
"%40s%40s%40s",
readList[0],
readList[1],
readList[2]);

Your char readlist[3] is an array of three chars. What you need however is an array of three strings, like char readlist[3][MUCH]. Then, reading them like
sscanf(linebuf, "list %s %s %s",
readList[0],
readList[1],
readList[2]);
will perhaps succeed. Note that the alphabetic strings in scanf need to match character-by-character (hence list, not List), and any whitespace in the format string is a signal to skip all whitespace in the string up to the next non-whitespace character. Also note the absence of & in readList arguments since they are already pointers.

Related

Getting a char* with spaces in C from sscanf

I am attempting to read a line written in the format:
someword: .asciiz "want this as a char*"
There is an arbitrary amount of white space between words. I am curious if there is a simple way of getting the internal characters in the quotes into a char* variable using something like sscanf? I am guaranteed the quotes and that where will be no more than 32 characters (including spaces). There will also be a new line character immediately following the quotes.
Most scanf() field descriptors implicitly cause leading whitespace to be skipped and expect the field to be whitespace-terminated. To scan a string that may contain whitespace, however, you can use the %[] field descriptor with an appropriate scan set. Thus, you might scan sequence of lines following the pattern you describe like so by looping calls like this:
char keyword[32], value[32], description[32];
scanf("%s%s%*[ \t]\"%[^\"]\"", keyword, value, description);
That format string:
scans two whitespace-delimited strings into char arrays keyword and value,
scans but does not assign one or more whitespace characters followed by a quotation mark,
scans everything up to but not including the next quotation mark into char array description, and scans and discards a quotation mark.
It relies on the data to be correctly formatted; among other things, this is vulnerable to a buffer overflow if the data are malformed. You can address that by specifying maximum field widths in the format string.
Note, too, that you should check the return value of the function to ensure that all fields were successfully matched. That will allow you to terminate early in the event of malformed input, and even to present valid information about the location of the malformation.
You can use scanf ("%s%s%31[^\n]",s1,s2,s3);
Example:
#include <stdio.h>
int main()
{
char s1[32],s2[32],s3[32];
printf ("write something: ");
scanf ("%s%s%31[^\n]",s1,s2,s3);
printf ("%s %s %s",s1,s2,s3);
return 0;
}
s1 and s2 will ignore spaces but s3 won't
Use \"%32[^\"]\" to capture the quoted phrase. Use "%n" to detect success.
char w1[32+1];
char w2[32+1];
char w3[32+1];
int n = 0;
sscanf(buffer, "%32s%32s \"%32[^\"]\" %n", w1, w2, w3, &n);
if (n == 0) return fail; // format mis-match
if (buffer[n]) return fail; // Extra garbage detected
// else good to go.
"%32s" Skip white-space,then read & save up to 32 non-white-space char. Append '\0'.
" " Skip white space.
"\"" Match a '\"'.
"%32[^\"]" Read and save up to 32 non-'\"' char. Append '\0'.
"%n" Save the count of characters scanned.

sscanf not extracting pattern

I am trying to figure out the pattern I should be giving to sscanf.
I have a string abcde(1GB). I want to extract 1 and GB. I am using
char list[]= "abcde(1GB)";
int memory_size =0;
char unit[3]={0} ;
sscanf(list, "%*s%d%s" , &memory_size, unit);
I do not see tokens extracted when I print I see memory_size =0 and NULL in unit.
Thanks
your sscanf() string format should be:
sscanf(list, "%*[^(](%d%[^)]" , &memory_size, unit);
%[^)] means catch charachters and stop ctaching when finding the charachter ) or end of the string
%*[^(] means:
[^\(] means catch charachters and stop ctaching when finding the charachter ( - as opposed to a more conventional %s - catching charachters and stop ctaching when finding space characters"
* means "read but not store"
The %s conversion specifier in the format string of scanf means that scanf will read a sequence of non-whitespace characters. The * in %*s is called the assignment suppression character. It means that scanf will read and discard the non-whitespace characters.
Since %s matches any sequence of non-whitespace characters, %*s in "%*s%d%s" means that sscanf will read an discard all the characters in the string list. Therefore, there's nothing left in the array list to be read and assigned to the arguments &memory_size and unit. This explains why memory_size and unit are unchanged. What you need is the format string
sscanf(list, "%*[^(](%d%2[^)]%*s", &memory_size, unit);
Here, in the format string "%*[^(](%d%2[^)]%*s" -
%*[^(] means that sscanf will first read and discard a sequence of character not containing the character (.
( means that sscanf will read and discard (.
%d means the sscanf will read a decimal integer.
%2[^)] means that sscanf will read at most a sequence of 2 characters not containing ) and store them in the corresponding argument. This is to ensure that sscanf does not overrun the buffer unit in case the string to be stored is too large. It's one less than the size of unit to save space for the terminating null byte which is automatically added by sscanf at the end of the buffer.
%*s means sscanf will read and discard any sequence of leftover non-whitespace characters.
plz go through the following link for the sscanf() function.
sscanf function description
Now,
as per MOHAMED said %[^(] is for catch the characters till "("
means %*[^(] truncate the string until ( after that %d as integer data and then capture the string until ).
After accept answer - supplemental
#MOHAMED answer is good, but below are candidate improvements.
1) Always check the sscanf() result to insure data was scanned as intended.
if (sscanf(list, "%*[^(](%d%[^)]", &memory_size, unit) != 2) ScanFailure();
2) When using the "%s" or "%[]" specifiers, include a limiting length. #ajay
char unit[3]={0} ;
// v--- 1 less than buffer size.
if (sscanf(list, "%*[^(](%d%2[^)]" , &memory_size, unit) != 2) ScanFailure();
3) Appending a "%n" (save scan position) is a sure-fire means of detecting the trailing ) was there and extra junk was not at the end of the input string.
"%n" does not add to sscaanf() result.
int n = 0;
// )%n <--- Look for ) then save scan position
if (sscanf(list, "%*[^(](%d%2[^)])%n", &memory_size, unit, &n) != 2) ||
list[n] != '\0') ScanFailure();
4) Making room for optional white-space may/may not be useful. Specifiers already allow optional leading white-space. 3 exceptions: %c %[] %n
"%*[^(](%d%2[^)])%n"
// v v v
" %*[^(](%d %2[^)]) %n"

Doesn't %[] or %[^] specifier in scanf(),sscanf() or fscanf() store the input in null-terminated character array?

Here's what the Beez C guide (LINK) tells about the %[] format specifier:
It allows you to specify a set of characters to be stored away (likely in an array of chars). Conversion stops when a character that is not in the set is matched.
I would appreciate if you can clarify some basic questions that arise from this premise:
1) Are the input fetched by those two format specifiers stored in the arguments(of type char*) as a character array or a character array with a \0 terminating character (string)? If not a string, how to make it store as a string , in cases like the program below where we want to fetch a sequence of characters as a string and stop when a particular character (in the negated character set) is encountered?
2) My program seems to suggest that processing stops for the %[^|] specifier when the negated character | is encountered.But when it starts again for the next format specifier,does it start from the negated character where it had stopped earlier?In my program I intend to ignore the | hence I used %*c.But I tested and found that if I use %c and an additional argument of type char,then the character | is indeed stored in that argument.
3) And lastly but crucially for me,what is the difference between passing a character array for a %s format specifier in printf() and a string(NULL terminated character array)?In my other program titled character array vs string,I've passed a character array(not NULL terminated) for a %s format specifier in printf() and it gets printed just as a string would.What is the difference?
//Program to illustrate %[^] specifier
#include<stdio.h>
int main()
{
char *ptr="fruit|apple|lemon",type[10],fruit1[10],fruit2[10];
sscanf(ptr, "%[^|]%*c%[^|]%*c%s", type,fruit1, fruit2);
printf("%s,%s,%s",type,fruit1,fruit2);
}
//character array vs string
#include<stdio.h>
int main()
{
char test[10]={'J','O','N'};
printf("%s",test);
}
Output JON
//Using %c instead of %*c
#include<stdio.h>
int main()
{
char *ptr="fruit|apple|lemon",type[10],fruit1[10],fruit2[10],char_var;
sscanf(ptr, "%[^|]%c%[^|]%*c%s", type,&char_var,fruit1, fruit2);
printf("%s,%s,%s,and the character is %c",type,fruit1,fruit2,char_var);
}
Output fruit,apple,lemon,and the character is |
It is null terminated. From sscanf():
The conversion specifiers s and [ always store the null terminator in addition to the matched characters. The size of the destination array must be at least one greater than the specified field width.
The excluded characters are unconsumed by the scan set and remain to be processed. An alternative format specifier:
if (sscanf(ptr, "%9[^|]|%9[^|]|%9s", type,fruit1, fruit2) == 3)
The array is actually null terminated as remaining elements will be zero initialized:
char test[10]={'J','O','N' /*,0,0,0,0,0,0,0*/ };
If it was not null terminated then it would keep printing until a null character was found somewhere in memory, possibly overruning the end of the array causing undefined behaviour. It is possible to print a non-null terminated array:
char buf[] = { 'a', 'b', 'c' };
printf("%.*s", 3, buf);
1) Are the input fetched by those two format specifiers stored in the
arguments(of type char*) as a character array or a character array
with a \0 terminating character (string)? If not a string, how to make
it store as a string , in cases like the program below where we want
to fetch a sequence of characters as a string and stop when a
particular character (in the negated character set) is encountered?
They're stored in ASCIIZ format - with a NUL/'\0' terminator.
2) My program seems to suggest that processing stops for the %[^|]
specifier when the negated character | is encountered.But when it
starts again for the next format specifier,does it start from the
negated character where it had stopped earlier?In my program I intend
to ignore the | hence I used %*c.But I tested and found that if I use
%c and an additional argument of type char,then the character | is
indeed stored in that argument.
It shouldn't consume the next character. Show us your code or it didn't happen ;-P.
3) And lastly but crucially for me,what is the difference between
passing a character array for a %s format specifier in printf() and a
string(NULL terminated character array)?In my other program titled
character array vs string,I've passed a character array(not NULL
terminated) for a %s format specifier in printf() and it gets printed
just as a string would.What is the difference?
(edit: the following addresses the question above, which talks about array behaviours generally and is broader than the code snippet in the question that specifically posed the case char[10] = "abcd"; and is safe)
%s must be passed a pointer to a ASCIIZ text... even if that text is explicitly in a char array, it's the mandatory presence of the NUL terminator that defines the textual content and not the array length. You must NUL terminate your character array or you have undefined behaviour. You might get away with it sometimes - e.g. strncpy into the array will NUL terminate it if-and-only-if there's room to do so, and static arrays start with all-0 content so if you only overwrite before the final character you'll have a NUL, your char[10] example happens to have elements for which values aren't specified populated with NULs, but you should generally take responsibility for ensuring that something is ensuring NUL termination.

splitting string in c

I have a file where each line looks like this:
cc ssssssss,n
where the two first 'c's are individual characters, possibly spaces, then a space after that, then the 's's are a string that is 8 or 9 characters long, then there's a comma and then an integer.
I'm really new to c and I'm trying to figure out how to put this into 4 seperate variables per line (each of the first two characters, the string, and the number)
Any suggestions? I've looked at fscanf and strtok but i'm not sure how to make them work for this.
Thank you.
I'm assuming this is a C question, as the question suggests, not C++ as the tags perhaps suggest.
Read the whole line in.
Use strchr to find the comma.
Do whatever you want with the first two characters.
Switch the comma for a zero, marking the end of a string.
Call strcpy from the fourth character on to extract the sssssss part.
Call atoi on one character past where the comma was to extract the integer.
A string is a sequence of characters that ends at the first '\0'. Keep this in mind. What you have in the file you described isn't a string.
I presume n is an integer that could span multiple decimal places and could be negative. If that's the case, I believe the format string you require is "%2[^ ] %9[^,\n],%d". You'll want to pass fscanf the following expressions:
Your FILE *,
The format string,
An array of 3 chars silently converted to a pointer,
An array of 9 chars silently converted to a pointer,
... and a pointer to int.
Store the return value of fscanf into an int. If fscanf returns negative, you have a problem such as EOF or some other read error. Otherwise, fscanf tells you how many objects it assigned values into. The "success" value you're looking for in this case is 3. Anything else means incorrectly formed input.
I suggest reading the fscanf manual for more information, and/or for clarification.
fscanf function is very powerful and can be used to solve your task:
We need to read two chars - the format is "%c%c".
Then skip a space (just add it to the format string) - "%c%c ".
Then read a string until we hit a comma. Don't forget to specify max string size. So, the format is "%c%c %10[^,]". 10 - max chars to read. [^,] - list of allowed chars. ^, - means all except a comma.
Then skip a comma - "%c%c %10[^,],".
And finally read an integer - "%c%c %10[^,],%d".
The last step is to be sure that all 4 tokens are read - check fscanf return value.
Here is the complete solution:
FILE *f = fopen("input_file", "r");
do
{
char c1 = 0;
char c2 = 0;
char str[11] = {};
int d = 0;
if (4 == fscanf(f, "%c%c %10[^,],%d", &c1, &c2, str, &d))
{
// successfully got 4 values from the file
}
}
while(!feof(f));
fclose(f);

using scanf and family to read in two strings separated by a space from a file in c

I am trying to read in 2 string that are separated by a space from a file.
Whatever I try I keep getting the 1st string initialized but the second string is always NULL.
Some of the formatters that I have tried are "%s%s" , "%s %s" , "%s[\n\t ]%s"
Any ideas of what I am doing wrong?
I think it has to do with the internal buffer of scanf -- reads the first %s then puts some invisible character in buffer reads that with 2nd %s which then is read and the second string is NULL when complete.
What do your strings look like?
I don't think your hypothesis of fscanf() modifying the input data by placing "some invisible character in buffer" is true.
It seems more likely that your strings don't conform to the requirements of the %s format specifier.
between your string, if there is just one space
fscanf ( ..., "%s %s", ... ) ; // you know how to fill space marked with ...
but if number of white space is not known :
char stack[YourscreenSize];
fscanf ( ..., "YourscreenSize[^\n]", stack ); // take all line in one data,
then parse it
if number of white space is not known, ( second way )
take data from file
check it, send first char in isgraph function whether it is white space or not
if it is white space, then erase stored data
do it, iteratively.When you see EOF break from iteration, (you can chech return value of
fscanf to know whether it reads char or EOF )

Resources