C fscanf reading in the correct format - c

I'm totally stuck with fscanf formatizer in C
Alice:(44;69) Bob:(74;68) John:(57;98)
This is what I need to read from file. Name:(score1, score2). But I failed to construct the correct formatizer for it:
while(fscanf(f, "%[a-zA-Z]%[;(]%d %d", &buff, &garbage, &s1, &s2)!= EOF){
What am I doing wrong?

First of all if you check e.g. this scanf (and family) reference you can see that you can add an asterisk to a format code to suppress assignment, so no need to pass "garbage" variables.
Secondly for your problem, the numbers are split with semicolon, but you have a space in the format which corresponds to whitespace.
In fact, due to the pattern-matching functionality built-in into scanf you should be able to simplify the format specification to e.g.
fscanf(f, " %[^:]:(%d;%d)", buff, &s1, &s2)
The "%[^:]" format reads everything as a string until it sees a colon. The rest of the format then matches the colon, the left parenthesis, a decimal number, a semicolon, another decimal number and a right parenthesis. I also added a leading space in the format, to skip leading whitespace if there is any.

Related

Wanting scanf to proceed without all parameters filled [duplicate]

This question already has answers here:
How to make that scanf is optionally ignoring one of its conversion specifiers?
(4 answers)
Closed 2 years ago.
I want to have a scanf function that allows the user to input up to four integers separated by spaces but still run if only 2 integers are put in.
scanf("%d %d %d %d", &command, &num_one, &num_two, &num_three);
scanf does exactly that. It returns the number of successful conversions it performed. If it cannot perform a conversion (or cannot match a literal character), it stops reading precisely at that point.
You should always check its return value, even if the examples you are copying don't do that.
What scanf doesn't guarantee is that the values converted are separated by spaces. They might be separated by newlines. If you want a newline character to stop the scan, you need to read the line using something like fgets (or, better if possible, the Posix getline function), and then call sscanf on the line which was read.
You could also force scanf to stop at the end of the line by using %*[ \t] instead of to separate the %ds, which will only match space and tab characters. (The * causes scanf to not try to save the matched string, and also to not count the conversion in its return count.) But that will run you into the other problem with scanf: if there is garbage in the line, you normally want to continue reading with the next line. The getline/sscanf solution will do that for you. If you use scanf, you'll need to manually flush the rest of the input line, which requires calling fgets or getline anyway.
And while I'm at it, note that there is no difference between scanf("%d %d %d %d", ...) and scanf("%d%d%d%d", ...), because %d, like all scanf conversions other than %c, %[ and %%, skips leading whitespace.

Parsing string with sscanf that has a string in it

Project is in C. I need to parse strings that are always formatted the following way: integer, whitespace, plus sign, multi-word string, plus sign, white space, integer, whitespace, integer, end-of-line
Example:
10 +This is 1 string+ 2 -1
I'm having a hard time figuring out what to enter in the formatting of sscanf so that the string surrounded by the '+' signs get parsed correctly, without including the + signs. Assuming sscanf can be used for this case.
I tried "%d +%s+ %d %d" and that didn't work.
You use %s but that reads up to the first white space character. You want to read a string of not-plus-signs, so say that's what sscanf() should do:
"%d +%[^+]+ %d %d"
That's a scan set — see POSIX sscanf(). You should also protect yourself from buffer overflow. If you have:
char buffer[256];
use:
"%d +%255[^+]+ %d %d"
Note the off-by-one in the lengths — this is a design feature of the scanf() family of functions. You could skip leading spaces by putting a space after the first + in the format string. It is not possible to skip trailing spaces before the second + in the data; you'll have to remove those separately.
You ask for 'end of line' after the 3rd number. That's fairly hard. You might use:
"%d +%255[^+]+ %d %d %n"
passing an extra pointer to int argument to hold the offset of the last character parsed. The blank before the %n skips white space, including newlines, so if you read into int nbytes; (passing &nbytes), then you'd check if (buffer[nbytes] != '\0') { …handle trailing garbage… } (but only after checking that you had four successful conversion specifications — %n conversion specifications are not counted in the return value from sscanf() et al). There are other solutions to that; they're all grubby to some extent.

Read File: fscanf doesn't read whitespaces?

I have a problem fetching lines from File Pointer using fscanf.
Let's say a want to fetch a line like this:
<123324><sport><DESCfddR><spor ds>
Fscanf fetch only this part:
<123324><sport><DESCfddR><spor
Does anybody know how to overcome this problem?
Thanks in advance.
In conclusion,the best way to read lines which contain whitespaces is to use fgets:
fgets (currentLine, MAX_LENGTH , filePointer);
Using fscanf you are going to mess with a lot of problems.
You are probably using %s in the fscanf to read data. From the C11 standard,
7.21.6.2 The fscanf function
[...]
The conversion specifiers and their meanings are:
[...]
s Matches a sequence of non-white-space characters. 286
[...]
So, %s will stop scanning when it encounters a whitespace character or, if the length field is present, until the specified length or until a whitespace character, whichever occurs first.
How to fix this problem? Use a different format specifier:
fscanf(fp ," %[^\n]", buffer);
The above fscanf skips all whitespace characters, if any, until the first non-whitespace character(space at the start) and then, %[^\n] scans everything until a \n character.
You can further improve security by using
fscanf(fp ," %M[^\n]", buffer);
Replace M with the size of buffer minus one(One space reserved for the NUL-terminator). It is the length modifier. Also checking the return value of fscanf is a good idea.
Using fgets() is a better way though.

How use the FORMAT part of the sscanf function (or how to use the sscanf function in general)

I have encountered the following line in a program. From reading the manual, I know that sscanf copies from wherever argv[2] is pointed to, but I'm not sure why the FORMAT had been specified as %d and, at the same time, %c (I've seen other examples where there are more format specifiers included in the double quotations ). Is it because sscanf writes a decimal to the struct element "%g.number", a character to "%c"? Thanks!
sscanf(argv[2], " %d %c", &g.number, &c)
Yes. Reading from argv[2], one integer to g.number and one char to c.
For example argv[2] could be "123 a"
What it is specifying is zero or more white space characters followed by a number followed by zero or more white space characters, then a character. It puts the values of the number in g.number and the character in c. I'm not sure why it scans for a number then a character, but if the input does not start with a number, then nothing will be assigned to either variables and the return value of sscanf will be zero, which is how many variables it scanned in. If it fails then the value in g.number will be whatever is in the memory slot it occupies which could be anything, hence undefined.

Making fscanf Ignore Optional Parameter

I am using fscanf to read a file which has lines like
Number <-whitespace-> string <-whitespace-> optional_3rd_column
I wish to extract the number and string out of each column, but ignore the 3rd_column if it exists
Example Data:
12 foo something
03 bar
24 something #randomcomment
I would want to extract 12,foo; 03,bar; 24, something while ignoring "something" and "#randomcomment"
I currently have something like
while(scanf("%d %s %*s",&num,&word)>=2)
{
assign stuff
}
However this does not work with lines with no 3rd column. How can I make it ignore everything after the 2nd string?
The problem is that the %*s is eating the number on the next line when there's no third column, and then the next %d is failing because the next token is not a number. To fix it without using gets() followed by sscanf(), you can use the character class specified:
while(scanf("%d %s%*[^\n]", &num, &word) == 2)
{
assign stuff
}
The [^\n] says to match as many characters as possible that aren't newlines, and the * suppresses assignment as before. Also note that you can't put a space between the %s and the %*[\n], because otherwise that space in the format string would match the newline, causing the %*[\n] to match the entire subsequent line, which is not what you want.
It would appear to me that the simplest solution is to scanf("%d %s", &num, &word) and then fgets() to eat the rest of the line.
Use fgets() to read a line at a time and then use sscanf() to look for the two columns you are interested in, more robust and you don't have to do anything special to ignore trailing data.
I often use gets() followed by an sscanf() on the string you just, er, gots.
Bonus: you can separate the test for end-of-input from the parsing.

Resources