How to have scanf ignore extraneous text from an input file? - c

I'm attempting to give a program input of the following form:
11 22 #3
30 2 #^
1 4 #B asdfghj
However, there are several lines (like the last) that have extra text in them. I'm trying to have the program ignore them to no avail.
Below is my current code:
int x_coord;
int y_coord;
char type[3];
do(grid[y_coord][x_coord]=type[1]);
while(scanf("%d %d %s",&x_coord,&y_coord,type)!=EOF);{
for(i=1; i<=30;i++){
for(j=1; j<=30;j++){
printf("%c",grid[i][j]);
}
printf("\n");
}
}
I've tried to add an extra parameter%*s to scanf to attempt to catch any extra text, but I couldn't get it to run.
Does anyone have any suggestions on how to handle the extra text in the input file?

Your loop seems very oddly formed.
do(grid[y_coord][x_coord]=type[1]);
while(scanf("%d %d %s",&x_coord,&y_coord,type)!=EOF);
Is y_coord, x_coord and type initialized properly for the assignment to work?
Also, scanf() will return the number of elements successfully scanned. You can use that for additional error checking.
In any case, if your input is well-formed, then you can use the following format string:
scanf("%d %d %s%*[^\n]%*c",&x_coord,&y_coord,type)
After parsing the first three arguments, this will discard anything that is not a newline character, and then discard the newline itself.
Using scanf() is error prone because badly formed input can cause scanning to jam or unexpected errors as unintended portions of the input get interpreted incorrectly. For line delimited input, it is better to read in the entire line first with fgets() and then parse the line with sscanf().

Related

C skips user input from subsequent scanf statement [duplicate]

I've been having a lot of problems trying to figure out how to use scanf(). It seems to work fine with integers, being fairly straight forward scanf("%d", &i).
Where I am running into issues is using scanf() in loops trying to read input. For example:
do {
printf("counter: %d: ", counter);
scanf("%c %c%d", &command, &prefix, &input);
} while (command != 'q');
When I enter in a validly structured input like c P101, it seems to loop again before prompting me. This seems to happen even with a single:
scanf("%c", &c)
in a while loop. It'll do the loop twice before prompting me again. What is making it loop twice, and how do I stop it?
When I enter in less amount of input that programmatically wouldn't have another character or number such as q, pressing enter seems to prompt me to enter more. How do I get scanf() to process both single and double character entries?
When you enter "c P101" the program actually receives "c P101\n". Most of the conversion specifiers skip leading whitespace including newlines but %c does not. The first time around everything up til the "\n" is read, the second time around the "\n" is read into command, "c" is read into prefix, and "P" is left which is not a number so the conversion fails and "P101\n" is left on the stream. The next time "P" is stored into command, "1" is stored into prefix, and 1 (from the remaining "01") is stored into input with the "\n" still on the stream for next time. You can fix this issue by putting a space at the beginning of the format string which will skip any leading whitespace including newlines.
A similiar thing is happening for the second case, when you enter "q", "q\n" is entered into the stream, the first time around the "q" is read, the second time the "\n" is read, only on the third call is the second "q" read, you can avoid the problem again by adding a space character at the beginning of the format string.
A better way to do this would be to use something like fgets() to process a line at a time and then use sscanf() to do the parsing.
It's really broken! I didn't know it
#include <stdio.h>
int main(void)
{
int counter = 1;
char command, prefix;
int input;
do
{
printf("counter: %d: ", counter);
scanf("%c %c%d", &command, &prefix, &input);
printf("---%c %c%d---\n", command, prefix, input);
counter++;
} while (command != 'q');
}
counter: 1: a b1
---a b1---
counter: 2: c d2
---
c1---
counter: 3: e f3
---d 21---
counter: 4: ---e f3---
counter: 5: g h4
---
g3---
The output seems to fit with Robert's answer.
Once you have the string that contains the line. i.e. "C P101", you can use the parsing abilities of sscanf.
See:
http://www.cplusplus.com/reference/clibrary/cstdio/sscanf.html
For question 1, I suspect that you've got a problem with your printf(), since there is no terminating "\n".
The default behavior of printf is to buffer output until it has a complete line. That is unless you explicitly change the buffering on stdout.
For question 2, you've just hit one of the biggest problems with scanf(). Unless your input exactly matches the scan string that you've specified, your results are going to be nothing like what you expect.
If you've got an option you'll have better results (and fewer security issues) by ignoring scanf() and doing your own parsing. For example, use fgets() to read an entire line into a string, and then process the individual fields of the string — maybe even using sscanf().
Perhaps using a while loop, not a do...while loop will help. This way the condition is tested before execution of the code.
Try the following code snippet:
while(command != 'q')
{
//statements
}
Also, if you know the string length ahead of time, 'for' loops can be much easier to work with than 'while' loops. There are also some trickier ways to dynamically determine the length as well.
As a final rant: scanf() does not "suck." It does what it does and that is all.
The gets() function is very dangerous (though convenient for no-risk applications), since it does not natively do any checking of the input. It is VERY commonly known as a point of exploit, specifically buffer overflow attacks, overwriting space in registers not allocated for that variable. Therefore if you choose to use it, spend some time putting some solid error checking/correction in.
However, almost invariably, either fgets() or POSIX getline() should be used to read the line — noting that the functions both include the newline in the input string, unlike gets(). You can remove the trailing newline from string read by either fgets() or getline() using string[strcspn(string, "\n")] = '\0'; — this works reliably.

How scanf works if I add a new line '\n' at the end

#include <stdio.h>
int main(){
int a,b,c;
scanf("%d%d%d\n",&a,&b,&c);
printf("%d %d %d",a,b,c);
return 0;
}
Look I am taking 3 inputs and printing 3 outputs... But as I added a new line at the end of the scanf function I have to give 4 inputs to get 3 outputs(I get the first 3 I give as input)... How scanf works here?
And in this case:
#include <stdio.h>
int main(){
double n[2][5]; int i,j;
for (i=0;i<=1;i++){
for(j=0;j<=4;j++){
scanf("%lf\n",&n[i][j]);
printf("Class=%d Roll=%d Marks=%lf\n",i+6,j+1,n[i][j]);
}
}
return 0;
}
Look I have to give 11 inputs to get the 10 outputs... And each time I give a input I get the previous input as an output... How scanf is working here?
A white character in scanf format matches a sequence of white characters in the input until a non-white character.
Newline is a white character and this explains the behavior of your program. Meaning that if your scanf format terminates by a newline, it does not finish until it sees an additional non-blank character after the last parsed input.
scanf is only used to input values, now what you need is the your output to be on a new line, basically you want a newline to be printed on your screen, so you must use the \n in the printf as you want a new line to be printed.As for why it is asking for four inputs , Im not sure, the syntax says that you must you use % and a letter according to the type of data to be accepted, Ill read more on this and Get back.
Thank You.

arguments of a scanf statement

I wrote this code:
scanf("%d \n", &n);
for(i=0;i<n;i++)
printf("%d \n",i);
It was not printing. I realised that there was a '\n' in the call to scanf. When I removed that I got the expected output. Why was it not giving the output when the scanf format string contained a '\n'?
What is the reason?
scanf has an implicit read to end of line. Since you had a '\n' in your format string, it was reading your first return as part of the format. It was then continueing to wait for the '\n' it expected as a terminator. If you provided another token, followed by a return, then you would get the expected results.
So, if you supplied:
2
7
You would get the output:
0
1
Because, the first number (2), has been matched against your first format specifier. What I'm unsure about is why you need to provide another token (just pressing return on the subsequent line doesn't work). I assume that's because scanf requires a minimum of one non-white space character, but I could be wrong.

Parsing input with scanf in C

I've been having a lot of problems trying to figure out how to use scanf(). It seems to work fine with integers, being fairly straight forward scanf("%d", &i).
Where I am running into issues is using scanf() in loops trying to read input. For example:
do {
printf("counter: %d: ", counter);
scanf("%c %c%d", &command, &prefix, &input);
} while (command != 'q');
When I enter in a validly structured input like c P101, it seems to loop again before prompting me. This seems to happen even with a single:
scanf("%c", &c)
in a while loop. It'll do the loop twice before prompting me again. What is making it loop twice, and how do I stop it?
When I enter in less amount of input that programmatically wouldn't have another character or number such as q, pressing enter seems to prompt me to enter more. How do I get scanf() to process both single and double character entries?
When you enter "c P101" the program actually receives "c P101\n". Most of the conversion specifiers skip leading whitespace including newlines but %c does not. The first time around everything up til the "\n" is read, the second time around the "\n" is read into command, "c" is read into prefix, and "P" is left which is not a number so the conversion fails and "P101\n" is left on the stream. The next time "P" is stored into command, "1" is stored into prefix, and 1 (from the remaining "01") is stored into input with the "\n" still on the stream for next time. You can fix this issue by putting a space at the beginning of the format string which will skip any leading whitespace including newlines.
A similiar thing is happening for the second case, when you enter "q", "q\n" is entered into the stream, the first time around the "q" is read, the second time the "\n" is read, only on the third call is the second "q" read, you can avoid the problem again by adding a space character at the beginning of the format string.
A better way to do this would be to use something like fgets() to process a line at a time and then use sscanf() to do the parsing.
It's really broken! I didn't know it
#include <stdio.h>
int main(void)
{
int counter = 1;
char command, prefix;
int input;
do
{
printf("counter: %d: ", counter);
scanf("%c %c%d", &command, &prefix, &input);
printf("---%c %c%d---\n", command, prefix, input);
counter++;
} while (command != 'q');
}
counter: 1: a b1
---a b1---
counter: 2: c d2
---
c1---
counter: 3: e f3
---d 21---
counter: 4: ---e f3---
counter: 5: g h4
---
g3---
The output seems to fit with Robert's answer.
Once you have the string that contains the line. i.e. "C P101", you can use the parsing abilities of sscanf.
See:
http://www.cplusplus.com/reference/clibrary/cstdio/sscanf.html
For question 1, I suspect that you've got a problem with your printf(), since there is no terminating "\n".
The default behavior of printf is to buffer output until it has a complete line. That is unless you explicitly change the buffering on stdout.
For question 2, you've just hit one of the biggest problems with scanf(). Unless your input exactly matches the scan string that you've specified, your results are going to be nothing like what you expect.
If you've got an option you'll have better results (and fewer security issues) by ignoring scanf() and doing your own parsing. For example, use fgets() to read an entire line into a string, and then process the individual fields of the string — maybe even using sscanf().
Perhaps using a while loop, not a do...while loop will help. This way the condition is tested before execution of the code.
Try the following code snippet:
while(command != 'q')
{
//statements
}
Also, if you know the string length ahead of time, 'for' loops can be much easier to work with than 'while' loops. There are also some trickier ways to dynamically determine the length as well.
As a final rant: scanf() does not "suck." It does what it does and that is all.
The gets() function is very dangerous (though convenient for no-risk applications), since it does not natively do any checking of the input. It is VERY commonly known as a point of exploit, specifically buffer overflow attacks, overwriting space in registers not allocated for that variable. Therefore if you choose to use it, spend some time putting some solid error checking/correction in.
However, almost invariably, either fgets() or POSIX getline() should be used to read the line — noting that the functions both include the newline in the input string, unlike gets(). You can remove the trailing newline from string read by either fgets() or getline() using string[strcspn(string, "\n")] = '\0'; — this works reliably.

Making fscanf Ignore Optional Parameter

I am using fscanf to read a file which has lines like
Number <-whitespace-> string <-whitespace-> optional_3rd_column
I wish to extract the number and string out of each column, but ignore the 3rd_column if it exists
Example Data:
12 foo something
03 bar
24 something #randomcomment
I would want to extract 12,foo; 03,bar; 24, something while ignoring "something" and "#randomcomment"
I currently have something like
while(scanf("%d %s %*s",&num,&word)>=2)
{
assign stuff
}
However this does not work with lines with no 3rd column. How can I make it ignore everything after the 2nd string?
The problem is that the %*s is eating the number on the next line when there's no third column, and then the next %d is failing because the next token is not a number. To fix it without using gets() followed by sscanf(), you can use the character class specified:
while(scanf("%d %s%*[^\n]", &num, &word) == 2)
{
assign stuff
}
The [^\n] says to match as many characters as possible that aren't newlines, and the * suppresses assignment as before. Also note that you can't put a space between the %s and the %*[\n], because otherwise that space in the format string would match the newline, causing the %*[\n] to match the entire subsequent line, which is not what you want.
It would appear to me that the simplest solution is to scanf("%d %s", &num, &word) and then fgets() to eat the rest of the line.
Use fgets() to read a line at a time and then use sscanf() to look for the two columns you are interested in, more robust and you don't have to do anything special to ignore trailing data.
I often use gets() followed by an sscanf() on the string you just, er, gots.
Bonus: you can separate the test for end-of-input from the parsing.

Resources