fgets() reading numbers to space - c

I've got a little problem:
I want fgets() to act like scanf("%d",...) - read input to whitespace, not whole line. Is there any way to make it work like that?
Thanks in advance

Use fgets() to save the whole line to a char array. Then write a function that uses strtok() to slice your line into substrings, separated by spaces, and check each substring to see if it contains only digits. If it is so, use sscanf() to read from that substring to a variable.
Alternatively, you can use fscanf() in the first place, with format "%s" to read a string from the file. fscanf() will stop reading upon reaching a separator (space, new line, etc). Check the string read and if it contains a valid number, use sscanf() or atoi() to convert it into a numeric value.
I've come up with this code:
#include <stdio.h>
#define VALUE_NOT_PRESENT -1 /* A value you won't expect in your file */
int main()
{
FILE *f;
char s[256];
int n;
f = fopen ("test.txt","r");
fscanf (f, "%s", s);
while (!feof(f))
{
n = VALUE_NOT_PRESENT;
sscanf (s, "%d", &n); /* if s cannot be converted to a number, n won't
be updated, so we can use that to check if
the number in s is actually a valid number */
if (n == VALUE_NOT_PRESENT)
printf ("Discarded ");
else
printf ("%d ", n);
fscanf (f, "%s", s);
}
fclose (f);
printf ("\n");
return 0;
}
It works by using the ability of *scanf family functions to not update the variable if the characters read cannot form a valid number.
Executed with a file with this content:
1 2 -3
-4 abc
5 6 a12 6c7
It's able to recognize abc and a12 as invalid numbers and so they are discarded. Unfortunately, it recognized 6c7 as the number 6. I don't know if this is ok for you. If not, you will probably have to write a function that will use a state-machine driven parser to accept or reject the string as a number. I don't know if such function exists in the standard library, but will be surely available out there.

Related

General questions about scanf and fscanf in C programming language

If I'm not wrong, library function int fscanf(FILE *stream, const char *format, ...) works
exactly the same as function int scanf(const char *format, ...) except that it requires stream selection.
For example if I wanted to read two ints from standard input the code would look something like this.
int first_number;
int second_number;
scanf("%d%d", &first_number, &second_number);
There's no point of me adding newline character in between format specifiers even though the second number is entered in next line of input? Function just looks for next decimal integer right? What happens when I enter two characters instead of ints? Why the function sometimes doesn't work if there's a space between format specifiers?
In addition to that. When reading from file with fscanf(..), lets says the txt file contains next lines:
P6
255
1920 1080
Do I need to specify next line characters in fscanf(..)? I read it like this.
FILE *input = ..
char type[2];
int tr;
int width; int height;
fscanf(input, "%s\n", &type);
fscanf(input, "%d\n" &tr);
fscanf(input, "%d %d\n", &width, &height)
Is there a need for \n to signal next line?
Can fscanf(..) anyhow affect any other functions for reading files like fread()? Or is it a good practice to just stick to one function through the whole file?
scanf(...) operates like fscanf(stdin, ....).
Unless '\n', ' ', or other white spaces are inside a "%[...]", as part of a format for *scanf(), scanning functions the same as if ' ', '\t' '\n' was used. (Also for '\v, '\r, '\f.)
// All function the same.
fscanf(input, "%d\n" &tr);
fscanf(input, "%d " &tr);
fscanf(input, "%d\t" &tr);
There's no point of me adding newline character in between format specifiers even though the second number is entered in next line of input?
All format specifiers except "%n", "^[...]", "%c" consume optional leading white-spaces. With "%d" the is no need other than style to code a leading white-space in the format.
Function just looks for next decimal integer right?
Simply: yes.
What happens when I enter two characters instead of ints?
Scanning stops. The first non-numeric input remains in stdin as well as any following input. The *scanf() return value reflects the incomplete scan.
Why the function sometimes doesn't work if there's a space between format specifiers?
Need example. Having spaces between specifiers is not an issue unless the following specifier is one of "%n", "^[...]", "%c".
When reading from file with fscanf(..), .... Do I need to specify next line characters in fscanf(..)?
No. fscanf() is not line orientated. Use fgets() to read lines. fscanf() is challenging to use to read a line. Something like
char buf[100];
int cnt = fscanf(f, "%99[^\n]", buf);
if (cnt == 0) {
buf[0] = 0;
}
if (cnt != EOF) {
cnt = fscanf(f, "%*1[^\n]");
}
I read it like this. ... fscanf(input, "%s\n", &type); fscanf(input, "%d\n" &tr); ....
"it" as in a line is not read properly as "%s", "%d", "\n" all read consume 0, 1, 2, ... '\n' and other white-spaces. They do not read a line nor just the 1 character of the format.
Further "\n" does not complete upon reading 1 '\n', but continues reading all white-spaces until a non-white-space is detected (or end-of-file). Do not append such to the end of a format to read the rest of the line.
If want to read the trailing '\n', code could use int cnt = fscanf(input, "%d%*1[\n]" &tr);, but code will not know if it succeeded in reading the trailing '\n' after the int. It will have simply read it if it was there. Could use other formats, but really, using fgets() to read a line is better.
Is there a need for \n to signal next line?
No, as a format "\n" reads 0 or more whites-spaces, not just new-lines.
Can fscanf(..) anyhow affect any other functions for reading files like fread()?
Yes. All input function affect what is available next for other input functions. Mixing fread() and fscanf() is challenging to get right.
is it a good practice to just stick to one function through the whole file?
It certainly is simpler. I recommend to use input functions as building blocks for a helper function to handle your file input.
Tip: Read lines with fgets(), then parse. Set fscanf() aside until you understand why it has so much trouble with unexpected input.
The %d conversion specifier tells scanf and fscanf to skip over any leading whitespace, then read up to the first non-digit character, so you don’t need to put a newline between the two %d in the scanf call - in fact, if you do that, it means you have to have a newline between your inputs, not just blanks.
Most conversion specifiers skip over leading whitespace - the only ones that don’t are %c and %[, so you’ll want to be careful when using them.

Scanf: detect that the input was too long

We can easily limit the length of the input accepted by scanf:
char str[101];
scanf("%100s", str);
Is there any efficient way to find out that the string was trimmed? We could, for example, report an error in such case.
We could read "%101s" into char strx[102] and check with strlen() but this involves extra cost.
Use the %n conversion to write the scan position to an integer. If it was 100 past the beginning then the string was too big.
I find that %n is useful for all kinds of things.
I thought the above was plenty of information for anyone who had read the scanf docs / man page and had actually tried it.
The idea is that you make your buffer and your scan limit bigger than whatever size string you expect to find. Then if you find a scan result that is exactly as big as your scan limit you know it is an invalid string. Then you report an error or exit or whatever it is that you do.
Also, if you're about to say "But I want to report an error and continue on the next line but scanf left my file in an unknown position."
That is why you read a line at a time using fgets and then use sscanf instead of scanf. It removes the possibility of ending the scan in the middle of the line and makes it easy to count line numbers for error reporting.
So here is the code that I just wrote:
#include <stdio.h>
#include <stdlib.h>
int scan_input(const char *input) {
char buf[101];
int position = 0;
int matches = sscanf(input, "%100s%n", buf, &position);
printf("'%s' matches=%d position=%d\n", buf, matches, position);
if (matches < 1)
return 2;
if (position >= 100)
return 3;
return 0;
}
int main(int argc, char *argv[]) {
if (argc < 2)
exit(1);
const char *input = argv[1];
return scan_input(input);
}
And here is what happens:
$ ./a.out 'This is a test string'
'This' matches=1 position=4
$ ./a.out 'This-is-a-test-string'
'This-is-a-test-string' matches=1 position=21
$ ./a.out '01234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789'
'0123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789' matches=1 position=100
You could use fgets() to read an entire line. Then you verify if the newline character is in the string. However, this has a few disadvantages:
It will consume the entire line, and maybe that's not what you want. Notice that fgets() is not equivalent to scanf("%100s") -- the latter only reads until the first blank character appears;
If the input stream is closed before a newline character is supplied, you will be undecided;
You have to go through the array to search for the newline character.
So the better option seems to be as such:
char str[101];
int c;
scanf("%100s", str);
c = getchar();
ungetc(c, stdin);
if (c == EOF || isspace(c)) {
/* successfuly read everything */
}
else {
/* input was too long */
}
This reads the string normally and checks for the next character. If it's a blank or if the stream has been closed, then everything was read.
The ungetc() is there in case you don't want your test to modify the input stream. But it's probably unnecessary.
fgets() is a better way to go, read the line of user input and then parse it.
But is OP still wants to use scanf()....
Since it is not possible to "detect that the input was too long" without attempting to read more than the n maximum characters, code needs to read beyond.
unsigned char sentinel;
char str[101];
str[0] = '\0';
if (scanf("%100s%c", str, &sentinel) == 2) {
ungetc(sentential, stdin); // put back for next input function
if (isspace(sentential) NoTrimOccurred();
else TrimOccurred();
else {
NoTrimOccurred();
}
A very rough but easy way of doing this would be, adding a getchar() call after the scanf().
scanf() leaves the newline into the input buffer after reading the actual input. In case, the supplied input is less than the maximum field width, getchar() would return the newline. Otherwise, the first unconsumed input will be returned.
That said, the ideal way of doing it is to actually read a bit more than the required value and see if anything appears in the buffer area. You can make use of fgets() and then, check for the 100th element value to be a newline or not but this also comes with additional cost.

Regarding about file input and output in C

I normally don't ask questions on here unless I'm really stuck! I was wondering if anyone can please explain why my code prints out a '5 47'. I understand why there is a 5, but not why there is a 47? I looked up the ASCII values for blankspace (32) and I tried changing the second letter to e, f, g, for example but the output remains '5 47' unchanged.
In general, when I use fscanf(fp, "%d", &variablename), does the fscanf skip over miscellaneous characters? For example: in my file test.txt I had "5 hello 6 ben jerry\n". How would I scan in the 5 and the 6? Would fscanf(fp, "%d %d", &test1, &test2) scan in the 5 and 6, skipping over the word "hello"?
Here is my simple code I am using to test output:
int main(int argc, char *argv[]) {
int blah, test;
FILE * fp;
fp = fopen(argv[1], "r");
fscanf(fp, "%d %d", &blah, &test);
printf("%d %d\n", blah, test);
return 0;
}
My file I am using as argv[1] contents:
5g
P.S. is FILE *fp an actual pointer to each character/number and does it work as a placeholder when it scans through the file? Is that why we need rewind(fp) once it hits the end of the file?
The operator %d looks for an integer, not a character. Because g is a character, not an integer, %d is getting confused and the output will not always be 5 47. The 47 could be anything. it could be 5 7, 5 23 etc. This is because the fscanf is not reading a second number, so no value is being assigned to test. Therefore, test remains at the value which was sitting in that piece of memory when the program was initiated.
To fix this, replace %d with %c and change the type of blah and test to int. Also, as WhozCraig said, it is good practise to check the return value of fscanf to check that two values have been found. This way, you can be sure that everything you're looking for has been found.
Note that the scanf() family of functions stop reading when they come across a character that is not expected by the format string. The unexpected character is left in the input for the next input operation to process.
If you want to read two integers that are definitely separated by a 'word' that is not an integer, then you will need to skip the word. If you don't know in advance what the word will be, you need to use assignment suppression (see the POSIX scanf() page for lots of information).
Hence, your code to read the two integers from input containing
5 hello 6 ben jerry
should be:
if (fscanf(fp, "%d %*s %d", &blah, &test) != 2)
…Oops; format error?…
Note that the code tests that it got the expected result. However, if you don't know whether there'll be a word between the two numbers, you are much better off using fgets() and sscanf() because you can try different parses of the same line:
char buffer[4096];
while (fgets(buffer, sizeof(buffer), fp) != 0)
{
if (sscanf(buffer, "%d %*s %d", &blah, &test) == 2)
…got two numbers with a word — let's go!
else if (sscanf(buffer, "%d %d", &blah, &test) == 2)
…got two numbers but no word — let's go!
else
…didn't recognize the format…
}
One of the major advantages of this is that you can report the error in terms of the complete line of input, rather than just the part that fscanf() didn't manage to work on.
Your last question, about a FILE *, is not a pointer to each character in the file. It is a handle which allows you to invoke functions that take a file pointer argument to read from or write to the associated file. However, you can't use indexing based on the file pointer (so fp[1024] does not identify the character at offset 1024 in the file or anything useful like that). If you want that sort of behaviour, you need a memory-mapped file (mmap() for POSIX systems).

How do I get scanf in C to print an error when a char is detected?

This is what I think the code should look like. It's inside a function (main) by the way.
char a;
if (a [is detected]) {
printf("Incorrect input format \n");
exit( EXIT_FAILURE );
}
Remember that digits are also characters. What you want to do is to use scanf to scan for an integer, and check the return value. The return value from the scanf family of function is the number of successfully scanned items, or -1 on error. If you scan for a single integer (format "%d") then if scanf doesn't return 1 there was an error.
So you could do something like
if (scanf(" %d", &number) == 1)
{
/* Got a number okay */
}
else
{
/* Not a number in the input */
}
Also remember that if scanf fails, the input is still there, so you can't just loop and hope the current input will be disregarded. A simply way to get by that is to use fgets to read one line of input, and then use sscanf to scan the newly read line.
Use fread instead of scanf to read the input from stdin. Parse the input provided by user to check if char is given as input, then print the error.

How can I use fgets to scan from the keyboard a formatted input (string)?

I need to use fgets to get a formatted input from the keyboard (i.e., student number with a format 13XXXX). I know with scanf this could be possible, but I'm not too sure with fgets. How can I make sure that the user inputs a string starting with 13?
printf ( ">>\t\tStudent No. (13XXXX): " );
fgets ( sNum[b], sizeof (sNum), stdin);
EDIT: sNum is the string for the student number.
Thanks.
Your call to fgets() is probably wrong.
char line[4096];
char student_no[7];
printf(">>\t\tStudent No. (13XXXX): ");
if (fgets(line, sizeof(line), stdin) == 0)
...deal with EOF...
if (sscanf(line, "%6s", line) != 1)
...deal with oddball input...
if (strlen(student_no) != 6 || strncmp(student_no, "13", 2) != 0)
...too short or not starting 13...
You can apply further conditions as you see fit. Should the XXXX be digits, for example? If so, you can convert the string to an int, probably using strtol() since tells you the first non-converted character, which you'd want to be the terminating null of the string. You can also validate that the number is 13XXXX by dividing by 10,000 and checking that the result is 13. You might also want to look at what comes after the first 6 non-blank characters (what's left in line), etc.
Use fgets to get a string, then sscanf to interpret it. A format string of %i will interpret the string as a number. The return value from sscanf will tell you whether that interpretation was successful (ie the user typed in a number). Once you know that then divide that number by 10,000. If the result is 13 then you know the number typed in was in the range 13x,xxx
printf ( ">>\t\tStudent No. (13XXXX): " );
fgets ( sNum[b], sizeof (sNum), stdin);
The sNum[b] should be sNum, which is a pointer.
And after getting a line from stdin with fgets, you can check the line with regular expression: "^13.+".
Sounds like you're looking for fscanf.
int fscanf ( FILE * stream, const char * format, ... );

Resources