C language reading columnated text file - c

First of all let me ask for your forgiveness if this is too trivial, I am not a C developer, usually I program in Fortran.
I am in need to read some columnated text files. The problem I have is that some columns can have blank space (non filled value) or not fully filed field.
Let me use a short example of the problem. Lets say I have a generator program like:
#include <stdio.h>
#include <stdlib.h>
int main(){
printf("xxxx%4d%4.2f\n",99,3.14);
}
When I execute this program I get:
$ ./t1
xxxx 993.14
If I get it into a text file and try to read using (e.g.) sscanf with the code:
#include <stdio.h>
#include <stdlib.h>
int main() {
char *fmt = "%*4c%4d%4f";
char *line = "xxxx 993.14";
int ival;
float fval;
sscanf(line,fmt,&ival,&fval);
printf(">>>>%d|%f\n",ival,fval);
}
The result is:
$ ./t2
>>>>993|0.140000
What is the problem here? The sscanf seems to think that all space is meaningless and should be discarded. So the "%4c" does what it is meant to be, it counts 4 characters without discarding any blank space and discards everything due to "". Next the %4d start skipping all blank spaces and start count the 4 characters of the field upon finding the first valid character for the conversion. So the value, meant to be 99 becomes 993, and the 3.14 becomes 0.14.
In Fortran the reading code would be:
program t3
implicit none
integer :: ival
real :: fval
character(len=30) :: fmt="(4x,i4,f4.0)"
character(len=30) :: line="xxxx 993.14"
read(line,fmt) ival, fval
write(*,"('>>>>',i4,'|',f4.2)") ival,fval
end program t3
and the result would be:
$ ./t3
>>>> 99|3.14
That is, the format specification states the field width and nothing is discarding in conversion, except if instructed to by the "nX" specification.
Some final remarks to help the helpers:
The format to be read is an international standard and there is no
way to change it.
The number of existing files is to big to think of intervention or
format change.
It is not a CSV or similar format.
The code has to be in C for integration in a free software package.
Sorry to be too long, trying to state the problem as completely as possible.
The question is: Is there a way to tell sscanf to not skip the blank spaces? If not, is there a simple way to do it in C or it will be necessary write an specialized parser for each record type?
Thank you in advance.

When reading fixed-length fields with sscanf, it is best to parse the values as character strings (which you could do a number of ways), and then perform independent conversion of each of the fields. This allows you to handle conversion/error detection on a per-field basis. For example, you could use a format string of:
char *fmt = "%*4s%2[^0-9]%s";
which would read/discard the 4 leading characters, then read 2-chars as your integer, followed by the remainder of line (or up until the next whitespace) as a string containing your float value.
To handle the storage and parsing of line as fixed length fields, you could use temporary character arrays to hold each of the strings and then use sscanf to fill them much as you have attempted to do with the integer and float directly. e.g.:
char istr[8] = {0};
char fstr[16] = {0};
...
sscanf (line,fmt,istr,fstr);
(note: you could use minimum storage of istr[3] and fstr[7] in this given case, adjust the storage length as required, but providing space for the nul-terminating character)
You can then use strtol and strtof to provide conversion with error checking on each value. For example:
errno = 0;
if ((ival = (int)strtol (istr, NULL, 10)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* underflow/overflow checks omitted */
and
errno = 0;
if ((fval = strtof (fstr, NULL)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* nan and inf checks omitted */
Putting all the pieces together in you example, you could use something like:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
int main() {
char *fmt = "%*4s%2[^0-9]%s";
char *line = "xxxx 993.14";
char istr[8] = {0};
char fstr[16] = {0};
int ival;
float fval;
sscanf (line,fmt,istr,fstr);
errno = 0;
if ((ival = (int)strtol (istr, NULL, 10)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* underflow/overflow checks omitted */
errno = 0;
if ((fval = strtof (fstr, NULL)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* nan and inf checks omitted */
printf(">>>>%d|%6.2f\n",ival,fval);
return 0;
}
Example/Output
$ >>>>0|993.14

*scanf() is not designed to handle fixed column width with non-intervening white-space.
With sscanf(), to not skip spaces, code must use "%c", "%n", "%[]" as all other specifiers skip leading white-space and those skipped characters do not contribute to a width limit.
To scan the printed line, which in now in buffer, take advantage that the only use of '\n' is at the end of the line.
char str_int[5];
char str_float[5];
int n = 0;
sscanf(buffer, "%*4c%4[^\n]%4[^\n]%n", str_int, str_float, &n);
if (n != 12 || buffer[n] != '\n') Fail();
// Now convert str_int, str_float as needed.
Another way to use sscanf() would be to parse buffer as
int ival;
float fval;
if (strlen(buffer) != 13) Fail();
if (sscanf(&buffer[8], "%f", &fval) != 1) Fail();
buffer[8] = '\0';
if (sscanf(&buffer[4], "%d", &ival) != 1) Fail();
Note: The 4s in the below do not specified the output width as 4 characters. 4 is the minimum width to print.
printf("xxxx%4d%4.2f\n",ival, fval);
Code could use the following to detect problems.
if (13 != printf("xxxx%4d%4.2f\n",ival, fval)) Fail();
Watch out for
printf("xxxx%4d%4.2f\n",123, 9.995000001f); // "xxxx 12310.00\n"

First off, I dunno. There might be some way to wrangle sscanf to recognize the whitespace towards your integer count. But I just don't think scanf was made for this sort of format in mind. The tool's trying to be smart of helpful and it's biting you in the ass.
But if it's columnated data and you know the position of the various fields, there's a really easy work around. Just extract the field you want.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv)
{
char line[] = "xxxx 893.14";
char tmp[100];
int thatDamnNumber;
float myfloatykins;
//Get that field
memcpy(tmp, line+4, 4);
sscanf(tmp, "%d", &thatDamnNumber);
//Kill that field so it doesn't goober-up the float
memset(line+4, ' ', 4);
sscanf(line, "%*4c%f", &myfloatykins);
printf("%d %f\n", thatDamnNumber, myfloatykins);
return 0;
}
If there is a lot of this, you could make some generalized functions: integerExtract(int positionStart, int sizeInCharacters), floatExtract(), etc.

If each element is of fixed width you don't really need scanf(), try this
char copy[5];
const char *line = "xxxx 993.14";
int ival;
float fval;
copy[0] = line[4];
copy[1] = line[5];
copy[2] = line[6];
copy[3] = line[7];
copy[4] = '\0'; // nul terminate for `atoi' to work
ival = atoi(copy);
fval = atof(&line[8]);
fprintf(stdout, "%d -- %f\n", ival, fval);
If you want (probably should) you can use strtol() instead of atoi() and strtof() instead of atof() to check for malformed data.
Both these functions take a parameter to store the unconverted/invalid characters, you can check the passed pointer in order to verify that there was a problem with conversion.
Or if you really want scanf() do the same, capture the integer + whitespaces to a char array and then convert it to int later, like this
char integer[5];
const char *line = "xxxx 993.14";
int ival;
float fval;
if (sscanf(line, "%*4c%4[0-9 ]%f", integer, &fval) != 2)
return -1;
ival = atoi(integer);
fprintf(stdout, "%d -- %f\n", ival, fval);
The format "%*4c%4[0-9 ]%f" will
Skip the first four characters including white spaces.
Scan the next four characters if they consist only of digits or white spaces.
Scan the rest of the input string searching for a matching float value.

I am posting what I think is a final conclusion from the answers I have got so far and from other sources.
What is a very trivial task in Fortran is not a so trivial task in other languages. I guess — not sure — that the same task could be as easy as in Fortran in other languages. I think that Cobol, Pascal, PL/I and others from the time of punched card probably could be trivial.
I think that most languages nowadays are more comfortable with different data structure and inherited its I/O structure from C. I think that Java, Python, Perl(?) and others could serve as examples.
From what I saw in this thread there are two main problems to read / convert fixed column length text data with C.
The first problem is that, as Philip said in his answer: “The tool’s trying to be smart of helpful and it’s biting you in the ass.” Quite right! The point is that it seems that C text I/O thinks that “white space” is something like a NULL character and should be thrown away, completely disregarding any information of the start of field. The only exception to that seems to be the %nc that get exactly n chars, even blanks.
The second problem is that the conversion “tag” (how is that called?) %nf will keep converting while it finds a valid character, even if you say stop at the 4th character.
If we join those two problems with a field completely filled with white space, depending on the conversion tool used, it throws an error or keeps going madly looking for something meaningful.
At the end of the day, it seems that the only way is to extract the field length to another memory area, dynamically allocated or not (we can have an area for each column length), and try to parse this separate area, taking into account the possibility of a full white space area to cache the error.

Related

Extracting numbers from the string using regex

I am trying to extract the number 4 and 3 from the string /ab/cd__my__sep__4__some__sep__3. I am trying with regex but not sure how would I do this. I wrote the following code, but it just prints out __my__sep__4__some__sep__3
#include <stdio.h>
#include <regex.h>
#include <string.h>
#include <stdlib.h>
int main() {
char* s = "/ab/cd__my__sep__4__some__sep__3";
regex_t regex;
int reti = regcomp(&regex,"__my__sep__([0-9]+)",REG_EXTENDED);
if(reti!=0) {
exit(-1);
}else {
regmatch_t match[2];
reti = regexec(&regex, s, 2, match, 0);
if(reti == 0) {
char *v = &s[match[1].rm_so];
ssize_t fl;
sscanf(v, "%zu", &fl);
printf("%s",v);
}else {
printf("else");
}
}
}
How could I extract the numbers 4 and 3 ?
match[0] refers to the part of the text matched by the entire pattern. match[1] is the match corresponding to the first capture (parenthesized subpattern).
Note that &s[match[1].rm_so] gives you a pointer to the start of the capture, but if you print the string at that point, you will get the part of the string starting at the beginning of the capture. In this case, that doesn't really matter. Since you're using sscanf to extract the integer value of the captured text, the fact that the substring isn't terminated immediately doesn't matter; it's not going to be followed by a digit, and sscanf will stop at the first non-digit.
But in the general case, it's possible that it will not be so easy to identify the end of the matched capture, and you can use one of these techniques:
If you want to print the capture, you can use a computed string width format: (See Note 1.)
printf("%.*s\n", match[1].rm_eo - match[1].rm_so, &s[match[1].rm_so]);
If you have strndup, you can easily create a dynamically-allocated copy of the capture: (See Note 2.)
char* capture = strndup(&s[match[1].rm_so], match[1].rm_eo - match[1].rm_so);
As a quick-and-dirty hack, it is also possible to just insert a NUL terminator (assuming that the searched string is not immutable, which means that it cannot be a string literal). You'll probably want to save the old value of the following character so that you can restore the string to it's original state:
char* capture = &s[match[1].rm_so];
char* rest = &s[match[1].rm_eo];
char saved_char = *rest;
*rest = 0;
/* capture now points to a NUL-terminated string. */
/* ... */
/* restore s */
*rest = saved_char;
None of the above is really necessary in the context of the original question, since the sscanf as written will work perfectly if you change the start of the string to scan from match[0] to match[1].
Notes:
In the general case, you should test to make sure that a capture was actually found before trying to use its offset. The rm_so member will be -1 if the capture was not found during the regex search That doesn't necessarily mean that the search failed, because the capture could be part of an alternative not used in the match.
Don't forget to free the copy when you no longer need it. If you don't have strndup, it's pretty easy to implement. But watch out for the corner cases.
Since you are using sscanf(), there is no need to use a regex. You can parse the two numbers from your string using sscanf() alone using the format string: "%*[^0-9]%d%*[^0-9]%d" where "%*[^0-9]" uses the assignment suppression '*' to read and discard all non-digit characters and then uses "%d" to extract the integer value. The full format-string just repeats those two patterns twice.
A short example using your input could be:
#include <stdio.h>
int main (void) {
char *s = "/ab/cd__my__sep__4__some__sep__3";
int a, b;
if (sscanf (s, "%*[^0-9]%d%*[^0-9]%d", &a, &b) == 2)
printf ("a: %d\nb: %d\n", a, b);
else {
fputs ("error: parse of integers failed.\n", stderr);
return 1;
}
}
Example Use/Output
$ ./bin/parse2ints
a: 4
b: 3
If you find yourself attempting to parse something that sscanf() cannot handle, then a regex is appropriate. Here, sscanf() is more than capable of handling your needs alone.
Create a regex format that only holds [0-9]. Then create a separate boolean function checking whether a character belongs or not to your regex. Then apply the function to your string. If true, add the character to the string you want to output

How to convert a string value to numerical value?

I have tried this code to separate my Str[] string into 2 string, but my problem is "I want to separate John(name) as string and 100(marks) as integer",How can I do it, any suggestion?
#include <stdio.h>
#include <string.h>
void main()
{
char Str[] = "John,100";
int i, j, xchange;
char name[50];
char marks[10];
j = 0; xchange = 0;
for(i=0; Str[i]!='\0'; i++){
if(Str[i]!=',' && xchange!=-1){
name[i] = Str[i];
}else{
xchange = -1;
}
if(xchange==-1){
marks[j++] = Str[i+1];
}
}
printf("Student name is %s\n", name);
printf("Student marks is %s", marks);
}
How to separate "John,100" into 2 strings?
There are three common approaches:
Use strtok() to split the string into individual tokens. This will modify the original string, but is quite simple to implement:
int main(void)
{
char line[] = "John,100;passed";
char *name, *score, *status;
/* Detach the initial part of the line,
up to the first comma, and set name
to point to that part. */
name = strtok(line, ",");
/* Detach the next part of the line,
up to the next comma or semicolon,
setting score to point to that part. */
score = strtok(NULL, ",;");
/* Detach the final part of the line,
setting status to point to it. */
status = strtok(NULL, "");
Note that if you change char line[] = "John,100"; then status will be NULL, but the code is otherwise safe to run.
So, in practice, if you required all three fields to exist in line, it would be sufficient to ensure the last one was not NULL:
if (!status) {
fprintf(stderr, "line[] did not have three fields!\n");
return EXIT_FAILURE;
}
Use sscanf() to convert the string. For example,
char line[] = "John,100";
char name[20];
int score;
if (sscanf(line, "%19[^,],%d", name, &score) != 2) {
fprintf(stderr, "Cannot parse line[] correctly.\n");
return EXIT_FAILURE;
}
Here, the 19 refers to the number of chars in name (one is always reserved for the end-of-string nul char, '\0'), and [^,] is a string conversion, consuming everything except a comma. %d converts an int. The return value is the number of successful conversions.
This approach does not modify the original string, and it allows you to try a number of different parsing patterns; as long as you try them the most complex one first, you can allow multiple input formats with very little added code. I do this regularly when taking 2D or 3D vectors as inputs.
The downside is that sscanf() (all functions in the scanf family) ignores overflow. For example, on 32-bit architectures, the largest int is 2147483647, but scanf functions will happily convert e.g. 9999999999 to 1410065407 (or some other value!) without returning an error. You can only assume the numerical inputs are sane and within the limits; you cannot verify.
Use helper functions to tokenise and/or parse the string.
Typically, the helper functions are something like
char *parse_string(char *source, char **to);
char *parse_long(char *source, long *to);
where source is a pointer to the next character in the string to be parsed, and to is a pointer to where the parsed value will be stored; or
char *next_string(char **source);
long next_long(char **source);
where source is a pointer to a pointer to the next character in the string to be parsed, and the return value is the value of the extracted token.
These tend to be longer than above, and if written by me, then quite paranoid about the inputs they accept. (I want my programs to complain if their input cannot be reliably parsed, rather than silently produce garbage.)
If the data is some variant of CSV (comma-separated values) read from a file, then the proper approach is a different one: instead of reading the file line by line, you read the file token by token.
The only "trick" is to remember the separator character that ended the token (you can use ungetc() for this), and use a different function to (read and ignore the rest of the tokens in the current record, and) consume the newline separator.

Printf on data stored as union gives no output after called for double variable

I'm working on program which input looks as follows:
3.14 (it's variable stored in union)
4 (number of calls)
int (asked types to return)
long
float
double
On output should i get:
1078523331
1078523331
3.140000
0.000000
Full instruction to this task
My program works except on double case: instead of giving me any output program gives me none. Can anyone explain me why? Here is my code.
#include <stdio.h>
#include <string.h>
#define SIZE 1000
#define CHARLENGTH 6
union Data {
int i;
long long l;
float f;
double d;
};
int main(){
union Data x;
char types[SIZE][CHARLENGTH];
int n;
scanf("%f",&x.f);
scanf("%d",&n);
for(int i = 0;i<=n+1;i++){
fgets(types[i],CHARLENGTH,stdin);
types[i][strcspn(types[i],"\n")] ='\0';//removing newline
}
for(int i = 1;i<=n+1;i++){
if(strcmp(types[i], "int") == 0){
printf("%d\n",x.i);
}
else if(strcmp(types[i], "long") == 0){
printf("%lli\n",x.l);
}
else if(strcmp(types[i], "float") == 0){
printf("%f\n",x.f);
}
else if(strcmp(types[i], "double") == 0){
printf("%lf\n",x.d);
}
}
}
You do not allow sufficient space in array types for a six-character string such as "double", because you need an extra byte for the terminator. Because you have used fgets() in a reasonable way, however, you have saved yourself from overrunning the bounds of that array -- fgets() just stops reading after the fifth character of "double", and appends a terminator. Therefore, what actually gets stored is "doubl". Naturally, that compares different from "double", so no corresponding output is produced.
In the first place, you should increase CHARLENGTH to at least 7. Doing so will take care of your immediate problem.
You should also consider adding a final else clause inside your loop that prints out a diagnostic message in the event that none of the other cases is satisfied. Such a message could have clued you in to what's going on.
for robustness, you might consider making sure to read and discard any trailing junk on the type lines; as it is, even trailing whitespace after one of the shorter type names will screw up your matching.
Perhaps it's respondent to the exercise as it is, but your program would be a lot more user friendly if it prompted for each input item.
Three Four quick observations:
0) As a minimum, the main function should be: int main(void).
1) Because C strings are defined as an array of char terminated with NULL, the string "double" requires a buffer with space for 7 char to contain it.
|d|o|u|b|l|e|\0| //includes NULL char termination
Change
#define CHARLENGTH 6
to
#define CHARLENGTH 7
2) Because it is not clear from the cmd line prompts in the running program what items are to be entered, if one of the string types, eg "double", is not entered, the line:
fgets(types[i],CHARLENGTH,stdin);
will not do as it is intended. Suggest adding some printf statements with instructions for what to enter for all 3 entries per line.
3) types is not initialized before use.
This can be addressed by simply initializing like this:
memset(types, 0, SIZE*CHARLENGTH);
or even simpler:
char types[SIZE][CHARLENGTH] = {0};
A comparison of what the memory looks like by the time it gets to the fgets statement, uninitialized, or initialized (by either method):

Using atof() function in C with multiple input values

The goal of this program is to create a function which reads in a single string, user typed, command (ultimately for program to be used in conjunction with a robot) which consists of an unknown command word(stored and printed as command), and an unknown number of decimal parameters(the quantity is stored and printed as num, and the parameters are to be stored as float values in the array params). In the User input, the command and parameters will be separated by spaces. I believe my issue is with the atof function when I go to extract the decimal values from the string. What am I doing wrong? Thank you for the help!
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
void func(char *input, char *command, int *num, float *params);
int main()
{
char input[40]={};
char command[40]={};
int num;
float params[10];
printf("Please enter your command: ");
gets(input);
func(input,command,&num,params);
printf("\n\nInput: %s",input);
printf("\nCommand: %s",command);
printf("\n# of parameters: %d",num);
printf("\nParameters: %f\n\n",params);
return 0;
}
void func(char *input, char *command, int *num, float *params)
{
int i=0, k=0, j=0, l=0;
int n=0;
while(input[i]!=32)
{
command[i]=input[i];
i++;
}
for (k=0; k<40;k++)
{
if ((input[k]==32)&&(input[k-1]!=32))
{
n++;
}
}
*num=n;
while (j<n)
{
for (l=0;l<40;l++)
{
if((input[l-1]==32)&&(input[l]!=32))
{
params[j]=atof(input[l]);
j++;
}
}
}
}
A Sample Output Screen:
Please enter your command: Move 10 -10
Input: Move 10 -10
Command: Move
# of parameters: 2
Parameters: 0.000000
The Parameters output should, ideally, read "10 -10" for the output. Thanks!
Change atof(input[l]) to atof(input + l). input[l] is single char but you want to get substring from l position. See also strtod() function.
Other people have already remarked the problem in your code, but may I suggest that you have a look at strtod() instead?
While both atof() and strtod() discard spaces at the start for you (so you don't need to do it manually), strtod() will point you to the end of the number, so that you know where to continue:
while(j < MAX_PARAMS) // avoid a buffer overflow via this check
{
params[j] = strtod(ptr, &end); // `end` is where your number ends
if(ptr == end) // if end == ptr, input wasn't a number (say, if there are none left)
break;
// input was a number, so ...
ptr = end; // continue at end for next iteration
j++; // increment number of params
}
Do note that the above solution does not differentiate between invalid arguments (say, foo instead of 3.5) and missing ones (because we've hit the last argument). You can check for that by doing this: if(!str[strspn(str, " \t\v\r\n\f")]) --- this checks if we're at the end of string (but allowing trailing whitespace). See the second side-note for what it does.
SIDE-NOTES:
You can use ' ' instead of 32 to check for space; this has two advantages:
It is clearer to the reader (it's very clear that it's a whitespace, instead of "some magic number that happens to have meaning")
It works in non-ASCII encodings (and the standard allows other encodings, though ASCII is by far the most popular; one common encoding is EBCDIC)
For future reference, this trick can help you skip whitespace: ptr += strspn(ptr, " \t\v\r\n\f");. strspn returns the number of characters at the start of the string that match the set (in this case, one of " \t\v\r\n"). Check documentation for more info.
Example for strspn: strspn("abbcbaa", "ab"); returns 3 because you have aab (which match) before c (which doesn't).
you are trying to convert a char into a float,
params[j]=atof(input[l]);
you should get the entire word(substring) of the float.
Example, "12.01" a null terminated string with 5 characters and pass it to atof, atof("12.01") and it will return a double of 12.01.
so, you should first extract the string for each float parameter and pass it to atof
Avoid comparing character to ascii value, rather you could have use ' ' (space) directly.
Instead of using for loop with a fixed size, you can use strlen() or strnlen() to find the length of the input string.

C programming: How to use sscanf with a ':' delimiter?

I have been trying to extract hours, seconds and minutes from an input text using sscanf. After sscanf function is performed, only s variable which holds the seconds has the right value. h and m which have hours and minutes in them hold only zeros. Please suggest changes to my code below.
char text[20];
if (fgets(text, sizeof text, stdin)!= NULL){
char* newline = strchr(text, '\n');
if (newline != NULL){
*newline = '\0';
}
}
uint8_t s = 0;
uint8_t m = 0;
uint8_t h = 0;
sscanf(text, "%02i:%02i:%02i",&h,&m,&s);
Note in the debugger, text has the right values.
This program:
#include <stdio.h>
int main(void)
{
const char hhmmss[] = "10:32:54";
int hh, mm, ss;
if (sscanf(hhmmss, "%i:%i:%i", &hh, &mm, &ss) != 3)
printf("Failed to scan 3 values from '%s'\n", hhmmss);
else
printf("From <<%s>> hh = %d, mm = %d, ss = %d\n", hhmmss, hh, mm, ss);
return 0;
}
gives this output:
From <<10:32:54>> hh = 10, mm = 32, ss = 54
The %02i conversions should also work, but the digits are somewhat superfluous.
The amended question shows that the variables are of type uint8_t, in which case you must use the correct conversion specifiers from <inttypes.h>:
#include <stdio.h>
#include <inttypes.h>
int main(void)
{
const char hhmmss[] = "10:32:54";
uint8_t s;
uint8_t m;
uint8_t h;
if (sscanf(hhmmss, "%02" SCNi8 ":%02" SCNi8 ":%02" SCNi8, &h, &m, &s) != 3)
printf("Failed to scan 3 values from '%s'\n", hhmmss);
else
printf("From <<%s>> h = %d, m = %d, s = %d\n", hhmmss, h, m, s);
return 0;
}
This produces the same output as before. With any of the scanf() family of functions, it is crucial that your format conversion specifiers match the types of the pointers you are passing into the function. You can get away with quite a lot of mismatches in printf() - certainly by comparison - because of default integer (in particular) promotions, but scanf() is a lot less forgiving.
#Jonathan Leffler's answer is entirely correct, but...
You should never use scanf or fscanf or sscanf for parsing input from file handle or a string, except perhaps in known-to-be-thrown-away-tomorrow code. They are too error-prone and too hard to control. For an exhaustive summary of the various problems with scanf, I recommend this series of articles. A few highlights:
If you need to read single characters, use getchar.
If you want to read a string, scanf has all the buffer overflow problems of gets.
If you want to read numbers, scanf's parsing is error-prone and hard to use. Use strtoul and strtod instead.
If you have more complicated input, everything is just worse.
What to do?
Read your own input using something better than gets, that is, without buffer overflow problems. Do not attempt to combine getting the bytes in with interpreting them.
Use a combination of strcspn, strspn, and stroul and strtod combined with some custom code to scan the input.
There are times when this too is a drag, but by that time your typically building some sort of input language that needs more generic techniques anyway.

Resources