Read tab separated content line by line with last column empty string - c

I have a file format like this
1.9969199999999998 2.4613199999999997 130.81278270000001 AA
2.4613199999999997 2.5541999999999998 138.59131554109211 BB
2.5541999999999998 2.9953799999999995 146.83238401449094 CC
...........................
I have to read first three columns as float and the last column as char array in C. All the columns are tab separated and the there is an new line character at the end of each line. Everything works fine with fscanf(fp1, "%f\t%f\t%f\t%s\n", ...) till I have a some text at the end of each line (the char string part).
There are cases where instead of AA/BB/CC, I have an empty string in the file. How to handle that case. I have tried fscanf(fp1, "%f\t%f\t%f\t%s[^\n]\n", ...) and many other things, but I am unable to figure out the right way. Can you please help me out here?

Using float rather than double will throw away about half the digits shown. You get 6-7 decimal digits with float; you get 15+ digits with double.
As to your main question: use fgets() (or POSIX
getline()) to read lines and then sscanf() to parse the line that is read. This will avoid confusion. When the input is line-based but not regular enough, don't use fscanf() and family to read the data — the file-reading scanf() functions don't care about newlines, even when you do.
Note that sscanf() will return either 3 or 4, indicating whether there was a string at the end of a line or not (or EOF, 0, 1 or 2 if it is given an empty string, or a string which doesn't start with a number, or a string which only contains one or two numbers). Always test the return value from scanf() and friends — but do so carefully. Look for the number of values that you expect (3 or 4 in this example), rather than 'not EOF'.
This leads to roughly:
#include <stdio.h>
int main(void)
{
double d[3];
char text[20];
char line[4096];
while (fgets(line, sizeof(line), stdin) != 0)
{
int rc = sscanf(line, "%lf %lf %lf %19s", &d[0], &d[1], &d[2], &text[0]);
if (rc == 4)
printf("%13.6f %13.6f %13.6f [%s]\n", d[0], d[1], d[2], text);
else if (rc == 3)
printf("%13.6f %13.6f %13.6f -NA-\n", d[0], d[1], d[2]);
else
printf("Format error: return code %d\n", rc);
}
return 0;
}
If given this file as standard input:
1.9969199999999998 2.4613199999999997 130.81278270000001 AA
2.4613199999999997 2.5541999999999998 138.59131554109211 BB
2.5541999999999998 2.9953799999999995 146.83238401449094 CC
19.20212223242525 29.3031323334353637 3940.41424344454647
19.20212223242525 29.3031323334353637 3940.41424344454647 PolyVinyl-PolySaccharide
the output is:
1.996920 2.461320 130.812783 [AA]
2.461320 2.554200 138.591316 [BB]
2.554200 2.995380 146.832384 [CC]
19.202122 29.303132 3940.414243 -NA-
19.202122 29.303132 3940.414243 [PolyVinyl-PolySacch]
You can tweak the output format to suit yourself. Note that the %19s avoids buffer overflow even when the text is longer than 19 characters.

Related

Won't read from file to struct

I've been sitting with this problem for 2 days and I can't figure out what I'm doing wrong. I've tried debugging (kind of? Still kind of new), followed this link: https://ericlippert.com/2014/03/05/how-to-debug-small-programs/ And I've tried Google and all kinds of things. Basically I'm reading from a file with this format:
R1 Fre 17/07/2015 18.00 FCN - SDR 0 - 2 3.211
and I have to make the program read this into a struct, but when I try printing the information it comes out all wrong. My code looks like this:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#define MAX_INPUT 198
typedef struct game{
char weekday[4],
home_team[4],
away_team[4];
int round,
hour,
minute,
day,
month,
year,
home_goals,
away_goals,
spectators;}game;
game make_game(FILE *superliga);
int main(void){
int input_number,
number_of_games = 198,
i = 0;
game tied[MAX_INPUT];
FILE *superliga;
superliga = fopen("superliga-2015-2016.txt", "r");
for(i = 0; i < number_of_games; ++i){
tied[i] = make_game(superliga);
printf("R%d %s %d/%d/%d %d.%d %s - %s %d - %d %d\n",
tied[i].round, tied[i].weekday, tied[i].day, tied[i].month,
tied[i].year, tied[i].hour, tied[i].minute, tied[i].home_team,
tied[i].away_team, tied[i].home_goals, tied[i].away_goals,
tied[i].spectators);}
fclose(superliga);
return 0;
}
game make_game(FILE *superliga){
double spect;
struct game game_info;
fscanf(superliga, "R%d %s %d/%d/%d %d.%d %s - %s %d - %d %lf\n",
&game_info.round, game_info.weekday, &game_info.day, &game_info.month,
&game_info.year, &game_info.hour, &game_info.minute, game_info.home_team,
game_info.away_team, &game_info.home_goals, &game_info.away_goals,
&spect);
game_info.spectators = spect * 1000;
return game_info;
}
The problem is in your file. It starts with whitespaces, not with R's as you stated in the control string.
Check the return value of fscanf() and you'll see that it's zero everytime.
If you add a leading whitespace to your fscanf() call, your problem will be solved, like this:
fscanf(superliga, " R%d %s %d/%d/%d %d.%d %s - %s %d - %d %lf\n",
&game_info.round, game_info.weekday, &game_info.day, &game_info.month,
&game_info.year, &game_info.hour, &game_info.minute, game_info.home_team,
game_info.away_team, &game_info.home_goals, &game_info.away_goals,
&spect);
If each line in the file is a separate record, you should read each line as a string, then try to parse each string.
(Note that this also has the added feature of speculative parsing: you can try parsing the line in several different formats, and accept the one that parses correctly. I like to use this when I accept e.g. vector inputs, so that the user can use x y z, x, y, z, x/y/z, (x,y,z), [x,y,z], <x y z>, <x,y,z>, and so on, depending on what they like. It's only one additional scanf per format, after all.)
To read lines, you can use fgets() into a local buffer. The local buffer must be long enough. If the program is to run on POSIX.1 machines only (i.e., not on Windows), then you can use getline() instead, which can dynamically reallocate the given buffer as needed, so you're not limited to any specific line length.
To parse the string, use sscanf().
Note that all tabs, spaces, and newlines in the pattern in all of the scanf family of functions are treated exactly the same: they indicate any number of any type of whitespace. In other words, \n does not mean "and then a newline"; it means the same as a space, i.e. "and possibly some whitespace here". However, all conversions except %c and %[ automatically skip any leading whitespace; so, with the exception of a space before one of those two, the spaces in the pattern are only meaningful to us humans, they do not have any functional effect in the scanning.
All scanf family of functions return the number of successful conversions. (The only exception is the "conversion" %n, which yields the number of characters consumed; some implementations include it in the conversion count, and some others do not.) If end of input occurs prior to the first conversion, or a read error occurs, or the input does not match with the fixed part of the pattern, the functions will return EOF.
Even if you suppress saving the result of a conversion -- for example, if you have a word in the input you don't need, you can convert but discard it with %*s --, it is counted. So, for example sscanf(line, " %*d %*s %*d") returns 3 if the line starts with an integer, followed by a word (anything that is not a newline nor contains whitespace), followed by an integer.
Rather than have the function return the parsed structure, pass a pointer to the structure (and the file handle to read from), and return a status code. I prefer 0 for success, and nonzero for failure, but feel free to change that.
In other words, I'd suggest you change your read function into
#ifndef GAME_LINE_MAX
#define GAME_LINE_MAX 1022
#endif
int read_game(game *one, FILE *in)
{
char buffer[GAME_LINE_MAX + 2]; /* + '\n' + '\0' */
char *line;
/* Sanity check: no NULL pointers accepted! */
if (!one || !in)
return -1;
/* Paranoid check: Fail if read error has already occurred. */
if (ferror(in))
return -1;
/* Read the line */
line = fgets(buffer, sizeof buffer, in);
if (!line)
return -1;
/* Parse the game; pattern from OP's example: */
if (sscanf(line, "R%d %3s %d/%d/%d %d.%d %3s - %3s %d - %d %d\n",
&(one->round), one->weekday,
&(one->day), &(one->month), &(one->year),
&(one->hour), &(one->minute)
one->home_team,
one->away_team,
&(one->home_goals),
&(one->away_goals),
&(one->spectators)) < 12)
return -1; /* Line not formatted like above */
/* Spectators in the file are in units of 1000; convert: */
one->spectators *= 1000;
/* Success. */
return 0;
}
To use the above function in a loop, reading games one after another from standard input (stdin):
game g;
while (!read_game(&g, stdin)) {
/* Do something with current game stats, g */
}
if (ferror(stdin)) {
/* Read error occurred! */
} else
if (!feof(stdin)) {
/* Not all data was read/parsed! */
}
The two if clauses above are to check if there was a real read error (as in, a problem with the hardware or something like that), and whether there was unread/unparsed data (not at end of file), respectively.
There are two differences in the scanning pattern compared to the OP: First, all strings parsed are limited to 3 characters, because the structure has only room for 3+1 each. The one character is reserved for the end of string '\0', which is not counted in the maximum length for %s. Second, I parse the spectator count directly, and just multiply the field by 1000 if successful.
Also note how I used one->weekday, one->home_team, and one->away_team to refer to the character arrays. This works, because an array variable can be used as if it was a pointer to the first element in that array. (Given char a[5];, a and &a and &(a[0]) can all be used to refer to the first element in the array a). I like to use this "raw form" when scanning, because it makes it easier to match them to %s conversions, and ensure the pattern matches the parameters.

C language reading columnated text file

First of all let me ask for your forgiveness if this is too trivial, I am not a C developer, usually I program in Fortran.
I am in need to read some columnated text files. The problem I have is that some columns can have blank space (non filled value) or not fully filed field.
Let me use a short example of the problem. Lets say I have a generator program like:
#include <stdio.h>
#include <stdlib.h>
int main(){
printf("xxxx%4d%4.2f\n",99,3.14);
}
When I execute this program I get:
$ ./t1
xxxx 993.14
If I get it into a text file and try to read using (e.g.) sscanf with the code:
#include <stdio.h>
#include <stdlib.h>
int main() {
char *fmt = "%*4c%4d%4f";
char *line = "xxxx 993.14";
int ival;
float fval;
sscanf(line,fmt,&ival,&fval);
printf(">>>>%d|%f\n",ival,fval);
}
The result is:
$ ./t2
>>>>993|0.140000
What is the problem here? The sscanf seems to think that all space is meaningless and should be discarded. So the "%4c" does what it is meant to be, it counts 4 characters without discarding any blank space and discards everything due to "". Next the %4d start skipping all blank spaces and start count the 4 characters of the field upon finding the first valid character for the conversion. So the value, meant to be 99 becomes 993, and the 3.14 becomes 0.14.
In Fortran the reading code would be:
program t3
implicit none
integer :: ival
real :: fval
character(len=30) :: fmt="(4x,i4,f4.0)"
character(len=30) :: line="xxxx 993.14"
read(line,fmt) ival, fval
write(*,"('>>>>',i4,'|',f4.2)") ival,fval
end program t3
and the result would be:
$ ./t3
>>>> 99|3.14
That is, the format specification states the field width and nothing is discarding in conversion, except if instructed to by the "nX" specification.
Some final remarks to help the helpers:
The format to be read is an international standard and there is no
way to change it.
The number of existing files is to big to think of intervention or
format change.
It is not a CSV or similar format.
The code has to be in C for integration in a free software package.
Sorry to be too long, trying to state the problem as completely as possible.
The question is: Is there a way to tell sscanf to not skip the blank spaces? If not, is there a simple way to do it in C or it will be necessary write an specialized parser for each record type?
Thank you in advance.
When reading fixed-length fields with sscanf, it is best to parse the values as character strings (which you could do a number of ways), and then perform independent conversion of each of the fields. This allows you to handle conversion/error detection on a per-field basis. For example, you could use a format string of:
char *fmt = "%*4s%2[^0-9]%s";
which would read/discard the 4 leading characters, then read 2-chars as your integer, followed by the remainder of line (or up until the next whitespace) as a string containing your float value.
To handle the storage and parsing of line as fixed length fields, you could use temporary character arrays to hold each of the strings and then use sscanf to fill them much as you have attempted to do with the integer and float directly. e.g.:
char istr[8] = {0};
char fstr[16] = {0};
...
sscanf (line,fmt,istr,fstr);
(note: you could use minimum storage of istr[3] and fstr[7] in this given case, adjust the storage length as required, but providing space for the nul-terminating character)
You can then use strtol and strtof to provide conversion with error checking on each value. For example:
errno = 0;
if ((ival = (int)strtol (istr, NULL, 10)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* underflow/overflow checks omitted */
and
errno = 0;
if ((fval = strtof (fstr, NULL)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* nan and inf checks omitted */
Putting all the pieces together in you example, you could use something like:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
int main() {
char *fmt = "%*4s%2[^0-9]%s";
char *line = "xxxx 993.14";
char istr[8] = {0};
char fstr[16] = {0};
int ival;
float fval;
sscanf (line,fmt,istr,fstr);
errno = 0;
if ((ival = (int)strtol (istr, NULL, 10)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* underflow/overflow checks omitted */
errno = 0;
if ((fval = strtof (fstr, NULL)) == 0 && errno)
fprintf (stderr, "error: integer conversion failed.\n");
/* nan and inf checks omitted */
printf(">>>>%d|%6.2f\n",ival,fval);
return 0;
}
Example/Output
$ >>>>0|993.14
*scanf() is not designed to handle fixed column width with non-intervening white-space.
With sscanf(), to not skip spaces, code must use "%c", "%n", "%[]" as all other specifiers skip leading white-space and those skipped characters do not contribute to a width limit.
To scan the printed line, which in now in buffer, take advantage that the only use of '\n' is at the end of the line.
char str_int[5];
char str_float[5];
int n = 0;
sscanf(buffer, "%*4c%4[^\n]%4[^\n]%n", str_int, str_float, &n);
if (n != 12 || buffer[n] != '\n') Fail();
// Now convert str_int, str_float as needed.
Another way to use sscanf() would be to parse buffer as
int ival;
float fval;
if (strlen(buffer) != 13) Fail();
if (sscanf(&buffer[8], "%f", &fval) != 1) Fail();
buffer[8] = '\0';
if (sscanf(&buffer[4], "%d", &ival) != 1) Fail();
Note: The 4s in the below do not specified the output width as 4 characters. 4 is the minimum width to print.
printf("xxxx%4d%4.2f\n",ival, fval);
Code could use the following to detect problems.
if (13 != printf("xxxx%4d%4.2f\n",ival, fval)) Fail();
Watch out for
printf("xxxx%4d%4.2f\n",123, 9.995000001f); // "xxxx 12310.00\n"
First off, I dunno. There might be some way to wrangle sscanf to recognize the whitespace towards your integer count. But I just don't think scanf was made for this sort of format in mind. The tool's trying to be smart of helpful and it's biting you in the ass.
But if it's columnated data and you know the position of the various fields, there's a really easy work around. Just extract the field you want.
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
int main(int argc, char** argv)
{
char line[] = "xxxx 893.14";
char tmp[100];
int thatDamnNumber;
float myfloatykins;
//Get that field
memcpy(tmp, line+4, 4);
sscanf(tmp, "%d", &thatDamnNumber);
//Kill that field so it doesn't goober-up the float
memset(line+4, ' ', 4);
sscanf(line, "%*4c%f", &myfloatykins);
printf("%d %f\n", thatDamnNumber, myfloatykins);
return 0;
}
If there is a lot of this, you could make some generalized functions: integerExtract(int positionStart, int sizeInCharacters), floatExtract(), etc.
If each element is of fixed width you don't really need scanf(), try this
char copy[5];
const char *line = "xxxx 993.14";
int ival;
float fval;
copy[0] = line[4];
copy[1] = line[5];
copy[2] = line[6];
copy[3] = line[7];
copy[4] = '\0'; // nul terminate for `atoi' to work
ival = atoi(copy);
fval = atof(&line[8]);
fprintf(stdout, "%d -- %f\n", ival, fval);
If you want (probably should) you can use strtol() instead of atoi() and strtof() instead of atof() to check for malformed data.
Both these functions take a parameter to store the unconverted/invalid characters, you can check the passed pointer in order to verify that there was a problem with conversion.
Or if you really want scanf() do the same, capture the integer + whitespaces to a char array and then convert it to int later, like this
char integer[5];
const char *line = "xxxx 993.14";
int ival;
float fval;
if (sscanf(line, "%*4c%4[0-9 ]%f", integer, &fval) != 2)
return -1;
ival = atoi(integer);
fprintf(stdout, "%d -- %f\n", ival, fval);
The format "%*4c%4[0-9 ]%f" will
Skip the first four characters including white spaces.
Scan the next four characters if they consist only of digits or white spaces.
Scan the rest of the input string searching for a matching float value.
I am posting what I think is a final conclusion from the answers I have got so far and from other sources.
What is a very trivial task in Fortran is not a so trivial task in other languages. I guess — not sure — that the same task could be as easy as in Fortran in other languages. I think that Cobol, Pascal, PL/I and others from the time of punched card probably could be trivial.
I think that most languages nowadays are more comfortable with different data structure and inherited its I/O structure from C. I think that Java, Python, Perl(?) and others could serve as examples.
From what I saw in this thread there are two main problems to read / convert fixed column length text data with C.
The first problem is that, as Philip said in his answer: “The tool’s trying to be smart of helpful and it’s biting you in the ass.” Quite right! The point is that it seems that C text I/O thinks that “white space” is something like a NULL character and should be thrown away, completely disregarding any information of the start of field. The only exception to that seems to be the %nc that get exactly n chars, even blanks.
The second problem is that the conversion “tag” (how is that called?) %nf will keep converting while it finds a valid character, even if you say stop at the 4th character.
If we join those two problems with a field completely filled with white space, depending on the conversion tool used, it throws an error or keeps going madly looking for something meaningful.
At the end of the day, it seems that the only way is to extract the field length to another memory area, dynamically allocated or not (we can have an area for each column length), and try to parse this separate area, taking into account the possibility of a full white space area to cache the error.

Reading data in C

So my file has data of the form full_file_path # for example,
C:/dev/Java/src/java/util/concurrent/ConcurrentHashMap.java 212
C:/dev/Java/src/java/util/HashMap.java 212
C:/dev/Java/src/java/lang/CharacterData02.java 190
C:/dev/Java/src/java/lang/CharacterData0E.java 190
C:/dev/Java/src/java/nio/DirectCharBufferS.java 123
C:/dev/Java/src/java/nio/DirectCharBufferU.java 123
...
and I'm trying to read the file with
int dup;
char file[MAX_LINE];
...
FILE *fp;
fp = fopen("OUTPUT100.txt", "r");
while (fscanf(fp, "%s %d\n", &file, &dup) == 1) {
printf("%s %d\n", file, dup);
}
fclose(fp);
However the output is junk like
has 0 duplciate lines of code
RJ9 has 0 duplciate lines of code
has 0 duplciate lines of code
▒▒" has 0 duplciate lines of code
has 0 duplciate lines of code
"A▒ has 0 duplciate lines of code
has 0 duplciate lines of code
7▒cw has 0 duplciate lines of code
has 0 duplciate lines of code
has 0 duplciate lines of code
What am I doing wrong?
Edit: My "still spits out junk" comments were just brain farts. I had a printf in a loop below the while loop that was spitting out junk. After scrolling up a few hundred lines I started seeing sensible data. Commenting out the printf resulted in the expected results. The working read uses while (fscanf(fp, "%s %d\n", file, &dup) == 2). Thanks all.
Modify your loop call like this
while (fscanf(fp, "%s %d\n", file, &dup) == 2) {
As #mjswartz said file is an array that will decay to char* when you pass it to fscanf. Also fscanf will return the number of parameters it managed to retrieve and you are scanning 2 variables per line.
Well, fscanf returns the number of successfull format specifiers so it will return 2 on success, and you're checking if it returns 1. You probably want to check for the EOF character instead and stop reading from there.
Also the file variable is an array which can be used as a pointer so you shouldn't try to reference it.
while (1)
{
r = fscanf(fp, "%s %d\n", file, &dup);
if( r == EOF )
break;
printf("%s %d\n", file, dup);
}
regarding this line:
while (fscanf(fp, "%s %d\n", &file, &dup) == 1) {
the fscanf() returns the number of successful input/conversions
I.E. if it is successful, and the format string has 2 conversion parameters, then it will return 2, NOT 1.
Please read/understand the man page for fscanf()
The format string should NOT have a trailing '\n' (unlike printf() which almost always has a trailing '\n')
To consume any 'leftover' white space, like a newline, place a leading space in the format string.
The current format specifier '%s" does not have a max length modifier, so the data in the input file could overrun the available input buffer, resulting in undefined behaviour that can lead to a seg fault event.
Note: in C, the name of an array degrades to a pointer to the first byte of the array, so do not take the address of the array: file
suggest:
while (fscanf(fp, " %#s %d\n", file, &dup) == 2) {
where # is the value MAX_LINE-1.
There is absolutely no possibility that the posted code output the text that is posted as the output.
Suggest posting the actual code, the actual output and the expected output.
Note: file is a poor name for an array of char. suggest something meaningful, like buffer

Check if user input into an array is too long?

I am getting the user to input 4 numbers. They can be input: 1 2 3 4 or 1234 or 1 2 34 , etc. I am currently using
int array[4];
scanf("%1x%1x%1x%1x", &array[0], &array[1], &array[2], &array[3]);
However, I want to display an error if the user inputs too many numbers: 12345 or 1 2 3 4 5 or 1 2 345 , etc.
How can I do this?
I am very new to C, so please explain as much as possible.
//
Thanks for your help.
What I have now tried to do is:
char line[101];
printf("Please input);
fgets(line, 101, stdin);
if (strlen(line)>5)
{
printf("Input is too large");
}
else
{
array[0]=line[0]-'0'; array[1]=line[1]-'0'; array[2]=line[2]-'0'; array[3]=line[3]-'0';
printf("%d%d%d%d", array[0], array[1], array[2], array[3]);
}
Is this a sensible and acceptable way? It compiles and appears to work on Visual Studios. Will it compile and run on C?
OP is on the right track, but needs adjust to deal with errors.
The current approach, using scanf() can be used to detect problems, but not well recover. Instead, use a fgets()/sscanf() combination.
char line[101];
if (fgets(line, sizeof line, stdin) == NULL) HandleEOForIOError();
unsigned arr[4];
int ch;
int cnt = sscanf(line, "%1x%1x%1x%1x %c", &arr[0], &arr[1], &arr[2],&arr[3],&ch);
if (cnt == 4) JustRight();
if (cnt < 4) Handle_TooFew();
if (cnt > 4) Handle_TooMany(); // cnt == 5
ch catches any lurking non-whitespace char after the 4 numbers.
Use %1u if looking for 1 decimal digit into an unsigned.
Use %1d if looking for 1 decimal digit into an int.
OP 2nd approach array[0]=line[0]-'0'; ..., is not bad, but has some shortcomings. It does not perform good error checking (non-numeric) nor handles hexadecimal numbers like the first. Further, it does not allow for leading or interspersed spaces.
Your question might be operating system specific. I am assuming it could be Linux.
You could first read an entire line with getline(3) (or readline(3), or even fgets(3) if you accept to set an upper limit to your input line size) then parse that line (e.g. with sscanf(3) and use the %n format specifier). Don't forget to test the result of sscanf (the number of read items).
So perhaps something like
int a=0,b=0,c=0,d=0;
char* line=NULL;
size_t linesize=0;
int lastpos= -1;
ssize_t linelen=getline(&line,&linesize,stdin);
if (linelen<0) { perror("getline"); exit(EXIT_FAILURE); };
int nbscanned=sscanf(line," %1d%1d%1d%1d %n", &a,&b,&c,&d,&lastpos);
if (nbscanned>=4 && lastpos==linelen) {
// be happy
do_something_with(a,b,c,d);
}
else {
// be unhappy
fprintf(stderr, "wrong input line %s\n", line);
exit(EXIT_FAILURE);
}
free(line); line=NULL;
And once you have the entire line, you could parse it by other means like successive calls of strtol(3).
Then, the issue is what happens if the stdin has more than one line. I cannot guess what you want in that case. Maybe feof(3) is relevant.
I believe that my solution might not be Linux specific, but I don't know. It probably should work on Posix 2008 compliant operating systems.
Be careful about the result of sscanf when having a %n conversion specification. The man page tells that standards might be contradictory on that corner case.
If your operating system is not Posix compliant (e.g. Windows) then you should find another way. If you accept to limit line size to e.g. 128 you might code
char line[128];
memset (line, 0, sizeof(line));
fgets(line, sizeof(line), stdin);
ssize_t linelen = strlen(line);
then you do append the sscanf and following code from the previous (i.e. first) code chunk (but without the last line calling free(line)).
What you are trying to get is 4 digits with or without spaces between them. For that, you can take a string as input and then check that string character by character and count the number of digits(and spaces and other characters) in the string and perform the desired action/ display the required message.
You can't do that with scanf. Problem is, there are ways to make scanf search for something after the 4 numbers, but all of them will just sit there and wait for more user input if the user does NOT enter more. So you'd need to use gets() or fgets() and parse the string to do that.
It would probably be easier for you to change your program, so that you ask for one number at a time - then you ask 4 times, and you're done with it, so something along these lines, in pseudo code:
i = 0
while i < 4
ask for number
scanf number and save in array at index i
E.g
#include <stdio.h>
int main(void){
int array[4], ch;
size_t i, size = sizeof(array)/sizeof(*array);//4
i = 0;
while(i < size){
if(1!=scanf("%1x", &array[i])){
//printf("invalid input");
scanf("%*[^0123456789abcdefABCDEF]");//or "%*[^0-9A-Fa-f]"
} else {
++i;
}
}
if('\n' != (ch = getchar())){
printf("Extra input !\n");
scanf("%*[^\n]");//remove extra input
}
for(i=0;i<size;++i){
printf("%x", array[i]);
}
printf("\n");
return 0;
}

using fscanf to read in items

There is a file a.txt looks like this:
1 abc
2
3 jkl
I want to read each line of this file as an int and a string, like this:
fp = fopen("a.txt", "r");
while (1) {
int num;
char str[10];
int ret =fscanf(fp, "%d%s", &num, str);
if (EOF == ret)
break;
else if (2 != ret)
continue;
// do something with num and str
}
But there is a problem, if a line in a.txt contains just a num, no string (just like line 2), then the above code will be stuck in that line.
So any way to jump to the next line?
Do it in two steps:
Read a full line using fgets(). This assumes you can put a simple static limit on how long lines you want to support, but that's very often the case.
Use sscanf() to inspect and parse the line.
This works around the problem you ran into, and is generally the better approach for problems like these.
UPDATE: Trying to respond to your comment. The way I see it, the easiest approach is to always read the full line (as above), then parse out the two fields you consider it to consist of: the number and the string.
If you do this with sscanf(), you can use the return value to figure out how it went, like you tried with fscanf() in your code:
const int num_fields = sscanf(line, "%d %s", &num, str);
if( num_fields == 1 )
; /* we got only the number */
else if( num_fields == 2 )
; /* we got both the number and the string */
else
; /* failed to get either */
Not sure when you wouldn't "need" the string; it's either there or it isn't.
If the first character of the string is \r or\n this will be an empty string. You can use the comparison. fscanf() is not suitable if words contain spaces(or empty lines) in them .In that case better to use fgets()
How to solve "using fscanf":
After the int, look for spaces (don't save), then look for char that are not '\n'.
int num;
char str[10];
#define ScanInt "%d"
#define ScanSpacesDontSave "%*[ ]"
#define ScanUntilEOL "%9[^\n]"
int ret = fscanf(fp, ScanInt ScanSpacesDontSave ScanUntilEOL, &num, str);
if (EOF == ret) break;
else if (1 == ret) HandleNum(num);
else if (2 == ret) HandleNumStr(num, str);
else HadleMissingNum();
ret will be 2 if something was scanned into str, else ret will typically be 1. The trick to the above is not to scan in a '\n' after the int. Thus code can not use "%s" nor " " after the "%d" which both consume all (leading) white-space. The next fscanf() call will consume the '\n' as part of leading white-space consumption via "%d".
Minor note: reading the line with fgets() then parsing the buffer is usually a better approach, but coding goals may preclude that.

Resources