A way to fscanf on only the first line - c

I've been looking into a way to obtain 2 integers, seperated by a space, that are located in the first line of a file that I would read. I considered using
fscanf(file, "%d %d\n", &wide, &high);
But that read 2 integers that were anywhere in the file, and would give the wrong output if the first line was in the wrong format. I also tried using
char line[1001];
fgets(line, 1000, file);
Which seems like the best bet, except for how clumsy it is. It leaves me with a string that has up to a few hundred blank spaces, from which I must extract my precious integers, nevermind checking for errors in formatting.
Surely there is a better option than this? I'll accept any solution, but the most robust solution seems (to me) to be a fscanf on the first line only. Any way to do that?

You can capture the character immediately following the second number in a char, and check that the captured character is '\n', like this:
int wide, high;
char c;
if (fscanf(file, "%d%d%c", &wide, &high, &c) != 3 || c != '\n') {
printf("Incorrect file format: expected two ints followed by a newline.");
}
Here is a demo on ideone.

Which seems like the best bet, except for how clumsy it is.
Nah, it's not clumsy at all (except that you are using the size argument of fgets() in the wrong way...). It's perfectly fine & idiomatic. strtol() does its job pretty well:
char line[LINE_MAX];
fgets(line, sizeof line, file);
char *endp;
int width = strtol(line, &endp, 10);
int height = strtol(endp, NULL, 10);

Related

Store hex input into int variable without using scanf() function in C

Pre-History:
I had the issue, that the getchar() function did not get processed in the right way as there was not a request for any given input and the program just have continued processing further.
I searched the internet about what this issue could be and found the information that if the scanf() function is implemented into a program before the getchar() function, the getchar() function does not behave in the right way, and would act like my issue was.
Citation:
I will bet you ONE HUNDRED DOLLARS you only see this problem when the call to getchar() is preceded by a scanf().
Don't use scanf for interactive programs. There are two main reasons for this:
1) scanf can't recover from malformed input. You have to get the format string right, every time, or else it just throws away whatever input it couldn't match and returns a value indicating failure. This might be fine if you're parsing a fixed-format file when poor formatting is unrecoverable anyway, but it's the exact opposite of what you want to do with user input. Use fgets() and sscanf(), fgets() and strtok(), or write your own user input routines using getchar() and putchar().
1.5) Even properly used, scanf inevitably discards input (whitespace) that can sometimes be important.
2) scanf has a nasty habit of leaving newlines in the input stream. This is fine if you never use anything but scanf, since scanf will usually skip over any whitespace characters in its eagerness to find whatever it's expecting next. But if you mix scanf with fgets/getchar, it quickly becomes a total mess trying to figure out what might or might not be left hanging out in the input stream. Especially if you do any looping -- it's quite common for the input stream to be different on the first iteration, which results in a potentially weird bug and even weirder attempts to fix it.
tl;dr -- scanf is for formatted input. User input is not formatted. //
Here is the link, to that thread: https://bbs.archlinux.org/viewtopic.php?id=161294
scanf() with:
scanf("%x",integer_variable);
seems for me as a newbie to the scene as the only way possible to input a hex number from the keyboard (or better said the stdin file) and store it to a int variable.
Is there a different way to input a hex value from the stdin and store it into an integer variable?
Bonus challenge: It would be nice also, if i could write negative values (through negative hex input of course) into an signed int variable.
INFO: I have read many threads for C here on Stackoverflow about similar problems but none of those answer my explicit question quite well. So i´ve posted this question.
I work under Linux Ubuntu.
The quote about the hundred dollar bet is accurate. Mixing scanf with getchar is almost always a bad idea; it almost always leads to trouble. It's not that they can't be used together, though. It's possible to use them together -- but usually, it's just way too difficult. There are too many fussy little details and "gotcha!"s to keep track of. It's more trouble than it's worth.
At first you had said
scanf() with ... %d ... seems for me as a newbie to the scene as the only way possible to input a hex number from the keyboard
There was some side confusion there, because of course %d is for decimal input. But since I'd written this answer by the time you corrected that, let's proceed with decimal for the moment.
(Also for the moment I'm leaving out error checking -- that is, these code fragments don't check for or do anything graceful if the user doesn't type the requested number.) Anyway, here are several ways of reading an integer:
scanf("%d", &integer_variable);
You're right, this is the (superficially) easiest way.
char buf[100];
fgets(buf, sizeof(buf), stdin);
integer_variable = atoi(buf);
This is, I think, the easiest way that doesn't use scanf. But most people these days frown on using atoi, because it doesn't do much useful error checking.
char buf[100];
fgets(buf, sizeof(buf), stdin);
integer_variable = strtol(buf, NULL, 10);
This is almost the same as before, but avoids atoi in favor of the preferred strtol.
char buf[100];
fgets(buf, sizeof(buf), stdin);
sscanf(buf, "%d", &integer_variable);
This reads a line and then uses sscanf to parse it, another popular and general technique.
All of these will work; all of these will handle negative numbers. It's important to think about error conditions, though -- I'll have more to say about that later.
If you want to input hexadecimal numbers, the techniques are similar:
scanf("%x", &integer_variable);
char buf[100];
fgets(buf, sizeof(buf), stdin);
integer_variable = strtol(buf, NULL, 16);
char buf[100];
fgets(buf, sizeof(buf), stdin);
sscanf(buf, "%x", &integer_variable);
These should all work, too. I wouldn't necessarily expect them to handle "negative hexadecimal", though, because that's an unusual requirement. Most of the time, hexadecimal notation is used for unsigned integers. (In fact, strictly speaking, %x with scanf and sscanf must be used with an integer_variable that has been declared as unsigned int, not plain int.)
Sometimes it's useful or necessary to do this sort of thing "by hand". Here's a code fragment that reads exactly two hexadecimal digits. I'll start out with the version using getchar:
int c1 = getchar();
if(c1 != EOF && isascii(c1) && isxdigit(c1)) {
int c2 = getchar();
if(c2 != EOF && isascii(c2) && isxdigit(c2)) {
if(isdigit(c1)) integer_variable = c1 - '0';
else if(isupper(c1)) integer_variable = 10 + c1 - 'A';
else if(islower(c1)) integer_variable = 10 + c1 - 'a';
integer_variable = integer_variable * 16;
if(isdigit(c2)) integer_variable += c2 - '0';
else if(isupper(c2)) integer_variable += 10 + c2 - 'A';
else if(islower(c2)) integer_variable += 10 + c1 - 'a';
}
}
As you can see, it's a bit of a jawbreaker. Me, although I almost never use members of the scanf family, this is one place where I sometimes do, precisely because doing it "by hand" is so much work. You can simplify it considerably by using an auxiliary function or macro to do the digit conversion:
int c1 = getchar();
if(c1 != EOF && isascii(c1) && isxdigit(c1)) {
int c2 = getchar();
if(c2 != EOF && isascii(c2) && isxdigit(c2)) {
integer_variable = Xctod(c1);
integer_variable = integer_variable * 16;
integer_variable += Xctod(c2);
}
}
Or you could collapse those inner expressions down to just
integer_variable = 16 * Xctod(c1) + Xctod(c2);
These work in terms of an auxiliary function:
int Xctod(int c)
{
if(!isascii(c)) return 0;
else if(isdigit(c)) return c - '0';
else if(isupper(c)) return 10 + c - 'A';
else if(islower(c)) return 10 + c - 'a';
else return 0;
}
Or perhaps a macro (though this is definitely an old-school sort of thing):
#define Xctod(c) (isdigit(c) ? (c) - '0' : (c) - (isupper(c) ? 'A' : 'a') + 10)
Often I'm parsing hexadecimal digits like this not from stdin using getchar(), but from a string. Often I'm using a character pointer (char *p) to step through the string, meaning that I end up with code more like this:
char c1 = *p++;
if(isascii(c1) && isxdigit(c1)) {
char c2 = *p++;
if(isascii(c2) && isxdigit(c2))
integer_variable = 16 * Xctod(c1) + Xctod(c2);
}
It's tempting to omit the temporary variables and the error checking and boil this down still further:
integer_variable = 16 * Xctod(*p++) + Xctod(*p++);
But don't do this! Besides the lack of error checking, this expression is probably undefined, and it definitely won't always do what you want, because there's no longer any guarantee abut what order you read the characters in. If you know p points at the first of two hex digits, you don't want to collapse it any further than
integer_variable = Xctod(*p++);
integer_variable = 16 * integer_variable + Xctod(*p++);
and even then, this will work only with the function version of Xctod, not the macro, since the macro evaluates its argument multiple times.
Finally, let's talk abut error handling. There are quite a few possibilities to worry about:
The user hits Return without typing anything.
The user types whitespace before or after the number.
The user types extra garbage after the number.
The user types non-numeric input instead of a number.
The code hits end-of-file; there are no characters to read at all.
And then how you handle these depends on what input techniques you're using. Here are the basic rules:
A. If you're calling scanf, fscanf, or sscanf, always check the return value. If it's not 1 (or, in the case where you had multiple % specifiers, it's not the number of values you expected to read), it means something went wrong. This will generally catch problems 4 and 5, and will handle case 2 gracefully. But it will often quietly ignore problems 1 and 3. (In particular, scanf and fscanf treat an extra \n just like leading whitespace.)
B. If you're calling fgets, again, always check the return value. You'll get NULL on EOF (problem 5). Handling the other problems depends on what you do with the line you read.
C. If you're calling atoi, it will deal gracefully with problem 2, but it will ignore problem 3, and it will quietly turn problem 4 into the number 0 (which is why atoi is usually not recommended any more).
D. If you're calling strtol or any of the other "strto" functions, they will deal gracefully with problem 2, and if you let them give you back an "end pointer", you can check for and deal with problems 3 and 4. (Note that I left the end-pointer handling out of my two strtol examples above.)
E. Finally, if you're doing something down-and-dirty like my "hardway" two-digit hex converter, you generally have to take care of all these problems, explicitly, yourself. If you want to skip leading whitespace you have to do so (the isspace function from <ctype.h> can help), and if there might be unexpected non-digit characters, you have to check for those, too. (That's what the calls to isascii and isxdigit are doing in my "hardway" two-digit hex converter.)
Per scanf man page, you can use scanf to read hex number from stdin into (unsigned) integer variable.
unsigned int v ;
if ( scanf("%x", &v) == 1 ) {
// do something with v.
}
As per man page, %x is always unsigned. If you want to support negative values, you will have to add explicit logic.
As mentioned in the link you posted, using fgets and sscanf is the best way to handle this. fgets will read a full line of text and sscanf will parse the line.
For example
char line[100];
fgets(line, sizeof(line), stdin);
int x;
int rval = sscanf(line, "%x", &x);
if (rval == 1) {
printf("read value %x\n", x);
} else {
printf("please enter a hexadecimal integer\n");
}
Since you're only reading in a single integer, you could also use strtol instead of sscanf. This also has the advantage of detecting if any additional characters were entered:
char *ptr;
errno = 0;
long x = strtol(line, &ptr, 16);
if (errno) {
perror("parsing failed");
} else if (*ptr != '\n' && *ptr != 0) {
printf("extra characters entered: %s\n", ptr);
} else {
printf("read value %lx\n", x);
}

Reading a specific line from a file in C

Okay, so after reading both: How to read a specific line in a text file in C (integers) and What is the easiest way to count the newlines in an ASCII file? I figured that I could use the points mentioned in both to both efficiently and quickly read a single line from a file.
Here's the code I have:
char buf[BUFSIZ];
intmax_t lines = 2; // when set to zero, reads two extra lines.
FILE *fp = fopen(filename, "r");
while ((fscanf(fp, "%*[^\n]"), fscanf(fp, "%*c")) != EOF)
{
/* globals.lines_to_feed__queue is the line that we _do_ want to print,
that is we want to ignore all lines up to that point:
feeding them into "nothingness" */
if (lines == globals.lines_to_feed__queue)
{
fgets(buf, sizeof buf, fp);
}
++lines;
}
fprintf(stdout, "%s", buf);
fclose(fp);
Now the above code works wonderfully, and I'm extrememly pleased with myself for figuring out that you can fscanf a file up to a certain point, and then use fgets to read whatever data is at said point into a buffer, instead of having to fgets every single line and then fprintf the buf, when all I care about is the line that I'm printing: I don't want to be storing strings that I could care less about in a buffer that I'm only going to use once for a single line.
However, the only issue I've run into, as noted by the // when set to zero, reads two extra lines comment: when lines is initialized with a value of 0, and the line I want is like 200, the line I'll get will actually be line 202. Could someone please explain what I'm doing wrong here/why this is happening and whether my quick fix lines = 2; is fine or if it is insufficient (as in, is something really wrong going on here, and it just happens to work?)
There are two reasons why you have to set the lines to 2, and both can be derived from the special case where you want the first line.
On one hand, in the while loop the first thing you do is use fscanf to consume a line, then you check if the lines counter matches the line you want. The thing is that if the line you want is the one you just consumed you are out of luck. On the other hand you are basically moving through lines by finding the next \n and incrementing lines after you check if the current line is the one you're after.
These two factors combined cause the offset in the lines count, so the following is a version of the same function taking them into account. Additionally it also contains a break; statement once you get to the line you are looking for, so that the while loop stops looking further into the file.
void read_and_print_line(char * filename, int line) {
char buf[BUFFERSIZE];
int lines = 0;
FILE *fp = fopen(filename, "r");
do
{
if (++lines == line) {
fgets(buf, sizeof buf, fp);
break;
}
}while((fscanf(fp, "%*[^\n]"), fscanf(fp, "%*c")) != EOF);
if(lines == line)
printf("%s", buf);
fclose(fp);
}
Just as another way of looking at the problem… Assuming that your global specifies 1 when the first line is to be printed, 2 for the second, etc, then:
char buf[BUFSIZ];
FILE *fp = fopen(filename, "r");
if (fp == 0)
return; // Error exit — report error.
for (int lineno = 1; lineno < globals.lines_to_feed_queue; lineno++)
{
fscanf(fp, "%*[^\n]");
if (fscanf(fp, "%*c") == EOF)
break;
}
if (fgets(buf, sizeof(buf), fp) != 0)
fprintf(stdout, "%s", buf);
else
…requested line not present in file…
fclose(fp);
You could replace the break with fclose(fp); and return; if that's appropriate (but do make sure you close the file before exiting; otherwise, you leak resources).
If your line numbers are counted from 0, then change the lower limit of the for loop to 0.
First, about what is wrong here: this code is unable to read the very first line in the file (what happens if globals.lines_to_feed__queue is 0?). It would also miscount lines shall the file contain successive newlines.
Second, you must realize that there is no magic. Since you don't know at which offset the string in question lives, you have to patiently read file character by character, counting end-of-strings along the way. It doesn't matter if you delegate the reading/counting to fgets/fscanf, or fgetc each character for manual inspection - either way an uninteresting piece of file will make its way from the disk into the OS buffers, and then into the userspace for interpretation.
Your gut feeling is absolutely correct: the code is broken.

Reading file with different format in each line in C

I just start learning C(optional school course). I stuck on a small problem for 2 days. So the basic idea is, I have bunch of data in a file that I want to extract. However, there are 2 formats that the data has, and the first letter on each line determines what action I need to take.
For example, the data in file looks like these:
S:John,engineer,male,30
S:Alice,teacher,female,40
C:Ford Focus,4-door,25000
C:Chevy Corvette,sports,56000
S:Anna,police,female,36
What I want to do is, after open the file, read each line. If the first letter is S, then use
fscanf(fp, "%*c:%[^,],%[^,],%[^,],%d%*c",name,job,sex,&age)
to store all variable so I can pass them to function people().
But if the first letter is C, then use
fscanf(fp, "%*c:%[^,],%[^,],%d%*c",car,type,&price)
to store so I can pass them to function vehicle().
Would really appreciated if anyone can give me some pointer on how to do this. Thanks.
There are many approaches, but separating IO from parsing is a good first step.
With line oriented data, it is so much cleaner to simply
FILE *inf = ...;
char buf[100];
if (fgets(buf, sizeof buf, inf) == NULL) Handle_EOForIOError();
Then parse it.
char name[sizeof buf];
char job[sizeof buf];
char sex[sizeof buf];
unsigned age;
char car[sizeof buf];
char type[sizeof buf];
unsigned cost;
int n;
if (sscanf(buf, "S:%[^,],%[^,],%[^,],%u %n", name, job, sex, &age, &n) == 4 &&
buf[n] == '\0')
Good_SRecord();
else if (sscanf(buf, "C:%[^,],%[^,],%u %n", car, type, &cost &n) == 3 &&
buf[n] == '\0')
Good_CRecord();
else Garbage();
The " %n" trick is really good at making sure all the data parsed as expected without extra junk.
Here is the rough idea: parse the string in 2 steps:
"%c:%s"
will get you the first character and the rest of the string. Then, based on what you read in the first character, you can continue parsing the remaining part as
"%[^,],%[^,],%[^,],%d%c"
or
"%c:%[^,],%[^,],%d%c"

Check if user input into an array is too long?

I am getting the user to input 4 numbers. They can be input: 1 2 3 4 or 1234 or 1 2 34 , etc. I am currently using
int array[4];
scanf("%1x%1x%1x%1x", &array[0], &array[1], &array[2], &array[3]);
However, I want to display an error if the user inputs too many numbers: 12345 or 1 2 3 4 5 or 1 2 345 , etc.
How can I do this?
I am very new to C, so please explain as much as possible.
//
Thanks for your help.
What I have now tried to do is:
char line[101];
printf("Please input);
fgets(line, 101, stdin);
if (strlen(line)>5)
{
printf("Input is too large");
}
else
{
array[0]=line[0]-'0'; array[1]=line[1]-'0'; array[2]=line[2]-'0'; array[3]=line[3]-'0';
printf("%d%d%d%d", array[0], array[1], array[2], array[3]);
}
Is this a sensible and acceptable way? It compiles and appears to work on Visual Studios. Will it compile and run on C?
OP is on the right track, but needs adjust to deal with errors.
The current approach, using scanf() can be used to detect problems, but not well recover. Instead, use a fgets()/sscanf() combination.
char line[101];
if (fgets(line, sizeof line, stdin) == NULL) HandleEOForIOError();
unsigned arr[4];
int ch;
int cnt = sscanf(line, "%1x%1x%1x%1x %c", &arr[0], &arr[1], &arr[2],&arr[3],&ch);
if (cnt == 4) JustRight();
if (cnt < 4) Handle_TooFew();
if (cnt > 4) Handle_TooMany(); // cnt == 5
ch catches any lurking non-whitespace char after the 4 numbers.
Use %1u if looking for 1 decimal digit into an unsigned.
Use %1d if looking for 1 decimal digit into an int.
OP 2nd approach array[0]=line[0]-'0'; ..., is not bad, but has some shortcomings. It does not perform good error checking (non-numeric) nor handles hexadecimal numbers like the first. Further, it does not allow for leading or interspersed spaces.
Your question might be operating system specific. I am assuming it could be Linux.
You could first read an entire line with getline(3) (or readline(3), or even fgets(3) if you accept to set an upper limit to your input line size) then parse that line (e.g. with sscanf(3) and use the %n format specifier). Don't forget to test the result of sscanf (the number of read items).
So perhaps something like
int a=0,b=0,c=0,d=0;
char* line=NULL;
size_t linesize=0;
int lastpos= -1;
ssize_t linelen=getline(&line,&linesize,stdin);
if (linelen<0) { perror("getline"); exit(EXIT_FAILURE); };
int nbscanned=sscanf(line," %1d%1d%1d%1d %n", &a,&b,&c,&d,&lastpos);
if (nbscanned>=4 && lastpos==linelen) {
// be happy
do_something_with(a,b,c,d);
}
else {
// be unhappy
fprintf(stderr, "wrong input line %s\n", line);
exit(EXIT_FAILURE);
}
free(line); line=NULL;
And once you have the entire line, you could parse it by other means like successive calls of strtol(3).
Then, the issue is what happens if the stdin has more than one line. I cannot guess what you want in that case. Maybe feof(3) is relevant.
I believe that my solution might not be Linux specific, but I don't know. It probably should work on Posix 2008 compliant operating systems.
Be careful about the result of sscanf when having a %n conversion specification. The man page tells that standards might be contradictory on that corner case.
If your operating system is not Posix compliant (e.g. Windows) then you should find another way. If you accept to limit line size to e.g. 128 you might code
char line[128];
memset (line, 0, sizeof(line));
fgets(line, sizeof(line), stdin);
ssize_t linelen = strlen(line);
then you do append the sscanf and following code from the previous (i.e. first) code chunk (but without the last line calling free(line)).
What you are trying to get is 4 digits with or without spaces between them. For that, you can take a string as input and then check that string character by character and count the number of digits(and spaces and other characters) in the string and perform the desired action/ display the required message.
You can't do that with scanf. Problem is, there are ways to make scanf search for something after the 4 numbers, but all of them will just sit there and wait for more user input if the user does NOT enter more. So you'd need to use gets() or fgets() and parse the string to do that.
It would probably be easier for you to change your program, so that you ask for one number at a time - then you ask 4 times, and you're done with it, so something along these lines, in pseudo code:
i = 0
while i < 4
ask for number
scanf number and save in array at index i
E.g
#include <stdio.h>
int main(void){
int array[4], ch;
size_t i, size = sizeof(array)/sizeof(*array);//4
i = 0;
while(i < size){
if(1!=scanf("%1x", &array[i])){
//printf("invalid input");
scanf("%*[^0123456789abcdefABCDEF]");//or "%*[^0-9A-Fa-f]"
} else {
++i;
}
}
if('\n' != (ch = getchar())){
printf("Extra input !\n");
scanf("%*[^\n]");//remove extra input
}
for(i=0;i<size;++i){
printf("%x", array[i]);
}
printf("\n");
return 0;
}

Simple count how many integers are in file in C

Im currently learning C through random maths questions and have hit a wall. Im trying to read in 1000 digits to an array. But without specifiying the size of an array first i cant do that.
My Answer was to count how many integers there are in the file then set that as the size of the array.
However my program returns 4200396 instead of 1000 like i hoped.
Not sure whats going on.
my code: EDIT
#include <stdio.h>
#include <stdlib.h>
int main (void)
{
FILE* fp;
const char filename[] = "test.txt";
char ch;
int count = 0;
fp = fopen(filename, "r");
if( fp == NULL )
{
printf( "Cannot open file: %s\n", filename);
exit(8);
}
do
{
ch = fgetc (fp);
count++;
}while (ch != EOF);
fclose(fp);
printf("Text file contains: %d\n", count);
return EXIT_SUCCESS;
}
test.txt file:
731671765313306249192251196744265747423553491949349698352031277450632623957831801698480186947885184385861560789112949495459501737958331952853208805511
125406987471585238630507156932909632952274430435576689664895044524452316173185640309871112172238311362229893423380308135336276614282806444486645238749
303589072962904915604407723907138105158593079608667017242712188399879790879227492190169972088809377665727333001053367881220235421809751254540594752243
525849077116705560136048395864467063244157221553975369781797784617406495514929086256932197846862248283972241375657056057490261407972968652414535100474
821663704844031998900088952434506585412275886668811642717147992444292823086346567481391912316282458617866458359124566529476545682848912883142607690042
242190226710556263211111093705442175069416589604080719840385096245544436298123098787992724428490918884580156166097919133875499200524063689912560717606
0588611646710940507754100225698315520005593572972571636269561882670428252483600823257530420752963450
Any help would be great.
You forgot to initialize count, so it contains random garbage.
int count = 0;
(But note that with this change it's still not going to work, since %d in a scanf format means read as many digits as you find rather than read a single digit.)
Turn on your compiler's warnings (-Wall), it will tell you that you didn't initialize count, which is a problem: it could contain absolutely anything when your program starts.
So initialize it:
int count = 0;
The other problem is that the scanfs won't do what you want, at all. %d will match a series of digits (a number), not an individual digit. If you do want to do your counting like that, use %c to read individual characters.
Another approach typically used (as long as you know the file isn't being updated) is to use fseek/ftell to seek to the end of the file, get the position (wich will tell you its size), then seek back to the start.
The fastest approach though would be to use stat or fstat to get the file size information from the filesystem.
If you want number of digits thin you tave to do it char-by-char e.g:
while (isdigit(fgetc(file_decriptor))
count++;
Look up fgetc, getc and scanf in manpages, you don't seem to understand whats going on in your code.
The way C initializes values is not specified. Most of the time it's garbage. Your count variable it's not initialized, so it mostly have a huge value like 1243435, try int count = 0.

Resources