Catch errors on stdin, stdout, stderr - c

I was writing a bit C99 code reading from stdin:
// [...]
fgets(buf, sizeof(buf), stdin);
// [...]
But I am wondering if I should catch errors in this case, since the shell could redirect stdin to anything that may be less robust than plain stdin. But that would also mean, that every access on stdin, stdout and stderr has to be checked for errors, and I seldomly see any checks after printf and co.
So, is it recommended to check every stream access for errors?
The above example would then be something like:
// [...]
if (!fgets(buf, sizeof(buf), stdin) && ferror(stdin)) {
exit(EXIT_FAILURE);
}
// [...]
Thanks in advance!

You always have to check the return value from fgets(), every time you use it. If you don't, you have no idea whether there is useful data in the buffer; it could hold the last line a second time. Similar comments apply to every read operation; you must check whether the read operation returned what you expected.
if (fgets(buf, sizeof(buf), stdin) == 0)
...EOF, or some error...
In the handling code, you need to decide what to do. You can legitimately use feof() and ferror() in that code. The correct reaction to the problem depends on your code. Detecting EOF is usually a cause for exiting the loop or exiting the function (break or return, but you only return if the function did not open the file; otherwise you have to close the file at least). Detecting errors on stdin is going to be a rare occurrence; you will have to decide what's appropriate to do.
Detecting errors on write to stderr or stdout is something that's less often done, but it is arguably sloppy programming to omit them. One issue, especially if it is stderr that has the problem, is "how are you going to report the error?" You might need to use syslog() instead, but that's the sort of issue you have to think about.

It depends on the nature of the application you are developing. For example if you are developing a hard real time system whose abnormal termination results in severe problems. then you should take precaution to deal with all kinds of data streaming errors. On such situations use the following code,
if (!fgets(buf, sizeof(buf), stdin) && ferror(stdin)) {
exit(EXIT_FAILURE);
}
or some construct similar to it. But if your application's rare failures won't have any severe consequences you don't need to check every data streaming operation.

Here's a fun exercise. Find someone's interactive program, run it until it asks for input from the terminal, and press control-D (EOF). Odds are that the author doesn't check feof(stdin) and his gets() calls keep just returning 0 bytes which the code interprets as a blank line. If it takes that as invalid input and re-prompts, it'll end up in an infinite loop!

Related

What happens if you don't check feof in C?

During read/write operations is it absolutely necessary to check feof()?
The reason I ask is because I have a program that performs read/write ONCE, here is the code below:
while (1) {
data = read_file_data(file);
write_file_data(data, filename);
if (feof(file))
print(read error)
}
This is just pseudocode but is it necessary to check feof() in a case like this where a read will occur once? Currently, I think it is only necessary if you will do ANOTHER read after the one above like this:
while (1) {
data = read_file_data(file);
write_file_data(data, filename);
if (feof(file)) // feof error occurred oops
print(read error)
data = read_file_data(file); // reading after error
}
Lastly, what can the consequences be of reading even after an EOF reached error (reading past EOF) occurs?
During read/write operations is it absolutely necessary to check feof()?
No. During normal operations, the best way of detecting EOF is to check the return value of the particular input call you're using. Input calls can always fail, so their return values should always be checked. This is especially true of scanf and fgets, which many beginning programmers (and unfortunately many beginning programming tutorials) conspicuously neglect to check.
Explicit calls to feof are rarely necessary. You might need to call feof if:
An input call has just returned 0 or EOF, but you'd like to know whether that was due to an actual end-of-file, or a more serious error.
You've used an input call such as getw() that has no way to indicate EOF or error.
(rare, and arguably poor form) You've made a long series of input calls in a row, and you didn't expect EOF from any of them, so you didn't feel like checking all of them explicitly, but decided to check with one catch-all call to feof() at the end.
The other thing to know is that feof() only tells you that you did hit end-of-file — that is, past tense. It does not predict the future; it does not tell you that next input call you try to make will hit EOF. It tells you that the previous call you made did hit EOF. See also Why is “while( !feof(file) )” always wrong?
See also how to detect read/write errors when using fread() and fwrite?
what can the consequences be of reading even after an EOF reached error (reading past EOF) occurs?
That's a good question. The answer is, "it depends," and since unpredictability can be a real problem, your best bet is usually not to try to read past EOF.
When I was first learning C, if you got an EOF, but tried reading some more, and if the EOF had somehow "gone away", your next read might succeed. And it could be quite common for the EOF to "go away" like that, if you were reading from the keyboard, and the user indicated EOF by typing control-D. But if they typed more input after typing control-D, you could go ahead and read it.
But that was in the old days. These days EOF is "sticky", and once the end-of-file flag has been set for a stream, I think any future attempts to read are supposed to immediately return EOF. These days, if the user hits control-D and you want to keep reading, you have to call clearerr().
I'm pretty sure everything I just said abut feof() and the per-stream EOF flag is also true of ferror() and the per-stream error flag.
In a comment, #0___________ said that "if you ignore I/O errors and continue you invoke Undefined Behaviour". I don't think this is true, but I don't have my copy of the Standard handy to check.
The reason to call feof is to figure out whether the EOF return from an input function was due to some kind of actual I/O error (ferror() would return a non-zero value then but feof() wouldn't), or due to the input being exhausted (which is not an error, but a condition; feof() would return a non-zero value).
For example if your program is to consume all the input and process it, it might be crucial to be able to distinguish that you actually did read all of the input vs someone removed the USB stick from the drive when you'd read just half of the input.

Guarantee that getchar receives newline or EOF (eventually)?

I would like to read characters from stdin until one of the following occurs:
an end-of-line marker is encountered (the normal case, in my thinking),
the EOF condition occurs, or
an error occurs.
How can I guarantee that one of the above events will happen eventually? In other words, how do I guarantee that getchar will eventually return either \n or EOF, provided that no error (in terms of ferror(stdin)) occurs?
// (How) can we guarantee that the LABEL'ed statement will be reached?
int done = 0;
while (!0) if (
(c = getchar()) == EOF || ferror(stdin) || c == '\n') break;
LABEL: done = !0;
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur. It seems like the answer will have to do with the properties of the device. Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition. Similarly for files stored on disc / SSD.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I am using C89, but I am curious if the answer depends on which C standard is used.
You can't.
Let's say I run your program, then I put a weight on my keyboard's "X" key and go on vacation to Hawaii. On the way there, I get struck by lightning and die.
There will never be any input other than 'x'.
Or, I may decide to type the complete story of Moby Dick, without pressing enter. It will probably take a few days. How long should your program wait before it decides that maybe I won't ever finish typing?
What do you want it to do?
Looking at all the discussion in the comments, it seems you are looking in the wrong place:
It is not a matter of keyboard drivers or wrapping stdin.
It is also not a matter of what programming language you are using.
It is a matter of the purpose of the input in your software.
Basically, it is up to you as a programmer to know how much input you want or need, and then decide when to stop reading input, even if valid input is still available.
Note, that not only are there devices that can send input forever without triggering EOF or end of line condition, but there are also programs that will happily read input forever.
This is by design.
Common examples can be found in POSIX style OS (like Linux) command line tools.
Here is a simple example:
cat /dev/urandom | hexdump
This will print random numbers for as long as your computer is running, or until you hit Ctrl+C
Though cat will stop working when there is nothing more to print (EOF or any read error), it does not expect such an end, so unless there is a bug in the implementation you are using it should happily run forever.
So the real question is:
When does your program need to stop reading characters and why?
If stdin is connected to a device that always delivers some character other than '\n', none of the above conditions will occur.
A device such as /dev/zero, for example. Yes, stdin can be connected to a device that never provides a newline or reaches EOF, and that is not expected ever to report an error condition.
It seems like the answer will have to do with the properties of the device.
Indeed so.
Where can those details be found (in the doumentation for compiler, device firmware, or device hardware perhaps)?
Generally, it's a question of the device driver. And in some cases (such as the /dev/zero example) that's all there is anyway. Generally drivers do things that are sensible for the underlying hardware, but in principle, they don't have to do.
In particular, I am interested to know if keyboard input is guaranteed to be terminated by an end-of-line marker or end-of-file condition.
No. Generally speaking, an end-of-line marker is sent by a terminal device if and only if the <enter> key is pressed. An end-of-file condition might be signaled if the terminal disconnects (but the program continues), or if the user explicitly causes one to be sent (by typing <-<D> on Linux or Mac, for example, or <-<Z> on Windows). Neither of those events need actually happen on any given run of a program, and it is very common for the latter not to do.
Similarly for files stored on disc / SSD.
You can generally rely on data read from an ordinary file to contain newlines where they are present in the file itself. If the file is open in text mode, then the system-specific text line terminator will also be translated to a newline, if it differs. It is not necessary for a file to contain any of those, so a program reading from a regular file might never see a newline.
You can rely on EOF being signaled when a read is attempted while the file position is at or past the and of the file's data.
Typical use case: user enters text on the keyboard. Program reads first few characters and discards all remaining characters, up to the end-of-line marker or end-of-file (because some buffer is full or after that everything is comments, etc.).
I think you're trying too hard.
Reading to end-of-line might be a reasonable thing to do in some cases. Expecting a newline to eventually be reached is reasonable if the program is intended to support interactive use. But trying to ensure that invalid data cannot be fed to your program is a losing cause. Your objective should be to accept the widest array of inputs you reasonably can, and to fail gracefully when other inputs are presented.
If you need to read input in a line-by-line mode then by all means do that, and document that you do it. If only the first n characters of each line are significant to the program then document that, too. Then, if your program never terminates when a user connects its input to /dev/zero that's on them, not on you.
On the other hand, try to avoid placing arbitrary constraints, especially on sizes of things. If there is not a natural limit on the size of something, then no artificial limit you introduce will ever be enough.

Is this the proper way to flush the C input stream?

Well I been doing a lot of searching on google and on here about how to flush the input stream properly. All I hear is mixed arguments about how fflush() is undefined for the input stream, and some say just do it that way, and others just say don't do it, I haven't had much luck on finding a clear efficient/proper way of doing so, that the majority of people agree on.. I am quite new at programming so I don't know all the syntax/tricks of the language yet, so my question which way is the most efficient/proper solution to clearing the C input stream??
Use the getchar() twice before I try to receive more input?
Just use the fflush() function on the input? or
This is how I thought I should do it.
void clearInputBuf(void);
void clearInputBuf(void)
{
int garbageCollector;
while ((garbageCollector = getchar()) != '\n' && garbageCollector != EOF)
{}
}
So, whenever I need to read a new scanf(), or use getchar() to pause the program I just call the clearInputBuf.. So what would be the best way out of the three solutions or is there a even better option?
All I hear is mixed arguments about how fflush() is undefined for the input stream
That's correct. Don't use fflush() to flush an input stream.
(A) will work for simple cases where you leave single character in the input stream (such as scanf() leaving a newline).
(B) Don't use. It's defined on some platforms. But don't rely on what standard calls undefined behaviour.
(C) Clearly the best out of the 3 options as it can "flush" any number of characters in the input stream.
But if you read lines (such as using fgets()), you'll probably have much less need to clear input streams.
It depends on what you think of as "flushing an input stream".
For output streams, the flush operation makes sure that all data that were written to the stream but were being kept buffered in memory have been flushed to the underlying filesystem file. That's a very well defined operation.
For input streams, there is no well defined operation of what flushing the stream should do. I would say that it does not make any sense.
Some implementations of the C standard library redefine the meaning of "flush" for input streams to mean "clear the rest of the current line which has not been read yet". But that's entirely arbitrary, and other implementations choose to do nothing instead.
As of C11 this disparity has been corrected, and the standard now explicitly states that the fflush() function does not work with input streams, precisely because it makes no sense, and we do not want each runtime library vendor to go implementing it in whatever way they feel like.
So, please by all means do go ahead and implement your clearInputBuf() function the way you did, but do not think of it as "flushing the input stream". There is no such thing.
It turns out to be platform dependent.
The fflush() cannot have an input stream as a parameter because according to the c standard, IT'S UNDEFINED BEHAVIOR since the behavior is not defined anywhere.
On Windows, there is a defined behavior for fflush() and it does what you need it to do.
On Linux, there is fpurge(3) which does what you want it to do too.
The best way is to simply read all characters in a loop until
A newline character is found.
EOF is returned from getchar().
like your clearInputBuf() function.
Note that flushing an output stream means writing all unwritten data to the stream, the data that is still in a buffer waiting to be flushed. But reading all the unread bytes from a stream, does not have the same meaning.
That's why it doesn't make sense to fflush() an input stream. On the other hand fpurge() is designed specifically for this, and it's name is a better choice because you want to clear the input stream and start fresh. The problem is, it's not a standard function.
Reading fpurge(3) should clarify why fflush(stdin) is undefined behavior, and why an implementation like the one on Windows doesn't make sense because it makes fflush() behave differently with different inputs. That's like making c compliant with PHP.
The problem is more subtile than it looks:
On systems with elaborate terminal devices, such as unix and OS/X, input from the terminal is buffered at 2 separate levels: the system terminal uses a buffer to handle line editing, from just correcting input with backspace to full line editing with cursor and control keys. This is called cooked mode. A full line of input is buffered in the system until the enter key is typed or the end-of-file key combination is entered.
The FILE functions perform their own buffering, which is line buffered by default for streams associated with a terminal. The buffer size in set to BUFSIZ by default and bytes are requested from the system when the buffered contents have been consumed. For most requests, a full line will be read from the system into the stream buffer, but in some cases such as when the buffer is full, only part of the line will have been read from the system when scanf() returns. This is why discarding the contents of the stream buffer might not always suffice.
Flushing the input buffer may mean different things:
discarding extra input, including the newline character, that have been entered by the user in response to input requests such as getchar(), fgets() or scanf(). The need for flushing this input is especially obvious in the case of scanf() because most format lines will not cause the newline to be consumed.
discarding any pending input and waiting for the user to hit a key.
You can implement a fluch function portably for the first case:
int flush_stream(FILE *fp) {
int c;
while ((c = getc(fp)) != EOF && c != '\n')
continue;
return c;
}
And this is exactly what your clearInputBuf() function does for stdin.
For the second case, there is no portable solution, and system specific methods are non trivial.

Error checking fprintf when printing to stderr

According to the docs, fprintf can fail and will return a negative number on failure. There are clearly many situations where it would be useful to check this value.
However, I usually use fprintf to print error messages to stderr. My code will usually look something like this:
rc = foo();
if(rc) {
fprintf(stderr, "An error occured\n");
//Sometimes stuff will need to be cleaned up here
return 1;
}
In these cases, is it still possible for fprintf to fail? If so, is there anything that can be done to display the error message somehow or is there is a more reliable alternative to fprintf?
If not, is there any need to check fprintf when it is used in this way?
The C standard says that the file streams stdin, stdout, and stderr shall be connected somewhere, but they don't specify where, of course.
(C11 §7.21.3 Files ¶7:
At program startup, three text streams are predefined and need not be opened explicitly -- standard input (for reading conventional input), standard output (for writing conventional output), and standard error (for writing diagnostic output). As initially opened, the standard error stream is not fully buffered; the standard input and standard output streams are fully buffered if and only if the stream can be determined not to refer to an interactive device.
It is perfectly feasible to run a program with the standard streams redirected:
some_program_of_yours >/dev/null 2>&1 </dev/null
Your writes will succeed - but the information won't go anywhere. A more brutal way of running your program is:
some_program_of_yours >&- 2>&- </dev/null
This time, it has been run without open file streams for stdout and stderr — in contravention of the the standard. It is still reading from /dev/null in the example, which means it doesn't get any useful data input from stdin.
Many a program doesn't bother to check that the standard I/O channels are open. Many a program doesn't bother to check that the error message was successfully written. Devising a suitable fallback as outline by Tim Post and whitey04 isn't always worth the effort. If you run the ls command with its outputs suppressed, it will simply do what it can and exits with a non-zero status:
$ ls; echo $?
gls
0
$ ls >&- 2>&-; echo $?
2
$
(Tested RHEL Linux.) There really isn't a need for it to do more. On the other hand, if your program is supposed to run in the background and write to a log file, it probably won't write much to stderr, unless it fails to open the log file (or spots an error on the log file).
Note that if you fall back on syslog(3) (or POSIX), you have no way of knowing whether your calls were 'successful' or not; the syslog functions all return no status information. You just have to assume that they were successful. It is your last resort, therefore.
Typically, you'd employ some kind of logging system that could (try) to handle this for you, or you'll need to duplicate that logic in every area of your code that prints to standard error and exits.
You have some options:
If fprintf fails, try syslog.
If both fail, try creating a 'crash.{pid}.log' file that contains information that you'd want in a bug report. Check for the existence of these files when you start up, as they can tell your program that it crashed previously.
Let net connected users check a configuration option that allows your program to submit an error report.
Incidentally, open() read() and write() are good friends to have when the fprintf family of functions aren't working.
As whitey04 says, sometimes you just have to give up and do your best to not melt down with fireworks going off. But do try to isolate that kind of logic into a small library.
For instance:
best_effort_logger(LOG_CRIT, "Heap corruption likely, bailing out!");
Is much cleaner than a series of if else else if every place things could possibly go wrong.
You could put the error on stdout or somewhere else... At some point you just have to give error reporting a best effort and then give up.
The key is that your app "gracefully" handles it (e.g. the OS doesn't have to kill it for being bad and it tells you why it exited [if it can]).
Yes, of course fprintf to stderr can fail. For instance stderr could be an ordinary file and the disk could run out of space, or it could be a pipe that gets closed by the reader, etc.
Whether you should check an operation for failure depends largely on whether you could achieve better program behavior by checking. In your case, the only conceivable things you could do on failure to print the error message are try to print another one (which will almost surely also fail) or exit the program (which is probably worse than failing to report an error, but perhaps not always).
Some programs that really want to log error messages will set up an alternate stack at program start-up to reserve some amount of memory (see sigaltstack(2) that can be used by a signal handler (usually SIGSEGV) to report errors. Depending upon the importance of logging your error, you could investigate using alternate stacks to pre-allocate some chunk of memory. It might not be worth it :) but sometimes you'd give anything for some hint of what happened.

forcing fgets to block (i.e. faking an "interactive device")

I have a C application which provides a "shell" for entering commands. I'm trying to write some automated test-code for the application (Using CUnit). The "shell" input is read from stdin like so:
fgets(buf, sizeof(buf), stdin);
I can "write" commands automatically to the application by freopen()'ning stdin and hooking it to an intermediate file. When the application is executed "normally" the fgets() call blocks untill characters are available because it is "an interactive device", but not so on the intermediate file. So how can I fake fgets into thinking the intermediate file is an "interactive device".
The C program is for Windows (XP) compiled using MinGW.
Regards!
fgets is not blocking when you are reading from a file because it reaches the end of the file which causes EOF to set on the stream and thus calls to fgets return immediately. When you are running from an interactive input EOF is never set, unless you type Ctrl-Z (or Ctrl-D on UNIX system) of course.
If you really want to use an intermediate file I think you'll need to enhance your shell so that when it hits an EOF it clears and retests it after a suitable wait. A function like this should work I think:-
void waitForEofClear(FILE *f)
{
while (feof(f)) {
clearerr(f);
sleep(1);
}
}
You could then call this before the fgets:-
waitForEofClear(stdin);
fgets(buf, sizeof(buf), stdin);
Simply using a file is not going to work, as the other answers have indicated. So, you need to decide what you are going to do instead. A FIFO (named pipe) or plain (anonymous) pipe could be used to feed the interactive program under test - or, on Unix, you could use a pseudo-tty. The advantage of all these is that a program blocks when there is no data read, waiting for the next information to arrive, rather than immediately deciding 'no data to read, must be EOF'.
You will then need a semi-intelligent (or even intelligent) program periodically writing data to the channel for the program under test to read. This program will need to know how long to pause between the messages it writes. This might be as simplistic as 'wait one second; write the next line of data'. Or you might do something more complex.
One scheme that I know of has two programs - a capture program to record what a user types and the timing of it (so the 'data' file is structured; it has records consisting of a delay (in seconds and fractions of a second) plus a set of characters to send (count and list of bytes). This is run to capture what the user types and record it (as well as send the data to the program). There is then a second replay program that reads the file, and interprets the delays and character sequences.
This scheme works adequately if the input sequence is stable; if the same sequence of key strokes is always needed to get the required result. If the data sent to the program needs to adapt to what the program under test is doing and its responses, and may do different things at different times, then you are probably better off going with 'expect'. This has the capacity to do whatever you need - at least for non-GUI programs.
I'm not sure what the windows equivalent is, but in Linux I would make the intermediate file a fifo. If I was going to do a real non-trivial autopilotting, I would wrap it in an expect script.

Resources