Why are system calls in C always error-checked?

Why are system calls in C always error-checked? - c

Obviously, it's good practice. That goes without saying. I see it every time in example code (like socket(), fork(), or malloc(), to name a few). I know to do it, I just don't understand the why of it so much. Are they prone to failing often? Is it because system calls are made in kernel mode? What's the reasoning behind it?

I presume you are asking why code that calls these routines checks the results to determine whether an error occurred.
Each of the routines you cite, socket, fork, and malloc, requires resources. Those resources may be unavailable either because the calling process has exceeded limits set by the system administrator or the user or because the system has exhausted the resources it has and cannot provide any more to processes. Therefore, it is possible, even if not frequent, that a call to one of these routines will return failure. So a calling process should check for failure.
Additionally, in some implementations, some system routines (such as read and write) can be interrupted if a signal is delivered to the process before the operation completed. (When a signal arrives, it is considered important, and it is desirable to deliver it to the process immediately rather than wait for a potentially long operation to complete. So the operation is interrupted, the signal is delivered, the process may handle the signal and return from the signal handler. Then control is returned to the code that called the original routine, and that code must be informed that the operation was interrupted.) This interruption results in returning failure with an error status indicating the operation was interrupted.

Always, if only..
Way back when as a C function could only return an integer, and exceptions were science fiction, they came up with the idea of returning either success or a code that provided a clue as to what had gone wrong. It became a convention.
Depends on what you call a failure.
Something like opening a file (given the developer can be bothered) are relatively easy to deal with, File not found for instance. Malloc, is a bit more difficult to take some remedial action.
The key point though is to check as near to the error as possible. If you don't, you find that the file you wanted to open and append to didn't exist 10,000 lines of code later, when you try and write the results of your extensive computation to it and get say an access violation.
Basically this stuff is the reason exceptions were invented. Checking the return value is "optional", swallowing an exception is explicit.

example:
FILE *fp;
fp = fopen("c:\\removedDirectory\nonexistingFile.txt", "r")//returns NULL
if(fp != NULL)
{
//stuff here will fail if fp == NULL
}
If you do not check output of fopen, (replace with any function that returns an error) and fp is NULL, the subsequent functions depending on a real file stream will not work.

Related

What to do if a posix close call fails?

On my system (Ubuntu Linux, glibc), man page of a close call specifies several error return values it can return. It also says
Not checking the return value of close() is a common but nevertheless serious programming error.
and at the same time
Note that the return value should only be used for diagnostics. In particular close() should not be retried after an EINTR since this may cause a reused descriptor from another thread to be closed.
So I am not allowed to ignore the return value nor to retry the call.
Given that, how shall I handle the close() call failure?
If the error happened when I was writing something to the file, I am probably supposed to try to write the information somewhere else to avoid the data loss.
If I was only reading the file, can I just log the failure and continue the program pretending nothing happened? Are there any caveats, leak of file descriptors or whatever else?

In practice, close should never be retried on error, and the fd you passed to close is always invalid (closed) after close returns, regardless of whether an error occurred. In some cases, an error may indicate that data was lost (certain NFS setups) or unusual hardware conditions for devices (e.g. tape could not be rewound), so you may want to be cautious to avoid data loss, but you should never attempt to close the fd again.
In theory, POSIX was unclear in the past as to whether the fd remains open when close fails with EINTR, and systems disagreed. Since it's important to know the state (otherwise you have either fd leaks or double-close bugs which are extremely dangerous in multithreaded programs), the resolution to Austin Group issue #529 specified the behavior strictly for future versions of POSIX, that EINTR means the fd remains open. This is the right behavior consistent with the definition of EINTR elsewhere, but Linux refuses to accept it. (FWIW there's an easy workaround for this that's possible at the libc syscall wrapper level; see glibc PR #14627.) Fortunately it never arises in practice anyway.
Some related questions you might find informative:
What are the reasons to check for error on close()?
Trying to make close sleep on Linux

First of all: EINTR means exactly that: System call was interrupted, if this happens on a close() call, there is exactly nothing you can do.
Apart from maybe keeping track of the fact, that if the fd belonged to a file, this file is possibly corrupt, there is not much you can do about errors on close() at all - depending on the return value. AFAIK the only case, where a close can be retried is on EBUSY, but I have yet to see that.
So:
Not checking the result of close() might mean that you miss file corruption, especially truncation.
Depending on the error, most of the time you can do nothing - a failed close() just means something has gone awfully wrong outside the scope of your application.

What happens to a process when the filesystem is full

What happens to a process if the filesystem is full? Does the kernel send us a signal to shutdown and if so what signal is it. Obviously, a program will probably crash if it writes to the file system but I'm curious as to how this occurs (in gory kernel/operating system detail).

What happens to a process if the filesystem fills up?
Operations that would require additional disk space on the full partition (like creating or appending to a file) fail with an errno of ENOSPC.
No signal is sent, as a full filesystem is not a critical condition which makes a signal necessary. It's a routine, easily handled error.

There is no reason a program should crash when the filesystem is full. Obviously file writes will fail, but a well-written program should be able to cope with that - in C, this would mean that fopen returns NULL or ferror returns a non-zero value, etc. I have encountered this many times, and some nasty things can happen such as overwriting a file with a blank version, but never a program crash. If it does happen, it is presumably because the author of the program tried to use a NULL file descriptor or some similar problem, in which case the program would receive a SIGSEGV as usual.

Why do system calls return EFAULT instead of sending a segfault?

To be clear, this is a design rather than an implementation question
I want to know the rationale behind why POSIX behaves this way. POSIX system calls when given an invalid memory location return EFAULT rather than crashing the userspace program (by sending a sigsegv), which makes their behavior inconsistent with userspace functions.
Why? Doesn't this just hide memory bugs? Is it a historical mistake or is there a good reason for it?

Because system calls are executed by the kernel, not by the user program --- when the system call occurs, the user process halts and waits for the kernel to finish.
The kernel itself, of course, isn't allowed to seg fault, so it has to manually check all the address areas the user process gives it. If one of these checks fails, the system call fails with EFAULT. So in this situation a segmentation fault hasn't actually happening --- it's been avoided by the kernel explicitly checking to make sure all the addresses are valid. Hence it makes sense that no signal is sent.
In addition, if a signal were sent, there'd be no way the kernel could attach a meaningful program counter to the signal, the user process isn't actually executing when the system call is running. This means there'd be no way for the user process to produce decent diagnostics, restart the failed instruction, etc.
To summarise: mostly historical, but there is actual logic to the reasoning. Like EINTR, this doesn't make it any less irritating to deal with.

Well, what would you want to happen. A system call is a request to the system. If you ask: "when does the ferry to Munchen leave?" would you like the program to crash, or to get return = -1 with errno = ENOHARBOR ? If you ask the sytem to put your car into your handbag, would you like to have your handbag destroyed, or a return of -1 with errno set to EBAGTOOSMALL ?
There is a technical detail: before or after syscalls,arguments to/from user/system -land have to be converted (copied) when entering/leaving the system call. Mostly for security reasons the system is very reluctant to write into user-space. (Linux has a copy_to_user_space function for this (and vice versa), which checks the credentials before doing the actual copying)
Why? Doesn't this just hide memory bugs?`
On the contrary. It allows your program to handle the error(impossible in this case), and terminate gracefully. But the program must check the return value from system calls and inspect errno. In the case of SIGSEGVE, there is very little for your program to do, so mapping EINVAL to SIGSEGVE would be a bad idea.
Systemcalls were designed to always return (or block indefinitely...), whether they succeed or fail.
And a technical aspect could be that {segmentation faults, buserror, floating point exception, ...} are (often) generated by hardware interrupts.

Why is this not a bug in qmail?

I was reading DJB's "Some thoughts on security after ten years of Qmail 1.0" and he listed this function for moving a file descriptor:
int fd_move(to,from)
int to;
int from;
{
if (to == from) return 0;
if (fd_copy(to,from) == -1) return -1;
close(from);
return 0;
}
It occurred to me that this code does not check the return value of close, so I read the man page for close(2), and it seems it can fail with EINTR, in which case the appropriate behavior would seem to be to call close again with the same argument.
Since this code was written by someone with far more experience than I in both C and UNIX, and additionally has stood unchanged in qmail for over a decade, I assume there must be some nuance that I'm missing that makes this code correct. Can anyone explain that nuance to me?

I've got two answers:
He was trying to make a point about factoring out common code and often such examples omit error checking for brevity and clarity.
close(2) may return EINTER, but does it in practice, and if so what would you reasonably do? Retry once? Retry until success? What if you get EIO? That could mean almost anything, so you really have no reasonable recourse except logging it and moving on. If you retry after an EIO, you might get EBADF, then what? Assume that the descriptor is closed and move on?
Every system call can return EINTR, escpecially one that blocks like read(2) waiting on a slow human. This is a more likely scenario and a good "get input from terminal" routine will indeed check for this. That also means that write(2) can fail, even when writing a log file. Do you try to log the error that the logger generated or should you just give up?

When a file descriptor is dup'd, as it is in the fd_copy or dup2 function, you will end up with more than one file descriptor referring to the same thing (i.e. the same struct file in the kernel). Closing one of them will simply decrement its reference count. No operation is performed on the underlying object unless it is the last close. As a result, conditions such as EINTR and EIO are not possible.

Another possibility is that his function is used only in an application (or a part of one) which has done something to ensure that the call will not be interrupted by a signal. If you aren't going to do anything important with signals, then you don't have to be responsive to them, and it might make sense to mask them all out, rather than wrap every single blocking system call in an EINTR retry. Except of course the ones that will kill you, so SIGKILL and frequently SIGPIPE if you handle it by quitting, along with SIGSEGV and similar fatal errors which will in any case never be delivered to a correct user-space app.
Anyway, if all he's talking about is security, then quite possibly he doesn't have to retry close. If close failed with EIO, then he would not be able to retry it, it would be a permanent failure. Therefore, it is not necessary for the correctness of his program that close succeeds. It may well be that it is not necessary for the correctness of his program that close be retried on EINTR, either.
Usually you want your program to make a best effort to succeed, and that means retrying on EINTR. But this is a separate concern from security. If your program is designed so that some function failing for any reason isn't a security flaw, then in particular the fact that it happens to have failed EINTR, rather than for a permanent reason, isn't a flaw. DJB has been known to be fairly opinionated, so I would not be at all surprised if he has proved some reason why he doesn't need to retry, and therefore doesn't bother, even if doing so would allow his program to succeed in flushing the handle in certain situations where maybe it currently fails (like being explicitly sent a harmless signal with kill by the user at a crucial moment).
Edit: it occurs to me that retrying on EINTR could potentially itself be a security flaw under certain conditions. It introduces a new behaviour to that section of code: it can loop indefinitely in response to a signal flood, where previously it would make one attempt to close and then return. I don't know for sure that this would cause qmail any problems (after all, close itself makes no guarantees how soon it will return). But if giving up after one attempt does make the code easier to analyse then it could plausibly be a smart move. Or not.
You might think that retrying prevents a DoS flaw, where a signal causes a spurious failure. But retrying allows another (more difficult) DoS flaw, where a signal flood causes an indefinite stall. In terms of binary "can this app be DoSed?", which is the kind of absolute security question DJB was interested in when he wrote qmail and djbdns, it makes no difference. If something can happen once, then normally that means it can happen many times.

Only broken unices ever return EINTR without you explicitly asking for it. The sane semantics for signal() enable restartable system calls ("BSD style"). When building a program on a system with the sysv semantics (interrupting signals), you should always replace calls to signal() with calls to bsd_signal(), which you can define yourself in terms of sigaction() if it doesn't exist.
It's further worth noting that no systems will return EINTR on signal receipt unless you have installed signal handlers. If the default action is left in place, or if the signal is set to no action, it's impossible for system calls to be interrupted.

What does select(2) do if you close(2) a file descriptor in a separate thread?

What is the behavior of the select(2) function when a file descriptor it is watching for reading is closed by another thread?
From some cursory testing, it does return right away. I suspect the outcome is either that (a) it still continues to wait for data, but if you actually tried to read from it you'd get EBADF (possibly -- there's a potential race) or (b) that it pretends as though the file descriptor were never passed in. If the latter case is true, passing in a single fd with no timeout would cause a deadlock if it were closed.

From some additional investigation, it appears that both dwc and bothie are right.
bothie's answer to the question boils down to: it's undefined behavior. That doesn't mean that it's unpredictable necessarily, but that different OSes do it differently. It would appear that systems like Solaris and HP-UX return from select(2) in this case, but Linux does not based on this post to the linux-kernel mailing list from 2001.
The argument on the linux-kernel mailing list is essentially that it is undefined (and broken) behavior to rely upon. In Linux's case, calling close(2) on the file descriptor effectively decrements a reference count on it. Since there is a select(2) call also with a reference to it, the fd will remain open and waiting for input until the select(2) returns. This is basically dwc's answer. You will get an event on the file descriptor and then it'll be closed. Trying to read from it will result in a EBADF, assuming the fd hasn't been recycled. (A concern that MarkR made in his answer, although I think it's probably avoidable in most cases with proper synchronization.)
So thank you all for the help.

I would expect that it would behave as if the end-of-file had been reached, that's to say, it would return with the file descriptor shown as ready but any attempt to read it subsequently would return "bad file descriptor".
Having said that, doing that is very bad practice anyway, as you'd always have potential race conditions as another file descriptor with the same number could be opened by yet another thread immediately after the other 2nd closed it, then the selecting thread would end up waiting on the wrong one.
As soon as you close a file, its number becomes available for reuse, and may get reused by the next call to open(), socket() etc, even if by another thread. Therefore you really, really need to avoid this kind of thing.

The select system call is a way to wait for file desctriptors to change state while the programs doesn't have anything else to do. The main use is for server applications, which open a bunch of file descriptors and then wait for anything to do on them (accept new connections, read requests or send the responses). Those file descriptors will be opened in non-blocking io mode such that the server process won't hang in a syscall at any times.
This additionally means, there is no need for separate threads, because all the work, that could be done in the thread can be done prior to the select call as well. And if the work takes long, than it can be interrupted, select being called with timeout={0,0}, the file descriptors get handled and afterwards the work is being resumed.
Now, you close a file descriptor in another thread. Why do you have that extra thread at all, and why shall it close the file descriptor?
The POSIX standard doesn't provide any hints, what happens in this case, so what you're doing is UNDEFINED BEHAVIOR. Expect that the result will be very different between different operating systems and even between version of the same OS.
Regards, Bodo

It's a little confusing what you're asking...
Select() should return upon an "interesting" change. If the close() merely decremented the reference count and the file was still open for writing somewhere then there's no reason for select() to wake up.
If the other thread did close() on the only open descriptor then it gets more interesting, but I'd need to see a simple version of the code to see if something's really wrong.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight