Should errno/perror methodology be used today to detect errors?

Should errno/perror methodology be used today to detect errors? - c

I know many questions have been asked previously about error handling in C but this is specifically about errno stuff.
I want to ask whether we should use the errno/perror functionality to handle errors gracefully at runtime.I am asking this because MSVC uses it and Win32 api also uses it heavily.I don't know anything about gcc or 'linux api'.Today both gcc and MSVC say that errno/perror can be used safely in a multithreaded environment.So what's your view?
thanks.

Note that using errno alone is a bad idea: standard library functions invoke other standard library functions to do their work. If one of the called functions fails, errno will be set to indicate the cause of the error, and the library function might still succeed, if it has been programmed in a manner that it can fall back to other mechanisms.
Consider malloc(3) -- it might be programmed to try mmap(.., MAP_PRIVATE|MAP_ANONYMOUS) as a first attempt, and if that fails fall back to sbrk(2) to allocate memory. Or consider execvp(3) -- it may probe a dozen directories when attempting to execute a program, and many of them might fail first. The 'local failure' doesn't mean a larger failure. And the function you called won't set errno back to 0 before returning to you -- it might have a legitimate but irrelevant value left over from earlier.
You cannot simply check the value of errno to see if you have encountered an error. errno only makes sense if the standard library function involved also returned an error return. (Such as NULL from getcwd(3) or -1 from read(2), or "a negative value" from printf(3).)
But in the cases when standard library functions do fail, errno is the only way to discover why they failed. When other library functions (not supplied by the standard libraries) fail, they might use errno or they might provide similar but different tools (see e.g. ERR_print_errors(3ssl) or gai_strerror(3).) You'll have to check the documentation of the libraries you're using for full details.

I don't know if it is really a question of "should" but if you are programming in C and using the low level C/posix API, there really is no other option. Of course you can wrap it up if this offends your stylistic sensibilities, but under the hood that is how it has to work (at least as long as POSIX is a standard).

In Linux, errno is safe to read/write in multiple thread or process, but not with perror(). It's a standard library that not re-entrant.

Related

Why does system() exist?

Many papers and such mention that calls to 'system()' are unsafe and unportable. I do not dispute their arguments.
I have noticed, though, that many Unix utilities have a C library equivalent. If not, the source is available for a wide variety of these tools.
While many papers and such recommend against goto, there are those who can make an argument for its use, and there are simple reasons why it's in C at all.
So, why do we need system()? How much existing code relies on it that can't easily be changed?

sarcastic answer Because if it didn't exist people would ask why that functionality didn't exist...
better answer
Many of the system functionality is not part of the 'C' standard but are part of say the Linux spec and Windows most likely has some equivalent. So if you're writing an app that will only be used on Linux environments then using these functions is not an issue, and as such is actually useful. If you're writing an application that can run on both Linux and Windows (or others) these calls become problematic because they may not be portable between system. The key (imo) is that you are simply aware of the issues/concerns and program accordingly (e.g. use appropriate #ifdef's to protect the code etc...)

The closest thing to an official "why" answer you're likely to find is the C89 Rationale. 4.10.4.5 The system function reads:
The system function allows a program to suspend its execution temporarily in order to run another program to completion.
Information may be passed to the called program in three ways: through command-line argument strings, through the environment, and (most portably) through data files. Before calling the system function, the calling program should close all such data files.
Information may be returned from the called program in two ways: through the implementation-defined return value (in many implementations, the termination status code which is the argument to the exit function is returned by the implementation to the caller as the value returned by the system function), and (most portably) through data files.
If the environment is interactive, information may also be exchanged with users of interactive devices.
Some implementations offer built-in programs called "commands" (for example, date) which may provide useful information to an application program via the system function. The Standard does not attempt to characterize such commands, and their use is not portable.
On the other hand, the use of the system function is portable, provided the implementation supports the capability. The Standard permits the application to ascertain this by calling the system function with a null pointer argument. Whether more levels of nesting are supported can also be ascertained this way; assuming more than one such level is obviously dangerous.
Aside from that, I would say mainly for historical reasons. In the early days of Unix and C, system was a convenient library function that fulfilled a need that several interactive programs needed: as mentioned above, "suspend[ing] its execution temporarily in order to run another program". It's not well-designed or suitable for any serious tasks (the POSIX requirements for it make it fundamentally non-thread-safe, it doesn't admit asynchronous events to be handled by the calling program while the other program is running, etc.) and its use is error-prone (safe construction of command string is difficult) and non-portable (because the particular form of command strings is implementation-defined, though POSIX defines this for POSIX-conforming implementations).
If C were being designed today, it almost certainly would not include system, and would either leave this type of functionality entirely to the implementation and its library extensions, or would specify something more akin to posix_spawn and related interfaces.

Many interactive applications offer a way for users to execute shell commands. For instance, in vi you can do:
:!ls
and it will execute the ls command. system() is a function they can use to do this, rather than having to write their own fork() and exec() code.
Also, fork() and exec() aren't portable between operating systems; using system() makes code that executes shell commands more portable.

Do errno values differ across *nix systems?

I'm writing a library that emits Linux kernel 4.4 errno values when things go wrong --- these are defined in a header for the program and aren't necessarily the same as the host errno values. (There's a good reason for doing this, and I can't change this part of it.) But I'm guaranteed that the environment it's running on:
can run ELF64 executables
implements the libc interface for all syscalls (i.e. I'm guaranteed that the system has a function named open with the same signature and semantics as open(2)).
I realize that in theory, the C/POSIX standards allow errno values to be whatever the implementer wants them to be, and in theory, I could compile my own kernel with whatever bizarre errno values I want. But then I would never be able to reliably use any binary that I didn't compile myself, so I'm probably going to have a bad time, and I'm not going to be surprised when things break at random.
In practice, can I count on this kind of host having the same errno values as the values defined in the kernel's errno.h? i.e. can I rely on getting sensible behavior from the host's perror if I directly set errno in my library?

Here's a very large list comparing ERRNO values from POSIX with the actual associated messages and numbers on various systems. Some differences between Linux and BSD for instance are obvious in the spreadsheet:
http://www.ioplex.com/~miallen/errcmp.html
So the answer is, maybe in practice it's fairly safe, depending on exactly what code you are looking at? For instance ENOMEM, EACCESS, are the same on all listed platforms here.
But in general no.

The names are reliable, at least the ones which are in Posix. The actual values, not. They certainly differ between Linux and *BSD, for example.
If you translate using the host's errno.h, you will be fine. Anything else is speculative.

Actually it really depends on the error. Below about 35 they are the same, except for EAGAIN which isn't so much changed but split differently. (Who gets the old number? EAGAIN or EDEADLK?)
I can think of two things that would work:
Perhaps you can just return errors that are common to Linux, OSX, and BSD.
You could compile a master include (thanks, #Chris Beck) and make some kind of hash map keyed by printable names, then return the right value at runtime.

Why is 'sys_errlist' deprecated in glibc?

sys_errlist is a handy array which allows getting static errno descriptions. The alternative to it is the strerror_r function, which is available in two confusing incompatible flavors. The GNU version of it returns char *, which would be from that same aforementioned array as long as the error is known, or otherwise from the user-supplied buffer. The standards-compliant version of strerror_r returns an int instead, and always uses the user-supplied buffer. The problem is, those two functions share the same name despite having completely different semantics, so you basically have to perform a fairly complex #ifdef check and write two completely different versions of your code depending on which version you get. In addition to that, both of those functions are worse than sys_errlist, as both require for the caller to provide a "large enough" buffer to hold the description, even though the GNU version would rarely use it, and neither function allows to know just how large the buffer should really be. If instead you choose to use sys_errlist instead, you can simply check whether value >= sys_nerr and allocate the buffer only in that case, just to put the Unknown error %d there via snprintf, and be done.
Given that strerror_r is a horrible, incomprehensible and inefficient mess, why did GNU developers mark sys_errlist as deprecated, effectively forcing one to either use strerrror_r or to observe the ugly warning each time the code is compiled?

strerror and its relative are localized. The usefulness of a non-localized system message can be debated, but glibc's maintainers went with the prevailing direction (Solaris and other systems).
However: sys_errlist has been deprecated for quite a while. It is not a POSIX interface. Some systems do not have it.
Further reading:
Where can I find the contents of sys_errlist?
Use strerror() or strerror_r() instead of sys_errlist and sys_nerr (fish-shell bug #1830)
RE: sys_errlist (Cygwin in 1999)
GNU Hurd/ hurd/ porting/ guidelines
It's been a while since this was an issue, but it used to be the case that some systems did not have strerror (see Unix Incompatibility Notes:
String and Memory Functions).

get_nprocs() and get_nprocs_conf() -- How to do error checking

I am using get_nproc() and get_nprocs_conf() top get number of online and all processor cores present in my machine.
How do I check for errors with these functions. Any specific error values?
Do they even notify of errors ? not sure.
I would really like to check for error with the call as my program would depend on the returned values a lot.
FYI - As these functions are available from GNU library I prefer these over
sysconf (_SC_NPROCESSORS_ONLN) and sysconf (_SC_NPROCESSORS_CONF)
so basically I want to avoid including extra file
Also, I see that these are declared in -- sys/sysinfo.h but wasn't able to find the definition. Any idea on where can I get that ?

get_nprocs and get_nprocs_conf are GNU extensions which are not documented to return errors. These functions are very unlikely to fail because they parse interfaces provided by the kernel in /sys or /proc. Still, failures can happen, either due to a kernel misconfiguration, a bug in the parser, or (most likely) a lack of resources causing open() to fail. In that case, the current implementation of both functions returns 1 without setting an error flag.
In other words, you are expected to use the return value as if the functions cannot fail. Since the fallback value returned in the unlikely case of error is fairly reasonable, doing just that does not seem like it will cause problems.

Here's a copy of the appropriate manual page: http://www.unix.com/man-page/linux/3/get_nprocs/
No error indicators are documented, though it follows from the function descriptions that if either function ever returns a value less than 1 then the function must have failed (else the function could not have run).

Error Code Handling

Are there any standards for error code handling? What I mean by this is not how a particular error event is handled but what is actually returned in the process. An example of this would be when an error in opening a file is identified. While the open() function may return its own value, the function that called the open() function may return a different value.

I don't think ther's a standard, all errors must be detected and handled (the caller should always handle errors).
in Unix in general:
the standard C library for exemple always return -1 on fail and set the global variable errno to the correct value.
Some libraries for example return NULL for inexistant field rather than aborting.
You should always return as much useful information as possible.
Hope this help.
Regards.

It sounds entirely context dependent to me. In some cases it's even advisable to just abort() the whole process. The failing function is called from a program or library using its own coding standards, you should probably adhere to that.