Why is 'sys_errlist' deprecated in glibc? - c

sys_errlist is a handy array which allows getting static errno descriptions. The alternative to it is the strerror_r function, which is available in two confusing incompatible flavors. The GNU version of it returns char *, which would be from that same aforementioned array as long as the error is known, or otherwise from the user-supplied buffer. The standards-compliant version of strerror_r returns an int instead, and always uses the user-supplied buffer. The problem is, those two functions share the same name despite having completely different semantics, so you basically have to perform a fairly complex #ifdef check and write two completely different versions of your code depending on which version you get. In addition to that, both of those functions are worse than sys_errlist, as both require for the caller to provide a "large enough" buffer to hold the description, even though the GNU version would rarely use it, and neither function allows to know just how large the buffer should really be. If instead you choose to use sys_errlist instead, you can simply check whether value >= sys_nerr and allocate the buffer only in that case, just to put the Unknown error %d there via snprintf, and be done.
Given that strerror_r is a horrible, incomprehensible and inefficient mess, why did GNU developers mark sys_errlist as deprecated, effectively forcing one to either use strerrror_r or to observe the ugly warning each time the code is compiled?

strerror and its relative are localized. The usefulness of a non-localized system message can be debated, but glibc's maintainers went with the prevailing direction (Solaris and other systems).
However: sys_errlist has been deprecated for quite a while. It is not a POSIX interface. Some systems do not have it.
Further reading:
Where can I find the contents of sys_errlist?
Use strerror() or strerror_r() instead of sys_errlist and sys_nerr (fish-shell bug #1830)
RE: sys_errlist (Cygwin in 1999)
GNU Hurd/ hurd/ porting/ guidelines
It's been a while since this was an issue, but it used to be the case that some systems did not have strerror (see Unix Incompatibility Notes:
String and Memory Functions).

Related

When should Win32/WinAPI types be used vs. Standard C types?

I'm late to the Win32 party and there's a sea of functions such as _tprintf, TEXT(), there are also libraries like strsafe.h which have such functions like StringCchCopy(), StringCchLength(), etc.. Basically, Win32 API introduces a bunch of extra functions and types on top of C which can be confusing to a C programmer who hasn't worked with Win32 much. I do not have a problem finding the definitions of these types and functions on MSDN. However, I do have a problem finding guidelines on when and why they should be used.
I have 2 questions:
How important is it to use all of these types and special functions which Microsoft has provided on top of standard C when programming with Win32? Is it considered good practice to do away with all standard C functions and types and use entirely Microsoft wrappers?
Is it okay to mix standard C functions in with these Microsoft types and functions? For example, to use malloc() instead of HeapAlloc(), or to use printf() instead of _tprintf() and etc...?
I have a copy of Charles Petzold's Programming Windows Fifth Edition book but it mostly covers GUI stuff and not a lot of the remainder of the API.
There are actually 3 questions here, the ones you explicitly asked, and the one you didn't. Let's get that last one out of the way first, as it tends to cause the most confusion:
What are those _t-extensions offered by the Microsoft-provided CRT?
They are generic-text mappings, introduced to make it possible to write code that targets both ANSI-based systems (Win9x) as well as Unicode-based systems (Windows NT). They are macros that expand to the actual function calls, based on the _UNICODE and _MBCS preprocessor symbols. For example, the symbol _tprintf expands to either printf or wprintf.
Likewise, the Windows API provides both ANSI and Unicode versions of the API calls. They, too, are preprocessor macros that expand to the actual API call, depending on the preprocessor symbol UNICODE. For example, the CreateFile symbol expands to CreateFileA or CreateFileW.
Generic-text mappings haven't been useful in the past two decades. Today, simply use the Unicode versions of the CRT and API calls (e.g. wprintf and CreateFileW). You can define _UNICODE and UNICODE for good measure, too, so that you don't accidentally call an ANSI version.
there are also libraries like strsafe.h which have such functions like StringCchCopy(), StringCchLength()
Those are safe variants of the CRT string manipulation calls. They are safer than e.g. strcpy by providing the buffer size of the destination, similar to strncpy. The latter, however, suffers from an awkward design decision, that causes the destination buffer to not get zero-terminated, in case the source won't fit. StringCchCopy will always zero-terminate the destination buffer, and thus provides additional safety over the CRT implementations. (Note: C11 introduces safe variants, e.g. strncpy_s, that will always zero-terminate the destination array, in case the input is valid. They also validate the input, calling the currently installed constraint handler when validation fails, thus providing even stronger safety than the strsafe.h implementations. The bounds-checked implementations are a conditional feature of C11.)
How important is it to use all of these types and special functions which Microsoft has provided on top of standard C when programming with Win32? Is it considered good practice to do away with all standard C functions and types and use entirely Microsoft wrappers?
It is not important at all. You can use whichever is more suitable in your scenario. If in doubt, writing portable (i.e. Standard C) code is generally preferable. You only ever want to call the Windows API calls, if you need the additional control they offer (e.g. HeapAlloc allows more control over the allocation than malloc does; likewise CreateFile provides more options than fopen).
Is it okay to mix standard C functions in with these Microsoft types and functions? For example, to use malloc() instead of HeapAlloc(), or to use printf() instead of _tprintf() and etc...?
In general, yes, as long as you match those calls: HeapFree what you HeapAlloc, free what you malloc. You must not mix HeapAlloc and free, for example. In case a Windows API call requires special memory management functions to be used, it is explicitly pointed out in the documentation. For example, if FormatMessage is requested to allocate the buffer to return data, it must be freed using LocalFree. If you do not request the API to allocate a buffer, you can pass in a buffer allocated any way you like (malloc, HeapAlloc, IMalloc::Alloc, etc.).
It is possible to create programs on Windows without using any standard C library functions but most programs do and then you might as well use malloc over HeapAlloc. malloc will use HeapAlloc or VirtualAlloc internally but it is probably tuned for better performance/less fragmentation compared to the raw API. It also makes it easier to port to POSIX in the future. You will still be forced to use LocalFree/GlobalFree/HeapFree in some places where the API allocates memory for you.
Handling text needs special consideration and you need to decide if you need Unicode support or not. A stroll down memory lane might shed some light on why things are the way they are.
Back when Windows 95/98 was king you could use the char/CHAR narrow string types with both the C standard functions and the Windows API. There was virtually no Unicode support except for a handful of functions.
On Windows NT4/2000 however the native string type is WCHAR (UTF-16 LE but Microsoft just calls it Unicode). If you are using Microsoft Visual C++ then you have access to wide string versions of the C standard libray beyond what the C standard actually requires to ease coding for this platform. When coding for Windows using the Microsoft toolchain you can assume that the Windows SDK WCHAR type is the same as the wchar_t type defined by C.
Because the development of 95 and NT4 overlapped they share the same API and every function that receives/returns a string has two versions, one with a A suffix ("ANSI") and one with a W suffix. On Windows 95 the W functions are just stubs that return failure.
When you include Windows.h it will create defines like #define CreateProcess CreateProcessW if UNICODE is defined or #define CreateProcess CreateProcessA if not.
Visual C++ does the same thing with the tchar.h header. It uses the _UNICODE define to decide if the TCHAR type and the _t* functions use the char or wchar_t type. This meant that you could create two releases from the same source code, one for Windows 95/98/ME and one with full Unicode support.
This is not that relevant anymore but you still need to make a choice because things will be defined for one or the other.
It is still perfectly valid to do
#define UNICODE
#define _UNICODE
#include <windows.h>
#include <tchar.h>
void foo()
{
TCHAR buf[100];
SomeWindowsFunction(buf, 100);
_tprintf(_T("foo: %s\n"), buf);
}
although you will see many people go straight for WCHAR and wprintf these days.
The StrSafe functions were added to make it easier to write bug free code, they still have the same A/W duplication.
You cannot mix and match WCHAR with printf, even if you use %ls in the format string the string will be converted internally and not all Unicode strings will convert correctly.
If POSIX portability is not a requirement then I suggest that you use the wide function extensions provided by Microsoft when you need a C library function.
Note that different versions of the OS use different definitions of base types and use different alignments/padding. Remember 8086, 386 and now Core i7(16, 32, 64 bits).
When structs need to be compatible with earlier versions, they typically use pre-defined integer widths and pad for legacy alignment.
For that reason, the types from the API must be used in API calls.
Also in memory there used and maybe are different memory models of process memory and shared memory. For example the clipboard uses a form of shared memory. It is important to use the memory allocation mechanisms Microsoft advices here for API calls.
For everything non-API I use the standard C functions and types.

Stand-alone portable snprintf(), independent of the standard library?

I am writing code for a target platform with NO C-runtime. No stdlib, no stdio. I need a string formatting function like snprintf but that should be able to run without any dependencies, not even the C library.
At most it can depend on memory alloc functions provided by me.
I checked out Trio but it needs stdio.h header. I can't use this.
Edit
Target platform : PowerPC64 home made OS(not by me). However the library shouldn't rely on OS specific stuff.
Edit2
I have tried out some 3rd-party open source libs, such as Trio(http://daniel.haxx.se/projects/trio/), snprintf and miniformat(https://bitbucket.org/jj1/miniformat/src) but all of them rely on headers like string.h, stdio.h, or(even worse) stdlib.h. I don't want to write my own implementation if one already exists, as that would be time-wasting and bug-prone.
Try using the snprintf implementation from uclibc. This is likely to have the fewest dependencies. A bit of digging shows that snprintf is implemented in terms of vsnprintf which is implemented in terms of vfprintf (oddly enough), it uses a fake "stream" to write to string.
This is a pointer to the code: http://git.uclibc.org/uClibc/tree/libc/stdio/_vfprintf.c
Also, a quick google search also turned up this:
http://www.ijs.si/software/snprintf/
http://yallara.cs.rmit.edu.au/~aholkner/psnprintf/psnprintf.html
http://www.jhweiss.de/software/snprintf.html
Hopefully one is suitable for your purposes. This is likely to not be a complete list.
There is a different list here:
http://trac.eggheads.org/browser/trunk/src/compat/README.snprintf?rev=197
You will probably at least need stdarg.h or low level knowledge of the specific compiler/architecture calling convention in order to be able to process the variadic arguments.
I have been using code based on Kustaa Nyholm's implementation It provides printf() (with user supplied character output stub) and sprintf(), but adding snprintf() would be simple enough. I added vprintf() and vsprintf() for example in my implementation.
No dynamic memory application is required, but it does have a dependency on stdarg.h, but as I said, you are unlikely to be able to get away without that for any variadic function - though you could potentially implement your own.
I am guessing you are in a norming enivronment where you need to explicitly document and verify COTS code.
However, I think in the case of stdarg.h this is worthwhile. You could pull in the source for just this and treat it like handwritten code (review, lint, unit-test, etc.). Any self-written replacement will be a lot of work, probably less stable and absolutely not portable.
That said, the actual snprintf implementation should not be too hard, and you could do this yourself, probably. Especially if you might be able to strip a few features away.
Keep in mind that vararg code has no typechecking and is prone to errors. For library snprintf you may find gcc's warnings helpful.

Should errno/perror methodology be used today to detect errors?

I know many questions have been asked previously about error handling in C but this is specifically about errno stuff.
I want to ask whether we should use the errno/perror functionality to handle errors gracefully at runtime.I am asking this because MSVC uses it and Win32 api also uses it heavily.I don't know anything about gcc or 'linux api'.Today both gcc and MSVC say that errno/perror can be used safely in a multithreaded environment.So what's your view?
thanks.
Note that using errno alone is a bad idea: standard library functions invoke other standard library functions to do their work. If one of the called functions fails, errno will be set to indicate the cause of the error, and the library function might still succeed, if it has been programmed in a manner that it can fall back to other mechanisms.
Consider malloc(3) -- it might be programmed to try mmap(.., MAP_PRIVATE|MAP_ANONYMOUS) as a first attempt, and if that fails fall back to sbrk(2) to allocate memory. Or consider execvp(3) -- it may probe a dozen directories when attempting to execute a program, and many of them might fail first. The 'local failure' doesn't mean a larger failure. And the function you called won't set errno back to 0 before returning to you -- it might have a legitimate but irrelevant value left over from earlier.
You cannot simply check the value of errno to see if you have encountered an error. errno only makes sense if the standard library function involved also returned an error return. (Such as NULL from getcwd(3) or -1 from read(2), or "a negative value" from printf(3).)
But in the cases when standard library functions do fail, errno is the only way to discover why they failed. When other library functions (not supplied by the standard libraries) fail, they might use errno or they might provide similar but different tools (see e.g. ERR_print_errors(3ssl) or gai_strerror(3).) You'll have to check the documentation of the libraries you're using for full details.
I don't know if it is really a question of "should" but if you are programming in C and using the low level C/posix API, there really is no other option. Of course you can wrap it up if this offends your stylistic sensibilities, but under the hood that is how it has to work (at least as long as POSIX is a standard).
In Linux, errno is safe to read/write in multiple thread or process, but not with perror(). It's a standard library that not re-entrant.

Should I use secure versions of POSIX functions on MSVC - C

I am writing some C code which is expected to compile on multiple compilers (at least on MSVC and GCC). Since I am beginner in C, I have all warnings turned on and warnings are treated as errors (-Werror in GCC & /WX in MSVC) to prevent me from making silly mistakes.
When I compiled some code that uses strcpy on MSVC, I get warning like,
warning C4996: 'strcpy': This function or variable may be unsafe. Consider using strcpy_s instead. To disable deprecation, use _CRT_SECURE_NO_WARNINGS. See online help for details.
I am bit confused. Lot of common functions are deprecated on MSVC. Should I use this secured version when on Windows? If yes, should I wrap strcpy something like,
my_strcpy()
{
#ifdef WIN32
// use strcpy_s
#ELSE
// use strcpy
}
Any thoughts?
Whenever you move data between non-constant-size buffers, you have to (gasp! omg!) actually think about whether it fits. Using functions (like the MS-specific strcpy_s or the BSD strlcpy) that purport to be "safe" will protect you from some obvious buffer overflow conditions, but won't protect you from the bugs that result from string truncation. It also won't protect you from integer overflows in computing the necessary sizes of buffers.
Unless you're an expert dealing with C strings, I would recommend forgetting about special functions and commenting every line of your code that will perform variable-length/position writes with a justification for how you know, at this point in the program, that the length/offset you're about to use is within the bounds of the size of the buffer. Do this for lines where you perform arithmetic on sizes/offsets too - document how you know that the arithmetic will not overflow, and add tests for overflow if you find you don't know.
Another approach is to completely wrap all your string handling in a string object that stores the length of the buffer along with the string and automatically reallocates when a string needs to be enlarged, and then only use const char * for read-only access to strings when you need to pass them to system functions or other libraries. This will sacrifice a good bit of the performance you'd expect from C, but it will help you ensure that you don't make mistakes. Just don't take it to the extreme. There's no need to duplicate stuff like strchr, strstr, etc. in your string wrapper. Just provide methods to duplicate string objects, concatenate them, and truncate them, and then with the existing library functions that operate on const char * you can do just about anything you'd want to.
There are lots and lots of discussions about this topic here on SO. The usual suspects like strncpy, strlcpy and whatever will pop up here again, I'm sure. Just type "strcpy" in the search box and read some of the longer threads to get an overview.
My advice is: Whatever your final choice will be, it is a good idea to follow the DRY principle and continue to do it as in your example of my_strcpy(). Don't throw the raw calls all over your code, use wrappers and centralize them in your own string handling library. This will reduce overall code (boilerplate), and you have one central location to make modifications, if you change your mind later.
Of course this opens up some other cans of worms, especially for a beginner: Memory handling responsibility and interface design. Both a topic on its own, and 5 people will give you 10 suggestions of how to do it. A central library usually has the nice effect that it enforces a decision, which you will follow throughout your whole codebase, instead of using method a in module A and method b in module B, causing you trouble when you try to connect A with B...
I would tend to use the safer function snprintf † which is available on both platforms rather than having different paths depending on platform. You will need to use the define to prevent the warnings on MSVC.
† though possibly slightly less safer - it will return a string which is not nul-terminated on error, so you must check the return, but it won't cause a buffer overflow.

How to get the absolute path of a file programmatically with out realpath() under linux?

I know it is possible to get an absolute path of a file with realpath() function. However, according to BUGS section the manpage, there are some problem in its implementation. The details are following:
BUGS
Avoid using this function. It is broken by design since (unless using the non-standard resolved_path == NULL feature) it is impossible to determine a suitable size for the output buffer, resolved_path. According to POSIX a buffer of size PATH_MAX suffices, but PATH_MAX need not be a defined constant, and may have to be obtained using pathconf(3). And asking pathconf(3) does not really help, since on the one hand POSIX warns that the result of pathconf(3) may be huge and unsuitable for mallocing memory. And on the other hand pathconf(3) may return -1 to signify that PATH_MAX is not bounded.
The libc4 and libc5 implementation contains a buffer overflow (fixed in libc-5.4.13). Thus, set-user-ID programs like mount(8) need a private version.
So, the question is what is the best practice to get the absolute path of a file?
I know this question is old, but I don't see any answers that address the core issue: The man page OP referenced is wrong and outdated, for at least two reasons.
One is that POSIX 2008 added/mandated support for the NULL argument option, whereby realpath allocates the string for you. Programs using this feature will be portable to all relevant versions of GNU/Linux, probably most other modern systems, and anything conforming to POSIX 2008.
The second reason the man page is wrong is the admonition against PATH_MAX. This is purely GNU religious ideology against "arbitrary limits". In the real world, not having a pathname length limit would add all sorts of avenues for abuse/DoS, would add lots of failure cases to tasks that otherwise could not fail, and would break more interfaces than just realpath.
If you care about maximum portability, it's probably best to use a mix of both methods. See the POSIX documentation for details:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/realpath.html
I would use a fixed-size, caller-provided buffer if PATH_MAX is defined, and otherwise pass NULL. This seems to cover all cases, but you might also want to check older versions of POSIX to see if they have any guidelines for what to do if PATH_MAX is not defined.
Use getcwd() and readlink() which allows to give a buffer size to reimplement realpath(). Note that you have to resolve symbolic links, "." and ".." from left to right to do it correctly.
From the shell, I can get a full path using readlink -f $FILE. There's a readlink() function in glibc, maybe that'll help you.
# man 2 readlink

Resources