What are the differences between the various C standard library implementations [closed] - c

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 months ago.
Improve this question
While I was researching how fopen and moreover the structure FILE are built I found out it wasn't all just GNU libc. I began to look closer at the various implementations such as Google's Bionic, Musl, Newlib and several others. Having only been aware of Musl and up until now my understanding was that it was just a more optimized version of gnu libc. I hadn't given much thought about how, or if, they are allowed to differ and how they are possibly similar with the same respect.
Specifically how does the Posix and ANSI C standard dictate the various implementations? Meaning are all fopen and FILE (all libc functions/structures) determined to be built more or less the same? Or is it that they are opaque and just held to the "standard" of operating the same (return values and interfaces exposed)?
Intuitively, I would think its as I have described but maybe Musl is literally built more efficiently and optimized from compilation/linkage.
Any insight is greatly appreciated.

Specifically how does the Posix and ANSI C standard dictate the various implementations?
They don't, they only dictate the expected behavior of calling each of the library's functions.
What are the differences between the various C standard library implementations
You can split this into 2 categories:
implementations for different operating systems. E.g. a standard C library designed for Windows must use the Windows' kernel API and/or depend on other Windows specific dynamically linked libraries, a standard C library designed for Linux must use the Linux kernel API, etc. For operating systems that use micro-kernels or exo-kernels the standard C library could be radically different (e.g. maybe "open()" sends a message to a different process).
implementations for the same OS. In this case most differences are likely minor. E.g. if you asked 5 different programmers to implement their own version of strlen() you might get 2 simple versions that are almost identical, 1 almost as simple version that is slightly different, and 2 complex versions that are very different (and contain highly optimized inline assembly). However; it's convenient to shove things that aren't part of the standard C library into the same library (e.g. OS specific extensions, and extensions like POSIX) so different implementations of the C standard library can have different extensions and/or different versions of extensions.
Of course an implementation of the standard C library can attempt to be portable and support multiple environments (e.g. 32-bit and 64-bit code) and multiple operating systems (e.g. with lots of #ifdef ...); so there's also differences in which targets the source code of a standard C library tries to support.

Related

I want to know about C Library Configuration

The C Standard Library is independent of any operating system and system.
So, why use the input/output functions from the standard library?
Unix-specific POSIX system calls exist. Windows-specific input/output system calls exist.
Don't standard library functions eventually call system calls internally? Is this just for portability?
The API presented by the C standard library is uniform between operating systems, well, uniform provided all the "unspecified" parts of C roughly align (like the size of int).
The implementation of the C standard library is not independent of the operating system. Basically the implementation consists of the compiled source code the provides the API, and that compiled code matches the CPU / machine instruction sets, and possibly other items specific to the hardware bus width, supporting chip sets and other actual hardware details.
So, programming against the C API helps your program be "more portable" but that doesn't mean that any specific implementation of the C API is portable. Finally, there are lots of small details that aren't specified in detail, or are allowed to vary between platforms (like byte order, size of int, and so on). So even a program written against the standard C API might not work correctly on another machine, unless you write code that accommodates and reacts to the parts of the C API that might differ between platforms.
POSIX is basically a standard that eventually became incorporated into most C development environments. It is designed to provide a single API to program against for multiple UNIX platforms for items that lie outside of the core C language. There are POSIX implementations for Windows too, but Microsoft's historical offerings are notorious for not actually working correctly.
Yes, these APIs (if available) are implemented with code that eventually performs operating specific calls, and is presented in "machine code" that is very specific to the CPU instruction set. There are dozens of CPUs out there, and each major platform has its own matching compiler and matching C API libraries, if the C language is available.
The C language and API is there for portability, but portability isn't it's primary reason for existing (and there are lots of small corner cases where the same code isn't portable across all platforms unless it is written a certain way.) The primary reason it's there is not portability, it is because if the language features weren't consistently available across all platforms, then you wouldn't have "one C language" that could be used on multiple machines, you would have "many C-like languages, where each supported item would have to be checked" meaning you might know C on your development platform, but not know C on another platform.
As for the libraries, there are many libraries that might be absent in a typical machine, and when developing, you generally have to use a dependency checker to ensure the library is present (and sometimes the correct functions are available in the library) before successful use of the machine for development. Autoconf, for example, has m4 macros that can be configured to check if a library is present before compiling the programs.

What is the relationship between POSIX and the C language?

I understand that the C language is an ISO standard, and I can see from Wikipedia that the standard includes 29 header files, and that conforming to these header files, a C application is theoretically 'portable'.
In practice, however, I recently tried doing a tutorial on a simple C HTTP server that uses header files that aren't part of the C standard. So in this case, the simplest of applications that I can think of - a C application comprising a single int main(void) function, and that is less than 100 lines, with the aim of listening on a network interface goes beyond the C standard?
In this case what is the relationship between the C language as a specification and (assuming I'm writing an application for Linux) the POSIX specification as a language?
As far as I can tell, "man7.org" provides a list of the C header files that define the API of all Unix/Linux systems (I'm assuming this is the same as 'POSIX') systems, as well as a list of system calls for the Linux platform.
This includes 82 header files, of which the 29 C standard library headers are a subset, and some 10 000 system calls (at least I assume that everything in this list that is NOT a head file is a system call).
I would assume that any reasonably functional program written in C would go beyond the standard library and make use of OS specific header files. Would it not be more accurate to say that programming an application to run on Linux would actually be "POSIX programming"?
I guess it would also be possible to stick to the standard library, and define custom header files for portable logic implementation across POSIX & non-POSIX systems (including platform-specific assembly routines). Is this ever done?
POSIX is not a specification for a language, it is a specification for an operating system, just one part of which is the wider C library specification and additional restrictions on to how the C language itself needs to be implemented on such operating systems.
There are many popular cross-platform libraries. One popular library that concerns the areas that the POSIX C specification is mostly concerned with is the Apache Portable Runtime:
The mission of the Apache Portable Runtime (APR) project is to create and maintain software libraries that provide a predictable and consistent interface to underlying platform-specific implementations. The primary goal is to provide an API to which software developers may code and be assured of predictable if not identical behaviour regardless of the platform on which their software is built, relieving them of the need to code special-case conditions to work around or take advantage of platform-specific deficiencies or features.
APR includes things like the sockets and threads and processes and can be used to compile the same application for various operating systems - many unix-like ones and Windows - with minimal changes.
POSIX is not a standard but a family of standards specifying an entire operating system.
The POSIX C standard is a superset of the standard C library, the relationship between these two is well described in this other question

No library C implementation

I've heard while looking at different C implementations that any system that hopes to implement C must minimally include certain libraries, stdarg.h etc. My question is why this is, it can't be that the C library is not Turing complete without some headers, and since the headers have been written it must be true that I could write them myself. Why, then, is it not permissible to have a C implementation consisting of just a compiler+linker toolchain? (of course, in this case interacting with the OS would require inline assembly or linked assembly code as well as knowledge of the system's syscalls etc., but that doesn't mean that C can't be written, does it?)
You confuse a property of the programming language, i.e. the language itself with additional features mandated by the standard.
"Turing complete" is just about the language itself; basically if you can use it to solve a certain class of problems (for a more exact definition, please see Wikipedia for a starter(!) ). That is quite an abstract concept and does not include any libraries. Basically, if you use such libraries, you just have to be able to write those libraries in the language itself. This is true for the C language.
About the libraries required: Your premise is wrong. C very well allows to omit the libraries themselves. That is the difference between a hosted (full libraries) and a freestanding (few target-specific headers, but no generated code). See 4p6.
The few headers are normally part of the compiler itself. They basically provide some typedefs and #defined constants, e.g. the range of the integer types (limits.h) and types of guaranteed minimum width (stdint.h, often also fixed-width types). stddef.h e.g. provides size_t and NULL.
While you do not need to use those headers, they already allow writing portable code for the program logic. Just see them as part of the language itself, tailored to the target.
The gcc C compiler, for instance actually is a freestanding implementation: It only provides the required headers, but not the standard library. Instead, it relies on the system library, which is e.g. glibc on Linux.
Note: Generally it is a bad idea to re-invent the wheel. So if you are on a hosted environment (i.e. full-grown OS), you should use the features available. Otherwise you might run into trouble, as these e.g. mightr provide additional functions not directly seen by your code. E.g. debugging or system/user-wide configuration like localisation support. Also debugging support might depend on you using the standard library, e.g. valgrind. Replacing memory allocation with your own code at least makes this much more difficult.
Not to mention maintainability. Not just others will understand your code easier if you use the standard names&semantics, but also yourself - just wait some years and try understanding your old code.
OTOH, if you are on a bare-metal embedded system, there is actually little use of most features the standard library. Including e.g. printf or scnaf just bloats your firmware, often without any actual use. For such systems, there are stripped-down libraries (e.g. newlib) which may be not completely compliant or allow to omit certain costly features, e.g. floating point conversion or the math lib. Still you only should use them iff you really need many of their features. And sometimes there is a middle way, but that requires some knowledge about the dependencies of the library.
Two reasons: compatibility and system interaction.
If you don't implement the whole C standard library, then code other people write won't work. Even the most basic C program uses library calls.
#include <stdio.h>
int main() {
printf("Hello world!\n");
return 0;
}
Without an agreed upon and fully implemented stdio.h that program will not run because the compiler doesn't know what printf() means.
Then there's system interaction. C has been called "portable assembly". This is because different computing environments do things differently, but C takes are of that for you (well, some of it). You can't write a portable stdio.h in assembly without losing your mind. But its more than that. Each C header file protects you from something that each environment does (or used to) do very differently.
stdlib.h shields you from differing memory models and process control.
stdio.h shields you from differing IO systems.
math.h shields you from differing floating point implementations.
limits.h shields you from differing data sizes.
locale.h shields you from differing locales.
And so on...
The C libraries provide a standard API that each environment can write to. When C is ported to a new environment, that environment is responsible for implementing those libraries according to the particulars of that system. You don't have to do that.
Nowadays we live in a much more homogeneous environment than when C was developed, but most of the C standard library still protects you from basic differences in how operating systems and hardware do things.
It didn't always used to be this way. An example that comes to mind is the hell of running a game on DOS. There was no standard interface to the sound and video card (if you had them). Each program had to ship drivers for each sound and video card they supported. If yours wasn't in there, sorry. If their driver was buggy, sorry.
Programming without the C standard library is kind of like that, but far far worse.

Zopfli is written in C for portability... wait what? [closed]

Closed. This question is opinion-based. It is not currently accepting answers.
Want to improve this question? Update the question so it can be answered with facts and citations by editing this post.
Closed 9 years ago.
Improve this question
So I am not a C programmer so pardon this question.
I was reading this blog entry Google Zopfli Compression and I was a little dumbfounded by the following sentence : "Zopfli is written in C for portability".
How exactly is C a portable language? Or does he not mean portable in a compile-to-machine-code sense, but some other context? I guess C is more portable than writing assembly code. But is that really the comparison he is trying to make? I hope someone can enlighten me as to what he means and how exactly C is a portable language.
Thanks a lot!
Portable in this context means something like "Anybody can take this source code and compile it on their own computer and have this program." Very nearly all computers drawing power somewhere today have a C compiler available for them (it may not be installed on that machine, but it's either available to be installed or is available as a cross-compiler (eg embedded systems)), so that same source code is portable virtually everywhere. (EDIT: I'm assuming based on context that the source code doesn't have system-specific things in it, as system-specific things would limit your portability.)
"Portability" has multiple meanings, depending on the context:
The C language is "portable" in the sense that C compilers have been written for a wide variety of platforms, from mainframes to microcontrollers;
The language is also "portable" in the sense that there is an agreed-upon standard that implementations conform to (to greater or lesser degree), so you don't have subtly different versions of the language depending on the vendor - the behavior of a conforming program should be the same on any conforming implementation;
C programs that don't make any assumptions about the system they're running on (type sizes, alignment, endianess) or use system-specific libraries are often "trivially" portable; they only need to be recompiled for the target platform, without needing to edit the source code.
Compared to the majority of its contemporaries (Pascal, Fortran, etc.), C is highly portable, and I spent the bulk of the '90s writing C code that had to run on multiple platforms concurrently (one project required the same code to run on Windows NT, Solaris, and Classic MacOS).
C's portability can be summed up as "write once1, build and run everywhere", where Java and C#'s portability can be summed up as "write and build once, run everywhere."
1. Subject to the caveats in the third bullet
For a piece of software to be considered cross-platform, it must be able to function on more than one computer architecture or operating system.
Developing such program can be a time-consuming task because different operating systems have different application programming interfaces (API).
For example, Linux uses a different API for application software than Windows does.
C is a language you can use in most of the API.
C code can be directly called in C++, and easily used in C# and I believe Objective-C. That and the wide availability of c compilers, it does make sense.
Of course, the argument can also be made that Java is more portable as far as running it directly on other machines. But Java can't be moved from language to language as easily.

Is the ABI part of the C standard? [duplicate]

This question already has answers here:
Does C have a standard ABI?
(9 answers)
Closed 9 years ago.
It seems to me that C libraries almost never have issues mixing libraries compiled with different versions or (sometimes) even different compilers, and that many languages seem to be able to interface with C libraries either directly or with minimal effort.
Is this all because the ABI is standard?
ABIs are not codified in the language standard. You can get a copy of any of the C standard drafts to see it yourself.
And there's a good reason for ABIs not being in the standard. The standard cannot anyhow foresee all hardware and OSes for which C compilers can be implemented.
The ABI is most definitely not standard. At least, not in the C standard. Each operating system or tool chain specifies these things, but the language itself does not. Try running a windows program on a Linux machine, for example.
The ABI is defined by the Operatingsystem and/or the toolchain and is not defined by the standard. It defines for example how params are passed to a function call. What layout the stackframe has or how system calls are invoked.
The reason why most languages are able to interface with C libraries is most likely because most operating systems are (more or less) written in C, exposing C libraries as API and define a ABI based on that. And if a library written in a certain language wants to interface a specific OS, it has to be able to interface the ABI of this OS.
ABI is not a part of C standard. However, there had been efforts to standardize ABI. Quoting from "Linux System Programming" :
Although several attempts have been made at defining a single ABI for a given archi-
tecture across multiple operating systems (particularly for i386 on Unix systems), the
efforts have not met with much success. Instead, operating systems—Linux
included—tend to define their own ABIs however they see fit.

Resources