Is the ABI part of the C standard? [duplicate] - c

This question already has answers here:
Does C have a standard ABI?
(9 answers)
Closed 9 years ago.
It seems to me that C libraries almost never have issues mixing libraries compiled with different versions or (sometimes) even different compilers, and that many languages seem to be able to interface with C libraries either directly or with minimal effort.
Is this all because the ABI is standard?

ABIs are not codified in the language standard. You can get a copy of any of the C standard drafts to see it yourself.
And there's a good reason for ABIs not being in the standard. The standard cannot anyhow foresee all hardware and OSes for which C compilers can be implemented.

The ABI is most definitely not standard. At least, not in the C standard. Each operating system or tool chain specifies these things, but the language itself does not. Try running a windows program on a Linux machine, for example.

The ABI is defined by the Operatingsystem and/or the toolchain and is not defined by the standard. It defines for example how params are passed to a function call. What layout the stackframe has or how system calls are invoked.
The reason why most languages are able to interface with C libraries is most likely because most operating systems are (more or less) written in C, exposing C libraries as API and define a ABI based on that. And if a library written in a certain language wants to interface a specific OS, it has to be able to interface the ABI of this OS.

ABI is not a part of C standard. However, there had been efforts to standardize ABI. Quoting from "Linux System Programming" :
Although several attempts have been made at defining a single ABI for a given archi-
tecture across multiple operating systems (particularly for i386 on Unix systems), the
efforts have not met with much success. Instead, operating systems—Linux
included—tend to define their own ABIs however they see fit.

Related

What are the differences between the various C standard library implementations [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 3 months ago.
Improve this question
While I was researching how fopen and moreover the structure FILE are built I found out it wasn't all just GNU libc. I began to look closer at the various implementations such as Google's Bionic, Musl, Newlib and several others. Having only been aware of Musl and up until now my understanding was that it was just a more optimized version of gnu libc. I hadn't given much thought about how, or if, they are allowed to differ and how they are possibly similar with the same respect.
Specifically how does the Posix and ANSI C standard dictate the various implementations? Meaning are all fopen and FILE (all libc functions/structures) determined to be built more or less the same? Or is it that they are opaque and just held to the "standard" of operating the same (return values and interfaces exposed)?
Intuitively, I would think its as I have described but maybe Musl is literally built more efficiently and optimized from compilation/linkage.
Any insight is greatly appreciated.
Specifically how does the Posix and ANSI C standard dictate the various implementations?
They don't, they only dictate the expected behavior of calling each of the library's functions.
What are the differences between the various C standard library implementations
You can split this into 2 categories:
implementations for different operating systems. E.g. a standard C library designed for Windows must use the Windows' kernel API and/or depend on other Windows specific dynamically linked libraries, a standard C library designed for Linux must use the Linux kernel API, etc. For operating systems that use micro-kernels or exo-kernels the standard C library could be radically different (e.g. maybe "open()" sends a message to a different process).
implementations for the same OS. In this case most differences are likely minor. E.g. if you asked 5 different programmers to implement their own version of strlen() you might get 2 simple versions that are almost identical, 1 almost as simple version that is slightly different, and 2 complex versions that are very different (and contain highly optimized inline assembly). However; it's convenient to shove things that aren't part of the standard C library into the same library (e.g. OS specific extensions, and extensions like POSIX) so different implementations of the C standard library can have different extensions and/or different versions of extensions.
Of course an implementation of the standard C library can attempt to be portable and support multiple environments (e.g. 32-bit and 64-bit code) and multiple operating systems (e.g. with lots of #ifdef ...); so there's also differences in which targets the source code of a standard C library tries to support.

Linux kernel: What kind of C Linux kernel is using?

I am confused here. They say linux kernel is developed using C. But to my knowledge, C library is built on top of Linux kernel, so at kernel land, there should be no C just yet. And yet again, the kernel code I saw from GitHub were all written in C, all with those weird includes! It's just like that classic chicken vs egg puzzle to me. Which one exists first?
Thanks in advance for your patience with my stupid question(s).
C isn't built ontop of linux. C in itself is a compiled programming language, that a compiler translates into machine code. Based on your OS, the compiler may do it differently (for some C code).
But the language C itself really is just a very long list of things functions should do and how things should behave, and compilers just obey these rules. Thats what is called the C "standard". There is a comittee that sets it, and there are multiple versions.
Linux Kernel was indeed written in C. So someone wrote it and then compiled it using a standard-compliant C compiler.
As for libraries, they're optional. The Linux kernel is developed without dependencies, that means it implements everything it needs itself, in plain C. These includes you see are just files from the kernel itself, defining its functions, types etc.
The linux kernel (and other kernels) is developed freestanding, this means it doesn't use any external libraries. Every function it needs is implemented inside the kernel. What you call "weird includes" are includes declaring its own internal functions and types.
The C specification makes a distinction between hosted and freestanding implementations. For some details, see Is there a meaningful distinction between freestanding and hosted implementations? and https://stackoverflow.com/questions/35164489/what-is-the-reason-for-creating-freestanding-vs-hosted-implementation.
One of the differences is that freestanding implementations are not required to provide all the standard library functions. When compiling a Unix kernel, we use the compiler in a freestanding mode, because the many of the standard libraries depend on having a kernel beneath them. In particular, the standard I/O library requires an operating system with files, but the kernel is where that all gets implemented, so it can't be used from the kernel.
While there are some library functions, like the ones in <string.h>, that could be the same in the kernel, to keep things simple it doesn't link with any of the standard libraries. There are functions like strcpy() in the kernel, but they're copies of the standard library code, not linked with the same libraries (on many systems, the standard C library is dynamically linked, but this isn't feasible in the kernel).
So the kernel makes use of the C language, but none of the C libraries.

What is the difference between the C programming language and C programming under linux?

What is the difference between the C programming language and C programming under Linux?
Are the syntax same in both them?
Or is the difference only when you execute the program?
The C language is governed by the ISO approved C standard and it does not take in to account the underlying platform on which you use C. So from the perspective of the language standard there is no difference, and a standard compliant program shall work correctly on both.
However in practical usage one needs to do platform specific things for ex: IPC mechanisms, multithreading, file access and so on which are specific to the platform, such functionality will vary from platform to platform because each will provide functionality specific to itself. Note that such functionality is not covered by the C language standard, so using it makes the program non portable across other platforms.
Linux is a platform that can be used for the development of programs and applications using languages such as C. The only thing is that its supposed to be is its simplicity and one's liking to a particular operating system. Otherwiswe there is no difference in the syntax. It is absolutely same.
There are languages and there are platforms. Popular languages are typically governed by standards (e.g., ANSI). C is a programming language.
Linux, Windows, Android, etc, are platforms (or, specifically, operating systems). Each platform offers a set of libraries (API calls) that you can access to do different things on that platform. System/library calls for file system access, networking, specific windowing/GUI system, etc, can be different on different platforms. So knowing how to "write C on Linux" means you know C and you know a lot of Linux platform calls. Even different windowing systems under Linux can have different API calls.
There are also standards across platforms, such as POSIX, which work to make the library calls the same across different platforms. Although this doesn't deal with most of the disparity between GUI APIs.
The C language programming syntax is defined under the ISO C standard. The resulting execution depends on the compiler used to turn code into an executable program and the machine on which the compile runs (or at least the target architecture it runs for). The results from that compilation will depend on the use of the programming syntax (the code) against the interpretation of that code from the compiler. If the programmer restricts his programming habits to writing conformant C code excluding implementation-defined behavior or undefined behavior, it's resulting executable will behave identically on any platform.
Then you think of it as if there was roughly three "layers" of C implementation you could make: kernel programming, system programming and userspace programming.
Kernel programming is hardware-level programming and usually leverage implementation-defined behavior to interface the hardware world to the software world. They provide a C interface to system programmers. They are different from machine to machine and the architercture resulting from these implementation defines the difference between various OS (ex: window vs linux vs OsX vs MIT exokernel, etc).
System programmers leverage the kernel's (the system's) API to build C standard library (they define the implementation of higher level C standard functionnalities). Ex: glibc and the gnu c compiler (gcc) should be iso C conformant to unambiguous section of the C standard and defines the implementation of implementation-define AND undefined behavior. That layer of implementation is hardware independant (to some extend) since the kernel level constitute an hardware abstraction. But they handle resource from that abstraction layer (ex: RAM or writting to a file on the hard drive or sending a stream of data on an internet socket).
Userspace programmers code the programs that uses the standard API and the compilers to build "usable" pieces of software such as gnome-terminal or i3 windows tiling manager (I can't find an example a C code "user-friendly" running under windows from the top of my head...). Unless these software implementation resort to implementation-define code or undefined behavior code, it should be platform independent.
The answer is simple: There is no difference!
However each operating system has its own API. This API does not depend on the programming language.
Example: The "MessageBox()" function exists in Windows only, not in Linux. It is a Windows-Specific function (available in any programming language under Windows).
There are also some library functions that are named differently in Linux and in Windows.
One example would be the "stricmp()" function (Windows) that is named "strcasecmp()" under Linux. However this is not an issue of the C programming language but of the libraries (.H files and .SO files).
Different operating systems will have different APIs (Application programming interfaces) which can be libraries built for building application software for your specific OS. GNU/Linux has libraries specific to it such as sys/socket.h, linux.h, sys/types.h, etc.

C libraries are distributed along with compilers or directly by the OS?

As per my understanding, C libraries must be distributed along with compilers. For example, GCC must be distributing it's own C library and Forte must be distributing it's own C library. Is my understanding correct?
But, can a user library compiled with GCC work with Forte C library? If both the C libraries are present in a system, which one will get invoked during run time?
Also, if an application is linking to multiple libraries some compiled with GCC and some with Forte, will libraries compiled with GCC automatically link to the GCC C library and will it behave likewise for Forte.
GCC comes with libgcc which includes helper functions to do things like long division (or even simpler things like multiplication on CPUs with no multiply instruction). It does not require a specific libc implementation. FreeBSD uses a BSD derived one, glibc is very popular on Linux and there are special ones for embedded systems like avr-libc.
Systems can have many libraries installed (libc and other) and the rules for selecting them vary by OS. If you link statically it's entirely determined at compile time. If you link dynamically there are versioning and path rules which come into play. Generally you cannot mix and match at runtime because of bits of the library (from headers) that got compiled into the executable.
The compile products of two compilers should be compatible if they both follow the ABI for the platform. That's the purpose of defining specific register and calling conventions.
As far as Solaris is concerned, you assumption is incorrect. Being the interface between the kernel and the userland, the standard C library is provided with the operating system. That means whatever C compiler you use (Forte/studio or gcc), the same libc is always used. In any case, the rare ports of the Gnu standard C library (glibc) to Solaris are quite limited and probably lacking too much features to be usable. http://csclub.uwaterloo.ca/~dtbartle/opensolaris/
None of the other answers (yet) mentions an important feature that promotes interworking between compilers and libraries - the ABI or Application Binary Interface. On Unix-like machines, there is a well documented ABI, and the C compilers on the system all follow the ABI. This allows a great deal of mix'n'match. Normally, you use the system-provided C library, but you can use a replacement version provided with a compiler, or created separately. And normally, you can use a library compiled by one compiler with programs compiled by other compilers.
Sometimes, one compiler uses a runtime support library for some operations - perhaps 64-bit arithmetic routines on a 32-bit machine. If you use a library built with this compiler as part of a program built with another compiler, you may need to link this library. However, I've not seen that as a problem for a long time - with pure C.
Linking C++ is a different matter. There isn't the same degree of interworking between different C++ compilers - they disagree on details of class layout (vtables, etc) and on how exception handling is done, and so on. You have to work harder to create libraries built with one C++ compiler that can be used by others.
Only few things of the C library are mandatory in the sense that they are not needed for a freestanding environment. It only has to provide what is necessary for the headers
<float.h>, <iso646.h>, <limits.h>, <stdarg.h>, <stdbool.h>, <stddef.h>, and <stdint.h>
These usually don't implement a lot of functions that must be provided.
The other type of environments are called "hosted" environments. As the name indicated they suppose that there is some entity that "hosts" the running program, usually the OS. So usually the C library is provided by that "hosting environment", but as Ben said, on different systems there may even be alternative implementations.
Forte? That's really old.
The preferred compilers and developer tools for Solaris are all contained in Oracle Solaris Studio.
C/C++/Fortran with a debugger, performance analyzer, and IDE based on NetBeans, and lots of libraries.
http://www.oracle.com/technetwork/server-storage/solarisstudio/index.html
It's (still) free, too.
I think there a is a bit of confusion about terms: a library is NOT DLL's or .so: in the real sense of programming languages, Libraries are compiled code the LINKER will merge with our binary (.o). So the linker (or the compiler via some directives...) can manage them, but OS can't, simply is NOT a concept related to OS.
We are used to think OSes are written in C and we can rebuild the OS using gcc/libraries or similar, but C is NOT linux / unix.
We can also have an OS written in Pascal (Mac OS was in this manner many years ago..) AND use libraries with our favorite C compiler, OR have an OS written in ASM (even if not all, as in first Windows version), but we must have C libraries to build an exe.

How does linking to OS C libraries under Windows and Linux work?

I understand Linux ships with a c library, which implements the ISO C functions and system call functions, and that this library is there to be linked against when developing C. However, different c compilers do not necessarily produce linkable code (e.g. one might pad datastructures used in function arguments differently from another). How is the built-in c library meant to be linked to when I could use any compiler to compile my C? Is the story any different for static versus dynamic linking?
Under Windows on the other hand, each compiler provides its own standard library, which solves part of the problem, but system calls are still in a single set of DLLs. How are C applications linked to these DLLs successfully? How about different languages? (The same DLLs can be used by pre-.Net Visual Basic, etc.)
Each platform has some "calling conventions" that each C implementation must adhere to in order to be able to talk to the operating system correctly. For Windows, for example, all OS-based functions have to be called using stdcall convention, as opposed to the default C convention of cdecl.
In Linux, since the standard C library (and kernel) is compiled using GCC, any other compilers for Linux must make sure their calling conventions are compatible to the one used by GCC.
Compilers do come with their implementations of the standard library. It's just that under Linux it's assumed that any compiler will follow the same conventions the version of GCC that compiled the library had.
As of interoperability, it can be easier than you think. There are established calling conventions that will allow compilers to produce a valid call to a function, even if the function wasn't compiled with the same software.
As of structures and padding, you'll notice that most frameworks work with opaque types, that is, pointers to structures. Often, the structure's layout isn't even available to clients. As such, they never works with the actual data, only pointers to the data, which clears the padding issue.
Standards. You'll note that stdlib stuff operates on primitive values and arrays - and the standard for that stuff is pretty explicit on how things are to be done.

Resources