C - What are C implementations? - c

Could you give me an explanation of what exactly an implementation is, taking inspiration from the three highlighted expressions?
From "C Primer Plus" > Language Standards
Currently, many C implementations are available. Ideally, when you
write a C program, it should work the same on any implementation,
providing it doesn’t use machine-specific programming. For this to be
true in practice, different implementations need to conform to a
recognized standard.

A standard conforming C implementation consists of a compiler that translates compilation units as mandated by the standard, an implementation of the standard library for all functions required by the standard and something (normally a linker) that puts everything together to build an executable file. In fact the implementation also includes all the software required to then run the produced executable.
We commonly speak of compilers (gcc, clang, msvc) when we should speak of C development environment. And inside each vendor system, you may have different implementations, because for example gcc or clang can generate executables for different int sizes (32 or 64 bits) and eventually different endiannesses. Each configuration then constitutes a specific implementation.
To be more exhaustive, it should be noted that support the standard library may be optional in so called standalone execution environment (by opposition to hosted execution environment). In real world, standalone mode is used for kernel development, because the kernel must be able to start before all the functions from the standard library are available. Else we would have a chicken and egg problem if the kernel required functions that it only can provide when it is fully loaded...
References: The draft n1570 for C11 defines an implementation as:
3.12
implementation:
particular set of software, running in a particular translation environment under particular
control options, that performs translation of programs for, and supports execution of
functions in, a particular execution environment

C is typically implemented by a C compiler. In that paragraph you could simply replace this accordingly.
A reason for the more generic term "implementation" is that a compiler might contain multiple frontends or be made out of different tasks, or there are theoretically other forms than a compiler thinkable.
The C standard is defined against a "abstract machine" which behaves as the standard describes but doesn't have any other requirements and defines mapping to real machines as exercise for the implementor.

An implementation is an application to a particular specification.
For example, the specification could be the ISO international standard, and an implementation could be gcc or clang; the compilers that have implemented the standard specifications coupled with an implementation of the standard library.

Related

I want to know about C Library Configuration

The C Standard Library is independent of any operating system and system.
So, why use the input/output functions from the standard library?
Unix-specific POSIX system calls exist. Windows-specific input/output system calls exist.
Don't standard library functions eventually call system calls internally? Is this just for portability?
The API presented by the C standard library is uniform between operating systems, well, uniform provided all the "unspecified" parts of C roughly align (like the size of int).
The implementation of the C standard library is not independent of the operating system. Basically the implementation consists of the compiled source code the provides the API, and that compiled code matches the CPU / machine instruction sets, and possibly other items specific to the hardware bus width, supporting chip sets and other actual hardware details.
So, programming against the C API helps your program be "more portable" but that doesn't mean that any specific implementation of the C API is portable. Finally, there are lots of small details that aren't specified in detail, or are allowed to vary between platforms (like byte order, size of int, and so on). So even a program written against the standard C API might not work correctly on another machine, unless you write code that accommodates and reacts to the parts of the C API that might differ between platforms.
POSIX is basically a standard that eventually became incorporated into most C development environments. It is designed to provide a single API to program against for multiple UNIX platforms for items that lie outside of the core C language. There are POSIX implementations for Windows too, but Microsoft's historical offerings are notorious for not actually working correctly.
Yes, these APIs (if available) are implemented with code that eventually performs operating specific calls, and is presented in "machine code" that is very specific to the CPU instruction set. There are dozens of CPUs out there, and each major platform has its own matching compiler and matching C API libraries, if the C language is available.
The C language and API is there for portability, but portability isn't it's primary reason for existing (and there are lots of small corner cases where the same code isn't portable across all platforms unless it is written a certain way.) The primary reason it's there is not portability, it is because if the language features weren't consistently available across all platforms, then you wouldn't have "one C language" that could be used on multiple machines, you would have "many C-like languages, where each supported item would have to be checked" meaning you might know C on your development platform, but not know C on another platform.
As for the libraries, there are many libraries that might be absent in a typical machine, and when developing, you generally have to use a dependency checker to ensure the library is present (and sometimes the correct functions are available in the library) before successful use of the machine for development. Autoconf, for example, has m4 macros that can be configured to check if a library is present before compiling the programs.

No library C implementation

I've heard while looking at different C implementations that any system that hopes to implement C must minimally include certain libraries, stdarg.h etc. My question is why this is, it can't be that the C library is not Turing complete without some headers, and since the headers have been written it must be true that I could write them myself. Why, then, is it not permissible to have a C implementation consisting of just a compiler+linker toolchain? (of course, in this case interacting with the OS would require inline assembly or linked assembly code as well as knowledge of the system's syscalls etc., but that doesn't mean that C can't be written, does it?)
You confuse a property of the programming language, i.e. the language itself with additional features mandated by the standard.
"Turing complete" is just about the language itself; basically if you can use it to solve a certain class of problems (for a more exact definition, please see Wikipedia for a starter(!) ). That is quite an abstract concept and does not include any libraries. Basically, if you use such libraries, you just have to be able to write those libraries in the language itself. This is true for the C language.
About the libraries required: Your premise is wrong. C very well allows to omit the libraries themselves. That is the difference between a hosted (full libraries) and a freestanding (few target-specific headers, but no generated code). See 4p6.
The few headers are normally part of the compiler itself. They basically provide some typedefs and #defined constants, e.g. the range of the integer types (limits.h) and types of guaranteed minimum width (stdint.h, often also fixed-width types). stddef.h e.g. provides size_t and NULL.
While you do not need to use those headers, they already allow writing portable code for the program logic. Just see them as part of the language itself, tailored to the target.
The gcc C compiler, for instance actually is a freestanding implementation: It only provides the required headers, but not the standard library. Instead, it relies on the system library, which is e.g. glibc on Linux.
Note: Generally it is a bad idea to re-invent the wheel. So if you are on a hosted environment (i.e. full-grown OS), you should use the features available. Otherwise you might run into trouble, as these e.g. mightr provide additional functions not directly seen by your code. E.g. debugging or system/user-wide configuration like localisation support. Also debugging support might depend on you using the standard library, e.g. valgrind. Replacing memory allocation with your own code at least makes this much more difficult.
Not to mention maintainability. Not just others will understand your code easier if you use the standard names&semantics, but also yourself - just wait some years and try understanding your old code.
OTOH, if you are on a bare-metal embedded system, there is actually little use of most features the standard library. Including e.g. printf or scnaf just bloats your firmware, often without any actual use. For such systems, there are stripped-down libraries (e.g. newlib) which may be not completely compliant or allow to omit certain costly features, e.g. floating point conversion or the math lib. Still you only should use them iff you really need many of their features. And sometimes there is a middle way, but that requires some knowledge about the dependencies of the library.
Two reasons: compatibility and system interaction.
If you don't implement the whole C standard library, then code other people write won't work. Even the most basic C program uses library calls.
#include <stdio.h>
int main() {
printf("Hello world!\n");
return 0;
}
Without an agreed upon and fully implemented stdio.h that program will not run because the compiler doesn't know what printf() means.
Then there's system interaction. C has been called "portable assembly". This is because different computing environments do things differently, but C takes are of that for you (well, some of it). You can't write a portable stdio.h in assembly without losing your mind. But its more than that. Each C header file protects you from something that each environment does (or used to) do very differently.
stdlib.h shields you from differing memory models and process control.
stdio.h shields you from differing IO systems.
math.h shields you from differing floating point implementations.
limits.h shields you from differing data sizes.
locale.h shields you from differing locales.
And so on...
The C libraries provide a standard API that each environment can write to. When C is ported to a new environment, that environment is responsible for implementing those libraries according to the particulars of that system. You don't have to do that.
Nowadays we live in a much more homogeneous environment than when C was developed, but most of the C standard library still protects you from basic differences in how operating systems and hardware do things.
It didn't always used to be this way. An example that comes to mind is the hell of running a game on DOS. There was no standard interface to the sound and video card (if you had them). Each program had to ship drivers for each sound and video card they supported. If yours wasn't in there, sorry. If their driver was buggy, sorry.
Programming without the C standard library is kind of like that, but far far worse.

C headers: compiler specific vs library specific?

Is there some clear-cut distinction between standard C *.h header files that are provided by the C compiler, as oppossed to those which are provided by a standard C library? Is there some list, or some standard locations?
Motivation: int this answer I got a while ago, regarding a missing unistd.h in the latest TinyC compiler, the author argued that unistd.h (contrarily to sys/unistd.h) should not be provided by the compiler but by your C library.
I could not make much sense of that response (for one thing shouldn't that also apply to, say, stdio.h?) but I'm still wondering about it. Is that correct? Where is some authoritative reference for this?
Looking in other compilers, I see that other "self contained" POSIX C compilers that are hosted in Windows (like the GCC toolchain that comes with MinGW, in several incarnations; or Digital Mars compiler), include all header files.
And in a standard Linux distribution (say, Centos 5.10) I see that the gcc package provides a few header files (eg, stdbool.h, syslimits.h) in /usr/lib/gcc/i386-redhat-linux/4.1.1/include/, and the glibc-headers package provides the majority of the headers in /usr/include/ (including stdio.h, /usr/include/unistd.h and /usr/include/sys/unistd.h).
So, in neither case I see support for the above claim.
No, there is no clear-cut distinction.
As far as the C standard is concerned (here's a recent draft), the compiler and the library together make up the implementation, distinguished mostly by being described in sections 6 and 7 of the standard, respectively.
For some implementations, the compiler and the runtime library are provided by the same vendor/organization/person, either as a single installable package or as two separate packages. For other implementations (including gcc), the bulk of the standard library is provided by the underlying operating system, but the installation package for the compiler includes a few of its own headers.
Another example: When you install gcc from source on Solaris, the installer runs a script that grabs copies of some of the existing header files (provided by Sun's Oracle's runtime library) and edits them, installing the modified copies in a separate directory.
On GNU/Linux systems, the default C compiler is usually gcc, and the runtime library is provided by glibc -- both GNU packages, but developed separately. The MinGW implementation under Windows uses the gcc compiler with Microsoft's runtime library (which leads to some problems because they disagree on the representation of long double).
The choice of which standard headers need to be provided by the compiler is made by the authors of the compiler. Headers whose implementation is tightly tied to a particular compiler (such as <stdint.h>, <limits.h>, and <float.h>) are typically provided by the compiler; headers that provide an interface to operating system services, like <stdio.h> and <stdlib.h> are typically provided by the runtime library or perhaps by the OS.
The C standard provides no direct guidance regarding how this choice should be made.
Except for embedded (free-standing) C implementations, it makes little sense to separate C into compiler and libraries. Neither a compiler without libraries, not a C library without a compiler make much sense. Only a compiler together with a library makes up a complete implementation.
Since C89, the standard library is part of the C standard, and the list of required header files is listed in the standard. Other sets of libraries are standardized by Posix, X/Open ...
See this answer for a list: List of standard header files in C and C++
There are some headers that are by nature closer to the compiler itself, e.g. limits.h, which specifies the size of the data types. Some headers are closer to the OS, i.e. unistd.h. In both cases however, there are intersections, and if the OS' idea of, say, size_t and the compiler's implementation do not agree, nothing will work.
One could also argue that unistd.h should be provided by the OS and not by the C library - after all, the knowledge how to call a kernel function belongs into the OS.
In summary, I think this distinction into compiler, C library, operating system makes little sense.

What is the difference between the C programming language and C programming under linux?

What is the difference between the C programming language and C programming under Linux?
Are the syntax same in both them?
Or is the difference only when you execute the program?
The C language is governed by the ISO approved C standard and it does not take in to account the underlying platform on which you use C. So from the perspective of the language standard there is no difference, and a standard compliant program shall work correctly on both.
However in practical usage one needs to do platform specific things for ex: IPC mechanisms, multithreading, file access and so on which are specific to the platform, such functionality will vary from platform to platform because each will provide functionality specific to itself. Note that such functionality is not covered by the C language standard, so using it makes the program non portable across other platforms.
Linux is a platform that can be used for the development of programs and applications using languages such as C. The only thing is that its supposed to be is its simplicity and one's liking to a particular operating system. Otherwiswe there is no difference in the syntax. It is absolutely same.
There are languages and there are platforms. Popular languages are typically governed by standards (e.g., ANSI). C is a programming language.
Linux, Windows, Android, etc, are platforms (or, specifically, operating systems). Each platform offers a set of libraries (API calls) that you can access to do different things on that platform. System/library calls for file system access, networking, specific windowing/GUI system, etc, can be different on different platforms. So knowing how to "write C on Linux" means you know C and you know a lot of Linux platform calls. Even different windowing systems under Linux can have different API calls.
There are also standards across platforms, such as POSIX, which work to make the library calls the same across different platforms. Although this doesn't deal with most of the disparity between GUI APIs.
The C language programming syntax is defined under the ISO C standard. The resulting execution depends on the compiler used to turn code into an executable program and the machine on which the compile runs (or at least the target architecture it runs for). The results from that compilation will depend on the use of the programming syntax (the code) against the interpretation of that code from the compiler. If the programmer restricts his programming habits to writing conformant C code excluding implementation-defined behavior or undefined behavior, it's resulting executable will behave identically on any platform.
Then you think of it as if there was roughly three "layers" of C implementation you could make: kernel programming, system programming and userspace programming.
Kernel programming is hardware-level programming and usually leverage implementation-defined behavior to interface the hardware world to the software world. They provide a C interface to system programmers. They are different from machine to machine and the architercture resulting from these implementation defines the difference between various OS (ex: window vs linux vs OsX vs MIT exokernel, etc).
System programmers leverage the kernel's (the system's) API to build C standard library (they define the implementation of higher level C standard functionnalities). Ex: glibc and the gnu c compiler (gcc) should be iso C conformant to unambiguous section of the C standard and defines the implementation of implementation-define AND undefined behavior. That layer of implementation is hardware independant (to some extend) since the kernel level constitute an hardware abstraction. But they handle resource from that abstraction layer (ex: RAM or writting to a file on the hard drive or sending a stream of data on an internet socket).
Userspace programmers code the programs that uses the standard API and the compilers to build "usable" pieces of software such as gnome-terminal or i3 windows tiling manager (I can't find an example a C code "user-friendly" running under windows from the top of my head...). Unless these software implementation resort to implementation-define code or undefined behavior code, it should be platform independent.
The answer is simple: There is no difference!
However each operating system has its own API. This API does not depend on the programming language.
Example: The "MessageBox()" function exists in Windows only, not in Linux. It is a Windows-Specific function (available in any programming language under Windows).
There are also some library functions that are named differently in Linux and in Windows.
One example would be the "stricmp()" function (Windows) that is named "strcasecmp()" under Linux. However this is not an issue of the C programming language but of the libraries (.H files and .SO files).
Different operating systems will have different APIs (Application programming interfaces) which can be libraries built for building application software for your specific OS. GNU/Linux has libraries specific to it such as sys/socket.h, linux.h, sys/types.h, etc.

C libraries are distributed along with compilers or directly by the OS?

As per my understanding, C libraries must be distributed along with compilers. For example, GCC must be distributing it's own C library and Forte must be distributing it's own C library. Is my understanding correct?
But, can a user library compiled with GCC work with Forte C library? If both the C libraries are present in a system, which one will get invoked during run time?
Also, if an application is linking to multiple libraries some compiled with GCC and some with Forte, will libraries compiled with GCC automatically link to the GCC C library and will it behave likewise for Forte.
GCC comes with libgcc which includes helper functions to do things like long division (or even simpler things like multiplication on CPUs with no multiply instruction). It does not require a specific libc implementation. FreeBSD uses a BSD derived one, glibc is very popular on Linux and there are special ones for embedded systems like avr-libc.
Systems can have many libraries installed (libc and other) and the rules for selecting them vary by OS. If you link statically it's entirely determined at compile time. If you link dynamically there are versioning and path rules which come into play. Generally you cannot mix and match at runtime because of bits of the library (from headers) that got compiled into the executable.
The compile products of two compilers should be compatible if they both follow the ABI for the platform. That's the purpose of defining specific register and calling conventions.
As far as Solaris is concerned, you assumption is incorrect. Being the interface between the kernel and the userland, the standard C library is provided with the operating system. That means whatever C compiler you use (Forte/studio or gcc), the same libc is always used. In any case, the rare ports of the Gnu standard C library (glibc) to Solaris are quite limited and probably lacking too much features to be usable. http://csclub.uwaterloo.ca/~dtbartle/opensolaris/
None of the other answers (yet) mentions an important feature that promotes interworking between compilers and libraries - the ABI or Application Binary Interface. On Unix-like machines, there is a well documented ABI, and the C compilers on the system all follow the ABI. This allows a great deal of mix'n'match. Normally, you use the system-provided C library, but you can use a replacement version provided with a compiler, or created separately. And normally, you can use a library compiled by one compiler with programs compiled by other compilers.
Sometimes, one compiler uses a runtime support library for some operations - perhaps 64-bit arithmetic routines on a 32-bit machine. If you use a library built with this compiler as part of a program built with another compiler, you may need to link this library. However, I've not seen that as a problem for a long time - with pure C.
Linking C++ is a different matter. There isn't the same degree of interworking between different C++ compilers - they disagree on details of class layout (vtables, etc) and on how exception handling is done, and so on. You have to work harder to create libraries built with one C++ compiler that can be used by others.
Only few things of the C library are mandatory in the sense that they are not needed for a freestanding environment. It only has to provide what is necessary for the headers
<float.h>, <iso646.h>, <limits.h>, <stdarg.h>, <stdbool.h>, <stddef.h>, and <stdint.h>
These usually don't implement a lot of functions that must be provided.
The other type of environments are called "hosted" environments. As the name indicated they suppose that there is some entity that "hosts" the running program, usually the OS. So usually the C library is provided by that "hosting environment", but as Ben said, on different systems there may even be alternative implementations.
Forte? That's really old.
The preferred compilers and developer tools for Solaris are all contained in Oracle Solaris Studio.
C/C++/Fortran with a debugger, performance analyzer, and IDE based on NetBeans, and lots of libraries.
http://www.oracle.com/technetwork/server-storage/solarisstudio/index.html
It's (still) free, too.
I think there a is a bit of confusion about terms: a library is NOT DLL's or .so: in the real sense of programming languages, Libraries are compiled code the LINKER will merge with our binary (.o). So the linker (or the compiler via some directives...) can manage them, but OS can't, simply is NOT a concept related to OS.
We are used to think OSes are written in C and we can rebuild the OS using gcc/libraries or similar, but C is NOT linux / unix.
We can also have an OS written in Pascal (Mac OS was in this manner many years ago..) AND use libraries with our favorite C compiler, OR have an OS written in ASM (even if not all, as in first Windows version), but we must have C libraries to build an exe.

Resources