What GNU C extensions are available that aren't trivial to implement in C99? - c

How come the Linux kernel can compile only with GCC? What GNU C extensions are really necessary for some projects and why?

Here's a couple gcc extensions the Linux kernel uses:
inline assembly
gcc builtins, such as __builtin_expect,__builtin_constant,__builtin_return_address
function attributes to specify e.g. what registers to use (e.g. __attribute__((regparm(0)),__attribute__((packed, aligned(PAGE_SIZE))) ) )
specific code depending on gcc predefined macros (e.g. workarounds for certain gcc bugs in certain versions)
ranges in switch cases (case 8 ... 15:)
Here's a few more: http://www.ibm.com/developerworks/linux/library/l-gcc-hacks/
Much of these gcc specifics are very architecture dependent, or is made possible because of how gcc is implemented, and probably do not make sense to be specified by a C standard. Others are just convenient extensions to C. As the Linux kernel is built to rely on these extensions, other compilers have to provide the same extensions as gcc to be able to build the kernel.
It's not that Linux had to rely on these features of gcc, e.g. the NetBSD kernel relies very little on gcc specific stuff.

GCC supports Nested Functions, which are not part of the C99 standard. That said, some analysis is required to see how prevalent they actually are within the linux kernel.

Linux kernel was written to be compiled by GCC, so standard compliance was never an objective for kernel developers.
And if GCC offers some useful extensions that make coding easier or compiled kernel smaller or faster, it was a natural choice to use these extensions.

I guess it's not they are really that necessary. Just there are many useful ones and cross-compiler portability is not that much an issue for Linux kernel to forgo the niceties. Not to mention sheer amount of work it would take to get rid of relying on extensions.

Related

relationship of c compiler and c standard library

I have been doing a lot reading lately about how glibc functions wrap system calls in linux. I am wondering however about the relationship between glibc and the GNU C Compiler.
Lets say for example I wanted to write my own C Standard implementation and write a new library called "newglibc" and I change things just slightly. Like for example I take more checks and actions before and after the system calls. Would I have to write a new compiler? Or would I be able to use the same GNU gcc compiler?
If the compiler is completely separate from the library, then would someone be able to, THEORETICALLY, use the gcc on windows system if they could turn it into a .exe and provide the standard C library that windows provides?
Thank you
The Linux kernel, the GNU C Library ("glibc"), and the GNU Compiler Collection (gcc) are three separate development projects. They are often used all together, but they don't have to be. The ones with "GNU" in their name are offically part of the GNU Project; Linux isn't.
The C standard does not make a distinction between the "compiler" and the "library"; it's all one "implementation" to the committee. It is largely a historical accident that GCC is a separate development project from glibc—but a motivated one: back in the day, each commercial Unix variant shipped with its own C library and compiler, and they were terrible, 90% bugs by volume was typical. GNU got its start providing a less terrible replacement for the compiler (and the shell utilities, which were also terrible).
Replacing the compiler on a traditional commercial Unix is a lot easier than replacing the C library, because the C library isn't just the functions defined in clause 7 of the C standard; as you have noticed, it also provides the lowest-level interface to the kernel, and often that wasn't very well documented. glibc did at one time at least sort-of support a bunch of these Unixes, but nowadays it can only be used with Linux and an experimental kernel called the Hurd. By contrast, GCC supports dozens of different CPUs and kernels, and Linux supports dozens of different CPUs.
If you write your own C library and/or kernel, it is relatively easy to write a "back end" so that GCC can generate code for them as a cross-compiler, and somewhat more difficult to port GCC to run in that environment. You may also need to write a back end for the assembler and linker, which are yet a fourth project ("GNU Binutils"). Porting glibc to a new CPU running Linux is a large but straightforward task; porting glibc to a new operating system is hard, especially if that OS is not Unix-ish. (Windows is decidedly not Unix-ish, so much so that when Microsoft wanted to make it easier to run programs written for Unix under Windows, the path of least resistance was to bolt an in-house clone of the Linux kernel onto the side of the NT kernel. I am not making this up.)
If you write your own C compiler, you will have to make it conform to the expectations of the library and kernel that it is generating code for. A lot of that is documented in the "ABI" specification for the environment you're working in, but not all, unfortunately.
If that doesn't clarify, please let us know what is still unclear.

Crosscompiler Binary compatibility in C

I need to verify something for which I have doubts. If a shared library ( .dll) is written in C, with the C99 standard and compiled under a compiler. Say MinGw. Then in my experience it is binary compatible and hence useable from any other compiler. Say MS Visual Studio. I say in my experience because I have tried it successfully more than once. But I need to verify if this is a rule.
And in addition I would like to ask if it is indeed so, then why libraries written completely in C, like openCV for example don't provide compiled binaries for every different OS? I know that the obvious reason would be to set all the compile-time parameters, but other than that there is none right?
EDIT: I am adding an additional question which I see as a logical extension to the original. Isn't this how one would go and create a closed source library? Since the option of giving source goes out of the window there, giving binaries is the only choice. And in that case providing binaries for as many architectures as possible is the desired result, with C being an obvious choice for having the best portability between systems and compilers. Right?
In the specific case of C compilers (MSVC and GCC/MinGW) in the Windows world, you are correct in the assumption of binary compatibility. One can link a C interface DLL compiled by GCC to a program in Visual Studio. This is the way C99 projects like ffmpeg allow developers to write application wiht Visual Studio. One only needs to create the import library with lib.exe found in the Microsoft toolchain from the DLL. Or vice versa, using mingw.org's pexports or better, mingw-w64's gendef tool, one can create a GCC import lib for a MSVC produced DLL.
This handy interoperability breaks down when you enter the C++ interface world, where the ABI of MSVC and GCC is different and incompatible. It may work, it may not, no guarantees are made and no effort is (currently) being done in changing that. Also, debugging info is obviously different, until someone writes a debug information generator/writer in GCC that is compatible to MSVC's debugger (along with gdb support of course).
I don't think C99 specifically changes anything to function declarations or the way arguments are handled in symbol definitions, so there should be no problem here either.
Note that as Vijay said, there is still the architecture difference, so a x86 library can't be used when linking to an AMD64 library.
To also answer your additional question about closed source binaries and distributing a version for all available compilers/architectures.
This is exactly the way you would create a closed source binary. In addition to the import library, it is also very important to hide exports from the DLL, making the DLL itself useless for linking (if you don't want client code to use private functions in the library, see for example the output of dumpbin /exports on a MSOffice DLL, lots of hidden stuff there). You can achieve the same thing with GCC (I believe, never used or tried it) using things like __attribute(hidden) etc...
Some compiler specific points:
MSVC comes with four (well, actually only three remaining in newer versions) different runtime libraries through /MT, /MD, and /LD. On top of this, you would have to provide a build for each version of Visual Studio (including Service Packs) to assure compatibility. But that is closed source binary and Windows for you...
GCC does not have this problem; MinGW always links to msvcrt.dll provided by Windows (since Windows 98), equivalent with /MD (and maybe also a debug library equivalent with /MDd). But I there are two versions of MinGW (mingw.org and mingw-w64) which do not guarantee binary compatibility. THe latter is more complete as it provides 64-bit options as well as 32-bit, and provides a more complete header/library set (including a substantial part of DirectX and DDK).
The general rule is that IF your OS/CPU combination has a standard ABI, and IF that ABI is powerful enough for your language, most compilers will follow that ABI and as a result will be binary compatible, allowing you to link libraries (shared or static) compiled with different compilers to programs compiled with other compilers just fine.
The problem is that most ABIs are fairly weak -- they're designed around low-level languages like C and FORTRAN and date back to the days before object oriented languages like C++. So they tend to lack support for things like function overloading, user-defined operators, exceptions, global contructors and destructors, virtual functions, inheritance, and such that are needed by C++.
This lack was recognized when C++ was designed which is why C++ has extern "C" -- which causes the compiler to limit itself to the standard ABI for certain functions, while disabling all the extra C++ features that the ABIs generally don't support.
A shared library or dll compiled to a particular architecture can be linked to applications compiled by other compilers that target the same architecture. (By architecture, I mean a processor/OS combination). But it is not practical for a library developer to compile against all possible architectures. Moreover, when a library is distributed in source form, users can build binaries optimized to their specific requirements.

Optimal Windows C tool stack

I am planning a bigger C-only project. It should run on both Linux and Windows. My question is, what is the optimal development stack (compiler, IDE) on Windows? Problem is, we would like to use C99 (if possible).
On Linux it's quite easy because usually combination GCC+VIM+GIT is optimal. But on Windows?
I am concerning: Visual Studio 2010 (no C99 support), MinGW and Intel C++ Compiler.
How do they compare with each other in terms of performance?
For IDE, I've happily used VC++, MonoDevelop, and Code::Blocks (the latter two are cross-platform, as a plus). C isn't a very difficult language to make an IDE for, so most anything will work and it'll just come down to personal preference.
For compiler... C is very quick to compile on all of them, so I guess by performance you mean of generated code?
In my experience, Intel C++ optimizes best if you're targeting Intel CPUs. MinGW GCC optimizes best for everything else. VC++ optimizes very good, but not quite as much as GCC or ICC. This is of course in the general sense only -- I've had plenty experiences where VC++ bests them both.
VC++ compiler can integrate with some non-VC++ IDEs, like Code::Blocks. As you said, lacks C99 support (though, it does have stdint.h).
MinGW integrates with most IDEs, except VC++. Once upon a time its port of libstdc++ lacked support for wchar_t, making Unicode apps very difficult to write. I'm not sure if this has changed.
ICC integrates with the VC++ IDE, some non-VC++ IDEs, as well as supporting C99 and Linux, however it has been shown in the past to deliberately use sub-optimal code when used with non-Intel CPUs -- I'm not sure if this is still the case.
Agner Fog's Optimizing software in C++ provides a decent comparison of compilers, included optimization capabilities.
Well, GCC is GCC. Performance is identical over different OSes, minus ABI and C stdlib differences that may impact performance.
This is an easy problem to solve, use GCC everywhere, meaning MinGW, or the more up to date mingw-w64 (includes 32 and 64-bit compilation capabilities). It provides all (most, meaning 99.9%) of the Win32 API when you need it.
Note that although it's the same compiler, ABI is different for say, Windows x64 vs Linux x64 (in this case: the size of long), and you should ensure the code compiles and works on all platforms you intend to target, regularly.
Using GCC with -pedantic-errors -Wall -Wextra is a nice help for this (if you silence all warnings!), but not perfect.
The Intel compiler will bring better performance, but is only free for personal use on Linux (you can't distribute binaries produced by the free version if I remember correctly), so if you want free tools, that's out. Visual C sucks at C99. It's a C89 compiler, and that isn't going to change soon.
Most development tools are available on Windows as well, see for example msysgit and vim.

C libraries are distributed along with compilers or directly by the OS?

As per my understanding, C libraries must be distributed along with compilers. For example, GCC must be distributing it's own C library and Forte must be distributing it's own C library. Is my understanding correct?
But, can a user library compiled with GCC work with Forte C library? If both the C libraries are present in a system, which one will get invoked during run time?
Also, if an application is linking to multiple libraries some compiled with GCC and some with Forte, will libraries compiled with GCC automatically link to the GCC C library and will it behave likewise for Forte.
GCC comes with libgcc which includes helper functions to do things like long division (or even simpler things like multiplication on CPUs with no multiply instruction). It does not require a specific libc implementation. FreeBSD uses a BSD derived one, glibc is very popular on Linux and there are special ones for embedded systems like avr-libc.
Systems can have many libraries installed (libc and other) and the rules for selecting them vary by OS. If you link statically it's entirely determined at compile time. If you link dynamically there are versioning and path rules which come into play. Generally you cannot mix and match at runtime because of bits of the library (from headers) that got compiled into the executable.
The compile products of two compilers should be compatible if they both follow the ABI for the platform. That's the purpose of defining specific register and calling conventions.
As far as Solaris is concerned, you assumption is incorrect. Being the interface between the kernel and the userland, the standard C library is provided with the operating system. That means whatever C compiler you use (Forte/studio or gcc), the same libc is always used. In any case, the rare ports of the Gnu standard C library (glibc) to Solaris are quite limited and probably lacking too much features to be usable. http://csclub.uwaterloo.ca/~dtbartle/opensolaris/
None of the other answers (yet) mentions an important feature that promotes interworking between compilers and libraries - the ABI or Application Binary Interface. On Unix-like machines, there is a well documented ABI, and the C compilers on the system all follow the ABI. This allows a great deal of mix'n'match. Normally, you use the system-provided C library, but you can use a replacement version provided with a compiler, or created separately. And normally, you can use a library compiled by one compiler with programs compiled by other compilers.
Sometimes, one compiler uses a runtime support library for some operations - perhaps 64-bit arithmetic routines on a 32-bit machine. If you use a library built with this compiler as part of a program built with another compiler, you may need to link this library. However, I've not seen that as a problem for a long time - with pure C.
Linking C++ is a different matter. There isn't the same degree of interworking between different C++ compilers - they disagree on details of class layout (vtables, etc) and on how exception handling is done, and so on. You have to work harder to create libraries built with one C++ compiler that can be used by others.
Only few things of the C library are mandatory in the sense that they are not needed for a freestanding environment. It only has to provide what is necessary for the headers
<float.h>, <iso646.h>, <limits.h>, <stdarg.h>, <stdbool.h>, <stddef.h>, and <stdint.h>
These usually don't implement a lot of functions that must be provided.
The other type of environments are called "hosted" environments. As the name indicated they suppose that there is some entity that "hosts" the running program, usually the OS. So usually the C library is provided by that "hosting environment", but as Ben said, on different systems there may even be alternative implementations.
Forte? That's really old.
The preferred compilers and developer tools for Solaris are all contained in Oracle Solaris Studio.
C/C++/Fortran with a debugger, performance analyzer, and IDE based on NetBeans, and lots of libraries.
http://www.oracle.com/technetwork/server-storage/solarisstudio/index.html
It's (still) free, too.
I think there a is a bit of confusion about terms: a library is NOT DLL's or .so: in the real sense of programming languages, Libraries are compiled code the LINKER will merge with our binary (.o). So the linker (or the compiler via some directives...) can manage them, but OS can't, simply is NOT a concept related to OS.
We are used to think OSes are written in C and we can rebuild the OS using gcc/libraries or similar, but C is NOT linux / unix.
We can also have an OS written in Pascal (Mac OS was in this manner many years ago..) AND use libraries with our favorite C compiler, OR have an OS written in ASM (even if not all, as in first Windows version), but we must have C libraries to build an exe.

Current Standard C Compiler?

I wanted to know what is the current standard C compiler being used by companies. I know of the following compilers and don't understand which one to use for learning purposes.
Turbo C
Borland C
GCC
DJGPP
I am learning C right now and referring to the K&R book.
Can anyone please guide me to which compiler to use?
GCC would be the standard, best supported and fastest open source compiler used by most (sane) people.
GCC is going to have the best support of the choices you've listed for the simple reason that it comes standard in GNU and is the target of Linux. It's very unlikely any organization would use the other three beyond possibly supporting some horrible legacy application.
Other C compilers you might look into include:
Clang: an up-and-comer, particularly for BSD and Mac OS X
Visual Studio Express: for Windows programming
Intel Compiler Suite: very high performance; costs money
Portland Group: another high-performance commercial compiler; used typically for supercomputers
PathScale: yet another commercial high-performance compiler
If you are starting to learn the language, Clang's much better diagnostics will help you.
To make your (job) applications tools section look better, GCC (and maybe Visual Studio) are good to have knowledge of.
GCC (which I use in those rare moments when I use C) or ICC (Intel C Compiler), though ICC is known for making code that runs slowly on AMD processors.
Depends on the platform you are using and planning to learn on or will do future development.
On Windows you can use Visual Studio Express C++ which supports standard ANSI C usage. Option two is Cygwin which is a library and tool set that replicates much of what you would use on Linux or other Unix style OS's ( it uses GCC ).
On the Mac you would want XCode which is the standard development tools including C compiler ( based on GCC ).
On many Unix type systems it will be cc or gcc depending on the OS vendor.
If you have the money some of the paid compilers like the Intel one are exceptional but likely won't be much help in learning the programming craft at this point.
If you use LINUX operating system GCC is the best compiler. You can separate each compiler steps like preprocessing , assembler , linker separately in GCC compiler using some command line options. You can analyze step by step of compilation of your C source code easily. I suggest to go for "GNU C COMPILER(GCC)". You can use "CC" command, its nothing but a symbolic link to GCC.
I can recommend OpenWatcom which was once used to develop Netware. Only supports IA-32 but does it well. Contains a basic IDE and a basic but competent profiler. Something for the real programmer :)
Then there is Pelles C which supports x86-64. It has a basic VC-like IDE but few support programs.
I like these two because the compilers are competent and you get going quickly without having to pore over manuals and wondering what the options mean.
If you are on windows use MinGW or like most have suggested ggo with GCC on Linux
Though ofcourse since it's commandline so you might find Dev-C++ and/or Code::Blocks, Eclipse CDT etc which are IDEs useful for you to make your job simpler.
There is no standard and each compiler and it's libraries differ from one another.
gcc is best and free. GO FOR GNU!

Resources