No library C implementation - c

I've heard while looking at different C implementations that any system that hopes to implement C must minimally include certain libraries, stdarg.h etc. My question is why this is, it can't be that the C library is not Turing complete without some headers, and since the headers have been written it must be true that I could write them myself. Why, then, is it not permissible to have a C implementation consisting of just a compiler+linker toolchain? (of course, in this case interacting with the OS would require inline assembly or linked assembly code as well as knowledge of the system's syscalls etc., but that doesn't mean that C can't be written, does it?)

You confuse a property of the programming language, i.e. the language itself with additional features mandated by the standard.
"Turing complete" is just about the language itself; basically if you can use it to solve a certain class of problems (for a more exact definition, please see Wikipedia for a starter(!) ). That is quite an abstract concept and does not include any libraries. Basically, if you use such libraries, you just have to be able to write those libraries in the language itself. This is true for the C language.
About the libraries required: Your premise is wrong. C very well allows to omit the libraries themselves. That is the difference between a hosted (full libraries) and a freestanding (few target-specific headers, but no generated code). See 4p6.
The few headers are normally part of the compiler itself. They basically provide some typedefs and #defined constants, e.g. the range of the integer types (limits.h) and types of guaranteed minimum width (stdint.h, often also fixed-width types). stddef.h e.g. provides size_t and NULL.
While you do not need to use those headers, they already allow writing portable code for the program logic. Just see them as part of the language itself, tailored to the target.
The gcc C compiler, for instance actually is a freestanding implementation: It only provides the required headers, but not the standard library. Instead, it relies on the system library, which is e.g. glibc on Linux.
Note: Generally it is a bad idea to re-invent the wheel. So if you are on a hosted environment (i.e. full-grown OS), you should use the features available. Otherwise you might run into trouble, as these e.g. mightr provide additional functions not directly seen by your code. E.g. debugging or system/user-wide configuration like localisation support. Also debugging support might depend on you using the standard library, e.g. valgrind. Replacing memory allocation with your own code at least makes this much more difficult.
Not to mention maintainability. Not just others will understand your code easier if you use the standard names&semantics, but also yourself - just wait some years and try understanding your old code.
OTOH, if you are on a bare-metal embedded system, there is actually little use of most features the standard library. Including e.g. printf or scnaf just bloats your firmware, often without any actual use. For such systems, there are stripped-down libraries (e.g. newlib) which may be not completely compliant or allow to omit certain costly features, e.g. floating point conversion or the math lib. Still you only should use them iff you really need many of their features. And sometimes there is a middle way, but that requires some knowledge about the dependencies of the library.

Two reasons: compatibility and system interaction.
If you don't implement the whole C standard library, then code other people write won't work. Even the most basic C program uses library calls.
#include <stdio.h>
int main() {
printf("Hello world!\n");
return 0;
}
Without an agreed upon and fully implemented stdio.h that program will not run because the compiler doesn't know what printf() means.
Then there's system interaction. C has been called "portable assembly". This is because different computing environments do things differently, but C takes are of that for you (well, some of it). You can't write a portable stdio.h in assembly without losing your mind. But its more than that. Each C header file protects you from something that each environment does (or used to) do very differently.
stdlib.h shields you from differing memory models and process control.
stdio.h shields you from differing IO systems.
math.h shields you from differing floating point implementations.
limits.h shields you from differing data sizes.
locale.h shields you from differing locales.
And so on...
The C libraries provide a standard API that each environment can write to. When C is ported to a new environment, that environment is responsible for implementing those libraries according to the particulars of that system. You don't have to do that.
Nowadays we live in a much more homogeneous environment than when C was developed, but most of the C standard library still protects you from basic differences in how operating systems and hardware do things.
It didn't always used to be this way. An example that comes to mind is the hell of running a game on DOS. There was no standard interface to the sound and video card (if you had them). Each program had to ship drivers for each sound and video card they supported. If yours wasn't in there, sorry. If their driver was buggy, sorry.
Programming without the C standard library is kind of like that, but far far worse.

Related

I want to know about C Library Configuration

The C Standard Library is independent of any operating system and system.
So, why use the input/output functions from the standard library?
Unix-specific POSIX system calls exist. Windows-specific input/output system calls exist.
Don't standard library functions eventually call system calls internally? Is this just for portability?
The API presented by the C standard library is uniform between operating systems, well, uniform provided all the "unspecified" parts of C roughly align (like the size of int).
The implementation of the C standard library is not independent of the operating system. Basically the implementation consists of the compiled source code the provides the API, and that compiled code matches the CPU / machine instruction sets, and possibly other items specific to the hardware bus width, supporting chip sets and other actual hardware details.
So, programming against the C API helps your program be "more portable" but that doesn't mean that any specific implementation of the C API is portable. Finally, there are lots of small details that aren't specified in detail, or are allowed to vary between platforms (like byte order, size of int, and so on). So even a program written against the standard C API might not work correctly on another machine, unless you write code that accommodates and reacts to the parts of the C API that might differ between platforms.
POSIX is basically a standard that eventually became incorporated into most C development environments. It is designed to provide a single API to program against for multiple UNIX platforms for items that lie outside of the core C language. There are POSIX implementations for Windows too, but Microsoft's historical offerings are notorious for not actually working correctly.
Yes, these APIs (if available) are implemented with code that eventually performs operating specific calls, and is presented in "machine code" that is very specific to the CPU instruction set. There are dozens of CPUs out there, and each major platform has its own matching compiler and matching C API libraries, if the C language is available.
The C language and API is there for portability, but portability isn't it's primary reason for existing (and there are lots of small corner cases where the same code isn't portable across all platforms unless it is written a certain way.) The primary reason it's there is not portability, it is because if the language features weren't consistently available across all platforms, then you wouldn't have "one C language" that could be used on multiple machines, you would have "many C-like languages, where each supported item would have to be checked" meaning you might know C on your development platform, but not know C on another platform.
As for the libraries, there are many libraries that might be absent in a typical machine, and when developing, you generally have to use a dependency checker to ensure the library is present (and sometimes the correct functions are available in the library) before successful use of the machine for development. Autoconf, for example, has m4 macros that can be configured to check if a library is present before compiling the programs.

C runtime library : what for?

Context:
Creating a toy OS, written in assembly and C.
X86. 32-bits first, 64-bits then.
Currently read how to make a C library and try to understand the underlyings.
My question comes from the reading on this page.
I read the other SO questions concerning runtime libraries but still don't get it.
Question:
Ok, a runtime library seems to help providing low level functionalities that an application library cannot provide.
What wonders can I do with a runtime library ?
My searches on the subject lead to theoretical explanations (MSDN and so on).
I need a practical, visual explanation.
Update:
I saw the question the admins refer to, and I obviously read it already. But it was not enough high level. But that is fine :)
Thanks
A C runtime library is more a part of your C implementation than it is a core part of the operating system, especially if the C implementation provides only static linking, as might be the case in a toy OS.
Among other things, though, the C runtime library provides all the functions necessary for programs to obtain services from the OS, such as memory allocation and I/O. These are not necessarily the same functions the OS kernel uses internally for the same or similar purposes.
Programs written in other languages may or may not rely on the C language runtime (they may provide their own, independent one instead), and statically-linked C programs include all necessary functions in their own images, instead of relying on dynamically loading them from a library at run time. In a sense, the C runtime library is distributed and duplicated across all statically-linked programs built from C sources.

is it possible to write c code and library, for multiplatform: embedded or all operating systems?

Is it possible to run a written code or written library that only uses initial c libraries, on every platform?
For example:
Windows,
ARM Microprocessors,
PIC microprocessors,
They have their compilers seperately and this difference is not important for me, I can compile in different compilers for need. But do I have to change code totally or partially to run on this platforms?
Note: For libraries, I will just use default c libraries.
It depends. If your library use only standard C and the multiplatform you are about to port has a compiler compatiable to standard C, you can always write the library code. But if your library have to call native API of each platform, you have to encapsulate these code seperately.
In embedded systems you will need to implement certain functions that the operating system gives you (like _sbrk, _read, etc.) for standard library functions like malloc and printf.
If you take care of that I don't see a reason for your code not to work so long as you take GREAT CARE in how you write it. By GREAT CARE, I mean be very careful with floating points, processor word size and any other things that are not common between your desired targets.
Short answer: possible, but not easy.

What is the C runtime library?

What actually is a C runtime library and what is it used for? I was searching, Googling like a devil, but I couldn't find anything better than Microsoft's: "The Microsoft run-time library provides routines for programming for the Microsoft Windows operating system. These routines automate many common programming tasks that are not provided by the C and C++ languages."
OK, I get that, but for example, what is in libcmt.lib? What does it do? I thought that the C standard library was a part of C compiler. So is libcmt.lib Windows' implementation of C standard library functions to work under win32?
Yes, libcmt is (one of several) implementations of the C standard library provided with Microsoft's compiler. They provide both "debug" and "release" versions of three basic types of libraries: single-threaded (always statically linked), multi-threaded statically linked, and multi-threaded dynamically linked (though, depending on the compiler version you're using, some of those may not be present).
So, in the name "libcmt", "libc" is the (more or less) traditional name for the C library. The "mt" means "multi-threaded". A "debug" version would have a "d" added to the end, giving "libcmtd".
As far as what functions it includes, the C standard (part 7, if you happen to care) defines a set of functions a conforming (hosted) implementation must supply. Most vendors (including Microsoft) add various other functions themselves (for compatibility, to provide capabilities the standard functions don't address, etc.) In most cases, it will also contain quite a few "internal" functions that are used by the compiler but not normally by the end user.
The runtime library is basically a collection of the implementations of those functions in one big file (or a few big files--e.g., on UNIX the floating point functions are traditionally stored separately from the rest). That big file is typically something on the same general order as a zip file, but without any compression, so it's basically just some little files collected together and stored together into one bigger file. The archive will usually contain at least some indexing to make it relatively fast/easy to find and extract the data from the internal files. At least at times, Microsoft has used a library format with an "extended" index the linker can use to find which functions are implemented in which of the sub-files, so it can find and link in the parts it needs faster (but that's purely an optimization, not a requirement).
If you want to get a complete list of the functions in "libcmt" (to use your example) you could open one of the Visual Studio command prompts (under "Visual Studio Tools", normally), switch to the directory where your libraries were installed, and type something like: lib -list libcmt.lib and it'll generate a (long) list of the names of all the object files in that library. Those don't always correspond directly to the names of the functions, but will generally give an idea. If you want to look at a particular object file, you can use lib -extract to extract one of those object files, then use dumpbin /symbols <object file name> to find what function(s) is/are in that particular object file.
At first, we should understand what a Runtime Library is; and think what it could mean by "Microsoft C Runtime Library".
see: http://en.wikipedia.org/wiki/Runtime_library
I have posted most of the article here because it might get updated.
When the source code of a computer program is translated into the respective target language by a compiler, it would cause an extreme enlargement of program code if each command in the program and every call to a built-in function would cause the in-place generation of the complete respective program code in the target language every time. Instead the compiler often uses compiler-specific auxiliary functions in the runtime library that are mostly not accessible to application programmers. Depending on the compiler manufacturer, the runtime library will sometimes also contain the standard library of the respective compiler or be contained in it.
Also some functions that can be performed only (or are more efficient or accurate) at runtime are implemented in the runtime library, e.g. some logic errors, array bounds checking, dynamic type checking, exception handling and possibly debugging functionality. For this reason, some programming bugs are not discovered until the program is tested in a "live" environment with real data, despite sophisticated compile-time checking and pre-release testing. In this case, the end user may encounter a runtime error message.
Usually the runtime library realizes many functions by accessing the operating system. Many programming languages have built-in functions that do not necessarily have to be realized in the compiler, but can be implemented in the runtime library. So the border between runtime library and standard library is up to the compiler manufacturer. Therefore a runtime library is always compiler-specific and platform-specific.
The concept of a runtime library should not be confused with an ordinary program library like that created by an application programmer or delivered by a third party or a dynamic library, meaning a program library linked at run time. For example, the programming language C requires only a minimal runtime library (commonly called crt0) but defines a large standard library (called C standard library) that each implementation has to deliver.
I just asked this myself and was hurting my brain for some hours. Still did not find anything that really makes a point. Everybody that does write something to a topic is not able to actually "teach". If you want to teach someone, take the most basic language a person understands, so he does not need to care about other topics when handling a topic. So I came to a conclusion for myself that seems to fit well in all this chaos.
In the programming language C, every program starts with the main() function.
Other languages might define other functions where the program starts. But a processor does not know the main(). A processor knows only predefined commands, represented by combinations of 0 and 1.
In microprocessor programming, not having an underlying operating system (Microsoft Windows, Linux, MacOS,..), you need to tell the processor explicitly where to start by setting the ProgramCounter (PC) that iterates and jumps (loops, function calls) within the commands known to the processor. You need to know how big the RAM is, you need to set the position of the program stack (local variables), as well as the position of the heap (dynamic variables) and the location of global variables (I guess it was called SSA?) within the RAM.
A single processor can only execute one program at a time.
That's where the operating system comes in. The operating system itself is a program that runs on the processor. A program that allows the execution of custom code. Runs multiple programs at a time by switching between the execution codes of the programs (which are loaded into the RAM). But the operating system IS A PROGRAM, each program is written differently. Simply putting the code of your custom program into RAM will not run it, the operating system does not know it. You need to call functions on the operating system that registers your program, tell the operating system how much memory the program needs, where the entry point into the program is located (the main() function in case of C). And this is what I guess is located within the Runtime Library, and explains why you need a special library for each operating system, cause these are just programs themselves and have different functions to do these things.
This also explains why it is NOT dynamically linked at runtime as .dll files are, even if it is called a RUNTIME Library. The Runtime Library needs to be linked statically, because it is needed at startup of your program. The Runtime Library injects/connects your custom program into/to another program (the operating system) at RUNTIME. This really causes some brain f...
Conclusion:
RUNTIME Library is a fail in naming. There might not have been a .dll (linking at runtime) in the early times and the issue of understanding the difference simply did not exist. But even if this is true, the name is badly chosen.
Better names for the Runtime Library could be: StartupLibrary/OSEntryLibrary/SystemConnectLibrary/OSConnectLibrary
Hope I got it right, up for correction/expansion.
cheers.
C is a language and in its definition, there do not need to be any functions available to you. No IO, no math routines and so on. By convention, there are a set of routines available to you that you can link into your executable, but you don't need to use them. This is, however, such a common thing to do that most linkers don't ask you to link to the C runtime libraries anymore.
There are times when you don't want them - for example, in working with embedded systems, it might be impractical to have malloc, for example. I used to work on embedding PostScript into printers and we had our own set of runtime libraries that were much happier on embedded systems, so we didn't bother with the "standard".
The runtime library is that library that is automatically compiled in for any C program you run. The version of the library you would use depends on your compiler, platform, debugging options, and multithreading options.
A good description of the different choices for runtime libraries:
http://www.davidlenihan.com/2008/01/choosing_the_correct_cc_runtim.html
It includes those functions you don't normally think of as needing a library to call:
malloc
enum, struct
abs, min
assert
Microsoft has a nice list of their runtime library functions:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/crt-alphabetical-function-reference?view=msvc-170
The exact list of functions would vary depending on compiler, so for iOS you would get other functions like dispatch_async() or NSLog().
If you use a tool like Dependency Walker on an executable compiled from C or C++ , you will see that one of the the DLLs it is dependent on is MSVCRT.DLL. This is the Microsoft C Runtime Library. If you further examine MSVCRT.DLL with DW, you will see that this is where all the functions like printf(), puts(0, gets(), atoi() etc. live.
i think Microsoft's definition really mean:
The Microsoft implementation of
standard C run-time library provides...
There are three forms of the C Run-time library provided with the Win32 SDK:
* LIBC.LIB is a statically linked library for single-threaded programs.
* LIBCMT.LIB is a statically linked library that supports multithreaded programs.
* CRTDLL.LIB is an import library for CRTDLL.DLL that also supports multithreaded programs. CRTDLL.DLL itself is part of Windows NT.
Microsoft Visual C++ 32-bit edition contains these three forms as well, however, the CRT in a DLL is named MSVCRT.LIB. The DLL is redistributable. Its name depends on the version of VC++ (ie MSVCRT10.DLL or MSVCRT20.DLL). Note however, that MSVCRT10.DLL is not supported on Win32s, while CRTDLL.LIB is supported on Win32s. MSVCRT20.DLL comes in two versions: one for Windows NT and the other for Win32s.
see: http://support.microsoft.com/?scid=kb%3Ben-us%3B94248&x=12&y=9

How does linking to OS C libraries under Windows and Linux work?

I understand Linux ships with a c library, which implements the ISO C functions and system call functions, and that this library is there to be linked against when developing C. However, different c compilers do not necessarily produce linkable code (e.g. one might pad datastructures used in function arguments differently from another). How is the built-in c library meant to be linked to when I could use any compiler to compile my C? Is the story any different for static versus dynamic linking?
Under Windows on the other hand, each compiler provides its own standard library, which solves part of the problem, but system calls are still in a single set of DLLs. How are C applications linked to these DLLs successfully? How about different languages? (The same DLLs can be used by pre-.Net Visual Basic, etc.)
Each platform has some "calling conventions" that each C implementation must adhere to in order to be able to talk to the operating system correctly. For Windows, for example, all OS-based functions have to be called using stdcall convention, as opposed to the default C convention of cdecl.
In Linux, since the standard C library (and kernel) is compiled using GCC, any other compilers for Linux must make sure their calling conventions are compatible to the one used by GCC.
Compilers do come with their implementations of the standard library. It's just that under Linux it's assumed that any compiler will follow the same conventions the version of GCC that compiled the library had.
As of interoperability, it can be easier than you think. There are established calling conventions that will allow compilers to produce a valid call to a function, even if the function wasn't compiled with the same software.
As of structures and padding, you'll notice that most frameworks work with opaque types, that is, pointers to structures. Often, the structure's layout isn't even available to clients. As such, they never works with the actual data, only pointers to the data, which clears the padding issue.
Standards. You'll note that stdlib stuff operates on primitive values and arrays - and the standard for that stuff is pretty explicit on how things are to be done.

Resources