Portable way to implement variadic arguments in kernel space? - c

I am wondering if it is possible to implement the variadic macros in C or assembly.
I would prefer to have at least va_start() be a C macro but looks like this might not be possible. I have seen other answers to different questions saying it is not possible to do in C because you have to rely on undefined behaviour.
For context I am writing a kernel and I do not want to rely on any specific C89 compiler or unix-like assembler. Building the source with any C compiler is important for the project. Keeping it simple is another goal, unfortunately supporting something like variadic arguments seems to be complex on some architectures (amd64 ABI).
I know the __builtin_va_start(v,l), __builtin_va_arg(v, l), etc. macros exist but these are only available to specific compilers?
Right now I have the kernel printf(, ...) and panic(, ...) routines written in assembly (i386 ABI) which setup the va_list (pointer to first va argument on the stack) and pass it to vprintf(, va_list) which then uses the va_arg() macro (written in C). This does not rely on any undefined or implementation defined behaviour but I would prefer that all the macros are written in C.

Summary: Just #include <stdarg.h> and use va_start and friends as you normally would. A standard-conformant C compiler will support this, even without what we normally think of as a "C library", and it is perfectly usable in a kernel that must run on the bare metal without OS support. This is also the most portable solution, and avoids needing an architecture-, compiler- or ABI-dependent solution.
Of course when writing a kernel, you are used to not using library facilities like the functions from <stdio.h>, <stdlib.h>, and even <string.h> (printf, malloc, strcpy, etc), or having to write your own. But <stdarg.h> is in a different category. Its functionality can be provided by the compiler without OS support or extensive library code, and is in some sense more a part of the compiler/language than the "library".
From the point of view of the C standard, there are two kinds of conforming implementations (see C17 section 4, "Conformance"). Application programmers mostly think about conforming hosted implementations, which must provide printf and all that. But for a kernel or embedded code or anything else that runs on the bare metal, what you want is a conforming freestanding implementation (I'll write CFI for short). This is, informally speaking, "just the compiler" without "the standard library". But there are a few standard headers whose contents a CFI must still support, and <stdarg.h> is one of them. The others are things like <limits.h>, <stddef.h> and <stdint.h> that are mainly constants, macros and typedefs.
(This same distinction has existed all the way back to C89, with the same guarantee of <stdarg.h> being available.)
If your kernel will build with any CFI, that's pretty much the gold standard of portability for a kernel. In fact, you'll be pretty hard-pressed not to use some more compiler-specific feature at some points (inline assembly is awfully useful, for instance). But <stdarg.h> doesn't have to be one of them; you're really not giving up any portability by using it. You can expect it to be supported by any usable compiler targeting any given architecture, and that includes cross compilers (which will be configured to use the correct header for the target). For instance, in the case of a GNU system, <stdarg.h> ships with the gcc compiler itself, and not with the glibc standard library.
As some further assurance, until very recently, the Linux kernel itself used <stdarg.h> in precisely this way. (About a month ago there was a commit to create their own <linux/stdarg.h> file, which just copy-pastes from an old version of gcc's <stdarg.h> and defines the macros as their gcc-specific __builtin versions. Linux only supports building with gcc anyway, so this doesn't hurt them. But my best guess is that this was done for licensing reasons - the commit message emphasizes that they copied a GPL 2 version - rather than based on anything technical.)
By contrast, writing your variadic functions in assembly will naturally tie you to that specific architecture, and they'd be one more thing to be rewritten if you ever want to port to another architecture. And trying to access variadic arguments on the stack from C, with tricks like arg = *((int *)&fixed_arg + 1), is (a) ABI-dependent, (b) only possible at all for ABIs which actually pass args on the stack, which these days isn't much besides x86-32, and (c) is undefined behavior that might be "miscompiled" by some compilers. Finally, things like __builtin_va_start are strictly compiler-dependent (gcc and clang in this case), and using <stdarg.h> is no worse because gcc's <stdarg.h> simply contains macros like #define va_start __builtin_va_start.

Since you are highlighting kernel space, is it right that you want a user space function which is implemented via some sort of kernel call and is variadic?
This is a bit problematic; as a typical kernel entry point transitions the flow of control onto a kernel stack. Your va_start(), va_arg() implementations would have to be aware of how to traverse to the user's stack, and possibly map bits of a register save area into the vector.
An easier approach would be to have the user function:
int ufunc(char *fmt, ...) {
va_list v;
int n;
va_start(v, fmt);
n = __ufunc(fmt, v);
va_end(v);
return n;
}
And implement __ufunc in the kernel. Traditionally this is how the execl and execv family of functions co-operate to make handy interfaces, but only use one kernel call.
Your kernel will still have a bit of work dealing with the user stack though. For example, I could craft a va_list value for your call that caused the kernel to read out some private data. But if you are able to sort that the va_list points somewhere valid, and whatever processing you are doing with va_arg() supplied values are also valid, you would be able to use the stock compiler provided implementation.
Do note that if the user program used a different calling convention than your kernel, you could be in for a bit of work. For example, microsoft ignored the published ABI for the amd64, so that might cause a problem.

Related

When should Win32/WinAPI types be used vs. Standard C types?

I'm late to the Win32 party and there's a sea of functions such as _tprintf, TEXT(), there are also libraries like strsafe.h which have such functions like StringCchCopy(), StringCchLength(), etc.. Basically, Win32 API introduces a bunch of extra functions and types on top of C which can be confusing to a C programmer who hasn't worked with Win32 much. I do not have a problem finding the definitions of these types and functions on MSDN. However, I do have a problem finding guidelines on when and why they should be used.
I have 2 questions:
How important is it to use all of these types and special functions which Microsoft has provided on top of standard C when programming with Win32? Is it considered good practice to do away with all standard C functions and types and use entirely Microsoft wrappers?
Is it okay to mix standard C functions in with these Microsoft types and functions? For example, to use malloc() instead of HeapAlloc(), or to use printf() instead of _tprintf() and etc...?
I have a copy of Charles Petzold's Programming Windows Fifth Edition book but it mostly covers GUI stuff and not a lot of the remainder of the API.
There are actually 3 questions here, the ones you explicitly asked, and the one you didn't. Let's get that last one out of the way first, as it tends to cause the most confusion:
What are those _t-extensions offered by the Microsoft-provided CRT?
They are generic-text mappings, introduced to make it possible to write code that targets both ANSI-based systems (Win9x) as well as Unicode-based systems (Windows NT). They are macros that expand to the actual function calls, based on the _UNICODE and _MBCS preprocessor symbols. For example, the symbol _tprintf expands to either printf or wprintf.
Likewise, the Windows API provides both ANSI and Unicode versions of the API calls. They, too, are preprocessor macros that expand to the actual API call, depending on the preprocessor symbol UNICODE. For example, the CreateFile symbol expands to CreateFileA or CreateFileW.
Generic-text mappings haven't been useful in the past two decades. Today, simply use the Unicode versions of the CRT and API calls (e.g. wprintf and CreateFileW). You can define _UNICODE and UNICODE for good measure, too, so that you don't accidentally call an ANSI version.
there are also libraries like strsafe.h which have such functions like StringCchCopy(), StringCchLength()
Those are safe variants of the CRT string manipulation calls. They are safer than e.g. strcpy by providing the buffer size of the destination, similar to strncpy. The latter, however, suffers from an awkward design decision, that causes the destination buffer to not get zero-terminated, in case the source won't fit. StringCchCopy will always zero-terminate the destination buffer, and thus provides additional safety over the CRT implementations. (Note: C11 introduces safe variants, e.g. strncpy_s, that will always zero-terminate the destination array, in case the input is valid. They also validate the input, calling the currently installed constraint handler when validation fails, thus providing even stronger safety than the strsafe.h implementations. The bounds-checked implementations are a conditional feature of C11.)
How important is it to use all of these types and special functions which Microsoft has provided on top of standard C when programming with Win32? Is it considered good practice to do away with all standard C functions and types and use entirely Microsoft wrappers?
It is not important at all. You can use whichever is more suitable in your scenario. If in doubt, writing portable (i.e. Standard C) code is generally preferable. You only ever want to call the Windows API calls, if you need the additional control they offer (e.g. HeapAlloc allows more control over the allocation than malloc does; likewise CreateFile provides more options than fopen).
Is it okay to mix standard C functions in with these Microsoft types and functions? For example, to use malloc() instead of HeapAlloc(), or to use printf() instead of _tprintf() and etc...?
In general, yes, as long as you match those calls: HeapFree what you HeapAlloc, free what you malloc. You must not mix HeapAlloc and free, for example. In case a Windows API call requires special memory management functions to be used, it is explicitly pointed out in the documentation. For example, if FormatMessage is requested to allocate the buffer to return data, it must be freed using LocalFree. If you do not request the API to allocate a buffer, you can pass in a buffer allocated any way you like (malloc, HeapAlloc, IMalloc::Alloc, etc.).
It is possible to create programs on Windows without using any standard C library functions but most programs do and then you might as well use malloc over HeapAlloc. malloc will use HeapAlloc or VirtualAlloc internally but it is probably tuned for better performance/less fragmentation compared to the raw API. It also makes it easier to port to POSIX in the future. You will still be forced to use LocalFree/GlobalFree/HeapFree in some places where the API allocates memory for you.
Handling text needs special consideration and you need to decide if you need Unicode support or not. A stroll down memory lane might shed some light on why things are the way they are.
Back when Windows 95/98 was king you could use the char/CHAR narrow string types with both the C standard functions and the Windows API. There was virtually no Unicode support except for a handful of functions.
On Windows NT4/2000 however the native string type is WCHAR (UTF-16 LE but Microsoft just calls it Unicode). If you are using Microsoft Visual C++ then you have access to wide string versions of the C standard libray beyond what the C standard actually requires to ease coding for this platform. When coding for Windows using the Microsoft toolchain you can assume that the Windows SDK WCHAR type is the same as the wchar_t type defined by C.
Because the development of 95 and NT4 overlapped they share the same API and every function that receives/returns a string has two versions, one with a A suffix ("ANSI") and one with a W suffix. On Windows 95 the W functions are just stubs that return failure.
When you include Windows.h it will create defines like #define CreateProcess CreateProcessW if UNICODE is defined or #define CreateProcess CreateProcessA if not.
Visual C++ does the same thing with the tchar.h header. It uses the _UNICODE define to decide if the TCHAR type and the _t* functions use the char or wchar_t type. This meant that you could create two releases from the same source code, one for Windows 95/98/ME and one with full Unicode support.
This is not that relevant anymore but you still need to make a choice because things will be defined for one or the other.
It is still perfectly valid to do
#define UNICODE
#define _UNICODE
#include <windows.h>
#include <tchar.h>
void foo()
{
TCHAR buf[100];
SomeWindowsFunction(buf, 100);
_tprintf(_T("foo: %s\n"), buf);
}
although you will see many people go straight for WCHAR and wprintf these days.
The StrSafe functions were added to make it easier to write bug free code, they still have the same A/W duplication.
You cannot mix and match WCHAR with printf, even if you use %ls in the format string the string will be converted internally and not all Unicode strings will convert correctly.
If POSIX portability is not a requirement then I suggest that you use the wide function extensions provided by Microsoft when you need a C library function.
Note that different versions of the OS use different definitions of base types and use different alignments/padding. Remember 8086, 386 and now Core i7(16, 32, 64 bits).
When structs need to be compatible with earlier versions, they typically use pre-defined integer widths and pad for legacy alignment.
For that reason, the types from the API must be used in API calls.
Also in memory there used and maybe are different memory models of process memory and shared memory. For example the clipboard uses a form of shared memory. It is important to use the memory allocation mechanisms Microsoft advices here for API calls.
For everything non-API I use the standard C functions and types.

What remains in C if I exclude libraries and compiler extensions?

Imagine a situation where you can't or don't want to use any of the libraries provided by the compiler as "standard", nor any external library. You can't use even the compiler extensions (such as gcc extensions).
What is the remaining part you get if you strip C language of all the things a lot of people use as a matter of course?
In such a way, probably a list of every callable function supported by any big C compiler (not only ANSI C) out-of-box would be satisfying as as answer as it'd at least approximately show the use-case of the language.
First I thought about sizeof() and printf() (those were already clarified in the comments - operator + stdio), so... what remains? In-line assembly seem like an extension too, so that pretty much strips even the option to use assembly with C if I'm right.
Probably in the matter of code it'd be easier to understand. Imagine a code compiled with only e.g. gcc main.c (output flag permitted) that has no #include, nor extern.
int main() {
// replace_me
return 0;
}
What can I call to actually do something else than "boring" type math and casting from type to type?
Note that switch, goto, if, loops and other constructs that do nothing and only allow repeating a piece of code aren't the thing I'm looking for (if it isn't obvious).
(Hopefully the edit clarified wtf I'm actually asking, but Matteo's answer pretty much did it.)
If you remove all libraries essentially you have something similar to a freestanding implementation of C (which still has to provide some libraries - say, string.h, but that's nothing you couldn't easily implement yourself in portable C), and that's what normally you start with when programming microcontrollers and other computers that don't have a ready-made operating system - and what operating system writers in general use when they compile their operating systems.
There you typically have two ways of doing stuff besides "raw" computation:
assembly blocks (where you can do literally anything the underlying machine can do);
memory mapped IO (you set a volatile pointer to some hardware dependent location and read/write from it; that affects hardware stuff).
That's really all you need to build anything - and after all, it all boils down to that stuff anyway, the C library of a regular hosted implementation is normally written in C itself, with some assembly used either for speed or to communicate with the operating system1 (typically the syscalls are invoked through some kind of interrupt).
Again, it's nothing you couldn't implement yourself. But the point of having a standard library is both to avoid to continuously reinvent the wheel, and to have a set of portable functions that spare you to have to rewrite everything knowing the details of each target platform.
And mainstream operating systems, in turn, are generally written in a mix or C and assembly as well.
C has no "built-in" functions as such. A compiler implementation may include "intrinsic" functions that are implemented directly by the compiler without provision of an external library, although a prototype declaration is still required for intrinsics, so you would still normally include a header file for such declarations.
C is a systems-level language with a minimal run-time and start-up requirement. Because it can directly access memory and memory mapped I/O there is very little that it cannot do (and what it cannot do is what you use assembly, in-line assembly or intrinsics for). For example, much of the library code you are wondering what you can do without is written in C. When running in an OS environment however (using C as an application-level rather then system-level language), you cannot practically use C in that manner - the OS has control over such things as I/O and memory-management and in modern systems will normally prevent unmediated access to such resources. Of course that OS itself is likely to largely written in C (and/or C++).
In a standalone of bare-metal environment with no OS, C is often used very early in the bootstrap process initialising hardware and establishing an application execution environment. In fact on ARM Cortex-M processors it is possible to boot directly into C code from reset, since the hardware loads an initial stack-pointer and start address from the vector table on start-up; this being enough to run C code that does not rely on library or static data initialisation - such initialisation can however be written in C before calling main().
Note that sizeof is not a function, it is an operator.
I don't think you really understand the situation.
You don't need a header to call a function in C. You can call with unchecked parameters - a bad idea and an obsolete feature, but still supported. And if a compiler links a library by default instead of only when you explicitly tell it to, that's only a little switch within the compiler to "link libc". Notoriously Unix compilers need to be told to link the math library, it wasn't linked by default because some very early programs didn't use floating point.
To be fair, some standard library functions like memcpy tend to be special-cased these days as they lend themselves to inlining and optimisation.
The standard library is documented and is usually available, though in effect deprecated by Microsoft for security reasons. You can write pretty much any function quite easily with only stdlib functions, what you can't do is fancy IO.

How to install C11 compiler on Mac OS with optional string functions included?

I'm trying the below code to see if the optional string functions in C are supported (I've got Mac OS X El Capitan and XCode installed)...
#include <stdio.h>
int main(void)
{
#if defined __STDC_LIB_EXT1__
printf("Optional functions are defined.\n");
#else
printf("Optional functions are not defined.\n");
#endif
return 0;
}
...but it suggests they aren't.
I've tried all the different compilers I have from XCode (cc, gcc, llvm-gcc, clang).
I've also tried brew install gcc assuming that the GNU C compiler would give me these extra functions, but it doesn't.
Is there a way to simply install a C11 compatible compiler on Mac OS that'll give me these additional (i.e. safe) string functions.
Summary: You won't get it to work. There are better ways to make sure your code is correct. For now, use the address sanitizer instead.
Also known as "Annex K" of the C11 standard or TR 24731, these functions are not widely implemented. The only commonly available implementation is part of Microsoft Visual Studio, other common C implementations have rejected (explicitly, even) the functionality in annex K. So, while annex K is technically part of the standard, for practical purposes it should be treated as a Microsoft-specific extension.
See Field Experience With Annex K — Bounds Checking Interfaces (document N1967) for more information. According to this report, there are only four implementations of annex K, two are for Windows, one is considered "very incomplete" and the remaining one is "unsuitable for production use without considerable changes."
However, the argument that these string functions are "safe" is a bit misleading. These functions merely add bounds checking, which only works if the functions are called correctly—but then again, the "non-safe" functions only work if they are called correctly too. From the report cited above,
Despite more than a decade since the original proposal and nearly ten years since the ratification of ISO/IEC TR 24731-1:2007, and almost five years since the introduction of the Bounds checking interfaces into the C standard, no viable conforming implementations has emerged. The APIs continue to be controversial and requests for implementation continue to be rejected by implementers.
The design of the Bounds checking interfaces, though well-intentioned, suffers from far too many problems to correct. Using the APIs has been seen to lead to worse quality, less secure software than relying on established approaches or modern technologies. More effective and less intrusive approaches have become commonplace and are often preferred by users and security experts alike.
Therefore, we propose that Annex K be either removed from the next revision of the C standard, or deprecated and then removed.
I suggest using the address sanitizer as an alternative.
Do not use strncpy, strncat or the like as "safe" functions, they're not designed to do that and they are not drop-in replacements for strcpy, strcat, etc., unlike strcpy_s, strcat_s, which are drop-in replacements.
If you are not using Windows or Embarcadero you need to use the external safeclib: https://github.com/rurban/safeclib/releases
No other libc's comes with the safe C11 Annex K extensions.
For an overview of the various libc quirks regarding this see https://rurban.github.io/safeclib/doc/safec-3.3/d1/dae/md_doc_libc-overview.html

C struct alignment and portability across compilers

Assuming the following header file corresponding to, for example, a shared library. The exported function takes a pointer to a custom structure defined in this header:
// lib.h
typedef struct {
char c;
double d;
int i;
} A;
DLL_EXPORT void f(A* p);
If the shared library is built using one compiler and then is used from C code built with another compiler it might not work because of a different memory alignment, as Memory alignment in C-structs suggests. So, is there a way to make my structure definition portable across different compilers on the same platform?
I am interested specifically in Windows platform (apparently it does not have a well-defined ABI), though would be curious to learn about other platforms as well.
TL;DR in practice you should be fine.
The C standard does not define this but a platform ABI generally does. That is, for a given CPU architecture and operating system, there can be a definition for how C maps to assembly that allows different compilers to interoperate.
Struct alignment isn't the only thing that a platform ABI has to define, you also have function calling conventions and stuff like that.
C++ makes it even more complex and the ABI has to specify vtables, exceptions, name mangling, etc.
On Windows I think there are multiple C++ ABIs depending on compiler but C is mostly compatible across compilers. I could be wrong, not a Windows expert.
Some links:
what is an ABI? http://gcc.gnu.org/ml/libstdc++/2001-11/msg00063.html
things an ABI has to define C++ ABI issues list
example C++ ABI spec http://sourcery.mentor.com/public/cxx-abi/abi.html
how the ABI evolved on Solaris http://developers.sun.com/solaris/articles/CC_abi/CC_abi_content.html
Anyway the bottom line is that you're looking for your guarantee in the platform/compiler ABI spec, not the C standard.
The only way to know for sure is to consult the documentation of the compilers in question. However, it is usually the case that C struct layout (except, as you say, for bitfields) is defined by an ABI description for the environment you're using, and C compilers will tend to follow the native ABI.
Not only that it is not guarantied, but even if you use the same compiler there might be differences due to different compiler switches used in the build, or if you use different versions of the same compiler and same switches (happened in an embedded compiler I worked on).
You need to make make sure the structs are represented exactly the same, use switches, #pragmas, whatever the compilers gives you.
My advice - to stay way from this altogether. Pass your arguments in the function, not wrapped within a struct.
And even in this simple form, if you deal with two compilers, it's not trivial. You need to make sure that an int takes the same number of bytes, for example. Also calling conevntion - arguments order - from left to right or from right to left - can differ between compiler.

What can you do in C without "std" includes? Are they part of "C," or just libraries?

I apologize if this is a subjective or repeated question. It's sort of awkward to search for, so I wasn't sure what terms to include.
What I'd like to know is what the basic foundation tools/functions are in C when you don't include standard libraries like stdio and stdlib.
What could I do if there's no printf(), fopen(), etc?
Also, are those libraries technically part of the "C" language, or are they just very useful and effectively essential libraries?
The C standard has this to say (5.1.2.3/5):
The least requirements on a conforming
implementation are:
— At sequence points, volatile objects
are stable in the sense that previous
accesses are complete and subsequent
accesses have not yet occurred.
— At program termination, all data
written into files shall be identical
to the result that execution of the
program according to the abstract
semantics would have produced.
— The input and output dynamics of
interactive devices shall take place
as specified in
7.19.3.
So, without the standard library functions, the only behavior that a program is guaranteed to have, relates to the values of volatile objects, because you can't use any of the guaranteed file access or "interactive devices". "Pure C" only provides interaction via standard library functions.
Pure C isn't the whole story, though, since your hardware could have certain addresses which do certain things when read or written (whether that be a SATA or PCI bus, raw video memory, a serial port, something to go beep, or a flashing LED). So, knowing something about your hardware, you can do a whole lot writing in C without using standard library functions. Potentially, you could implement the C standard library, although this might require access to special CPU instructions as well as special memory addresses.
But in pure C, with no extensions, and the standard library functions removed, you basically can't do anything other than read the command line arguments, do some work, and return a status code from main. That's not to be sniffed at, it's still Turing complete subject to resource limits, although your only resource is automatic and static variables, no heap allocation. It's not a very rich programming environment.
The standard libraries are part of the C language specification, but in any language there does tend to be a line drawn between the language "as such", and the libraries. It's a conceptual difference, but ultimately not a very important one in principle, because the standard says they come together. Anyone doing something non-standard could just as easily remove language features as libraries. Either way, the result is not a conforming implementation of C.
Note that a "freestanding" implementation of C only has to implement a subset of standard includes not including any of the I/O, so you're in the position I described above, of relying on hardware-specific extensions to get anything interesting done. If you want to draw a distinction between the "core language" and "the libraries" based on the standard, then that might be a good place to draw the line.
What could I do if there's no printf(), fopen(), etc?
As long as you know how to interface the system you are using you can live without the standard C library. In embedded systems where you only have several kilobytes of memory, you probably don't want to use the standard library at all.
Here is a Hello World! example on Linux and Windows without using any standard C functions:
For example on Linux you can invoke the Linux system calls directly in inline assembly:
/* 64 bit linux. */
#define SYSCALL_EXIT 60
#define SYSCALL_WRITE 1
void sys_exit(int error_code)
{
asm volatile
(
"syscall"
:
: "a"(SYSCALL_EXIT), "D"(error_code)
: "rcx", "r11", "memory"
);
}
int sys_write(unsigned fd, const char *buf, unsigned count)
{
unsigned ret;
asm volatile
(
"syscall"
: "=a"(ret)
: "a"(SYSCALL_WRITE), "D"(fd), "S"(buf), "d"(count)
: "rcx", "r11", "memory"
);
return ret;
}
void _start(void)
{
const char hwText[] = "Hello world!\n";
sys_write(1, hwText, sizeof(hwText));
sys_exit(12);
}
You can look up the manual page for "syscall" which you can find how can you make system calls. On Intel x86_64 you put the system call id into RAX, and then return value will be stored in RAX. The arguments must be put into RDI, RSI, RDX, R10, R9 and R8 in this order (when the argument is used).
Once you have this you should look up how to write inline assembly in gcc.
The syscall instruction changes the RCX, R11 registers and memory so we add this to the clobber list make GCC aware of it.
The default entry point for the GNU linker is _start. Normally the standard library provides it, but without it you need to provide it.
It isn't really a function as there is no caller function to return to. So we must make another system call to exit our process.
Compile this with:
gcc -nostdlib nostd.c
And it outputs Hello world!, and exits.
On Windows the system calls are not published, instead it's hidden behind another layer of abstraction, the kernel32.dll. Which is always loaded when your program starts whether you want it or not. So you can simply include windows.h from the Windows SDK and use the Win32 API as usual:
#include <windows.h>
void _start(void)
{
const char str[] = "Hello world!\n";
HANDLE stdout = GetStdHandle(STD_OUTPUT_HANDLE);
DWORD written;
WriteFile(stdout, str, sizeof(str), &written, NULL);
ExitProcess(12);
}
The windows.h has nothing to do with the standard C library, as you should be able to write Windows programs in any other language too.
You can compile it using the MinGW tools like this:
gcc -nostdlib C:\Windows\System32\kernel32.dll nostdlib.c
Then the compiler is smart enough to resolve the import dependencies and compile your program.
If you disassemble the program, you can see only your code is there, there is no standard library bloat in it.
So you can use C without the standard library.
What could you do? Everything!
There is no magic in C, except perhaps the preprocessor.
The hardest, perhaps is to write putchar - as that is platform dependent I/O.
It's a good undergrad exercise to create your own version of varargs and once you've got that, do your own version of vaprintf, then printf and sprintf.
I did all of then on a Macintosh in 1986 when I wasn't happy with the stdio routines that were provided with Lightspeed C - wrote my own window handler with win_putchar, win_printf, in_getchar, and win_scanf.
This whole process is called bootstrapping and it can be one of the most gratifying experiences in coding - working with a basic design that makes a fair amount of practical sense.
You're certainly not obligated to use the standard libraries if you have no need for them. Quite a few embedded systems either have no standard library support or can't use it for one reason or another. The standard even specifically talks about implementations with no library support, C99 standard 5.1.2.1 "Freestanding environment":
In a freestanding environment (in which C program execution may take place without any benefit of an operating system), the name and type of the function called at program startup are implementation-defined. Any library facilities available to a freestanding program, other than the minimal set required by clause 4, are implementation-defined.
The headers required by C99 to be available in a freestanding implemenation are <float.h>, <iso646.h>, <limits.h>, <stdarg.h>, <stdbool.h>, <stddef.h>, and <stdint.h>. These headers define only types and macros so there's no need for a function library to support them.
Without the standard library, you're entire reliant on your own code, any non-standard libraries that might be available to you, and any operating system system calls that you might be able to interface to (which might be considered non-standard library calls). Quite possibly you'd have to have your C program call assembly routines to interface to devices and/or whatever operating system might be on the platform.
You can't do a lot, since most of the standard library functions rely on system calls; you are limited to what you can do with the built-in C keywords and operators. It also depends on the system; in some systems you may be able to manipulate bits in a way that results in some external functionality, but this is likely to be the exception rather than the rule.
C's elegance is in it's simplicity, however. Unlike Fortran, which includes much functionality as part of the language, C is quite dependent on its library. This gives it a great degree of flexibility, at the expense of being somewhat less consistent from platform to platform.
This works well, for example, in the operating system, where completely separate "libraries" are implemented, to provide similar functionality with an implementation inside the kernel itself.
Some parts of the libraries are specified as part of ANSI C; they are part of the language, I suppose, but not at its core.
None of them is part of the language keywords. However, all C distributions must include an implementation of these libraries. This ensures portability of many programs.
First of all, you could theoretically implement all these functions yourself using a combination of C and assembly, so you could theoretically do anything.
In practical terms, library functions are primarily meant to save you the work of reinventing the wheel. Some things (like string and library functions) are easier to implement. Other things (like I/O) very much depend on the operating system. Writing your own version would be possible for one O/S, but it is going to make the program less portable.
But you could write programs that do a lot of useful things (e.g., calculate PI or the meaning of life, or simulate an automata). Unless you directly used the OS for I/O, however, it would be very hard to observe what the output is.
In day to day programming, the success of a programming language typically necessitates the availability of a useful high-quality standard library and libraries for many useful tasks. These can be first-party or third-party, but they have to be there.
The std libraries are "standard" libraries, in that for a C compiler to be compliant to a standard (e.g. C99), these libraries must be "include-able." For an interesting example that might help in understanding what this means, have a look at Jessica McKellar's challenge here:
http://blog.ksplice.com/2010/03/libc-free-world/
Edit: The above link has died (thanks Oracle...)
I think this link mirrors the article: https://sudonull.com/post/178679-Hello-from-the-libc-free-world-Part-1
The CRT is part of the C language just as much as the keywords and the syntax. If you are using C, your compiler MUST provide an implementation for your target platform.
Edit:
It's the same as the STL for C++. All languages have a standard library. Maybe assembler as the exception, or some other seriously low level languages. But most medium/high levels have standard libs.
The Standard C Library is part of ANSI C89/ISO C90. I've recently been working on the library for a C compiler that previously was not ANSI-compliant.
The book The Standard C Library by P.J. Plauger was a great reference for that project. In addition to spelling out the requirements of the standard, Plauger explains the history of each .h file and the reasons behind some of the API design. He also provides a full implementation of the library, something that helped me greatly when something in the standard wasn't clear.
The standard describes the macros, types and functions for each of 15 header files (include stdio.h, stdlib.h, but also float.h, limits.h, math.h, locale.h and more).
A compiler can't claim to be ANSI C unless it includes the standard library.
Assembly language has simple commands that move values to registers of the CPU, memory, and other basic functions, as well as perform the core capabilities and calculations of the machine. C libraries are basically chunks of assembly code. You can also use assembly code in your C programs. var is an assembly code instruction. When you use 0x before a number to make it Hex, that is assembly instruction. Assembly code is the readable form of machine code, which is the visual form of the actual switch states of the circuits paths.
So while the machine code, and therefore the assembly code, is built into the machine, C languages are combined of all kinds of pre-formed combinations of code, including your own functions that might be in part assembly language and in part calling on other functions of assembly language or other C libraries. So the assembly code is the foundation of all the programming, and after that it's anyone's guess about what is what. That's why there are so many languages and so few true standards.
Yes you can do a ton of stuff without libraries.
The lifesaver is __asm__ in GCC. It is a keyword so yes you can.
Mostly because every programming language is built on Assembly, and you can make system calls directly under some OSes.

Resources