What is the relation between Linux kernel and GNU C library? - c

We know that Linux kernel is written in C. But does it also call standard C functions like malloc() or extra functions like mmap() which are provided by GNU C library (glibc)? In that case, it's strange, because direct low-level interaction with hardware, e.g. memory management, is supposed to be almost always the task of a kernel. So, which is dependent on the other? Which is more fundamental/low-level?

We know that Linux kernel is written in C. But does it also call standard C functions like malloc()
No. However, the kernel defines similar functions like kmalloc. Note this is not part of a library; it's part of the kernel itself.
or extra functions like mmap()
Not mmap, but there are a lot of memory management functions in the kernel.
which are provided by GNU C library (glibc)?
Definitely not. The kernel does not use glibc ever.
So, which is dependent on the other?
Some parts of glibc depends on the kernel. Other parts (like strcpy) have nothing to do with the kernel and don't depend on it. The kernel never depends on glibc. You can run programs on Linux that use a different libc (like "musl") or that don't use a libc at all.

Related

Why and when malloc() will not be available in C?

I've been given a 8051 based board with an embedded in-house operating system. I am using SDCC to create applications above the OS. And malloc is not available so I have to allocate memory statically. Why is that? Isn't malloc supposed to be on a dynamic library within the compiler?
TL;DR:
Why and when malloc() will not be available in C?
The only thing that can be said in general is that malloc() will be provided by every conforming, hosted C implementation, but there are other kinds, including another conforming kind.
Isn't malloc supposed to be on a dynamic library within the compiler?
Not exactly. malloc() is part of the C standard library, therefore it is provided by every conforming, hosted C implementation. A C implementation comprises a system for translating C source code into executable programs and a mechanism and environment for running the resulting programs. The former typically revolves around a compiler. The latter includes as much of the C standard library as the implementation provides, and this part is where malloc resides if it is available. Thus, no, malloc is technically not part of the compiler.
I'm sure that's not a distinction you meant to invoke, but it does bear on the answer. Note well that I said that malloc is provided by hosted implementations. These are the kind you ordinarily run into on general-purpose operating systems. They create programs that are launched in a standard way via the host OS, and they provide all the features of the C standard library in conjunction with the OS. But there are also freestanding implementations. One of the key differences is that freestanding implementations are excused from providing most of the standard library, including malloc().
You will commonly find freestanding implementations in use for and on embedded systems, such as yours. They are also used for OS kernels, boot loaders, and other such programs than run directly on bare metal. That your programs run on top of an OS makes your environment a bit of a Cadillac among embedded systems, but does not ensure that the C implementation is a hosted one. Inasmuch as it does not provide malloc, it cannot be a conforming hosted implementation, but it can be a conforming freestanding implementation. It ought to document which, if either, it claims to be. If it is freestanding but provides other standard library functions then you can consider that a luxury.
Some guidelines for (safety) critical systems does not allow dynamic memory allocations.
For example MISRA C:2004 guideline have the following rule:
20.4 - Dynamic heap memory allocation shall not be used.
One way to follow the rule is: Don't bring or implement malloc() and other dynamic memory allocation functions to the system.
Those kind of systems are typically embedded systems, where memory needs are well known/limited during or before compile time. So dynamic memory allocations can be avoided without pain.
With C libraries included in your project you can leverage the functions like malloc, printf....etc Understand that 8051 is a low memory foot print device in order of few KB. Hence inclusion of C libraries would increase the size of output .hex file and you will run out of memory.

Linux kernel: What kind of C Linux kernel is using?

I am confused here. They say linux kernel is developed using C. But to my knowledge, C library is built on top of Linux kernel, so at kernel land, there should be no C just yet. And yet again, the kernel code I saw from GitHub were all written in C, all with those weird includes! It's just like that classic chicken vs egg puzzle to me. Which one exists first?
Thanks in advance for your patience with my stupid question(s).
C isn't built ontop of linux. C in itself is a compiled programming language, that a compiler translates into machine code. Based on your OS, the compiler may do it differently (for some C code).
But the language C itself really is just a very long list of things functions should do and how things should behave, and compilers just obey these rules. Thats what is called the C "standard". There is a comittee that sets it, and there are multiple versions.
Linux Kernel was indeed written in C. So someone wrote it and then compiled it using a standard-compliant C compiler.
As for libraries, they're optional. The Linux kernel is developed without dependencies, that means it implements everything it needs itself, in plain C. These includes you see are just files from the kernel itself, defining its functions, types etc.
The linux kernel (and other kernels) is developed freestanding, this means it doesn't use any external libraries. Every function it needs is implemented inside the kernel. What you call "weird includes" are includes declaring its own internal functions and types.
The C specification makes a distinction between hosted and freestanding implementations. For some details, see Is there a meaningful distinction between freestanding and hosted implementations? and https://stackoverflow.com/questions/35164489/what-is-the-reason-for-creating-freestanding-vs-hosted-implementation.
One of the differences is that freestanding implementations are not required to provide all the standard library functions. When compiling a Unix kernel, we use the compiler in a freestanding mode, because the many of the standard libraries depend on having a kernel beneath them. In particular, the standard I/O library requires an operating system with files, but the kernel is where that all gets implemented, so it can't be used from the kernel.
While there are some library functions, like the ones in <string.h>, that could be the same in the kernel, to keep things simple it doesn't link with any of the standard libraries. There are functions like strcpy() in the kernel, but they're copies of the standard library code, not linked with the same libraries (on many systems, the standard C library is dynamically linked, but this isn't feasible in the kernel).
So the kernel makes use of the C language, but none of the C libraries.

The C language and Mac OSX

I was wondering whether anybody here could help me better understand the relationship between OSX and C. There's some developer information related to C++ in xcode but nothing for C.
I believe one fundamental difference is that osx uses libc as opposed to glibc. Can anybody point me to libc documentation? I can't seem to find any.
I've seen the usr/includes folder but all that does is make me wonder where I can get a reference that elucidates all the options available to me. For instance, I just discovered <tree.h>. That's all well and good but is there any documentation? Or do I need to trawl the includes folder?
It seems that you're asking whether the functionality that OSX provides to you as a programmer is partially different from other *nix systems; focusing on the functionality that OSX's implementation of the C Standard Library provides you with.
Now keep in mind that while the C Standard Library is a very common way to take advantage of the functionality the operating system kernel exposes, it's not the only way. You can use other low-level libraries, or write low-level functions yourself.
Having said that, consider the following:
OSX, like many other *nix systems, is "mostly POSIX-compliant". Meaning that its particular C Standard Library implementation will likely expose headers defined by the POSIX standard. This is the stuff you can rely on regardless of whether you use libc, glibc, or some other implementation of the C Standard Library.
Depending on the particular C Standard Library you're using, it might come with additional functionality, like BSD libc - we say "superset of the POSIX Standard Library" to that. While it can contain implementations of things specific to BSD (and therefore OSX), it mostly seems to contain things that can be implemented regardless of the operating system flavour. For example, the sys/tree.h header that you mention is "an implementation of Red-black tree and Splay tree" - by no means something that couldn't have been implemented on a Linux system!
To sum up:
OSX comes with an implementation of the C Standard Library called BSD libc that provides some additional headers on top of what the POSIX Standard defines.
The difference in functionality between the XNU kernel used by OSX and other *nix kernels will not necessarily be captured in the difference between the C Standard Library implementations. If you want to know what the XNU kernel can do for you that the Linux kernel can't, the place to start is with the kernels themselves.
So your question can be split into:
What is the difference between glibc and BSD libc?
and
What is the difference between the XNU kernel and the Linux kernel?
It's a bit unclear what you're asking.
OS X is based on top of FreeBSD, a POSIX-compliant UNIX operating system. The relationship between OS X and C is that C is one of many programming languages that you can code in to develop for the platform (C is the core of Objective-C, an otherwise unused language that Apple champions).
OS X doesn't use libc. clang, the compiler that ships as part of Apple's developer tools package for OS X, uses libc. There's a difference. If you want to use glib, grab GCC from Homebrew or Macports and use it to compile your programs instead of clang.
Lastly, you can't find documentation for libc, as all C libraries, like libc, glibc, etc, all provide the same set of functions if they are standards-compliant. There tend to be few differences end-user-wise between the different C libraries; so, if you want to find out about a header file, use man, like this: man clang to read clang documentation, for example.
Hope this helps.

Are libc and malloc part of the operating system?

I was having a discussion with a co-worker about malloc, and Was wondering if it is the cases that certain libc calls like malloc are implemented by the operating system?
I always thought that malloc was calling some symbols exposed in "sys" to declare which memory addresses it would use. From what I thought the operating system would allow the program's segmentation to be specified using some os level api... which might similar to:
int assign_memory_segmention(size_t start, size_t end);
I know my stdlib.h header is part of GNU because of the GPL header... and as GNU have made sure to inform me... they are not Unix. So is malloc just some type of function pointer to an OS heap implementation?
This question is best asked with another question: what is an Operating System? Or if you prefer: where do you print the line between OS and standard libraries?
Technically, malloc is part of the standard C library. And since the Linux is mainly written in C, and that the same library also includes many system calls, not in the C language, it is reasonable to think that this library is part of the OS.
But, on the other hand, there are several implementations of the C library, and also, the GNU C library is available for others operating systems, such as Windows. And I'm sure that there are other languages out there that call the OS without using the standard C library. So, from that POV, it is not part of the OS.
But then, Linux is the kernel, the OS should be named GNU/Linux (citation needed). But again, there are Linux systems without GNU, such as Android...
The conclusion is: the term "Operating System" is not a technical one. If you want to be precise, use kernel or standard C library, etc.
Yes... and no. C malloc() is usually a sub-allocator to memory areas provided by OS calls. The OS manages all virtual memory - that is part of it's job.

Writing a POSIX-compliant kernel

I've wanted to write a kernel for some time now. I already have a sufficient knowledge of C and I've dabbled in x86 Assembler. You see, I've wanted to write a kernel that is POSIX-compliant in C so that *NIX applications can be potentially ported to my OS, but I haven't found many resources on standard POSIX kernel functions. I have found resources on the filesystem structure, environment variables, and more on the Open Group's POSIX page.
Unfortunately, I haven't found anything explaining what calls and kernel functions a POSIX-compliant kernel must have (in other words, what kind of internal structure must a kernel have to comply with POSIX). If anyone could find that information, please tell me.
POSIX doesn't define the internal structure of the kernel, the kernel-to-userspace interface, or even libc, at all. Indeed, even Windows has a POSIX-compliant subsystem. Just make sure the POSIX interfaces defined at your link there work somehow. Note, however, that POSIX does not require anything to be implemented specifically in the kernel - you can implement things in the C library using simpler kernel interfaces of your own design where possible, if you prefer.
It just so happens that a lot of the POSIX compliant OSes (BSD, Linux, etc) have a fairly close relationship between many of those calls and the kernel layer, but there are exceptions. For example, on Linux, a write() call is a direct syscall, invoking a sys_write() function in the kernel. However on Windows, write() is implemented in a POSIX support DLL, which translates the file descriptor to a NT handle and calls NtWriteFile() to service it, which in turn invokes a corresponding system call in ntoskrnl.exe. So you have a lot of freedom in how to do things - which makes things harder, if anything :)
The opengroup.org leaves the decisions about kernel syscalls to each implmentation.
write(), for example has to look and behave as stated, but what it calls underneath is not defined. A lot of calls like write, read, lseek are free to call whatever entrypoint they want inside the kernel.
So, no, there really is nothing that says you have to have a certain function name with a defined set of semantics available in the kernel. It just has to available in the C runtime library.

Resources