Are libc and malloc part of the operating system? - c

I was having a discussion with a co-worker about malloc, and Was wondering if it is the cases that certain libc calls like malloc are implemented by the operating system?
I always thought that malloc was calling some symbols exposed in "sys" to declare which memory addresses it would use. From what I thought the operating system would allow the program's segmentation to be specified using some os level api... which might similar to:
int assign_memory_segmention(size_t start, size_t end);
I know my stdlib.h header is part of GNU because of the GPL header... and as GNU have made sure to inform me... they are not Unix. So is malloc just some type of function pointer to an OS heap implementation?

This question is best asked with another question: what is an Operating System? Or if you prefer: where do you print the line between OS and standard libraries?
Technically, malloc is part of the standard C library. And since the Linux is mainly written in C, and that the same library also includes many system calls, not in the C language, it is reasonable to think that this library is part of the OS.
But, on the other hand, there are several implementations of the C library, and also, the GNU C library is available for others operating systems, such as Windows. And I'm sure that there are other languages out there that call the OS without using the standard C library. So, from that POV, it is not part of the OS.
But then, Linux is the kernel, the OS should be named GNU/Linux (citation needed). But again, there are Linux systems without GNU, such as Android...
The conclusion is: the term "Operating System" is not a technical one. If you want to be precise, use kernel or standard C library, etc.

Yes... and no. C malloc() is usually a sub-allocator to memory areas provided by OS calls. The OS manages all virtual memory - that is part of it's job.

Related

What is the relation between Linux kernel and GNU C library?

We know that Linux kernel is written in C. But does it also call standard C functions like malloc() or extra functions like mmap() which are provided by GNU C library (glibc)? In that case, it's strange, because direct low-level interaction with hardware, e.g. memory management, is supposed to be almost always the task of a kernel. So, which is dependent on the other? Which is more fundamental/low-level?
We know that Linux kernel is written in C. But does it also call standard C functions like malloc()
No. However, the kernel defines similar functions like kmalloc. Note this is not part of a library; it's part of the kernel itself.
or extra functions like mmap()
Not mmap, but there are a lot of memory management functions in the kernel.
which are provided by GNU C library (glibc)?
Definitely not. The kernel does not use glibc ever.
So, which is dependent on the other?
Some parts of glibc depends on the kernel. Other parts (like strcpy) have nothing to do with the kernel and don't depend on it. The kernel never depends on glibc. You can run programs on Linux that use a different libc (like "musl") or that don't use a libc at all.

Why and when malloc() will not be available in C?

I've been given a 8051 based board with an embedded in-house operating system. I am using SDCC to create applications above the OS. And malloc is not available so I have to allocate memory statically. Why is that? Isn't malloc supposed to be on a dynamic library within the compiler?
TL;DR:
Why and when malloc() will not be available in C?
The only thing that can be said in general is that malloc() will be provided by every conforming, hosted C implementation, but there are other kinds, including another conforming kind.
Isn't malloc supposed to be on a dynamic library within the compiler?
Not exactly. malloc() is part of the C standard library, therefore it is provided by every conforming, hosted C implementation. A C implementation comprises a system for translating C source code into executable programs and a mechanism and environment for running the resulting programs. The former typically revolves around a compiler. The latter includes as much of the C standard library as the implementation provides, and this part is where malloc resides if it is available. Thus, no, malloc is technically not part of the compiler.
I'm sure that's not a distinction you meant to invoke, but it does bear on the answer. Note well that I said that malloc is provided by hosted implementations. These are the kind you ordinarily run into on general-purpose operating systems. They create programs that are launched in a standard way via the host OS, and they provide all the features of the C standard library in conjunction with the OS. But there are also freestanding implementations. One of the key differences is that freestanding implementations are excused from providing most of the standard library, including malloc().
You will commonly find freestanding implementations in use for and on embedded systems, such as yours. They are also used for OS kernels, boot loaders, and other such programs than run directly on bare metal. That your programs run on top of an OS makes your environment a bit of a Cadillac among embedded systems, but does not ensure that the C implementation is a hosted one. Inasmuch as it does not provide malloc, it cannot be a conforming hosted implementation, but it can be a conforming freestanding implementation. It ought to document which, if either, it claims to be. If it is freestanding but provides other standard library functions then you can consider that a luxury.
Some guidelines for (safety) critical systems does not allow dynamic memory allocations.
For example MISRA C:2004 guideline have the following rule:
20.4 - Dynamic heap memory allocation shall not be used.
One way to follow the rule is: Don't bring or implement malloc() and other dynamic memory allocation functions to the system.
Those kind of systems are typically embedded systems, where memory needs are well known/limited during or before compile time. So dynamic memory allocations can be avoided without pain.
With C libraries included in your project you can leverage the functions like malloc, printf....etc Understand that 8051 is a low memory foot print device in order of few KB. Hence inclusion of C libraries would increase the size of output .hex file and you will run out of memory.

Linux kernel: What kind of C Linux kernel is using?

I am confused here. They say linux kernel is developed using C. But to my knowledge, C library is built on top of Linux kernel, so at kernel land, there should be no C just yet. And yet again, the kernel code I saw from GitHub were all written in C, all with those weird includes! It's just like that classic chicken vs egg puzzle to me. Which one exists first?
Thanks in advance for your patience with my stupid question(s).
C isn't built ontop of linux. C in itself is a compiled programming language, that a compiler translates into machine code. Based on your OS, the compiler may do it differently (for some C code).
But the language C itself really is just a very long list of things functions should do and how things should behave, and compilers just obey these rules. Thats what is called the C "standard". There is a comittee that sets it, and there are multiple versions.
Linux Kernel was indeed written in C. So someone wrote it and then compiled it using a standard-compliant C compiler.
As for libraries, they're optional. The Linux kernel is developed without dependencies, that means it implements everything it needs itself, in plain C. These includes you see are just files from the kernel itself, defining its functions, types etc.
The linux kernel (and other kernels) is developed freestanding, this means it doesn't use any external libraries. Every function it needs is implemented inside the kernel. What you call "weird includes" are includes declaring its own internal functions and types.
The C specification makes a distinction between hosted and freestanding implementations. For some details, see Is there a meaningful distinction between freestanding and hosted implementations? and https://stackoverflow.com/questions/35164489/what-is-the-reason-for-creating-freestanding-vs-hosted-implementation.
One of the differences is that freestanding implementations are not required to provide all the standard library functions. When compiling a Unix kernel, we use the compiler in a freestanding mode, because the many of the standard libraries depend on having a kernel beneath them. In particular, the standard I/O library requires an operating system with files, but the kernel is where that all gets implemented, so it can't be used from the kernel.
While there are some library functions, like the ones in <string.h>, that could be the same in the kernel, to keep things simple it doesn't link with any of the standard libraries. There are functions like strcpy() in the kernel, but they're copies of the standard library code, not linked with the same libraries (on many systems, the standard C library is dynamically linked, but this isn't feasible in the kernel).
So the kernel makes use of the C language, but none of the C libraries.

The C language and Mac OSX

I was wondering whether anybody here could help me better understand the relationship between OSX and C. There's some developer information related to C++ in xcode but nothing for C.
I believe one fundamental difference is that osx uses libc as opposed to glibc. Can anybody point me to libc documentation? I can't seem to find any.
I've seen the usr/includes folder but all that does is make me wonder where I can get a reference that elucidates all the options available to me. For instance, I just discovered <tree.h>. That's all well and good but is there any documentation? Or do I need to trawl the includes folder?
It seems that you're asking whether the functionality that OSX provides to you as a programmer is partially different from other *nix systems; focusing on the functionality that OSX's implementation of the C Standard Library provides you with.
Now keep in mind that while the C Standard Library is a very common way to take advantage of the functionality the operating system kernel exposes, it's not the only way. You can use other low-level libraries, or write low-level functions yourself.
Having said that, consider the following:
OSX, like many other *nix systems, is "mostly POSIX-compliant". Meaning that its particular C Standard Library implementation will likely expose headers defined by the POSIX standard. This is the stuff you can rely on regardless of whether you use libc, glibc, or some other implementation of the C Standard Library.
Depending on the particular C Standard Library you're using, it might come with additional functionality, like BSD libc - we say "superset of the POSIX Standard Library" to that. While it can contain implementations of things specific to BSD (and therefore OSX), it mostly seems to contain things that can be implemented regardless of the operating system flavour. For example, the sys/tree.h header that you mention is "an implementation of Red-black tree and Splay tree" - by no means something that couldn't have been implemented on a Linux system!
To sum up:
OSX comes with an implementation of the C Standard Library called BSD libc that provides some additional headers on top of what the POSIX Standard defines.
The difference in functionality between the XNU kernel used by OSX and other *nix kernels will not necessarily be captured in the difference between the C Standard Library implementations. If you want to know what the XNU kernel can do for you that the Linux kernel can't, the place to start is with the kernels themselves.
So your question can be split into:
What is the difference between glibc and BSD libc?
and
What is the difference between the XNU kernel and the Linux kernel?
It's a bit unclear what you're asking.
OS X is based on top of FreeBSD, a POSIX-compliant UNIX operating system. The relationship between OS X and C is that C is one of many programming languages that you can code in to develop for the platform (C is the core of Objective-C, an otherwise unused language that Apple champions).
OS X doesn't use libc. clang, the compiler that ships as part of Apple's developer tools package for OS X, uses libc. There's a difference. If you want to use glib, grab GCC from Homebrew or Macports and use it to compile your programs instead of clang.
Lastly, you can't find documentation for libc, as all C libraries, like libc, glibc, etc, all provide the same set of functions if they are standards-compliant. There tend to be few differences end-user-wise between the different C libraries; so, if you want to find out about a header file, use man, like this: man clang to read clang documentation, for example.
Hope this helps.

I'm confused with C libraries

Ok here's the thing.
Most people learn about the C standard library simultaneously as they first get in contact with the C language and I wasn't an exception either. But as I am studying linux now, I tend to get confused with C libraries. well first, I know that you get a nice old C standard lib as you install gcc on your linux distro as a static lib. After that, you get a new stable version of glibc pretty soon as you connect to the internet.
I started to look into glibc API and here's where I got messed up. glibc seems to support vast amount of lib basically starting from POSIX C Standard lib (which implements the standard C lib(including C99 as I know of)) to it's own extensions based on the POSIX standard C lib.
Does this mean that glibc actually modified or added functions in the POSIX C Standard lib? or even add whole new header set? Cause I see some functions that are not in the standard C lib but actually included in the standard C header (such as strnlen() in
Also referring to what I mentioned about a 'glibc making whole new header set', is because I'm starting to see some header files that seems pretty unique such as linux/blahblah.h or sys/syscalls.h <= (are these the libs that only glibc support?)
Next Ques is that I actually heard linux is built based on C language. Does this mean linux compiles itself with it's own gcc compiler???????
For the first question, glibc follows both standard C and POSIX, from About glibc
The GNU C Library is primarily designed to be a portable and high performance C library. It follows all relevant standards including ISO C11 and POSIX.1-2008. It is also internationalized and has one of the most complete internationalization interfaces known.
For the second question, yes, you can compile Linux using gcc. Even gcc itself can be compiled using gcc, it's called bootstrapping.
Glibc implements the POSIX, ANSI and ISO C standards, and adds its own 'fluff', which it calls "glibc extensions". The reason that they are all "mixed together" is because they wrote the library as one package, there is no separate POSIX-only glibc.
<linux/blah> is not part of glibc. It is a set headers written specifically for the operating system, by people outside of glibc, to give the programmer access to the Linux kernel API. It is "part" of the Linux kernel and is installed with it, and is used for kernel hacking. <sys/blah> is part of glibc, and is specific to Linux. It gives access to a fairly abstracted Linux system API.
As for your second question, yes. Linux is written in C, as it is (according to Linus) the only programming language for kernel and system programming. The way this is done is through a technique called bootstrapping, where a small compiler is built (usually manually in ASM) and builds the entire kernel or the entirety of GCC.
There is one more thing to be aware of: one of the purposes of the libc is to abstract from the actual system kernel. As such, the libc is the one part of your app that is kernel specific. If you had a different kernel with different syscalls, you would need to have a specially compiled libc. AFAIK, the libc is therefore usually linked as a shared library.
On linux, we usually have the glibc installed, because linux systems usually are GNU/Linux systems with a GNU toolchain on top of the linux kernel.
And yes, the glibc does expand the standards in certain spots: The asprintf() function for instance originated as a gnu-addition. It almost made it into the C11 standard subsequently, but until it becomes part of them, it's use will require a glibc-based system, or statically linking with the glibc.
By default, the glibc headers do not define these gnu additions. You can switch them on by defining the preprocessor macro GNU_SOURCE before including the appropriate headers, or by specifying -std=gnu11 to the gcc call.

Resources