Why and when malloc() will not be available in C? - c

I've been given a 8051 based board with an embedded in-house operating system. I am using SDCC to create applications above the OS. And malloc is not available so I have to allocate memory statically. Why is that? Isn't malloc supposed to be on a dynamic library within the compiler?

TL;DR:
Why and when malloc() will not be available in C?
The only thing that can be said in general is that malloc() will be provided by every conforming, hosted C implementation, but there are other kinds, including another conforming kind.
Isn't malloc supposed to be on a dynamic library within the compiler?
Not exactly. malloc() is part of the C standard library, therefore it is provided by every conforming, hosted C implementation. A C implementation comprises a system for translating C source code into executable programs and a mechanism and environment for running the resulting programs. The former typically revolves around a compiler. The latter includes as much of the C standard library as the implementation provides, and this part is where malloc resides if it is available. Thus, no, malloc is technically not part of the compiler.
I'm sure that's not a distinction you meant to invoke, but it does bear on the answer. Note well that I said that malloc is provided by hosted implementations. These are the kind you ordinarily run into on general-purpose operating systems. They create programs that are launched in a standard way via the host OS, and they provide all the features of the C standard library in conjunction with the OS. But there are also freestanding implementations. One of the key differences is that freestanding implementations are excused from providing most of the standard library, including malloc().
You will commonly find freestanding implementations in use for and on embedded systems, such as yours. They are also used for OS kernels, boot loaders, and other such programs than run directly on bare metal. That your programs run on top of an OS makes your environment a bit of a Cadillac among embedded systems, but does not ensure that the C implementation is a hosted one. Inasmuch as it does not provide malloc, it cannot be a conforming hosted implementation, but it can be a conforming freestanding implementation. It ought to document which, if either, it claims to be. If it is freestanding but provides other standard library functions then you can consider that a luxury.

Some guidelines for (safety) critical systems does not allow dynamic memory allocations.
For example MISRA C:2004 guideline have the following rule:
20.4 - Dynamic heap memory allocation shall not be used.
One way to follow the rule is: Don't bring or implement malloc() and other dynamic memory allocation functions to the system.
Those kind of systems are typically embedded systems, where memory needs are well known/limited during or before compile time. So dynamic memory allocations can be avoided without pain.

With C libraries included in your project you can leverage the functions like malloc, printf....etc Understand that 8051 is a low memory foot print device in order of few KB. Hence inclusion of C libraries would increase the size of output .hex file and you will run out of memory.

Related

I want to know about C Library Configuration

The C Standard Library is independent of any operating system and system.
So, why use the input/output functions from the standard library?
Unix-specific POSIX system calls exist. Windows-specific input/output system calls exist.
Don't standard library functions eventually call system calls internally? Is this just for portability?
The API presented by the C standard library is uniform between operating systems, well, uniform provided all the "unspecified" parts of C roughly align (like the size of int).
The implementation of the C standard library is not independent of the operating system. Basically the implementation consists of the compiled source code the provides the API, and that compiled code matches the CPU / machine instruction sets, and possibly other items specific to the hardware bus width, supporting chip sets and other actual hardware details.
So, programming against the C API helps your program be "more portable" but that doesn't mean that any specific implementation of the C API is portable. Finally, there are lots of small details that aren't specified in detail, or are allowed to vary between platforms (like byte order, size of int, and so on). So even a program written against the standard C API might not work correctly on another machine, unless you write code that accommodates and reacts to the parts of the C API that might differ between platforms.
POSIX is basically a standard that eventually became incorporated into most C development environments. It is designed to provide a single API to program against for multiple UNIX platforms for items that lie outside of the core C language. There are POSIX implementations for Windows too, but Microsoft's historical offerings are notorious for not actually working correctly.
Yes, these APIs (if available) are implemented with code that eventually performs operating specific calls, and is presented in "machine code" that is very specific to the CPU instruction set. There are dozens of CPUs out there, and each major platform has its own matching compiler and matching C API libraries, if the C language is available.
The C language and API is there for portability, but portability isn't it's primary reason for existing (and there are lots of small corner cases where the same code isn't portable across all platforms unless it is written a certain way.) The primary reason it's there is not portability, it is because if the language features weren't consistently available across all platforms, then you wouldn't have "one C language" that could be used on multiple machines, you would have "many C-like languages, where each supported item would have to be checked" meaning you might know C on your development platform, but not know C on another platform.
As for the libraries, there are many libraries that might be absent in a typical machine, and when developing, you generally have to use a dependency checker to ensure the library is present (and sometimes the correct functions are available in the library) before successful use of the machine for development. Autoconf, for example, has m4 macros that can be configured to check if a library is present before compiling the programs.

Linux kernel: What kind of C Linux kernel is using?

I am confused here. They say linux kernel is developed using C. But to my knowledge, C library is built on top of Linux kernel, so at kernel land, there should be no C just yet. And yet again, the kernel code I saw from GitHub were all written in C, all with those weird includes! It's just like that classic chicken vs egg puzzle to me. Which one exists first?
Thanks in advance for your patience with my stupid question(s).
C isn't built ontop of linux. C in itself is a compiled programming language, that a compiler translates into machine code. Based on your OS, the compiler may do it differently (for some C code).
But the language C itself really is just a very long list of things functions should do and how things should behave, and compilers just obey these rules. Thats what is called the C "standard". There is a comittee that sets it, and there are multiple versions.
Linux Kernel was indeed written in C. So someone wrote it and then compiled it using a standard-compliant C compiler.
As for libraries, they're optional. The Linux kernel is developed without dependencies, that means it implements everything it needs itself, in plain C. These includes you see are just files from the kernel itself, defining its functions, types etc.
The linux kernel (and other kernels) is developed freestanding, this means it doesn't use any external libraries. Every function it needs is implemented inside the kernel. What you call "weird includes" are includes declaring its own internal functions and types.
The C specification makes a distinction between hosted and freestanding implementations. For some details, see Is there a meaningful distinction between freestanding and hosted implementations? and https://stackoverflow.com/questions/35164489/what-is-the-reason-for-creating-freestanding-vs-hosted-implementation.
One of the differences is that freestanding implementations are not required to provide all the standard library functions. When compiling a Unix kernel, we use the compiler in a freestanding mode, because the many of the standard libraries depend on having a kernel beneath them. In particular, the standard I/O library requires an operating system with files, but the kernel is where that all gets implemented, so it can't be used from the kernel.
While there are some library functions, like the ones in <string.h>, that could be the same in the kernel, to keep things simple it doesn't link with any of the standard libraries. There are functions like strcpy() in the kernel, but they're copies of the standard library code, not linked with the same libraries (on many systems, the standard C library is dynamically linked, but this isn't feasible in the kernel).
So the kernel makes use of the C language, but none of the C libraries.

No library C implementation

I've heard while looking at different C implementations that any system that hopes to implement C must minimally include certain libraries, stdarg.h etc. My question is why this is, it can't be that the C library is not Turing complete without some headers, and since the headers have been written it must be true that I could write them myself. Why, then, is it not permissible to have a C implementation consisting of just a compiler+linker toolchain? (of course, in this case interacting with the OS would require inline assembly or linked assembly code as well as knowledge of the system's syscalls etc., but that doesn't mean that C can't be written, does it?)
You confuse a property of the programming language, i.e. the language itself with additional features mandated by the standard.
"Turing complete" is just about the language itself; basically if you can use it to solve a certain class of problems (for a more exact definition, please see Wikipedia for a starter(!) ). That is quite an abstract concept and does not include any libraries. Basically, if you use such libraries, you just have to be able to write those libraries in the language itself. This is true for the C language.
About the libraries required: Your premise is wrong. C very well allows to omit the libraries themselves. That is the difference between a hosted (full libraries) and a freestanding (few target-specific headers, but no generated code). See 4p6.
The few headers are normally part of the compiler itself. They basically provide some typedefs and #defined constants, e.g. the range of the integer types (limits.h) and types of guaranteed minimum width (stdint.h, often also fixed-width types). stddef.h e.g. provides size_t and NULL.
While you do not need to use those headers, they already allow writing portable code for the program logic. Just see them as part of the language itself, tailored to the target.
The gcc C compiler, for instance actually is a freestanding implementation: It only provides the required headers, but not the standard library. Instead, it relies on the system library, which is e.g. glibc on Linux.
Note: Generally it is a bad idea to re-invent the wheel. So if you are on a hosted environment (i.e. full-grown OS), you should use the features available. Otherwise you might run into trouble, as these e.g. mightr provide additional functions not directly seen by your code. E.g. debugging or system/user-wide configuration like localisation support. Also debugging support might depend on you using the standard library, e.g. valgrind. Replacing memory allocation with your own code at least makes this much more difficult.
Not to mention maintainability. Not just others will understand your code easier if you use the standard names&semantics, but also yourself - just wait some years and try understanding your old code.
OTOH, if you are on a bare-metal embedded system, there is actually little use of most features the standard library. Including e.g. printf or scnaf just bloats your firmware, often without any actual use. For such systems, there are stripped-down libraries (e.g. newlib) which may be not completely compliant or allow to omit certain costly features, e.g. floating point conversion or the math lib. Still you only should use them iff you really need many of their features. And sometimes there is a middle way, but that requires some knowledge about the dependencies of the library.
Two reasons: compatibility and system interaction.
If you don't implement the whole C standard library, then code other people write won't work. Even the most basic C program uses library calls.
#include <stdio.h>
int main() {
printf("Hello world!\n");
return 0;
}
Without an agreed upon and fully implemented stdio.h that program will not run because the compiler doesn't know what printf() means.
Then there's system interaction. C has been called "portable assembly". This is because different computing environments do things differently, but C takes are of that for you (well, some of it). You can't write a portable stdio.h in assembly without losing your mind. But its more than that. Each C header file protects you from something that each environment does (or used to) do very differently.
stdlib.h shields you from differing memory models and process control.
stdio.h shields you from differing IO systems.
math.h shields you from differing floating point implementations.
limits.h shields you from differing data sizes.
locale.h shields you from differing locales.
And so on...
The C libraries provide a standard API that each environment can write to. When C is ported to a new environment, that environment is responsible for implementing those libraries according to the particulars of that system. You don't have to do that.
Nowadays we live in a much more homogeneous environment than when C was developed, but most of the C standard library still protects you from basic differences in how operating systems and hardware do things.
It didn't always used to be this way. An example that comes to mind is the hell of running a game on DOS. There was no standard interface to the sound and video card (if you had them). Each program had to ship drivers for each sound and video card they supported. If yours wasn't in there, sorry. If their driver was buggy, sorry.
Programming without the C standard library is kind of like that, but far far worse.

C runtime library : what for?

Context:
Creating a toy OS, written in assembly and C.
X86. 32-bits first, 64-bits then.
Currently read how to make a C library and try to understand the underlyings.
My question comes from the reading on this page.
I read the other SO questions concerning runtime libraries but still don't get it.
Question:
Ok, a runtime library seems to help providing low level functionalities that an application library cannot provide.
What wonders can I do with a runtime library ?
My searches on the subject lead to theoretical explanations (MSDN and so on).
I need a practical, visual explanation.
Update:
I saw the question the admins refer to, and I obviously read it already. But it was not enough high level. But that is fine :)
Thanks
A C runtime library is more a part of your C implementation than it is a core part of the operating system, especially if the C implementation provides only static linking, as might be the case in a toy OS.
Among other things, though, the C runtime library provides all the functions necessary for programs to obtain services from the OS, such as memory allocation and I/O. These are not necessarily the same functions the OS kernel uses internally for the same or similar purposes.
Programs written in other languages may or may not rely on the C language runtime (they may provide their own, independent one instead), and statically-linked C programs include all necessary functions in their own images, instead of relying on dynamically loading them from a library at run time. In a sense, the C runtime library is distributed and duplicated across all statically-linked programs built from C sources.

Are libc and malloc part of the operating system?

I was having a discussion with a co-worker about malloc, and Was wondering if it is the cases that certain libc calls like malloc are implemented by the operating system?
I always thought that malloc was calling some symbols exposed in "sys" to declare which memory addresses it would use. From what I thought the operating system would allow the program's segmentation to be specified using some os level api... which might similar to:
int assign_memory_segmention(size_t start, size_t end);
I know my stdlib.h header is part of GNU because of the GPL header... and as GNU have made sure to inform me... they are not Unix. So is malloc just some type of function pointer to an OS heap implementation?
This question is best asked with another question: what is an Operating System? Or if you prefer: where do you print the line between OS and standard libraries?
Technically, malloc is part of the standard C library. And since the Linux is mainly written in C, and that the same library also includes many system calls, not in the C language, it is reasonable to think that this library is part of the OS.
But, on the other hand, there are several implementations of the C library, and also, the GNU C library is available for others operating systems, such as Windows. And I'm sure that there are other languages out there that call the OS without using the standard C library. So, from that POV, it is not part of the OS.
But then, Linux is the kernel, the OS should be named GNU/Linux (citation needed). But again, there are Linux systems without GNU, such as Android...
The conclusion is: the term "Operating System" is not a technical one. If you want to be precise, use kernel or standard C library, etc.
Yes... and no. C malloc() is usually a sub-allocator to memory areas provided by OS calls. The OS manages all virtual memory - that is part of it's job.

Resources