What kind of C is an operating system written in? - c

It makes sense that something like an operating system would be written in C. But how much of it, and what kind of C? I mean, in C, if you needed some heap memory, you would call malloc. But, does an OS even have a heap? As far as I know, malloc asks the operating system for memory and then adds it to a linked list, or binary tree, or something. What about a call stack? The OS is responsible for setting up all of this stuff that other applications use, but how does it do that? When you want to open or create a file in C, the appropriate functions ask the operating system for that file. so... What kind of C is on the other side of that call? Or on the other end of a memory allocation?
Also, how much of an operating system would actually be written in C? All of it? What about architecture dependent code? What about the higher levels of abstraction--does that ever get written in higher level languages, like C++?
I mean, I'm just asking this out of sheer curiosity. I'm downloading the latest linux kernel now but it's taking forever. I'm not sure if I'll wind up being able to follow the code--or if I'll be caught in an inescapably complex web of stuff I've never seen before.

Excellent questions, all. The answer is: little to none of the standard C library is available in the "dialect" of C used to write an operating system. In the Linux kernel, for example, the standard memory allocation functions malloc, nmalloc, free etc. are replaced with special kernel-internel memory allocation functions kmalloc and kfree, with special restrictions on their use. The operating system must provide its own "heap" -- in the Linux kernel, physical memory pages that have been allocated for kernel use must be non-pageable and often physically continguous. See This linux journal article on kmalloc and kfree. Similarly, the operating system kernel maintains its own special call stack, the use of which requires, from memory, special support from the GCC compiler.
Also, how much of an operating system would actually be written in C? All of
it?
As far as I'm aware, operating systems are overwhelmingly written in C. Some architecture-specific features are coded in assembler, but usually very little to improve portability and maintainability: the Linux kernel has some assembler but tries to minimize it as much as possible.
What about architecture dependent
code? What about the higher levels of
abstraction--does that ever get
written in higher level languages,
like C++?
Usually the kernel will be written in pure C, but sometimes the higher level frameworks and APIs are written in a higher level language. For example, the Cocoa framework/API on MacOS is written in Objective C, and the BeOS higher level APIs were written in C++. Much of Microsoft's .NET framework was written in C#, with the "Common Language Runtime" written in a mix of C++ and assembler. The QT widget set most often used on Linux is written in C++. Of course, this introduces philosophical questions about what counts as "the operating system."
The Linux kernel is definitely worth looking at for this, although, it must be said, it is huge and intimidating for anyone to read from scratch.

What kind of C?
Mostly ANSI C, with a lot of time looking at the machine code it generates.
But, does an OS even have a heap?
Malloc asks the operating system for a pointer to some memory it is allowed to use. If a program running on an OS (user mode) tries to access memory it doesn't own, it will give a segmentation fault. An OS is allowed to directly access all the physical memory on the system, malloc not needed, no seg-faults on any address that exists.
What about a call stack?
The call stack actually often works at the hardware level, with a link register.
For file access, the OS needs access to a disk driver, which needs to know how to read the file system that's on the disk (there are a lot of different kinds) Sometimes the OS has one built in, but I think it's more common that the boot loader hands it one to start with, and it loads another (bigger) one. The disk driver has access to the hardware IO of the physical disk, and builds from that.

C is a very low level language, and you can do a lot of things directly. Any of the C library methods (like malloc, printf, crlscr etc) need to be implemented first, to invoke them from C (Have a look at libc concepts for example). I'll give an example below.
Let us see how the C library methods are implemented under the hood. We'll go with a clrscr example. When you implement such methods, you'll access system devices directly. For ex, for clrscr (clearing the screen) we know that the video memory is resident at 0xB8000. Hence, to write to screen or to clear it, we start by assigning a pointer to that location.
In video.c
void clrscr()
{
unsigned char *vidmem = (unsigned char *)0xB8000;
const long size = 80*25;
long loop;
for (loop=0; loop<size; loop++) {
*vidmem++ = 0;
*vidmem++ = 0xF;
}
}
Let us write our mini kernel now. This will clear the screen when the control is handed over to our 'kernel' from the boot loader. In main.c
void main()
{
clrscr();
for(;;);
}
To compile our 'kernel', you might use gcc to compile it to a pure bin format.
gcc -ffreestanding -c main.c -o main.o
gcc -c video.c -o video.o
ld -e _main -Ttext 0x1000 -o kernel.o main.o video.o
ld -i -e _main -Ttext 0x1000 -o kernel.o main.o video.o
objcopy -R .note -R .comment -S -O binary kernel.o kernel.bin
If you noticed the ld parameters above, you see that we are specifying the default load location of your Kernel as 0x1000. Now, you need to create a boot loader. From your boot loader logic, you might want to pass control to your Kernel, like
jump 08h:01000h
You normally write your boot loader logic in Asm. Even before that, you may need to have a look at how a PC Boots - Click Here.
Better start with a tinier Operating system to explore. See this Roll Your Own OS Tutorial
http://www.acm.uiuc.edu/sigops/roll_your_own/

But how much of it, and what kind of C?
Some parts must be written in assembly
I mean, in C, if you needed some heap memory, you would call malloc. But, does an OS even have a heap? As far as I know, malloc asks the operating system for memory and then adds it to a linked list, or binary tree, or something.
Some OS's have a heap. At a lowest level, they are slabs of memory that are dolled out called pages. Your C library then partitions with its own scheme in a variable sized manner with malloc. You should learn about virtual memory which is a common memory scheme in modern OS's.
When you want to open or create a file in C, the appropriate functions ask the operating system for that file. so... What kind of C is on the other side of that call?
You call into assembly routines that query hardware with instructions like IN and OUT. With raw memory access sometimes you have regions of memory that are dedicated to communicating to and from hardware. This is called DMA.
I'm not sure if I'll wind up being able to follow the code--or if I'll be caught in an inescapably complex web of stuff I've never seen before.
Yes you will. You should pick up a book on hardware and OS's first.

I mean, in C, if you needed some heap memory, you would call malloc. But, does an OS even have a heap? As far as I know, malloc asks the operating system for memory and then adds it to a linked list, or binary tree, or something. What about a call stack?
A lot of what you say in your question is actually done by the runtime library in userspace.
All that OS needs to do is to load the program into memory and jump to it's entry point, most details after that can be done by the user space program. Heap and stack are just areas of the processes virtual memory. Stack is just a pointer register in the cpu.
Allocating physical memory is something that is done on the OS level. OS usually allocates fixed size pages, which are then mapped to a user space process.

You should read the Linux Device Drivers 3. It explains pretty well the internals of the linux kernel.

I wouldn't start reading the Linux kernel, It's too complicated for starters.
Osdev is an excellent place to start reading.
I have done a little os with information from Osdev for an school subject. It runs on vmware, bochs, and qemu so it's easy to test it. Here is the source code.

Traditionally, C is mostly needed for the kernel and device drivers due to interaction with hardware. However, languages such as C++ and Java could be used for the entire operating system
For more information, I've found Operating Systems Design and Implementation by Andrew Tannenbaum particularly useful with LOTS of code samples.

malloc and memory management functions aren't keywords in C. This is functions of standard OS libraries. I don't know the name of this standard (it is unlikely that it's POSIX standard - I haven't found any mention), but it's exists - you use malloc in C applications on most platforms.
If you want to know how Linux kernel works I advice this book http://oreilly.com/catalog/9780596005658/ . I think it's good explanation with some C code inserted :).

Related

Compiling old C code + X86 disassmbley it

I'm working on book "Hacking: the art of exploitation" and I'm trying to go with the writer and get my hands dirty.
I downloaded the source codes, when I compile them I got the same output C executables. But when I the disassemble, using GDB, they have different addresses and different disassembly codes! I do the same commands as in the book!
Btw I've compiled with the command:
gcc -m32 -g code.c
I'm using 64bit PC and I learn x86 assembly.
So what's wrong? Is it because it's an old source code or what?
TL;DR You cannot match the exact same addresses of a binary compiled on a different machine than the one in the book, in normal circumstances.
Even thought the question is kind of abstract i will try to be as concise as possible. Please keep in mind that the reasons why the addresses between your local debugger and the ones in the book are numerous so the ones I listed bellow are definitely not exhaustive.
ASLR (Address Space Layout Randomization)
ASLR what it does it that pretty much randomizes the higher bytes of the memory addresses (thus, it doesn't randomizes the offset between the functions-variables inside the ELF) as a security mechanism against well known binary exploitation strategies
Let's assume that we have some code compiled e.g. function_A and function_B (assuming we are on Unix-like system, and the flag of the compiler is just the one that you suggested): If you look at the ELF file right before it gets loaded in memory for example in the disassembler of gdb (so you are looking at the byte represendation of the ELF itself) You'll find function_A having an address similar to 0x0000ABCD and function_B an address similar to ```0x0000EF12`. If you set a breakpoint in main ,run the binary and check the addresses again you'll observe that the addresses have now changed to something like 0xUUUUABCD, 0xUUUUEF12, U = Unknown.
P.S. GDB by default disables ASLR, so to observe a different address load, you have to close it and repeat the process again, or disable ASLR from inside of gdb.
Compiler changes
The book was published around 2003 for the first time if i can recall correctly. Since then the GCC compiler has changed a lot. Considering the fact that even small changes on the code of the compiler can have a significant difference on the executable that it produces. It is non-trivial to understand why for example the assembly representation of function_A may not even be close to it's representation almost 20 years ago. ( I do know that this is a bit of abstract) but looking more into this would take me a book to explain, but I can suggest you to take a look at Compilers: Principles, Techniques, and Tools aka The Dragon Book.
OS Environment
Since the publish of the book Ubuntu (and in general Linux distros) versions has changed a lot also they evolved, and added features (and removed others) that affect e.g. the loader which responsible for loading the program on RAM. With that being said keep in mind that changing an OS, - especially if you change family of Linux distributions (e.g. going from a Debian-based system to a Fedora-based) - affects the way the binary is loaded in memory , that of course differentiates the addresses in memory.

C function code in malloc'd memory

Is there a way to malloc memory space and then copy function code inside the space in C?
This question might not make sense in practice. I ask this question out of curiosity so that I can get a better understanding about how c and its underlying implementation work.
Here's the follow-up questions if it is possible to copy the code into heap:
How to determine the size for the function binary code when copy?
Can we use function pointer to execute the code? (the code is placed inside malloc'd memory, and that part of memory might be marked as non-executable for safety reason, but I'm not sure about this)
This (or something like it) is possible on most machines, but the techniques you'd use are system-specific -- there's no standard C or C++ way to do it.
Even figuring out the length of a function so you can copy it is difficult. I don't think you can do it reliably if the function is in the same translation unit, because the compiler may have done optimization magic that you can't see. However, if the function is in a different file, then the interface to it will probably be more reliable (although there could be linker magic going on that you would have to understand and emulate to accomplish your goal.)
Other problems (on some systems) are that malloc'd memory may not be executable. (This is often the case to improve security by preventing execution of code placed in an overrun buffer area.) However, systems with executable protection often have an alternate memory allocation function that can give you a chunk of memory where executable code can be placed, and to which execution can transfer. Some variation of this feature is necessary to implement shared libraries.
Finally, although self modifying code is probably the first thing people probably think of when considering your question, a reasonable, legitimate use of the relevant techniques might be in a native-code, just-in-time compilation system.
You may get better answers by specifying a particular OS and CPU where you want to do this.
The C standard (e.g. C11, read n1570) or the C++ one (e.g. C++11, C++14 and notice that they have lambda expressions and std::function; read more about closures ...) does not define what is a function address or pointer (it only defines what calling such an address does, then function pointers should point to existing functions and there is no standard way to build new ones dynamically at runtime). In some systems (pure Harvard architectures) a function sits in a different address space than the C heap (and on these systems executing anything in malloc-ed heap makes no sense and is undefined behavior). so the C11 standard forbids casting function pointers to data pointers and vice-versa.
So, to your question
Is there a way to malloc memory space and then put function code inside the space in C?
the answer is NO in general (but on some systems you could generate code at runtime, see below).
However, on desktop or laptop PCs or server PCs or tablets (running common OSes like Linux, Windows, MacOSX, Android), you usually have a Von Neumann architecture and there is (for a given process) a single virtual address space sharing both code and data (notably heap data obtained with malloc). That virtual address space organised in pages, and each page has its own memory protection. Read more about computer architecture, instruction sets, MMUs. Quite often heap allocated data is non-executable thru the NX bit.
The operating system plays an essential role. You need to read an entire book about OS, such as Operating Systems : Three Easy Pieces.
(I am guessing that you want to "create" some new functions in your program at runtime and call them thru C function pointers; you should explain why; I suppose you are coding some application for a PC or a tablet with a Unix-like OS, practically a Linux-x86_64 distribution, but you could adapt my answer to Windows)
You could use some libraries for JIT compilation such as asmjit, libgccjit, LLVM (or libjit or GNU lightning) and they generate code which is executable.
You could also use dynamic loading techniques on some plugin; on POSIX systems look into dlopen & dlsym (which can be used to "create" function addresses from a loaded plugin, beyond what the C11 standard allows). A possible way would be to generate some C code in a temporary file, compile it into a plugin, and dlopen that generated plugin. See this answer for more details.
On Linux, you can use the mmap(2) and related system calls (used to implement malloc in your C standard library, and also by dlopen(3)) to change your virtual address space, and the mprotect(2) system call to change protection (on a page by page basis). So if you want to explicitly copy or generate some function code it has to go into an executable page (PROT_EXEC).
Notice that because of relocation issues (and offsets or absolute addresses in machine code), it is not easy to copy machine code. Copying with memcpy the bytes of a given function code into some executable page usually won't work without pain: often CALL or JUMP machine instructions are using PC-relative addressing, so copying them without changing their offset won't work.
if it is possible to copy the code into heap
No, it is not possible in general; and in practice it is much more difficult than what you believe (even on Linux-x86_64, where other approaches that I mentioned are preferable); if you want to go that route you need to care about low level implementation details (instruction set, processor, compiler, calling conventions, ABIs, relocation) and your code would be non-portable and brittle.
How to determine the size for the function binary code when copy?
That question (and the notion of function size) has no sense in general. Some optimizing compilers are able to emit some machine code which is shared between several C functions, or to emit several non-contiguous machine code chunks for a given function (and gcc -O2 is likely to do these optimizations, read about function cloning). On Linux you could use dladdr(3) (or the nm or readelf programs) to get a "symbol size" in the ELF sense, but that size might not mean much. And as I explained, you can't just byte-copy binary machine code, you need to relocate (some parts of) it.

How to instrument/profile memory(heap, pointers) reads and writes in C?

I know this might be a bit vague and far-fetched (sorry, stackoverflow police!).
Is there a way, without external forces, to instrument (track basically) each pointer access and track reads and writes - either general reads/writes or quantity of reads/writes per access. Bonus if it can be done for all variables and differentiate between stack and heap ones.
Is there a way to wrap pointers in general or should this be done via custom heap? Even with custom heap I can't think of a way.
Ultimately I'd like to see a visual representation of said logs that would show me variables represented as blocks (of bytes or multiples of) and heatmap over them for reads and writes.
Ultra simple example:
int i = 5;
int *j = &i;
printf("%d", *j); /* Log would write *j was accessed for read and read sizeof(int) bytes
Attempt of rephrasing in more concise manner:
(How) can I intercept (and log) access to a pointer in C without external instrumentation of binary? - bonus if I can distinguish between read and write and get name of the pointer and size of read/write in bytes.
I guess (or hope for you) that you are developing on Linux/x86-64 with a recent GCC (5.2 in october 2015) or perhaps Clang/LLVM compiler (3.7).
I also guess that you are tracking a naughty bug, and not asking this (too broad) question from a purely theoretical point of view.
(Notice that practically there is no simple answer to your question, because in practice C compilers produce machine code close to the hardware, and most hardware do not have sophisticated instrumentations like the one you dream of)
Of course, compile with all warnings and debug info (gcc -Wall -Wextra -g). Use the debugger (gdb), notably its watchpoint facilities which are related to your issue. Use also valgrind.
Notice also that GDB (recent versions like 7.10) is scriptable in Python (or Guile), and you could code some scripts for GDB to assist you.
Notice also that recent GCC & Clang/LLVM have several sanitizers. Use some of the -fsanitize= debugging options, notably the address sanitizer with -fsanitize=address; they are instrumenting the code to help in detecting pointer accesses, so they are sort-of doing what you want. Of course, the performance of the instrumented generated code is decreasing (depending on the sanitizer, can be 10 or 20% or a factor of 50x).
At last, you might even consider adding your own instrumentation by customizing your compiler, e.g. with MELT -a high level domain specific language designed for such customization tasks for GCC. This would take months of work, unless you are already familiar with GCC internals (then, only several weeks). You could add an "optimization" pass inside GCC which would instrument (by changing the Gimple code) whatever accesses or stores you want.
Read more about aspect-oriented programming.
Notice also that if your C code is generated, that is if you are meta-programming, then changing the C code generator might be very relevant. Read more about reflection and homoiconicity. Dynamic software updating is also related to your issues.
Look also into profiling tools like oprofile and into sound static source analyzers like Frama-C.
You could also run your program inside some (instrumenting) emulator (like Qemu, Unisim, etc...).
You might also compile for a fictitious architecture like MMIX and instrument its emulator.

CSR1000 allocate memory

I'm currently playing around with the CSR 1000 chip and I wanted to allocate memory. I tried using malloc but the compiler tells me:
undefined reference to `malloc'
I assume that is because gcc is run with -nostdlib parameter
So please could somebody with CSR uEnergy SDK experience, tell me why I can't allocate memory, and how I should do it instead??
If there is an SDK bundled with that chip that provides basic routines for memory allocation then use those, alternatively you can write your own allocator or use an existing one off of the web (with some fiddling).
As a quick solution you can probably mark a region in memory using a modified linker script or by using the gcc 'section' attribute (more here) and then use that as your heap arena in your malloc allocator.
A very simple allocator would not keep any accounting information such as headers/footers but rather allocate linearly one region after another (free-ing would essentially be a no-op in this case), this won't get you far but you will be able to run simple programs.
You probably want something more sophisticated, you could also look into implementing some kind of memory pool or any of the standard allocation algorithms.
The classic book The C Programming Language by Dennis Ritchie and Brian Kernighan provides a simple memory allocator if I re-call correctly. You may want to have a look at that.
I have three months of experience with this chip.
The malloc function is found in standard C library, which is typically available in desktop software development or embedded linux. But this is a small and resource-limited embedded chip. There is no standard C library.
If you browse the uEnergy SDK installation directory, something like this: C:\uEnergy_SDK-2.0.0\doc\reference\html\index.html. Click Modules tag on the top. You will find that under the section "C Standard Library APIs", CSR provides a few functions that mimic a subset of the standard C library. Unfortunately, there is no methods like malloc.
In general, when you work with small embedded systems, it is quite often that there is no dynamic memory allocation. However, for RF applications which are usually event-driven, there is typically a simple dynamic memory allocation function provided so incoming packets can be handed to you by the OS to your application. I used TI's CC2430 and its Zigbee stacks. They provide functions osal_mem_alloc and osal_mem_free, which mimic the malloc and free in the standard C library.
From my experience working with both chips, I found that CSR is much more protective than TI, in the same way as iOS vs. Android. You don't know what MCU they use except they tell you that it is a 16-bit RISC.
I suspect they have the dynamic memory allocation internally but your application just can't use those functions. RF packets are handed to you by the OS in the AppProcessLmEvent function, from there you get your data via the p_event_data pointer. You don't have to deallocate it as the OS will do it for you once you finish handling that event.
So back to your question, you can allocate memory so you just reserve a block of memory as global array and work on it.
Hope this helps.
add #include <malloc.h> to the head of your file

What is the need of randomizing memory addresses for loading libraries?

ldd displays the memory addresses where the shared libraries are linked at runtime
$ cat one.c
#include<stdio.h>
int main() {
printf ("%d", 45);
}
$ gcc one.c -o one -O3
$ ldd one
linux-gate.so.1 => (0x00331000)
libc.so.6 => /lib/tls/i686/cmov/libc.so.6 (0x00bc2000)
/lib/ld-linux.so.2 (0x006dc000)
$
From this answer to another question,
... The addresses are basically random numbers. Before secure implementations were devised, ldd would consistently indicate the memory addresses where the program sections were loaded. Since about five years ago, many flavors of Linux now intentionally randomize load addresses to frustrate would-be virus writers, etc.
I do not fully understand how these memory addresses can be used for exploitations.
Is the problem something like "If the addresses are fixed, one can put some undesirable code at that address which would be linked as if it was a library" or is it something more than this?
"If the addresses are fixed, one can put some undesirable code at that address which would be linked as if it was a library"
Yes.
Also. Buffer overflow exploits require a consistent memory model so that the bytes that overflow the buffer do known things to known parts of the code.
http://www.corewars.org/ A great illustration of the principle.
Some vulnerabilities allow overwriting some address (stack overflows allow overwriting return addresses, exploit for heap overflows typically overwrite SEH pointers on Win32 and addresses (GOT entries) of dynamically called functions on Linux, ...). So the attacker needs to make the overwritten address point to something interesting. To make this more difficult, several counter-measures have been adopted:
Non-executable stacks prevents exploits from just jumping to some code the attacker has put on the stack.
W^X segments (segments which can never be writable and executable at the same time) prevents the same for other memory areas.
Randomized load addresses for libraries and position independent executables decrease the probabilities of succesful exploitation via return-into-libc and return-oriented-programming techniques, ...
Randomized load addresses also prevent attackers from knowing in advance where to find some interesting function (e.g: imagine an attacker that can overwrite the GOT entry and part of the message for the next logging call, knowing the address of system would be "interesting").
So, you have to view load address randomization as another counter-measure among many (several layers of defense and all that).
Also note that exploits aren't restricted to arbitrary code execution. Getting a program to print some sensitive information instead of (or in addition to, think of string truncation bugs) some non-sensitive information also counts as an exploit; it would not be difficult to write some proof-of-concept program with this kind of vulnerability where knowing absolute addresses would make reliable exploits possible.
You should definitely take a look at return-into-libc and return-oriented-programming. These techniques make heavy use of knowledge of addresses in the executable and libraries.
And finally, I'll note there are two ways to randomize library load addresses:
Do it on every load: this makes (some) exploits less reliable even if an attacker can obtain info about addresses on one run and try to use that info on another run.
Do it once per system: this is what prelink -R does. It avoids attackers using generic information for e.g: all Redhat 7.2 boxes. Obviously, its advantage is that it doesn't interfere with prelink :).
A simple example:
If on a popular operating system the standard C library was always loaded at address 0x00100000 and a recent version of the standard C library had the system function at offset 0x00000100 then if someone were able to exploit a flaw in a program running on a computer with this operating system (such as a web server) causing it to write some data to the stack (via a buffer overrun) they would know that it was very likely that if they wrote 0x00100100 to the place on the stack where the current function expected its return address to be then they could make it so that upon returning from the current function the system function would be called. While they still haven't done everything needed to cause system to execute something that they want it to, they are close, and there are some tricks writing more stuff to the stack aver the address mentioned above that have a high likelihood of resulting in a valid string pointer and a command (or series of commands) being run by this forced call to system.
By randomizing the addresses at which libraries are loaded the attacker is more likely to just crash the web server than gain control of the system.
The typical method is by a buffer overrun, where you put a particular address on the stack, and then return to it. You typically pick an address in the kernel where it assumes the parameters you've passed it on the stack have already been checked, so it just uses them without any further checking, allowing you to do things that normally wouldn't be allowed.

Resources