How can linux boot code be written in C? - c

I'm a newbie to learning OS development. From the book I read, it said that boot loader will copy first MBR into 0x7c00, and starts from there in real mode.
And, example starts with 16 bit assembly code.
But, when I looked at today's linux kernel, arch/x86/boot has 'header.S' and 'boot.h', but actual code is implemented in main.c.
This seems to be useful by "not writing assembly."
But, how is this done specifically in Linux?
I can roughly imagine that there might be special gcc options and link strategy, but I can't see the detail.

I'm reading this question more as an X-Y problem. It seems to me the question is more about whether you can write a bootloader (boot code) in C for your own OS development. The simple answer is YES, but not recommended. Modern Linux kernels are probably not the best source of information for creating bootloaders written in C unless you have an understanding of what their code is doing.
If using GCC there are restrictions on what you can do with the generated code. In newer versions of GCC there is an -m16 option that is documented this way:
The -m16 option is the same as -m32, except for that it outputs the ".code16gcc" assembly directive at the beginning of the assembly output so that the binary can run in 16-bit mode.
This is a bit deceptive. Although the code can run in 16-bit real mode, the code generated by the back end uses 386 address and operand prefixes to make normally 32-bit code execute in 16-bit real mode. This means the code generated by GCC can't be used on processors earlier than the 386 (like the 8086/80186/80286 etc). This can be a problem if you want a bootloader that can run on the widest array of hardware. If you don't care about pre-386 systems then GCC will work.
Bootloader code that uses GCC has another downside. The address and operand prefixes that get get added to many instructions add up and can make a bootloader bloated. The first stage of a bootloader is usually very constrained in space so this could potentially become a problem.
You will need inline assembly or assembly language objects with functions to interact with the hardware. You don't have access to the Linux C library (printf etc) in bootloader code. For example if you want to write to the video display you have to code that functionality yourself either writing directly to video memory or through BIOS interrupts.
To tie it altogether and place things in the binary file usable as an MBR you will likely need a specially crafted linker script. In most projects these linker scripts have an .ld extension. This drives the process of taking all the object files putting them together in a fashion that is compatible with the legacy BIOS boot process (code that runs in real mode at 0x07c00).
There are so many pitfalls in doing this that I recommend against it. If you are intending to write a 32-bit or 64-bit kernel then I'd suggest not writing your own bootloader and use an existing one like GRUB. In the versions of Linux from the 1990s it had its own bootloader that could be executed from floppy. Modern Linux relies on third party bootloaders to do most of that work now. In particular it supports bootloaders that conform to the Multiboot specification
There are many tutorials on the internet that use GRUB as a bootloader. OS Dev Wiki is an invaluable resource. They have a Bare Bones tutorial that uses the original Multiboot specification (supported by GRUB) to boot strap a basic kernel. The Mulitboot specification can easily be developed for using a minimal of assembly language code. Multiboot compatible bootloaders will automatically place the CPU in protected mode, enable the A20 line, can be used to get a memory map, and can be told to place you in a specific video mode at boot time.
Last year someone on the #Osdev chat asked about writing a 2 stage bootloader located in the first 2 sectors of a floppy disk (or disk image) developed entirely in GCC and inline assembly. I don't recommend this as it is rather complex and inline assembly is very hard to get right. It is very easy to write bad inline assembly that seems to work but isn't correct.
I have made available some sample code that uses a linker script, C with inline assembly to work with the BIOS interrupts to read from the disk and write to the video display. If anything this code should be an example why it's non-trivial to do what you are asking.

Related

Is it possible to build C source code written for ARM to run on x86 platform?

I got some source code in plain C. It is built to run on ARM with a cross-compiler on Windows.
Now I want to do some white-box unit testing of the code. And I don't want to run the test on an ARM board because it may not be very efficient.
Since the C source code is instruction set independent, and I just want to verify the software logic at the C-level, I am wondering if it is possible to build the C source code to run on x86. It makes debugging and inspection much easier.
Or is there some proper way to do white-box testing of C code written for ARM?
Thanks!
BTW, I have read the thread: How does native android code written for ARM run on x86?
It seems not to be what I need.
ADD 1 - 10:42 PM 7/18/2021
The physical ARM hardware that the code targets may not be ready yet. So I want to verify the software logic at a very early phase. Based on John Bollinger's answer, I am thinking about another option: Just build the binary as usual for ARM. Then use QEMU to find a compatible ARM cpu to run the code. The code is assured not to touch any special hardware IO. So a compatible cpu should be enough to run all the code I think. If this is possible, I think I need to find a way to let QEMU load my binary on a piece of emulated bare-metal. And to get some output, I need to at least write a serial port driver to bridge my binary to the serial port.
ADD 2 - 8:55 AM 7/19/2021
Some more background, the C code is targeting ARMv8 ISA. And the code manipulates some hardware IPs which are not ready yet. I am planning to create a software HAL for those IPs and verify the C code over the HAL. If the HAL is good enough, everything can be purely software and I guess the only missing part is a ARMv8 compatible CPU, which I believe QEMU can provide.
ADD 3 - 11:30 PM 7/19/2021
Just found this link. It seems QEMU user mode emulation can be leveraged to run ARM binaries directly on a x86 Linux. Will try it and get back later.
ADD 4 - 11:42 AM 7/29/2021
An some useful links:
Override a function call in C
__attribute__((weak)) and static libraries
What are weak functions and what are their uses? I am using a stm32f429 micro controller
Why the weak symbol defined in the same .a file but different .o file is not used as fall back?
Now I want to do some white-box unit testing of the code. And I don't want to run the test on an ARM board because it may not be very efficient.
What does efficiency have to do with it if you cannot be sure that your test results are representative of the real target platform?
Since the C source code is instruction set independent,
C programs vary widely in how portable they are. This tends to be less related to CPU instruction set than to target machine and implementation details such as data type sizes, word endianness, memory size, and floating-point implementation, and implementation-defined and undefined program behaviors.
It is not at all safe to assume that just because the program is written in C, that it can be successfully built for a different target machine than it was developed for, or that if it is built for a different target, that its behavior there is the same.
I am wondering if it is possible to build the C source code to run on x86. It makes debugging and inspection much easier.
It is probably possible to build the program. There are several good C compilers for various x86 and x86_64 platforms, and if your C code conforms to one of the language specifications then those compilers should accept it. Whether the behavior of the result is representative of the behavior on ARM is a different question, however (see above).
It may nevertheless be a worthwhile exercise to port the program to another platform, such as x86 or x86_64 Windows. Such an exercise would be likely to unmask some bugs. But this would be a project in its own right, and I doubt that it would be worth the effort if there is no intention to run the program on the new platform other than for testing purposes.
Or is there some proper way to do white-box testing of C code written for ARM?
I don't know what proper means to you, but there is no substitute for testing on the target hardware that you ultimately want to support. You might find it useful to perform initial testing on emulated hardware, however, instead of on a physical ARM device.
If you were writing ARM code for a windows desktop application there would be no difference for the most part and the code would just compile and run. My guess is you are developing for some device that does some specific task.
I do this for a lot of my embedded ARM code. Typically the core algorithms work just fine when built on x86 but the whole application does not. The problems come in with the hardware other than the CPU. For example I might be using a LCD display, some sensors, and Free RTOS on the ARM project but the code that runs on Windows does not have any of these. What I do is extract important pieces of C/C++ code and write a test framework around it. In the real ARM code the device is reading values from a sensor and doing something with it. In the test code that runs on a desktop the code reads from a data file with fake sensor values and writes its output to a datafile that can be analyzed. This way I can have white box tests for the most complicated code.
May I ask, roughly what does this code do? An ARM processor with no peripherals would be kind of useless. Typically we use the processor to interact with some other hardware like a screen, some buttons, or Bluetooth. It's those interactions that are going to be the most problematic.

Why is /arch/x86/boot/header.S in assembly?

Why is this file written in assembly while it could simply use an easier language like c? and why hasn't anyone still attempted to rewrite it in c?
It's a boot-related code and it's architecture dependent. Some of bootloader code constructs (say, stack-related) might not be representable in C without breaking its major conventions. At the same time, usage of such constructs is typically unavoidable in the boot process.
Well, there is a school of thought that you could write a boot-related code in C but anyway you would still have to use much inline-assembly in it to access very low-level features in your code.
Also, a typical bootloader has to deal with a handful of limitations.
One of the limitations (at least, for x86) is that when the machine is powered on, the processor starts running in 16 bit real mode. Less than 1 MB of RAM is available for use (boot facilities need to be small enough to fit in), no virtual memory mechanism is avaiable and, in general, memory addressing mode is quite restricted. When BIOS POST program reads the boot sector from either boot device (say, HDD), the loaded program has to read more facilities from the disk to memory and pass control to them. Obviously, as no OS is running at that point, no OS device drivers are available, and no standard C approach (say, using the standard IO library) is applicable. Instead, it's BIOS which provides device drivers which serve a well-defined set of interrupts (in example, https://en.wikipedia.org/wiki/INT_13H ) to access data on different boot drives. So, roughly, the boot manager has to be written in assembly to use a very specific set of BIOS features in real mode.
All in all, taking all the points into account (code size, 16 bit real mode limitations, the need to use BIOS-specific features and code constructs not representable in C), the answer is that writing the whole code in assembly would be the most efficient and unambiguous way rather than extending C to handle non-standard constructs or use badly-readable mixture of C and inline-assembly code.
P.S. If you're interested in a more detailed description of bootloader internals, it would be useful to refer to a very eloquent example of FreeBSD bootstrapping and kernel initialisation: https://www.freebsd.org/doc/en_US.ISO8859-1/books/arch-handbook/boot-overview.html

Compiling C and assembling ASM into machine code [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 10 years ago.
I have three questions:
What compiler can I use and how can I use it to compile C source code into machine code?
What assembler can I use and how can I use it to assemble ASM to machine code?
(optional) How would you recommend placing machine code in the proper addresses (i.e. bootloader machine code must be placed in the boot sector)?
My goal:
I'm trying to make a basic operating system. This would use a personally made bootloader and kernel. I would also try to take bits and pieces from the Linux kernel (namely the drivers) and integrate them into my kernel. I hope to create a 32-bit DOS-like operating system for messing with memory on most modern computers. I don't think I will be creating a executable format for my operating system, as my operating system wont be dynamic enough to require it.
My situation:
I'm running on a x86-64 windows 8 laptop with a Intel Celeron CPU; I believe it uses secure boot. I would be testing my operating system on a x86-64 desktop with Intel Core I3 CPU. I have a average understanding of operating systems and their techniques. I know the C, ASM, and computer theory required for this project. I think it is also note worthy that I'm sixteen with no formal education about computer science.
My research: After searching Google for what C normally compiles into, I found answers ranging from machine code, binary, plain binary, raw binary, assembly, and relocatable object code. Assembly as I understand normally assembles into a PE formatted executable. I have heard of the Cygwin, GCC C, and MingW C compilers. As for assemblers, I have heard of FASM, MASM, and NASM. I have searched websites such as OSDev and OSDever.
What I have tried: I tried to setup GCC (a nightmare) and create a cross compiler (another nightmare).
Conclusion: As you can tell, I'm vary confused about compilers, assemblers, and executable formats. Please dispel my ignorance along with answering my questions. These are probably the only things keeping me from having a OS on my resume. Sorry, I would have included more links, but stackoverflow wouldn't let me make more then two. Thanks a ton!
First, some quick answers to your three questions.
Pretty much any compiler will translate C code into assembly code. That's what compilers do. GCC and clang are popular and free.
clang -S -o example.s example.c
Whichever compiler you choose will probably support assembly as well, simply by using the same compiler driver.
clang -o example.o example.s
Your linker documentation will tell you how to put specific code at specific addresses and so forth. If you use GCC or clang as described above, you will probably use ld(1). In that case, read into 'linker scripts'.
Next, some notes:
You don't need a cross compiler or to set up GCC by yourself. You're working on an Intel machine, generating code for an Intel machine. Any binary distribution of clang or GCC that comes with your linux distribution should work fine.
C compilers normally compile code into assembly, and then pass the resulting assembly off to a system assembler to end up with machine code. Machine code, binary, plain binary, raw binary, are all basically synonymous.
The generated machine code is packaged into some kind of executable file format, to tell the host operating system how to load and run the code. On windows, it's PE, on Linux, it's ELF, and on Mac OS X it's Mach-O.
You don't need to create an executable format for your OS, but you will probably want to use one. ELF is a pretty straightforward (and well-documented) option.
And a bit of a personal note that I hope doesn't discourage you too much - If you are not very familiar with how compilers, assemblers, linkers, and all of those tools work, your project is going to be very difficult and confusing. You might want to start with some smaller projects to get your "sea legs", so to speak.
At first "machine code" and "binary" are synonyms. "Object code" is some kind of intermediate form, that the linker will convert to binary at the end. Some C/C++ compilers generate not directly binary, but assembler source code, that they feed to the assembler, that produces object code and then to the linker, that makes the final binary. In the most cases these processes are transparent to the user. You feed the compiler with C/C++/Pascal/whatever source code and get a binary file at the output.
FASM assembler, aka flatassembler is the best assembler for OS development. There are several OSes already created in FASM.
That is because FASM is self compilable and is very easy portable. This way, for 2..3 days, you can port it to your OS and then your OS will become self sufficient - i.e. you will be able to compile the programs from within your OS.
Another good feature of FASM is that it does not need linker - it can generate directly binary files in several formats.
The big active community is also very important. There are tons of sources available for FASM, including for OS development.
The message board is very active and is place where one can learn a lot.
I think the first part of your question has been answered, so I'll take on the other two:
What assembler can I use and how can I use it to assemble ASM to machine code?
One of nasm, yasm (basically very like nasm), fasm, "masm" i.e. ml64.exe, ml.exe and freely available as part of the Microsoft tools.
Of these, I probably recommend either nasm or yasm. That recommendation is based entirely on personal preference - but the wide range of platforms they support, plus using Intel syntax by default are my reasons. I'd try a few and see what you like.
(optional) How would you recommend placing machine code in the proper addresses (i.e. bootloader machine code must be placed in the boot sector)?
Well, there is only one way to place the bootloader at the correct address for MBR - open the disk at LBA 0 and write exactly 512 bytes there, ending in 0x55AA. Flush, then close. The MBR usually also contains a partition table embedded in it - it is both code and data. The sciency term for this stuff is Von Neumann Architecture which can be briefly summarised as "programs and data are stored in the same place". The action of the BIOS on wanting to boot from disk will be to read the first 512 bytes into memory, check the signature and if it matches, execute that memory (starting from byte 0).
OK, that's those questions out of the way. Now I'll give you some more notes:
512-bytes for a bootloader is not really enough for anyone's usage. As such, some file systems contain boot sectors and the bootloader itself simply loads the code/data found in these. This allows for larger amounts of code to be loaded - enough to get a kernel going. For example, grub contains stage1, stage1_5 and stage2 components in the legacy version.
Although most operating systems require you to use an executable format container, you don't need one. On disk and in memory, executable code is just one, two or three byte strings called opcodes. You can read the opcode reference or the Intel/AMD manuals to find out what hexadecimal value translates to what. Anyway, you can perform a direct conversion from assembler to binary using nasm like this:
nasm -f bin input.asm -o output.asm
Which will work for 16, 32 or 64 bit assembler quite happily although the result likely won't execute. The only place it will is if you explicitly use the [bits 16] directive in your code, along with org 100h, then you have an MSDOS .com program. Unfortunately, this is the simplest of binary formats in existence - you only have code and data in one big lump and this must not exceed the size of a single segment.
I feel this might handle this point:
I found answers ranging from machine code, binary, plain binary, raw binary, assembly, and relocatable object code.
The answer as to what assembly assembles to - it assembles to opcodes and memory addresses, depending on the assembler. This is represented in bytes which are data all of themselves. You can read them raw with a hex editor although there are few occasions where this is strictly necesary. I mention memory addresses because some opcodes control how memory addresses are interpreted - relocatable object code for example requires that addresses are not hard-coded (instead, they are interpreted as offsets from the current location).
Assembly as I understand normally assembles into a PE formatted executable.
It is fair to say the assembler from which your C/C++ was derived is compiled to opcodes which are then, along with anything else to be included in the program (data, resources) are stored in an executable format, such as PE. Normally depends on your OS.
If you have thoroughly read the OSDev Wiki, you'll realise segmented addressing is an utter pain - the standard and only usage of segments in modern operating systems is to define four segments spanning the entire address space - two data segments at ring 0 and 3, two code segments at ring 0 and 3.
If you haven't read the OSDEV Wiki thoroughly, you should. I'd also recommend JamesM's kernel tutorials which contain practical advice on building a kernel in C.
If you simply want to do bad things to a DOS kernel, you actually still can without needing to write a full kernel yourself. You should also be able to switch the CPU to protected mode from DOS, too. You need FreeDOS and an assembler of your choice. There is an excellent tutorial on terminate and stay resident which basically means hooking an interrupt routine, then editing yourself out of the active process list, in The Rootkit Arsenal. There are probably tutorials on the internet for this, too.
I might be tempted to recommend doing this as a first, just to get yourself used to this kind of low level stuff.
If you just wanted to poke an OS, you can set up kernel debugging on Windows. WinDbg is a bit... arcane, but once you get used to it it makes sense.
You mention your laptop uses secure boot. If this is the case your laptop uses UEFI. If you want to read up on this, the UEFI spec is 100% guaranteed more boring than your maths homework, but I recommend skimming it just to understand the goals and the basic environment. THe important thing is to have the EFI SDK which enables you to build EFI-compatible applications (which are in PE format and exist on a FAT32 partition on your disk - so installing an EFI bootloader is very simple even if writing one is not so. If I had to make an honest recommendation, I'd stick to MBR for now, since emulating OSes with MBR is much easier than EFI at the time of writing and you really do want to do this in some form of VM for now. Also, I'd use an existing one like grub, since bootloaders are not all that exciting, really.
Others have said it, and I will say it: You absolutely want to do anything like this under some form of emulator or virtual machine. You will make a mistake, guaranteed, and you will come up against things you don't understand. Emulators and VM software are free these days, and some such as BOCHS will tell you what the reason for a given fault, trap etc is. This is massively helpful!
First, use something like Virtual box for your testing
I think you might want to take some smaller steps, get comfortable writing C code.
then look into how boot sectors on disks work ( well documented on the internet) also look at code of other open source boot loaders.
Then look at how to do task switching. Its not too hard to write. You can even write most of it while running it under your normal OS before trying to embeded into your own OS
With C compilers you can generally mix in asm inline usually with asm { /* assembly code */ }

How to write a custom kernel on mac?

I've been following the "Mike OS Guide" to make my own kernel, and I got it working. But then I went onto the many guides on the internet for making a boot sector in NASM that loads a main function from a compiled C object. I have tried compiling and linking with all kinds of GCC installations:
x86_64-pc-linux-
arm-uclinux-elf-
arm-agb-elf-
arm-elf-
arm-apple-darwin10-
powerpc-apple-darwin10-
i686-apple-darwin10-
i586-pc-linux-
i386-elf-
All of them fail once I put them onto a floppy like I do with the MikeOS bootstrap. I've tried various tutorials on http://www.osdever.net/ like the one here and I've tried http://wiki.osdev.org/Bare_Bones , but none work when trying to compile on a Mac, yet I have not tired on an actual Linux machine yet. But I was wondering how I could get a bootstrap in assembly the calls the C function and put them together into a working kernel file and then load the onto a floppy file then onto an ISO like in the MikeOS tutorial. Or should I just make the kernel.bin and load it with syslinux? Could anyone give me a tip on how to make this all work on a Mac developement environment? I have tolls via macports and homebrew so that helps. Anyone successively done this?
EDIT
Here's my bootsector so far.
I just wanna know how to jump to an extern function from the C and link it.
There's a few problems with this. First of all, all the compilers you mentioned output either 32-bit or 64-bit code. That's great, but when the boot sector starts, it's running in 16-bit real mode. If you want to be able to run that 32-bit or 64-bit code, you'll need to first switch to the appropriate mode (32-bit protected mode for, well, 32-bit, and long mode for 64-bit).
Then, once you switch to the appropriate mode, you don't even have that much space for code: boot sectors are 512 bytes; two bytes are reserved for the bootable signature, and you'll need some bytes for the code that switches to the appropriate mode. If you want to be able to use partitions on that disk or maybe a FAT filesystem, take away even more usable bytes. You simply won't have enough space for all but the most trivial program.
So how do real operating systems deal with that? Real operating systems tend to use the boot sector to load a bigger bootloader from the disk. Then that bigger bootloader can load the actual kernel and switch to the appropriate mode (although that might be the responsibility of the loaded kernel — it depends).
It can be a lot of work to write a bootloader, so rather than rolling your own, you may want to use GRUB and have your kernel comply to the Multiboot standard. GRUB is a bootloader which will be able to load your kernel from the disk (probably in ELF format) and jump to the entry point in 32-bit protected mode. Helpful, right?
This does not free you from learning assembly, though: The entry point of the kernel must be assembly. Often, all it does is set up a little stack and pass the appropriate registers to a C function with the correct calling convention.
You may think that you can just copy that rather than writing it yourself, and you'd be right, but it doesn't end there. You also need assembly for (at least):
Loading a new global descriptor table.
Handling interrupts.
Using non-memory-mapped I/O ports.
…and so on, not to mention that if you have to debug, you may not have a nice debugger; instead, you'll have to look at disassemblies, register values, and memory dumps. Even if your code is compiled from C, you'll have to know what the underlying assembly does or you won't be able to debug it.
In summary, your main problem is not knowing assembly. As stated before, assembly is essential for operating system development. Once you know assembly thoroughly, then you may be able to start writing an operating system.

How can the Linux kernel compile itself?

I don't quite understand the compiling process of the Linux kernel when I install
a Linux system on my machine.
Here are some things that confused me:
The kernel is written in C, however how did the kernel get compiled without a compiler installed?
If the C compiler is installed on my machine before the kernel is compiled, how can the compiler itself get compiled without a compiler installed?
I was so confused for a couple of days, thanks for the response.
The first round of binaries for your Linux box were built on some other Linux box (probably).
The binaries for the first Linux system were built on some other platform.
The binaries for that computer can trace their root back to an original system that was built on yet another platform.
...
Push this far enough, and you find compilers built with more primitive tools, which were in turn built on machines other than their host.
...
Keep pushing and you find computers built so that their instructions could be entered by setting switches on the front panel of the machine.
Very cool stuff.
The rule is "build the tools to build the tools to build the tools...". Very much like the tools which run our physical environment. Also known as "pulling yourself up by the bootstraps".
I think you should distinguish between:
compile, v: To use a compiler to process source code and produce executable code [1].
and
install, v: To connect, set up or prepare something for use [2].
Compilation produces binary executables from source code. Installation merely puts those binary executables in the right place to run them later. So, installation and use do not require compilation if the binaries are available. Think about ”compile” and “install” like about “cook” and “serve”, correspondingly.
Now, your questions:
The kernel is written in C, however how did the kernel get compiled without a compiler installed?
The kernel cannot be compiled without a compiler, but it can be installed from a compiled binary.
Usually, when you install an operating system, you install an pre-compiled kernel (binary executable). It was compiled by someone else. And only if you want to compile the kernel yourself, you need the source and the compiler, and all the other tools.
Even in ”source-based” distributions like gentoo you start from running a compiled binary.
So, you can live your entire life without compiling kernels, because you have them compiled by someone else.
If the C compiler is installed on my machine before the kernel is compiled, how can the compiler itself get compiled without a compiler installed?
The compiler cannot be run if there is no kernel (OS). So one has to install a compiled kernel to run the compiler, but does not need to compile the kernel himself.
Again, the most common practice is to install compiled binaries of the compiler, and use them to compile anything else (including the compiler itself and the kernel).
Now, chicken and egg problem. The first binary is compiled by someone else... See an excellent answer by dmckee.
The term describing this phenomenon is bootstrapping, it's an interesting concept to read up on. If you think about embedded development, it becomes clear that a lot of devices, say alarm clocks, microwaves, remote controls, that require software aren't powerful enough to compile their own software. In fact, these sorts of devices typically don't have enough resources to run anything remotely as complicated as a compiler.
Their software is developed on a desktop machine and then copied once it's been compiled.
If this sort of thing interests you, an article that comes to mind off the top of my head is: Reflections on Trusting Trust (pdf), it's a classic and a fun read.
The kernel doesn't compile itself -- it's compiled by a C compiler in userspace. In most CPU architectures, the CPU has a number of bits in special registers that represent what privileges the code currently running has. In x86, these are the current privilege level bits (CPL) in the code segment (CS) register. If the CPL bits are 00, the code is said to be running in security ring 0, also known as kernel mode. If the CPL bits are 11, the code is said to be running in security ring 3, also known as user mode. The other two combinations, 01 and 10 (security rings 1 and 2 respectively) are seldom used.
The rules about what code can and can't do in user mode versus kernel mode are rather complicated, but suffice to say, user mode has severely reduced privileges.
Now, when people talk about the kernel of an operating system, they're referring to the portions of the OS's code that get to run in kernel mode with elevated privileges. Generally, the kernel authors try to keep the kernel as small as possible for security reasons, so that code which doesn't need extra privileges doesn't have them.
The C compiler is one example of such a program -- it doesn't need the extra privileges offered by kernel mode, so it runs in user mode, like most other programs.
In the case of Linux, the kernel consists of two parts: the source code of the kernel, and the compiled executable of the kernel. Any machine with a C compiler can compile the kernel from the source code into the binary image. The question, then, is what to do with that binary image.
When you install Linux on a new system, you're installing a precompiled binary image, usually from either physical media (such as a CD DVD) or from the network. The BIOS will load the (binary image of the) kernel's bootloader from the media or network, and then the bootloader will install the (binary image of the) kernel onto your hard disk. Then, when you reboot, the BIOS loads the kernel's bootloader from your hard disk, and the bootloader loads the kernel into memory, and you're off and running.
If you want to recompile your own kernel, that's a little trickier, but it can be done.
Which one was there first? the chicken or the egg?
Eggs have been around since the time of the dinosaurs..
..some confuse everything by saying chickens are actually descendants of the great beasts.. long story short: The technology (Egg) was existent prior to the Current product (Chicken)
You need a kernel to build a kernel, i.e. you build one with the other.
The first kernel can be anything you want (preferably something sensible that can create your desired end product ^__^)
This tutorial from Bran's Kernel Development teaches you to develop and build a smallish kernel which you can then test with a Virtual Machine of your choice.
Meaning: you write and compile a kernel someplace, and read it on an empty (no OS) virtual machine.
What happens with those Linux installs follows the same idea with added complexity.
It's not turtles all the way down. Just like you say, you can't compile an operating system that has never been compiled before on a system that's running that operating system. Similarly, at least the very first build of a compiler must be done on another compiler (and usually some subsequent builds too, if that first build turns out not to be able to compile its own source code just yet).
I think the very first Linux kernels were compiled on a Minix box, though I'm not certain about that. GCC was available at the time. One of the very early goals of many operating systems is to run a compiler well enough to compile their own source code. Going further, the first compiler was almost certainly written in assembly language. The first assemblers were written by those poor folks who had to write in raw machine code.
You may want to check out the Linux From Scratch project. You actually build two systems in the book: a "temporary system" that is built on a system you didn't build yourself, and then the "LFS system" that is built on your temporary system. The way the book is currently written, you actually build the temporary system on another Linux box, but in theory you could adapt it to build the temporary system on a completely different OS.
If I am understanding your question correctly. The kernel isn't "compiling itself" these days. Most Linux distributions today provide system installation through a linux live cd. The kernel is loaded from the CD into memory and operates as it would normally as if it were installed to disk. With a linux environment up and running on your system it is easy to just commit the necessary files to your disk.
If you were talking about the bootstrapping issue; dmckee summed it up pretty nice.
Just offering another possibility...

Resources