How does the Master Boot Record (MBR) change? I can not figure out if there is an MBR specific to Linux-based operating systems, or does each operating system have a different MBR? How do I compare MBRs of operating systems? And if I write one with C for Ubuntu, will it work for other distributions as well?
If an MBR is always cross-platform, does mean that it will work on both Unix-like systems and Windows for example? And if not, what's the difference between an MBR for Windows and MBR for Unix?
In "idealistic theory", for dual boot scenarios, the MBR contains a "Boot Manager" that does things like determine which partitions are bootable, selects one (possibly by offering the user a menu to choose from), loads the selected partition's first sector (the selected operating system's boot sector) and passes control to it.
In this case, the MBR (and other associated code in the first track of the disk) is owned by a third-party utility that does not belong to any OS; and no OS should ever be allowed to touch/modify the MBR (or any other data that is before the start of the first partition).
However, in practice often people only install one OS, and for convenience OS installers provide their own MBR (so that people don't need to worry about installing a third-party utility); and almost every OS has nasty/egotistical "we only care about ourselves and/or want to be in control of any other OS you install" tendencies. This led to operating systems ignoring common sense and trashing each other and/or providing their own "anti-competitive" alternatives for dual-boot scenarios (GRUB and "boot.ini" on Windows).
This nasty nonsense became significantly worse (for users) when manufacturers/firmware started caring about security. The plan was for all code used during boot (including firmware, MBR and the operating system's boot loader) to be "measured" (checksummed using fancy cryptography built into a TPM chip) so that an OS (and other software - e.g., "remote attestation") can detect if malware had tampered with any of the code that an operating system depended on (because that "fancy checksum" would be different).
This means that if you install a second OS (corrupting/modifying the MBR of an existing OS), you change that "fancy checksum" and break the original operating system's identity, so after you fix the damage and get the original OS to boot again various other things (that depended on the "fancy checksum") remain broken.
Fortunately, UEFI mostly fixed this by taking on the role of "boot manager" (and deprecating the MBR completely), where (if multiple operating systems are installed) the firmware uses "UEFI variables" to determine which one gets booted, and each operating system can have its own different boot loaders in the UEFI system partition without conflict.
Sadly, the nasty/egotistical "we only care about ourselves and want to be in control" tendencies remain and operating systems are trying to find alternative ways to ruin everything (SecureBoot key management, using GRUB as a shim so that different operating systems can fight for control of GRUB's configuration, etc.).
How does the Master Boot Record (MBR) change?
The MBR doesn't change unless you install an OS. When you do install an OS, the MBR may or may not change depending on the OS installer and (in some cases) what you tell the OS installer to do.
I can not figure out if there is an MBR specific to Linux-based operating systems, or does each operating system have a different MBR?
Linux (the kernel) defines a "Linux boot protocol" which allows anyone to write a compatible boot loader. For booting from BIOS; there were 3 common boot loaders (LILO, GRUB and SYSLINUX); but most Linux-based operating systems gravitated towards GRUB, and most OS installers for Linux-based operating systems tended to install GRUB in the MBR as "boot manager plus boot loader".
I want to know if an MBR is always cross-platform, means that it will work on both Unix-like systems and Windows for example. And If not, I want to know what's the difference between an MBR for Windows and MBR for Unix.
Sadly, no - the MBR should be "OS neutral/cross-platform" (and can be in some cases - e.g., if the user doesn't mind installing LILO or GRUB in its own partition and if the OS installer supports that), but mostly isn't. If you want to write your own MBR then you'll probably need to deal with a specific operating system's failure to cooperate with other operating systems.
Related
I trying to boot an elf microkernel in an UEFI environment. So i compiled a minimal boot loader and created an ESP image. This works fine if I boot via an HDD but I want to direct boot it via the qemu -kernel option (This is some special requirement as I am working with AMD SEV). This doesn't work.
I can boot my kernel like this with grub if I use grub mkimage with a fat image included i.e. like this:
mcopy -i "${basedir}/disk.fat" -- "${basedir}/kernel" ::kernel
mcopy -i "${basedir}/disk.fat" -- "${basedir}/module" ::module
grub-mkimage -O x86_64-efi
-c "${basedir}/grub-bootstrap.cfg"
-m "${basedir}/disk.fat"
-o "${basedir}/grub.efi"
But the goal for my system is minimalism and security hence the microkernel, so grub and it's vulnerabilities is out of question.
So my question is:
How to create a bootable application image similar to grub-mkimage?
I have read about efi stub boot but couldn't really figure out how to build an efi stub image.
Normally I am a bare metal embedded programmer, so the whole uefi boot thing is a bit weird to me. I am glad for any tips or recommendations. Also I figured stack overflow might not be the best place for such low level questions, can you maybe recommend other forums?
I want to direct boot it via the qemu -kernel option
Why? It's a qemu-specific hack that doesn't exist on anything else (including any real computer). By using this hack the only thing you're doing is failing to test anything you'd normally use to boot (and therefore failing to test anything that actually matters).
(This is some special requirement as I am working with AMD SEV)
That doesn't make any sense (it's a little bit like saying "I have a banana in my ear because I'm trying to learn how to play piano").
AMD's SEV is a set of extensions intended to enhance the security of virtual machines that has nothing at all to do with how you boot (or whether you boot from BIOS or UEFI or a qemu-specific hack).
I am glad for any tips or recommendations.
My recommendation is to stop using GRUB specific (multi-boot), Qemu specific (-kernel) and Linux/Unix specific (elf) tools and actually try to use UEFI. This will require you to write your own boot loader using (Microsoft's) PE32+ file format that uses UEFI's services itself. Note that GNU's tools (their "Gnu-EFI" stuff for GCC) is relatively awful (it puts a PE32+ wrapper around an ELF file and does run-time patching to make the resulting Franken-monster work); and there are much better alternatives now (e.g. the Clang/LLVM/lld toolchain).
If you care about security, then it'll also involve learning about UEFI SecureBoot (and key management, and digital signatures). If you care about secure virtual machines I'd also recommend learning about the SKINIT instruction from AMD"s manual (used to create a dynamic root of trust after boot); but don't forget that this is AMD specific and won't work on any Intel CPU, and is mostly obsolete (the "trusted measurement" stuff from BIOS and TPM was mostly superseded by SecureBoot anyway), and (even on Intel CPUs) if you're only the guest then the hyper-visor can emulate it in any way it wants (and it won't guarantee anything is secure).
Finally; note that booting a micro-kernel directly doesn't make much sense either. There's no device drivers in a micro-kernel; so after booting a micro-kernel you end up with a "can't start any device drivers because there are no device drivers" problem. Instead you need to load many files (e.g. maybe an initial RAM disk), then (e.g.) start some kind of "boot log handler" (to display error messages, etc); then find and start the kernel, then start other processes (e.g. "device manager" to detect devices and drivers; "VFS layer" to handle file systems and file IO; etc). For the whole thing; starting the kernel is just one relatively insignificant small step (not much more than starting a global shared library that provides multi-tasking) buried among a significantly larger amount of code that does all the work.
Sadly; booting a monolithic kernel directly can make sense because it can contain all the drivers (or at least, has enough built into the kernel's executable file to handle an initial RAM disk if it's "modular monolithic" with dynamically loaded drivers); and this "monolithic with stuff that doesn't belong in any micro-kernel" idea is what most beginner tutorials assume.
I have compiled a simple executable application written in C, using the arm-linux-gnueabi compiler for ARM.
How to run it on device?
Assuming that I have two devices for test it:
A Samsung phone with Windows Mobile 6.1, ARM926EJ OMAP1710 processor
A Foston tablet with Android 2.x, the processor name I not found but is one of processors in the ARM family.
If is not possible run it on the current operating system, then how to format the device and put my kernel instead of Android/Linux?
An application is typically built to run on top of an operating system. An operating system is typically built to run on top of hardware. Keep this in mind.
Running your application instead of Android/Linux implies that your application is a operating system of some sort. If you didn't write or include explicit code to control the hardware chips in the device, then you are only asking the wrong question, you should ask "I've wrtten an application in C, now how do I run it on my phone's operating system." If you did write or include explicit code to control the hardware chips in the device, then you did ask the right question (but some of the details seem off). This style of development happens a lot with the arduino/PIC/embedded ARM community.
Assuming you are not doing embedded development, the application must be compiled with some understanding of what the operating system offers (against the operating system's available api's) which generally makes them incompatible with other operating systems. This means the first step is to determine what operating system you are targeting, and obtain it's development suite. Once you have that, assuming that it supports C code (as most do), the suite will recompile your source code in a format that is both compatible with the CPU of the device and the API of the operating system on the device.
Small devices like phones typically run operating systems that have a tiny fraction of the features of a PC, so be prepared for fewer convenience features, and possibly "missing" libraries. That said, if you do get it to compile, typically you then hook the device up with the supported "bus" (USB is very popular), and save the program on the device (which sometimes involves sending "development / debugging" codes across the bus, and the development suite does this for you).
If everything worked well, you can then launch your program from the phone. If the program misbehaves and renders the phone inoperable, each development suite / phone has specific instructions on how to recover or reload a fresh operating system.
Here are resources for a few well known platforms (and percentages of the phones using them)
(worldwide according to Gartner's latest study, US according to Nielsen's latest study)
As referenced from wikipedia on 4/27/2012
(52%, 46.3%) Android Standard Development Kit
(16.9%, 1.4%) Symbian Standard Development Kit
(15%, 30%) iOS Phone Standard Development Kit
(11%, 14.9%) Blackberry Phone Standard Development Kit
(2.2%, 0%) Bada Standard Development Kit
(1.5%, 5.9%) Windows Phone Standard Development Kit
Note that these measurements are like most surveys, while they attempt to be random and unbiased, they are prone to measurement error and sampling error, so the numbers are more useful as relative indicators than absolute values.
If you compile a program in say, C, on a Linux based platform, then port it to use the MacOS libraries, will it work?
Is the core machine-code that comes from a compiler compatible on both Mac and Linux?
The reason I ask this is because both are "UNIX based" so I would think this is true, but I'm not really sure.
No, Linux and Mac OS X binaries are not cross-compatible.
For one thing, Linux executables use a format called ELF.
Mac OS X executables use Mach-O format.
Thus, even if a lot of the libraries ordinarily compile separately on each system, they would not be portable in binary format.
Furthermore, Linux is not actually UNIX-based. It does share a number of common features and tools with UNIX, but a lot of that has to do with computing standards like POSIX.
All this said, people can and do create pretty cool ways to deal with the problem of cross-compatibility.
EDIT:
Finally, to address your point on byte-code: when making a binary, compilers usually generate machine code that is specific to the platform you're developing on. (This isn't always the case, but it usually is.)
In general you can easily port a program across various Unix brands. However you need (at least) to recompile it on each platform.
Executables (binaries) are not usable on several platforms, because an executable is tightly coupled with the operating system's ABI (Application Binary Interface), i.e. the conventions of how an application communicates with the operating system.
For instance if your program prints a string onto the console using the POSIX write call, the ABI specifies:
How a system call is done (Linux used to call the 0x80 software interrupt on x86, now it uses the specific sysenter instruction)
The system call number
How are the function's arguments transmitted to the system
Any kind of alignment
...
And this varies a lot across operating systems.
Note however that in some cases there may be “ABI adapters” allowing to run binaries of one OS onto another OS. For instance Wine allows you to run Windows executables on various Unix flavors, NDISwrapper allows you to use Windows network drivers on Linux.
"bytecode" usually refers to code executed by a virtual machine (e.g. for java or python). C is compiled to machine code, which the CPU can execute directly. Machine language is hardware-specific so it it would be the same under any OS running on an intel chip (even under Windows), but the details of how the machine code is wrapped into an executable file, and how it is integrated with system calls and dynamically linked libraries are different from system to system.
So no, you can't take compiled code and use it in a different OS. (However, there are "cross-compilers" that run on one OS but generate code that will run on another OS).
There is no "core byte-code that comes from a compiler". There is only machine code.
While the same machine instructions may be applicable under several operating systems (as long as they're run on the same hardware), there is much more to a hosted executable than that, and since a compiled and linked native executable for Linux has very different runtime and library requirements from one on BSD or Darwin, you won't be able to run one binary on the other system.
By contrast, Windows binaries can sometimes be executed under Linux, because Linux provides both a binary format loader for Windows's PE format, as well as an extensive API implementation (Wine). In principle this idea can be used on other platforms as well, but I'm not aware of anyone having written this for Linux<->Darwin. If you already have the source code, and it compiles in Linux, then you have a good chance of it also compiling under MacOS (modulo UI components, of course).
Well, maybe... but most probably not.
But if it does, it's not "because both are UNIX" it's because:
Mac computers happen to use the same processor nowadays (this was very different in the past)
You happen to use a program that has no dependency on any library at all (very unlikely)
You happen to use the same runtime libraries
You happen to use a loader/binary format that is compatible with both.
I know that OS kernels are made up of drivers, but how does the driver become a part of the os?, does the kernel decompile itself, and then add the driver and recompile itself?, or are the drivers plug-ins for the kernel?, someone told me that for most operating systems, the drivers actually become a part of the kernel, but whenever I compile a c program, it turns into an ordinary executable
The driver architecture depends entirely on your operating system. For most operating systems running on computers (as opposed to embedded devices), thinking of drivers as 'plug-ins' for the kernel is pretty much accurate. That said, there are plenty of older, smaller, and less sophisticated operating systems which require you to build the driver in as part of the kernel - no dynamic loading possible. These days, several operating systems have support for "user-mode" drivers, which are device drivers that don't ever run in the kernel memory space at all.
It depends on the o/s.
Classically, the kernel was a monolithic executable that contained all the drivers - and was rebuilt when a new driver needed to be added, including the code for the new driver along with all the old ones.
In modern Linux, and probably other o/s too, the drivers are dynamically loaded by the kernel when needed. The driver is created in a form that allows the kernel to do that loading; typically, that means in a shared object or dynamic link library format.
In operating systems like Linux drivers can be actually compiled into the kernel image. Although even if statically linked, they may well exhibit a plug-in type architecture that allows one to easily only include the drivers one needs.
Alternatively, they are dynamically linked and loaded either at boot time or on demand when required by some system level software.
I don't quite understand the compiling process of the Linux kernel when I install
a Linux system on my machine.
Here are some things that confused me:
The kernel is written in C, however how did the kernel get compiled without a compiler installed?
If the C compiler is installed on my machine before the kernel is compiled, how can the compiler itself get compiled without a compiler installed?
I was so confused for a couple of days, thanks for the response.
The first round of binaries for your Linux box were built on some other Linux box (probably).
The binaries for the first Linux system were built on some other platform.
The binaries for that computer can trace their root back to an original system that was built on yet another platform.
...
Push this far enough, and you find compilers built with more primitive tools, which were in turn built on machines other than their host.
...
Keep pushing and you find computers built so that their instructions could be entered by setting switches on the front panel of the machine.
Very cool stuff.
The rule is "build the tools to build the tools to build the tools...". Very much like the tools which run our physical environment. Also known as "pulling yourself up by the bootstraps".
I think you should distinguish between:
compile, v: To use a compiler to process source code and produce executable code [1].
and
install, v: To connect, set up or prepare something for use [2].
Compilation produces binary executables from source code. Installation merely puts those binary executables in the right place to run them later. So, installation and use do not require compilation if the binaries are available. Think about ”compile” and “install” like about “cook” and “serve”, correspondingly.
Now, your questions:
The kernel is written in C, however how did the kernel get compiled without a compiler installed?
The kernel cannot be compiled without a compiler, but it can be installed from a compiled binary.
Usually, when you install an operating system, you install an pre-compiled kernel (binary executable). It was compiled by someone else. And only if you want to compile the kernel yourself, you need the source and the compiler, and all the other tools.
Even in ”source-based” distributions like gentoo you start from running a compiled binary.
So, you can live your entire life without compiling kernels, because you have them compiled by someone else.
If the C compiler is installed on my machine before the kernel is compiled, how can the compiler itself get compiled without a compiler installed?
The compiler cannot be run if there is no kernel (OS). So one has to install a compiled kernel to run the compiler, but does not need to compile the kernel himself.
Again, the most common practice is to install compiled binaries of the compiler, and use them to compile anything else (including the compiler itself and the kernel).
Now, chicken and egg problem. The first binary is compiled by someone else... See an excellent answer by dmckee.
The term describing this phenomenon is bootstrapping, it's an interesting concept to read up on. If you think about embedded development, it becomes clear that a lot of devices, say alarm clocks, microwaves, remote controls, that require software aren't powerful enough to compile their own software. In fact, these sorts of devices typically don't have enough resources to run anything remotely as complicated as a compiler.
Their software is developed on a desktop machine and then copied once it's been compiled.
If this sort of thing interests you, an article that comes to mind off the top of my head is: Reflections on Trusting Trust (pdf), it's a classic and a fun read.
The kernel doesn't compile itself -- it's compiled by a C compiler in userspace. In most CPU architectures, the CPU has a number of bits in special registers that represent what privileges the code currently running has. In x86, these are the current privilege level bits (CPL) in the code segment (CS) register. If the CPL bits are 00, the code is said to be running in security ring 0, also known as kernel mode. If the CPL bits are 11, the code is said to be running in security ring 3, also known as user mode. The other two combinations, 01 and 10 (security rings 1 and 2 respectively) are seldom used.
The rules about what code can and can't do in user mode versus kernel mode are rather complicated, but suffice to say, user mode has severely reduced privileges.
Now, when people talk about the kernel of an operating system, they're referring to the portions of the OS's code that get to run in kernel mode with elevated privileges. Generally, the kernel authors try to keep the kernel as small as possible for security reasons, so that code which doesn't need extra privileges doesn't have them.
The C compiler is one example of such a program -- it doesn't need the extra privileges offered by kernel mode, so it runs in user mode, like most other programs.
In the case of Linux, the kernel consists of two parts: the source code of the kernel, and the compiled executable of the kernel. Any machine with a C compiler can compile the kernel from the source code into the binary image. The question, then, is what to do with that binary image.
When you install Linux on a new system, you're installing a precompiled binary image, usually from either physical media (such as a CD DVD) or from the network. The BIOS will load the (binary image of the) kernel's bootloader from the media or network, and then the bootloader will install the (binary image of the) kernel onto your hard disk. Then, when you reboot, the BIOS loads the kernel's bootloader from your hard disk, and the bootloader loads the kernel into memory, and you're off and running.
If you want to recompile your own kernel, that's a little trickier, but it can be done.
Which one was there first? the chicken or the egg?
Eggs have been around since the time of the dinosaurs..
..some confuse everything by saying chickens are actually descendants of the great beasts.. long story short: The technology (Egg) was existent prior to the Current product (Chicken)
You need a kernel to build a kernel, i.e. you build one with the other.
The first kernel can be anything you want (preferably something sensible that can create your desired end product ^__^)
This tutorial from Bran's Kernel Development teaches you to develop and build a smallish kernel which you can then test with a Virtual Machine of your choice.
Meaning: you write and compile a kernel someplace, and read it on an empty (no OS) virtual machine.
What happens with those Linux installs follows the same idea with added complexity.
It's not turtles all the way down. Just like you say, you can't compile an operating system that has never been compiled before on a system that's running that operating system. Similarly, at least the very first build of a compiler must be done on another compiler (and usually some subsequent builds too, if that first build turns out not to be able to compile its own source code just yet).
I think the very first Linux kernels were compiled on a Minix box, though I'm not certain about that. GCC was available at the time. One of the very early goals of many operating systems is to run a compiler well enough to compile their own source code. Going further, the first compiler was almost certainly written in assembly language. The first assemblers were written by those poor folks who had to write in raw machine code.
You may want to check out the Linux From Scratch project. You actually build two systems in the book: a "temporary system" that is built on a system you didn't build yourself, and then the "LFS system" that is built on your temporary system. The way the book is currently written, you actually build the temporary system on another Linux box, but in theory you could adapt it to build the temporary system on a completely different OS.
If I am understanding your question correctly. The kernel isn't "compiling itself" these days. Most Linux distributions today provide system installation through a linux live cd. The kernel is loaded from the CD into memory and operates as it would normally as if it were installed to disk. With a linux environment up and running on your system it is easy to just commit the necessary files to your disk.
If you were talking about the bootstrapping issue; dmckee summed it up pretty nice.
Just offering another possibility...