I am trying to fuzz a part of code written for big endian MIPS architecture using libfuzzer.
I have run libfuzzer under a Debian little endian machine and segmentation faults are issued. I think that the results are not a 100% trustworthy. How to use libfuzzer with an emulator for big endian architectures? Is this possible? Are there any other techniques for testing big endian code in little endian architectures by using fuzzing?
You could cross-compile your software to big-endian MIPS on your host machine, and then use QEMU user-mode emulation. In this mode, QEMU runs a single process on the emulated CPU, eliminating all the emulated hardware. Instead, it merely translates system calls to the host kernel, so the process accesses all the host files, networks, etc. -- just like if your host CPU would get the ability to execute MIPS instructions -- it now can mess with your host files etc. too, you have been warned. :)
I'm not familiar with specifically libfuzzer, but this setup should suffice to at least validate already found crashes (supposing one can trust that QEMU simulates MIPS CPU realistically enough). AFAIK libfuzzer is an in-process fuzzer, so unlike AFL, the fuzzed process should not face some problems with communicating to the separate fuzzer process using shared memory, etc.
Related
I'm trying to boot a kernel (extracted from a firmware) using QEMU.
Qemu emulation seems to start at 0x0.
The problem is that the memory from 0x0 to 0x04000000 is only filled with 0.
How can i debug the bootloader?
You don't say what your command line is. The address where QEMU starts execution depends on many things:
the guest CPU architecture
which board model you are emulating
whether you passed QEMU a BIOS image file
the file format of any file passed to -kernel (ELF, plain kernel image, uImage, etc)
In general, though, you should not expect to be able to pull a random kernel image out of a firmware dump for a piece of Arm hardware and run it under QEMU. This is because every Arm board or machine is different -- RAM may be in different places, the devices such as the serial port are at different addresses, and so on -- and the kernel will only boot on systems which it has been compiled to support. The chances are very high that (a) QEMU does not have a specific emulation of the bit of hardware that the firmware dump is for and (b) the kernel from the firmware has not been built to also run on any of the board types that QEMU does support. So it will almost certainly simply crash very early on in bootup without producing any output.
If you want to debug what's going on in early bootup, the best approach is probably to use QEMU's built in gdbstub, and attach a guest-architecture-aware gdb to it. You may also find QEMU's internal logging via the '-d' option useful, though it requires some familiarity with how QEMU works to make sense of the output.
In the case of x86 the same (real mode) bootloader works on virtually any x86 device.
Is that possible on ARM or do I need to create a specific bootloader for each 'cortex'?
x86 or lets say PC compatible systems are ... pc compatible. They support the ancient bios calls so that there is massive compatibility. by design, by the chip vendor (intel) the software vendors (bios, operating system) and the motherboard vendors.
ARM is in now way shape or form like that. There are instruction sets you can choose that work almost or all the way across, but remember ARM systems you buy an ARM core and add it to your special chip, you and your special/custom stuff, then that is put on one or more different boards. There is little to no compatibility. Instruction set and arm core is a small part of the whole picture most of the code is for the non-arm stuff.
u-boot and perhaps others are fairly massive bootloaders, pretty much an operating system themselves, and have to be ported just like an operating system to each chip/board combination. The chip vendor, if this is a linux compatible system, most likely has a reference design and a BSP including a u-boot port and/or some other solution (rasberry pi is a good example). it is fairly trivial to boot linux or used to be, there is no reason for the massively overcomplicated u-boot. without a DTB you setup a few memory locations a register or two and branch to the kernel, thats it (again look at the raspberry pi), I assume with DTB you build the dtb then put it somewhere, setup a few registers and branch to the linux kernel (raspberry pi? ntc chip?)
There is a Arm open source project that can cover Armv7/v8 Cortex-A processors bootloaders.
https://git.trustedfirmware.org/TF-A/trusted-firmware-a.git/
Another open source project for Cortex-M processors:
https://git.trustedfirmware.org/TF-M/trusted-firmware-m.git/
Is it possible to compile some Linux Kernel and run it over QEMU, emulating some Big Endian ARM processor?
If QEMU is not capable of that, I'd love to hear about other system emulators than can.
My basic goal is to run and debug dedicated Big Endian ELFs in as much as possible native environment.
Every close solution or idea would help!
QEMU has support for big-endian ARM CPUs, but it does not currently have support for emulation of any specific machines (boards) which have big-endian ARM CPUs in them. ARM Linux kernels will generally only run on the hardware they're compiled for, so you can't just take a random big-endian ARM Linux kernel and run it on anything -- you'd need to model the hardware the kernel wanted to see first.
The underlying reason for this is that big-endian ARM systems are very rare -- almost everybody runs ARM CPUs in little-endian mode, and all the boards QEMU models today are little-endian.
I have recently started to take an interest in the topics of operating systems. I have a couple of things that are weighing on my mind, but I have decided to split the questions.
Let's assume we're designing a kernel for a new instruction set architecture that's out on the market. There are no C runtime libraries, no nothing. Only a compatible compiler for that ISA.
Presumably, this means that the only C constructs that are available to the kernel programmer are only basic assignment operators, bitwise operators and loops. Is this correct?
If so, how are more complex things like main memory I/O and process scheduling achieved on the lowest level? Can they only be implemented in pure assembly?
What does it mean then, for a kernel to be written in C (Linux for example). Are some parts of the kernel inherently written in assembly then?
Presumably, this means the only C constructs that are available to the kernel programmer are only basic assignment operators, bitwise operators and loops. Is this correct?
Pretty much all C language features will still work in your kernel without needing any particular runtime support, your C compiler will be able to translate them to assembler that can run just as well in kernel mode as they would in a normal user-mode program.
However libraries such as the Standard C Library will not be available, you will have to write your own implementation. In particular this means no malloc and free until you implement them yourself.
If so, how are more complex things like main memory I/O and process scheduling achieved on the lowest level? Can they only be implemented in pure assembly?
Memory I/O is something much more low level that is handled by the CPU, BIOS, and various other hardware on your computer. The OS thankfully doesn't have to bother with this (with some exceptions, such as some addresses being reserved, and some memory management features).
Process scheduling is a concept that doesn't really exist at the machine code level on most architecture. x86 does have a concept of tasks and hardware task switching but nobody uses it. This is an abstraction set up by the OS as needed, you would have to implement it yourself, or you could decide to have a single-tasking OS if you do not want to spend the effort, it will still work.
What does it mean then, for a kernel to be written in C (linux for example). Are some parts of the kernel inherently written in assembly then?
Some parts of the kernel will be heavily architecture dependent and will have to be written in ASM. For example on x86 switching between modes (e.g. to run 16 bit code, or as part of the boot process) or interrupt handling can only be done with some protected ASM instructions. The reference manual of your architecture of choice, such as the Intel® 64 and IA-32 Architectures Software Developer’s Manual for x86 are the first place to look for those kinds of details.
But C is a portable language, it has no need for such low level architecture-specific concepts (although you could in theory do everything from a .c file with compiler intrinsics and inline ASM). It is more useful to abstract this away in assembler routines, and build your C code on top of a clean interface that you could maintain if you wanted to port your OS to another architecture.
If you are interested in the subject, I highly recommend you pay a visit to the OS Development Wiki, it's a great source of information about Operating Systems and you'll find many hobbyists that share your interest.
About the only thing you need to code in assembler are:
Context switches (swapping out the machine state of one abstract process for another)
Access to device registers (and you don't even need this if the devices are memory mapped)
Entry and exit from interrupt handlers (this is a kind of context switch)
Perhaps a boot loader
Everthing else you should be able to do in C code.
If you want to see this job done spectacularly well, you should go an check out the Multics OS, dating from the middle 60s, supporting a large scale information services (multiple CPUs, Virtual Memory, ...). This was coded almost entirely in PL/1 (a C-like language) with only very small bits coded in the native assembly language of the Honeywell processor that supported Multics. The Organick book on Multics is worth its weight in gold in terms of showing how Multics worked and how clean most of it is. (We got "Eunuchs" instead).
There are some places where it will be worthwhile to code in assembler anyway. Regardless of the quality of your compiler's code generator, you will be able to hand-code certain routines that occur in time-critical areas better in assembler than the compiler will do. Places I'd expect this matter: the scheduler, system call entry and exit. Other places only as measurement indicates. (On older, much smaller systems, one tended to write the OS using a lot of assembler, but that was as much for space savings as it was for efficiency of execution, C compilers weren't nearly as good).
I'm wondering how a new architecture that's "out on the market" would not already have some type of operating system.
Device drivers - someone is going to have to write code for this, perhaps one driver for BIOS, the other for the OS. Memory mapped I/O can get complicated depending on the hardware, such as a controller with a set of descriptors, each containing a physical address and length. If the OS supports virtual memory, then that memory has to be "locked" and the physical addresses obtained in order to program the controller. This one reason for having a set of descriptors, so that a single memory mapped I/O can handle scattered physical pages that have been mapped into a continuous virtual address space.
Assembly code - the other comments here have already note that some assembly will be required (context switches, interrupt handlers (which could call C functions, so most of the code could be in C)).
I have an open-source Atari 2600 emulator (Z26), and I'd like to add support for cartridges containing an embedded ARM processor (NXP 21xx family). The idea would be to simulate the 6507 until it tries to read or write a byte of memory (which it will do every 841ns). If the 6507 performs a write, put the address and data on some of the ARM's I/O ports and let the ARM code run 20 cycles, confirm that the ARM is floating its data bus, and let the ARM run for another 38 cycles. If the 6507 performs a read, put the address on the ARM's I/O ports, let the ARM run 38 cycles, grab the data from the ARM's I/O port (hopefully the ARM software will have put it there), and let the ARM run another 20 cycles.
The ARM7 seems pretty straightforward to implement; I don't need to simulate a whole lot of hardware features. Any thoughts?
Edit
What I have in mind would be a routine that would take as a parameter a struct holding the machine state and pointers to a memory access routine. When called, the routine would emulate the ARM's instruction engine, generating appropriate reads, writes, and code fetches. I could then write the memory access routine to regard appropriate areas as flash (with roughly-approximated wait states), RAM, I/O ports, and timer registers. Some other areas would be marked as don't-care, and accesses to any other areas would flag an error and stop the emulator.
Perhaps QEMU uses such a thing internally. Since the ARM emulation would be integrated into an already-existing emulation engine (which I didn't write and don't fully understand--the only parts of Z26 I've patched have been the memory read/write logic) I would need something with a fairly small footprint.
Any idea how QEMU works inside? Any idea what the GPL licence would require if I just use 2% of the code in QEMU--whether I'd have to bundle the code for the whole thing, or just the part that I use, or what?
Try QEMU.
With some work, you can make my emulator do what you want. It was written for ARM920, and the Thumb instruction set isn't done yet. Neither is the MMU/cache interface. Also, it's slow because it is an interpreter. On the bright side, it's all written in C99.
http://code.google.com/p/gp2xemu/
I haven't worked on it for a while (The svn trunk is 2 years old), but if you're going to use the code, I'll be glad to help you out with the missing features. It is licensed under MIT, so it's just the same as the broad BSD license.