Create a Debugger using C

Create a Debugger using C - c

I have been asked to write a program in C which should debug another program in C language and then store the value of each variable of every line,loop or function in a log file.
I have been searching over the internet and I found articles on debugging using gdb.
Can I somehow use GDB in my program for this purpose and then store the values of each variable line by line.
I've got basic knowledge of C/C++ so please reply in simple terms.
Thanks

Debuggers depend on some special capability of the hardware, which must be exposed by the operating system (if any).
The basic idea is that the hardware is configured to transfer control to a debugger stub either after every instruction of the target program, or after certain types of instructions such system calls, or those meeting a hardware breakpoint condition. Typically this would look like an interrupt, supervisor exception, or the like - very platform-specific detail.
As mentioned in comments, on Linux, you use the ptrace functionality of the kernel to interact with the debugger support provided by the hardware and the kernel, abstracting away a lot of the hardware-unique detail and managing the permission issues. Typically you must either be the same user id as the process being debugged, or be the superuser (root). Linux's ptrace also gives you an indirect ability to do to things like access the memory (literally, address space) of the target application, something critical to debugger functionality which you cannot ordinarily do from another user-mode program on a multitasking operating system.
Other operating systems will have different methods. Some embedded targets use debug pods which connect your development machine to the embedded board by a few wires. In other cases, debug capability built into the hardware is managed by a small program running on the target processor, which then talks back over a serial or network port to the full debugger program residing on the development machine.
A program such as GDB can do more than just the basics of setting debug stop conditions, dumping registers, and dumping program instructions. Much of its code deals with annotating what it displays based on debug metadata optionally left behind by compilers, walking back through stack frames, and giving the user powerful tools to configure all of this - and of course it does most of this in a target-independent way, with the target-unique code mostly confined to a few interchangeable directories.
You can indeed "drive" GDB from another program - many, many GUI type debuggers do exactly that, existing as graphical front ends for GDB. However, if you were assigned to write a debugger, doing it that way may or may not by consistent with your assignment.

Related

Will static linking allow cross platform execution?

I am curious about how statically linked C executables would work in different environments. Lets say we compile our C code to target x86 MacOs and we statically include everything it uses in the executable as well (print, strlen). What really stops this executable from running in a Windows OS if we include every library it needs? I understand the file format could be different and break but other than that would this technically be able to run?

I see where you're coming from, operating systems makes us think as programmers that libraries are the be-all end-all of programming, that a call to a library is all you need to make complex things happen and that everything is contained in them.
But the truth is, libraries mostly provide provide an abstraction layer. As an exemple let's create a library called "hello_world.so" which prints "Hello World!" to the console. That library we created relies on stdio to handle the complex I/O stuff but stdio itself depends on at least one other thing: the kernel (some specific targets work without a kernel but these system are outside the scope of this answer).
In the desktop world, things can get really complicated, we have several hundreds of processes all running at once even in an idle system, all these apps need access to the hardware (possibly at once too) so it was decided a controller was needed, some piece of software that would coordinate all other software running on the same computer. This piece of software is usually called a kernel. On Windows it's the NT kernel, on macOS it's the XNU and on Linux it's... the Linux kernel!
On these systems, the biggest job of a library is to abstract the kernel, to make us believe printing text on a Linux or a Windows console works the exact same way when actually it can be completely different! Libraries like stdio/time/etc have different "implementations" but the same "interface": they look the same from the dev point of view but the way they achieve their goals can vary wildy, they can do conversions, calls to other hidden or non hidden functions... All this is completely portable from one OS to the other though, things start to go south for you idea when kernel calls start to show up.
Kernel calls are ways a program can "talk" to the kernel. They can be used to do literally thousands of different things but for example there's one (or several ones) to ask for memory (usually this is called with malloc), one to print to the console, one to ask if a network packet arrived, on to ask to talk to your GPU... And these system calls are completely different from one kernel to the other, sometimes even for two versions of the same kernel!
These "kernel calls" are the only thing preventing you from running statically-compiled linux programs on Windows.
PS: Even though all the above is completely true and kernels can be as different from one another as they wish, due to the history of kernels and of computing in general, some kernels actually share the same interface (even though their implementation as you guessed, can be nothing alike). The best example I know of is how most kernels I know of are based on the UNIX kernel.
It means that -even though I have never tested it myself- I think you should be able to statically link a Linux app and use it on Linux, most BSDs and possibly even macOS

The binary and libraries are specific to the operating system.

The TLDR is that the linking process translates function calls into adresses that points of the Operating System's specific libraries. There are some differences like alignment that happens at compile time, but the responsible for your x86 instructions not running under a different OS is the linker.
Your compiler produces x86 instructions that are ready to execute but is incomplete. The linker will go into every function call give a adress for that function in the executable file, even for functions of the standard library.
The linker will create the executable file following a file format which have a header with information like size, metadata and entry point.Windows and Unix have differents specifications for executable files. Windows has PE and Unix has ELF format both for executables and libraries.
Through some hacks and non-trivial tricks it is possible to create an executable that can run on Windows and Unix (see αcτµαlly pδrταblε εxεcµταblε for how).
But even if you do all that there is an obstacle that can not be circumvented: the kernel. The kernel is, well, a kernel. It's the most important thing in
a OS and it provides a set of API calls that provides basic and low-level access to computer resources, so functions like malloc are implemented using the kernel specific API call, VirtualAlloc for Windows and vmalloc or mmap for POSIX.

Main Answer
If your program does anything useful (print output, return a value, communicate on the network), it contains some form of system-call instruction. Each system-call instruction is a request to a particular operating system, and macOS system calls will not work on Windows and vice-versa.
The system-call instruction sends information to the operating system, including a number identifying which service is requested. The operating system that performs that service. When you build your program for macOS, it includes library routines that contain system-call instructions. If you execute those instructions on a Microsoft Windows system, Windows will not understand the macOS requests. It will interpret the information differently, and the program will not work.
So, in theory, there is nothing preventing you from writing your own program loader that reads an ELF executable file intended for macOS, loading its contents into memory, and transferring control to its entry point. But the program will not work because of the system calls.
Supplement
You might consider translating all the system calls in the program. Changing the primary number that identifies the service request might be feasible; it might not require changing the executable too much. For example, if 37 is a “write to file” request on macOS, your program loader might change it to 48 on Windows. However, the system calls also require other data be passed, such as pointers to buffers, lengths, and so on, and there are likely many discrepancies in how those are passed, so that macOS requests cannot be easily translated into Windows requests. Also, it can be technically challenging to identify all places in a program that a certain instruction is used—some of the contents of memory of a loaded program are instructions and some are data. Most normal programs may be well-behaved and easy to analyze in this regard, but not all are.
Another potential issue is that programs may expect to have certain modes set in the processor, and the host operating system may or may not have set those modes as needed.

How to make a program run by BIOS?

I searched for info about this but didn't find anything.
The idea is:
If I code a program in C, or any other languages, what else do I need to do for it to get recognized in BIOS and started by it as a DOS program or just a prompt program?
I got this idea after I booted an flash drive with windows using the ISO and Rufus, which put some code in the flash drive for the BIOS to recognize it and run, so I would like to do the same with a program of mine, for example.
Thanks in advance!

An interesting, but rather challenging exercise!
The BIOS will fetch a specific zone from the boot device, called a master boot record. In a "normal" situation with an OS and one or more partitions, the MBR will need to figure out where to find the OS, load that into memory, and pass control to it. At that time the regular boot sequence starts and somewhat later the OS will be running and be able to interact with you. More detail on the initial activities can be found here
Now, for educational purposes, this is not strictly necessary. You could write an MBR that just reads in a fixed part of the disk (the BIOS has functions that will allow you to read raw sectors off a disk, a disk can be considered as just a bunch of sectors each containing 512 bytes of information) and starts that code. You can find an open source MBR here and basically in any open source OS.
That was the "easy" part, because now you probably want to do something interesting. Unless you want to interact with each part of the hardware yourself, you will have to rely on the services provided by the BIOS to interact with keyboard, screen and disk. The traditionally best source about BIOS services is Ralf Brown's interrupt list.
One specific consideration: your C compiler comes with a standard library, and that library will need a specific OS for many of its operations (eg, to perform output to the screen, it will ask the operating system to perform that output, and the OS will typically use the BIOS or some direct access to the hardware to perform that task). So, in going the route explained above, you will also need to figure out a way to replace these services by some that use the BIOS and nothing more - ie, more or less rewrite the standard library.
In short, to arrive at something usable, you will be writing the essential parts of an operating system...

Actually BIOS is going to be dead in the next two years (INTEL will not support any BIOSes after this date) so you may want to learn UEFI standard. UEFI from v2.4 allows to write and add custom UEFI applications. (BTW the "traditional" BIOS settings on the UEFI computers is often implemented as a custom UEFI App).

Qemu: trace MMU operation

Currently I'm trying to run qemu-system-arm for armv7 architecture, do some initial setup for paging and then enable MMU.
I run qemu with gdb stub and connect to it then with gdb.
I must have screwed something up with translation tables/registers/etc., the thing is the minute I set MMU-enable bit in control register, gdb can't fetch data from memory anymore: after ni command which executes mmu-enable instruction it doesn't fetch next command and I can't access memory.
Is there any way to look what happens inside Qemu's MMU? Where it takes translation tables from, what calculates etc.
Or should I just recompile it with my additional debug output?

No, there's no way to trace this without modifying QEMU's sources yourself
So I did. For ARM architecture, the relevant code is found in target-arm/helper.c - get_phys_addr* functions.

No, there's no way to trace this without modifying QEMU's sources yourself to add debugging output. This is a specific case of a more general tendency, which is that QEMU's design and approach is largely "run correct code quickly", not to provide detailed introspection into the behaviour of possibly buggy guests. Sometimes, as in this case, there's a nice easy location to add your own debug printing; sometimes, as in the fairly common desire to print all the memory accesses made by the guest, there is nowhere in the C code where tracing can be put to catch all accesses.

When I used QEMU for debugging VM issues in an operating system kernel that I had built with a colleague, I ended up connecting GDB to debug QEMU instead (instead of debugging the guest process inside QEMU).
You can place breakpoints on the MMU table walking function and step through it.

Run executable on MINI2440 with NO OS

I have Fedora installed on my PC and I have a Friendly ARM Mini2440 board. I have successfully installed Linux kernel and everything is working. Now I have some image processing program, which I want to run on the board without OS. The only process running on board should be my program. And in that program how can I access the on board camera to take image from, and serial port to send output to the PC.

You're talking about what is often called a bare-metal environment. Google can help you, for example here. In a bare-metal environment you have to have a good understanding of your hardware because you have to take care of a lot of things that the OS normally handles.
I've been working (off and on) on bare-metal support for my ELLCC cross development tool-chain. I have the ARM implementation pretty far along but there is still quite a bit of work to do. I have written about some of my experiences on my blog.
First off, you have to get your program started. You'll need to write some start-up code, usually in assembly, to handle the initialization of the processor as it comes out of reset (or is powered on). The start-up code then typically passes control to code written in C that ultimately directly or indirectly calls your main() function. Getting to main() is a huge step in your bare-metal adventure!
Next, you need to decide how to support your hardware's I/O devices which in your case include the camera and serial port. How much of the standard C (or C++) library does your image processing require? You might need to add some support for functions like printf() or malloc() that normally need some kind of OS support. A simple "hello world" would be a good thing to try next.
ELLCC has examples of various levels of ARM bare-metal in the examples directory. They range from a simple main() up to and including MMU and TCP/IP support. The source for all of it can be browsed here.
I started writing this before I left for work this morning and didn't have time to finish. Both dwelch and Clifford had good suggestions. A bootloader might make your job a lot simpler and documentation on your hardware is crucial.

First you must realise that without an OS, you are responsible for bringing the board up from reset including configuring the PLL and SDRAM, and also for the driver code for every device on the board you wish to use. To do that required adequate documentation of the board and it devices.
It is possible that you can use the existing bootloader to configure the core and SDRAM, but that may not meet your requirement for the only process running on the board should be your image processing program.
Additionally you will need some means of loading and bootstrapping; again the existing Linux bootstrapper may suit.
It is by no means straightforward and cannot really be described in detail here.

Can I execute any c made prog without any os platform?

I googled about it and somewhere I read ....
Yes, you can. That is happening in the case of embedded systems
I think NO, it's not possible. Any platform must have an operating system. Or else, your program must itself be an OS.
Either soft or hard-wired. Without an operating system your component wouldn't work.
Am I right or can anybody explain me the answer? (I dont have any idea abt embedded systems...)

Of course you can. All a (typical) CPU needs is power and access to a memory, then it will execute its hard-coded boot sequence.
Typically this will involve reading some pre-defined address, interpreting the contents there as instructions, and starting to run them.
These instructions could of course come from a C program, although at this level it's more common to write the very early stages (called bootstrapping) in assembly.
This of course doesn't mean, if I were to read your question title literally, that any C program be run this way. If the program assumes there is an OS, but there isn't, it won't work. This should be pretty obvious.

You can run a program in a system without an Operating System ... and that program need not be an Operating System itself.
Think about all the computers (or processors if you prefer) inside a car: engine management, air conditioning, ABS, ..., ...
All of those system have a program (possibly written in C) running. None of the processors have an OS.
The Standard specifically differentiates between hosted implementations and freestanding implementations:
5.1.2.1 Freestanding environment
1 In a freestanding environment (in which C program execution may take place
without any benefit of an operating system), the name and type of the
function called at program startup are implementation-defined. Any library
facilities available to a freestanding program, other than the minimal set
required by clause 4, are implementation-defined.
2 The effect of program termination in a freestanding environment is
implementation-defined.
5.1.2.2 Hosted environment
1 A hosted environment need not be provided, but shall conform to the
following specifications if present.
...

I think you would have fun writing 'toy' kernels that are designed to run under simulators like QEMU (or virtualization platforms, Xen + MiniOS is one of my favorites). With not (much) difficulty, you could get a basic console up and running and start printing things to it. Its really fun, educational and satisfying all at once.
If you are working on x86 .. and get your spiffy kernel working under QEMU .. there's a very good chance that it will also work on real hardware. You might enjoy it.
Anyway, the answer to your question is most decidedly yes. Its especially easier if you happen to be using a boot loader .. for instance, google memtest86 and grab the code.

Usually, any C program will have a variety of system calls which depend on the operating system. For example, printf makes a system call to write to the screen buffer. Opening files and things like that are also system calls.
So basically, you can run the C code which just gets compiled and assembled in to machine code on a processor, but if the code makes any system calls, it would just freeze up the processor when it tries to jump to a memory location that it thinks is the operating system. This of course would depend on your being able to get the program running in the first place, which is not easy without the operating system as well.

Embedded systems are legitimate OS's in their own right, they're just not general purpose OS's. Any userland program (i.e. a program that is not itself an operating system) needs an operating system to run on top of.

As an example: Building Bare-Metal ARM Systems with GNU
Many embedded systems do not have enough resources for a full OS, some may use a scheduler kernel or RTOS, others are coded 'bare metal'. The main() C entry point is entered after reset. Only a small amount of assembler code is required to initialise a microprocessor, to execute C code. All C requires to run generally is a stack - usually simply a case of initialising the stack pointer to a specific address. Some processor specific initialisation of interrupt/exception vectors, system clocks, memory controllers etc. may be necessary also.
On a desktop PC, typically you have a BIOS that handles basic hardware initialisation such as SDRAM controller setup and timing, and then bootstrapping from a disk boot-sector, which then in turn bootstraps an OS. Any of that code could be written in C (and some of it probably is), and it could do something other than boot an OS - it could do anything - it is just code.
OSs are useful for non-dedicated computing devices where the end user many select one of many programs to execute and possibly several simultaneously. Most embedded systems do just one thing, the software is often loaded from ROM or executes directly from ROM, and is never changed and executes indefinitely (usually stopped only by power-down).
You still of course might implement device drivers and the like, but often these are an integral part of the application rather than a separate entity. Even when you do use an RTOS in an embedded system, it is still generally integral to your application rather than an OS in the sense you might understand. In these cases the RTOS is simply a library like any other, and is often initialised and started from main() rather then the other way around as you might expect.

every piece of hardware has to have a piece of software that operates it, be it embedded firmware (smaller and relatively fixed, like vxworks) or an operating system software that can run complex arbitrary code on top of it (like windows, linux, or mac).
think of it as a stack. at the bottom, you have the hardware. on top of that, a piece of software that can control that hardware. on top of that, you can have all sorts of stuff. in the case of a voip phone, you'll have vxworks controlling the hardware, and a layer on top of that that handles all the phone applications.
so going back to your question, yes, you CAN run any c program on anything, BUT it depends what kind of c program it is. if it's a low level c program that can talk to hardware, then you dont need anything other than your program and the hardware. if it's a higher level c program (like a chat program), then you need a whole bunch of stuff between your program and the hardware.
make sense?

Obviously, you cannot execute any arbitrary C program without some sort of OS or OS-equivalent. Similarly, I can write a C program under Linux that won't run under Microsoft Windows.
However, you can write C programs on almost anything. It's a popular language to write software for embedded systems in, and they very often don't have an OS.
Many embedded systems have just a CPU hooked up to a ROM, with pins coming out of the chip that are directly attached to inputs and outputs. There is no user I/O, no file system, no process scheduling, nothing you'd typically want an OS for. In those cases, a C programmer might write a program that is burned into a ROM, which will handle everything itself.
(Some embedded systems are more complicated, and can use an OS. Linux is frequently used, since it's free for the use, can be made very compact, and can be changed at any level. Not all do, though.)

You definitely don't need an OS to run your C code on any system. What you will need is two pieces of initialization code - one to initialize the hardware needed (processor, clock, memory) and another to set up your stack and C runtime (i.e. intialization of data and BSS sections). This, of course, means that you cannot take advantage of the multithreading, messaging and synchronization services that an OS would provide. I'll try and break it down into some steps to give you an idea:
Write a "reset_routine" that run when the board starts. This will initialize the clock and any external memory needed. (This routine will have to execute from a memory that is either internal or one that can be initialized and programmed externally).
The reset_routine, after the hardware initializations, transfers control to a "sw_runtime_init" routine that will set up the stack and the globals definied by you application. (Do a jump from reset_routine to sw_runtime_init instead of a call to avoid stack usage).
Compile and link this to you application, whilst ensuring that the "reset_routine" is linked to the location where the reset vector points to.
Load this onto your target and pray.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight