Coding C libraries for an Operating System - c

I am trying to create a DOS-like OS. I have read multiple articles, read multiple books (I even paid a lifetime subscription for O'Reilly Media), but to no avail, I found nothing useful. I want to learn how to make operating system libraries, which rises the question which is: are libraries which you code for a program the same if you are compiling it for an operating system?
I know Operating Systems are very challenging to make and the very few programmers that do attempt to make one never produce a functioning one which is why it's described as "the great pinnacle of programming.". But still, I'm going to make an attempt at making one (just for fun, and maybe learn a few pointers on the way).
All I need to do this is basically learning how to make the libraries, C (which I already know and love), assembly (which I kind-of know how to use along with C) and a compiler (I use the GNU toolchain). The only thing I am having trouble with are coding the libraries. I'm like wow right now, who knew that coding libraries are so hard, like point me to a book or something! But no. I'm not asking for a book right here, all I'm asking for is some advice on how to do this like:
How do you start making some basic I/O libraries
Is it the same as making a regular C library
And finally, is it going to be hard? (JK I know already that this is going to be extremely hard which is why I prepared so much)
In summary, the main question is, how I can make this work or is there a pre-built library that would most likely speed up the process?

Are libraries which you code for a program the same if you are compiling it for an operating system?
Absolutely not. A user-space C library at its lowest level makes system calls to an operating system to interact with hardware via device drivers; it is the device driver and interaction with hardware you will be writing.
From my experience doing embedded system bringups, the way you start is with a development board with a legacy RS-232 port. It's about the easiest possible device to write a driver for - you write bytes to a memory mapped IO address, wait a bit then write some more. This is where your first debug output goes too.
You might find yourself waggling IO pins and probing them with a logic analyser or DSO on the route to this though - hence why you want a development board where the signals are accessible.
None of the standard C-library will be available to you - so you'll need to equivalents of some of things it provides - but in kernel space - including type definitions, memory management, and intrinsics the compiler expects - particularly those for memory barriers. The C-library doesn't provide any data structures or algorithms anyway, but you'll definitely be wanting to write some early on.

Related

C runtime library : what for?

Context:
Creating a toy OS, written in assembly and C.
X86. 32-bits first, 64-bits then.
Currently read how to make a C library and try to understand the underlyings.
My question comes from the reading on this page.
I read the other SO questions concerning runtime libraries but still don't get it.
Question:
Ok, a runtime library seems to help providing low level functionalities that an application library cannot provide.
What wonders can I do with a runtime library ?
My searches on the subject lead to theoretical explanations (MSDN and so on).
I need a practical, visual explanation.
Update:
I saw the question the admins refer to, and I obviously read it already. But it was not enough high level. But that is fine :)
Thanks
A C runtime library is more a part of your C implementation than it is a core part of the operating system, especially if the C implementation provides only static linking, as might be the case in a toy OS.
Among other things, though, the C runtime library provides all the functions necessary for programs to obtain services from the OS, such as memory allocation and I/O. These are not necessarily the same functions the OS kernel uses internally for the same or similar purposes.
Programs written in other languages may or may not rely on the C language runtime (they may provide their own, independent one instead), and statically-linked C programs include all necessary functions in their own images, instead of relying on dynamically loading them from a library at run time. In a sense, the C runtime library is distributed and duplicated across all statically-linked programs built from C sources.

Is it viable to write a Linux kernel-mode debugger for Intel x86-64 in Common Lisp, and with which Common Lisp implementation[s]?

I'm interested in developing some kind of ring0 kernel-mode debugger for x86-64 in Common Lisp that would be loaded as a Linux kernel module and as I prefer Common Lisp to C in general programming, I wonder how different Common Lisp implementations would fit this kind of programming task.
The debugger would use some external disassembling library, such as udis86 via some FFI. It seems to me that it's easiest to write kernel modules in C as they need to contain C functions int init_module(void) and void cleanup_module(void) (The Linux Kernel Module Programming Guide), so the kernel-land module code would call Common Lisp code from C by using CFFI. The idea would be to create a ring0 debugger for 64-bit Linux inspired by the idea of Rasta Ring 0 Debugger, that is only available for 32-bit Linux and requires PS/2 keyboard. I think the most challenging part would be the actual debugger code with hardware and software breakpoints and low-level video, keyboard or USB input device handling. Inline assembly would help a lot in that, it seems to me that in SBCL inline assembly can be implemented by using VOPs (SBCL Internals: VOP) (SBCL Internals: Adding VOPs), and this IRC log mentions that ACL (Allegro Common Lisp), CCL (Clozure Common Lisp) and CormanCL have LAPs (Lisp Assembly Programs). Both ACL and CormanCL are proprietary and thus discarded, but CCL (Clozure Common Lisp) could be one option. Capacity of building standalone executables is a requirement too; SBCL which I'm currently using has it, but as they are entire Lisp images, their size is quite big.
My question is: is it viable to create a ring0 kernel-mode debugger for Intel x86-64 in Common Lisp, with low-level code implemented in C and/or assembly, and if it is, which Common Lisp implementations for 64-bit Linux best suit for this kind of endeavour, and what are the pros and cons if there are more than one suitable Common Lisp implementation? Scheme can be one possible option too, if it offers some benefits over Common Lisp. I am well aware that the great majority of kernel modules are written in C, and I know C and x86 assembly well enough to be able to write the required low-level code in C and/or assembly. This is not an attempt to port Linux kernel into Lisp (see: https://stackoverflow.com/questions/1848029/why-not-port-linux-kernel-to-common-lisp), but a plan to write in Common Lisp a Linux kernel module that would be used as a ring0 debugger.
You might want to take a look at the Feb 2 2008 lispvan talk "Doing Evil Things with Common Lisp" by Brad Beveridge on working with a filesystem driver from SBCL.
Talk description & files
In it he mentions:
"A C/C++ Debugger written in CL??
Totally pie in the sky right now
But, how cool would that be?
Not that much of a stretch, only need to be able to write to memory where the library is located to insert break points & then trap signals on the Lisp side
Could use dirty tricks to replace the C functions with calls to Lisp functions
Apart from some details, it's probably not that hard – certainly nothing β€œnew”
The dirty trick would involve overwriting the C code with another jump (branch without link) into a Lisp callback. When the Lisp code returns it can jump directly back to the original calling function via the link register.
Also, I'm totally glossing over the real difficulty in writing a debugger – it would be time consuming."
Nope, it would not be feasible to implement a kernel module in CL for the obvious reason that just to make that work you will need to do lot of hacks and may end up loosing all the benefit the lisp language provide and your lisp code will look like C code in S-expressions.
Write your kernel module to export whatever functionality/data you need from kernel and then using ioctl or read/write operations any user mode program can communicate with the module.
I am not sure if there is any kernel module which is generic enough that it implement Lisp (or may be its subset) such that you write code in Lisp and this generic module reads the lisp code and run it as sub component, basically Kernel -> Generic lisp module -> your lisp code (that will run in kernel).

Sample solutions for low-level problems written in C [closed]

Closed. This question is off-topic. It is not currently accepting answers.
Want to improve this question? Update the question so it's on-topic for Stack Overflow.
Closed 11 years ago.
Improve this question
Does anyone know where I might find sample solutions written in C for low level / systems level applications? A really good website or book recommendation would be cool too.
I've learned some of the basics, but would like to see some code within the context of a real solution written in C, and specifically for a lower-level problem. Id' be interested in how C is used within the context of OS programming, for example. What are some areas where C is used for lower-level programming?
Thanks.
I would suggest you to study MINIX3 from Tanenbaum: http://www.minix3.org/
Its a microkernel architecture, and with his book ( http://vig.prenhall.com/catalog/academic/product/0,1144,0131429388,00.html ) it is really enlightning.
As of my opinion, studying the linux kernel is a bit hardcore for a start ;), and out of a academical point of view the microkernel architecture is superior to the monolithic kernel.
Furthermore, with only a few thousands lines of code, unlike the Linux Kernel, its consumable in a realistic timetable.
And its a real serious project, the European Union sponsored some Millions towards it as far as i am aware of. I think i remind him saying that in one of his talks.
And you have a X-Server running there, a gcc-toolchain etcpp.
Have fun :)
EDIT: As i read the comments, someone mentions the Ruby interpreter. Its written in a mixture of C and Ruby, and as far as it was mentioned in one episode of se-radio.net, it is really nice sourcecode. Though i have to admit, i havent looked into it myself. Might be worth the dig into it if you have some interest in Ruby too.
I'd suggest looking at some (for you) interesting open source projects written in C. For example, there's busybox, a piece of software that runs on embedded devices and has lots of smaller programs to study. You could, for example, take the source for the telnet client on one side and the corresponding RFC on the other. Or, for a steeper learning curve, you could also try studying the open source OSes, like the Linux kernel (here's the tree for browsing) or the BSDs. It's a lot more involved than busybox, but you can still find some parts that are fairly easy to understand if you're familiar with the context.
Studying the Linux kernel, maybe in conjunction with one of the several books on the kernel or device drivers would provide a wealth of material. Much of this is available free.
any or all of the books by W. Richard Stevens that walk though the implementation (TCP/IP Illustrated) or use (UNIX Network Programming) of the networking stack or his Advanced Programming in the UNIX Environment book.
If you have a leaning toward Windows there are several good books, even if they're quite old, including:
Programming Server-Side Applications for Microsoft Windows 2000 by Richter and Clark
Programming Applications for Microsoft Windows by Richter
I would suggest the following sources might be interesting r.e. Operating Systems from a learning perspective. Be aware there have been many advancements actually present in modern kernels:
The original linux code.
xv6. This is a simple unix OS that goes along with MIT's excellent OpenCourseWare course on Operating Systems.
Other ideas:
The current grub stage 1 bootloader isn't that complicated - it's pretty hard to be complicated with 512 bytes to play with.
The Linux kernel module guide gives you an introduction to building kernel modules. You could experiment with building custom, yet pointless, drivers that add say character devices to /dev/ or proc devices to /proc and work towards implementing something interesting. People have implemented web servers in kernel space...
If you want to experiment with Windows kernels, have a go with Native NT applications. I'd start with printing a pointless boot message, then move up to drivers.
Beyond that, it's hard to suggest where you might want to go. Systems level is a wide space.
In the context of low level programming, C and C++ are portable assembler. In many of the above spaces the standard library is either partially or totally missing and extra functionality may be implemented by existing parts of the system-level code you're modifying, so you have to be aware of the API functions available to you in any given space and what you need to implement yourself, as well as what your memory and processing requirements must be. For example, a bootloader written to the MBR has to use bios interrupts and starts in real (16-bit) mode. Those are the constraints of the hardware design. Likewise, functions like fopen() aren't available in kernel space since they wrap system calls - you'd need to use kernel specific constructs to achieve this if it really made sense to write a file from kernel space.

Can I execute any c made prog without any os platform?

I googled about it and somewhere I read ....
Yes, you can. That is happening in the case of embedded systems
I think NO, it's not possible. Any platform must have an operating system. Or else, your program must itself be an OS.
Either soft or hard-wired. Without an operating system your component wouldn't work.
Am I right or can anybody explain me the answer? (I dont have any idea abt embedded systems...)
Of course you can. All a (typical) CPU needs is power and access to a memory, then it will execute its hard-coded boot sequence.
Typically this will involve reading some pre-defined address, interpreting the contents there as instructions, and starting to run them.
These instructions could of course come from a C program, although at this level it's more common to write the very early stages (called bootstrapping) in assembly.
This of course doesn't mean, if I were to read your question title literally, that any C program be run this way. If the program assumes there is an OS, but there isn't, it won't work. This should be pretty obvious.
You can run a program in a system without an Operating System ... and that program need not be an Operating System itself.
Think about all the computers (or processors if you prefer) inside a car: engine management, air conditioning, ABS, ..., ...
All of those system have a program (possibly written in C) running. None of the processors have an OS.
The Standard specifically differentiates between hosted implementations and freestanding implementations:
5.1.2.1 Freestanding environment
1 In a freestanding environment (in which C program execution may take place
without any benefit of an operating system), the name and type of the
function called at program startup are implementation-defined. Any library
facilities available to a freestanding program, other than the minimal set
required by clause 4, are implementation-defined.
2 The effect of program termination in a freestanding environment is
implementation-defined.
5.1.2.2 Hosted environment
1 A hosted environment need not be provided, but shall conform to the
following specifications if present.
...
I think you would have fun writing 'toy' kernels that are designed to run under simulators like QEMU (or virtualization platforms, Xen + MiniOS is one of my favorites). With not (much) difficulty, you could get a basic console up and running and start printing things to it. Its really fun, educational and satisfying all at once.
If you are working on x86 .. and get your spiffy kernel working under QEMU .. there's a very good chance that it will also work on real hardware. You might enjoy it.
Anyway, the answer to your question is most decidedly yes. Its especially easier if you happen to be using a boot loader .. for instance, google memtest86 and grab the code.
Usually, any C program will have a variety of system calls which depend on the operating system. For example, printf makes a system call to write to the screen buffer. Opening files and things like that are also system calls.
So basically, you can run the C code which just gets compiled and assembled in to machine code on a processor, but if the code makes any system calls, it would just freeze up the processor when it tries to jump to a memory location that it thinks is the operating system. This of course would depend on your being able to get the program running in the first place, which is not easy without the operating system as well.
Embedded systems are legitimate OS's in their own right, they're just not general purpose OS's. Any userland program (i.e. a program that is not itself an operating system) needs an operating system to run on top of.
As an example: Building Bare-Metal ARM Systems with GNU
Many embedded systems do not have enough resources for a full OS, some may use a scheduler kernel or RTOS, others are coded 'bare metal'. The main() C entry point is entered after reset. Only a small amount of assembler code is required to initialise a microprocessor, to execute C code. All C requires to run generally is a stack - usually simply a case of initialising the stack pointer to a specific address. Some processor specific initialisation of interrupt/exception vectors, system clocks, memory controllers etc. may be necessary also.
On a desktop PC, typically you have a BIOS that handles basic hardware initialisation such as SDRAM controller setup and timing, and then bootstrapping from a disk boot-sector, which then in turn bootstraps an OS. Any of that code could be written in C (and some of it probably is), and it could do something other than boot an OS - it could do anything - it is just code.
OSs are useful for non-dedicated computing devices where the end user many select one of many programs to execute and possibly several simultaneously. Most embedded systems do just one thing, the software is often loaded from ROM or executes directly from ROM, and is never changed and executes indefinitely (usually stopped only by power-down).
You still of course might implement device drivers and the like, but often these are an integral part of the application rather than a separate entity. Even when you do use an RTOS in an embedded system, it is still generally integral to your application rather than an OS in the sense you might understand. In these cases the RTOS is simply a library like any other, and is often initialised and started from main() rather then the other way around as you might expect.
every piece of hardware has to have a piece of software that operates it, be it embedded firmware (smaller and relatively fixed, like vxworks) or an operating system software that can run complex arbitrary code on top of it (like windows, linux, or mac).
think of it as a stack. at the bottom, you have the hardware. on top of that, a piece of software that can control that hardware. on top of that, you can have all sorts of stuff. in the case of a voip phone, you'll have vxworks controlling the hardware, and a layer on top of that that handles all the phone applications.
so going back to your question, yes, you CAN run any c program on anything, BUT it depends what kind of c program it is. if it's a low level c program that can talk to hardware, then you dont need anything other than your program and the hardware. if it's a higher level c program (like a chat program), then you need a whole bunch of stuff between your program and the hardware.
make sense?
Obviously, you cannot execute any arbitrary C program without some sort of OS or OS-equivalent. Similarly, I can write a C program under Linux that won't run under Microsoft Windows.
However, you can write C programs on almost anything. It's a popular language to write software for embedded systems in, and they very often don't have an OS.
Many embedded systems have just a CPU hooked up to a ROM, with pins coming out of the chip that are directly attached to inputs and outputs. There is no user I/O, no file system, no process scheduling, nothing you'd typically want an OS for. In those cases, a C programmer might write a program that is burned into a ROM, which will handle everything itself.
(Some embedded systems are more complicated, and can use an OS. Linux is frequently used, since it's free for the use, can be made very compact, and can be changed at any level. Not all do, though.)
You definitely don't need an OS to run your C code on any system. What you will need is two pieces of initialization code - one to initialize the hardware needed (processor, clock, memory) and another to set up your stack and C runtime (i.e. intialization of data and BSS sections). This, of course, means that you cannot take advantage of the multithreading, messaging and synchronization services that an OS would provide. I'll try and break it down into some steps to give you an idea:
Write a "reset_routine" that run when the board starts. This will initialize the clock and any external memory needed. (This routine will have to execute from a memory that is either internal or one that can be initialized and programmed externally).
The reset_routine, after the hardware initializations, transfers control to a "sw_runtime_init" routine that will set up the stack and the globals definied by you application. (Do a jump from reset_routine to sw_runtime_init instead of a call to avoid stack usage).
Compile and link this to you application, whilst ensuring that the "reset_routine" is linked to the location where the reset vector points to.
Load this onto your target and pray.

Content for Linux Operating Systems Class

I will be TA for an operating systems class this upcoming semester. The labs will deal specifically with the Linux Kernel.
What concepts/components of the Linux kernel do you think are the most important to cover in the class?
What do you wish was covered in your studies that was left out?
Any suggestions regarding the Linux kernel or overall operating systems design would be much appreciated.
My list:
What an operating system's concerns are: Abstraction and extension of the physical machine and resource management.
How the build process works ie, how architecture specific/machine code stuff is implanted
How system calls work and how modules can link up
Memory management / Virtual Memory / Paging and all the rest
How processes are born, live and die in POSIX and other systems
userspace vs kernel threads and what the difference is between process/threads
Why the monolithic Kernel design is growing tiresome and what are the alternatives
Scheduling (and some of the alternative / domain specific schedulers)
I/O, Driver development and how they are dynamically loaded
The early stages of booting and what the kernel does to setup the environment
Problems with clocks, mmu-less systems etc
... I could go on ...
I almost forgot IPC and Unix 'eveything is a file' design decisions
POSIX, why it exists, why it shouldn't
In the end just get them to go through tanenbaum's modern operating systems and also do case studies on some other kernels like Mach/Hurd's microkernel setup and maybe some distributed and exokernel stuff.
Give a broad view past Linux too, I recon
For those who are super geeky, the history of operating systems and why they are the way they are.
The Virtual File System layer is an absolute must for any Linux Operating System class.
I took a similar class in college. The most frustrating but, at the same time, helpful project was writing a small file system for the Linux operating system. Getting this to work takes ~2-3 weeks for a group of 4 people and really teaches you the ins and outs of the Kernel.
I recently took an operating systems class, and I found the projects to be challenging, but essential in understanding the concepts in class. The projects were also fun, in that they involved us actually working with the Linux source code (version 2.6.12, or thereabouts).
Here's a list of some pretty good projects/concepts that I think should be covered in any operating systems class:
The difference between user space and kernel space
Process management (i.e. fork(), exec(), etc.)
Write a small shell that demonstrates knowledge of fork() and exec()
How system calls work, i.e. how do we switch from user to kernel mode
Add a simple system call to the Linux kernel, write a test application that calls the system call to demonstrate it works.
Synchronization in and out of the kernel
Implement synchronization primitives in user space
Understand how synchronization primitives work in kernel space
Understand how synchronization primitives differ between single-CPU architectures and SMP
Add a simple system call to the Linux kernel that demonstrates knowledge of how to use synchronization primitives in the Linux kernel (i.e. something that has to acquire, say, the tasklist lock, etc. but also make it something where you have to kmalloc, which can't be done while holding a lock (unless you GFP_ATOMIC, but you shouldn't, really))
Scheduling algorithms, and how scheduling takes place in the Linux kernel
Modify the Linux task scheduler by adding your own scheduling policy
What is paging? How does it work? Why do we have paging? How does it work in the Linux kernel?
Add a system call to the Linux kernel which, given an address, will tell you if that address is present or if it's been swapped out (or some other assignment involving paging).
File systems - what are they? Why do they exist? How do they work in the Linux kernel?
Disk scheduling algorithms - why do they exist? What are they?
Add a VFS to the Linux kernel
Well, I just finished my OS course this semester so I thought I'd chime in.
I was kind of upset that we didn't actually play around with the actual OS itself, rather we just did system programming. I'd recommend having the labs be on something that is in the OS itself, which is what it sounds like what you want to do.
One lab that I did enjoy and found useful however was writing our own malloc/free routines. It was difficult, but pretty entertaining as well.
Maybe also cover loading programs into memory and/or setting up the memory manager (such as paging).
For labs, one thing that may be cool is to show them actual code and discuss about it, ask questions about what do they think things are done that way and not another, etc.
If I were again in the University I would certainly appreciate more in depth lessons about synchronization primitives, concurrency and so on... those are hard matters that are more difficult to approach without proper guidance. I remember I went to a speech by Paul "Rusty" Russell about spinlocks and other synchronization primitives that was absolutely rad, maybe you could find it in youtube and borrow some ideas.
Another good topic (or possibly exercise for the students) would be looking at virtualisation. Especially Rusty Russel's "lguest" which is designed as a simple introduction to what is required to virtualise an operating system. The docs are good reading too.
I actually just took a class that perfectly fits your description (OS Design using linux) in the spring. I was actually very frustrated with it because I felt like the teacher focused too narrowly for the projects rather than give a broader understanding. For instance, our last project revolved around futexes. My partner and I barely learned what they were, got it working (kinda) and then turned it in. I came away with no general knowledge of anything really from that project. I wish one of the projects had been to write a simple device driver or something like that.
In other words, I think it's good to make sure a good broad overview is presented, with as much detail as you can afford, but ultimately broad. I felt like my teacher nitpicked these tiny areas and made us intensely focus on those, while in the end I did NOT come away with that great of a general understanding of the inner-workings of Linux.
Another thing I'd like to note is a lot of why I didn't retain knowledge from the class was lack of organization. Topics came out of nowhere any given week, and there was no roadmap. Give the material a logical flow. Mental organization is the key to retaining the knowledge.
The networking sub-system is also quite interesting. You could follow a packet as it goes from the socket system call to the wire and the other way around.
Fun assignments could be:
create a state-full firewall by using netfilter
create an HTTP load balancer
design and implement a simple tunneling protocol
Memory mapped I/O and the 1g/3g vs 2g/2g split between kernel address space and user addressable space in 32bit operating systems.
Limitations of 32 bit architecture on hard drive size and what this means for the design of file systems.
Actually just all the pros and cons of going to 64 bit, what it means and why as well as the history and why are aren't there yet.

Resources