Where to start with Linux Kernel Modules? - c

A little background, I'm a CMPE Student currently in an Operating Systems class. I have some basic knowledge of C coding but am more comfortable with C++ (taken about 3 semesters of that). Other than that, never had any other formal training in coding. Also, I've got a basic understanding of the linux environment.
I am working on a project that requires me and my team to code a linux kernel module that can do the following:
echoes data passed from user-level processes by printing the data received to the kernel log
is able to pass data from one user process to another.
must be possible to use the kernel module as an inter-process communication abstraction. module should provide for situations where a sender posts data to it but no receiver is waiting.module must cover the situation where a receiver asks for data but there is no data available.
module must cover the situation where a receiver asks for data but there is no data available.
must be a limit in the buffer capacity in your module.
Now I don't know how difficult this seems to those with a background in programming, but this seems like an impossibly complicated task for someone in my position.
Here's what I've done so far:
Coded, Compiled, Inserted, and Removed the basic "hello world" linux kernel module successfully
Read through about the first 4 or 5 chapters of The Linux Kernel Module Programming Guide
Read through a few stackoverflow posts, none of which seem to be able to direct me to where I need to go.
So finally here's my question: Can someone please point me in the direction that I need to go with this? I don't even know where to being to find commands to use for reading in user-level process data and I need somewhere to start me off. TLPD was great for insight on the topic but isn't helping me get to the point where I will have a workable project to turn in. In the past, I would learn off of reading source code and reverse engineering, is there anywhere I can find something like that? Any and all help is appreciated.
-Will

I've found that the Linux Kernel Module Programming Guide is a pretty good resource. From the sounds of it, something like a character device might work best for your purposes, but I'm not sure if you have other constraints.
Another direction I might consider (though this could be a bad path) is to look at examples in the Linux kernel for a kernel module that has similar functionality. I don't have a good example offhand, but perhaps look through /drivers/char/.

What you describe is pretty much the same as a pipe.
Read chapter three of Linux Device Drivers.
(But don't just copy the scull pipe example …)

Related

Coding C libraries for an Operating System

I am trying to create a DOS-like OS. I have read multiple articles, read multiple books (I even paid a lifetime subscription for O'Reilly Media), but to no avail, I found nothing useful. I want to learn how to make operating system libraries, which rises the question which is: are libraries which you code for a program the same if you are compiling it for an operating system?
I know Operating Systems are very challenging to make and the very few programmers that do attempt to make one never produce a functioning one which is why it's described as "the great pinnacle of programming.". But still, I'm going to make an attempt at making one (just for fun, and maybe learn a few pointers on the way).
All I need to do this is basically learning how to make the libraries, C (which I already know and love), assembly (which I kind-of know how to use along with C) and a compiler (I use the GNU toolchain). The only thing I am having trouble with are coding the libraries. I'm like wow right now, who knew that coding libraries are so hard, like point me to a book or something! But no. I'm not asking for a book right here, all I'm asking for is some advice on how to do this like:
How do you start making some basic I/O libraries
Is it the same as making a regular C library
And finally, is it going to be hard? (JK I know already that this is going to be extremely hard which is why I prepared so much)
In summary, the main question is, how I can make this work or is there a pre-built library that would most likely speed up the process?
Are libraries which you code for a program the same if you are compiling it for an operating system?
Absolutely not. A user-space C library at its lowest level makes system calls to an operating system to interact with hardware via device drivers; it is the device driver and interaction with hardware you will be writing.
From my experience doing embedded system bringups, the way you start is with a development board with a legacy RS-232 port. It's about the easiest possible device to write a driver for - you write bytes to a memory mapped IO address, wait a bit then write some more. This is where your first debug output goes too.
You might find yourself waggling IO pins and probing them with a logic analyser or DSO on the route to this though - hence why you want a development board where the signals are accessible.
None of the standard C-library will be available to you - so you'll need to equivalents of some of things it provides - but in kernel space - including type definitions, memory management, and intrinsics the compiler expects - particularly those for memory barriers. The C-library doesn't provide any data structures or algorithms anyway, but you'll definitely be wanting to write some early on.

Custom Kernel: Implement filesystem

As a out of course project, I am currently developing a kernel in an attempt to better understand all the aspects of an actual OS. So far, I am done setting up a flat physical memory model with support for paging and the basic interrupts (keyboard and perhaps trackpad/mouse next). I thought the step forward would be to implement a filesystem and I am keen about the ext2. I have looked around, even on SO but there isn't anything explicit that answers my questions:
Is it possible to write a driver to access an ext2 filesystem in C or do I need to go lower?
If I plan to access the filesystem off a USB device, I am assuming I will need to get the device driver for USB running first. Any help on this would be greatly appreciated.
I know the code for detecting a filesystem is already available on the MINIX and other kernels but what I really want to know is if I want to build a custom albeit simple filesystem, how do I go about it? I am considering this possibility too.
My apologies if the question and details sound a little ignorant but I am still in the learning process.
Thanks :)
I'll try to give you a few tips/hints - a clear answer isn't that simple:
An ext2 filesystem written in C is just C. C is just a programming language - you can use C++, plain assembly or a few others (A few os'dever use D) - but not a "managed" language etc. But it is important that you have a rock solid understanding of this language. In my opinion assembly is a MUST (Take a look at the scheduler in an operating system -> plain assembly)
Do you really want to write an USB driver ? It isn't "just" a simple USB driver (Layer of abstractions). Why a USB driver and not a floppy disk or CD driver (Believe me - a floppy driver in 32 bit protected mode isn't that hard) ?
Please focus on your project. Of course Linux (Early versions) and Minix have example code, but take care of the design structure (Monolithic/Microkernel or hybrid-kernel) - and don't mix it, write your own code.
Please make on step after the other. You wrote a basic IRQ handling and the plan is to write a keyboard/mouse driver - write the keyboard driver ! Don't dream about loading and executing files (Rom wasn't built in one day).
You have to read documentations, for example the Intel manuals or other "books". A very popular forum is osdev.org - take a look at the wiki. As twalberg said, it's a very huge module - stay focused on the main parts of your operating system.
I know, this is not the answer to your question - but it's important not to go in the wrong direction and dream of a fancy UI or something like this ;)
osdev.org forum
osdev.org wiki
Intel manuals
And a few other books in my book shelf can you find here (Tanenbaum, Silberschatz with Peter Galvin - great books!):
Books

Getting started with http media streaming in C

Alrighty so, I've been wanting to learn C for awhile, and now I have a project idea that's actually relevant to a website I want to build, but I have a few initial questions on how to get started. This isn't really a "how to program" question or anything, I can get started with C programming fine, I know how to read and communicate with various APIs and protocols as long as I have documentation, etc. I'm just looking for a starting point, I guess.
The program would be something similar to ice or shoutcast, so basically audio streaming. Does anyone think they could give a brief, high-level overview of what would be required? As I said the end product would be a url you pop in a .pls file and you can stream it to w/e client you want. What protocols, libraries, and documentation should I be looking at?
If you want this to be a toy for learning, you might want to do all the work yourself; it's a complicated problem, and getting it right is definitely going to be educational. A copy of Advanced Programming the Unix Environment, 2nd edition or TCP/IP Illustrated, Vol 1 would be helpful but not strictly necessary.
If you want this to be useful too, I'd suggest starting with libev or libevent. libevent has some built-in HTTP handling, which might be nice, but there are reports that libevents HTTP handling isn't perfect. libev doesn't provide built-in HTTP handling, but it should be easier to write with libev than performing all the work by hand. Using these pre-written event-based libraries will improve the stability and reliability of your program compared to writing the entire thing by hand, though they aren't doing anything that you can't do yourself.

Content for Linux Operating Systems Class

I will be TA for an operating systems class this upcoming semester. The labs will deal specifically with the Linux Kernel.
What concepts/components of the Linux kernel do you think are the most important to cover in the class?
What do you wish was covered in your studies that was left out?
Any suggestions regarding the Linux kernel or overall operating systems design would be much appreciated.
My list:
What an operating system's concerns are: Abstraction and extension of the physical machine and resource management.
How the build process works ie, how architecture specific/machine code stuff is implanted
How system calls work and how modules can link up
Memory management / Virtual Memory / Paging and all the rest
How processes are born, live and die in POSIX and other systems
userspace vs kernel threads and what the difference is between process/threads
Why the monolithic Kernel design is growing tiresome and what are the alternatives
Scheduling (and some of the alternative / domain specific schedulers)
I/O, Driver development and how they are dynamically loaded
The early stages of booting and what the kernel does to setup the environment
Problems with clocks, mmu-less systems etc
... I could go on ...
I almost forgot IPC and Unix 'eveything is a file' design decisions
POSIX, why it exists, why it shouldn't
In the end just get them to go through tanenbaum's modern operating systems and also do case studies on some other kernels like Mach/Hurd's microkernel setup and maybe some distributed and exokernel stuff.
Give a broad view past Linux too, I recon
For those who are super geeky, the history of operating systems and why they are the way they are.
The Virtual File System layer is an absolute must for any Linux Operating System class.
I took a similar class in college. The most frustrating but, at the same time, helpful project was writing a small file system for the Linux operating system. Getting this to work takes ~2-3 weeks for a group of 4 people and really teaches you the ins and outs of the Kernel.
I recently took an operating systems class, and I found the projects to be challenging, but essential in understanding the concepts in class. The projects were also fun, in that they involved us actually working with the Linux source code (version 2.6.12, or thereabouts).
Here's a list of some pretty good projects/concepts that I think should be covered in any operating systems class:
The difference between user space and kernel space
Process management (i.e. fork(), exec(), etc.)
Write a small shell that demonstrates knowledge of fork() and exec()
How system calls work, i.e. how do we switch from user to kernel mode
Add a simple system call to the Linux kernel, write a test application that calls the system call to demonstrate it works.
Synchronization in and out of the kernel
Implement synchronization primitives in user space
Understand how synchronization primitives work in kernel space
Understand how synchronization primitives differ between single-CPU architectures and SMP
Add a simple system call to the Linux kernel that demonstrates knowledge of how to use synchronization primitives in the Linux kernel (i.e. something that has to acquire, say, the tasklist lock, etc. but also make it something where you have to kmalloc, which can't be done while holding a lock (unless you GFP_ATOMIC, but you shouldn't, really))
Scheduling algorithms, and how scheduling takes place in the Linux kernel
Modify the Linux task scheduler by adding your own scheduling policy
What is paging? How does it work? Why do we have paging? How does it work in the Linux kernel?
Add a system call to the Linux kernel which, given an address, will tell you if that address is present or if it's been swapped out (or some other assignment involving paging).
File systems - what are they? Why do they exist? How do they work in the Linux kernel?
Disk scheduling algorithms - why do they exist? What are they?
Add a VFS to the Linux kernel
Well, I just finished my OS course this semester so I thought I'd chime in.
I was kind of upset that we didn't actually play around with the actual OS itself, rather we just did system programming. I'd recommend having the labs be on something that is in the OS itself, which is what it sounds like what you want to do.
One lab that I did enjoy and found useful however was writing our own malloc/free routines. It was difficult, but pretty entertaining as well.
Maybe also cover loading programs into memory and/or setting up the memory manager (such as paging).
For labs, one thing that may be cool is to show them actual code and discuss about it, ask questions about what do they think things are done that way and not another, etc.
If I were again in the University I would certainly appreciate more in depth lessons about synchronization primitives, concurrency and so on... those are hard matters that are more difficult to approach without proper guidance. I remember I went to a speech by Paul "Rusty" Russell about spinlocks and other synchronization primitives that was absolutely rad, maybe you could find it in youtube and borrow some ideas.
Another good topic (or possibly exercise for the students) would be looking at virtualisation. Especially Rusty Russel's "lguest" which is designed as a simple introduction to what is required to virtualise an operating system. The docs are good reading too.
I actually just took a class that perfectly fits your description (OS Design using linux) in the spring. I was actually very frustrated with it because I felt like the teacher focused too narrowly for the projects rather than give a broader understanding. For instance, our last project revolved around futexes. My partner and I barely learned what they were, got it working (kinda) and then turned it in. I came away with no general knowledge of anything really from that project. I wish one of the projects had been to write a simple device driver or something like that.
In other words, I think it's good to make sure a good broad overview is presented, with as much detail as you can afford, but ultimately broad. I felt like my teacher nitpicked these tiny areas and made us intensely focus on those, while in the end I did NOT come away with that great of a general understanding of the inner-workings of Linux.
Another thing I'd like to note is a lot of why I didn't retain knowledge from the class was lack of organization. Topics came out of nowhere any given week, and there was no roadmap. Give the material a logical flow. Mental organization is the key to retaining the knowledge.
The networking sub-system is also quite interesting. You could follow a packet as it goes from the socket system call to the wire and the other way around.
Fun assignments could be:
create a state-full firewall by using netfilter
create an HTTP load balancer
design and implement a simple tunneling protocol
Memory mapped I/O and the 1g/3g vs 2g/2g split between kernel address space and user addressable space in 32bit operating systems.
Limitations of 32 bit architecture on hard drive size and what this means for the design of file systems.
Actually just all the pros and cons of going to 64 bit, what it means and why as well as the history and why are aren't there yet.

Linux kernel code that uses procfs: what should I be aware of?

I have a very nice idea for a kernel patch, and I want to conduct some research and see code examples before I shape my idea.
I'm looking for interesting code examples that would demonstrate advanced usage of procfs (the Linux /proc file system). By interesting, I mean more than just reading a documented value.
My idea is to provide every process with an easy broadcast mechanism. For example, let's consider a process that runs multiple instances of rsync and wants to check the transfer status (how many bytes have been transfered so far) for each child. Currently, I don't know of any way that can be done.
I intend to provide the process with a minimal interface to write data to the procfs. That data would be placed under the PID directory. For example:
/procfs/1343/data_transfered/incoming
I can think of numerous advantage for this, mainly in the concurrency field.
By the way, if such a mechanism already exists, do tell...
Yes, I've written stuff that pokes around in /proc. I suspect you are unlikely to get linux kernel patches accepted that do anything with proc, unless they are just fixing something that is already there that was broken in some way.*
/sysfs seems to be where things are moving.
/proc was originally for process information, but a lot of misc. driver stuff ended up in there.
*well, maybe they'll take it if whatever you're doing has to do with processes, and isn't in a driver.
Go look at the source code for the procps package for code that uses /proc
http://github.com/tialaramex/leakdice/tree/master
Uses proc to figure out the memory address layout of a process, and dump random pages from its heap (for reasons which are explained in its documentation).

Resources