input and output without a library in C - c

I'm writing a small kernel for my programs in C.
This is not (at the moment) an OS kernel, it's merely a way for me to keep track of input and output in programs without relying on external source (i.e. stdio.h). You might ask me why I'd ever want to do this; it's just so I know how this works, and so that I have more, and more (end goal is total) control of program flow.
I was wondering if anyone knows some tutorials on input and output in C (with inline asm?) without relying on any other code.

There is a lot of room between the bare metal and stdio. You have said you aren't writing an OS kernel, but not whether or not you are running under an OS.
Running directly on hardware without an OS, you will still want to encapsulate all of your I/O operations in a module, even if you don't formally define a device driver interface and framework for all of your I/O modules to follow. This is hugely architecture dependent, and makes you responsible for knowing all of the details of interaction with every I/O device you might ever use. For some devices, this can quickly become a huge development effort. That isn't a problem for embedded systems, but running on commercial hardware this way is neither easy nor recommended.
Running within an OS, you probably don't get (and shouldn't want to get) access to the actual hardware registers and interrupts. If you are developing a custom I/O device, the best practice is to make it conform to existing standards so that you need as little low level custom software for it as possible. This is why you see a lot of custom user interface gadgets connecting via USB and identifying themselves as HIDs (Human Interface Devices). As a HID, the existing USB drivers take care of the physical layer, and the OS-supplied HID driver takes care of the logical interface, providing a very simple high level access API to the application.
One of the operating system's key roles is to provide a consistent I/O API across all devices. Generally, that takes the form of open(), close(), read(), write(), and ioctl() functions (the names vary, but some form of at least the first four will always exist). The OS layer is quite raw, however. Typically, an OS call is forwarded without much processing to a device driver, which then forwards the data on to the device. Usually, the OS low level calls block the caller until they complete, and often they have restrictions on the sizes of the buffers that make sense. For instance, raw access to a disk device is usually required to be for an integral number of disk blocks at a time.
And don't forget about things like file systems and network protocols... all of which are made much more reliable and compatible by encapsulation within an operating system.
Even if it is acceptable to call read() and write() for single characters, that is usually not the best performance possible. Operating system calls are relatively expensive, and if you can read multiple characters in a single call, your performance can go way up.
That is the origin of the stdio library for C, and various other buffering libraries in other environments. The stdio library provides a buffering layer that isolates the C code from the block size of the underlying hardware. Even on an entirely home-grown operating system where you have full control over all the devices, something like C stdio will still be valuable.
Writing your own stdio replacement is a highly valuable exercise, even if you don't use it in production code, and is one I would recommend to anyone wanting to learn about what really goes on between printf() and scanf() and the terminal or files.
One valuable resource is the book The Standard C Library by P.J. Plauger. In it, the author presents an implementation of the complete C runtime library specified in the ANSI standard. His discussion of the specific implementation choices he made is valuable and apropos to the context of this question, and the discussions of why some of the standard library features were specified is interesting as well.

This sort of thing is very architecture specific. To put it simply, your I/O devices will raise hardware interrupts to the CPU. The CPU will call the code associated with the interrupt which will deal with it appropriately; for an input device it will fetch the data that is available from the device, for an output device the interrupt usually means that the device is ready to send the next piece.
The old 8088/8086 CPU architecture is a nice simple place to start to get your head around this. Typically, the BIOS would be where the hardware interrupts would have been handled, but it was always possible to write your own. ;)

Related

Using sock_create, accept, bind etc in kernel

I'm trying to implement an echo TCP server as a loadable kernel module.
Should I use sock_create, or sock_create_kern?
Should I use accept, or kernel_accept?
I mean it does make sense that I should use kernel_accept for example; but I don't know why. Can't I use normal sockets in the kernel?
The problem is, you are trying to shoehorn an user space application into the kernel.
Sockets (and files and so on) are things the kernel provides to userspace applications via the kernel-userspace API/ABI. Some, but not all, also have an in-kernel callable, for cases when another kernel thingy wishes to use something provided to userspace.
Let's look at the Linux kernel implementation of the socket() or accept() syscalls, in net/socket.c in the kernel sources; look for SYSCALL_DEFINE3(socket, and SYSCALL_DEFINE3(accept,, SYSCALL_DEFINE4(recv,, and so on.
(I recommend you use e.g. Elixir Cross Referencer to find specific identifiers in the Linux kernel sources, then look up the actual code in one of the official kernel Git trees online; that's what I do, anyway.)
Note how pointer arguments have a __user qualifier: this means the data pointed to must reside in user space, and that the functions will eventually use copy_from_user()/copy_to_user() to retrieve or set the data. Furthermore, the operations access the file descriptor table, which is part of the process context: something that normally only exist for userspace processes.
Essentially, this means your kernel module must create an userspace "process" (enough of one to satisfy the requirements of crossing the userspace-kernel boundary when using kernel interfaces) to "hold" the memory and file descriptors, at minimum. It is a lot of work, and in the end, it won't be any more performant than an userspace application would be. (Linux kernel developers have worked on this for literally decades. There are some proprietary operating systems where doing stuff in "kernel space" may be faster, but that is not so in Linux. The cost to do things in userspace is some context switches, and possibly some memory copies (for the transferred data).)
In particular, the TCP/IP and UDP/IP interfaces (see e.g. net/ipv4/udp.c for UDP/IPv4) do not seem to have any interface for kernel-side buffers (other than directly accessing the rx/tx socket buffers, which are in kernel memory).
You have probably heard of TUX web server, a subsystem patch to the Linux kernel by Ingo Molnár. Even that is not a "kernel module server", but more like a subsystem that an userspace process can use to implement a server that runs mostly in kernel space.
The idea of a kernel module that provides a TCP/IP and/or UDP/IP server, is simply like trying to use a hammer to drive in screws. It will work, after a fashion, but the results won't be pretty.
However, for the particular case of an echo server, it just might be possible to bolt it on top of IPv4 (see net/ipv4/) and/or IPv6 (see net/ipv6/) similar to ICMP packets (net/ipv4/icmp.c, net/ipv6/icmp.c). I would consider this route if and only if you intend to specialize in kernel-side networking stuff, as otherwise everything you'd learn doing this is very specialized and not that useful in practice.
If you need to implement something kernel-side for an exercise or something, I'd recommend steering away from "application"-type ideas (services or similar).
Instead, I would warmly recommend developing a character device driver, possibly implementing some kind of inter-process communications layer, preferably bus-style (i.e., one sender, any number of recipients). Something like that has a number of actual real-world use cases (both hardware drivers, as well as stranger things like kdbus-type stuff), so anything you'd learn doing that would be real-world applicable.
(In fact, an echo character device -- which simply outputs whatever is written to it -- is an excellent first target. Although LDD3 is for Linux kernel 2.6.10, it should be an excellent read for anyone diving into Linux kernel development. If you use a more recent kernel, just remember that the example code might not compile as-is, and you might have to do some research wrt. Linux kernel Git repos and/or a kernel source cross referencer like Elixir above.)
In short sockets are just a mechanism that enable two processes to talk, localy or remotely.
If you want to send some data from kernel to userspace you have to use kernel sockets sock_create_kern() with it's family of functions.
What would be the benefit of TCP echo server as kernel module?
It makes sense only if your TCP server provides data which is otherwise not accessible from userspace, e.g. read some post-mortem NVRAM which you can't read normally and to send it to rsyslog via socket.

cannot find pthread.c in linux folder

I have downloaded kernel and the kernel resides in folder called Linux-2.6.32.28 in which I can find /Kernel/Kthread.c
I found Kthread.c but I cannot find pthread.c in Linux-2.6.32.28
I found Kthread.c in Linux-3.13/Kernel and Linux-4.7.2/Kernel
locate pthread.c finds file pthread.c in Computer/usr folder that comes when I install Ubuntu but pthread.c is not available in downloaded folders Linux-2.6.32.28, Linux-3.13, Linux-4.7.2
MORE: There are two sets of function calls. 1. System Calls 2. Library Calls.
For a computer to do any task it has to use hardware resources. So, how library calls differ from system calls?
System calls always use kernel which means hardware.
Library calls means no usage of kernel or hardware?
I know that library calls sometimes may resolve to system call.
What I want to know is that, if every set of function calls uses hardware then to what degree system calls will use hardware resources when compared with library calls and vice-versa.
Whether a function call is System or Library, at least hardware resource like RAM has to be utilized. Right?
Read first pthreads(7). It explains you that pthreads are implemented in the C standard library as nptl(7).
The C standard library is the cornerstone of Linux systems, and you might have several variants of it; however, most Linux distributions have only one libc, often the GNU glibc, which contains the NPTL. You might use another C standard library (such as musl-libc or dietlibc). With care, you can have several C standard libraries co-existing on your system.
The C standard library (and every user-space program) is using system calls to interact with the kernel. They are listed in syscalls(2). BTW most C standard library implementations on Linux are free software, so you can study (or even improve) their source code if you want to. You often use a system call thru the small wrapper C function (e.g. write(2)) for it in your C standard library, and even more often thru some higher-level function (e.g. fprintf(3)) provided by your C standard library.
Pthreads are implemented (in the NPTL layer of glibc) using low-level stuff like clone(2) and futex(7) and a bit of assembler code. But you generally won't use these directly unless you are implementing a thread library (like NPTL).
Most programs are using the libc and linking with it (and also with crt0) as a shared library, which is /lib/x86_64-linux-gnu/libc.so.6 on my Debian/Sid/x86-64. However, you might (but you usually don't) invoke system calls directly thru some assembler code (e.g. using a SYSCALL or SYSENTER machine instruction). See also this.
The question was edited to also ask
What I want to know is that, if every set of function calls uses hardware then to what degree system calls will use hardware resources
Please read a lot more about operating systems. So read carefully Operating Systems: Three Easy pieces (a freely downloadable textbook) and read about Instruction Set Architecture and Computer Architecture. Study several of them, e.g. x86-64 and RISC-V (or ARM, PowerPC, etc...). Read about CPU modes and virtual memory.
You'll find out that the OS manages physical resources (including RAM and cores of processors). Each process has its own virtual address space. So from a user-space point of view, a process don't use directly hardware (e.g. it runs in virtual address space, not in RAM), it runs in some virtual machine (provided by the OS kernel) defined by the system calls and the ISA (the unpriviledged machine instructions).
Whether a function call is System or Library, at least hardware resource like RAM has to be utilized. Right?
Wrong, from a user-space point of view. All hardware resources are (by definition) managed by the operating system (which provides abstractions thru system calls). An ordinary application executable program uses the abstractions and software resources (files, processes, file descriptors, sockets, memory mappings, virtual address space, etc etc...) provided by the OS.
(so several books are required to really answer your questions; I gave some references, please follow them and read a lot more; we can't explain everything here)
Regarding your second cluster of questions: Everything a computer does is ultimately done by hardware. But we can make some distinctions between different kinds of hardware within the computer, and different kinds of programs interacting with them in different ways.
A modern computer can be divided into three major components: central processing unit (CPU), main memory (random access memory, RAM), and "peripherals" such as disks, network transceivers, displays, graphics cards, keyboards, mice, etc. Things that the CPU and RAM can do by themselves without involving peripherals in any way are called computation. Any operation involving at least one peripheral is called input/output (I/O). All programs do some of both, but some programs mostly do computation, and other programs mostly do I/O.
A modern operating system is also divided into two major components: the kernel, which is a self-contained program that is given special abilities that no other program has (such as the ability to communicate with peripherals and the ability to control the allocation of RAM), and is responsible for supervising execution of all the other programs; and the user space, which is an unbounded set of programs that have no such special abilities.
User space programs can do computation by themselves, but to do I/O they must, essentially, ask the kernel to do it for them. A system call is a request from a user-space program to the kernel. Many system calls are requests to perform I/O, but the kernel can also be requested to do other things, such as provide general information, set up communication channels among programs, allocate RAM, etc. A library is a component of a user space program. It is, essentially, a collection of "functions" that someone else has written for you, that you can use as if you wrote them yourself. There are many libraries, and most of them don't do anything out of the ordinary. For instance, zlib is a library that (provides functions that) compress and uncompress data.
However, "the C library" is special because (on all modern operating systems, except Windows) it is responsible for interacting directly with the kernel. Nearly all programs, and nearly all libraries, will not make system calls themselves; they will instead ask the C library to do it for them. Because of this, it can often be confusing trying to tell whether a particular C library function does all of its work itself or whether it "wraps" one or more system calls. The pthreads functions in particular tend to be a complicated tangle of both. If you want to learn how operating systems are put together at their lower levels, I recommend you start with something simpler, such as how the stdio.h "buffered I/O" routines (fopen, fclose, fread, fwrite, puts, etc) ultimately call the unistd.h/fcntl.h "raw I/O" routines (open, close, read, write, etc) and how the latter set of functions are just wrappers around system calls.
(Assigning the task of wrapping system calls to the runtime library for the C programming language is a historical accident and we probably wouldn't do it that way again if we were going to start over from scratch.)
pthread is POSIX thread. It is a library that helps application to create threads in OS. Kthread in kernel source code is for kernel threads in linux.
POSIX is a standard defined by IEEE to maintain the compatibility between different OS.
So pthread source code cannot be seen in linux kernel source code.
you may refer this link for pthread source code http://www.gnu.org/software/hurd/libpthread.html

Why do we need a RTOS on ARM Cortex-M

If we can already execute C programs on cortex-m like micro-controllers, Why do we even need to install RTOS (or other operating systems).?
What benefits it can provide if micro-controller is intended to be multi-purpose.?
No you dont need an RTOS only if you need/want the features of the (particular) RTOS. You can program the microcontroller the way you/we always have without one if you prefer.
Typical things an RTOS might bring,
Memory management (who owns memory)
Interrupt handling support
Scheduling (pre-emptive or co-operative)
Usually several drivers in a BSP for your hardware/SOC
Debug tools
Some sort of shell
File systems
IPC (inter-process communitation)
A tool suite
A build environment
Memory protection
Networking
Your application may or may not need these features depending on your end goal. Some of them may be detrimental to your organizations work flow (like the tool suite and build environment). As a product matures, you may end up needing features you didn't account for.
However, a completely custom solution will probably have a smaller foot print. The race conditions involved in interrupt handling can be quite difficult to get right. Probably most RTOS will give a better implementation than something custom that evolves over time. If you are very dedicated, a state machine with polling of devices can be more optimal (hard real time) but again it is difficult to get right.
If the RTOS is BSD (or other permissive) licensed , it maybe possible to reuse the driver code to your own custom infra-structure. At some point your code may become an 'RTOS' of sorts. There are many to choose from.
POSIX compliance is a common standard. If you confine your code to POSIX, you are portable to many different RTOS/OS. However, most often an API that is more rich than POSIX; it is one way they differentiate each other. You may be able to use more 3rd party libraries if the RTOS is POSIX compliant.
An operating system provides a level of abstraction between the code written by an application programmer and the actual hardware the program runs on.
So you don't have to worry, as an application programmer, about the details of the hardware, as they are handled by drivers.
And thus you can compile the same program for many different hardware platforms, if they run the same (or a compatible) operating system.

Getting access to Raspberry PI registers in C programming

I am trying to get access to the register of my Raspberry Pi.
To be a bit more specific, http://www.raspberrypi.org/wp-content/uploads/2012/02/BCM2835-ARM-Peripherals.pdf has some Hardware Timers on page 172-173.
I want to use them because I have to write two functions HW_GetTimer() and HW_ClearTimer().
I can't find a good way to communicate with those registers. Is this possible? Is there an existing C function that I don't know about?
First of all, a word of warning: These registers are likely used by the Operating System, so if you fiddle with them, chances are that you break something...
That said, there are two options:
The proper one: write a kernel driver and you'll have plenty of functions to access the hardware in a sane and controlled way. Or chances are that there is already a driver that does exactly what you are trying to do, if that's the case, you just find it and use the interface it exposes. Reading the kernel source is fun!
The easy one: from user-mode land, open /dev/mem and mmap() the addresses you want to access into your process memory. Then you can read/write (use volatile pointers, please!) as you will. Note that you cannot read()/write() from/to /dev/mem, only mmap().
Obviously, for the user-mode thing you have to have the proper permissions or be root.
Guess: You are using linux.
If you are trying to do this in conjunction with Linux, there usually is a driver for (yes even for timers!) which are used internally for scheduling, tasklets and other stuff - in userspace you should use poll or epoll without any filedescriptors and just use the timeout. This will get you as close as it can get to schedulers granularity.
Another way would be to check the kernel code if the timer is used, if not you could simply export it via a kernel module, though that requires at least a basic understanding of the CPU, how the kernel works and how it is implemented without security implications or risk of crash (if not both).
I omit the bare metal way here...

Dependency of Run time library on operating system

I was going through this tutorial about how to write a minimalist kernel. I read this in between :
The Run-Time Library
A major part of writing code for your OS is rewriting the run-time library, also known as libc. This is because
the RTL is the most OS-dependent part of the compiler package: the C
RTL provides enough functionality to allow you to write portable
programs, but the inner workings of the RTL are dependent on the OS in
use. In fact, compiler vendors often use different RTLs for the same
OS: Microsoft Visual C++ provides different libraries for the various
combinations of debug/multi-threaded/DLL, and the older MS-DOS
compilers offered run-time libraries for up to 6 different memory
models.
I am kind of confused with this part. Suppose I write my kernel in C code and against the advice use the inbuilt printf() function to print something. Finally my code will be translated to the machine code. When it will be executed, processer will directly run it. Why does the author says :
inner workings of the RTL are dependent on the OS in use ?
There are two separate issues:
What will printf() do when run inside of your kernel? Most likely it will crash or do nothing, since the RTL of the C compiler you use to develop your kernel is probably assuming some runtime environment with console, operating system, etc. Even if you're using a freestanding implementation of C/C++, the runtime will likely take over serial ports or whatnot to perform the output. You don't want that, probably, since your kernel's drivers will control the I/O. So you need to reimplement the underlying file I/O from the RTL.
What will printf() do when run in a user process that runs on top of your kernel? If the kernel protects access to hardware resources, it can't do anything. The underlying file I/O code from the RTL has to be aware of how to communicate with the kernel to open whatever passes for standard input/output "files" and to perform data exchange.
You need to be aware of whether you're using a free-standing or hosted implementation of the C/C++ compiler + RTL, and all of the implications. For kernel development, you'll be using a free-standing implementation. For userspace development, you'll want a hosted implementation, perhaps a cross-compiler, but the runtime library must be written as for a hosted implementation. Note that in both cases you can use the same compiler, you just need to point it to appropriate header files and libraries. On Linux, for example, kernel and userspace development can be done using the very same gcc compiler, with different headers and libraries.
The processor has no clue what a console is, or what a kernel is. Some code has to actually access the hardware. When you take printf() from a hosted C/C++ implementation, that implementation, somewhere deep in its guts, will invoke a system call for the particular platform it was meant to run on. That system call is meant to write to some abstraction that wraps the "console". On the other side of this system call is kernel code that will push this data to some hardware. It may not even be hardware directly, it may well be userspace of another process.
For example, whenever you run things in a GUI-based terminal on a Unix machine (KDE's Konsole, X11 xterm, OS X Terminal, etc.), the userland process invoking printf() has very, very far to go before anything hits hardware. Namely (even this is simplified!):
printf() writes data to a buffer
The buffer is flushed to (written to) a file handle. The write() library function is called.
The write() library function invokes a system call that transfers the control over to the kernel.
The kernel code copies the data from the userspace pages, since those can vanish at any time, to a kernel-side non-paged buffer.
The kernel code invokes the write handler for a given file handle - a file handle, in many kernels, is implemented as class with virtual methods.
The file handle happens to be a pseudo-terminal (pty) slave. The write method passes the data to the pty master.
The pty master fills up the read buffer of given pseudo-terminal, and wakes up the process waiting on the related file handle.
The process implementing the GUI terminal wakes up and read()s the file handle. This goes through the library to a syscall.
The kernel invokes the read handler for the pty master file handle.
The read handler copies its buffered data to the userspace.
The syscall returns.
The terminal process takes the data, parses it for control codes, and updates its internal data structure representing the emulated screen. It also queues an update event in the event queue. Control returns to the event loop of the GUI library/framework. This is done through an event since those events are usually coalesced. If there's a lot of data available, it will be all processed to update the screen data structure before anything gets repainted.
The event dispatcher dispatches the update/repaint event to the "screen" widget/window.
The event handler code in the widget/window uses the internal data structure to "paint" somewhere. Usually it'd be on a bitmap backing store.
The GUI library/framework code signals the operating system's graphics driver that new data is available on the backing store.
Again, through a syscall, the control is passed over to the kernel. The graphics driver running in the kernel will do the necessary magic on the graphics hardware to pass the backing bitmap to the screen. It may be an explicit memory copy, or a simple queuing of a texture copy with the graphics hardware.
Printf() is a high-level function that can be independent of the OS. It is however just part of the puzzle, it has dependencies itself. It needs to be able to write to stdout. Which will result in low-level OS dependent system calls, like create() to open the stdout stream and write() to send printf output there. Different OSes have different system calls so there's always an adaption layer, there will be in yours.
So sure, you can make printf() work in your kernel. Actually seeing the output of calls to printf() is going to be the real problem to solve. Nothing like a terminal window in kernel mode.

Resources