How are trap handlers, exception handlers, and interrupt handlers different from system calls? - c

Considering Linux environment, what is the difference between them?
How is a system call different from a normal function call?

According to wikipedia, a TRAP is an exception. Exceptions are defined differently depending on who you talk to. In a generic form, interrupts are exceptions. Exceptions could be a page fault (code or data), an alignment, an undefined instruction, divide by zero, etc.
Generally, they are all very similar. They will switch context to the OS to handle the issue resulting in registers saves (a user-space to OS context switch) and a possible process context switch depending on the request or circumstance. When you transition to the OS, different MMU protection (the CPU view of memory) are in effect and a different stack is used. In most cases, the instruction that caused the fault is the one that was executing when the switch happens.
The interrupt is different in that any user-space instruction could be interrupted. For most others, there are only specific classes of instructions that may cause a fault. This has ramification for compilers and libraries that need to do things atomically (to the thread, process or to the system globally). More details really depend on the CPU in use.
Difference between library and system calls
Considering Linux environment, what is the difference between them?
This is almost unanswerable in a definite way. Linux versions, CPU versions and your definition of what these are would influence the answer. However, I think the above is a good conceptual guide.
How is a system call different from a normal function call?
A normal function call doesn't transition to 'kernel space'. Many access permissions change when entering kernel space. Usually this has some physical hard wiring into the CPU. However the Linux 'mm' and 'io' layers are most definitely different and code maybe required to make it so. It can also depend on what the 'system call' does. In some cases, Linux has been optimize so the system call isn't needed (from one version to the next). See for example the vdso man page. In other cases, the C libraries or other mechanism might avoid the system call; for instance DNS name caching, etc.

Related

What's the relationship between VDSO(7) and SYSCALL(2)?

From this post, I learned
syscall is the default way of entering kernel mode on x86-64.
In practice, recent kernels are implementing a VDSO
Then I look up manual, in http://man7.org/linux/man-pages/man2/syscall.2.html :
The first table lists the instruction used to transition to kernel
mode (which might not be the fastest or best way to transition to the
kernel, so you might have to refer to vdso(7)), the register used to
indicate the system call number, the register used to return the sys‐
tem call result, and the register used to signal an error.....
But I lack some essential knowledge to understand the statements.
Is it true that VDSO(7) is the implementation of syscall(2), or syscall(2) will invoke VDSO(7) to complete system call?
If it is not true, what's the relationship between VDSO(7) and SYSCALL(2)?
the VDSO(7) is not the implementation of syscall(2).
Without VDSO(7), syscall will be run in user-space applications. In this case will be occur context switching.
if use VDSO(7), will be run syscall without context switching.
The kernel automatically maps into the address space of all user-space applications with vDSO.
Read more carefully the man pages syscalls(2), vdso(7) and the wikipages on system calls and VDSO. Read also the operating system wikipage and Operating Systems: Three Easy Pieces (freely downloadable).
System calls are fundamental, they are the only way a user-space application can interact with the operating system kernel and use services provided by it. So every program uses some system calls (unless it crashes and is terminated by some signal(7)). System calls requires a user to kernel transition (e.g. thru a SYSCALL or SYSENTER machine instruction on x86) which is somehow "costly" (e.g. could take a microsecond).
VDSO is only a clever optimization (to avoid the cost of a genuine system call, for very few functions like clock_gettime(2) which also still exist as genuine system calls), a bit like some shared library magically provided by the kernel without any real file. Some programs (e.g. statically linked ones, or those not using libc like BONES or probably busybox) don't use it.
You can avoid VDSO (or not use it), and earlier kernels did not have it. But you cannot avoid doing system calls, and programs usually do a lot of them.
Play also with strace(1) to understand the (many) system calls done by an application or a running process.

cannot find pthread.c in linux folder

I have downloaded kernel and the kernel resides in folder called Linux-2.6.32.28 in which I can find /Kernel/Kthread.c
I found Kthread.c but I cannot find pthread.c in Linux-2.6.32.28
I found Kthread.c in Linux-3.13/Kernel and Linux-4.7.2/Kernel
locate pthread.c finds file pthread.c in Computer/usr folder that comes when I install Ubuntu but pthread.c is not available in downloaded folders Linux-2.6.32.28, Linux-3.13, Linux-4.7.2
MORE: There are two sets of function calls. 1. System Calls 2. Library Calls.
For a computer to do any task it has to use hardware resources. So, how library calls differ from system calls?
System calls always use kernel which means hardware.
Library calls means no usage of kernel or hardware?
I know that library calls sometimes may resolve to system call.
What I want to know is that, if every set of function calls uses hardware then to what degree system calls will use hardware resources when compared with library calls and vice-versa.
Whether a function call is System or Library, at least hardware resource like RAM has to be utilized. Right?
Read first pthreads(7). It explains you that pthreads are implemented in the C standard library as nptl(7).
The C standard library is the cornerstone of Linux systems, and you might have several variants of it; however, most Linux distributions have only one libc, often the GNU glibc, which contains the NPTL. You might use another C standard library (such as musl-libc or dietlibc). With care, you can have several C standard libraries co-existing on your system.
The C standard library (and every user-space program) is using system calls to interact with the kernel. They are listed in syscalls(2). BTW most C standard library implementations on Linux are free software, so you can study (or even improve) their source code if you want to. You often use a system call thru the small wrapper C function (e.g. write(2)) for it in your C standard library, and even more often thru some higher-level function (e.g. fprintf(3)) provided by your C standard library.
Pthreads are implemented (in the NPTL layer of glibc) using low-level stuff like clone(2) and futex(7) and a bit of assembler code. But you generally won't use these directly unless you are implementing a thread library (like NPTL).
Most programs are using the libc and linking with it (and also with crt0) as a shared library, which is /lib/x86_64-linux-gnu/libc.so.6 on my Debian/Sid/x86-64. However, you might (but you usually don't) invoke system calls directly thru some assembler code (e.g. using a SYSCALL or SYSENTER machine instruction). See also this.
The question was edited to also ask
What I want to know is that, if every set of function calls uses hardware then to what degree system calls will use hardware resources
Please read a lot more about operating systems. So read carefully Operating Systems: Three Easy pieces (a freely downloadable textbook) and read about Instruction Set Architecture and Computer Architecture. Study several of them, e.g. x86-64 and RISC-V (or ARM, PowerPC, etc...). Read about CPU modes and virtual memory.
You'll find out that the OS manages physical resources (including RAM and cores of processors). Each process has its own virtual address space. So from a user-space point of view, a process don't use directly hardware (e.g. it runs in virtual address space, not in RAM), it runs in some virtual machine (provided by the OS kernel) defined by the system calls and the ISA (the unpriviledged machine instructions).
Whether a function call is System or Library, at least hardware resource like RAM has to be utilized. Right?
Wrong, from a user-space point of view. All hardware resources are (by definition) managed by the operating system (which provides abstractions thru system calls). An ordinary application executable program uses the abstractions and software resources (files, processes, file descriptors, sockets, memory mappings, virtual address space, etc etc...) provided by the OS.
(so several books are required to really answer your questions; I gave some references, please follow them and read a lot more; we can't explain everything here)
Regarding your second cluster of questions: Everything a computer does is ultimately done by hardware. But we can make some distinctions between different kinds of hardware within the computer, and different kinds of programs interacting with them in different ways.
A modern computer can be divided into three major components: central processing unit (CPU), main memory (random access memory, RAM), and "peripherals" such as disks, network transceivers, displays, graphics cards, keyboards, mice, etc. Things that the CPU and RAM can do by themselves without involving peripherals in any way are called computation. Any operation involving at least one peripheral is called input/output (I/O). All programs do some of both, but some programs mostly do computation, and other programs mostly do I/O.
A modern operating system is also divided into two major components: the kernel, which is a self-contained program that is given special abilities that no other program has (such as the ability to communicate with peripherals and the ability to control the allocation of RAM), and is responsible for supervising execution of all the other programs; and the user space, which is an unbounded set of programs that have no such special abilities.
User space programs can do computation by themselves, but to do I/O they must, essentially, ask the kernel to do it for them. A system call is a request from a user-space program to the kernel. Many system calls are requests to perform I/O, but the kernel can also be requested to do other things, such as provide general information, set up communication channels among programs, allocate RAM, etc. A library is a component of a user space program. It is, essentially, a collection of "functions" that someone else has written for you, that you can use as if you wrote them yourself. There are many libraries, and most of them don't do anything out of the ordinary. For instance, zlib is a library that (provides functions that) compress and uncompress data.
However, "the C library" is special because (on all modern operating systems, except Windows) it is responsible for interacting directly with the kernel. Nearly all programs, and nearly all libraries, will not make system calls themselves; they will instead ask the C library to do it for them. Because of this, it can often be confusing trying to tell whether a particular C library function does all of its work itself or whether it "wraps" one or more system calls. The pthreads functions in particular tend to be a complicated tangle of both. If you want to learn how operating systems are put together at their lower levels, I recommend you start with something simpler, such as how the stdio.h "buffered I/O" routines (fopen, fclose, fread, fwrite, puts, etc) ultimately call the unistd.h/fcntl.h "raw I/O" routines (open, close, read, write, etc) and how the latter set of functions are just wrappers around system calls.
(Assigning the task of wrapping system calls to the runtime library for the C programming language is a historical accident and we probably wouldn't do it that way again if we were going to start over from scratch.)
pthread is POSIX thread. It is a library that helps application to create threads in OS. Kthread in kernel source code is for kernel threads in linux.
POSIX is a standard defined by IEEE to maintain the compatibility between different OS.
So pthread source code cannot be seen in linux kernel source code.
you may refer this link for pthread source code http://www.gnu.org/software/hurd/libpthread.html

Low interrupt latency via dedicated architectures and operating systems

This question may seem slightly vague, however I am researching upon how interrupt systems work and their latency times. I am trying to achieve an understanding of how architecture facilities such as FIQ in ARM help decrease latency times. How does this differ from using a operating system that does not have access or can not provide access to this facilities? For example - Windows RT is made for ARM etc, and this operating system is not able to be ported to other architectures.
Simply put - how is interrupt latency different in dedicated architectures that have dedicated operating systems as compared to operating systems that can be ported across many different architectures (Linux for example)?
Sorry for the rant - I'm pretty confused as you can probably tell.
I'll start with your Windows RT example, Windows RT is a port of Windows to the ARM architecture. It is not a 'dedicated operating system'. There are (probably) many OSes that only run on only 1 architecture, but that is more a function of can't be arsed to port them due to some reason.
What does 'port' really mean though?
Windows has a kernel (we'll call is NT here, doesn't matter) and that NT kernel has a bunch of concepts that need to be implemented. These concepts are things like timers, memory virtualisation, exceptions etc...
These concepts are implemented differently between architectures, so the port of the kernel and drivers (I will ignore the rest of the OS here, often that is a recompile only) will be a matter of using the available pieces of silicon to implement the required concepts. This implementation is a called 'port'.
Let's zoom in on interrupts (AKA exceptions) on an ARM that has FIQ and IRQ.
In general an interrupt can occur asynchronously, by that I mean at any time. The CPU is generally busy doing something when an IRQ is asserted so that context (we'll call it UserContext1) needs to be stored before the CPU can use any resources in use by UserContext1. Generally this means storing registers on the stack before using them.
On ARM when an IRQ occurs the CPU will switch to IRQ mode. Registers r13 and r14 have there own copy for IRQ mode, the rest will need to be saved if they are used - so that is what happens. Those stores to memory take some time. The IRQ is handled, UserContext1 is popped back off the stack then IRQ mode is exited.
So the latency in this case might be the time from IRQ assertion to the time the IRQ vector starts executing. That going to be some set number of clock cycles based upon what the CPU was doing when the IRQ happened.
The latency before the IRQ handling can occur is the time from the IRQ assert to the time the CPU has finished storing the context.
The latency before user mode code can execute depends on too much stuff in the OS/Kernel to explain here, but the minimum boils down to the time from the IRQ assertion to the return after restoring UserContext1 + the time for the OS context switch.
FIQ - If you are a hard as nails programmer you might only need to use 7 registers to completely handle your interrupt servicing. I mentioned that IRQ mode has its own copy of 2 registers, well FIQ mode has its own copy of 7 registers. Yup, that's 28 bytes of context that doesn't need to be pushed out into the stack (actually one of them is the link register so it's really 6 you have). That can remove the need to store UserContext1 then restore UserContext1. Thus the latency can be reduced by up to the length of time needed to do that save/restore.
None of this has much to do with the OS. The OS can choose to use or not use these features. The OS can choose to make guarantees regarding how long it will take to execute the OSes concept of an interrupt handler, or it may not. This is one of the basic concepts of an RTOS, the contract about how long before the handler will run.
The OS is designed for some purpose (and that purpose may be 'general') - that target design goal will have a lot more affect on latency than haw many target the OS has been ported to.
Go have a read about something like freertos than buy some hardware and try it. Annotate the code to figure out the latencies you really want to look at. IT will likely be the best way to get your ehad around it.
(*Multi-CPU systems do it the same with but with some synchronization and barrier functions and a sprinkling of complexity)

what's the difference between the threads(and process) in kernel-mode and ones in user-mode?

my question:
1)In book modern operating system, it says the threads and processes can be in kernel mode or user mode, but it does not say clearly what's the difference between them .
2)Why the switch for the kernel-mode threads and process costs more than the switch for user-mode threads and process?
3) now, I am learning Linux,I want to know how would I create threads and processes in Kernel mode and user mode respectively IN LINUX SYSTEM?
4)In book modern operating system, it says that it is possible that process would be in user- mode, but the threads which are created in the user-mode process can be in kernel mode. How would this be possible?
There are some terminology problems due more to historical accident than anything else here.
"Thread" usually refers to thread-of-control within a process, and may (does in this case) mean "a task with its own stack, but which shares access to everything not on that stack with other threads in the same protection domain".
"Process" tends to refer to a self-contained "protection domain" which may (and does in this case) have the ability to have multiple threads within it. Given two processes P1 and P2, the only way for P1 to affect P2 (or vice versa) is through some particular defined "communications channel" such as a file, pipe, or socket; via "inter-process" signals like Unix/Linux signals; and so on.
Since threads don't have this kind of barrier between each other, one thread can easily interfere with (corrupt the data used by) another thread.
All of this is independent of user vs kernel, with one exception: in "the kernel"—note that there is an implicit assumption here that there is just one kernel—you have access to the entire machine state at all times, and full privileges to do anything. Hence you can deliberately (or in some cases accidentally) disregard or turn off hardware protection and mess with data "belonging to" someone else.
That mostly covers several possibly-confused items in Q1. As for Q2, the answer to the question as asked is "it doesn't". In general, because threads do not involve (as much) protection, it's cheaper to switch from one thread to another: you do not have to tell the hardware (in whatever fashion) that it should no longer allow various kinds of access, since threads T1 and T2 have "the same" access. Switching between processes, however, as with P1 and P2, you "cross a protection barrier", which has some penalty (the actual penalty varies widely with hardware, and to some extent the skills of the OS writers).
It's also worth noting that crossing from user to kernel mode, and vice versa, is also crossing a protection domain, which again has some kind of cost.
In Linux, there are a number of ways for user processes to create what amount to threads, including both "POSIX threads" (pthreads) and the clone call (details for clone, which is extremely flexible, are beyond the scope of this answer). If you want to write portable code, you should probably stick with pthreads.
Within the Linux kernel, threads are done completely differently, and you will need Linux kernel documentation.
I can't properly answer Q4 since I don't have the book and am not sure what they are referring to here. My guess is that they mean that whenever any user process-or-thread makes a "system call" (requests some service from the OS), this crosses that user/kernel protection barrier, and it is then up to the kernel to verify that the user code has appropriate privileges for that operation, and then to do that operation. The part of the kernel that does this is running with kernel-level protections and thus needs to be more careful.
Some hardware (mostly obsolete these days) has (or had) more than just two levels of hardware-provided protection. On these systems, "user processes" had the least direct privilege, but above those you would find "executive mode", "system mode", and (most privileged) "kernel" or "nucleus" mode. These were intended to lower the cost of crossing the various protection barriers. Code running in "executive" did not have full access to everything in the machine, so it could, for instance, just assume that a user-provided address was valid, and try to use it. If that address was in fact invalid, the exception would rise to the next higher level. With only two levels—"user", unprivileged; and "kernel", completely-privileged—kernel code must be written very carefully. However, it's possible to provide "virtual machines" at low cost these days, which pretty much obsoletes the need for multiple hardware levels of protection. One simply writes a true kernel, then lets it run other things in what they "think" is "kernel mode". This is what VMware and other "hypervisor" systems do.
User-mode threads are scheduled in user mode by something in the process, and the process itself is the only thing handled by the kernel scheduler.
That means your process gets a certain amount of grunt from the CPU and you have to share it amongst all your user mode threads.
Simple case, you have two processes, one with a single thread and one with a hundred threads.
With a simplistic kernel scheduling policy, the thread in the single-thread process gets 50% of the CPU and each thread in the hundred-thread process gets 0.5% each.
With kernel mode threads, the kernel itself manages your threads and schedules them independently. Using the same simplistic scheduler, each thread would get just a touch under 1% of the CPU grunt (101 threads to share the 100% of CPU).
In terms of why kernel mode switching is more expensive, it probably has to do with the fact that you need to switch to kernel mode to do it. User mode threads do all their stuff in user mode (obviously) so there's no involving the kernel in a thread switch operation.
Under Linux, you create threads (and processes) with the clone call, similar to fork but with much finer control over things.
Your final point is a little obtuse. I can't be certain but it's probably talking about user and kernel mode in the sense that one could be executing user code and another could be doing some system call in the kernel (which requires switching to kernel or supervisor mode).
That's not the same as the distinction when talking about the threading support (user or kernel mode support for threading). Without having a copy of the book to hand, I couldn't say definitively, but that'd be my best guess.

how do you do system call interrupts in C?

I learned from download.savannah.gnu.org/.../ProgrammingGroundUp-1-0-booksize.pdf
that programs interrupt the kernel, and that is how things are done. What I want to know is how you do that in C (if it's possible)
There is no platform-independent way (obviously)! On x86 platforms, system-calls are typically implemented by placing the system-call code in the eax register, and triggering int 80h in assembler, which causes a switch to kernel-mode. The kernel then executes the relevant code based on what it sees in eax.
User processes usually request kernel services by calling system call wrapper functions from Standard C Library. You can do it manually with syscall(2).
The user program's interaction with the kernel is going to be very platform-specific, so it usually happens behind the scenes in the various library routines. So one just calls printf, write, select, or other library routines, which allow the programmer to write code without worrying about the details of the kernel interface, device drivers, and so forth.
And the way it usually works is that when one of those library routines needs the kernel to do something on its behalf, it performs a low-level system call that yields its control of the CPU to the kernel. It's the user program, not the kernel, that is the one being interrupted.
If you're using glibc (which you probably are if you are using gcc and linux) then there is a syscall function in unistd.h that you can use. It has different implementations for different architectures and operating systems, but the implementation is done in assembly (could be inline assembly). syscall has a man page, so:
man syscall
will give you some info.
If you are just curious about how all of this works then you should know that this has changed in Linux on x86 in recent years. Originally interrupt 0x80 was used by Linux as the normal system call entry point on x86. This worked well enough, but as processors got more advanced pipelining (starting an instruction before previous instructions have completed) interrupts have slowed down (relative to execution of regular code which has sped up, though some tests have shown that it has slowed down more than that). The reason for this is that even when the int instruction is used to trigger an interrupt it works mostly the same as hardware triggered interrupts, which occur unpredictably, which causes them not to play nice with the pipelining of instructions (pipelining works better when code paths are predictable).
To help with this newer x86 processors have instructions specifically intended for making system calls, but Intel and AMD use different instructions for this (sysenter and syscall, respectively). Additionally the Intel systenter instruction clobbers a general purpose register that Linux has used on x86_32 to pass a parameter to the kernel. This means that programs have to know which of 3 possible system call mechanisms to use as well as possibly different ways of passing arguments to the kernel. To get around all of this newer kernels map a special page of memory into programs (this page is called vsyscall and if you cat /proc/self/maps you will see an entry for it) that contains code for the system call mechanism that the kernel has determined should be used on the system, and newer versions of glib can implement their system call entry using the code in this page.
The point of all of this is that this isn't as simple as it used to be, but if you are just playing around on an x86_32 then you should be able to use the int 80h instruction because that will be supported on systems that can use one of the other mechanisms for backwards compatibility.
In C, you don't really do it directly, but you'll end up doing this indirectly any time you use library functions that end up invoking system calls. File access, network access, etc, are typical examples of this.
Those functions will all end up "trapping" to the kernel, which will handle the request.

Resources