i am trying to find the select() source code (linux, i386 arch) in the glibc source code,
but i cannot find anything (related to the said architecture)
Could anybody point me to the select() source code ?
mh's answer is pretty good, but I will try to be more specific:
select is Linux system call, not libc function. It's source code could be found here.
libc has only wrapper for calling (executing) linux system call. Wrapper for select syscall is created on the fly at build time, because select is in syscalls.list file.
select() is not a function of the libc, but a kernel function, so you need to take a look into the kernel source.
You can tell this by looking into the man page: If it is in section 2, it's a kernel function, if it's in section 3, it's a function of the standard C library, in your case the glibc.
Edit: Like some other people remarked correctly (thank you!), a function described in section 2 is officially called a system call and it is actually a call to a library that wraps the operating system's actual call interface.
Related
I have bit bit of confusion regarding these two so here are my questions;
The Linux man-pages project lists all these functions:
https://www.kernel.org/doc/man-pages/
Looking at recvfrom as an example, this function exists both as a Linux system call as well as a C library function. Their documentation seems different but they are both reachable using #include <sys/socket.h>.
I don't understand their difference?
I also thought systems calls are defined using hex values which can be implemented in assembly directly, their list is here:
https://syscalls.kernelgrok.com/
However I cannot find recvfrom in the above link. I'm a bit confused between Linux system calls vs C lib functions at this point!
Edit: To add to the questions, alot of functions are under (3) but not (2), i.e clean. Does this mean these are done by C runtime directly without relying on system calls and the underlying OS?
First, understand that that C functions and system calls are two completely different things.
System calls not wrapped in the C library must be called through the syscall function. One example of such a call is gettid.
To create a gettid system call wrapper with syscall, do this:
#define _GNU_SOURCE
#include <sys/syscall.h>
#include <sys/types.h>
#include <unistd.h>
pid_t gettid(void)
{
pid_t tid = (pid_t)syscall(SYS_gettid);
return tid;
}
Here's an excerpt from the NOTES section of the man-page, which explicitly states that this function is not defined within the C library:
NOTES
Glibc does not provide a wrapper for this system call; call it using syscall(2).
recvfrom is a C library wrapper around a system call.
Everything in section (2) is a system call. Everything in section (3) is not. Everything in section (3) (with a few notable exceptions, such as getumask) has a definition in the C library. About half of everything in section (2) does not have a definition (or wrapper) within the C library (with the exception of functions mandated by POSIX, and some other extensions, which all do), as with gettid.
When calling recvfrom in C, the C library calls the kernel to do the syscall.
The syscall function is the function that puts the system call number in the %eax register and uses int $0x80.
The reason you don't see recvfrom in https://syscalls.kernelgrok.com/ is because https://syscalls.kernelgrok.com/ is very, very incomplete.
The reason there are many functions in (3) that you don't see in (2) is because many functions on (3) don't have a system call. They may or may not rely on system calls, they just don't have a system call with that specific name that backs them.
exists both under the linux system call
The way userspace programs communicate with kernel is by using syscall function. All syscall() does is push some number on specific registers and then execute a special interrupt instruction. On the interrupt the execution is transferred to kernel, kernel then reads data from userspace using special registers.
Each syscall has a number and different arguments. User space programs are expected to "find out" arguments for each syscall by for example inspecting the documentation.
Linux system call is just a number, like __NR_recvfrom which is equal to 231 on x86-64 architecture.
C Lib function
A C library function is a function implemented by the C library implementation. So, for example glibc implements recvfrom as a simple wrapper around syscall(__NR_recvfrom, ...). This is C interface the library provides programmers to access kernel related functions. So C programmers wouldn't need to read the documentation for each syscall and have nice C interface to call the kernel.
However I cannot find recvfrom in the above link.
Don't use the link then. At best inspect kernel sources under uapi directory.
First of all, functions listed in section (2) are functions. They are different from functions in section (3) in that there is always a system call behind.
Those function will usually do additional work to make them behave like POSIX functions (converting returned value to -1 and errno), or to just make them usable (clone syscall requires libc integration to be useful). Sometimes arguments are passed to a system call differently than function prototype suggests, for example they can be packed into a structure and pointer to that structure can be passed through a register.
Sometimes a new syscall is added to fix some issues of the older syscall. In this can a function can be implemented using a new syscall transparently (see mmap vs mmap2, sys_select vs sys_old_select).
As for recvfrom, socket-related functions are implemented by either their respective syscalls or by a legacy sys_socketcall. For example musl still has this code:
#ifdef SYS_socket
#define __socketcall(nm,a,b,c,d,e,f) syscall(SYS_##nm, a, b, c, d, e, f)
#define __socketcall_cp(nm,a,b,c,d,e,f) syscall_cp(SYS_##nm, a, b, c, d, e, f)
#else
#define __socketcall(nm,a,b,c,d,e,f) syscall(SYS_socketcall, __SC_##nm, \
((long [6]){ (long)a, (long)b, (long)c, (long)d, (long)e, (long)f }))
#define __socketcall_cp(nm,a,b,c,d,e,f) syscall_cp(SYS_socketcall, __SC_##nm, \
((long [6]){ (long)a, (long)b, (long)c, (long)d, (long)e, (long)f }))
#endif
which tries to use appropriate syscall if available, backing off to socketcall otherwise.
Looking at recvfrom as an example, this function exists both as a Linux system call as well as a C library function.
I was able to find 2 pages for recvfrom:
recvfrom(2) in Linux Programmer's Manual
recvfrom(3p) in POSIX Programmer's Manual
Often, the Linux page also tells how Linux version of the function differs from POSIX one.
They are different from functions in section (3) in that there is always a system call behind [section 2].
Not necessarily. Section 2 is Linux-specific API for user-space applications. Linus Torvalds insists that user-space applications must never be broken because of Linux kernel API changes. glibc or another library normally implement the functions, to maintain the stable user-space API and delegate to the kernel.
I have a couple of assumptions, most likely some of them will be incorrect. Please correct me where they are wrong.
We could categorize the functions in a program written in C as follows:
Functions that are sent to dynamically loaded libraries:
These are sent to the library that translates them in to multiple standard C-functions
The library passes them on to libc where they are translated into multiple system calls.
Libc passes those on to the kernel where they are executed and the returns are sent back to libc.
Libc will collect the returs, group them by c-function and use them to create 1 return for each c-function. These returns will be send back to the dynamically loaded library.
This library will collect all returns and use them to create 1 return that is send back to the original program.
Functions that are either defined in the code or part of statically compiled libraries: Everything is the same as the category above but:
They program already does the translation into standard C functions where they are known or into functions calling dynamically loaded libraries in the other case.
The standard c functions are send to libc, the others to the dynamically loaded libraries (where they will be handled as above).
The creation of 1 final return based on the returns from both types of functions will be done by the program
Functions that are standard C functions: They will just be sent to libc which will communicate with the kernel in the same way as above and 1 return will be sent to the program
Functions that are system calls: They are NOT sent directly to the kernel but have to pass to libc although it doesn't do any extra work.
Security checks (permissions, writing to unallocated mem, ...) are always done by the kernel, although libc and libraries above might also check it first.
All POSIX-compliant systems follow these rules
It might not be the same on Linux and on some other POSIX system (like FreeBSD).
On Linux, the ABI defines how a system call is done. Read about Linux kernel interfaces. The system calls are listed in syscalls(2) (but see also /usr/include/asm*/unistd.h ...). Read also vdso(7). The assembler HowTo explains more details, but for 32 bits i686 only.
Most Linux libc are free software, you can study their source code. IMHO the source code of musl-libc is very readable.
To simplify a tiny bit, most system calls (e.g. write(2)) are small C functions in the libc which:
call the kernel using SYSENTER machine instruction (and take care of passing the system call number and its arguments with the kernel convention, which is not the usual C ABI). What the kernel considers as a system call is only that machine instruction (and conventions about it).
handle the failure case, by passing it to errno(3) and returning -1.
(IIRC, on failure, the carry -or perhaps the overflow- flag bit is set when the kernel returns from SYSENTER; but I could be wrong in the details)
handle the success case, by returning a result.
You could invoke system calls without libc, with some assembler code. This is unusual, but has been done (e.g. in BusyBox or in Bones).
So the libc code for write is doing some tiny extra work (passing arguments, handling failure & errno and success cases).
Some few system calls (probably getpid & clock_gettime) avoid the overhead of the SYSENTER machine instruction (and user-mode -> kernel-mode switch) thanks to vDSO.
No you can't categorize things like that. When you program in C (but that makes no difference in almost all other languages), there is only functions and whatever is the real status of these, you call them exactly the same way. This is defined by ABI (how to pass parameters, get returned values, etc) and enforced by the compiler/linker. Of course some functions are just stubs. For example stubs to shared libraries functions (stubs may be need to load the library, dynamic link to the real function, etc) or system calls (this is more technical and differs from kernel to kernel). But from the viewpoint of your program everything is the same (this is why it is hard to understand difference between fread and read at the beginning: you call them the same way, they make almost the same job, what's the difference?).
POSIX doesn't say a single word about kernels... It just lists the C (and formerly ADA) API of a set of functions with minimal semantic (plus some command, tools, etc). Implementation of these is totally free.
I know that read is system call. But when I read man 2 and man 3 of read it shows me different explanation. So , I am suspecting that read has library function and system call. In such case if I use read in my c program, whether compiler will consider read as library function or system call Please explain me on this confusion.
It doesn't. System calls are present in libc (the C standard library) just like library functions are. The implementations of system calls in libc are just "stubs" which invoke system-specific methods of calling into the kernel.
I'm assuming you're on Linux. On that platform, the manpage read(2) describes the Linux system call, while read(3) describes the POSIX specification for read, if you have the POSIX manpages installed. The latter is in category 3 because POSIX doesn't specify a difference between system calls and library functions.
There's only one read in libc, which is (a thin wrapper around) the system call.
I'm currently learning about operating systems the use of traps to facilitate system calls within the Linux kernel. I've located the table of the traps in traps.c and the implementation of many of the traps within entry.S.
However, I'm instructed to find an implementation of two system calls in the Linux kernel which utilize traps to implement a system call. Although I can find the definition of the traps themselves, I'm not sure what a "call" to one of these traps within the kernel would look like. Therefore, I'm struggling to find an example of this behavior.
Before anyone asks, yes, this is homework.
As a note, I'm using Github to browse the kernel source, since kernel.org is down:
https://github.com/torvalds/linux/
For the x86 architecture the SYCALL_VECTOR (0x80) interrupt is used only for 32bit kernels. You can see the interrupt vector layout in arch/x86/include/asm/irq_vectors.h. The trap_init() function from traps.c is the one that sets the trap handler defined in entry_32.S:
set_system_trap_gate(SYSCALL_VECTOR, &system_call);
For the 64bit kernels, the new SYSENTER (Intel) or SYSCALL (AMD) intructions are used for performance reasons. The syscall_init() function from arch/x86/kernel/cpu/common.c sets up the "handler" defined in entry_64.S and bearing the same name (system_call).
For the user-space perspetive you might want to take a look at this page (a bit outdated for the function/file names).
I'm instructed to find an implementation of two system calls in the Linux kernel which utilize traps to implement a system call
Every system call utilizes a trap (interrupt 0x80 if I recall correctly) so the "kernel" bit will be turned on in PSW, and privileged operations will be available to the processor.
As you mentioned the system calls are specified in entry.S under sys_call_table: and they all start with the "sys" prefix.
you can find the system call function header in: include/linux/syscalls.h, you can find it here:
http://lxr.linux.no/#linux+v3.0.4/include/linux/syscalls.h
Use lxr (as the comment above have already mentioned) in general in order to browse the source code.
Anyhow, the function are implemented using the SYSCALL_DEFINE1 or othe versions of the macro, see
http://lxr.linux.no/#linux+v3.0.4/kernel/sys.c
If you're looking for an actual system call, not an implementation of a system call, maybe you want to check some C libraries. Why would a kernel include a system call? (I'm not talking about a system call implementation, I'm talking about e.g. an actual chdir call for example. There is a chdir system call, which is a request for changing the directory and there is a chdir system call implementation which actually changes it and must be somewhere in the kernel). Ok, maybe some kernels do include some syscalls too but that's another story :)
Anyway, if I get your question right, you're not looking for an implementation but an actual call. GNU libc is too complicated for me, but you can try browsing the dietlibc sources. Some examples:
chdir.S
syscalls.h
i am trying to find the select() source code (linux, i386 arch) in the glibc source code,
but i cannot find anything (related to the said architecture)
Could anybody point me to the select() source code ?
mh's answer is pretty good, but I will try to be more specific:
select is Linux system call, not libc function. It's source code could be found here.
libc has only wrapper for calling (executing) linux system call. Wrapper for select syscall is created on the fly at build time, because select is in syscalls.list file.
select() is not a function of the libc, but a kernel function, so you need to take a look into the kernel source.
You can tell this by looking into the man page: If it is in section 2, it's a kernel function, if it's in section 3, it's a function of the standard C library, in your case the glibc.
Edit: Like some other people remarked correctly (thank you!), a function described in section 2 is officially called a system call and it is actually a call to a library that wraps the operating system's actual call interface.