This question already has answers here:
How is the system call in Linux implemented?
(6 answers)
Closed 8 years ago.
How does system calls work ?
What are the operations happen during system call?
There are various system call like open , read, write, socket etc. I would like to know how do they work in general ?
In short, here's how a system call works:
First, the user application program sets up the arguments for the system call.
After the arguments are all set up, the program executes the "system call" instruction.
This instruction causes an exception: an event that causes the processor to jump to a new address and start executing the code there.
The instructions at the new address save your user program's state, figure out what system call you want, call the function in the kernel that implements that system call, restores your user program state, and returns control back to the user program.
A visual explanation of a user application invoking the open() system call:
It should be noted that the system call interface (it serves as the link to system calls made available by the operating system) invokes intended system call in OS kernel and returns status of the system call and any return values. The caller need know nothing about how the system call is implemented or what it does during execution.
Another example: A C program invoking printf() library call, which calls write() system call
For more detailed explanation read section 1.5.1 in CH-1 and Section 2.3 in CH-2 from Operating System Concepts.
Related
I learned how assembly (x86) globally works in the book : "Programming from ground up".
In this book, every program ends with an interruption call to exit.
However, in C compiled programs, I found out that programs end with a ret. This supposes that there is an address to be popped and that would lead to the end of the program.
So my question is :
What is this address? (And what is the code there?)
You start your program by asking the OS to pass control to the start or _start function of your program by jumping to that label in your code. In a C program the start function comes from the C library and (as others already said before) does some platform specific environment initialization. Then the start function calls your main and the control is yours. After you return from the main, it passes control back to the C library that terminates the program properly and does the platform specific system call to return control back to the OS.
So the address main pops is a label coming from the C library. If you want to check it, it should be in stdlib.h (cstdlib) and you will see it calling exit that does the cleanup.
Its function is to destroy the static objects (C++ of course) at program termination or thread termination (C++11). In the C case it just closes the streams, flushes their buffers, calls atexit functions and does the system call.
I hope this is the answer you seek.
It is implementation specific.
On Linux, main is called by crt0, and the _start entry point there is analyzing the initial call stack set up by the kernel interpreting the execve(2) system call of your executable program. On return from main the epilogue part of crt0 is dealing with atexit(3) registered functions and flushing stdio.
FWIW, crt0 is provided by your GCC compiler, and perhaps your C standard library. All this (with the Linux kernel) is free software on Linux distribution.
every program ends with an interruption call to exit.
Not really. It is a system call (see syscalls(2) for their list), not an interrupt. See also this.
I was asked about system calls , what they are, which mode are they used in and if read(), getchar() and sqrt() uses or not system calls.
For the first part I answered that system calls provide a interface between a process and the OS and these are used in kernel mode.
The thing that is bothering me is the fact that for me the only function that uses system calls of those 3 is read().
Am I right? or getchar() and sqrt() also use system calls?
(NOTE: read() from unistd.h getchar() from stdio.h and sqrt() from math.h)
The difference between a system and a regular call is that a system call has to issue a trap to the operating system whereas a regular call just calls another user level subroutine. You're right in saying that the difference is in what mode the calls are executed in.
Sqrt is not a system call. All it does is perform a simple calculation. If I remember correctly, both read() and getchar() are system calls because the operating system is the one who handles input/output operations.
I'm creating a presentation on how to program in C, and since I'm fairly new to C, I want to check whether my assumptions are correct, and what am I missing.
Every C program has to have an entry point for the OS to know where to begin execution. This is defined by the main() function. This function always has a return value, whether it be user defined or an implicit return 0;.
Since this function is returning something, we must define the type of the thing it returns.
This is where my understand starts to get hazy...
Why does the entry point needs to have a return value?
Why does it have to be an int?
What does the OS do with the address of int main() after the program executes?
What happens in that address when say a segfault or some other error halts the program without reaching a return statement?
Every program terminates with an exit code. This exit code is determined by the return of main().
Programs typically return 0 for success or 1 for failure, but you can choose to use exit codes for other purposes.
1 and 2 are because the language says so.
For 3: Most operating systems have some sort of process management, and a process exits by invoking a suitable operating system service to do so, which takes a status value as an argument. For example, both DOS and Linux have "exit" system calls which accept one numeric argument.
For 4: Following from the above, operating systems typically also allow processes to die in response to receiving a signal which is not ignored or handled. In a decent OS you should be able to distinguish whether a process has exited normally (and retrieve its exit status) or been killed because of a signal (and retrieve the signal number). For instance, in Linux the wait system call provides this service.
Exit statuses and signals provide a simple mechanism for processes to communicate with one another in a generic way without the need for a custom communications infrastructure. It would be significantly more tedious and cumbersome to use an OS which didn't have such facilities or something equivalent.
To be clear, this is a design rather than an implementation question
I want to know the rationale behind why POSIX behaves this way. POSIX system calls when given an invalid memory location return EFAULT rather than crashing the userspace program (by sending a sigsegv), which makes their behavior inconsistent with userspace functions.
Why? Doesn't this just hide memory bugs? Is it a historical mistake or is there a good reason for it?
Because system calls are executed by the kernel, not by the user program --- when the system call occurs, the user process halts and waits for the kernel to finish.
The kernel itself, of course, isn't allowed to seg fault, so it has to manually check all the address areas the user process gives it. If one of these checks fails, the system call fails with EFAULT. So in this situation a segmentation fault hasn't actually happening --- it's been avoided by the kernel explicitly checking to make sure all the addresses are valid. Hence it makes sense that no signal is sent.
In addition, if a signal were sent, there'd be no way the kernel could attach a meaningful program counter to the signal, the user process isn't actually executing when the system call is running. This means there'd be no way for the user process to produce decent diagnostics, restart the failed instruction, etc.
To summarise: mostly historical, but there is actual logic to the reasoning. Like EINTR, this doesn't make it any less irritating to deal with.
Well, what would you want to happen. A system call is a request to the system. If you ask: "when does the ferry to Munchen leave?" would you like the program to crash, or to get return = -1 with errno = ENOHARBOR ? If you ask the sytem to put your car into your handbag, would you like to have your handbag destroyed, or a return of -1 with errno set to EBAGTOOSMALL ?
There is a technical detail: before or after syscalls,arguments to/from user/system -land have to be converted (copied) when entering/leaving the system call. Mostly for security reasons the system is very reluctant to write into user-space. (Linux has a copy_to_user_space function for this (and vice versa), which checks the credentials before doing the actual copying)
Why? Doesn't this just hide memory bugs?`
On the contrary. It allows your program to handle the error(impossible in this case), and terminate gracefully. But the program must check the return value from system calls and inspect errno. In the case of SIGSEGVE, there is very little for your program to do, so mapping EINVAL to SIGSEGVE would be a bad idea.
Systemcalls were designed to always return (or block indefinitely...), whether they succeed or fail.
And a technical aspect could be that {segmentation faults, buserror, floating point exception, ...} are (often) generated by hardware interrupts.
This question already has answers here:
Closed 13 years ago.
Possible Duplicate:
Linux API to list running processes?
How can I detect hung processes in Linux using C?
Under linux the way to do this is by examining the contents of /proc/[PID]/* a good one-stop location would be /proc/*/status. Its first two lines are:
Name: [program name]
State: R (running)
Of course, detecting hung processes is an entirely separate issue.
/proc//stat is a more machine-readable format of the same info as /proc//status, and is, in fact, what the ps(1) command reads to produce its output.
Monitoring and/or killing a process is just a matter of system calls. I'd think the toughest part of your question would really be reliably determining that a process is "hung", rather than meerly very busy (or waiting for a temporary condition).
In the general case, I'd think this would be rather difficult. Even Windows asks for a decision from the user when it thinks a program might be "hung" (on my system it is often wrong about that, too).
However, if you have a specific program that likes to hang in a specific way, I'd think you ought to be able to reliably detect that.
Seeing as the question has changed:
http://procps.sourceforge.net/
Is the source of ps and other process tools. They do indeed use proc (indicating it is probably the conventional and best way to read process information). Their source is quite readable. The file
/procps-3.2.8/proc/readproc.c
You can also link your program to libproc, which sould be available in your repo (or already installed I would say) but you will need the "-dev" variation for the headers and what-not. Using this API you can read process information and status.
You can use the psState() function through libproc to check for things like
#define PS_RUN 1 /* process is running */
#define PS_STOP 2 /* process is stopped */
#define PS_LOST 3 /* process is lost to control (EAGAIN) */
#define PS_UNDEAD 4 /* process is terminated (zombie) */
#define PS_DEAD 5 /* process is terminated (core file) */
#define PS_IDLE 6 /* process has not been run */
In response to comment
IIRC, unless your program is on the CPU and you can prod it from within the kernel with signals ... you can't really tell how responsive it is. Even then, after the trap a signal handler is called which may run fine in the state.
Best bet is to schedule another process on another core that can poke the process in some way while it is running (or in a loop, or non-responsive). But I could be wrong here, and it would be tricky.
Good Luck
You may be able to use whatever mechanism strace() uses to determine what system calls the process is making. Then, you could determine what system calls you end up in for things like pthread_mutex deadlocks, or whatever... You could then use a heuristic approach and just decide that if a process is hung on a lock system call for more than 30 seconds, it's deadlocked.
You can run 'strace -p ' on a process pid to determine what (if any) system calls it is making. If a process is not making any system calls but is using CPU time then it is either hung, or is running in a tight calculation loop inside userspace. You'd really need to know the expected behaviour of the individual program to know for sure. If it is not making system calls but is not using CPU, it could also just be idle or deadlocked.
The only bulletproof way to do this, is to modify the program being monitored to either send a 'ping' every so often to a 'watchdog' process, or to respond to a ping request when requested, eg, a socket connection where you can ask it "Are you Alive?" and get back "Yes". The program can be coded in such a way that it is unlikely to do the ping if it has gone off into the weeds somewhere and is not executing properly. I'm pretty sure this is how Windows knows a process is hung, because every Windows program has some sort of event queue where it processes a known set of APIs from the operating system.
Not necessarily a programmatic way, but one way to tell if a program is 'hung' is to break into it with gdb and pull a backtrace and see if it is stuck somewhere.