I'm working on a project which requires to take care of process scheduling. I tried to stop certain process by sending SIGINT signal(ctrl+c), but i found out sleeping process doesn't wake up.
I solved this weird issue but couldn't find out why SIGINT signal couldn't wake up sleeping process.
Here's my original code (where process is stuck) :
set_current_state(TASK_INTERRUPTIBLE);
while (!item->assigned) {
schedule_timeout(2*HZ);
set_current_state(TASK_INTERRUPTIBLE);
}
set_current_state(TASK_RUNNING);
So inside of the while loop, when !item->assigned is still 1, SIGINT signal doesn't do anything even though process' state is set to TASK_INTERRUPTIBLE.
As far as I know, process takes care of signal in these ways:
execute the signal default action
block the signal setting a signal mask (this is done using the system call sigmask)
assign a custom handler to the signal, executing a custom action (using the system call signal)
So in this case I assumed It would execute the default action..
I added signal_pending(current) to check if there is any pending signal, and if it is, then break out of the loop so that the process can handle the pending signal.
set_current_state(TASK_INTERRUPTIBLE);
while (!item->assigned) {
schedule_timeout(2*HZ);
if (signal_pending(current))
{
break;
}
set_current_state(TASK_INTERRUPTIBLE);
}
set_current_state(TASK_RUNNING);
But still no idea why It didn't catch the SIGINT signal.
Related
Let's say we have a program in C that uses the sleep() function
The program executes and goes to sleep. Then we type Ctrl+C to send a SIGINT signal to the process.
We know that the default action upon receipt of a SIGINT is to terminate the process, we also know that the sleep() function resume the process whenever the sleeping process receives a signal.
And my textbook says in order to allow sleep() function to return, we must install a SIGINT handler like this:
void handler(int sig){
return; /* Catch the signal and return */
}
...
int main(int argc, char **argv) {
...
if (signal(SIGINT, handler) == SIG_ERR) /* Install SIGINT handler */
unix_error("signal error\n");
...
sleep(1000)
}
Althouth the code seems to be straightforward, I still have questions if I want to dig deeper:
Background: When the process is sleeping and we type Ctrl+C to send SIGINT
Q1-My understanding is, Kernel sends SIGINT to the process by updating the SIGINT's corresponging pending bit in the pend bit vector, is my understanding correct?
Q2-The processor detects the existance of SIGINT, but since we overwrite the handler to make it return in stead of terminating the process, so our handler get executed, and then Kernel clears SIGINT's corresponging pending bit, is my understanding correct?
Q3- Since SIGINT's corresponging pending bit is cleared, then how can sleep() function gets return? I think it should be in sleep still because in theory, sleep() function has no way of knowing the existance of SIGINT(has been cleared)
Q1: the kernel checks if the process has blocked the received signal, if so, it updates the pending signal bit (unreliable, on systems with relable signals, this should be a counter) in the process entry, for the signal handler to be called when signals are unblocked again (see below). If not blocked, the system call prepares the return value and errno value and returns to user mode with a special code installed in the program's virtual stack that makes it to call the signal handler (already in user mode) before returning from the generic syscall code. The return from the system call gives -1 to the caller code, and the errno variable is set to EINTR. This requires the process to have installed a signal handler, because by default the action is to abort the process, so it will not return from the system call it is waiting on. Think that when one says the kernel the actual code executed is in the system call being awaken and notified of the special condition (a signal received) The interrupted call, detects that a signal handler is to be called, and prepares the user stack to jump to the proper place (the interrupt handler in user code) before returning from the syscall() wrapper.
Q2: pending bit is only used to save that a pending signal handler is to be called, so this is not the case. In the execution part of the process, the unix program loader installs some basic code to jump to the signal handler before returning from the system call. This is because the signal handler has to execute in user mode (not in kernel mode) so everything happens upon termination of system call. The signal handler executed is the SIGINT, but the code interrupted is a system call, and nothing happens until the system call returns (with the return code and the errno variable already fixed)
Q3: well, your reasoning was based on a wrong premise, that is, the interrupt pending flag is indicating that an interrupt has been received. This bit only signals that an unprocessed interrupt has been marked for delivery as soon as you unblock it, and this only happens in another system call (to unblock a signal). As soon as the signal is unblocked, the return code of the sigsetmask(2) syscall will execute the signal handler. In this case, the signal will be delivered to the process as soon as the timer elapses, the system call will be interrupted and, if you have not installed a signal handler for the SIGALRM signal (but sleep(2) implementation does this ---at least, old implementations did) the program will be aborted.
NOTE
When I say that the program is aborted by the kernel but in both cases, the signals involved (SIGINT and SIGALRM) don't make it to dump a core file. The program is aborted without generating core. This is different to the behaviour of the abort() routine, which sends a SIGABRT and so, it makes de kernel to dump a core file of the process.
Q3- Since SIGINT's corresponging pending bit is cleared, then how can sleep() function gets return?
Imagine the sleep() function in the kernel as a function that:
allocates and sets fields in some kind of "timer event" structure
adds the "timer event" to a list of timer events for the timer's IRQ handler to worry about later (when the expiry time has elapsed)
moves the task from the "RUNNING" state to the "SLEEPING" state (so the scheduler knows not to give the task CPU time), causing scheduler to do a task switch to some other task
configures return parameters for user-space (the amount of time remaining or 0 if the time expired)
figures out why the scheduler gave it CPU time again (did the time expire or was the sleep interrupted by a signal?)
potentially mangles the stack a bit (so that the kernel returns to the signal handler if the sleep() was interrupted by a signal instead of returning to the code that called sleep())
returns to user-space
Also imagine that there's a second function (that I'm going to call wake() for no particular reason) that:
removes the "timer event" from the list of timer events (for the timer's IRQ handler to worry)
moves the task from the "SLEEPING" state to the "READY TO RUN" state (so the scheduler knows that the task can be given CPU time again)
Naturally, if the timer's IRQ handler notices that the "timer event" has expired then the timer's IRQ handler would call the wake() function to wake the task up again.
Now imagine there's a third function (that I'm going to call send_signal()) which might be called by other functions (e.g. called by kill()). This function might set a "pending signal" flag for the task that's supposed to receive the signal, then check what state the receiving task is in; and if the receiving task is in the "SLEEPING" state it calls the wake() function to wake it up (and then lets the latter part of the sleep() function worry about delivering the signal back to user-space whenever the scheduler feels like giving the task CPU time later).
Your understanding is correct.
Think about it. The process is blocked in the kernel. We need to return to user space to run the handler. How can we do that without interrupting whatever blocking kernel call was running? We only have one process/thread context to work with here. The process can't be both sleeping and running a signal handler.
The sequence is:
Process blocks in some blocking kernel call.
Signal is sent to it.
Bit is set, process is made ready-to-run.
Process resumes running in kernel mode, checks for pending non-blocked signals.
Signal dispatcher is invoked.
Process context is modified to execute signal handler upon resumption.
Process is resumed in user space
Signal handler runs.
Signal handler returns.
Kernel is invoked by end of signal handler.
Kernel makes decision whether to resume system call or return interruption error.
I'm doing a program that utilizes threads. I also have a SIGINT handler that closes everything correctly, for an orderly shutdown. However, since my threads are in a while(1) loop the pthread_join function in my handler gets stuck and i have to press ctrl+c a bunch of times, to close each thread singularly. How can i do this with just 1 click?
Here's my thread worker function:
void *worker(){
struct message msg;
while(1){
if(wr.fnode != NULL){
sem_wait(&tsem);
stats->nptri++;
msg.patient = *(wr.fnode);
wr_deletefnode();
sem_post(&tsem);
sleep((float)(msg.patient.ttime/1000));
msgsnd(mqid,&msg,sizeof(msg)-sizeof(long),0);
}
}
}
It's depend how you are sending signal (SIGINT or any) to a thread. for sending a signal to thread you should use pthread_kill() instead of kill() or raise() because signal handler(signal or sigaction) handles only processes ,not threads.
int pthread_kill(pthread_t thread, int sig);
If you ever try to kill running thread using kill command/function OS will throw warning like Warning: Program '/bin/bash' crashed.
observe running thread using ps -eL | grep pts/0 before and after sending signal.
I hope you got something.
Two things you need to solve:
How to break the endless while loop
How to return from blocking system call (like for example sem_wait())
Referring 1.:
Change
while (1)
to be
while (!exit_flag)
Define exit_flag globally
volatile sig_atomic_t exit_flag = 0;
and set it to 1 on reception of the SIGINT signal.
To catch a signal do not setup a signal handler, but create a separate thread using sigwait() to wait for and receive a SIGINT.
Make sure all other threads are created blocking SIGINT.
Referring 2.:
Keep track of all thread running by memorising their thread-ids.
After the signal-listener thread mentioned under 1. set the exit_flag, then loop over the list of running threads and one by one
enable them to receive SIGINT
signal them using pthread_kill(), in case they were stuck inside a system call, the call would return setting errno to EINTR.
join the thread calling pthread_join().
I'm now learning signals in computer system and I've stuck with a problem. There is a code given below;
int i = 0;
void handler(int s) {
if(!i) kill(getpid(), SIGINT);
i++;
}
int main() {
signal(SIGINT, handler);
kill(getpid(), SIGINT);
printf("%d\n", i);
return 0;
}
And the solution says that the possible output should be 0, 1, or 2. I understand that these are possible, but why not 3, 4 or others?
For example, we send SIGINT in the main function. Handler gets SIGINT signal, and send SIGINT as it is zero. Before it proceeds to the increment code, handler might be able to listen to SIGINT signal and send SIGINT signal one more time as it is executed before the increment code (i = 0) - loops again, again - and it might print out 3, 4, 5, or even bigger numbers.
Historically, lots of details about how signals work have changed.
For instance, in the earliest variant, the processing of the signal reverted to default when the handler was called, and the handler had to re-establish itself. In this situation, sending the signal from the handler would kill the process.
Currently, it is often the case that while a handler is called for a particular signal, that signal is blocked. That means that the handler won't be called right then, but it will be called when the signal gets unblocked. Since there is no memory of how often the signal was sent, some of them may be "lost".
See POSIX <signal.h>,
Signal Concepts,
signal()
and sigaction().
This is because signals are usually blocked while delivered. So, in your example, the kill inside handler can't have effect at that place. You must wait to return from the handler to expect catching the signal again.
In theory, you can obtain 0 because it is unspecified when a signal is delivered. So, it is possible that you throw the signal in the main, and before its delivery you execute the printf.
You can get 1, because in general signal delivery occurs at the end of system call or begin or end of quantum. So in your case, just after sending the signal your return to user space and the signal is delivered, which produces the execution of the handler, incrementing i and then returning to the normal stream of execution and then prints.
You can have 2 because when returning from the handler, the signal is unblocked and then delivered for the second time. This is the more common case.
You can't have more than 2 because you set a condition for this. When i!=0 you don't throw the signal again, so it can't be thrown more than 2 times.
Beware that there is no "recursion" here...
I have a signal handler for SIGABRT , when the signal is received, i need some more time for other threads to exit gracefully. Then I will do _exit() inside signal handler to exit the entire process.
But I am not sure how to wait inside a signal handler. I think there are some limitations for using sleep inside signal handler. I dont want to use busy wait.
Somebody suggest any ideas please ?
There are no limitations of what kind of code you can execute inside a signal handler. You just have to keep in mind that another signal may arrive while executing your signal handler leaving you in the middle of half finished functions, locked mutexes or other things that should better remain uninterrupted.
Normally you would have the signal handler set a flag signalling to all threads to nicely exit, return from the signal handler and then have your code gracefully exit.
I have noticed that on my copy of FreeBSD9 the man page for sem_wait from sempahore.h does not have a EINTR error return value. I currently have some code that has a signal handler, and I am raising a SIGINT signal. This does not seem to be waking up my sem_wait() so I can check the return value, thus the thread that is running the function wtih the sem_wait gets hung indefinitely.
According to the linux man page, I should be able to raise the singal, test for the EINTR value in the thread that is doing the sem_wait, but that seems to be missing in FreeBSD.
What is the right way of fixing this?
In psuedo here is what I have
signal_handler() //handles SIGINT
{
loopvar = 0;
}
thread 1:
while(loopvar)
{
if((r = sem_wait())
{
check error value
continue
}
..
sem_post()
}
thread 2:
raise(SIGINT);
so I was expecting when thread2 raises SIGINT it will cause sem_wait to return with a value, the loop would continue, but now loopvar would be zero, so I would exit my infinite loop.
edit: to be clear, I am not using the SA_RESTART flag.
raise raises the signal for the calling thread, not for the process. If you want to signal the whole process (with delivery to a random thread that has the signal unmasked), you need the kill function. If you want to signal a specific thread, you need pthread_kill.