I'm writing a program that forks two children to do something. These two children send two different signals to its parent when their job is done. In the meanwhile, the parent waits for its children using two pause().
However, the program stopped after the first pause() and waits for another signal at the second pause(). Using gdb, I find that two signals from children are received, but only one pause() is finished.
What is the cause of this problem?
In main:
struct sigaction parent_act;
struct sigaction child_act[2];
// set the signal handlers
parent_act.sa_handler = &p0_handler;
child_act[0].sa_handler = &p1_handler;
child_act[1].sa_handler = &p2_handler;
// set the behavior when child get signal SIGUSR1 and SIGUSR2
sigaction(SIGUSR1, &child_act[0], NULL);
sigaction(SIGUSR2, &child_act[1], NULL);
// fork two child
for(i = 0; i < 2; i++){
pid[i] = fork();
// child process
else if(pid[i] == 0){
pause(); // wait for signal
return 0;
}
}
// set the behavior when parent get signal SIGUSR1 and SIGUSR2
sigaction(SIGUSR1, &parent_act, NULL);
sigaction(SIGUSR2, &parent_act, NULL);
kill(pid[0], SIGUSR1); // signal the child to do its job
kill(pid[1], SIGUSR2); // signal the other child to do its job
pause(); // wait for child
pause(); // wait for child
In handlers:
void p0_handler(int dummy)
{
return;
}
void p1_handler(int dummy)
{
// do something
kill(getppid(), SIGUSR1); // tell parent it's done
return;
}
void p2_handler(int dummy)
{
// do something
kill(getppid(), SIGUSR2); // tell parent it's done
return;
}
First child sends SIGUSR1 to parent and the second one sends SIGUSR2. It seems the first pause() received two signals. Is that possible?
Most likely both signals arrive at the same time (or the second one arrives during execution of the signal handler) at which point the second pause will wait indefinitely.
The problem is that there is no guarantee that even if you like set a flag "One child finished" the other signal does not arrive before pause is called.
The only way to make this waterproof with signals is to use select on a timeout.
int num_signals_received = 0;
void p0_handler(int dummy)
{
// signal safe action
__sync_fetch_and_add(&num_signals_received, 1);
return;
}
instead of pause:
struct timeval timeout = {1,0}; // 1 second
while (num_signals_received < 2)
select(0, NULL, NULL, NULL, &timeout);
If you are not set on signals for synchronisation then I would advise you move to either shared memory (mmap/MAP_SHARED) or file descriptor based (pipe(2)) synchrosisation.
Related
In this example from the CSAPP book chap.8:
\#include "csapp.h"
/* WARNING: This code is buggy! \*/
void handler1(int sig)
{
int olderrno = errno;
if ((waitpid(-1, NULL, 0)) < 0)
sio_error("waitpid error");
Sio_puts("Handler reaped child\n");
Sleep(1);
errno = olderrno;
}
int main()
{
int i, n;
char buf[MAXBUF];
if (signal(SIGCHLD, handler1) == SIG_ERR)
unix_error("signal error");
/* Parent creates children */
for (i = 0; i < 3; i++) {
if (Fork() == 0) {
printf("Hello from child %d\n", (int)getpid());
exit(0);
}
}
/* Parent waits for terminal input and then processes it */
if ((n = read(STDIN_FILENO, buf, sizeof(buf))) < 0)
unix_error("read");
printf("Parent processing input\n");
while (1)
;
exit(0);
}
It generates the following output:
......
Hello from child 14073
Hello from child 14074
Hello from child 14075
Handler reaped child
Handler reaped child //more than one child reaped
......
The if block used for waitpid() is used to generate a mistake that waitpid() is not able to reap all children. While I understand that waitpid() is to be put in a while() loop to ensure reaping all children, what I don't understand is that why only one waitpid() call is made, yet was able to reap more than one children(Note in the output more than one child is reaped by handler)? According to this answer: Why does waitpid in a signal handler need to loop?
waitpid() is only able to reap one child.
Thanks!
update:
this is irrelevant, but the handler is corrected in the following way(also taken from the CSAPP book):
void handler2(int sig)
{
int olderrno = errno;
while (waitpid(-1, NULL, 0) > 0) {
Sio_puts("Handler reaped child\n");
}
if (errno != ECHILD)
Sio_error("waitpid error");
Sleep(1);
errno = olderrno;
}
Running this code on my linux computer.
The signal handler you designated runs every time the signal you assigned to it (SIGCHLD in this case) is received. While it is true that waitpid is only executed once per signal receival, the handler still executes it multiple times because it gets called every time a child terminates.
Child n terminates (SIGCHLD), the handler springs into action and uses waitpid to "reap" the just exited child.
Child n+1 terminates and its behaviour follows the same as Child n. This goes on for every child there is.
There is no need to loop it as it gets called only when needed in the first place.
Edit: As pointed out below, the reason as to why the book later corrects it with the intended loop is because if multiple children send their termination signal at the same time, the handler may only end up getting one of them.
signal(7):
Standard signals do not queue. If multiple instances of a
standard signal are generated while that signal is blocked, then
only one instance of the signal is marked as pending (and the
signal will be delivered just once when it is unblocked).
Looping waitpid assures the reaping of all exited children and not just one of them as is the case right now.
Why is looping solving the issue of multiple signals?
Picture this: you are currently inside the handler, handling a SIGCHLD signal you have received and whilst you are doing that, you receive more signals from other children that have terminated in the meantime. These signals cannot queue up. By constantly looping waitpid, you are making sure that even if the handler itself can't deal with the multiple signals being sent, waitpid still picks them up as it's constantly running, rather than only running when the handler activates, which can or can't work as intended depending on whether signals have been merged or not.
waitpid still exits correctly once there are no more children to reap. It is important to understand that the loop is only there to catch signals that are sent when you are already in the signal handler and not during normal code execution as in that case the signal handler will take care of it as normal.
If you are still in doubt, try reading these two answers to your question.
How to make sure that `waitpid(-1, &stat, WNOHANG)` collect all children processes
Why does waitpid in a signal handler need to loop? (first two paragraphs)
The first one uses flags such as WNOHANG, but this only makes waitpid return immediately instead of waiting, if there is no child process ready to be reaped.
I'm trying to understand how blocking and unblocking signals work and I'm trying to understand the following piece of code. Specifically I am looking at line 28 (commented in the code): int a = sigprocmask(SIG_UNBLOCK, &mask, NULL);, aka where the signal is unblocked in the child.
The textbook I got the code from says that the code uses signal blocking in order to ensure that the program performs its add function (simplified to printf("adding %d\n", pid);) before its delete function (simplified to printf("deleting %d\n", pid);). This makes sense to me; by blocking the SIGCHLD signal, then unblocking it after we perform the add function, we ensure that handler isn't called until we perform the add function. However, why would we unblock the signal in the child? Doesn't that just eliminate the whole point of blocking by immediately unblocking it, allowing the child to delete before the parent adds?
However, the output (described after the code) is identical whether or not I have the line commented out or not, meaning that is clearly not what happens. The textbook states:
"Notice that children inherit the blocked set of their parents, so we must be careful to unblock the SIGCHLD signal in the child before calling execve."
But that still seems to me like the unblocking would result in the handler being called. What exactly does this line do?
void handler(int sig) {
pid_t pid;
printf("here\n");
while ((pid = waitpid(-1, NULL, 0)) > 0); /* Reap a zombie child */
printf("deleting %d\n", pid); /* Delete the child from the job list */
}
int main(int argc, char **argv) {
int pid;
sigset_t mask;
signal(SIGCHLD, handler);
sigemptyset(&mask);
sigaddset(&mask, SIGCHLD);
sigprocmask(SIG_BLOCK, &mask, NULL); /* Block SIGCHLD */
pid = fork();
if (pid == 0) {
printf("in child\n");
int a = sigprocmask(SIG_UNBLOCK, &mask, NULL); // LINE 28
printf("a is %d\n",a);
execve("/bin/date", argv, NULL);
exit(0);
}
printf("adding %d\n", pid);/* Add the child to the job list */
sleep(5);
printf("awake\n");
int b = sigprocmask(SIG_UNBLOCK, &mask, NULL);
printf("b is %d\n", b);
sleep(3);
exit(0);
}
Outputs:
adding 652
in child
a is 0
Wed Apr 24 20:18:04 UTC 2019
awake
here
deleting -1
b is 0
However, why would we unblock the signal in the child? Doesn't that
just eliminate the whole point of blocking by immediately unblocking
it, allowing the child to delete before the parent adds?
No. Each process has its own signal mask. A new process inherits its parent's signal mask, but only in the same sense that it inherits the contents of the parent's memory -- the child gets what amounts to an independent copy. Its modifications to that copy are not reflected in the parent's copy, nor vise versa after the child starts. If this were not the case, then all processes in the system would share a single signal mask.
It is only the parent that must not receive SIGCLD too soon, so only the parent needs to have that signal blocked.
[...] The textbook states:
"Notice that children inherit the blocked set of their parents, so we must be careful to unblock the SIGCHLD signal in the child before
calling execve."
But that still seems to me like the unblocking would result in the
handler being called.
Again, "inherit" in the sense of inheriting a copy, not in the sense of sharing the same mask.
What exactly does this line do?
It unblocks SIGCLD in the child -- again, having no effect on the parent -- in case it being blocked would interfere with the behavior of /bin/date, which the child is about to exec.
I'm writing a process related program in C and I'm having a small problem waking up a process:
I have a parent process that I put to sleep with waitpid(), but I need it to carry on either when its children complete, or when a certain time is reached. My plan was to call alarm(timeout), and then call waitpid(-1,&status,0), so essentially the process would wait until a child finished, and if the child didnt finish within the timeout time, a signal would be sent and the parent would exit after killing the child. The issue I'm having is that this alarm() call just prints "Alarm clock" to the console, and it doesnt seem to be waking up the parent in time. Thanks!
You need to install a signal handler for SIGALRM. alarm() sends a SIGALRM signal when it expires, and if you don't handle that signal it will terminate your process.
static int g_timeout;
void alrm_handler(int signo)
{
g_timeout = 1;
}
And in your main code, e.g.:
signal(SIGALRM, alrm_handler);
alarm(10);
pid_t p = waitpid(-1,&status,0);
if (p == -1) {
if (errno == EINTR && g_timeout) {
//timeout occured
} else {
//other error
}
}
...
signal( SIGUSR1, sigusr);
bla = 0;
for(;;)
{
if(pid=fork()==0)
{
fprintf(stderr,"Child %d: Waiting in queue.\n",getpid());
pause();
fprintf(stderr,"im here"); //can't get to this line
kill(deque(q),SIGUSR1)
_exit(0);
}
else
{
sem_wait(&q2);
enque(q,pid);
sem_post(&q2);
if(!bla)
{
bla=1;
sem_wait(&q2);
kill(deque(q),SIGUSR1);
sem_post(&q2);
}
sleep(n);
}
}
...
void sigusr()
{
signal(SIGUSR1, sigusr);
fprintf(stderr, "Child %d: Got it.\n", getpid());
}
Child doesn't continue running after receiving signal using pause(), parent send signal to the first child, I get output "Got it." but can't get pass pause();. after the parent send the signal, the first child needs to send signal to the next child.. etc...
The expression pid=fork()==0 does not work as you expect it to. It assigns the valie of fork() == 0 to the variable pid, because of the operator precedence for equality is higher than for assignment.
That means that pid will be either 0 or 1, neither a correct process identifier.
Change to (pid = fork()) == 0 instead.
In addition to pid=fork()==0 there is a race condition, the child may receive the signal before the call to pause(), and then wait for another signal in pause().
To avoid this problem:
Block the signal using sigprocmask.
Check if is appropriate to wait.
Wait for the signal using sigsuspend.
Unblock the signal.
I'm writing a program that uses fork to create child processes and count them when they're done.
How can I be sure I'm not losing signals?
what will happen if a child sends the signal while the main program still handles the previous signal? is the signal "lost"? how can I avoid this situation?
void my_prog()
{
for(i = 0; i<numberOfDirectChildrenGlobal; ++i) {
pid = fork();
if(pid > 0)//parent
//do parent thing
else if(0 == pid) //child
//do child thing
else
//exit with error
}
while(numberOfDirectChildrenGlobal > 0) {
pause(); //waiting for signal as many times as number of direct children
}
kill(getppid(),SIGUSR1);
exit(0);
}
void sigUsrHandler(int signum)
{
//re-register to SIGUSR1
signal(SIGUSR1, sigUsrHandler);
//update number of children that finished
--numberOfDirectChildrenGlobal;
}
It's recommended to use sigaction instead of signal, but in both cases it won't provide what you need. If a child sends a signal while the previous signal is still being handled, it will become a pending signal, but if more signals are sent they will be discarded (on systems that are not blocking incoming signals, the signals can be delivered before reestablishment of the handler and again resulting in missing signals). There is no workaround for this.
What one usually does is to assume that some signals are missing, and lets the handler take care of exiting children.
In your case, instead of sending a signal from your children, just let the children terminate. Once they terminate, the parent's SIGCHLD handler should be used to reap them. Using waitpid with WNOHANG option ensures that the parent will catch all the children even if they all terminate at the same time.
For example, a SIGCHLD handler that counts the number of exited children can be :
pid_t pid;
while((pid = waitpid(-1, NULL, WNOHANG)) > 0) {
nrOfChildrenHandled++;
}
To avoid this situation you can use the posix real-time signals.
Use sigaction instead of signal to register your handlers, and the delivery of the signals is assured.