prctl(PR_SET_PDEATHSIG) race condition - c

As I understand, the best way to achieve terminating a child process when its parent dies is via prctl(PR_SET_PDEATHSIG) (at least on Linux): How to make child process die after parent exits?
There is one caveat to this mentioned in man prctl:
This value is cleared for the child of a fork(2) and (since Linux 2.4.36 / 2.6.23) when executing a set-user-ID or set-group-ID binary, or a binary that has associated capabilities (see capabilities(7)). This value is preserved across execve(2).
So, the following code has a race condition:
parent.c:
#include <unistd.h>
int main(int argc, char **argv) {
int f = fork();
if (fork() == 0) {
execl("./child", "child", NULL, NULL);
}
return 0;
}
child.c:
#include <sys/prctl.h>
#include <signal.h>
int main(int argc, char **argv) {
prctl(PR_SET_PDEATHSIG, SIGKILL); // ignore error checking for now
// ...
return 0;
}
Namely, the parent count die before prctl() is executed in the child (and thus the child will not receive the SIGKILL). The proper way to address this is to prctl() in the parent before the exec():
parent.c:
#include <unistd.h>
#include <sys/prctl.h>
#include <signal.h>
int main(int argc, char **argv) {
int f = fork();
if (fork() == 0) {
prctl(PR_SET_PDEATHSIG, SIGKILL); // ignore error checking for now
execl("./child", "child", NULL, NULL);
}
return 0;
}
child.c:
int main(int argc, char **argv) {
// ...
return 0;
}
However, if ./child is a setuid/setgid binary, then this trick to avoid the race condition doesn't work (exec()ing the setuid/setgid binary causes the PDEATHSIG to be lost as per the man page quoted above), and it seems like you are forced to employ the first (racy) solution.
Is there any way if child is a setuid/setgid binary to prctl(PR_SET_PDEATH_SIG) in a non-racy way?

It is much more common to have the parent process set up a pipe. Parent process keeps the write end open (pipefd[1]), closing the read end (pipefd[0]). Child process closes the write end (pipefd[1]), and sets the read end (pipefd[1]) nonblocking.
This way, the child process can use read(pipefd[0], buffer, 1) to check if the parent process is still alive. If the parent is still running, it will return -1 with errno == EAGAIN (or errno == EINTR).
Now, in Linux, the child process can also set the read end async, in which case it will be sent a signal (SIGIO by default) when the parent process exits:
fcntl(pipefd[0], F_SETSIG, desired_signal);
fcntl(pipefd[0], F_SETOWN, getpid());
fcntl(pipefd[0], F_SETFL, O_NONBLOCK | O_ASYNC);
Use a siginfo handler for desired_signal. If info->si_code == POLL_IN && info->si_fd == pipefd[0], the parent process either exited or wrote something to the pipe. Because read() is async-signal safe, and the pipe is nonblocking, you can use read(pipefd[0], &buffer, sizeof buffer) in the signal handler whether the parent wrote something, or if parent exited (closed the pipe). In the latter case, the read() will return 0.
As far as I can see, this approach has no race conditions (if you use a realtime signal, so that the signal is not lost because an user-sent one is already pending), although it is very Linux-specific. After setting the signal handler, and at any point during the lifetime of the child process, the child can always explicitly check if the parent is still alive, without affecting the signal generation.
So, to recap, in pseudocode:
Construct pipe
Fork child process
Child process:
Close write end of pipe
Install pipe signal handler (say, SIGRTMIN+0)
Set read end of pipe to generate pipe signal (F_SETSIG)
Set own PID as read end owner (F_SETOWN)
Set read end of pipe nonblocking and async (F_SETFL, O_NONBLOCK | O_ASYNC)
If read(pipefd[0], buffer, sizeof buffer) == 0,
the parent process has already exited.
Continue with normal work.
Child process pipe signal handler:
If siginfo->si_code == POLL_IN and siginfo->si_fd == pipefd[0],
parent process has exited.
To immediately die, use e.g. raise(SIGKILL).
Parent process:
Close read end of pipe
Continue with normal work.
I do not expect you to believe my word.
Below is a crude example program you can use to check this behaviour yourself. It is long, but only because I wanted it to be easy to see what is happening at runtime. To implement this in a normal program, you only need a couple of dozen lines of code. example.c:
#define _GNU_SOURCE
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <fcntl.h>
#include <signal.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
static volatile sig_atomic_t done = 0;
static void handle_done(int signum)
{
if (!done)
done = signum;
}
static int install_done(const int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_handler = handle_done;
act.sa_flags = 0;
if (sigaction(signum, &act, NULL) == -1)
return errno;
return 0;
}
static int deathfd = -1;
static void death(int signum, siginfo_t *info, void *context)
{
if (info->si_code == POLL_IN && info->si_fd == deathfd)
raise(SIGTERM);
}
static int install_death(const int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_sigaction = death;
act.sa_flags = SA_SIGINFO;
if (sigaction(signum, &act, NULL) == -1)
return errno;
return 0;
}
int main(void)
{
pid_t child, p;
int pipefd[2], status;
char buffer[8];
if (install_done(SIGINT)) {
fprintf(stderr, "Cannot set SIGINT handler: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
if (pipe(pipefd) == -1) {
fprintf(stderr, "Cannot create control pipe: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
child = fork();
if (child == (pid_t)-1) {
fprintf(stderr, "Cannot fork child process: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
if (!child) {
/*
* Child process.
*/
/* Close write end of pipe. */
deathfd = pipefd[0];
close(pipefd[1]);
/* Set a SIGHUP signal handler. */
if (install_death(SIGHUP)) {
fprintf(stderr, "Child process: cannot set SIGHUP handler: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
/* Set SIGTERM signal handler. */
if (install_done(SIGTERM)) {
fprintf(stderr, "Child process: cannot set SIGTERM handler: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
/* We want a SIGHUP instead of SIGIO. */
fcntl(deathfd, F_SETSIG, SIGHUP);
/* We want the SIGHUP delivered when deathfd closes. */
fcntl(deathfd, F_SETOWN, getpid());
/* Make the deathfd (read end of pipe) nonblocking and async. */
fcntl(deathfd, F_SETFL, O_NONBLOCK | O_ASYNC);
/* Check if the parent process is dead. */
if (read(deathfd, buffer, sizeof buffer) == 0) {
printf("Child process (%ld): Parent process is already dead.\n", (long)getpid());
return EXIT_FAILURE;
}
while (1) {
status = __atomic_fetch_and(&done, 0, __ATOMIC_SEQ_CST);
if (status == SIGINT)
printf("Child process (%ld): SIGINT caught and ignored.\n", (long)getpid());
else
if (status)
break;
printf("Child process (%ld): Tick.\n", (long)getpid());
fflush(stdout);
sleep(1);
status = __atomic_fetch_and(&done, 0, __ATOMIC_SEQ_CST);
if (status == SIGINT)
printf("Child process (%ld): SIGINT caught and ignored.\n", (long)getpid());
else
if (status)
break;
printf("Child process (%ld): Tock.\n", (long)getpid());
fflush(stdout);
sleep(1);
}
printf("Child process (%ld): Exited due to %s.\n", (long)getpid(),
(status == SIGINT) ? "SIGINT" :
(status == SIGHUP) ? "SIGHUP" :
(status == SIGTERM) ? "SIGTERM" : "Unknown signal.\n");
fflush(stdout);
return EXIT_SUCCESS;
}
/*
* Parent process.
*/
/* Close read end of pipe. */
close(pipefd[0]);
while (!done) {
fprintf(stderr, "Parent process (%ld): Tick.\n", (long)getpid());
fflush(stderr);
sleep(1);
fprintf(stderr, "Parent process (%ld): Tock.\n", (long)getpid());
fflush(stderr);
sleep(1);
/* Try reaping the child process. */
p = waitpid(child, &status, WNOHANG);
if (p == child || (p == (pid_t)-1 && errno == ECHILD)) {
if (p == child && WIFSIGNALED(status))
fprintf(stderr, "Child process died from %s. Parent will now exit, too.\n",
(WTERMSIG(status) == SIGINT) ? "SIGINT" :
(WTERMSIG(status) == SIGHUP) ? "SIGHUP" :
(WTERMSIG(status) == SIGTERM) ? "SIGTERM" : "an unknown signal");
else
fprintf(stderr, "Child process has exited, so the parent will too.\n");
fflush(stderr);
break;
}
}
if (done) {
fprintf(stderr, "Parent process (%ld): Exited due to %s.\n", (long)getpid(),
(done == SIGINT) ? "SIGINT" :
(done == SIGHUP) ? "SIGHUP" : "Unknown signal.\n");
fflush(stderr);
}
/* Never reached! */
return EXIT_SUCCESS;
}
Compile and run the above using e.g.
gcc -Wall -O2 example.c -o example
./example
The parent process will print to standard output, and the child process to standard error. The parent process will exit if you press Ctrl+C; the child process will ignore that signal. The child process uses SIGHUP instead of SIGIO (although a realtime signal, say SIGRTMIN+0, would be safer); if generated by the parent process exiting, the SIGHUP signal handler will raise SIGTERM in the child.
To make the termination causes easy to see, the child catches SIGTERM, and exits the next iteration (a second later). If so desired, the handler can use e.g. raise(SIGKILL) to terminate itself immediately.
Both parent and child processes show their process IDs, so you can easily send a SIGINT/SIGHUP/SIGTERM signal from another terminal window. (The child process ignores SIGINT and SIGHUP sent from outside the process.)

Your last code snippet still contains a race condition:
int main(int argc, char **argv) {
int f = fork();
if (fork() == 0) {
// <- !!!race time!!!
prctl(PR_SET_PDEATHSIG, SIGKILL); // ignore error checking for now
execl("./child", "child", NULL, NULL);
}
return 0;
}
Meaning that in the child, after the fork, until the prctl() has visible effects (think: returns), there is a time-window where the parent may exit.
To fix this race you have to save the PID of the parent before the fork and check it after the prctl() call, e.g.:
pid_t ppid_before_fork = getpid();
pid_t pid = fork();
if (pid == -1) { perror(0); exit(1); }
if (pid) {
; // continue parent execution
} else {
int r = prctl(PR_SET_PDEATHSIG, SIGTERM);
if (r == -1) { perror(0); exit(1); }
// test in case the original parent exited just
// before the prctl() call
if (getppid() != ppid_before_fork)
exit(1);
// continue child execution ...
(see also)
Regarding executing a setuid/setgid program: You can then pass the ppid_before_fork by other means (e.g. in the argument or environment vector) and execute the prctl() (including the comparison) after the exec, i.e. inside the execed binary.

I don't know this for sure, but clearing the parent death signal on execve when invoking a set-id binary looks like an intentional restriction for security reasons. I'm not sure why, considering that you can use kill to send signals to setuid programs that share your real user ID, but they wouldn't have bothered making that change in 2.6.23 if there wasn't a reason to disallow it.
Since you control the code of the set-id child, here is a kludge: make the call to prctl, then immediately afterward, call getppid and see if it returns 1. If it does, then either the process was started directly by init (which is not as rare as it used to be) or the process was reparented to init before it had a chance to call prctl, which means its original parent is dead and it should exit.
(This is a kludge because I know of no way to rule out the possibility that the process was started directly by init. init never exits, so you have one case where it should exit and one case where it shouldn't and no way to tell which. But if you know from the larger design that the process will not be started directly by init, it should be reliable.)
(You must call getppid after prctl, or you have only narrowed the race window, not eliminated it.)

Related

Reinstalling set signals in C

I'm trying to handle multiple signals with one signal handler, the expected result is for ctrlc to exit the child process and also exit the parent process while ctrlz prints a random number everytime ctrlz is pressed but it doesn't seem to work after the first signal is handled.The other part of the code is a child process that loops until ctrl-c is called.
This is the code.
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <signal.h>
#include <sys/wait.h>
#include <time.h>
#include <string.h>
int sig_ctrlc = 0;
int sig_ctrlz = 0;
//main signal handler to handle all signals
void SIGhandler(int sig) {
switch (sig) {
case SIGINT:
sig_ctrlc = SIGINT;
break;
case SIGTSTP:
sig_ctrlz = SIGTSTP;
break;
case SIGCHLD:
default: break;
}
}
int main() {
int fd[2]; //to store the two ends of the pipe
char get_inode[] = "ls -il STATUS.TXT";
FILE *fp;
FILE *log;
FILE *command;
time_t t;
time(&t);
int rv = 0;
log = fopen("file_log.txt", "w");
struct sigaction act;
memset (&act, 0, sizeof(struct sigaction));
act.sa_handler = SIGhandler;
//if pipe can't be created
if (pipe(fd) < 0) {
fprintf(log, "Pipe error");
}
int pid = fork();
switch(pid) {
case -1:
fprintf(stderr, "fork failed\n");
exit(1);
case 0:
/*child process */
// maps STDOUT to the writing end of the pipe
// if (dup2(fd[1], STDOUT_FILENO) == -1) {
// fprintf(log, "error in mapping stdout to the writing pipe\n");
// }
act.sa_flags = SA_RESTART;
sigemptyset(&act.sa_mask);
sigaction(SIGINT, &act, NULL);
sigaction(SIGTSTP, &act, NULL);
for (size_t i = 1;; i++) {
/* code */
printf(" running in child\n");
sleep(1);
if (sig_ctrlc != 0) {
printf("ctrlc handled\n");
printf("exiting...\n");
sig_ctrlc = 0;
break;
}
if (sig_ctrlz != 0) {
printf("ctlrz handled.\n");
/* random generator, the problem with this is it runs with time if ctrlz
is handled within a second it returns the same number
*/
srand(time(0));
int rand_num;
rand_num = rand() % (50 - 10 + 1) + 10;
printf("random number: %d\n", rand_num);
sig_ctrlz = 0;
sigaction(SIGINT, &act, NULL);
sigaction(SIGTSTP, &act, NULL);
}
}
default:
/* parent process */
close(fd[1]);
//maps STDIN to the reading end of the pipe
// if (dup2(fd[0], STDIN_FILENO) < 0) {
// fprintf(log, "can't redirect");
// exit(1);
// }
// //checks for fopen not working and writes to STATUS.TXT with a redirect
// if ((fp = freopen("STATUS.TXT", "w", stdout)) != NULL) {
// printf("start time of program: %s\n", ctime(&t));
// printf("Parent process ID: %d\n", getppid());
// printf("Child process ID: %d\n", getpid());
//
// //gets inode information sends the command to and receives the info from the terminal
// command = popen(get_inode, "w");
// fprintf(command, "STATUS.TXT");
// fclose(command);
//
//// map STDOUT to the status file
// if(freopen("STATUS.TXT", "a+ ", stdout) == NULL) {
// fp = fopen("file_log.txt","w");
// fprintf(log, "can't map STATUS.TXT to stdout\n");
// exit(1);
// }
//
printf("parent has started\n");
wait(NULL);
time(&t);
printf("PARENT: My child's termination status is: %d at: %s\n", WEXITSTATUS(rv), ctime(&t));
// fprintf(fp, "PARENT: My child's termination status is: %d at: %s\n", WEXITSTATUS(rv), ctime(&t));
// fclose(fp);
// fclose(log);
sigaction(SIGINT, &act, NULL);
for (size_t i = 1;; i++) {
/* code */
printf("PARENT: in parent function\n");
sleep(1);
if (sig_ctrlc != 0)
exit(0);
}
}
return 0;
}
There are some good comments on the original post that help make minor fixes. I think there is also an issue of the static variables sig_ctrlc and sig_ctrlz maybe not being async-signal safe. Other than that though, I think your signal handling setup should work in a case where you repeatedly send SIGTSTP and then SIGINT after. I think how you are going about testing your program may be the issue.
Based on some clues you've given:
"ctrlz is pressed but it doesn't seem to work after the first signal is handled"
"doesn't handle both ctrlc and ctrlz after the first ctrlz"
It leads me to believe that what you are experiencing is actually the terminal's job control getting in your way. This sequence of events may explain it:
parent (process A) is started in terminal foreground in group %1
child (process B) is forked and also in terminal foreground in group %1
signal handlers are set up within child
in an attempt to signal the child, press ctrl-z to send SIGTSTP
owning terminal (grandparent of child (process B) in this case) receives the request
owning terminal broadcasts the signal to all processes in the foreground group
owning terminal removes group %1 from foreground
parent (process A) receives SIGTSTP and is suspended (default action)
child (process B) receives SIGTSTP and the signal handler is invoked
the random number is generated and printed on next iteration of child loop
subsequent attempts to signal the child via ctrl-z or ctrl-c are not forwarded to the child (or parent) by the terminal because nothing is in the terminal foreground
If that was indeed the case, at that point, you should be able to bring the processes back to the foreground by manually typing in fg and hitting enter. You could then try and signal again. However, a better way to test a program like this would be to run it in one terminal, then send the signals via kill(...) using their pid's from another terminal.
One extra note: unlike signal(...), sigaction(...) does not require "re-installation" after each disposition. A good explanation by Jonathan here https://stackoverflow.com/a/232711/7148416

Pipe guarantee to close after the child has exited

In the code below, is it safe to rely on read() failure to detect termination of child?
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
int main(void)
{
int pipefd[2];
pipefd[0] = 0;
pipefd[1] = 0;
pipe(pipefd);
pid_t pid = fork();
if (pid == 0)
{
// child
close(pipefd[0]); // close unused read end
while ((dup2(pipefd[1], STDOUT_FILENO) == -1) && (errno == EINTR)) {} // send stdout to the pipe
while ((dup2(pipefd[1], STDERR_FILENO) == -1) && (errno == EINTR)) {} // send stderr to the pipe
close(pipefd[1]); // close unused write end
char *argv[3];
argv[0] = "worker-app";
argv[1] = NULL;
argv[2] = NULL;
execvp("./worker-app", argv);
printf("failed to execvp, errno %d\n", errno);
exit(EXIT_FAILURE);
}
else if (pid == -1)
{
}
else
{
// parent
close(pipefd[1]); // close the write end of the pipe in the parent
char buffer[1024];
memset(buffer, 0, sizeof(buffer));
while (1) // <= here is it safe to rely on read below to break from this loop ?
{
ssize_t count = read(pipefd[0], buffer, sizeof(buffer)-1);
printf("pipe read return %d\n", (int)count);
if (count > 0)
{
printf("child: %s\n", buffer);
}
else if (count == 0)
{
printf("end read child pipe\n", buffer);
break;
}
else if (count == -1)
{
if (errno == EINTR)
{ continue;
}
printf("error read child pipe\n", buffer);
break;
}
}
close(pipefd[0]); // close read end, prevent descriptor leak
int waitStatus = 0;
waitpid(pid, &waitStatus, 0);
}
fprintf(stdout, "All work completed :-)\n");
return EXIT_SUCCESS;
}
Should I add something in the while(1) loop to detect child termination? What specific scenario could happen and break this app ?
Some thoughts of improvements below. However would I just waste CPU cycles?
Use kill with special argument 0 that won't kill the process but just check if it is responsive:
if (kill(pid, 0)) { break; /* child exited */ };
/* If sig is 0, then no signal is sent, but error checking is still performed; this can be used to check for the existence of a process ID or process group ID. https://linux.die.net/man/2/kill */
Use waitpid non-blocking in the while(1) loop to check if child has exited.
Use select() to check for pipe readability to prevent read() from possibly hanging?
Thanks!
Regarding your ideas:
If the child spawns children of its own, the read() won't return 0 until all of its descendants either die or close stdout and stderr. If it doesn't, or if the child always outlives all of its descendants, then just waiting for read() to return 0 is good enough and won't ever cause a problem.
If the child dies but the parent hasn't yet wait(2)ed on it, then kill(pid, 0) will succeed as if the child were still alive (at least on Linux), so this isn't an effective check from within your parent program.
A non-blocking waitpid() on its own would appear to fix the problem with the child having children of its own, but would actually introduce a subtle race condition. If the child exited right after the waitpid() but before the read(), then the read() would block until the rest of the descendants exited.
On its own, if you used select() in a blocking way, it's no better than just calling read(). If you used select() in a non-blocking way, you'd just end up burning CPU time in a loop.
What I'd do:
Add a no-op signal handler function for SIGCHLD, just so that it causes EINTR when it occurs.
Block SIGCHLD in the parent before you start looping.
Use non-blocking reads, and use pselect(2) to block to avoid spinning the CPU forever.
During the pselect, pass in a sigset_t that doesn't have SIGCHLD blocked, so that it's guaranteed to cause an EINTR for it when it eventually gets sent.
Somewhere in the loop, do a non-blocking waitpid(2), and handle its return appropriately. (Make sure you do this at least once after blocking SIGCHLD but before calling select for the first time, or you'll have a race condition.)

How to implement producer-consumer using processes?

I'm trying to implement a producer-consumer application using 1 parent process and 1 child process. The program should work like this:
1 - The parent process is the producer and the child process is the consumer.
2 - The producer creates a file, the consumer removes the file.
3 - After the file has been created, the parent process sends a SIGUSR1 signal to the child process which then removes the file and sends a SIGUSR2 signal to the parent, signaling it that the file can be created again.
I've tried implementing this problem but I keep getting this error:
User defined signal 1: 30.
I don't really understand what could be the problem. I've just started learning about process and signals and maybe I'm missing something. Any help would be appreciated. Here's my implementation:
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
pid_t child, parent;
void producer()
{
system("touch file");
printf("File was created.\n");
}
void consumer()
{
system("rm file");
printf("File was deleted.\n");
kill(parent, SIGUSR2); // signal -> file can created by parent
}
int main(void)
{
system("touch file");
pid_t pid = fork();
for(int i = 0; i < 10; ++i)
{
if(pid < 0) // error fork()
{
perror("fork()");
return -1;
}
else if(pid == 0) // child proces - consumer
{
child = getpid();
signal(SIGUSR1, consumer);
pause();
}
else // parent process - producer
{
parent = getpid();
signal(SIGUSR2, producer);
// signal -> file can be deleted by child
kill(child, SIGUSR1);
}
}
return 0;
}
Edit: I forgot to mention that there can only be one file at a time.
...Any help would be appreciated.
Regarding the Error: User defined signal 1: 30, it is possible that the speed of execution is precipitating a race condition, causing termination before your handler functions are registered. Keep in mind, each signal has a default disposition (or action). For SIGUSR1 and SIGUSR2S the disposition is term, (from table in signal(7) page linked below)
SIGUSR1 30,10,16 Term User-defined signal 1
SIGUSR2 31,12,17 Term User-defined signal 2
(Note the value 30 listed by SIGUSR1 matches the exit condition you cite.)
The implication here would be that your handler functions had not registered before the first encounter with SIGUSR1, causing the default action of terminating your application and throwing the signal related error.
The relationship between synchronization and timing come to mind as something to look at. I found several things written on synchronization, and linked one below.
Timing may be implicitly addressed with an adequate approach to synchronization, negating the need for any explicit execution flow control functions. However, if help is needed, experiment with the sleep family of functions.
Here are a couple of other general suggestions:
1) printf (and family) should really not be used in a signal handler.
2) But, if used, a newline ( \n ) is a good idea (which you have), or use fflush to force a write.
3) Add a strace() call to check if any system call traffic is occurring.
Another code example of Synchronizing using signal().
Take a look at the signal(7) page.. (which is a lot of information, but implies why using printf or fprintf inside a signal handler in the first place may not be a good idea.)
Another collection of detailed information on Signal Handling.
Apart from what #ryyker mentioned, another problem is that by the time your parent process tries to signal the child using global variable child, the child has not got a chance to run and collect the pid. So the parent will send signal to a junk pid. A better approach is to use the pid variable in the parent and getppid() in the child. Here is the code which seems to give desired output
void producer()
{
system("touch file");
printf("File was created.\n");
}
void consumer()
{
system("rm file");
printf("File was deleted.\n");
kill(getppid(), SIGUSR2); // signal -> file can created by parent
}
int main(void)
{
system("touch file");
pid_t pid = fork();
if(pid < 0) // error fork()
{
perror("fork()");
return -1;
}
if(pid > 0) { //parent
signal(SIGUSR2, producer);
}
else { //child
signal(SIGUSR1, consumer);
}
for(int i = 0; i < 10; ++i)
{
if(pid == 0) {// child proces - consumer
pause();
}
else // parent process - producer
{
printf("Iter %d\n",i);
kill(pid, SIGUSR1);
pause();
}
}
return 0;
}
Try using semaphores in c++ instead of signals.
Signals truly serve special purposes in OS whereas semaphores serve process synchronization.
Posix named semaphores in c++ can be used across processes.
The following pseudocode will help.
Semaphore Full,Empty;
------
Producer() //producer
{
waitfor(Empty);//wait for an empty slot
system("touch file");
printf("File was created.\n");
Signal(Full); //Signal one slot is full
}
Consumer() //Consumer
{
WaitFor(Full); //wait for producer to produce
system("rm file");
printf("File was deleted.\n");
Signal(Empty);//Signal that it has consumed, so one empty slot created
}
After a lot of research and reading all of the suggestions I finally managed to make the program work. Here is my implementation. If you see any mistakes or perhaps something could have been done better, then feel free to correct my code. I'm open to suggestions.
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
void signal_handler(int signal_number)
{
sigset_t mask;
if(sigemptyset(&mask) == -1 || sigfillset(&mask) == -1)
{// initialize signal set || block all signals
perror("Failed to initialize the signal mask.");
return;
}
switch(signal_number)
{
case SIGUSR1:
{
if(sigprocmask(SIG_BLOCK, &mask, NULL) == -1)
{ // entering critical zone
perror("sigprocmask(1)");
return;
} //---------------------
sleep(1);
system("rm file"); /* critical zone */
puts("File was removed.");
//--------------------
if(sigprocmask(SIG_UNBLOCK, &mask, NULL) == -1)
{// exiting critical zone
perror("1 : sigprocmask()");
return;
}
break;
}
case SIGUSR2:
{
if(sigprocmask(SIG_BLOCK, &mask, NULL) == -1)
{// entering critical zone
perror("2 : sigprocmask()");
return;
} //---------------------
sleep(1);
system("touch file");
puts("File was created."); /* critical zone */
// --------------------
if(sigprocmask(SIG_UNBLOCK, &mask, NULL) == -1)
{// exiting critical zone
perror("sigprocmask(2)");
return;
}
break;
}
}
}
int main(void)
{
pid_t pid = fork();
struct sigaction sa;
sa.sa_handler = &signal_handler; // handler function
sa.sa_flags = SA_RESTART;
sigaction(SIGUSR1, &sa, NULL);
sigaction(SIGUSR2, &sa, NULL);
if(pid < 0)
{
perror("fork()");
return -1;
}
for(int i = 0; i < 10; ++i)
{
if(pid > 0) // parent - producer
{
sleep(2);
// signal -> file was created
kill(pid, SIGUSR1);
pause();
}
else // child - consumer
{
pause();
// signal -> file was removed
kill(getppid(), SIGUSR2);
}
}
return 0;
}

c fork() and kill() at the same time not working?

Main program: Start a certain amount of child processes then send SIGINT right away.
int main()
{
pid_t childs[CHILDS];
char *execv_argv[3];
int n = CHILDS;
execv_argv[0] = "./debugging_procs/wait_time_at_interrupt";
execv_argv[1] = "2";
execv_argv[2] = NULL;
for (int i = 0; i < n; i++)
{
childs[i] = fork();
if (childs[i] == 0)
{
execv(execv_argv[0], execv_argv);
if (errno != 0)
perror(strerror(errno));
_exit(1);
}
}
if (errno != 0)
perror(strerror(errno));
// sleep(1);
for (int i = 0; i < n; i++)
kill(childs[i], SIGINT);
if (errno != 0)
perror(strerror(errno));
// Wait for all children.
while (wait(NULL) > 0);
return 0;
}
Forked program: Wait for any signal, if SIGINT is sent, open a certain file and write SIGINT and the current pid to it and wait the amount specified of seconds (in this case, I send 2 from the main program).
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <string.h>
void sigint_handler(int signum)
{
int fd = open("./aux/log1", O_WRONLY | O_APPEND);
char buf[124];
(void)signum;
sprintf(buf, "SIGINT %d\n", getpid());
write(fd, buf, strlen(buf));
close(fd);
}
int main(int argc, char **argv)
{
int wait_time;
wait_time = (argv[1]) ? atoi(argv[1]) : 5;
signal(SIGINT, &sigint_handler);
// Wait for any signal.
pause();
sleep(wait_time);
return 0;
}
The problem is, that the log file that the children should write, doesn't have n lines, meaning that not all children wrote to it. Sometimes nobody writes anything and the main program doesn't wait at all (meaning that sleep() isn't called in this case).
But if I uncomment sleep(1) in the main program, everything works just as I expected.
I suspect that the child processes don't get enough time to listen to SIGINT.
The program I'm working on is a task control and when I run a command like:
restart my_program; restart my_program I get an unstable behaviour. When I call restart, a SIGINT is sent, then a new fork() is called then another SIGINT is sent, just like the example above.
How can I make sure all children will parse SIGINT without the sleep(1) line? I'm testing my program if it can handle programs that don't exit right away after SIGINT is sent.
If I add for example, printf("child process started\n"); at the top of the child program, it doesn't get printed and the main program doesn't wait for anything, unless I sleep for a second. This happens even with only 1 child process.
Everything is working as it should. Some of your child processes get killed by the signal, before they set up the signal handler, or even before they start executing the child binary.
In your parent process, instead of just wait()ing until there are no more child processes, you could examine the identity and exit status of each of the processes reaped. Replace while (wait(NULL) > 0); with
{
pid_t p;
int status;
while ((p = wait(&status)) > 0) {
if (WIFEXITED(status))
printf("Child %ld exit status was %d.\n", (long)p, WEXITSTATUS(status));
else
if (WIFSIGNALED(status))
printf("Child %ld was killed by signal %d.\n", (long)p, WTERMSIG(status));
else
printf("Child %ld was lost.\n", (long)p);
fflush(stdout);
}
}
and you'll see that the "missing" child processes were terminated by the signals. This means that the child process was killed before it was ready to catch the signal.
I wrote my own example program pairs, with complete error checking. Instead of a signal handler, I decided to use sigprocmask() and sigwaitinfo(), just to show another way to do the same thing (and to not be limited to async-signal safe functions in a signal handler).
parent.c:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <signal.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
const char *signal_name(const int signum)
{
static char buffer[32];
switch (signum) {
case SIGINT: return "INT";
case SIGHUP: return "HUP";
case SIGTERM: return "TERM";
default:
snprintf(buffer, sizeof buffer, "%d", signum);
return (const char *)buffer;
}
}
static int compare_pids(const void *p1, const void *p2)
{
const pid_t pid1 = *(const pid_t *)p1;
const pid_t pid2 = *(const pid_t *)p2;
return (pid1 < pid2) ? -1 :
(pid1 > pid2) ? +1 : 0;
}
int main(int argc, char *argv[])
{
size_t count, r, i;
int status;
pid_t *child, *reaped, p;
char dummy;
if (argc < 3 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
fprintf(stderr, " %s COUNT PATH-TO-BINARY [ ARGS ... ]\n", argv[0]);
fprintf(stderr, "\n");
fprintf(stderr, "This program will fork COUNT child processes,\n");
fprintf(stderr, "each child process executing PATH-TO-BINARY.\n");
fprintf(stderr, "Immediately after all child processes have been forked,\n");
fprintf(stderr, "they are sent a SIGINT signal.\n");
fprintf(stderr, "\n");
return EXIT_FAILURE;
}
if (sscanf(argv[1], " %zu %c", &count, &dummy) != 1 || count < 1) {
fprintf(stderr, "%s: Invalid count.\n", argv[1]);
return EXIT_FAILURE;
}
child = malloc(count * sizeof child[0]);
reaped = malloc(count * sizeof reaped[0]);
if (!child || !reaped) {
fprintf(stderr, "%s: Count is too large; out of memory.\n", argv[1]);
return EXIT_FAILURE;
}
for (i = 0; i < count; i++) {
p = fork();
if (p == -1) {
if (i == 0) {
fprintf(stderr, "Cannot fork child processes: %s.\n", strerror(errno));
return EXIT_FAILURE;
} else {
fprintf(stderr, "Cannot fork child %zu: %s.\n", i + 1, strerror(errno));
count = i;
break;
}
} else
if (!p) {
/* Child process */
execvp(argv[2], argv + 2);
{
const char *errmsg = strerror(errno);
fprintf(stderr, "Child process %ld: Cannot execute %s: %s.\n",
(long)getpid(), argv[2], errmsg);
exit(EXIT_FAILURE);
}
} else {
/* Parent process. */
child[i] = p;
}
}
/* Send all children the INT signal. */
for (i = 0; i < count; i++)
kill(child[i], SIGINT);
/* Reap and report each child. */
r = 0;
while (1) {
p = wait(&status);
if (p == -1) {
if (errno == ECHILD)
break;
fprintf(stderr, "Error waiting for child processes: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
if (r < count)
reaped[r++] = p;
else
fprintf(stderr, "Reaped an extra child process!\n");
if (WIFEXITED(status)) {
switch (WEXITSTATUS(status)) {
case EXIT_SUCCESS:
printf("Parent: Reaped child process %ld: EXIT_SUCCESS.\n", (long)p);
break;
case EXIT_FAILURE:
printf("Parent: Reaped child process %ld: EXIT_FAILURE.\n", (long)p);
break;
default:
printf("Parent: Reaped child process %ld: Exit status %d.\n", (long)p, WEXITSTATUS(status));
break;
}
fflush(stdout);
} else
if (WIFSIGNALED(status)) {
printf("Parent: Reaped child process %ld: Terminated by %s.\n", (long)p, signal_name(WTERMSIG(status)));
fflush(stdout);
} else {
printf("Parent: Reaped child process %ld: Lost.\n", (long)p);
fflush(stdout);
}
}
if (r == count) {
/* Sort both pid arrays. */
qsort(child, count, sizeof child[0], compare_pids);
qsort(reaped, count, sizeof reaped[0], compare_pids);
for (i = 0; i < count; i++)
if (child[i] != reaped[i])
break;
if (i == count)
printf("Parent: All %zu child processes were reaped successfully.\n", count);
}
return EXIT_SUCCESS;
}
child.c:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <signal.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
const char *signal_name(const int signum)
{
static char buffer[32];
switch (signum) {
case SIGINT: return "INT";
case SIGHUP: return "HUP";
case SIGTERM: return "TERM";
default:
snprintf(buffer, sizeof buffer, "%d", signum);
return (const char *)buffer;
}
}
int main(void)
{
const long mypid = getpid();
sigset_t set;
siginfo_t info;
int result;
printf("Child: Child process %ld started!\n", mypid);
fflush(stdout);
sigemptyset(&set);
sigaddset(&set, SIGINT);
sigaddset(&set, SIGHUP);
sigaddset(&set, SIGTERM);
sigprocmask(SIG_BLOCK, &set, NULL);
result = sigwaitinfo(&set, &info);
if (result == -1) {
printf("Child: Child process %ld failed: %s.\n", mypid, strerror(errno));
return EXIT_FAILURE;
}
if (info.si_pid == 0)
printf("Child: Child process %ld terminated by signal %s via terminal.\n", mypid, signal_name(result));
else
if (info.si_pid == getppid())
printf("Child: Child process %ld terminated by signal %s sent by the parent process %ld.\n",
mypid, signal_name(result), (long)info.si_pid);
else
printf("Child: Child process %ld terminated by signal %s sent by process %ld.\n",
mypid, signal_name(result), (long)info.si_pid);
return EXIT_SUCCESS;
}
Compile both using e.g.
gcc -Wall -O2 parent.c -o parent
gcc -Wall -O2 child.c -o child
and run them using e.g.
./parent 100 ./child
where the 100 is the number of child processes to fork, each running ./child.
Errors are output to standard error. Each line from parent to standard output begins with Parent:, and each line from any child to standard output begins with Child:.
On my machine, the last line in the output is always Parent: All # child processes were reaped successfully., which means that every child process fork()ed, was reaped and reported using wait(). Nothing was lost, and there were no issues with fork() and kill().
(Do note that if you specify more child processes than you are allowed to fork, the parent program does not consider that an error, and just uses the allowed number of child processes for the test.)
On my machine, forking and reaping 100 child processes is enough work for the parent process, so that every child process gets to the part where it is ready to catch the signal.
On the other hand, the parent can handle 10 child processes (running ./parent 10 ./child) so fast that every one of the child processes gets killed by the INT signal before they are ready to handle the signal.
Here is the output from a pretty typical case when running ./parent 20 ./child:
Child: Child process 19982 started!
Child: Child process 19983 started!
Child: Child process 19984 started!
Child: Child process 19982 terminated by signal INT sent by the parent process 19981.
Child: Child process 19992 started!
Child: Child process 19983 terminated by signal INT sent by the parent process 19981.
Child: Child process 19984 terminated by signal INT sent by the parent process 19981.
Parent: Reaped child process 19982: EXIT_SUCCESS.
Parent: Reaped child process 19985: Terminated by INT.
Parent: Reaped child process 19986: Terminated by INT.
Parent: Reaped child process 19984: EXIT_SUCCESS.
Parent: Reaped child process 19987: Terminated by INT.
Parent: Reaped child process 19988: Terminated by INT.
Parent: Reaped child process 19989: Terminated by INT.
Parent: Reaped child process 19990: Terminated by INT.
Parent: Reaped child process 19991: Terminated by INT.
Parent: Reaped child process 19992: Terminated by INT.
Parent: Reaped child process 19993: Terminated by INT.
Parent: Reaped child process 19994: Terminated by INT.
Parent: Reaped child process 19995: Terminated by INT.
Parent: Reaped child process 19996: Terminated by INT.
Parent: Reaped child process 19983: EXIT_SUCCESS.
Parent: Reaped child process 19997: Terminated by INT.
Parent: Reaped child process 19998: Terminated by INT.
Parent: Reaped child process 19999: Terminated by INT.
Parent: Reaped child process 20000: Terminated by INT.
Parent: Reaped child process 20001: Terminated by INT.
Parent: All 20 child processes were reaped successfully.
Of the 20 child processes, 16 were killed by INT signal before they executed the first printf() (or fflush(stdout)) line. (We could add a printf("Child: Child process %ld executing %s\n", (long)getpid(), argv[2]); fflush(stdout); to parent.c just before the execvp() line, to see if any of the child processes get killed before they execute at all.)
Of the four remaining child processes (19982, 19983, 19984, and 19992), one (19982) was terminated after the first printf() or fflush(), but before it managed to run setprocmask(), which blocks the signal and prepares the child for catching it.
Only those three remaining child processes (19983, 19984, and 19992) caught the INT signal sent by the parent process.
As you can see, just adding complete error checking, and adding sufficient output (and fflush(stdout); where useful, as standard output is buffered by default), lets you run several test cases, and construct a much better overall picture of what is happening.
The program I'm working on is a task control and when I run a command like: restart my_program; restart my_program I get an unstable behaviour. When I call restart, a SIGINT is sent, then a new fork() is called then another SIGINT is sent, just like the example above.
In that case, you are sending the signal before the new fork is ready, so the default disposition of the signal (Termination, for INT) defines what happens.
The solutions to this underlying problem vary. Note that it is at the core of many init system issues. It is easy to solve if the child (my_program here) co-operates, but difficult in all other cases.
One simple co-operation method is to have the child send a signal to its parent process, whenever it is ready for action. To avoid killing parent processes that are unprepared for such information, a signal that is ignored by default (SIGWINCH, for example) can be used.
The option of sleeping for some duration, so that the new child process has enough time to become ready for action, is a common, but pretty unreliable method of mitigating this issue. (In particular, the required duration depends on the child process priority, and the overall load on the machine.)
Try using the waitpid() command in the for loop. This way the next child will only write once the first child is done

How to trace a process for system calls?

I am trying to code a program that traces itself for system calls. I am having a difficult time making this work. I tried calling a fork() to create an instance of itself (the code), then monitor the resulting child process.
The goal is for the parent process to return the index of every system call made by the child process and output it to the screen. Somehow it is not working as planned.
Here is the code:
#include <unistd.h> /* for read(), write(), close(), fork() */
#include <fcntl.h> /* for open() */
#include <stdio.h>
#include <sys/ptrace.h>
#include <sys/reg.h>
#include <sys/wait.h>
#include <sys/types.h>
int main(int argc, char *argv[]) {
pid_t child;
long orig_eax;
child = fork();
if (0 == child)
{
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
if (argc != 3) {
fprintf(stderr, "Usage: copy <filefrom> <fileto>\n");
return 1;
}
int c;
size_t file1_fd, file2_fd;
if ((file1_fd = open(argv[1], O_RDONLY)) < 0) {
fprintf(stderr, "copy: can't open %s\n", argv[1]);
return 1;
}
if ((file2_fd = open(argv[2], O_WRONLY | O_CREAT)) < 0) {
fprintf(stderr, "copy: can't open %s\n", argv[2]);
return 1;
}
while (read(file1_fd, &c, 1) > 0)
write(file2_fd, &c, 1);
}
else
{
wait(NULL);
orig_eax = ptrace (PTRACE_PEEKUSER, child, 4 * ORIG_EAX, NULL);
printf("copy made a system call %ld\n", orig_eax);
ptrace(PTRACE_CONT, child, NULL, NULL);
}
return 0;
}
This code was based on this code:
#include <sys/ptrace.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
#include <linux/user.h> /* For constants
ORIG_EAX etc */
int main()
{
pid_t child;
long orig_eax;
child = fork();
if(child == 0) {
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
execl("/bin/ls", "ls", NULL);
}
else {
wait(NULL);
orig_eax = ptrace(PTRACE_PEEKUSER,
child, 4 * ORIG_EAX,
NULL);
printf("The child made a "
"system call %ld\n", orig_eax);
ptrace(PTRACE_CONT, child, NULL, NULL);
}
return 0;
}
The output of this one is:
The child made a system call 11
which is the index for the exec system call.
According to the man pages for wait():
All of these system calls are used to wait for state changes in a child
of the calling process, and obtain information about the child whose
state has changed. A state change is considered to be: the child terminated;
the child was stopped by a signal; or the child was resumed by
a signal.
The way I understand it is that every time a system call is invoked by a user program, the kernel will first inspect if the process is being traced prior to executing the system call routine and pauses that process with a signal and returns control to the parent. Wouldn't that be a state change already?
The problem is that when the child calls ptrace(TRACEME) it sets itself up for tracing but doesn't actually stop -- it keeps going until it calls exec (in which case it stops with a SIGTRAP), or it gets some other signal. So in order for you to have the parent see what it does WITHOUT an exec call, you need to arrange for the child to receive a signal. The easiest way to do that is probably to have the child call raise(SIGCONT); (or any other signal) immediately after calling ptrace(TRACEME)
Now in the parent you just wait (once) and assume that the child is now stopped at a system call. This won't be the case if it stopped at a signal, so you instead need to call wait(&status) to get the child status and call WIFSTOPPED(status) and WSTOPSIG(status) to see WHY it has stopped. If it has stopped due to a syscall, the signal will be SIGTRAP.
If you want to see multiple system calls in the client, you'll need to do all of this in a loop; something like:
while(1) {
wait(&status);
if (WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP) {
// stopped before or after a system call -- query the child and print out info
}
if (WIFEXITED(status) || WIFSIGNALED(status)) {
// child has exited or terminated
break;
}
ptrace(PTRACE_SYSCALL, childpid, 0, 0); // ignore any signal and continue the child
}
Note that it will stop TWICE for each system call -- once before the system call and a second time just after the system call completes.
you are basically trying to write strace binary in linux, which traces the system calls of the process. Linux provides ptrace(2) system call for this. ptrace system call takes 4 arguement and the first arguement tells what you need to do. OS communicates with the parent process with signals and child process is stopped by sending SIGSTOP. broadly you need to follow below steps.
if(fork() == 0 )
{
//child process
ptrace(PTRACE_TRACEME, 0,0, 0);
exec(...);
}
else
{
start:
wait4(...);
if (WIFSIGNALED(status)) {
//done
}
if (WIFEXITED(status)) {
//done
}
if(flag == startup)
{
flag = startupdone;
ptrace(PTRACE_SYSCALL, pid,0, 0) ;
goto start;
}
if (if (WSTOPSIG(status) == SIGTRAP) {) {
//extract the register
ptrace(PTRACE_GETREGS,pid,(char *)&regs,0)
}
Note the register reading and interpretation will depend on your architecture. The above code is just an example to get it right you need to dig deeper. have a look at strace code for further understanding.
In your parent how many calls do you want to monitor? If you want more than one you're going to need some kind of loop.
Note the line in the example, it's important:
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
Looking at the man page the child needs to either do a PTRACE_TRACEME and an exec, or the parent needs to trace using PTRACE_ATTACH. I don't see either in your code:
The parent can initiate a trace by calling fork(2) and having the resulting child do a PTRACE_TRACEME, followed (typically) by an exec(3). Alternatively, the parent may commence trace of an existing process using PTRACE_ATTACH.
Just putting together what Chris Dodd said:
#include <unistd.h> /* for read(), write(), close(), fork() */
#include <fcntl.h> /* for open() */
#include <stdio.h>
#include <sys/ptrace.h>
#include <sys/reg.h>
#include <sys/wait.h>
#include <sys/types.h>
int main(int argc, char *argv[]) {
pid_t child;
int status;
long orig_eax;
child = fork();
if (0 == child)
{
ptrace(PTRACE_TRACEME, 0, NULL, NULL);
raise(SIGCONT);
if (argc != 3) {
fprintf(stderr, "Usage: copy <filefrom> <fileto>\n");
return 1;
}
int c;
size_t file1_fd, file2_fd;
if ((file1_fd = open(argv[1], O_RDONLY)) < 0) {
fprintf(stderr, "copy: can't open %s\n", argv[1]);
return 1;
}
if ((file2_fd = open(argv[2], O_WRONLY | O_CREAT)) < 0) {
fprintf(stderr, "copy: can't open %s\n", argv[2]);
return 1;
}
while (read(file1_fd, &c, 1) > 0)
write(file2_fd, &c, 1);
}
else
{
while(1){
wait(&status);
if(WIFSTOPPED(status) && WSTOPSIG(status) == SIGTRAP){
orig_eax = ptrace(PTRACE_PEEKUSER, child, sizeof(long) * ORIG_EAX, NULL);
printf("copy made a system call %ld\n", orig_eax);
}
if(WIFEXITED(status) || WIFSIGNALED(status)){
break;
}
ptrace(PTRACE_SYSCALL, child, 0, 0);
}
}
return 0;
}

Resources