I have a small utility that need to use fork() and wait(), however, I am facing an issue that the child process is not being terminated after the parent program is running. Any idea how I can fix it?
int test(void)
{
pid_t PID;
PID = fork();
if (PID == 0) {
sprintf(execmd, "/root/test);
system(execmd);
sprintf(filename, "test_results.txt);
FILE *fp = fopen(filename, "r");
fscanf (fp, "%s%s%s%s%s", &A, &B, &C, &D, &E);
printf ("A=%s B=%s C=%s D=%s E=%s\n", A, B, C, D, E);
fclose (fp);
}
else // *** Parent Process ***
{
int status;
wait(&status);
}
return 0;
}
At the very beginning: your code should not compile at all, as you do not close your strings:
sprintf(execmd, "/root/test);
system(execmd); // ^ missing quote!
(By the way, why don't you simply call system("/root/test");? At least from the code shown, I do not see any reason why you would need a copy...)
Then see wait documentation:
The wait() function shall suspend execution of the calling thread until status information for one of the terminated child processes of the calling process is available, or until delivery of a signal whose action is either to execute a signal-catching function or to terminate the process. If more than one thread is suspended in wait() or waitpid() awaiting termination of the same process, exactly one thread shall return the process status at the time of the target process termination. If status information is available prior to the call to wait(), return shall be immediate.
Return Value
If wait() or waitpid() returns because the status of a child process is available, these functions shall return a value equal to the process ID of the child process for which status is reported. If wait() or waitpid() returns due to the delivery of a signal to the calling process, -1 shall be returned and errno set to [EINTR]. If waitpid() was invoked with WNOHANG set in options, it has at least one child process specified by pid for which status is not available, and status is not available for any process specified by pid, 0 is returned. Otherwise, (pid_t)-1 shall be returned, and errno set to indicate the error.
So - if wait returns, you first should check if the process id was actually returned, otherwise you might re-enter into wait.
Then, if wait does not return, your child process is still running. I don't see why your process should be blocked because of the file handling (well, you do not perform any error checking, though, but that's a different matter unrelated to your problem), so most likely your child process is caught in the call to system.
What you now need is a timeout mechanism. My proposition is now as follows:
create a pipe for self-triggering events
install a signal handler for SIGCHLD using sigaction
in the signal handler, do the wait and write a single byte to your pipe
in your main function, where you currently wait for your child process, you would now first select or poll for your pipe with a timeout. If timeout occurs, send a signal to your child process (you could do that twice, first sending SIGTERM to allow your child to terminate gracefully, and if that does not help, send SIGKILL afterwards.
Alternative: If on linux, have a look at signalfd - you are not portable then, but get the same work done easier (select/poll for your fd and on success, you can call wait in the main function again).
Additionally, I recommend to re-structure your program a little: system will internally call fork and execve again, so you actually create another child process.
So I'd rather do it this way:
// preparations as described above
if (PID == 0)
{
execl("/root/test", ""); // won't return unless on error!
// some error handling?
return -1;
}
// parent process
if(PID < 0)
{
// error, the child process could not be created!
return -1;
}
// select/poll
if(timeout)
{
kill(PID, SIGTERM);
// select/poll
if(timeout)
kill(PID, SIGKILL);
}
//***************************************************
// if using signalfd:
int status;
wait(&status);
// yet to be done: check return value, status, errno
// you might need a loop...
// otherwise, prefer doing this in the signal handler
// however, the following out put MUST NOT be done there, as
// file handling is not async safe!
// just set some global flag there if the process terminated successfully!
//***************************************************
if(child_was_successful)
{
FILE* fp = fopen("test_results.txt", "r");
if(fp) // only if successful!
{
// missing declarations (presumably global), adding them here:
char A[32], B[32], C[32], D[32], E[32];
// some fixes:
// 1. checking return value
// 2. arrays already decay to pointers when being passed
// -> you do not need to take the address of again!
// 3. adding max length to your arrays prevents fscanf
// from writing beyond your array boundaries
if(fscanf (fp, "%31s%31s%31s%31s%31s", A, B, C, D, E) == 5)
// ^ no ampersand!
// ^ 31: need to leave space for terminating 0 character
{
printf ("A=%s B=%s C=%s D=%s E=%s\n", A, B, C, D, E);
}
fclose (fp);
}
}
Related
In this example from the CSAPP book chap.8:
\#include "csapp.h"
/* WARNING: This code is buggy! \*/
void handler1(int sig)
{
int olderrno = errno;
if ((waitpid(-1, NULL, 0)) < 0)
sio_error("waitpid error");
Sio_puts("Handler reaped child\n");
Sleep(1);
errno = olderrno;
}
int main()
{
int i, n;
char buf[MAXBUF];
if (signal(SIGCHLD, handler1) == SIG_ERR)
unix_error("signal error");
/* Parent creates children */
for (i = 0; i < 3; i++) {
if (Fork() == 0) {
printf("Hello from child %d\n", (int)getpid());
exit(0);
}
}
/* Parent waits for terminal input and then processes it */
if ((n = read(STDIN_FILENO, buf, sizeof(buf))) < 0)
unix_error("read");
printf("Parent processing input\n");
while (1)
;
exit(0);
}
It generates the following output:
......
Hello from child 14073
Hello from child 14074
Hello from child 14075
Handler reaped child
Handler reaped child //more than one child reaped
......
The if block used for waitpid() is used to generate a mistake that waitpid() is not able to reap all children. While I understand that waitpid() is to be put in a while() loop to ensure reaping all children, what I don't understand is that why only one waitpid() call is made, yet was able to reap more than one children(Note in the output more than one child is reaped by handler)? According to this answer: Why does waitpid in a signal handler need to loop?
waitpid() is only able to reap one child.
Thanks!
update:
this is irrelevant, but the handler is corrected in the following way(also taken from the CSAPP book):
void handler2(int sig)
{
int olderrno = errno;
while (waitpid(-1, NULL, 0) > 0) {
Sio_puts("Handler reaped child\n");
}
if (errno != ECHILD)
Sio_error("waitpid error");
Sleep(1);
errno = olderrno;
}
Running this code on my linux computer.
The signal handler you designated runs every time the signal you assigned to it (SIGCHLD in this case) is received. While it is true that waitpid is only executed once per signal receival, the handler still executes it multiple times because it gets called every time a child terminates.
Child n terminates (SIGCHLD), the handler springs into action and uses waitpid to "reap" the just exited child.
Child n+1 terminates and its behaviour follows the same as Child n. This goes on for every child there is.
There is no need to loop it as it gets called only when needed in the first place.
Edit: As pointed out below, the reason as to why the book later corrects it with the intended loop is because if multiple children send their termination signal at the same time, the handler may only end up getting one of them.
signal(7):
Standard signals do not queue. If multiple instances of a
standard signal are generated while that signal is blocked, then
only one instance of the signal is marked as pending (and the
signal will be delivered just once when it is unblocked).
Looping waitpid assures the reaping of all exited children and not just one of them as is the case right now.
Why is looping solving the issue of multiple signals?
Picture this: you are currently inside the handler, handling a SIGCHLD signal you have received and whilst you are doing that, you receive more signals from other children that have terminated in the meantime. These signals cannot queue up. By constantly looping waitpid, you are making sure that even if the handler itself can't deal with the multiple signals being sent, waitpid still picks them up as it's constantly running, rather than only running when the handler activates, which can or can't work as intended depending on whether signals have been merged or not.
waitpid still exits correctly once there are no more children to reap. It is important to understand that the loop is only there to catch signals that are sent when you are already in the signal handler and not during normal code execution as in that case the signal handler will take care of it as normal.
If you are still in doubt, try reading these two answers to your question.
How to make sure that `waitpid(-1, &stat, WNOHANG)` collect all children processes
Why does waitpid in a signal handler need to loop? (first two paragraphs)
The first one uses flags such as WNOHANG, but this only makes waitpid return immediately instead of waiting, if there is no child process ready to be reaped.
I read in an ebook that waitpid(-1, &status, WNOHANG) should be put under a while loop so that if multiple child process exits simultaniously , they are all get reaped.
I tried this concept by creating and terminating 2 child processes at the same time and reaping it by waitpid WITHOUT using loop. And the are all been reaped .
Question is , is it very necessary to put waitpid under a loop ?
#include<stdio.h>
#include<sys/wait.h>
#include<signal.h>
int func(int pid)
{
if(pid < 0)
return 0;
func(pid - 1);
}
void sighand(int sig)
{
int i=45;
int stat, pid;
printf("Signal caught\n");
//while( (
pid = waitpid(-1, &stat, WNOHANG);
//) > 0){
printf("Reaped process %d----%d\n", pid, stat);
func(pid);
}
int main()
{
int i;
signal(SIGCHLD, sighand);
pid_t child_id;
if( (child_id=fork()) == 0 ) //child process
{
printf("Child ID %d\n",getpid());
printf("child exiting ...\n");
}
else
{
if( (child_id=fork()) == 0 ) //child process
{
printf("Child ID %d\n",getpid());
printf("child exiting ...\n");
}
else
{
printf("------------Parent with ID %d \n",getpid());
printf("parent exiting ....\n");
sleep(10);
sleep(10);
}
}
}
Yes.
Okay, I'll elaborate.
Each call to waitpid reaps one, and only one, child. Since you put the call inside the signal handler, there is no guarantee that the second child will exit before you finish executing the first signal handler. For two processes that is okay (the pending signal will be handled when you finish), but for more, it might be that two children will finish while you're still handling another one. Since signals are not queued, you will miss a notification.
If that happens, you will not reap all children. To avoid that problem, the loop recommendation was introduced. If you want to see it happen, try running your test with more children. The more you run, the more likely you'll see the problem.
With that out of the way, let's talk about some other issues.
First, your signal handler calls printf. That is a major no-no. Very few functions are signal handler safe, and printf definitely isn't one. You can try and make your signal handler safer, but a much saner approach is to put in a signal handler that merely sets a flag, and then doing the actual wait call in your main program's flow.
Since your main flow is, typically, to call select/epoll, make sure to look up pselect and epoll_pwait, and to understand what they do and why they are needed.
Even better (but Linux specific), look up signalfd. You might not need the signal handler at all.
Edited to add:
The loop does not change the fact that two signal deliveries are merged into one handler call. What it does do is that this one call handles all pending events.
Of course, once that's the case, you must use WNOHANG. The same artifacts that cause signals to be merged might also cause you to handle an event for which a signal is yet to be delivered.
If that happens, then once your first signal handler exists, it will get called again. This time, however, there will be no pending events (as the events were already extracted by the loop). If you do not specify WNOHANG, your wait block, and the program will be stuck indefinitely.
This is a continuation of How to prevent SIGINT in child process from propagating to and killing parent process?
In the above question, I learned that SIGINT wasn't being bubbled up from child to parent, but rather, is issued to the entire foreground process group, meaning I needed to write a signal handler to prevent the parent from exiting when I hit CTRL + C.
I tried to implement this, but here's the problem. Regarding specifically the kill syscall I invoke to terminate the child, if I pass in SIGKILL, everything works as expected, but if I pass in SIGTERM, it also terminates the parent process, showing Terminated: 15 in the shell prompt later.
Even though SIGKILL works, I want to use SIGTERM is because it seems just like a better idea in general from what I've read about it giving the process it's signaling to terminate a chance to clean itself up.
The below code is a stripped down example of what I came up with
#include <stdio.h>
#include <signal.h>
#include <stdlib.h>
#include <unistd.h>
pid_t CHILD = 0;
void handle_sigint(int s) {
(void)s;
if (CHILD != 0) {
kill(CHILD, SIGTERM); // <-- SIGKILL works, but SIGTERM kills parent
CHILD = 0;
}
}
int main() {
// Set up signal handling
char str[2];
struct sigaction sa = {
.sa_flags = SA_RESTART,
.sa_handler = handle_sigint
};
sigaction(SIGINT, &sa, NULL);
for (;;) {
printf("1) Open SQLite\n"
"2) Quit\n"
"-> "
);
scanf("%1s", str);
if (str[0] == '1') {
CHILD = fork();
if (CHILD == 0) {
execlp("sqlite3", "sqlite3", NULL);
printf("exec failed\n");
} else {
wait(NULL);
printf("Hi\n");
}
} else if (str[0] == '2') {
break;
} else {
printf("Invalid!\n");
}
}
}
My educated guess as to why this is happening would be something intercepts the SIGTERM, and kills the entire process group. Whereas, when I use SIGKILL, it can't intercept the signal so my kill call works as expected. That's just a stab in the dark though.
Could someone explain why this is happening?
As I side note, I'm not thrilled with my handle_sigint function. Is there a more standard way of killing an interactive child process?
You have too many bugs in your code (from not clearing the signal mask on the struct sigaction) for anyone to explain the effects you are seeing.
Instead, consider the following working example code, say example.c:
#define _POSIX_C_SOURCE 200809L
#include <stdlib.h>
#include <unistd.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <signal.h>
#include <string.h>
#include <stdio.h>
#include <errno.h>
/* Child process PID, and atomic functions to get and set it.
* Do not access the internal_child_pid, except using the set_ and get_ functions.
*/
static pid_t internal_child_pid = 0;
static inline void set_child_pid(pid_t p) { __atomic_store_n(&internal_child_pid, p, __ATOMIC_SEQ_CST); }
static inline pid_t get_child_pid(void) { return __atomic_load_n(&internal_child_pid, __ATOMIC_SEQ_CST); }
static void forward_handler(int signum, siginfo_t *info, void *context)
{
const pid_t target = get_child_pid();
if (target != 0 && info->si_pid != target)
kill(target, signum);
}
static int forward_signal(const int signum)
{
struct sigaction act;
memset(&act, 0, sizeof act);
sigemptyset(&act.sa_mask);
act.sa_sigaction = forward_handler;
act.sa_flags = SA_SIGINFO | SA_RESTART;
if (sigaction(signum, &act, NULL))
return errno;
return 0;
}
int main(int argc, char *argv[])
{
int status;
pid_t p, r;
if (argc < 2 || !strcmp(argv[1], "-h") || !strcmp(argv[1], "--help")) {
fprintf(stderr, "\n");
fprintf(stderr, "Usage: %s [ -h | --help ]\n", argv[0]);
fprintf(stderr, " %s COMMAND [ ARGS ... ]\n", argv[0]);
fprintf(stderr, "\n");
return EXIT_FAILURE;
}
/* Install signal forwarders. */
if (forward_signal(SIGINT) ||
forward_signal(SIGHUP) ||
forward_signal(SIGTERM) ||
forward_signal(SIGQUIT) ||
forward_signal(SIGUSR1) ||
forward_signal(SIGUSR2)) {
fprintf(stderr, "Cannot install signal handlers: %s.\n", strerror(errno));
return EXIT_FAILURE;
}
p = fork();
if (p == (pid_t)-1) {
fprintf(stderr, "Cannot fork(): %s.\n", strerror(errno));
return EXIT_FAILURE;
}
if (!p) {
/* Child process. */
execvp(argv[1], argv + 1);
fprintf(stderr, "%s: %s.\n", argv[1], strerror(errno));
return EXIT_FAILURE;
}
/* Parent process. Ensure signals are reflected. */
set_child_pid(p);
/* Wait until the child we created exits. */
while (1) {
status = 0;
r = waitpid(p, &status, 0);
/* Error? */
if (r == -1) {
/* EINTR is not an error. Occurs more often if
SA_RESTART is not specified in sigaction flags. */
if (errno == EINTR)
continue;
fprintf(stderr, "Error waiting for child to exit: %s.\n", strerror(errno));
status = EXIT_FAILURE;
break;
}
/* Child p exited? */
if (r == p) {
if (WIFEXITED(status)) {
if (WEXITSTATUS(status))
fprintf(stderr, "Command failed [%d]\n", WEXITSTATUS(status));
else
fprintf(stderr, "Command succeeded [0]\n");
} else
if (WIFSIGNALED(status))
fprintf(stderr, "Command exited due to signal %d (%s)\n", WTERMSIG(status), strsignal(WTERMSIG(status)));
else
fprintf(stderr, "Command process died from unknown causes!\n");
break;
}
}
/* This is a poor hack, but works in many (but not all) systems.
Instead of returning a valid code (EXIT_SUCCESS, EXIT_FAILURE)
we return the entire status word from the child process. */
return status;
}
Compile it using e.g.
gcc -Wall -O2 example.c -o example
and run using e.g.
./example sqlite3
You'll notice that Ctrl+C does not interrupt sqlite3 -- but then again, it does not even if you were to run sqlite3 directly --; instead, you just see ^C on screen. This is because sqlite3 sets up the terminal in such a way that Ctrl+C does not cause a signal, and is just interpreted as normal input.
You can exit from sqlite3 using the .quit command, or pressing Ctrl+D at the start of a line.
You'll see that the original program will output a Command ... [] line afterwards, before returning you to the command line. Thus, the parent process is not killed/harmed/bothered by the signals.
You can use ps f to look at a tree of your terminal processes, and that way find out the PIDs of the parent and child processes, and send signals to either one to observe what happens.
Note that because SIGSTOP signal cannot be caught, blocked, or ignored, it would be nontrivial to reflect the job control signals (as in when you use Ctrl+Z). For proper job control, the parent process would need to set up a new session and a process group, and temporarily detach from the terminal. That too is quite possible, but a bit beyond the scope here, as it involves quite detailed behaviour of sessions, process groups, and terminals, to manage correctly.
Let's deconstruct the above example program.
The example program itself first installs some signal reflectors, then forks a child process, and that child process executes the command sqlite3. (You can speficy any executable and any parameters strings to the program.)
The internal_child_pid variable, and set_child_pid() and get_child_pid() functions, are used to manage the child process atomically. The __atomic_store_n() and __atomic_load_n() are compiler-provided built-ins; for GCC, see here for details. They avoid the problem of a signal occurring while the child pid is only partially assigned. On some common architectures this cannot occur, but this is intended as a careful example, so atomic accesses are used to ensure only a completely (old or new) value is ever seen. We could avoid using these completely, if we blocked the related signals temporarily during the transition instead. Again, I decided the atomic accesses are simpler, and might be interesting to see in practice.
The forward_handler() function obtains the child process PID atomically, then verifies it is nonzero (that we know we have a child process), and that we are not forwarding a signal sent by the child process (just to ensure we don't cause a signal storm, the two bombarding each other with signals). The various fields in the siginfo_t structure are listed in the man 2 sigaction man page.
The forward_signal() function installs the above handler for the specified signal signum. Note that we first use memset() to clear the entire structure to zeros. Clearing it this way ensures future compatibility, if some of the padding in the structure is converted to data fields.
The .sa_mask field in the struct sigaction is an unordered set of signals. The signals set in the mask are blocked from delivery in the thread that is executing the signal handler. (For the above example program, we can safely say that these signals are blocked while the signal handler is run; it's just that in multithreaded programs, the signals are only blocked in the specific thread that is used to run the handler.)
It is important to use sigemptyset(&act.sa_mask) to clear the signal mask. Simply setting the structure to zero does not suffice, even if it works (probably) in practice on many machines. (I don't know; I haven't even checked. I prefer robust and reliable over lazy and fragile any day!)
The flags used includes SA_SIGINFO because the handler uses the three-argument form (and uses the si_pid field of the siginfo_t). SA_RESTART flag is only there because the OP wished to use it; it simply means that if possible, the C library and the kernel try to avoid returning errno == EINTR error if a signal is delivered using a thread currently blocking in a syscall (like wait()). You can remove the SA_RESTART flag, and add a debugging fprintf(stderr, "Hey!\n"); in a suitable place in the loop in the parent process, to see what happens then.
The sigaction() function will return 0 if there is no error, or -1 with errno set otherwise. The forward_signal() function returns 0 if the forward_handler was assigned successfully, but a nonzero errno number otherwise. Some do not like this kind of return value (they prefer just returning -1 for an error, rather than the errno value itself), but I'm for some unreasonable reason gotten fond of this idiom. Change it if you want, by all means.
Now we get to main().
If you run the program without parameters, or with a single -h or --help parameter, it'll print an usage summary. Again, doing this this way is just something I'm fond of -- getopt() and getopt_long() are more commonly used to parse command-line options. For this kind of trivial program, I just hardcoded the parameter checks.
In this case, I intentionally left the usage output very short. It would really be much better with an additional paragraph about exactly what the program does. These kinds of texts -- and especially comments in the code (explaining the intent, the idea of what the code should do, rather than describing what the code actually does) -- are very important. It's been well over two decades since the first time I got paid to write code, and I'm still learning how to comment -- describe the intent of -- my code better, so I think the sooner one starts working on that, the better.
The fork() part ought to be familiar. If it returns -1, the fork failed (probably due to limits or some such), and it is a very good idea to print out the errno message then. The return value will be 0 in the child, and the child process ID in the parent process.
The execlp() function takes two arguments: the name of the binary file (the directories specified in the PATH environment variable will be used to search for such a binary), as well as an array of pointers to the arguments to that binary. The first argument will be argv[0] in the new binary, i.e. the command name itself.
The execlp(argv[1], argv + 1); call is actually quite simple to parse, if you compare it to the above description. argv[1] names the binary to be executed. argv + 1 is basically equivalent to (char **)(&argv[1]), i.e. it is an array of pointers that start with argv[1] instead of argv[0]. Once again, I'm simply fond of the execlp(argv[n], argv + n) idiom, because it allows one to execute another command specified on the command line without having to worry about parsing a command line, or executing it through a shell (which is sometimes downright undesirable).
The man 7 signal man page explains what happens to signal handlers at fork() and exec(). In short, the signal handlers are inherited over a fork(), but reset to defaults at exec(). Which is, fortunately, exactly what we want, here.
If we were to fork first, and then install the signal handlers, we'd have a window during which the child process already exists, but the parent still has default dispositions (mostly termination) for the signals.
Instead, we could just block these signals using e.g. sigprocmask() in the parent process before forking. Blocking a signal means it is made to "wait"; it will not be delivered until the signal is unblocked. In the child process, the signals could stay blocked, as the signal dispositions are reset to defaults over an exec() anyway. In the parent process, we could then -- or before forking, it does not matter -- install the signal handlers, and finally unblock the signals. This way we would not need the atomic stuff, nor even check if the child pid is zero, since the child pid will be set to its actual value well before any signal can be delivered!
The while loop is basically just a loop around the waitpid() call, until the exact child process we started exits, or something funny happens (the child process vanishes somehow). This loop contains pretty careful error checking, as well as the correct EINTR handing if the signal handlers were to be installed without the SA_RESTART flags.
If the child process we forked exits, we check the exit status and/or reason it died, and print a diagnostic message to standard error.
Finally, the program ends with a horrible hack: instead of returning EXIT_SUCCESS or EXIT_FAILURE, we return the entire status word we obtained with waitpid when the child process exited. The reason I left this in, is because it is sometimes used in practice, when you want to return the same or as similar exit status code as a child process returned with. So, it's for illustration. If you ever find yourself to be in a situation when your program should return the same exit status as a child process it forked and executed, this is still better than setting up machinery to have the process kill itself with the same signal that killed the child process. Just put a prominent comment there if you ever need to use this, and a note in the installation instructions so that those who compile the program on architectures where that might be unwanted, can fix it.
So, I'm exiting from the child thread back to the parent. I am using the _exit() system call. I was wondering a few things. One was what parameter for the _exit for my child. Here is the code that my child process is executing:
printf("\n****Child process.****\n\nSquence: ");
do{
//Print the integer in the sequence.
printf("%d\t",inputInteger);
if((inputInteger%2) == 0){
//printf("inputInteger = %d\n", inputInteger);
inputInteger = inputInteger / 2;
}else{
inputInteger = 3*inputInteger +1;
//printf("%d\t",inputInteger);
}
}while(inputInteger != 1);
//Makes sure we print off 1!
printf("%d\n\n", inputInteger);
//Properly exit
_exit(status);
I use status because back in my parent thread I use it in the waitpid() system call. Here is the code for parent process that is executed after the child is completed.
waitpid_check = waitpid(processID, &status, 0);
printf("\n****Parent process.****\n");
if(waitpid_check == -1){
printf("Error in waitpid.\n");
exit(EXIT_FAILURE);
}
if(WIFEXITED(status)){
printf("Child process terminated normally!\n");
}
Here I'm using waitpid() system call that ensures that the child was exited, then use status to check if it was exited properly. I was wondering if I was going about this in the right way of creating the child and exiting it.
Then I was also wondering if I was correctly checking the exiting of the child in the parent.
Thanks for your help!
From the waitpid linux manual.
"If status is not NULL, wait() and waitpid() store status information in the int to which
it points."
You don't need the return value of wait paid to check if the child failed. You need to check to value of status. There are a handful of macros to check status.
WIFEXITED(status)
returns true if the child terminated normally, that is, by calling exit(3) or _exit(2), or by returning from main().
WEXITSTATUS(status)
returns the exit status of the child. This consists of the least significant 8 bits of the status argument that the child specified in a call to exit(3) or _exit(2) or as the argument for a return statement in main(). This macro should only be employed if WIFEXITED returned true.
WIFSIGNALED(status)
returns true if the child process was terminated by a signal.
WTERMSIG(status)
returns the number of the signal that caused the child process to terminate. This macro should only be employed if WIFSIGNALED returned true.
WCOREDUMP(status)
returns true if the child produced a core dump. This macro should only be employed if WIFSIGNALED returned true. This macro is not specified in POSIX.1-2001 and is not available on some UNIX implementations (e.g., AIX, SunOS). Only use this enclosed in #ifdef WCOREDUMP ... #endif.
WIFSTOPPED(status)
returns true if the child process was stopped by delivery of a signal; this is only possible if the call was done using WUNTRACED or when the child is being traced (see ptrace(2)).
WSTOPSIG(status)
returns the number of the signal which caused the child to stop. This macro should only be employed if WIFSTOPPED returned true.
WIFCONTINUED(status)
(since Linux 2.6.10) returns true if the child process was resumed by delivery of SIGCONT.
As for whether or not you are exiting the child process right that really depends. You would exit like you would in any other program since when you fork a process you are really just duplicating an address space and the child when run as its own independent program (of course with the same open FD's, already declared values etc as parent). Below is typical implementation for this problem (although NULL is being passed to the wait instead of a status so I think you are doing it right.)
/* fork a child process */
pid = fork();
if (pid < 0) { /* error occurred */
fprintf(stderr, "Fork Failed\n");
return 1;
}
else if (pid == 0) { /* child process */
printf("I am the child %d\n",pid);
execlp("/bin/ls","ls",NULL);
}
else { /* parent process */
/* parent will wait for the child to complete */
printf("I am the parent %d\n",pid);
wait(NULL);
printf("Child Complete\n");
}
return 0;
I'd love to help but I'm really rusty on these calls. If you've read through the documentation on these API calls and you're checking everywhere for error returns, then you should be in good shape.
The idea seems good at a high level.
One thing to keep in mind is you might want to surround the meat of your child method in a try/catch. With threads, you often don't want an exception to mess up your main flow.
You won't have that problem with multiple processes, but think about whether you want _exit to be called in the face of an exception, and how to communicate (to the parent or to the user) that an exception occurred.
I'm trying to solve a problem I've got where a child process runs execvp() and needs to let the parent know if it returns. So, after the execvp() returns (because there's been an error), how can I tell the parent that this particular event has happened so it can handle it.
There's one method of writing a string of text through the pipe I'm using and then reading that from the parent.. but it seems a bit sloppy. Is there a better way?
Thanks!
Edit: Here is some code I'm trying where I can't seem to get the read to return.
int errorPipe[2];
signal( SIGPIPE, SIG_IGN );
int oldflags = fcntl (errorPipe[0], F_GETFD, 0);
oldflags |= FD_CLOEXEC;
fcntl (errorPipe[0], F_SETFD, oldflags);
oldflags = fcntl (errorPipe[1], F_GETFD, 0);
oldflags |= FD_CLOEXEC;
fcntl (errorPipe[1], F_SETFD, oldflags);
pipe( errorPipe );
// in the child..
char *error_message = "exec failed";
write( errorPipe[1], error_message, strlen(error_message)+1 );
exit(-1);
// in the parent
printf("read gives: %d\n", read(errorPipe[0], error_read, MAX_LINE_LENGTH) );
The easiest way is a pipe with the FD_CLOEXEC flag set, as then you can detect a successful exec as easily as a failure. In the event of a failure, I'd write whole the error message back to the parent over the pipe, but you could just write the status code or anything else that is meaningful. (Definitely write something though; nothing written has got to be a sign of a successful start of the other executable.)
[EDIT]: How to make use of this:
If the parent needs to wait until it knows whether the child successfully ran execve() (the unlying syscall) then it should do a blocking read() on the pipe. A zero result from that indicates success. (Make sure you've got SIGPIPE ignored.)
If the parent has some kind of event handling framework based on non-blocking IO and select() (or poll() or kqueue() or …) then wait for the pipe to become readable before trying to read the message (which will be zero-length if the child did the execve() correctly).
execvp() never returns, except when it fails to even start the executable at all. If it can start the executable, it will not return, no matter what the executable does (i.e. regardless to whether the executable succeeds at its task or not).
Your parent process will receive a SIGCHLD signal, for which you can install a signal handler.
Or you can wait(2) for the child process.
int child_pid = fork();
if (child_pid == 0) {
execvp("/path/to/executable", ...);
exit(123); /* this happens only if execvp() fails to invoke executable */
}
/* ... */
int status = 0;
int exit_pid = waitpid(-1, &status, WNOHANG);
if (exit_pid == child_pid && WIFEXITED(status)) {
if (WEXITSTATUS(status) == 0) {
/* child process exited fine */
} else if (WEXITSTATUS(status) == 123)
/* execvp() itself failed */
} else {
/* executed child process failed */
}
}
Cache the pid for the (child) process for which you want to examine the status in the parent.
Add a handler for SIGCHLD in the parent. In the child call exit with some status value of your choosing to denote that execvp failed. On receiving the signal in the parent you now have 2 options
a) Call waitpid with a pid of -1 (i.e. wait for any child), examine the return value, if that matches your cached pid, examine the status using macros like WEXITSTATUS.
b) Call waitpid with your cached pid , then on return examine the exit status.
To make this robust you should call WIFEXITED(status) before examining the exit status via WEXITSTATUS. WIFEXITED returns true if the child terminated normally i.e. by calling exit or _exit and not as a result of seg fault, un handled signal etc.
Also see man wait(2).