How does parent process convert child process exit code to status code? - c

When I run the following code (headers and main entry ommitted)
void fork11()
{
pid_t pid[N];
int i;
int child_status;
for (i = 0; i < N; i++)
if ((pid[i] = fork()) == 0)
exit(100+i); /* Child */
for (i = N-1; i >= 0; i--) {
pid_t wpid = waitpid(pid[i], &child_status, 0);
if (WIFEXITED(child_status)) {
printf("Child %d terminated with exit status %d\n", wpid, WEXITSTATUS(child_status));
printf("child_status: %d\n", child_status);
} else {
printf("Child %d terminate abnormally\n", wpid);
}
}
}
the result is
Child 5126 terminated with exit status 104
child_status: 26624
Child 5125 terminated with exit status 103
child_status: 26368
Child 5124 terminated with exit status 102
child_status: 26112
Child 5123 terminated with exit status 101
child_status: 25856
Child 5122 terminated with exit status 100
child_status: 25600
with some digging around I find WEXITSTATUS is a simple macro
#define WEXITSTATUS(x) ((_W_INT(x) >> 8) & 0x000000ff)
take child process 5126 for example, waitpid converts 104 to 26624=0x6800, WEXITSTATUS converts 26624 back to 104=0x68, I tried to look up source code of waitpid but ended up with some kernel functions which I can't understand... could anyone explain a little bit about how does waitpid convert exit code? It seems fairly simple in the above example but I think it's much more than that, thanks!

When a process exits, the return value passed to the exit() function is used as a parameter for the actual int (4 bytes) passed to any process that waits for it. In addition to the 8 bits of this status code, more information is being passed.
For example, in the actual (final) exit code (the entire 4 bytes), it's being specified whether the process was signaled or not.
As seen here about WEXITSTATUS:
This consists of the least significant 8 bits of the status argument that the child specified in a call to exit(3)...
This means that processes cannot exit with codes that require more than 8 bits.

Related

C, What process returns for wait()?

Let's suppose that I wrote the following in main:
int main() {
int status;
if ((p1 = fork()) != 0) {
if ((p2 = fork()) != 0) {
wait(&status);
wait(&status);
} else {
processB();
}
}
return 3;
}
plus:
int processB() {
return 1;
}
when process p2 ends, what will be the value it saves in &status?
If it ended using exit(100) for example then it will return 100, but I don't know what will happen in this case?
The exit status from the process is encoded in 16 bits. The high-order 8 bits contain the return value from main() or the value passed to exit() (or one of its relatives); the low-order 8 bits contain the signal that killed the process and a flag indicating that a core dump was created. The low order bits are zero if the process exited normally. The return value from wait() is the PID of the dead process.
Use code similar to this to print out the information in a semi-legible format:
int corpse;
int status;
while ((corpse = wait(&status)) > 0)
printf("PID %5d exited with status 0x%.4X\n", corpse, status);
If your program has return 3; at the end of main(), then the value reported via the status pointer to wait() will be 0x0300 (768 = 3 * 256). If your program calls exit(100);, the value returned by wait() will be 0x6400 (100 = 6 * 16 + 4). If your program is killed by an interrupt (SIGINT, 2), the value returned by wait() will be 0x0002. (Technically, the actual values are system-dependent, but this is historically accurate and I know of no system where this is not what happens.)
See also ExitCodes bigger than 255 — Possible?.
POSIX defines the <sys/wait.h>
header with a bunch of relevant macros, such as:
WIFEXITED
WEXITSTATUS
WIFSIGNALED
WTERMSIG
WIFSTOPPED
WSTOPSIG
And implementations often define a macro to detect core dumps — usually WCOREDUMP.
See also the POSIX specification for wait()
(which also specifies waitpid()).

Why should we check WIFEXITED after wait in order to kill child processes in Linux system call?

I came across some code in C where we check the return value of wait and if it's not an error there's yet another check of WIFEXITED and WIFEXITSTATUS. Why isn't this redundant? As far as I understand wait returns -1 if an error occurred while WIFEXITED returns non-zero value if wait child terminated normally. So if there wasn't any error in this line if ( wait(&status) < 0 ) why would anything go wrong durng WIFEXITED check?
This is the code:
#include <stdio.h>
#include <signal.h>
#include <sys/wait.h>
#include <stdlib.h>
#include <unistd.h>
#define CHILDREN_NUM 5
int main () {
int i, status, pid, p;
for(i = 0; (( pid = fork() ) < 0) && i < CHILDREN_NUM;i++)
sleep(5);
if ( pid == 0 )
{
printf(" Child %d : successfully created!\n",i);
exit( 0 ); /* Son normally exits here! */
}
p = CHILDREN_NUM;
/* The father waits for agents to return succesfully */
while ( p >= 1 )
{
if ( wait(&status) < 0 ) {
perror("Error");
exit(1);
}
if ( ! (WIFEXITED(status) && (WEXITSTATUS(status) == 0)) ) /* kill all running agents */
{
fprintf( stderr,"Child failed. Killing all running children.\n");
//some code to kill children here
exit(1);
}
p--;
}
return(0);
}
wait returning >= 0 tells you a child process has terminated (and that calling wait didn't fail), but it does not tell you whether that process terminated successfully or not (or if it was signalled).
But, here, looking at your code, it's fairly obvious the program does care about whether the child process that terminated did so successfully or not:
fprintf( stderr,"Child failed. Killing all running children.\n");
So, the program needs to do further tests on the status structure that was populated by wait:
WIFEXITED(status): did the process exit normally? (as opposed to being signalled).
WEXITSTATUS(status) == 0: did the process exit with exit code 0 (aka "success"). For more information, see: Meaning of exit status 1 returned by linux command.
wait(&status) waits on the termination of a child process. The termination may be due to a voluntary exit or due to the receipt of an unhandled signal whose default disposition is to terminate the process.
WIFEXITED(status) and WIFSIGNALED(status) allow you to distinguish the two* cases, and you can later use either WEXITSTATUS or WTERMSIG to retrieve either the exit status (if(WIFEXITED(status)) or termination signal (if(WIFSIGNALED(status)).
*with waitpid and special flags (WUNTRACED, WCONTINUED), you can also wait on child process stops and resumptions, which you can detect with WIFSTOPPED or WIFCONTINUED (linux only) respectively. See waitpid(2) for more information.

c - Proper range of return status / value

Recently, when reading a book about linux programming, I got a message that:
The status argument given to _exit() defines the termination status of the process, which is available to the parent of this process when it calls wait(). Although defined as an int, only the bottom 8 bits of status are actually made available to the parent. And only 0 ~ 127 is recommanded to use, because 128 ~ 255 could be confusing in shell due to some reason. Due to that -1 will become 255 in 2's complement.
The above is about the exit status of a child process.
My question is:
Why the parent process only get the 8 bits of the child process's exit status?
What about return value of normal functions? Does it reasonable or befinit to use only 0 ~ 127 ? Because I do use -1 as return value to indicate error sometimes, should I correct that in future.
Update - status get by wait() / waitpid():
I read more chps in the book (TLPI), and found there are more trick in the return status & wait()/waitpid() that worth mention, I should have read more chps before ask the question. Anyhow, I have add an answer by myself to describe about it, in case it might help someone in future.
Why the parent process only get the 8 bits of the child process's exit status?
Because POSIX says so. And POSIX says so because that's how original Unix worked, and many operating system derived from it and modeled after it continue to work.
What about return value of normal functions?
They are unrelated. Return whatever is reasonable. -1 is as good as any other value, and is in fact a standard way to indicate an error in a huge lot of standard C and POSIX APIs.
The answer from #n.m. is good.
But later on, I read more chps in the book (TLPI), and found there are more trick in the return status & wait()/waitpid() that worth mention, and that might be another important or root reason why child process can't use full bits of int when exit.
Wait status
basicly:
child process should exit with value within range of 1 byte, which is set as part of the status parameter of wait() / waitpid(),
and only 2 LSB bytes of the status is used,
byte usage of status:
event byte 1 byte 0
============================================================
* normal termination exit status (0 ~ 255) 0
* killed by signal 0 termination signal (!=0)
* stopped by signal stop signal 0x7F
* continued by signal 0xFFFF
*
dissect returned status:
header 'sys/wait.h', defines a set of macros that help to dissect a wait status,
macros:
* WIFEXITED(status)
return true if child process exit normally,
*
* WIFSIGNALED(status)
return true if child process killed by signal,
* WTERMSIG(status)
return signal number that terminate the process,
* WCOREDUMP(status)
returns ture if child process produced a core dump file,
tip:
this macro is not in SUSv3, might absent on some system,
thus better check whether it exists first, via:
#ifdef WCOREDUMP
// ...
#endif
*
* WIFSTOPPED(status)
return true if child process stopped by signal,
* WSTOPSIG(status)
return signal number that stopp the process,
*
* WIFCONTINUED(status)
return true if child process resumed by signal SIGCONT,
tip:
this macro is part of SUSv3, but some old linux or some unix might didn't impl it,
thus better check whether it exists first, via:
#ifdef WIFCONTINUED
// ...
#endif
*
Sample code
wait_status_test.c
// dissect status returned by wait()/waitpid()
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <errno.h>
#include <sys/wait.h>
#define SLEEP_SEC 10 // sleep seconds of child process,
int wait_status_test() {
pid_t cpid;
// create child process,
switch(cpid=fork()) {
case -1: // failed
printf("error while fork()\n");
exit(errno);
case 0: // success, child process goes here
sleep(SLEEP_SEC);
printf("child [%d], going to exit\n",(int)getpid());
_exit(EXIT_SUCCESS);
break;
default: // success, parent process goes here
printf("parent [%d], child created [%d]\n", (int)getpid(), (int)cpid);
break;
}
// wait child to terminate
int status;
int wait_flag = WUNTRACED | WCONTINUED;
while(1) {
if((cpid = waitpid(-1, &status, wait_flag)) == -1) {
if(errno == ECHILD) {
printf("no more child\n");
exit(EXIT_SUCCESS);
} else {
printf("error while wait()\n");
exit(-1);
}
}
// disset status
printf("parent [%d], child [%d] ", (int)getpid(), (int)cpid);
if(WIFEXITED(status)) { // exit normal
printf("exit normally with [%d]\n", status);
} else if(WIFSIGNALED(status)) { // killed by signal
char *dumpinfo = "unknow";
#ifdef WCOREDUMP
dumpinfo = WCOREDUMP(status)?"true":"false";
#endif
printf("killed by signal [%d], has dump [%s]\n", WTERMSIG(status), dumpinfo);
} else if(WIFSTOPPED(status)) { // stopped by signal
printf("stopped by signal [%d]\n", WSTOPSIG(status));
#ifdef WIFCONTINUED
} else if(WIFCONTINUED(status)) { // continued by signal
printf("continued by signal SIGCONT\n", WSTOPSIG(status));
#endif
} else { // this should never happen
printf("unknow event\n");
}
}
return 0;
}
int main(int argc, char *argv[]) {
wait_status_test();
return 0;
}
Compile:
gcc -Wall wait_status_test.c
Execute:
./a.out and wait it to terminate normally, child process id is printed after fork(),
./a.out, then kill -9 <child_process_id> before it finish sleep,
./a.out, then kill -STOP <child_process_id> before it finish sleep, then kill -CONT <child_process_id> to resume it,

C - using exec() instead of system()

In the following code:
int main ( int argc, char *argv[] ) {
int i, pid, status;
for(i = 0; i < atoi(argv[1]); i++) {
pid = fork();
if(pid < 0) {
printf("Error occured");
exit(1);
} else if (pid == 0) {
status = system("./frun test.txt 1");
if (status != 0)
printf("ERROR, exited with %d\n", status);
else
printf("SUCCESS, exited with %d\n", status); // 0
exit(0);
} else {
wait(NULL);
}
}
return 0;
}
I am using system() to run a program called 'frun'
My question is - how would I change the code to use one of the exec() family functions instead?
My main problem is that I need to get the exit code of the program and I can't find a way to do it with exec(), while system() just returns the exit status.
Read the manual pages for wait and exec.
You wait on the process created from forking and execing (recall that exec replaces the current process, so you must fork to get its exit code). This is from the man page for wait:
Regardless of its value, this information may be interpreted using the following
macros, which are defined in <sys/wait.h> and evaluate to integral expressions; the
stat_val argument is the integer value pointed to by stat_loc.
WIFEXITED(stat_val)
Evaluates to a non-zero value if status was returned for a child process that
terminated normally.
WEXITSTATUS(stat_val)
If the value of WIFEXITED(stat_val) is non-zero, this macro evaluates to the
low-order 8 bits of the status argument that the child process passed to _exit()
or exit(), or the value the child process returned from main().
WIFSIGNALED(stat_val)
Evaluates to a non-zero value if status was returned for a child process that
terminated due to the receipt of a signal that was not caught (see <signal.h>).
WTERMSIG(stat_val)
If the value of WIFSIGNALED(stat_val) is non-zero, this macro evaluates to the
number of the signal that caused the termination of the child process.
WIFSTOPPED(stat_val)
Evaluates to a non-zero value if status was returned for a child process that is
currently stopped.
WSTOPSIG(stat_val)
If the value of WIFSTOPPED(stat_val) is non-zero, this macro evaluates to the
number of the signal that caused the child process to stop.
WIFCONTINUED(stat_val)
Evaluates to a non-zero value if status was returned for a child process that
has continued from a job control stop.
In your case you'd probably want to use WEXITSTATUS(stat_val) to get the 8 bit exit code.
You call waitpid(pid, stat_loc, options) where you'd pass the pid returned from fork(), a pointer to a local int (where the status is stored), and your options flags (or 0). You would do this is in the branch where pid != 0, since this is the original process. The original process calling fork() (where pid == 0) would call exec and thus be replaced by the exec'd command.
Here is a pseudo-example that you can adapt to your code:
pid_t pid;
int status;
pid = fork();
if (pid == 0)
{
exec(/*look up the different functions in the exec family and how to use them*/);
}
else if (pid < 0)
{
//uh oh
}
else
{
waitpid(pid, &status, 0);
printf("Exited with %d\n", WEXITSTATUS(status));
}
Though you should check the result of waitpid.

I used wait(&status) and the value of status is 256, why?

I have this code:
int status;
t = wait(&status);
When the child process works, the value of status is 0, well.
But why does it return 256 when it doesn't work? And why changing the value of the argument given to exit in the child process when there is an error doesn't change anything (exit(2) instead of exit(1) for example)?
Edit : I'm on linux, and I compiled with GCC.
Given code like this...
int main(int argc, char **argv) {
pid_t pid;
int res;
pid = fork();
if (pid == 0) {
printf("child\n");
exit(1);
}
pid = wait(&res);
printf("raw res=%d\n", res);
return 0;
}
...the value of res will be 256. This is because the return value from wait encodes both the exit status of the process as well as the reason the process exited. In general, you should not attempt to interpret non-zero return values from wait directly; you should use of the WIF... macros. For example, to see if a process exited normally:
WIFEXITED(status)
True if the process terminated normally by a call to _exit(2) or
exit(3).
And then to get the exit status:
WEXITSTATUS(status)
If WIFEXITED(status) is true, evaluates to the low-order 8 bits
of the argument passed to _exit(2) or exit(3) by the child.
For example:
int main(int argc, char **argv) {
pid_t pid;
int res;
pid = fork();
if (pid == 0) {
printf("child\n");
exit(1);
}
pid = wait(&res);
printf("raw res=%d\n", res);
if (WIFEXITED(res))
printf("exit status = %d\n", WEXITSTATUS(res));
return 0;
}
You can read more details in the wait(2) man page.
The status code contains various information about how the child process exited. Macros are provided to get information from the status code.
From wait(2) on linux:
If status is not NULL, wait() and waitpid() store status information in the int to which it points. This integer can be inspected with the following macros (which take the integer itself as an argu-
ment, not a pointer to it, as is done in wait() and waitpid()!):
WIFEXITED(status)
returns true if the child terminated normally, that is, by calling exit(3) or _exit(2), or by returning from main().
WEXITSTATUS(status)
returns the exit status of the child. This consists of the least significant 8 bits of the status argument that the child specified in a call to exit(3) or _exit(2) or as the argument for a
return statement in main(). This macro should be employed only if WIFEXITED returned true.
WIFSIGNALED(status)
returns true if the child process was terminated by a signal.
WTERMSIG(status)
returns the number of the signal that caused the child process to terminate. This macro should be employed only if WIFSIGNALED returned true.
WCOREDUMP(status)
returns true if the child produced a core dump. This macro should be employed only if WIFSIGNALED returned true. This macro is not specified in POSIX.1-2001 and is not available on some UNIX
implementations (e.g., AIX, SunOS). Only use this enclosed in #ifdef WCOREDUMP ... #endif.
WIFSTOPPED(status)
returns true if the child process was stopped by delivery of a signal; this is possible only if the call was done using WUNTRACED or when the child is being traced (see ptrace(2)).
WSTOPSIG(status)
returns the number of the signal which caused the child to stop. This macro should be employed only if WIFSTOPPED returned true.
WIFCONTINUED(status)
(since Linux 2.6.10) returns true if the child process was resumed by delivery of SIGCONT.
SUGGESTION: try one of the following "Process Completion Status" macros:
http://www.gnu.org/software/libc/manual/html_node/Process-Completion-Status.html
EXAMPLE:
int status = 0;
..
int retval = wait (&status);
if (WIFEXITED(status))
printf("OK: Child exited with exit status %d.\n", WEXITSTATUS(status));
else
printf("ERROR: Child has not terminated correctly.\n");

Resources