I'm trying to write a shell and I came across this problem: after I run the fork() and execute the commands, in the main process I wait for all child processes like this:
while (wait(NULL) > 0);
But when I try to suspend a child process, the main process won't go past this loop.
So how do I wait only for non suspended processes?
I could try to save the pid_t of all started sub processes then check if they are suspended but I thought maybe there is a better way.
To wait for any child, either exited (aka ended, terminated) or stopped (aka suspended) use the waitpid() instead.
int wstatus;
{
pid_t result;
while (result = waitpid(-1, &wstatus, WUNTRACED)) /* Use WUNTRACED|WCONTINUED
to return on continued children as well. */
{
if ((pid_t) -1 = result)
{
if (EINTR = errno)
{
continue;
}
if (ECHILD == errno)
{
exit(EXIT_SUCCESS); /* no children */
}
perror("waitpid() failed");
exit(EXIT_FAILURE);
}
}
}
if (WEXITED(wstatus))
{
/* child exited normally with exit code rc = ... */
int rc = WEXITSTATUS(wstatus);
...
}
else if (WIFSIGNALED(wstatus)
{
/* child exited by signal sig = ... */
int sig = WTERMSIG(wstatus);
...
}
else if (WSTOPPED(wstatus))
{
/* child stopped by signal sig = ... */
int sig = WSTOPSIG(wstatus);
...
}
else if (WCONTINUED(wstatus))
{
/* child continued (occurs only if WCONTINUED was passed to waitpid()) */
}
Related
I want to fully terminate a process on Linux C89.
The flow is: check if its already dead, if not, let it die peacefully using sigterm and wait 10 seconds till it dies. If its still alive - SIGKILL it.
int TerminateProcessIMP(pid_t process_to_kill)
{
assert(process_to_kill);
/*---------------------------------------------------------*/
/* check if process is already does not exist */
if (!IsProcessAliveIMP(process_to_kill))
{
return (SUCCESS);
}
/*---------------------------------------------------------*/
/* terminate process_to_kill */
if (kill(process_to_kill, SIGTERM))
{
fprintf(stderr, "%s\n", strerror(errno));
}
if (!IsProcessAliveIMP(process_to_kill))
{
return (SUCCESS);
}
/*---------------------------------------------------------*/
/* if its still alive, SIGKILL it */
if (kill(process_to_kill, SIGKILL))
{
fprintf(stderr, "%s\n", strerror(errno));
}
if (!IsProcessAliveIMP(process_to_kill))
{
return (SUCCESS);
}
/*---------------------------------------------------------*/
return (FAILURE);
}
/******************************************************************************/
int IsProcessAliveIMP(pid_t process_to_check)
{
time_t start_time = 0;
time_t end_time = 0;
time_t time_to_wait = 10; /* in seconds */
assert(process_to_check);
start_time = time(0);
end_time = start_time + time_to_wait;
/* give it time to be terminated because maybe it frees memory meanwhile */
while (0 != kill(process_to_check, 0) && time(0) < end_time)
{}
/* check if it still exists */
if (0 == kill(process_to_check, 0))
{
return (0);
}
/* the process is still alive */
return (1);
}
What do you think?
Right now, it does not work and does not terminate the process.
It tries to terminate the process but fails to do that. I can't figure out why.
Thanks.
You're doing unnecessary things.
To check if a process exists:
kill(pid, 0);
If the return value is 0, then the process exists, if -1 then you have to check errno. In case of ESRCH:
The target process or process group does not exist. Note that an existing process might be a zombie, a process that has terminated execution, but has not yet been wait(2)ed for.
To terminate a process:
kill(pid, SIGTERM); // signal can be blocked, handled or ignored
or
kill(pid, SIGKILL); // signal cannot be blocked, handled or ignored
After a kill, you have to wait for the process via:
waitpid(pid, &status, 0);
If status is not NULL, wait() and waitpid() store status information in the int to which it points. This integer can be inspected with the macros, described in the man page.
See
man kill
man waitpid
Example:
int TerminateProcessIMP(pid_t pid)
{
//check if process exists
int res = kill(pid, 0);
if ((res == -1) && (errno != ESRCH)) {
//error: either EINVAL or EPERM
//ESRCH: an existing process might be a zombie
return -1;
}
if (res == 0) { //process exists
//ask politely to terminate
if (kill(pid, SIGTERM) == -1) {
//error: unable to send a signal to the process
return -1;
}
//let us see if the child complied to our request
res = waitpid(pid, NULL, WNOHANG | WUNTRACED | WCONTINUED);
if (res == -1) {
//most likely not our child (errno == ECHILD)
//but the process could follow our request and still terminate
//if you want to be sure goto SIGKILL below
//or return
} else if (res == 0) {
//our child, but at this point the child has not terminated yet
//(maybe it will never)
//either continue to wait or goto SIGKILL below
} else {
//child complied to our request and terminated in time
//res contains the id of the child (res == pid)
return res;
}
//--- or ---
/* do {
res = waitpid(pid, NULL, WNOHANG | WUNTRACED | WCONTINUED);
//because of WNOHANG
sleep(1);
//your timeout method goes here
} while (!res);
if (res == -1) { //same as above }
if (res > 0) { return res; } */
}
//at this point, the process either does not exists (maybe zombie),
//is not our child or refused our request (SIGTERM)
//send a SIGKILL signal to the process
kill(pid, SIGKILL);
//wait for the process to terminate
res = waitpid(pid, NULL, 0);
/*if (res == -1) {
//not our child, or process does not exists
}*/
/*if (res > 0) {
//child successfully terminated
}*/
return res;
}
While building a shell program I'm facing an issue of recognizing processes states. The description of the issue I'm facing with is that I have a list of child processes and I'm trying to figure out their state using waitpid and WNOHANG. I wish to distinguish between 3 states: TERMINATED, RUNNING and SUSPENDED. (as defined in the code below)
I wish to change the processes states to one of these three above, however right now this function makes running processes statuses to be terminated, and this function also doesn't recognize suspended processes.
I would like to know what am I doing wrong and how should the function updateProcessList be written to achieve it?
#define TERMINATED -1
#define RUNNING 1
#define SUSPENDED 0
typedef struct process{
cmdLine* cmd; /* the parsed command line*/
pid_t pid; /* the process id that is running the command*/
int status; /* status of the process: RUNNING/SUSPENDED/TERMINATED */
struct process *next; /* next process in chain */
} process;
void updateProcessList(process **process_list) {
process *p = *process_list;
int code = 0, status = 0,pidd = 0;
while (p) {
pidd = p->pid;
code = waitpid(pidd, &status, WNOHANG);
if (code == -1) { /* child terminated*/
p->status = TERMINATED;
} else if(WIFEXITED(status)){
p->status = TERMINATED;
}else if(WIFSTOPPED(status)){
p->status = SUSPENDED;
}
p = p->next;
}
}
From man 2 waitpid:
RETURN VALUE
waitpid(): on success, returns the process ID of the child whose state has changed;
if WNOHANG was specified and one or more child(ren) specified by pid exist, but have
not yet changed state, then 0 is returned. On error, -1 is returned.
You should check the return value for 0... and also fix the rest of the checks.
code = waitpid(ppid, &status, WNOHANG | WUNTRACED | WCONTINUED);
if (code == -1) {
// Handle error somehow...
// This doesn't necessarily mean that the child was terminated!
// See manual page section "ERRORS".
if (errno == ECHILD) {
// Child was already terminated by something else.
p->status = TERMINATED;
} else {
perror("waitpid failed");
}
} else if (code == 0) {
// Child still in previous state.
// Do nothing.
} else if (WIFEXITED(status)) {
// Child exited.
p->status = TERMINATED;
} else if (WIFSIGNALED(status)) {
// Child killed by a signal.
p->status = TERMINATED;
} else if (WIFSTOPPED(status)) {
// Child stopped.
p->status = SUSPENDED;
} else if (WIFCONTINUED(status)) {
// This branch seems unnecessary, you should already know this
// since you are the one that should kill(pid, SIGCONT) to make the
// children continue.
p->status = RUNNING;
} else {
// This should never happen!
abort();
}
Also, notice:
My addition of WUNTRACED and WCONTINUED in the flags: WIFSTOPPED() cannot happen unless you are tracing the child with ptrace() or you used the WUNTRACED flag, and WIFCONTINUED() cannot happen unless WCONTINUED is used.
The code and ppid variables should be pid_t, not int (the ppid variable also seems unneeded).
In any case, consider adding a signal handler for SIGCHLD and updating the children statuses there. Your program will receive a SIGCHLD for every child that terminates/stops/resuems. It's much simpler and also faster (does not require to continuously call waitpid() on every single child process).
The program is intended to signal transaction permanently. SIGUSR1 is caught by the parent and SIGUSR2 caught by the child. They play with only the flag when they catch their own signals. I let first the parent to run, that is, at first the parent sends signal. The child waits by pause() its process until it runs its catcher on the fly. I thought I apply a simple synchronization, but seemingly not. However, if I comment in the usleep(1000), the code works. Like
initial value, flag = -99
child process, flag = 0
parent process, flag = 1
child process, flag = 0
parent process, flag = 1
child process, flag = 0
.
.
.
child process, flag = 0
parent process, flag = 1
child process, flag = 0
parent process, flag = 1
child process, flag = 0
.
.
.
but without sleep, I can't get what I want. I want to get my intend without sleep. Wrong output is,
initial value, flag = -99
parent process, flag = -99
waits forever..................
How can it be run as intended? However, what's the reason of the behaviour? By the way, I have to apply the synchronization with only signals without semaphores, mutex etc. All posix signal features, except for sleep, nanosleep or pause and busy waiting, can be used like sigaction, sigsuspend etc.
#include <stdio.h>
#include <signal.h>
#include <unistd.h>
#include <stdlib.h>
volatile sig_atomic_t flag = -99; // child = 0, parent = 1;
void catcher(int sig) {
switch (sig) {
case SIGUSR1 : flag = 1; break;
case SIGUSR2 : flag = 0; break;
}
}
int safeBlockParent(int signum) {
sigset_t maskall, maskmost, maskold;
sigfillset(&maskall);
sigfillset(&maskmost);
sigdelset(&maskmost, signum);
sigprocmask(SIG_SETMASK, &maskall, &maskold);
if (flag == 0)
sigsuspend(&maskmost);
sigprocmask(SIG_SETMASK, &maskold, NULL);
}
int safeBlockChild(int signum) {
sigset_t maskall, maskmost, maskold;
sigfillset(&maskall);
sigfillset(&maskmost);
sigdelset(&maskmost, signum);
sigprocmask(SIG_SETMASK, &maskall, &maskold);
if (flag == 1)
sigsuspend(&maskmost);
sigprocmask(SIG_SETMASK, &maskold, NULL);
}
void ChildProcess() {
while(1) {
safeBlockChild(SIGUSR2);
fprintf(stderr, "child process, flag = %d\n", flag);
kill( getppid(), SIGUSR1 );
}
}
void ParentProcess(pid_t childPid) {
flag = 1;
while(1) {
//usleep(1000);
fprintf(stderr, "parent process, flag = %d\n", flag);
kill( childPid, SIGUSR2 );
safeBlockParent(SIGUSR1);
}
}
int main() {
pid_t pid;
struct sigaction sact = { 0 };
fprintf(stderr, "initial value, flag = %d\n", flag);
sigemptyset( &sact.sa_mask );
sact.sa_flags = 0;
sact.sa_handler = catcher;
if (sigaction (SIGUSR1, &sact, NULL) < 0) {
perror("sigaction sigusr1 error");
exit(1);
}
if (sigaction (SIGUSR2, &sact, NULL) < 0) {
perror("sigaction sigusr2 error");
exit(2);
}
pid = fork();
if (pid < 0) { perror("fork problem"); exit(3); }
if (pid == 0) {
//kill(getppid(), SIGUSR1);
ChildProcess();
}
else {
ParentProcess(pid);
//wait(NULL);
}
return 0;
}
The code stucks sometimes, sometimes runs.
You have two race conditions:
The parent process could send a signal before the child has had a chance to register a signal handler for SIGUSR2.
One process could send a signal while the other is outside pause.
The latter can happen the first time round, when the child process has yet to reach pause, but the parent has sent SIGUSR2 anyway. This causes the effect you're seeing.
I've been asked to develop the consumer (client) side to a producer (server), where the producer creates processes, waits until the consumer has read shared memory and deleted processes, then passes control back to the producer for the killing of processes and the shutting down of the shared memory block.
I've researched the difference between sleep and wait, and realise that as soon as fork() is called, the child process begins running.
The below code is after the creation of processes and checks if they're parent processes. If they are, they wait(0). *Now for my question, how do I know where the code in the consumer starts to be executed, and how do I pass it back? *
else if(pid > 0)
{
wait(0);
}
Below can be seen the main loop the producer uses.
int noToCreate = atoi(argv[2]); // (user inputs on cmd line "./prod 20 10 5" - 20 size of shared mem, 10 process to be created, 5 processes to be deleted)
while(*memSig != 2)
{
while(*memSig == 1) // set memsignature to sleep while..
{
sleep(1);
}
for(B = 0; B < noToCreate; B++)
{
pid = fork();
if(pid == -1)
{
perror("Error forking");
exit(1);
}
else if(pid > 0)
{
wait(0);
}
else
{
srand(getpid());
while(x == 0)
{
if(*randNum == 101)
{
*randNum = rand() % (100 -
1) + 1;
*pidNum = getpid();
printf("priority: %d
Process ID: %d \n", *randNum, *pidNum);
x = 1;
}
else
{
*randNum++;
*pidNum++;
}
}
exit(0);
}
} /* Closes main for loop */
if(*memSig == 0)
{
*memSig = 1;
}
} /* Closes main while loop */
Thanks a bunch guys :)
wait make parent blocked until any child end .You can use waitpid let parent wait specific child.
When a child process end, it will set a signal SIG_CHILD.
The pid is zero for the child process after the fork, so you are in the child process at your call to the srand function.
The other pid is that for the child process which allows he original thread to wait for the child to finish. If you wish to pass data between the processes consider using a pipe. A popen call returns two file descriptors, one to write end and the other to the read end. Set this up before the fork and the two processes can communicate.
wait makes the parent wait for any child to terminate before going on (preferably use waitpid to wait for a certain child), whereas sleep puts the process to sleep and resumes it, as soon as the time passed as argument is over.
Both calls will make the process block.
And it is NOT said that the child will run immediately, this is indeterminate behavior!
If you want to pass data between producer and consumer, use pipes or *NIX sockets, or use the return-value of exit from the child if a single integer is sufficient.
See man wait, you can get the return value of the child with the macro WEXITSTATUS.
#include <sys/wait.h>
#include <stdlib.h>
#include <unistd.h>
#include <stdio.h>
int
main(int argc, char *argv[])
{
pid_t cpid, w;
int status;
cpid = fork();
if (cpid == -1) {
perror("fork");
exit(EXIT_FAILURE);
}
if (cpid == 0) { /* Code executed by child */
printf("Child PID is %ld\n", (long) getpid());
if (argc == 1)
pause(); /* Wait for signals */
_exit(atoi(argv[1]));
} else { /* Code executed by parent */
do {
w = waitpid(cpid, &status, WUNTRACED | WCONTINUED);
if (w == -1) {
perror("waitpid");
exit(EXIT_FAILURE);
}
if (WIFEXITED(status)) {
printf("exited, status=%d\n", WEXITSTATUS(status));
} else if (WIFSIGNALED(status)) {
printf("killed by signal %d\n", WTERMSIG(status));
} else if (WIFSTOPPED(status)) {
printf("stopped by signal %d\n", WSTOPSIG(status));
} else if (WIFCONTINUED(status)) {
printf("continued\n");
}
} while (!WIFEXITED(status) && !WIFSIGNALED(status));
exit(EXIT_SUCCESS);
}
}
How could I track down the death of a child process without making the parent process wait until the child process got killed?
I am trying a client-server scenario where the server accepts the connection from a client and forks a new process for each and every connection it accepts.
I am ignoring SIGCHLD signals to prevent zombie creation.
signal(SIGCHLD, SIG_IGN);
while(1)
{
accept();
clients++;
if(fork() ==0)
{
childfunction();
clients--;
}
else
{
}
}
The problem in the above scenario is that if the child process gets killed in the childfunction() function, the global variable clients is not getting decremented.
NOTE: I am looking for a solution without using SIGCHLD signal ... If possible
Typically you write a handler for SIGCHLD which calls waitpid() on pid -1. You can use the return value from that to determine what pid died. For example:
void my_sigchld_handler(int sig)
{
pid_t p;
int status;
while ((p=waitpid(-1, &status, WNOHANG)) != -1)
{
/* Handle the death of pid p */
}
}
/* It's better to use sigaction() over signal(). You won't run into the
* issue where BSD signal() acts one way and Linux or SysV acts another. */
struct sigaction sa;
memset(&sa, 0, sizeof(sa));
sa.sa_handler = my_sigchld_handler;
sigaction(SIGCHLD, &sa, NULL);
Alternatively you can call waitpid(pid, &status, 0) with the child's process ID specified, and synchronously wait for it to die. Or use WNOHANG to check its status without blocking.
None of the solutions so far offer an approach without using SIGCHLD as the question requests. Here is an implementation of an alternative approach using poll as outlined in this answer (which also explains why you should avoid using SIGCHLD in situations like this):
Make sure you have a pipe to/from each child process you create. It can be either their stdin/stdout/stderr or just an extra dummy fd. When the child process terminates, its end of the pipe will be closed, and your main event loop will detect the activity on that file descriptor. From the fact that it closed, you recognize that the child process died, and call waitpid to reap the zombie.
(Note: I omitted some best practices like error-checking and cleaning up file descriptors for brevity)
/**
* Specifies the maximum number of clients to keep track of.
*/
#define MAX_CLIENT_COUNT 1000
/**
* Tracks clients by storing their process IDs and pipe file descriptors.
*/
struct process_table {
pid_t clientpids[MAX_CLIENT_COUNT];
struct pollfd clientfds[MAX_CLIENT_COUNT];
} PT;
/**
* Initializes the process table. -1 means the entry in the table is available.
*/
void initialize_table() {
for (int i = 0; i < MAX_CLIENT_COUNT; i++) {
PT.clientfds[i].fd = -1;
}
}
/**
* Returns the index of the next available entry in the process table.
*/
int get_next_available_entry() {
for (int i = 0; i < MAX_CLIENT_COUNT; i++) {
if (PT.clientfds[i].fd == -1) {
return i;
}
}
return -1;
}
/**
* Adds information about a new client to the process table.
*/
void add_process_to_table(int i, pid_t pid, int fd) {
PT.clientpids[i] = pid;
PT.clientfds[i].fd = fd;
}
/**
* Removes information about a client from the process table.
*/
void remove_process_from_table(int i) {
PT.clientfds[i].fd = -1;
}
/**
* Cleans up any dead child processes from the process table.
*/
void reap_zombie_processes() {
int p = poll(PT.clientfds, MAX_CLIENT_COUNT, 0);
if (p > 0) {
for (int i = 0; i < MAX_CLIENT_COUNT; i++) {
/* Has the pipe closed? */
if ((PT.clientfds[i].revents & POLLHUP) != 0) {
// printf("[%d] done\n", PT.clientpids[i]);
waitpid(PT.clientpids[i], NULL, 0);
remove_process_from_table(i);
}
}
}
}
/**
* Simulates waiting for a new client to connect.
*/
void accept() {
sleep((rand() % 4) + 1);
}
/**
* Simulates useful work being done by the child process, then exiting.
*/
void childfunction() {
sleep((rand() % 10) + 1);
exit(0);
}
/**
* Main program
*/
int main() {
/* Initialize the process table */
initialize_table();
while (1) {
accept();
/* Create the pipe */
int p[2];
pipe(p);
/* Fork off a child process. */
pid_t cpid = fork();
if (cpid == 0) {
/* Child process */
close(p[0]);
childfunction();
}
else {
/* Parent process */
close(p[1]);
int i = get_next_available_entry();
add_process_to_table(i, cpid, p[0]);
// printf("[%d] started\n", cpid);
reap_zombie_processes();
}
}
return 0;
}
And here is some sample output from running the program with the printf statements uncommented:
[31066] started
[31067] started
[31068] started
[31069] started
[31066] done
[31070] started
[31067] done
[31068] done
[31071] started
[31069] done
[31072] started
[31070] done
[31073] started
[31074] started
[31072] done
[31075] started
[31071] done
[31074] done
[31081] started
[31075] done
You don't want a zombie. If a child process dies and the parent is still RUNNING but never issues a wait()/waitpid() call to harvest the status, the system does not release the resources associated with the child and a zombie/defunct process is left in the proc table.
Try changing your SIGCHLD handler to something closer to the following:
void chld_handler(int sig) {
pid_t p;
int status;
/* loop as long as there are children to process */
while (1) {
/* retrieve child process ID (if any) */
p = waitpid(-1, &status, WNOHANG);
/* check for conditions causing the loop to terminate */
if (p == -1) {
/* continue on interruption (EINTR) */
if (errno == EINTR) {
continue;
}
/* break on anything else (EINVAL or ECHILD according to manpage) */
break;
}
else if (p == 0) {
/* no more children to process, so break */
break;
}
/* valid child process ID retrieved, process accordingly */
...
}
}
You could optionally mask/block additional SIGCHLD signals during execution of the signal handler using sigprocmask(). The blocked mask must be returned to its original value when the signal handling routine has finished.
If you really don't want to use a SIGCHLD handler, you could try adding the child processing loop somewhere where it would be called regularly and poll for terminated children.
The variable 'clients' are in different process address spaces after fork() and when you decrement the variable in the child, this will not affect the value in the parent. I think you need to handle SIGCHLD to handle the count correctly.