I am trying to create an assignment where I want to check if all child process created by students have exited. As I am not calling fork, I don't have access to thread ids. Is there a way to check if the current process doesn't have any children without knowing thread ids of child processes created?
I checked many questions but every solution consists of the use of return value from the fork call. Any help is appreciated.
Thank you.
You can call
int st = waitpid(-1, NULL, WNOHANG);
The first argument tells waitpid() to wait for any child process to exit, not for a specific pid.
The third argument is a flag, that makes waitpid() return immediately instead of blocking.
Now there are three possible outcomes:
return value is -1 and errno is ECHILD: this means, that there is no child process present at all
return value is >0: this denotes, that a child has exited in the past, but the return value was not yet collected (a so-called zombie process). Now iterate the process (call waitpid() again).
return value is 0: in this case, there are child processes available that are still running.
This should cover all cases you ask for.
Related
I am learning about forks, execl and parent and child processes in my systems programming class. One thing that is confusing me is waitpid() and getpid(). Could someone confirm or correct my understanding of these two functions?
getpid() will return the process ID of whatever process calls it. If the parent calls it, it returns the pid of the parent. Likewise for the child. (It actually returns a value of type pid_t, according to the manpages).
waitpid() seems more complex. I know that if I use it in the parent process, without any flags to prevent it from blocking (using WNOHANG), it will halt the parent process until the child process terminates. I'm a little unsure as to how waitpid() manages all this, however. waitpid() also returns pid_t. What is the value of the pid_t waitpid() returns? How does this change depending on whether or not a parent or child calls it, and whether or not a child process is still running, or has terminated?
Your understanding of getpid is correct, it returns the PID of the running process.
waitpid is used (as you said) to block the execution of a process (unless
WNOHANG is passed) and resume execution when a (or more) child of the process
ends. waitpid returns the pid of the child whose state has changed, -1 on
failure. It also can return 0 if WNOHANG has specified but the child has not
changed the state. See:
man waitpid
RETURN VALUE
waitpid(): on success, returns the process ID of the child whose state has changed; if WNOHANG
was specified and one or more child(ren) specified by pid exist, but have not yet changed state,
then 0 is returned. On error, -1 is returned.
Depending on the arguments passed to waitpid, it will behave differently. Here
I'l quote the man page again:
man waitpid
pid_t waitpid(pid_t pid, int *wstatus, int options);
...
The waitpid() system call suspends execution of the calling process until a child specified by pid argument
has changed state. By default, waitpid() waits only for terminated children, but this behavior is modifiable
via the options argument, as described below:
The value of pid can be:
< -1: meaning wait for any child process whose process group ID is equal to the absolute value of pid.
-1: meaning wait for any child process.
0: meaning wait for any child process whose process group ID is equal to that of the calling process.
> 0: meaning wait for the child whose process ID is equal to the value of pid.
The value of options is an OR of zero or more of the following constants:
WNOHANG: return immediately if no child has exited.
WUNTRACED also return if a child has stopped (but not traced via ptrace(2)).
Status for traced children which have stopped is provided even if this option is not specified.
WCONTINUED (since Linux 2.6.10) also return if a stopped child has been resumed by delivery of SIGCONT.
I'm a little unsure as to how waitpid() manages all this
waitpid is a syscall and the OS handles this.
How does this change depending on whether or not a parent or child calls it, and whether or not a child process is still running, or has terminated?
wait should only be called by a process that has executed fork(). So the parent
process should cal wait()/waitpid. If the child process hasn't called
fork(), then it doesn't need to call either one of these functions. If however
the child process has called fork(), then it also should call
wait()/waitpid().
The behaviour of these function is very well explained in the man page, I quoted the important parts of it. You should read the whole man page
to get a better understanding of it.
waitpid "shall only return the status of a child process" (from the POSIX spec). So the pid_t waitpid returns belongs to one of the current or former children of the process calling waitpid. For example, if a child has recently terminated, it returns that child's PID.
waitpid is only useful when called from a parent process. If called from a process that does not have any children, it returns ECHILD.
waitpid can check the status of children that have terminated, or that has recently stopped or continued (e.g., ^Z from a shell). The various pid/option argument combinations in the spec tell you the various types of information you can return. For example, the WCONTINUED option requests status of recently-continued children instead of recently-terminated children.
I write a program in C. I do fork() in the main process in order to do execve() in the forked child process to execute an unknown app (given by a user in the command line). I know a PID of the process of the executed app - it is returned by fork(), but this unknown app can possibly fork() many times and I do not know PIDs of all its children (they are grandchildren of the main parent process). How can I check in the main parent process WHEN its child process (it is the unknown app) and ALL children of the unknown app exit? (I do not know even how many children it can have and I do not know PIDs of these children).
This can be done by making your parent process a subreaper. A subreaper gets all children orphaned by its descendants, which would traditionally always go to init (process ID 1). The subreaper status needs to be enabled before forking the interesting child process. Once this is done, a waitpid() or similar call for any process will return the child process and all orphaned descendants until it returns error [ECHILD] when the entire tree is gone.
On Linux, this is enabled using prctl()'s PR_SET_CHILD_SUBREAPER option and on FreeBSD this is enabled using procctl() PROC_REAP_ACQUIRE command (see man pages for details).
On Linux you will be able to monitor only one child process individually this way, since the orphans do not remember from which original fork call they came. On FreeBSD, PROC_REAP_GETPIDS allows distinguishing individual subtrees, although this is less efficient if the tree contains many processes.
You can use waitpid(-1,NULL, WNOHANG) to tell if one child has exited. If you receive a positive number (a pid) then one child has exited. In your parent process you have a line that checks if the amount of child processes you have, here called x, is more than 0. if it is use this command to see if any child process has ended. If you have x items then when you add an item increment x and when one exits decrement x. When x, the amount of children you have, is zero all you children have been killed.
Good morning,
I'm trying to learn how to us Peterson's solution for critical section protection. Each process is trying to increment total to 100,000 and I have to make sure each child calls process#(). I also need to use the "wait" function so the parent knows when the child finishes. Once a child finishes I need to print the process ID and the amount of times process 1 interrupts process 2, and vise versa. I really have no idea what I'm doing even though I've been reading around a lot. What is this "Waiting" function I'm supposed to use? How do I use it? Why is my code incrementing to 200,000 instead of 100,000?
Code removed, unnecessary for question.
Apparently somewhere in the main function, I need to loop for the parent to wait for the child and then print the process ID of the children, but I have no idea how to do this.
The wait() command you are referring to (and waitpid()) is a command you use in the parent process to "wait" for a child to terminate (blocking, meaning the parent won't continue execution until the child changes state). If your child terminates and you do not wait() in the parent, the child process will become a "zombie". wait()ing effectively "reaps" the child process.
This is the signature for waitpid():
pid_t waitpid(pid_t pid, int *status, int options);
status is a variable you can use to return some info from the child to the parent (e.g., the number of times this process was interrupted (assuming you keep track of this in the child?)); it is the exit code of the child. Let's assume you have a child with PID of 1234, you would call waitpid(1234, &status 0) (if you wanted to wait non-blocking, you'd have to use WNOHANG for options (it immediately returns if no child has exited). Check out http://linux.die.net/man/2/waitpid because there's cool values you can use for waitpid() such as -1 to wait for any child to exit (this is the same as regular wait()). Please post comments if you have any more questions, but hopefully this is enough to point you in the right direction :).
Since you want to "loop" in the parent to wait for your children, you can either skip the loop altogether and use the normal, "blocking" wait, or you can use an infinite loop with a non-blocking wait (or a series of them, one for each child if you don't use regular wait()).
Just in case you didn't know, you can determine whether you are in the parent or child based on the return value of fork(), but I think you already knew this. So in the body of an if (checking for parent) is where you would do the wait().
Also, is process2() supposed to have while (k < 200000)? Could this by why you say it's incrementing to 200,000?
I was reading about the wait() function in a Unix systems book. The book contains a program which has wait(NULL) in it. I don't understand what that means. In other program there was
while(wait(NULL)>0)
...which also made me scratch my head.
Can anybody explain what the function above is doing?
man wait(2)
All of these system calls are used to wait for state changes in
a child of the calling process, and obtain information about the child
whose state has changed. A state change is considered to be: the child terminated; the child was stopped by a signal; or the child was resumed by a signal
So wait() allows a process to wait until one of its child processes change its state, exists for example. If waitpid() is called with a process id it waits for that specific child process to change its state, if a pid is not specified, then it's equivalent to calling wait() and it waits for any child process to change its state.
The wait() function returns child pid on success, so when it's is called in a loop like this:
while(wait(NULL)>0)
It means wait until all child processes exit (or change state) and no more child processes are unwaited-for (or until an error occurs)
a quick google suggests, wait(NULL) waits for any of the child processes to complete
wait(NULL) which should be equivalent to waitpid(-1, NULL, 0)
wait(NULL) waits for all the child processes to complete
Heres a breakdown of my code.
I have a program that forks a child (and registers the child's pid in a file) and then does its own thing. The child becomes any program the programmer has dignified with argv. When the child is finished executing, it sends a signal (using SIGUSR1) back to the parent processes so the parent knows to remove the child from the file. The parent should stop a second, acknowledge the deleted entry by updating its table, and continue where it left off.
pid = fork();
switch(pid){
case -1:{
exit(1);
}
case 0 :{
(*table[numP-1]).pid = getpid(); //Global that stores pids
add(); //saves table into a text file
freeT(table); //Frees table
execv(argv[3], &argv[4]); //Executes new program with argv
printf("finished execution\n");
del(getpid()); //Erases pid from file
refreshReq(); //Sends SIGUSR1 to parent
return 0;
}
default:{
... //Does its own thing
}
}
The problem is that the after execv successfully starts and finishes (A printf statement before the return 0 lets me know), I do not see the rest of the commands in the switch statement being executed. I am wondering if the execv has like a ^C command in it which kills the child when it finishes and thus never finishes the rest of the commands. I looked into the man pages but did not find anything useful on the subject.
Thanks!
execv replaces the currently executing program with a different one. It doesn't restore the old program once that new program is done, hence it's documented "on success, execv does not return".
So, you should see your message "finished execution" if and only if execv fails.
execv replaces the current process with a new one. In order to spawn a new process, you can use e.g. system(), popen(), or a combination of fork() and exec()
Other people have already explained what execv and similar functions do, and why the next line of code is never executed. The logical next question is, so how should the parent detect that the child is done?
In the simple cases where the parent should do absolutely nothing while the child is running, just use system instead of fork and exec.
Or if the parent will do something else before the child exits, these are the key points:
When the child exits, the parent will get SIGCHLD. The default handler for SIGCHLD is ignore. If you want to catch that signal, install a handler before calling fork.
After a child has exited, the parent should call waitpid to clean up the child and find out what its exit status was.
The parent can also call wait or waitpid in a blocking mode to wait until a child exits.
The parent can also call waitpid in a non-blocking mode to find out whether the child has exited yet.
What did you expect to happen? This is what execv does. Please read the documentation which says:
The exec() family of functions replaces the current process image with a new process image.
Perhaps you were after system or something, to ask the environment to spawn a new process in addition to the current one. Or.. isn't that what you already achieved through fork? It's hard to see what you want to accomplish here.