question about Fork() - c

When a parent process creates a child process with fork(), according to me,
the child process is in a Running state whereas the parent process is in a Ready state, i.e. waiting for the child to end.
Am I right?

No, the fork creates a copy of the parent.
Then you generally tests for the return value of fork which says 0 = I am the child, other: I'm the parent and the child has the return value as PID
If the parent has to wait for the child to end, you need to use the wait function.
Edit:
see http://linux.die.net/man/2/fork and http://linux.die.net/man/2/wait for the fork() in C.

Here is something from
After a fork(), it is indeterminate
which process—the parent or the
child—next has access to the CPU.
Applications that implicitly or
explicitly rely on a particular
sequence of execution in order to
achieve correct results are open to
failure due to race conditions.
It goes on to point different behaviors in different kernels. The bottom line is that it's implementation-defined and not to be relied upon.
Also if you do want to rely on it, on Linux since 2.6.32 "there's a sysctl for that"
kernel.sched_child_runs_first
Cheers

Related

What does reaping children imply?

I have just had a lecture that sums reaping as:
Reaping
Performed by parent on terminated child (using wait or waitpid)
Parent is given exit status informaton
Kernel then deletes zombie child process
So I understand that reaping is done by calling wait or waitpid from the parent process after which the kernel deletes the zombie process. If this actually is the case, that reaping is done only when calling wait or waitpid, why do the child processes actually go away after returning in theor entry function - I mean that indeed does seem as if the child processes have been reaped and thus no resources are wasted even though the parent process may not be waiting.
So is "reaping" only possible when calling wait or waitpid? Is processes are "reaped" as long as they return and exit from their entry function (which I assume all processes do) - what is the point of talking about "reaping" as if it was something special?
The child process does not fully "go away" when it exits. It ceases to exist as a running process, and most/all of its resources (memory, open files, etc.) are released, but it still remains in the process table. It remains in the process table because that's where its exit status is stored, so that the parent can retrieve it by calling one of the wait variants. If the parent fails to call wait, the process table entry sticks around — and that's what makes it a "zombie".
I said that most/all of its resources are released, but the one resource that's definitely still consumed is that process table slot.
As long as the (dead) child's parent exists, the kernel doesn't know that the parent isn't going to call wait eventually, so the process table slot has to stay there, so that the eventual call to wait (if there is one) can return the proper exit status.
If the parent eventually exits (without ever calling wait), the child will be inherited by the grandparent, which is usually a "master" process like the shell, or init, that does routinely call wait and that will finally "reap" the poor young zombie.
So, yes, it really is true that the only way for the parent to properly "reap" the child is, just as was said in your lecture, to call one of the wait functions. (Or to exit, but that's not an option if the parent is long-running.)
Footnote: I said "the child will be inherited by the grandparent", but I think I was wrong, there. Under Unix and Linux, orphaned processes are generally always inherited by pid 1, aka init.
The purpose of the wait*() call is to allow the child process to report a status back to the parent process. When the child process exits, the operating system holds that status data in a little data structure until the parent reads it. Reaping in that sense is cleaning out that little data structure.
If the parent does not care about waiting for status from the child, the code could be written in a way to allow the parent to ignore the status, and so the reaping occurs semi-automatically. One way is to ignore the SIGCHLD signal.
Another way is to perform a double-fork to create a grandchild process instead. When doing this, the "parent" does a blocking wait() after a call to fork(). Then, the child performs another fork() to create the grandchild and then immediately exits, causing the parent to unblock. The grandchild now does the real work, and is automatically reaped by the init process.

What happens when two processes wait for the same child?

From what I've read the default behavior for wait/waitpid is to wait for a state change in a process. What I can't find is the expected behavior of two processes waitpid using the same pid_t argument.
Do both return and continue execution, or is it a race condition where only one notices the state change?
Only the parent can wait() for a process, and a process can of course have only one parent.
The parent process might, however, have multiple threads. In the case of multiple threads waiting for the same child, POSIX specifies that only one of them will see the state change. To allow multiple threads to see the state change, you must use waitid() with the WNOWAIT flag.
POSIX: status information

fork() in C; which should be parent process which should be child process

This may seem to be a dumb question but I don't really have a good understanding of fork() other than knowing that this is about multi-threading. Child process is like a thread. If a task needs to be processed via fork(), how to correctly assign tasks to parent process and child process?
Check the return value of fork. The child process will receive the value of 0. The parent will receive the value of the process id of the child.
Read Advanced Linux Programming which has an entire chapter dedicated to processes (because fork is difficult to explain);
then read documentation of fork(2); fork is not about multi-threading, but about creating processes. Threads are generally created with pthread_create(3) (which is implemented above clone(2), a Linux specific syscall). Read some pthreads tutorial to learn more about threads.
PS. fork is difficult to understand (you'll need hours of reading, some experimentation, perhaps using strace(1), till you reach the "AhAh" insight moment when you have understood it) since it returns twice on success. You need to keep its result, and you need to test the result for the three cases : <0 (failure), ==0 (child), >0 (parent). Don't forget to later call waitpid(2) (or something similar) in the parent, to avoid having zombie processes.

How to restrict child thread or a child process to restrict from forking in C

In C language,I have a child thread(using pthreads),
Is there any way to restrict this child, so that we can't call fork inside this thread?
If we write fork inside, program should not compile.
I can also have a child process instead of child thread, as long as it cannot fork further.
Basically how can I have a child process or child thread, which cannot fork a process any further.
You can always try to play games with pthread_atfork: http://pubs.opengroup.org/onlinepubs/009695399/functions/pthread_atfork.html
Basically, you can use pthread_atfork() to install a "child" callback which always calls exit(). This way, your threads may still fork, but the forked process will exit immediately, so no harm will be done (and only a minimal overhead incurred).
With processes it may be somewhat more complicated. Linux allows you to limit a number of processes per user (so called RLIMIT_NPROC when set with setrlimit()). When this limit is reached, no further forks are possible for a given user id. Thus, you can create a parent process with a CAP_SETUID capability and a dummy user, having the RLIMIT_NPROC set to 1. This way, you can fork from parent, change the child uid to that of the "limited" user you've created in advance and drop the CAP_SETUID capability. At this point, child will have no possible way to fork itself.

Is it possible to adopt a process?

Process A fork()s process B.
Process A dies and therefore init adopts B.
A watchdog creates process C.
Is it somehow possible for C to adopt B from init?
Update:
Or would it even be possible to have C adopt B directly (when A dies), if C were created prior to A's dead, without init becoming an intermediate parent of B?
Update-1:
Also I would appreciate any comments on why having the possiblity to adopt a process the way I described would be a bad thing or difficult to impossible to implement.
Update-2 - The use case (parent and children refer to process(es)):
I have an app using a parent to manage a whole bunch of children, which rely on the parent's managment facility. To do its job the parent relies on being notified by a child's termination, which is done via receiving the related SIGCHLD signal.
If the parent itself dies due some accident (including segfaulting) I need to restart the whole "family", as it's impossible now to trigger something on a child's termination (which also might due to a segfault).
In such a case I need to bring down all children and do a full system's restart.
A possible approach to avoid this situation, would be to have a spare-process in place which could take over the dead parent's role ... - if it could again receive the step children's SIGCHLD signals!
No, most definitely not possible. It couldn't be implemented either, without some nasty race conditions. The POSIX guys who make these APIs would never create something with an inherent race condition, so even if you're not bothered, your kernel's not getting it anytime soon.
One problem is that pids get reused (they're a scarce resource!), and you can't get a handle or lock on one either; it's just a number. So, say, somewhere in your code, you have a variable where you put the pid of the process you want to reparent. Then you call make_this_a_child_of_me(thepid). What would happen then? In the meantime, the other process might have exited and thepid changed to refer to some other process! Oops. There can't be a way to provide a make_this_a_child_of_me API without large restructuring of the way unix handles processes.
Note that the whole deal with waiting on child pids is precisely to prevent this problem: a zombie process still exists in the process table in order to prevent its pid being reused. The parent can then refer to its child by its pid, confident that the process isn't going to exit and have the child pid reused. If the child does exit, its pid is reserved until the parent catches SIGCHLD, or waits for it. Once the process is reaped, its pid is up for grabs immediately for other programs to start using when they fork, but the parent is guaranteed to already know about it.
Response to update: consider a more complicated scheme, where processes are reparented to their next ancestor. Clearly, this can't be done in every case, because you often want a way of disowning a child, to ensure that you avoid zombies. init fulfills that role very well. So, there has to some way for a process to specify that it intends to either adopt, or not, its grandchildren (or lower). The problem with this design is exactly the same as the first situation: you still get race conditions.
If it's done by pid again, then the grandparent exposes itself to a race condition: only the parent is able to reap a pid, so only the parent really knows which process a pid goes with. Because the grandparent can't reap, it can't be sure that the grandchild process hasn't changed from the one it intended to adopt (or disown, depending on how the hypothetical API would work). Remember, on a heavily-loaded machine, there's nothing stopping a process from being taken off the CPU for minutes, and a whole load could have changed in that time! Not ideal, but POSIX's got to account for it.
Finally, suppose then that this API doesn't work by pid, but just generally says, "send all grandchildren to me" or "send them to init". If it's called after the child processes are spawned, then you get race conditions just as before. If it's called before, then the whole thing's useless: you should be able to restructure your application a little bit to get the same behaviour. That is, if you know before you start spawning child processes who should be the parent of whom, why can't you just go ahead and create them the right way round in the first place? Pipes and IPC really are able to do all the required work.
No there is no way that you can enforce Reparenting in the way you have described.
I don't know of a good way to do this, but one reason for having it is that a process running can stand on its own or add a capability to a parent process. The adoption would occur as the result of an event, know by the (not yet) child, but not the parent. The soon-to-be child would send a signal to the parent. The parent would adopt (or not) the child. Once part of the parent, the parent/child process would be able to react to the event, whereas neither could react to the event when standing alone.
This docking behavior could be coded into the apps, but I don't know how to do it in real-time. There are other ways to achieve the same functionality. A parent, who could accept docking children could have its functionality extended in novel ways not previously known to the parent.
While the original question is tagged with unix, there is a way to achieve this on linux so it's worth mentioning. This is achievable with the use of a subreaper process. When a process's parent, it will get adopted by the nearest subreaper ancestor or init. So in your case you'll have process C set as subreaper via prctl(PR_SET_CHILD_SUBREAPER) and spawns process A, when process A dies, process B will be adopted by C
An alternative on Linux would be to spawn C in a separate PID namespace, making it the init process of the PID namespace and hence can adopt the children of A when A dies.

Resources