Forking in a for loop clarification - c

I've seen lots of examples of forking in for loops on here, but not much clarification on how it does what it does. Lets use this simple example from an answer of How to use Fork() to create only 2 child processes? as an example.
for (i = 0; i < n; ++i) {
pid = fork();
if (pid) {
continue;
} else if (pid == 0) {
break;
} else {
printf("fork error\n");
exit(1);
}
}
Most of the examples I've seen follow this general format. But what I don't understand is, how does this prevent child processes from forking as well? From my understanding, every child that gets created has to go through this loop as well. But fork() is called at the very beginning of the for loop, and then the 3 comparisons happen. Could someone explain how, even though the children seem to call fork(), this for loop still ensures only the parent can create children?

The child starts at the line after fork. fork returns 0 for the child. In your example, the child would go into the pid == 0 block and break out of the for loop.
After a fork everything is exactly the same for the child and parent (including the next instruction to execute and variable values). The only difference is the return value from fork (0 for the child, and the child's pid for the parent).

When fork returns, it actually returns twice: once to the parent and once to the child. It returns 0 to the child and it returns the pid of the child to the parent.
The if block then detects which process returned. In the parent process, if (pid) evaluates to true, so it executes continue and jumps to the top of the loop.
In the child process, if (pid) evaluates to false, then if (pid == 0) evaluates to true, so it executes break to jump out of the loop. So the child doesn't do any more forking.

But what I don't understand is, how does this prevent child processes from forking as well?
fork() returns 0 in the child. In your example code, that causes the child to break out of the loop instead of performing another iteration, so the children in fact do not call fork().

After checking my Why does this program print “forked!” 4 times?. it seems straightforward to me why.
how does this prevent child processes from forking as well?
if (pid) {
continue;
}
You see when the child is created, and then executes its code and calls fork() it becomes the parent in that stage, thus pid will be 0.

Try man 2 fork. Fork(2) returns a different value for the parent and the child, the parent gets the pid and the child gets 0.

Related

What is the point in using fork() system call when both the processes (parent and child) work on the same code?

I read that the parent and the child will work on the identical code after the fork() system call. I cannot understand the point of doing a fork() as I cannot understand what good will it do in executing the same code twice.
The return value of fork() is different in the child and parent processes, so you'll typically have something along the lines of
pid_t child_pid = fork()
if (child_pid == 0) {
// do stuff in child process
} else {
// do stuff in parent process
}
You can use an if else condition to execute different piece of code for parent and child. As fork returns 0 to the child process and it returns pid of the child to the parent. Use this as the differentator in the if condition.

How many processes are created in this code?

I have this question in my text book that I am not able to wrap my head around. The question is: What is the maximum number of processes running simultaneously in the program code below?
In the code below no return value checks are made to fork(), hence both parent and child will execute all of the code, right? Am I wrong in assuming that in the first fork() call the parent will just wait first and then exit? So the maximum number of processes running at once would be 2? (Just before the parent exits it did a fork).
int main()
{
if ( fork() )
wait(0);
else
exit(0);
if ( fork() )
wait(0);
else
{
if ( fork() )
wait(0);
else
{
if ( fork() )
wait(0);
else
exit(0);
}
}
return 0;
}
I think the code is very poorly written and it is very unclear what actually happens in the code. I would be very thankful for a useful answer.
Thanks in advance.
In the code below no return value checks are made to fork(), hence both parent and child will execute all of the code, right?
No. If success, fork() will return postive number in parent process, 0 in child process. Those if (fork()) will be true in parent process.
Am I wrong in assuming that in the first fork() call the parent will just wait first and then exit?
No. After wait() returns, that parent will continue in the next if (fork()), the child will exit.
So the maximum number of processes running at once would be 2?
No. The right answer is 4.
From the fork(2) manpage:
On success, the PID of the child process is returned in the parent,
and 0 is returned in the child. On failure, -1 is returned in the
parent, no child process is created, and errno is set appropriately.
So those conditions behave like this (assuming the fork succeeds):
if ( fork() )
{
// parent, fork() returned the (nonzero) PID of the child process
}
else
// child, fork() returned 0
}

How to use fork() in an if statement

Can someone please explain to me what does fork() != 0 mean? From what I understand I think it means if fork is not false? Or if fork is true then.... I don't understand how Fork() can be true or false, seeing that it just creates a copy of a process into a parent and child. Also if a program where to say if (Fork() == 0) what would that mean?
#include "csapp.h"
int main(void)
{
int x = 3;
if (Fork() != 0)
printf("x=%d\n", ++x);
printf("x=%d\n", --x);
exit(0);
}
fork() returns -1 if it fails, and if it succeeds, it returns the forked child's pid in the parent, and 0 in the child. So if (fork() != 0) tests whether it's the parent process.
From man fork
Return Value
On success, the PID of the child process is returned in the parent,
and 0 is returned in the child. On failure, -1 is returned in the
parent, no child process is created, and errno is set appropriately.
Assuming success, fork returns twice: once in the parent, and once in the child.
OK, I did the OP a disservice: I don't know where csapp.h comes from, but if it's this one then it isn't doing you any favours. I guess it is a thin wrapper on POSIX (eg. around fork()), but maybe works on other platforms too?
Because you mentioned fork() before Fork(), I assumed the latter was a typo, whereas it's actually a library function.
If you had been using fork() directly, it would be reasonable to expect you to check the manpage.
Since you're using a Fork() function provided by some library, that library really ought to document it, and doesn't seem to.
Standard (non csapp) usage is:
pid_t child = fork();
if (child == -1) {
printf("fork failed - %d - %s\n", errno, strerror(errno));
exit(-1);
}
if (child) {
printf("I have a child with pid %d, so I must be the parent!\n", child);
} else {
printf("I don't have a child ... so I must be the child!\n")
}
exit(0);
Let's try explaining it differently... When the function starts there's 1 process, this process has a int x = 3
Once you hit this line of code:
if (fork() != 0)
Now, assuming the fork() worked, we have two processes. They both have the same execution space, they both are going to run the same code (to a point), but the child will get its own copy of x to play with.
fork() will return a 0 to the child process, so from the child processes' prospective, the rest of the function is this:
printf(("x=%d\n", --x);
exit(0);
The parent process on the other hand, will get a valid value back from the fork() command, thus it will execute:
printf("x=%d\n", ++x);
printf("x=%d\n", --x);
exit(0);
What the output at this point will be is anyone's guess... You can't tell if parent or child will run first
But if we assume the parent hits the ++x is the next operation then the output is:
x=4
x=3
x=2
As both parent and child will hit the --x. (the parent's x was 4 after the ++x, will be 3 at the end. The child's x was 3, will be 2 at the end)
From fork() manual:
Upon successful completion, fork() returns a value of 0 to the child process and returns the process ID of the child process to the parent
process. Otherwise, a value of -1 is returned to the parent process, no child process is created, and the global variable errno is set to indi-
cate the error.
After the code execution you have two execution threads. Into the if statement you have the parent process' thread and in else statement you have the child process' thread.
if ( fork() ) {
printf("I am the parent!\n");
} else {
printf("I am the child\n");
}
EDIT
For clarification purpose: fork starts a process, which has a thread, memory and may have other resources. I tried (it seems like that without success) to emphasize the flows of execution by adding the "thread" word.
However, by no means, one can say that "parent" relates to "thread" in "parent process' thread".
Of course, my answer could be improved but I think there are already enough good answers here.
Fork returns 0 for the child process and the process id of the child to the parent process. Hence commonly code has if(fork){ }else code. Which implies that the code inside the if is going to be executed only in a parent.
The better way to deal with it is
pid = fork()
if(pid){
// I am parent. Let us do something that only the parent has to do
}else{
// I am child. Let us do something only the child has to do
}
// This code is common to both
The child pid may be useful to wait upon later or to detach from the parent.
I recommend replacing the if with a switch because there are 3 possible results:
#include <sys/types.h>
#include <unistd.h>
pid_t pid;
switch ((pid = fork ())) {
case -1: /* error creating child. */
break;
case 0: /* I am the child process. */
break;
default: /* I am the parent process. */
break;
}

What is the difference between fork()!=0 and !fork() in process creation

Currently, I am doing some exercises on operating system based on UNIX. I have used the fork() system call to create a child process and the code snippet is as follows :
if(!fork())
{
printf("I am parent process.\n");
}
else
printf("I am child process.\n");
And this program first executes the child process and then parent process.
But, when I replace if(!fork()) by if(fork()!=0) then the parent block and then child block executes.Here my question is - does the result should be the same in both cases or there is some reason behind this? Thanks in advance!!
There is no guaranteed order of execution.
However, if(!fork()) and if(fork()!=0) do give opposite results logically: if fork() returns zero, then !fork() is true whilst fork()!=0 is false.
Also, from the man page for fork():
On success, the PID of the child process is returned in the parent, and 0 is returned in the child. On failure, -1 is returned in the parent, no child process is created, and errno is set appropriately.
So the correct check is
pid_t pid = fork();
if(pid == -1) {
// ERROR in PARENT
} else if(pid == 0) {
// CHILD process
} else {
// PARENT process, and the child has ID pid
}
EDIT: As Wyzard says, you should definitely make sure you make use of pid later as well. (Also, fixed the type to be pid_t instead of int.)
You shouldn't really use either of those, because when the child finishes, it'll remain as a zombie until the parent finishes too. You should either capture the child's pid in a variable and use it to retrieve the child's exit status:
pid_t child_pid = fork();
if (child_pid == -1)
{
// Fork failed, check errno
}
else if (child_pid)
{
// Do parent stuff...
int status;
waitpid(child_pid, &status, 0);
}
else
{
// Child stuff
}
or you should use the "double-fork trick" to dissociate the child from the parent, so that the child won't remain as a zombie waiting for the parent to retrieve its exit status.
Also, you can't rely on the child executing before the parent after a fork. You have two processes, running concurrently, with no guarantee about relative order of execution. They may take turns, or they may run simultaneously on different CPU cores.
The order in which the parent and child get to their respective printf() statements is undefined. It is likely that if you were to repeat your tests a large number of times, the results would be similar for both, in that for either version there would be times that the parent prints first and times the parent prints last.
!fork() and fork() == 0 both behave in the same way.
The condition itself cannot be the reason the execution sequence is any different.
The process is replicated, which means that child is now competing with parent for resources, including CPU. It is the OS scheduler that decides which process will get the CPU.
The sequence in which child and parent processes are being execute is determined by the scheduler. It determines when and for how long each process is being executed by the processor. So the sequence of the output may vary for one and the same program code. It is purely coincidental that the change in the source code led to the change of the output sequence.
By the way, your printf's should be just the other way round: if fork() returns 0, it's the child, not the parent process.
See code example at http://en.wikipedia.org/wiki/Fork_%28operating_system%29. The German version of this article (http://de.wikipedia.org/wiki/Fork_%28Unix%29) contains a sample output and a short discusion about operation sequence.

fork: where does the child start running?

Is the child, after the fork, start the program from the beginning or from the place of is parent?
for example, it this program, is the child start from line 1 or line 3?
int i=1
fork()
i=i*2
fork
i=i*2
fork() creates a new process by duplicating the calling process.
The new process, referred to as the child, is an exact duplicate of
the calling process, referred to as the parent, except for the
following points: […]
from fork(2)
As it is an exact duplicate, it will also have the same instruction pointer and stack. So the child will be right after the call to fork(). Now, you may ask, how do I find out whether the current program is the child or the parent? See the manpage on the return value:
On success, the PID of the child process is returned in the parent,
and 0 is returned in the child. On failure, -1 is returned in the
parent, no child process is created, and errno is set appropriately.
So if the result of fork() is equal to 0, you're in the child process, if its greater than 0 you're in the parent and if its below 0 you're in trouble.
Please note that this implies that every code which is independent of the result value of fork(), will be executed in both the child and the parent. So if you're for example creating a pool with 16 processes, you should be doing:
for (int i = 0; i < 16; i++) {
pid_t pid = fork()
if (pid == 0) {
do_some_work();
exit(0);
} else if (pid < 0) {
// fork failed
do_some_error_handling();
}
}
If you miss the exit(0), you'll spawn 2¹⁶-1 processes (been there, just with 100 instead of 16. No fun.)
The fork starts from line 3, the point where the fork occurred.
When fork returns, it returns in both the parent (returning the PID of the child) and the child (returning 0). Execution continues from there in both the parent and the child.
As such, typical use of fork is something like:
if (0 == (child = fork()))
// continue as child.
else
// Continue as parent.
The Child will be created at line 2 i.e., fork() but it will start its execution from the line 3 i.e., i = i*2. What confuses me here is your line 4. What are you trying to do there?

Resources