How is the flow of control in this program involving fork() system call? - c

From what I read about fork() system call
Fork system call use for creates a new process, which is called child process, which runs concurrently with parent process
After a new child process created, both processes will execute the next instruction following the fork() system call
fork() returns 0 to the child process
fork() returns Process ID of newly created child process to parent process (Positive value)
fork() returns negative value if child process creation fails
In this piece of code
void foo() {
if (fork() == 0)
printf("Hello from Child!\n");
else
printf("Hello from Parent!\n");
}
int main() {
foo();
return 0;
}
The output is
Hello from Parent!
Hello from Child!
The child process was created when the control was inside the condition of if-else of function foo in main process.
So from where (which instruction) did the child process start executing?
As it can be observed from the output, Hello from Parent is printed when fork() returns 0. So from my understanding Hello from Parent was actually printed by the Child Process
fork() returned a positive value to the parent process and the parent process printed Hello from Child. Is my understanding about this correct?
And from which instruction exactly did the child process started executing? The function call to fork() was given inside the condition section of a if-else. So the child should have started executing after that if-else but that is not what is happening?

Let's start by identifying a primary misconception here:
As it can be observed from the output, Hello from Parent is printed when fork() returns 0. So from my understanding Hello from Parent was actually printed by the Child Process
The child and the parent are two separate processes running concurrently. The order of these two outputs isn't well-defined, will vary based on your kernel and other timing considerations, and isn't correlated with the fact that your code contains the if/else block written as you have it.1
Let's rewrite your code as a linear stream of "instructions" in an abstract sense:
0: Function foo():
1: Invoke system call fork(), no arguments, store result to $1
2: If $1 is non-zero, jump to label #1.
3: Invoke C function printf(), argument "Hello from Child!"
4: Jump to label #2.
5: Label #1:
6: Invoke C function printf(), argument "Hello from Parent!"
7: Label #2:
8: return control to calling function.
Once your program reaches 1:, the system call is invoked, transferring control to the kernel. The kernel duplicates the process, puts the PID of the child into the return value of fork in the parent process, and puts 0 into the return value of fork in the child. On x86, the return value is stored in register eax (rax for x64) as part of the syscall calling convention.
One of these two processes will eventually get scheduled to run by the kernel. In your case, the child process happened to be the first to get scheduled. Your user-mode code took control back from kernel mode, read the return value (out of eax/rax if on x86) which was zero, and did not jump to label #1. It printed Hello from Child!, and then returned from the function (to the caller of foo, since the child got a copy of the parent's stack).
The same happened for the parent, except the parent got a non-zero value back from the system call, and printed Hello from Parent!. It got scheduled to run, and your user-mode code took control from the kernel at the same point, just with a different value returned by the system call.
1 It's also possible that the two outputs might become interleaved in some way, but that's not as relevant to this discussion, and requires understanding how Linux processes perform I/O.

The child process is a second process that executes in parallel. You might just as easily have gotten
Hello from Child!
Hello from Parent!
For example, if you have a terminal window open, and you start firefox &, which runs “first,” the terminal window or the browser window? Both are running at the same time.
In fact, Linux starts the child process slightly before it resrarts the parent. This is because a large number of programs that call fork() immediately have the child exec() a program, which frees the parent from needing to share all its memory with the child. This is more efficient, because shared memory is copy-on-write.

Related

How can output from parent and child of fork() system call interleave with each other?

code reads something like :
pid=fork()
if(pid){
print parent_id
print parent_id
}
else{
print child_id
print child_id
}
when it was executed it was 
child 
parent 
child 
parent
I don't understand how this could happen because child and parent should be printed consecutively.
You probably have a mutli-core CPU, so both parent and child can both be running on separate CPU cores. Printing is slow enough that the parent process has time to gets its 1st print started before your child process's 2nd print starts.
Or if not, on a single-core machine then scheduling could context-switch after one print.
The whole point of fork is to create a 2nd process that's also running; there should be no expectation that their system calls don't interleave with each other.
Existing near-duplicates about fork race conditions / timing, although those primarily about which runs first, not any expectation of continuing to run for multiple system calls before the other process can do anything.
fork(), runs always the parent first and then the child
Who executes first after fork(): parent or the child?
In fork() which will run first, parent or child?
Which process runs first when a fork() is called
And more generally Understanding parent and child process execution order about making multiple system calls including read in both parent and child, causing one or both to block.
And two where people had the opposite problem: they expected interleaving of output but didn't get it. (Perhaps because of buffered I/O resulting in only one write system call, if run from an IDE with output connected to a pipe instead of tty perhaps.)
fork() does not run parallel
Child process starts after parent process

Control flow of fork system call when wait is present or not present

In this code (run on linux):
void child_process()
{
int count=0;
for(;count<1000;count++)
{
printf("Child Process: %04d\n",count);
}
printf("Child's process id: %d\n",getpid());
}
void parent_process()
{
int count=0;
for(;count<1000;count++)
{
printf("Parent Process: %04d\n",count);
}
}
int main()
{
pid_t pid;
int status;
if((pid = fork()) < 0)
{
printf("unable to create child process\n");
exit(1);
}
if(pid == 0)
child_process();
if(pid > 0)
{
printf("Return value of wait: %d\n",wait();
parent_process();
}
return 0;
}
If the wait() were not present in the code, one of the process (child or parent) would finish it's execution and then the control is given to the linux terminal and then finally the process left (child or parent) would run. The output of such a case is:
Parent Process: 0998
Parent Process: 0999
guest#debian:~/c$ Child Process: 0645 //Control given to terminal & then child process is again picked for processing
Child Process: 0646
Child Process: 0647
In case wait() is present in the code, what should be the flow of execution?
When fork() is called then a process tree must be created containing parent and child process. In above code when the processing of child process ends, the parent is informed about the death of child zombie process via wait() system call, but parent and child being two separate processes, is it mandatory that the control is passed the directly to the parent after child process is over? (no control given to other process like terminal at all) - if yes then it is like child process is a part of parent process (like a function called from another function).
This comment is, at least, misleading:
//Control given to terminal & then child process is again picked for processing
The "terminal" process doesn't really enter into the equation. It's always running, assuming that you are using a terminal emulator to interact with your program. (If you're using the console, then there is no terminal process. But that's unlikely these days.)
The process in control of the user interface is whatever shell you're using. You type some command-line like
$ ./a.out
and the shell arranges for your program to run. (The shell is an ordinary user program without special privileges, by the way. You could write your own.)
Specifically, the shell:
Uses fork to create a child process.
Uses waitpid to wait for that child process to finish.
The child process sets up any necessary redirects and then uses some exec system call, typically execve, to replace itself with the ./a.out program, passing execve (or whatever) the command line arguments you specified.
That's it.
Your program, in ./a.out, uses fork to create a child and then possibly waits for the child to finish before terminating. As soon as your parent process terminates, the shell's waitpid() can return, and as soon as it returns, the shell prints a new command prompt.
So there are at least three relevant processes: the shell, your parent process, and your child process. In the absence of synchronisation functions like waitpid(), there are no guarantees about ordering. So when your parent process calls fork(), the created child could start executing immediately. Or not. If it does start executing immediately, it does not necessarily preempt your parent process, assuming your computer is reasonably modern and has more than one core. They could both be executing at the same time. But that's not going to last very long because your parent process will either immediately call exit or immediately call wait.
When a process calls wait (or waitpid), it is suspended and becomes runnable again when the process it is waiting for terminates. But again there are no guarantees. The mere fact that a process is runnable doesn't mean that it will immediately start running. But generally, in the absence of high load, the operating system will start running it pretty soon. Again, it might be running at the same time as another process, such as your child process (if your parent didn't wait for it to finish).
In short, if you performed your experiment a million times, and your parent waits for your child, then you will see the same result a million times; the child must finish before the parent is unsuspended, and your parent must finish before the shell is unsuspended. (If your parent process printed something before waiting, you would see different results; the parent and child outputs could be in any order, or even overlapped.)
If, on the other hand, your parent does not wait for the child, then you could see any of a number of results, and in a million repetitions you're likely to see more than one of them (but not with the same probability). Since there is no synchronisation between parent and child, the outputs could appear in either order (or be interleaved). And since the child is not synchronised with the shell, its output could appear before or after the shell's prompt, or be interleaved with the shell's prompt. No guarantees, other than that the shell will not resume until your parent is done.
Note that the terminal emulator, which is a completely independent process, is runnable the entire time. It owns a pseudo-terminal ("pty") which is how it emulates a terminal. The pseudo-terminal is a kind of pipe; at one end of the pipe is the process which thinks it's communicating with a console, and at the other end is the terminal emulator which interprets whatever is being written to the pty in order to render it in the GUI, and which sends any keystrokes it receives, suitably modified as a character stream back through the pipe. Since the terminal emulator is never suspended and its execution is therefore interleaved with whatever other processes are active on your computer, it will (more or less) immediately show you any output which is sent by your shell or the processes it starts up. (Again, assuming the machine is not overloaded.)

How does fork() know when to return 0?

Take the following example:
int main(void)
{
pid_t pid;
pid = fork();
if (pid == 0)
ChildProcess();
else
ParentProcess();
}
So correct me if I am wrong, once fork() executes a child process is created. Now going by this answer fork() returns twice. That is once for the parent process and once for the child process.
Which means that two separate processes come into existence DURING the fork call and not after it ending.
Now I don't get it how it understands how to return 0 for the child process and the correct PID for the parent process.
This where it gets really confusing. This answer states that fork() works by copying the context information of the process and manually setting the return value to 0.
First am I right in saying that the return to any function is placed in a single register? Since in a single processor environment a process can call only one subroutine that returns only one value (correct me if I am wrong here).
Let's say I call a function foo() inside a routine and that function returns a value, that value will be stored in a register say BAR. Each time a function wants to return a value it will use a particular processor register. So if I am able to manually change the return value in the process block I am able to change the value returned to the function right?
So am I correct in thinking that is how fork() works?
How it works is largely irrelevant - as a developer working at a certain level (ie, coding to the UNIX APIs), you really only need to know that it works.
Having said that however, and recognising that curiosity or a need to understand at some depth is generally a good trait to have, there are any number of ways that this could be done.
First off, your contention that a function can only return one value is correct as far as it goes but you need to remember that, after the process split, there are actually two instances of the function running, one in each process. They're mostly independent of each other and can follow different code paths. The following diagram may help in understanding this:
Process 314159 | Process 271828
-------------- | --------------
runs for a bit |
calls fork |
| comes into existence
returns 271828 | returns 0
You can hopefully see there that a single instance of fork can only return one value (as per any other C function) but there are actually multiple instances running, which is why it's said to return multiple values in the documentation.
Here's one possibility on how it could work.
When the fork() function starts running, it stores the current process ID (PID).
Then, when it comes time to return, if the PID is the same as that stored, it's the parent. Otherwise it's the child. Pseudo-code follows:
def fork():
saved_pid = getpid()
# Magic here, returns PID of other process or -1 on failure.
other_pid = split_proc_into_two();
if other_pid == -1: # fork failed -> return -1
return -1
if saved_pid == getpid(): # pid same, parent -> return child PID
return other_pid
return 0 # pid changed, child, return zero
Note that there's a lot of magic in the split_proc_into_two() call and it almost certainly won't work that way at all under the covers(a). It's just to illustrate the concepts around it, which is basically:
get the original PID before the split, which will remain identical for both processes after they split.
do the split.
get the current PID after the split, which will be different in the two processes.
You may also want to take a look at this answer, it explains the fork/exec philosophy.
(a) It's almost certainly more complex than I've explained. For example, in MINIX, the call to fork ends up running in the kernel, which has access to the entire process tree.
It simply copies the parent process structure into a free slot for the child, along the lines of:
sptr = (char *) proc_addr (k1); // parent pointer
chld = (char *) proc_addr (k2); // child pointer
dptr = chld;
bytes = sizeof (struct proc); // bytes to copy
while (bytes--) // copy the structure
*dptr++ = *sptr++;
Then it makes slight modifications to the child structure to ensure it will be suitable, including the line:
chld->p_reg[RET_REG] = 0; // make sure child receives zero
So, basically identical to the scheme I posited, but using data modifications rather than code path selection to decide what to return to the caller - in other words, you'd see something like:
return rpc->p_reg[RET_REG];
at the end of fork() so that the correct value gets returned depending on whether it's the parent or child process.
In Linux fork() happens in kernel; the actual place is the _do_fork here. Simplified, the fork() system call could be something like
pid_t sys_fork() {
pid_t child = create_child_copy();
wait_for_child_to_start();
return child;
}
So in the kernel, fork() really returns once, into the parent process. However the kernel also creates the child process as a copy of the parent process; but instead of returning from an ordinary function, it would synthetically create a new kernel stack for the newly created thread of the child process; and then context-switch to that thread (and process); as the newly created process returns from the context switching function, it would make the child process' thread end up returning to user mode with 0 as the return value from fork().
Basically fork() in userland is just a thin wrapper returns the value that the kernel put onto its stack/into return register. The kernel sets up the new child process so that it returns 0 via this mechanism from its only thread; and the child pid is returned in the parent system call as any other return value from any system call such as read(2) would be.
You first need to know how multitasking works. It is not useful to understand all the details, but every process runs in some kind of a virtual machine controlled by the kernel: a process has its own memory, processor and registers, etc. There is mapping of these virtual objects onto the real ones (the magic is in the kernel), and there is some machinery that swap virtual contexts (processes) to physical machine as time pass.
Then, when the kernel forks a process (fork() is an entry to the kernel), and creates a copy of almost everything in the parent process to the child process, it is able to modify everything needed. One of these is the modification of the corresponding structures to return 0 for the child and the pid of the child in the parent from current call to fork.
Note: nether say "fork returns twice", a function call returns only once.
Just think about a cloning machine: you enter alone, but two persons exit, one is you and the other is your clone (very slightly different); while cloning the machine is able to set a name different than yours to the clone.
The fork system call creates a new process and copies a lot of state from the parent process. Things like the file descriptor table gets copied, the memory mappings and their contents, etc. That state is inside the kernel.
One of the things the kernel keeps track for every process are the values of registers this process needs to have restored at the return from a system call, trap, interrupt or context switch (most context switches happen on system calls or interrupts). Those registers are saved on a syscall/trap/interrupt and then restored when returning to userland. System calls return values by writing into that state. Which is what fork does. Parent fork gets one value, child process a different one.
Since the forked process is different from the parent process, the kernel could do anything to it. Give it any values in registers, give it any memory mappings. To actually make sure that almost everything except the return value is the same as in the parent process requires more effort.
For each running process, the kernel has a table of registers, to load back when a context switch is made. fork() is a system call; a special call that, when made, the process gets a context switch and the kernel code executing the call runs in a different (kernel) thread.
The value returned by system calls is placed in a special register (EAX in x86) that your application reads after the call. When the fork() call is made, the kernel makes a copy of the process, and in each table of registers of each process descriptor writes the appropiate value: 0, and the pid.

How to wait for system() completion before advancing into next cycle

I'm using system("./foo 1 2 3") within C to call an external application. I use it inside a for cycle and I want to wait for the foo execution to complete (each execution takes 20/30 seconds) before going into the next cycle iteration. This is a MUST.
The returned system() value only tells me if the process was successfully started or not. So how can I do this?
I looked into fork() and wait() already but didn't manage to do what I want.
Edit:Here's my fork and wait code:
for(i=0;i<64;i++){
if((pid=fork()==-1)){
perror("fork error");
return -1;
}
else if(pid==0){
status=system("./foo 1 2 3"); //THESE 1 2 3 PARAMETERS CHANGE WITHIN EACH ITERATION
}
else{ /* start of parent process */
printf("Parent process started.n");
if ((pid = wait(&status)) == -1)/* Wait for child process. */
printf("wait error");
else { /* Check status. */
if (WIFSIGNALED(status) != 0)
printf("Child process ended because of signal %d.n",
WTERMSIG(status));
else if (WIFEXITED(status) != 0)
printf("Child process ended normally; status = %d.n",
WEXITSTATUS(status));
else
printf("Child process did not end normally.n");
}
}
}
What happens when I do this is that the PC gets extremely slow to the point I need to manually reboot. So What I guess this is doing is starting 64 simultaneous child processes, causing the computer to become really slow.
On a POSIX system, the system function should already be waiting for the command to finish.
http://pubs.opengroup.org/onlinepubs/009695399/functions/system.html
If command is not a null pointer, system() shall return the termination status of the command language interpreter in the format specified by waitpid(). The termination status shall be as defined for the sh utility; otherwise, the termination status is unspecified. If some error prevents the command language interpreter from executing after the child process is created, the return value from system() shall be as if the command language interpreter had terminated using exit(127) or _exit(127). If a child process cannot be created, or if the termination status for the command language interpreter cannot be obtained, system() shall return -1 and set errno to indicate the error.
The one thing to watch out for is if you're starting the program in the background within the command (i.e. if you're doing "./foo &") - the obvious answer is just don't do that.
After you call fork, the child calls system which starts another child that foo runs in. Once it completes, the child continues the next iteration of the for loop. So after the first loop iteration you have 2 processes, then 4 after the next, and so forth. You're spawning off processes at an exponential rate which causes the system to grind to a halt.
There are a few ways to address this:
After the call to system, you have to call exit so the forked off child quits.
Use exec instead of system. This will start foo in the same process as the child. A successful call to exec does not return, however if it fails you still want to print an error and call exit after exec.
Don't bother with fork or wait at all and just call system in a loop, since system doesn't return until the command is completed.
EDIT:
This loop is exhibiting some strange behavior. Here is the culprit:
if((pid=fork()==-1)){
You've got some misplaces parenthesis here. The innermost expression is pid=fork()==-1. Because == has higher precedence than =, it first evaluates fork()==-1. If fork was successful, this evaluates to false, i.e. 0. So then it evaluates pid=0. So after this conditional, both the parent and the child have pid==0.
After applying one of the above changes, put the parenthesis in the right place:
if((pid=fork())==-1){
And everything should work fine.
wait(2)
All of these system calls are used to wait for state changes in a
child of the calling process, and obtain information about the child
whose state has changed. A state change is considered to be: the
child terminated; the child was stopped by a signal; or the child was
resumed by a signal. In the case of a terminated child, performing a
wait allows the system to release the resources associated with the
child; if a wait is not performed, then the terminated child remains
in a "zombie" state (see NOTES below).
If a child has already changed state, then these calls return
immediately. Otherwise, they block until either a child changes
state or a signal handler interrupts the call (assuming that system
calls are not automatically restarted using the SA_RESTART flag of
sigaction(2)). In the remainder of this page, a child whose state
has changed and which has not yet been waited upon by one of these
system calls is termed waitable.
I found out what the problem was.
I saw here: https://askubuntu.com/questions/420981/how-do-i-save-terminal-output-to-a-file
that in order to save the stderr to file I needed to do &>output.txt.
So I was doing "./foo 1 2 3 &>output.txt" but that & causes the system process to go into background.
+1 to #Random832 for guessing it (even though I never said I was using &> -sorry guys, my bad ).
Btw, if you want the stderr to be exported to a file you can use 2>output.txt

What is the purpose of fork()?

In many programs and man pages of Linux, I have seen code using fork(). Why do we need to use fork() and what is its purpose?
fork() is how you create new processes in Unix. When you call fork, you're creating a copy of your own process that has its own address space. This allows multiple tasks to run independently of one another as though they each had the full memory of the machine to themselves.
Here are some example usages of fork:
Your shell uses fork to run the programs you invoke from the command line.
Web servers like apache use fork to create multiple server processes, each of which handles requests in its own address space. If one dies or leaks memory, others are unaffected, so it functions as a mechanism for fault tolerance.
Google Chrome uses fork to handle each page within a separate process. This will prevent client-side code on one page from bringing your whole browser down.
fork is used to spawn processes in some parallel programs (like those written using MPI). Note this is different from using threads, which don't have their own address space and exist within a process.
Scripting languages use fork indirectly to start child processes. For example, every time you use a command like subprocess.Popen in Python, you fork a child process and read its output. This enables programs to work together.
Typical usage of fork in a shell might look something like this:
int child_process_id = fork();
if (child_process_id) {
// Fork returns a valid pid in the parent process. Parent executes this.
// wait for the child process to complete
waitpid(child_process_id, ...); // omitted extra args for brevity
// child process finished!
} else {
// Fork returns 0 in the child process. Child executes this.
// new argv array for the child process
const char *argv[] = {"arg1", "arg2", "arg3", NULL};
// now start executing some other program
exec("/path/to/a/program", argv);
}
The shell spawns a child process using exec and waits for it to complete, then continues with its own execution. Note that you don't have to use fork this way. You can always spawn off lots of child processes, as a parallel program might do, and each might run a program concurrently. Basically, any time you're creating new processes in a Unix system, you're using fork(). For the Windows equivalent, take a look at CreateProcess.
If you want more examples and a longer explanation, Wikipedia has a decent summary. And here are some slides here on how processes, threads, and concurrency work in modern operating systems.
fork() is how Unix create new processes. At the point you called fork(), your process is cloned, and two different processes continue the execution from there. One of them, the child, will have fork() return 0. The other, the parent, will have fork() return the PID (process ID) of the child.
For example, if you type the following in a shell, the shell program will call fork(), and then execute the command you passed (telnetd, in this case) in the child, while the parent will display the prompt again, as well as a message indicating the PID of the background process.
$ telnetd &
As for the reason you create new processes, that's how your operating system can do many things at the same time. It's why you can run a program and, while it is running, switch to another window and do something else.
fork() is used to create child process. When a fork() function is called, a new process will be spawned and the fork() function call will return a different value for the child and the parent.
If the return value is 0, you know you're the child process and if the return value is a number (which happens to be the child process id), you know you're the parent. (and if it's a negative number, the fork was failed and no child process was created)
http://www.yolinux.com/TUTORIALS/ForkExecProcesses.html
fork() is basically used to create a child process for the process in which you are calling this function. Whenever you call a fork(), it returns a zero for the child id.
pid=fork()
if pid==0
//this is the child process
else if pid!=0
//this is the parent process
by this you can provide different actions for the parent and the child and make use of multithreading feature.
fork() will create a new child process identical to the parent. So everything you run in the code after that will be run by both processes — very useful if you have for instance a server, and you want to handle multiple requests.
System call fork() is used to create processes. It takes no arguments and returns a process ID. The purpose of fork() is to create a new process, which becomes the child process of the caller. After a new child process is created, both processes will execute the next instruction following the fork() system call. Therefore, we have to distinguish the parent from the child. This can be done by testing the returned value of fork():
If fork() returns a negative value, the creation of a child process was unsuccessful.
fork() returns a zero to the newly created child process.
fork() returns a positive value, the process ID of the child process, to the parent. The returned process ID is of type pid_t defined in sys/types.h. Normally, the process ID is an integer. Moreover, a process can use function getpid() to retrieve the process ID assigned to this process.
Therefore, after the system call to fork(), a simple test can tell which process is the child. Please note that Unix will make an exact copy of the parent's address space and give it to the child. Therefore, the parent and child processes have separate address spaces.
Let us understand it with an example to make the above points clear. This example does not distinguish parent and the child processes.
#include <stdio.h>
#include <string.h>
#include <sys/types.h>
#define MAX_COUNT 200
#define BUF_SIZE 100
void main(void)
{
pid_t pid;
int i;
char buf[BUF_SIZE];
fork();
pid = getpid();
for (i = 1; i <= MAX_COUNT; i++) {
sprintf(buf, "This line is from pid %d, value = %d\n", pid, i);
write(1, buf, strlen(buf));
}
}
Suppose the above program executes up to the point of the call to fork().
If the call to fork() is executed successfully, Unix will make two identical copies of address spaces, one for the parent and the other for the child.
Both processes will start their execution at the next statement following the fork() call. In this case, both processes will start their execution at the assignment
pid = .....;
Both processes start their execution right after the system call fork(). Since both processes have identical but separate address spaces, those variables initialized before the fork() call have the same values in both address spaces. Since every process has its own address space, any modifications will be independent of the others. In other words, if the parent changes the value of its variable, the modification will only affect the variable in the parent process's address space. Other address spaces created by fork() calls will not be affected even though they have identical variable names.
What is the reason of using write rather than printf? It is because printf() is "buffered," meaning printf() will group the output of a process together. While buffering the output for the parent process, the child may also use printf to print out some information, which will also be buffered. As a result, since the output will not be send to screen immediately, you may not get the right order of the expected result. Worse, the output from the two processes may be mixed in strange ways. To overcome this problem, you may consider to use the "unbuffered" write.
If you run this program, you might see the following on the screen:
................
This line is from pid 3456, value 13
This line is from pid 3456, value 14
................
This line is from pid 3456, value 20
This line is from pid 4617, value 100
This line is from pid 4617, value 101
................
This line is from pid 3456, value 21
This line is from pid 3456, value 22
................
Process ID 3456 may be the one assigned to the parent or the child. Due to the fact that these processes are run concurrently, their output lines are intermixed in a rather unpredictable way. Moreover, the order of these lines are determined by the CPU scheduler. Hence, if you run this program again, you may get a totally different result.
You probably don't need to use fork in day-to-day programming if you are writing applications.
Even if you do want your program to start another program to do some task, there are other simpler interfaces which use fork behind the scenes, such as "system" in C and perl.
For example, if you wanted your application to launch another program such as bc to do some calculation for you, you might use 'system' to run it. System does a 'fork' to create a new process, then an 'exec' to turn that process into bc. Once bc completes, system returns control to your program.
You can also run other programs asynchronously, but I can't remember how.
If you are writing servers, shells, viruses or operating systems, you are more likely to want to use fork.
Multiprocessing is central to computing. For example, your IE or Firefox can create a process to download a file for you while you are still browsing the internet. Or, while you are printing out a document in a word processor, you can still look at different pages and still do some editing with it.
Fork creates new processes. Without fork you would have a unix system that could only run init.
Fork() is used to create new processes as every body has written.
Here is my code that creates processes in the form of binary tree.......It will ask to scan the number of levels upto which you want to create processes in binary tree
#include<unistd.h>
#include<fcntl.h>
#include<stdlib.h>
int main()
{
int t1,t2,p,i,n,ab;
p=getpid();
printf("enter the number of levels\n");fflush(stdout);
scanf("%d",&n);
printf("root %d\n",p);fflush(stdout);
for(i=1;i<n;i++)
{
t1=fork();
if(t1!=0)
t2=fork();
if(t1!=0 && t2!=0)
break;
printf("child pid %d parent pid %d\n",getpid(),getppid());fflush(stdout);
}
waitpid(t1,&ab,0);
waitpid(t2,&ab,0);
return 0;
}
OUTPUT
enter the number of levels
3
root 20665
child pid 20670 parent pid 20665
child pid 20669 parent pid 20665
child pid 20672 parent pid 20670
child pid 20671 parent pid 20670
child pid 20674 parent pid 20669
child pid 20673 parent pid 20669
First one needs to understand what is fork () system call. Let me explain
fork() system call creates the exact duplicate of parent process, It makes the duplicate of parent stack, heap, initialized data, uninitialized data and share the code in read-only mode with parent process.
Fork system call copies the memory on the copy-on-write basis, means child makes in virtual memory page when there is requirement of copying.
Now Purpose of fork():
Fork() can be used at the place where there is division of work like a server has to handle multiple clients, So parent has to accept the connection on regular basis, So server does fork for each client to perform read-write.
fork() is used to spawn a child process. Typically it's used in similar sorts of situations as threading, but there are differences. Unlike threads, fork() creates whole seperate processes, which means that the child and the parent while they are direct copies of each other at the point that fork() is called, they are completely seperate, neither can access the other's memory space (without going to the normal troubles you go to access another program's memory).
fork() is still used by some server applications, mostly ones that run as root on a *NIX machine that drop permissions before processing user requests. There are some other usecases still, but mostly people have moved to multithreading now.
The rationale behind fork() versus just having an exec() function to initiate a new process is explained in an answer to a similar question on the unix stack exchange.
Essentially, since fork copies the current process, all of the various possible options for a process are established by default, so the programmer does not have supply them.
In the Windows operating system, by contrast, programmers have to use the CreateProcess function which is MUCH more complicated and requires populating a multifarious structure to define the parameters of the new process.
So, to sum up, the reason for forking (versus exec'ing) is simplicity in creating new processes.
Fork() system call use to create a child process. It is exact duplicate of parent process. Fork copies stack section, heap section, data section, environment variable, command line arguments from parent.
refer: http://man7.org/linux/man-pages/man2/fork.2.html
Fork() was created as a way to create another process with shared a copy of memory state to the parent. It works the way it does because it was the most minimal change possible to get good threading capabilities in time-slicing mainframe systems that previously lacked this capability. Additionally, programs needed remarkably little modification to become multi-process, fork() could simply be added in the appropriate locations, which is rather elegant. Basically, fork() was the path of least resistance.
Originally it actually had to copy the entire parent process' memory space. With the advent of virtual memory, it has been hacked and changed to be more efficient, with copy-on-write mechanisms avoiding the need to actual copy any memory.
However, modern systems now allow the creation of actual threads, which simply share the parent process' actual heap. With modern multi-threading programming paradigms and more advanced languages, it's questionable whether fork() provides any real benefit, since fork() actually prevents processes from communicating through memory directly, and forces them to use slower message passing mechanisms.

Resources