if else statement concurrency with fork() - c

While reading through some articles about fork() in C, I saw this example (code below) which I couldn't understand:
Understanding problem:
We only run "if" or "else" and not both. But since the child and parent processes run "simultaneoustly", in this example we see that we went through "if" and "else" both!
Eventhough it's similtaneous, it isn't in reality, it depends on which one of the processes will get the CPU first (right?).
What makes everything "weirder" is that we might first go through the "else" and then through the "if". How is this possible ?
#include <stdio.h>
#include <sys/types.h>
#include <unistd.h>
void forkexample()
{
// child process because return value zero
if (fork() == 0)
printf("Hello from Child!\n");
// parent process because return value non-zero.
else
printf("Hello from Parent!\n");
}
int main()
{
forkexample();
return 0;
}
Possible Outputs are:
Hello from Child!
Hello from Parent!
(or)
Hello from Parent!
Hello from Child!

Remember that the fork function actually returns twice: once to the parent with the child's pid and once to the child with 0.
At that point you have two independent processes running the same code and printing to the same terminal. And since you have two processes, the kernel is free to schedule them in any way it sees fit, meaning the output of either process can appear in any order.

fork is a system call, which does following (in abstraction, actual implementation likely to differ on the modern systems):
Creates a copy of the process image in memory and sets it up as a process with it's own PID
In one image (the child) returns 0
In second image (the parent) returns the PID of the new process
Schedules execution of both images as usual
As a result, on the multi-core system, both parent and a child are going to be executing independently, and absent other synchronization, in no way related to each other.
And if you observe the output of those two processes, you will find out that output from those can come in any shape or form - you can have output from the parent preceding output from the child, succeeding it or being interleaved with it - it is unpredictable what you will end up seeing.

Related

How can the multi-core cpu run the program interleaved?

The output of the program are not obviously contents from the printf()s in teh code. Instead it looks like characters in irregular sequence. I know the reason is because the parent process and child process are running
at the same time, but in this program I only see pid=fork(), which I think means pid is only the id of child process.
So why can the parent process print?
How do the two processes run together?
// fork.c: create a new process
#include "kernel/types.h"
#include "user/user.h"
int
main()
{
int pid;
pid = fork();
printf("fork() returned %d\n", pid);
if(pid == 0){
printf("child\n");
} else {
printf("parent\n");
}
exit(0);
}
output:
ffoorrkk(()) rreettuurrnende d 0
1c9h
ilpda
rent
I focus my answer on showing how the observed output can result from the shown program. I think that it will already clear things up for you.
This is your output.
I edited it to use a good guess of what is parent (p) and child (c):
ffoorrkk(()) rreettuurrnende d 0\n
cpcpcpcpcpcpcpcpcpcpcpcpccpcpcppccc
1 c9h\n
pccpcpp
ilpda\n
ccpcpcc
rent
pppp
If you only use the chars with a "c" beneath, you get
fork() returned 0
child
If you only use the chars with a "p" beneath, you get
fork() returned 19
parent
Split that way, it should match what you know about how fork() works.
Comments already provided the actual answer to the three "?"-adorned questions in title and body of your question post.
Lundin:
It creates two processes and they are executed just as any other process, decided by the OS scheduler.
Yourself:
each time fork() is called it will return twice, the parent process will return the id of child process, and child process will return 0
Maybe for putting a more obvious point on it:
The parent process receives the child ID and also continues executing the program after the fork().
That is why the output occurs twice, similarily, interleaved, with differences in PID value and the selected if branch.
Relevant is also that in the given situation there is no line buffering. Otherwise there would be no character-by-character interleaving and everthing would be much more readable.

what should be returned when calling "getpid() == fork()"

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
int main(int argc, char **argv) {
printf ("%d", getpid() == fork());
return 0;
}
The output of this program is 00. I don't quite understand why 0 is printed twice. My understanding it that after fork() is called, a child process is created. And now both processes continue running the next line of the program. Doesn't that mean the child process will run return 0? I can see that I will get 00 if it's fork() == getpid() tho. Thanks!
fork does not cause the child process to jump to the "next line". Assuming the operation succeeds, the fork function call returns twice, once in each process. So both processes execute the comparison to getpid.
Also, the C standard doesn't specify whether the call to getpid happens before the call to fork (in which case it happens only once) or after (in which case it happens twice and returns two different values), but this turns out not to matter, because all of the possible situations lead to the comparison being false:
fork fails: no new process is created and it returns −1 to the parent, which is not a valid process ID and therefore cannot compare equal to the value returned to the parent by getpid. It doesn't matter whether the getpid happens before or after the fork, because it's the same process in either case.
fork succeeds: it returns the child's process ID to the parent, and it returns zero to the child.
The child's process ID cannot be equal to the parent's process ID (because both of them are running at the same time), so, in the parent, the comparison will always be false, and it doesn't matter whether the getpid happens before or after the fork because it's the same process in either case.
If the getpid call happened before the fork, the child will compare zero to the parent's process ID; if it happened after the fork, the child will compare zero to the child's process ID. Zero is not a valid process ID either, so the comparison will be false either way.
getpid is one of a very few system calls that POSIX says cannot ever fail, so we don't have to worry about that possibility.
Therefore, the only things this program can print are 0 (if fork fails) or 00 (if fork succeeds).
I would strongly recommend not writing anything like this in a real program. Operations with "abnormal" control-flow behavior, like fork, should always be done as stand-alone statements, because this makes the program easier for humans to read. You might not have realized yet just how important that is, so let me leave you with an exercise: reread a program that you wrote more than three months ago, and try to remember what it does and why. (If you haven't been programming for long enough to do that, make a note to do this exercise when you can.)
fork documentation
When fork() returns, in both parent and child processes - it returns immediately after the ending of the invocation of the fork() system call. It could well be in the middle of the line as is the case here - you return with a value for the expression fork() and use that value to continue evaluating the expression that contained it - in this case - a condition within a printf.
fork() returns the PID of the child process in the parent process or 0 in the child process (or -1 on error).
In this case, both processes have the condition getpid() == fork() return 0 (false) since in the parent, the getpid() is different from the value returned by fork() since it's the parent processes pid (not the child's) and in the child - the fork returns 0 which is an illegal PID and can't be returned by getpid().
Thus the 00 output.

C - fork() and processes behaviour

I have a small program written in C on linux. It's purpose is to examine the behaviour of the fork() call and the resulting child processes
Upon first inspection everything seems simple enough. However
Sometimes output is written in a funny order
Sometimes the child PPID is '1' not whatever the parent PID is.
I can't find any pattern or correlation between when it works as expected and when it does not.
I think that point 2. is probably caused by the parent process dying before the child process has executed fully. If so, is there a way to stop this from happening.
However I have no idea what is causing point 1.
Code below:
#include <stdio.h>
#include <unistd.h>
int main()
{
int x = fork();
if (x == 0)
{
printf("Child:");
printf ("\nChild PID : %d", getpid());
printf ("\nChild PPID: %d", getppid());
printf("\nHello Child World\n");
}
if (x != 0)
{
printf("Parent :");
printf ("\nParent PID : %d", getpid());
printf ("\nParent PPID: %d", getppid());
printf("\nHello Parent World\n");
}
return 0;
}
this behaviour is seen because of scheduling policy of operating system. if you are aware of process management concepts of os, then if your parent code is running and fork() is encountered, child is created, but if within that time, parent's time slice has not been completed, then parent continues running and if within its time slice, parent executes and terminates, then child becomes orphan precess and after parent process' time slice completes, child's execution starts, thats why getppid() function returns 1, because child is an orphan process and it it now adopted by init process which starts first when operating system boots and is having process id 1.
Explanation of Behaviour 1:
The order of output cannot be controlled by the program normally. That's the point of parallel process. The OS decides which process to execute at any point of time and both processes are executed simultaneously (to the human eye).
Thus the output would generally be inter-tweened.
Explanation of Behaviour 2:
You guessed that right.
The parent process has finished before the forked one.
If you want the parent pid, you can use waitpid(x, &status, 0) in the parent process if you need the parent to stay alive till child execution. This link may help you.

Is there any formula to know how fork() makes near-perfect copy of the current process?

#include <stdio.h>
main()
{
int i, n=1;
for(i=0;i<n;i++) {
fork();
printf("Hello!");
}
}
I am confused if I put n=1, it prints Hello 2 times.
If n=2, it prints Hello 8 times
If n=3, it prints Hello 24 times..
and so on..
There's is no single "formula" how it's done as different operating systems do it in different ways. But what fork() does is it makes a copy of the process. Rough steps that are usually involved in that:
Stop current process.
Make a new process, and initialize or copy related internal structures.
Copy process memory.
Copy resources, like open file descriptors, etc.
Copy CPU/FPU registers.
Resume both processes.
you dont only fork the 'main' prozess, you also fork the children!
first itteration:
m -> c1
//then
m -> c2 c1-> c1.1
m -> c3 c1-> c1.1 c2->c2.1 c1.1 -> c1.1.1
for i = ....
write it in down this way:
main
fork
child(1)
fork child(2)
fork child(1.1)
fork child(3)
fork child(1.2)
fork child(2.1)
and so on ...
fork() always makes a perfect copy of the current process -- the only difference is the process ID).
Indeed in most modern operating systems, immediately after fork() the two processes share exactly the same chunk of memory. The OS makes a new copy only when the new process writes to that memory (in a mode of operation called "copy-on-write") -- if neither process ever changes the memory, then they can carry on sharing that chunk of memory, saving overall system memory.
But all of that is hidden from you as a programmer. Conceptually you can think of it as a perfect copy.
That copy includes all the values of variables at the moment the fork() happened. So if the fork() happens inside a for() loop, with i==2, then the new process will also be mid-way through a for() loop, with i==2.
However, the processes do not share i. When one process changes the value of i after the fork(), the other process's i is not affected.
To understand your program's results, modify the program so that you know which process is printing each line, and which iteration it is on.
#include <stdio.h>
# include <sys/types.h>
main() {
int i , n=4;
for(i=0;i<n;i++); {
int pid = getpid();
if(fork() == 0) {
printf("New process forked from %d with i=%d and pid=%d", pid, i, getpid());
}
printf("Hello number %d from pid %d\n", i, getpid());
}
}
Since the timing will vary, you'll get output in different orders, but you should be able to make sense of where all the "Hellos" are coming from, and why there are as many as there are.
When you fork() you create a new child process that has different virtual memory but initially shows in the same physical memory with the father. At that point both processes can only read data from the memory. If they want to write or change anything in that common memory, then they use different physical memory and have write properties to that memory. Therefore when you fork() your child has initially the same i value as its father, but that value changes separately for every child.
Yes, there is a formula:
fork() makes exactly one near-identical copy of the process. It returns twice, however. Once in the original process (the parent, returning the child pid) and once in the child (returning 0).

Where does code Execution start in a child process?

Consider the code:
#include <stdio.h>
#include <errno.h>
#include <sys/types.h>
#include <unistd.h>
/* main --- do the work */
int main(int argc, char **argv)
{
pid_t child;
if ((child = fork()) < 0) {
fprintf(stderr, "%s: fork of child failed: %s\n",
argv[0], strerror(errno));
exit(1);
} else if (child == 0) {
// do something in child
}
} else {
// do something in parent
}
}
My question is from where does in the code the child process starts executing, i.e. which line is executed first??
If it executes the whole code, it will also create its own child process and thing will go on happening continuously which does not happen for sure!!!
If it starts after the fork() command, how does it goes in if statement at first??
It starts the execution of the child in the return of the fork function. Not in the start of the code. The fork returns the pid of the child in the parent process, and return 0 in the child process.
When you execute a fork() the thread is duplicated into memory.
So what effectively happens is that you will have two threads that executes the snippet you posted but their fork() return values will be different.
For the child thread fork() will return 0, so the other branch of the if won't be executed, same thing happens for the father thread.
When fork() is called the operating system assigns a new address space to the new thread that is going to spawn, then starts it, they will both share the same code segment but since the return value will be different they'll execute different parts of the code (if correctly split, like in your example)
The child starts by executing the next instruction (not line) after fork. So in your case it is the assignment of the fork's return value to the child variable.
Well, if i understand your question correctly, i can say to you that your code will run as a process already.When you run a code,it is already a process , so that this process goes if statement anyway. After fork(), you will have another process(child process).
In Unix, a process can create another process, that's why that happens.
Code execution in a child process starts from the next instruction following the fork() system call.
fork() system call just creates a seperate address space for the child process therefore it is a cloned copy of the parent process and the child process has all the memory elements of it's parent's process.
Thus, after spawning a child process through fork(), both processes (the parent process and the child process) resumes the execution right from the next instruction following the fork() system call.

Resources