fork vs vfork functionality in a C program

fork vs vfork functionality in a C program - c

I am doing some C exercise for self-learning, and came across the following problem:
Part a:
int main(int argc, char **argv) {
int a = 5, b = 8;
int v;
v = fork();
if(v == 0) {
// 10
a = a + 5;
// 10
b = b + 2;
exit(0);
}
// Parent code
wait(NULL);
printf("Value of v is %d.\n", v); // line a
printf("Sum is %d.\n", a + b); // line b
exit(0);
}
Part b:
int main(int argc, char **argv) {
int a = 5, b = 8;
int v;
v = vfork();
if(v == 0) {
// a = 10
a = a + 5;
// b = 6
b = b - 2;
exit(0);
}
// Parent code
wait(NULL);
printf("Value of v is %d.\n", v); // line a
printf("Sum is %d.\n", a + b); // line b
exit(0);
}
We have to compare the outputs of line a and line b.
The outputs of part a is:
Value of v is 79525.
Sum is 13.
The outputs of part b is:
Value of v is 79517.
Sum is 16.
It appears in part a, the sum is the sum of the initial declaration of a and b, whereas in part b, the sum include the summation within the child process.
My question is - why is this happening?
According to this post:
The basic difference between the two is that when a new process is
created with vfork(), the parent process is temporarily suspended, and
the child process might borrow the parent's address space. This
strange state of affairs continues until the child process either
exits, or calls execve(), at which point the parent process continues.
The definition of parent process is temporarily suspended doesn't make much sense to me. Does this mean that for 1b, the program waits until the child process to finish running (hence why the child process variables get summed) before the parent runs?
The problem statement also assumes that "the process ID of the parent process maintained by the kernel is 2500, and that the new processes are created by the operating system before the child process is created."
By this definition, what would the value of v for both programs be?

the parent process is temporarily suspended
Basically, the parent process will not run until the child calls either _exit or one of the exec functions. In your example this means the child will run and therefore perform the summation before the parent runs and does the prints.
As for:
My question is - why is this happening?
First, your part b has undefined behavior because you are violating the vfork semantics. Undefined behavior for a program means the program will not behave in a predictable manner. See this SO post on undefined behavior for more details (it includes some C++ but most of the ideas are the same). From the POSIX specs on vfork:
The vfork() function has the same effect as fork(2), except that the
behavior is undefined if the process created by vfork() either
modifies any data other than a variable of type pid_t used to store
the return value from vfork(), or returns from the function in which
vfork() was called, or calls any other function before successfully
calling _exit(2) or one of the exec(3) family of functions.
So your part b could really do anything. However, you will probably see a somewhat consistent output from part b. This is because when you use vfork you are not creating a new address space. Instead, the child process basically "borrows" the address space of the parent, usually with the intent that it will call one of the exec functions and create a new program image. Instead in your part b you are using the parent address space. Basically, after the child has called exit (which is also invalid as it should call _exit) a most likely will equal 10 and b will most likely equal 6 in the parent. Therefore, the summation is 16 as shown in part b. I say most likely because as stated before this program has undefined behavior.
For part a where fork is used the child gets its own address space and its modifications are not seen in the parent, therefore the value printed is 13 (5 + 8).
Finally with regards to the value of v, this is seems just to be something the question is stating to make the output it is showing make sense. The value of v could be any valid value returned by vfork or fork and does not have to be limited to 2500.

Related

Understanding fork() order in C

So I have this program I'm trying to understand, its from an old exam but I just cant get a grip of it. How do I know the order of the forks and how the variables are changed?
static int g = -1;
int main(int argc, char *argv[])
{
int v = 0;
pid_t p;
while (v++ < 6)
if ((p = fork()) < 0) {
perror("fork error");
return 1;
} else if (p == 0) {
v++;
g--;
} else {
g++;
v+=3;
if (waitpid(p, NULL, 0) != p) {
perror("waitpid error");
return 1;
}
}
printf("mypid = %d parentpid = %d p = %d v = %d g = %d\n",
getpid(), getppid(), p, v, g);
return 0;
}

Each call to fork generates its own process with its own variables, which are copied at the time of the call (logically; optimization may change when the actual copy happens, but not in a way that'll change the outcome).
So when you enter the loop, v gets incremented to 1, then you fork. At this point the parent process has g=-1, v=1, p= and the new child has g=-1, v=1, p=0. The parent then drops into the else case, incrementing g to 0 and v to 4 and then waiting for the child to complete, whereas the child drops into the "else if (p == 0)", increments v to 2, decrements g to -2, and goes around the loop again.
From there, you've hopefully now got enough information to follow the logic as the next two child processes get forked off, finish off the loop, and print their respective results. When they do, the first child will also come to the end of its waitpid with v=6, drop out of the loop, and print its results.
At this point, the parent will unblock, go around the loop one more time (forking off one more child along the way), and (once that child has completed) drop out of the loop.

The call to fork() both starts a new process and continues the old one. If there is some kind of error, it returns an error value. All errors and only errors are negative numbers. This is what the first if block checks.
In the new process, fork() returns 0. The branch that increments v and decrements g is therefore called only in the child process, not the parent.
In the original process, the fork() function returns the process identifier (PID) of the daughter process, which is a positive integer. (This will later be passed to waitpid(). Therefore, the branch that decrements v and increments g is only called in the parent process, not the child.
Each process has its own copy of v and g. (That’s the main difference between a process and a thread: threads share memory.) On a modern SMP operating system, what will happen is that the child process gets a copy of the parent’s memory map. but these refer to the same pages of physical memory until one process or the other writes to them. When that happens, a copy is made of that page of memory and both processes now get their own, different copies.
The way modern Linux kernels implement fork(), the child process will continue before the parent does. This made a significant difference to performance. Most programs that call fork() immediately have the child process call exec() to start a new program. That means it isn’t going to need its copy of the parent’s memory at all. (There is a newer, simpler way to start a different program in a new process now, posix_spawn().) The parent process, on the other hand, almost always keeps running and modifying its memory. Therefore, giving the child the chance to declare that it’s going to discard the memory it inherited means that the parent doesn’t need to worry about leaving an unmodified copy of any memory pages for its children, and the kernel does not have to go through the rigmarole of copy-on-write.
In practice, though, any decent compiler will keep both local variables in registers, so this issue will not arise.
On the next iteration of the loop, which only happens after the child process terminates, a new child process is spawned using the updated values of the parent’s variables. Each child process also continues to run the loop with the same values of v and g that it inherited from its parent.

C: printf (with fork())

In the next code:
int i = 1;
fork();
i=i*2;
fork();
i=i*2;
fork();
i=i*2;
printf("%d\n", i);
Why 8,8,8,8,8,8,8,8 is printed, and not 1,2,2,4,4,8,8,8? fork() duplicate the process, and print the i before each fork. What I miss?

Given the code shown, you should be seeing eight lots of 6 (you wrote i = i + 2; instead of i = i * 2; for the last computation.
Since each process follows the same code path, each process will produce the same result.
To get the result you expected, you'd have to track whether each fork() yielded the parent or child process:
int i = 1;
if (fork())
{
i=i*2;
if (fork())
{
i=i*2;
if (fork())
i=i*2; // + --> *
}
}
printf(|%d\n", i);
I'm assuming there are no problems with the fork() operation. It is also interesting to note that you could invert any or all of the conditions and end up with the same result.

Because fork continues to execute the code as it goes downwards. So each of the processes will run through the i = i * 2 each time as they spawn off more children. Making it what you get and not what you expected (i.e. it doesn't jump to the end of the block once forked).
Info on fork: http://www.csl.mtu.edu/cs4411/www/NOTES/process/fork/create.html

Each new process gets a copy of the stack of the parent, so immediately after calling fork(), both parent and child have the same value for i -- but they don't have the same stack, just a copy... so changing i's value in one process has no effect on the other.
If you want two parallel pieces of code to share the same memory, either use threads (and memory that's in the heap, not on the stack), or use an explicit shared memory region.

working of fork in c language [closed]

It's difficult to tell what is being asked here. This question is ambiguous, vague, incomplete, overly broad, or rhetorical and cannot be reasonably answered in its current form. For help clarifying this question so that it can be reopened, visit the help center.
Closed 11 years ago.
Now I have a problem in understanding the working of fork() system call.
I write a code which is following :
#include<stdio.h>
int main()
{
int a, b;
b=fork();
printf("\n the value of b = %d",b);
}
The output of this code is following :
Now I don't understand why the output is like this ?
After that i just add a line to my code and output is completely different.
my code is following:
int main()
{
int a, b;
b=fork();
When i run the code the output is following
2389my name is manish
the value of b = 0
Now I'm totally confused about the working of fork() call.
My question are following:
How fork() works?
Where the control goes after the fork() call?
Can any body explain why the outputs of codes written in problem?
Why the output of b occurring at different places means in first code
the output of b = 2260 is just before the output b = 0 while the value of b = 2389 is not just before the b = 0?
Please explain me the working of fork in the code written in the problem so that I can learn it properly .

It might help to first understand why the word fork was used to name this function. Ever heard of a "fork on the road?" At a fork, the process has to split paths.
First there is a single process executing normally until you reach the fork call. When fork is called, a new process is created, which is identical in virtually every way as the original process, except for the return value of the fork function. The newly created process is called the child process, and hence the process that spawned it is referred to as the parent process.
Since you'd want to perform different tasks for each branch of the fork, it necessitates that you be able to distinguish the child process from the parent process. That's where the return value of fork comes in: fork returns the process id (pid) of the child (the newly created process) to the parent; it returns 0 to the child. Also, should the execution of fork go wrong, the return value is -1.
In your code, you don't distinguish between the child and parent process, so both processes run the entire code that follows after the fork call.
//what the child process looks like after fork is called
int main()
{
int a, b;
b=fork(); // <-- current line of execution: 0 is returned to b
printf("\nmy name is manish\n");
printf("\n my name is anil\n");
printf("\n the value of b = %d",b);
}
// what the parent process looks like after fork is called
int main()
{
int a, b;
b=fork(); // <-- current line: child process id is returned
printf("\nmy name is manish\n");
printf("\n my name is anil\n");
printf("\n the value of b = %d",b);
}
As you can see, both processes have the same code following the fork, hence the output is repeated. Perhaps if you want the parent process to output Manish and the child to output Anil, then you can do something like:
int main()
{
pid_t b; // note that the actual return type of fork is
// pid_t, though it's probably just an int typedef'd or macro'd
b = fork();
if (b == -1) perror("Fork failed");
else if (b > 0) {
printf("My name is Manish\n"); // parent process
else
printf("My name is Anil\n"); // child process
printf("The value of b is %d\n", b);
return 0;
}
Finally, the last comment that must be made is that in your code, the output appears to have been executed first by one process in its entirety and then the other process in its entirety. That may not always be the case. For example, the operating system might allow the parent to execute the 'manish' output, then make this process wait, and handing the cpu over to the child process, which then executes 'manish'. However, the child process may continue and execute 'anil' and 'b' outputs, completing execution of the child process and thus returning execution back to the parent process. Now the parent finishes its execution by outputting 'anil' and 'b' itself. The final output of running this program may look something like:
my name is manish // executed by parent
my name is anil // child
the value of b = 0 // child
my name is anil // parent
the value of b = 2244 // parent
manish.yadav#ws40-man-lin:~$
Take a look at the man page for fork.
Also look at waitpid for proper handling of child processes by parent processes so you don't create zombies.
Edit: In response to your questions in the comments, I'll answer how you can simply run each process consecutively.
int main()
{
pid_t pid;
int i;
for (i=0; i<NUM_PROCESSES; i++)
{
pid = fork();
if (pid == -1)
{
perror("Error forking");
return -1;
}
else if (pid > 0)
{
// parent process
waitpid(-1, NULL, 0); //might want to look at man page for this
// it will wait until the child process is done
}
else
{
// do whatever each process needs to do;
// then exit()
doProcess(i);
exit(0);
}
}
// do anything else the parent process needs to do
return 0;
}
Of course, isn't the best code, but it's just to illustrate the point. The big idea here is the waitpid call, which causes the parent process to wait until the child process it just forked to terminate. After the child prcoess completes, the parent continues after the waitpid call, starting another iteration of the for loop and forking another (the next) process. This continues until all child process have executed sequentially and execution finally returns to the parent.

Fork creates a copy of your current process.
Both the original and the copy continue executing from the point at which fork() was called.
Because your code is executed twice, your print statements are also evaluated twice. In the copied process, the value of b is 0. In the original process, the value of b is the process ID of the copied process.
Once your processes start running concurrently, they will be scheduled independently by your operating system and thus you have no guarantees about when they will actually be run.

Forking is implemented by the OS. It basically creates a child process and starts running it after the fork().
The parent process receives the process id of the file process: b=fork(); b has the process id. The child process get a pid of zero.
(and 4) Because both process can either run in parallel or be time sliced, your output will vary.
You may want to check this out: http://en.wikipedia.org/wiki/Fork_(operating_system)

You'd better start from this.
Here you find explanation and code example.

Functionality of fork()

System information: I am running 64bit Ubuntu 10.10 on a 2 month old laptop.
Hi everyone, I've got a question about the fork() function in C. From the resources I'm using (Stevens/Rago, YoLinux, and Opengroup) it is my understanding that when you fork a process, both the parent and child continue execution from the next command. Since fork() returns 0 to the child, and the process id of the child to the parent, you can diverge their behavior with two if statements, one if(pid == 0) for the child and if(pid > 0), assuming you forked with pid = fork().
Now, I am having the weirdest thing occur. At the beginning of my main function, I am printing to stdout a couple of command line arguments that have been assigned to variables. This is this first non assignment statement in the entire program, yet, it would seem that every time I call fork later in the program, these print statements are executed.
The goal of my program is to create a "process tree" with each process having two children, down to a depth of 3, thus creating 15 total children of the initial executable. Each process prints it's parent's process ID and its process ID before and after the fork.
My code is as follows and is properly commented, command line arguments should be "ofile 3 2 -p" (I haven't gotten to implementing -p/-c flags yet":
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char *argv[])
{
if(argc != 5)//checks for correct amount of arguments
{
return 0;
}
FILE * ofile;//file to write to
pid_t pid = 1;//holds child process id
int depth = atoi(argv[2]);//depth of the process tree
int arity = atoi(argv[3]);//number of children each process should have
printf("%d%d", depth, arity);
ofile = fopen(argv[1], "w+");//opens specified file for writing
int a = 0;//counter for arity
int d = 0;//counter for depth
while(a < arity && d < depth)//makes sure depth and arity are within limits, if the children reach too high(low?) of a depth, loop fails to execute
//and if the process has forked arity times, then the loop fails to execute
{
fprintf(ofile, "before fork: parent's pid: %d, current pid: %d\n", getppid(), getpid());//prints parent and self id to buffer
pid = fork(); //forks program
if(pid == 0)//executes for child
{
fprintf(ofile, "after fork (child):parent's pid: %d, current pid: %d\n", getppid(), getpid());//prints parent's id and self id to buffer
a=-1;//resets arity to 0 (after current iteration of loop is finished), so new process makes correct number of children
d++;//increases depth counter for child and all of its children
}
if(pid > 0)//executes for parent process
{
waitpid(pid, NULL, 0);//waits on child to execute to print status
fprintf(ofile, "after fork (parent):parent's pid: %d, current pid: %d\n", getppid(), getpid());//prints parent's id and self id to buffer
}
a++;//increments arity counter
}
fclose(ofile);
}
When I run gcc main.c -o ptree then ptree ofile 3 2 -p, the console is spammed with "32" repeating seemingly infinitely, and the file ofile is of seemingly proper format, but far far too large for what I think my program should be doing.
Any help would be greatly appreciated.

I am not sure why the fputs to stdout would be executed for the children, and don't have a Unix box to hand to verify/test.
However, the following jumps out:
int depth = *argv[2];//depth of the process tree
int arity = *argv[3];//number of children each process should have
You are taking the ASCII codes of the first character in argv[2] and argv[3] as your depth and arity, so your code is trying to spawn 50^51 processes instead of 2^3.
What you want is:
int depth = atoi(argv[2]);//depth of the process tree
int arity = atoi(argv[3]);//number of children each process should have
Once you fix this, bleh[0] = depth and its twin will also need correcting.
edit Although this is not a problem right now, you're cutting it pretty close with the length of some of the things you're sprintfing into obuf. Make some of the messages just a little bit longer and Kaboom! At the very least you want to use snprintf or, better yet, fprintf into the file directly.
edit I've just realised that fork, being an OS function, most probably isn't aware of internal buffering done by C I/O functions. This would explain why you get duplicates (both parent and child get a copy of buffered data on fork). Try fflush(stdout) before the loop. Also fflush(ofile) before every fork.

You have 2 errors in your code :
1)
int depth = *argv[2];//depth of the process tree
int arity = *argv[3];//number of children each process should have
With this code you are getting the first char of the strings argv[2] and argv[3].
A correct code must be like that :
int depth = atoi(argv[2]);
int arity = atoi(argv[3]);
2)
bleh[0] = depth;
fputs(bleh, stdout);
bleh[0] = arity;
fputs(bleh, stdout);
You can do something like that bleh[0] = (char) depth; but you'll just keep the first byte of your integer and its not that you want to do i guess, if you want to print the whole integer, simply use :
printf("%d\n%d", depth, arity);
I just tryied your code with those modifications and it seems to work well :)
Anhuin

You can't print out numbers using that code at the start of your function. It's probably invoking undefined behavior by passing a non-string to fputs(). You should use sprintf() (or, even better, snprintf()) to format the number into the string properly, and of course make sure the buffer is large enough to hold the string representation of the integers.
Also, you seem to be emitting text to the file, but yet it is opened in binary mode which seems wrong.

How to make multiple `fork()`-ed processes comunicate using shared memory?

I have a parent with 5 child processes. I'm wanting to send a random variable to each child process. Each child will square the variable and send it back to the parent and the parent will sum them all together.
Is this even possible? I can't figure it out...
edit: this process would use shared memory.

There are a great number of ways to do this, all involving some form of inter-process communication. Which one you choose will depend on many factors, but some are:
shared memory.
pipes (popen and such).
sockets.
In general, I would probably popen a number of communications sessions in the parent before spawning the children; the parent will know about all five but each child can be configured to use only one.
Shared memory is also a possibility, although you'd probably have to have a couple of values in it per child to ensure communications went smoothly:
a value to store the variable and return value.
a value to store the state (0 = start, 1 = variable ready for child, 2 = variable ready for parent again).
In all cases, you need a way for the children to only pick up their values, not those destined for other children. That may be as simple as adding a value to the shared memory block to store the PID of the child. All children would scan every element in the block but would only process those where the state is 1 and the PID is their PID.
For example:
Main creates shared memory for five children. Each element has state, PID and value.
Main sets all states to "start".
Main starts five children who all attach to the shared memory.
Main stores all their PIDs.
All children start scanning the shared memory for state = "ready for child" and their PID.
Main puts in first element (state = "ready for child", PID = pid1, value = 7).
Main puts in second element (state = "ready for child", PID = pid5, value = 9).
Child pid1 picks up first element, changes value to 49, sets state to "ready for parent"), goes back to monitoring.
Child pid5 picks up second element, changes value to 81, sets state to "ready for parent"), goes back to monitoring.
Main picks up pid5's response, sets that state back to "start.
Main picks up pid1's response, sets that state back to "start.
This gives a measure of parallelism with each child continuously monitoring the shared memory for work it's meant to do, Main places the work there and periodically receives the results.

The nastiest method uses vfork() and lets the different children trample on different parts of memory before exiting; the parent then just adds up the modified bits of memory.
Highly unrecommended - but about the only case I've come across where vfork() might actually have a use.
Just for amusement (mine) I coded this up:
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <time.h>
#include <sys/wait.h>
int main(void)
{
int i;
int array[5];
int square[5];
long sum = 0;
srand(time(0));
for (i = 0; i < 5; i++)
{
array[i] = rand();
if (vfork() == 0)
{
square[i] = array[i] * array[i];
execl("/bin/true", "/bin/true", (char *)0);
}
else
wait(0);
}
for (i = 0; i < 5; i++)
{
printf("in: %d; square: %d\n", array[i], square[i]);
sum += square[i];
}
printf("Sum: %d\n", sum);
return(0);
}
This works. The previous trial version using 'exit(0)' in place of 'execl()' did not work; the square array was all zeroes. Example output (32-bit compilation on Solaris 10, SPARC):
in: 22209; square: 493239681
in: 27082; square: 733434724
in: 2558; square: 6543364
in: 17465; square: 305026225
in: 6610; square: 43692100
Sum: 1581936094
Sometimes, the sum overflows - there is much room for improvement in the handling of that.
The Solaris manual page for 'vfork()' says:
Unlike with the fork() function, the child process borrows
the parent's memory and thread of control until a call to
execve() or an exit (either abnormally or by a call to
_exit() (see exit(2)). Any modification made during this
time to any part of memory in the child process is reflected
in the parent process on return from vfork(). The parent
process is suspended while the child is using its resources.
That probably means the 'wait()' is unnecessary in my code. (However, trying to simplify the code seemed to make it behave indeterminately. It is rather crucial that i does not change prematurely; the wait() does ensure that synchronicity. Using _exit() instead of execl() also seemed to break things. Don't use vfork() if you value your sanity - or if you want any marks for your homework.)

Things like the anti thread might make this a little easier for you, see the examples (in particular the ns lookup program).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight