Understanding forks in C - c

I am having some trouble understanding the following simple C code:
int main(int argc, char *argv[]) {
int n=0;
fork();
n++;
printf("hello: %d\n", n);
}
My current understanding of a fork is that from that line of code on, it will split the rest of the code in 2, that will run in parallel until there is "no more code" to execute.
From that prism, the code after the fork would be:
a)
n++; //sets n = 1
printf("hello: %d\n", n); //prints "hello: 1"
b)
n++; //sets n = 2
printf("hello: %d\n", n); //prints "hello: 2"
What happens, though, is that both print
hello: 1
Why is that?
EDIT: Only now it ocurred to me that contrary to threads, processes don't share the same memory. Is that right? If yes, then that'd be the reason.

After fork() you have two processes, each with its own "n" variable.

fork() starts a new process, sharing no variables/memory locations.
It is very similar to what happens if you execute ./yourprogram twice in a shell, assuming the first thing the program does is forking.

At fork() call's end, both the processes might be referring to the same copy of n. But at n++, each gets its own copy with n=0. At the end of n++; n becomes 1 in both the processes. The printf statement outputs this value.

Actually you spawn a new process of the same progarm. It is not the closure kind of thing. You could use pipes to exchange data between parent and child.

You did indeed answer your own question in your edit.

examine this code and everything should be clearer (see the man pages if you don't know what a certain function does):
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
int count = 1;
int main(int argc, char *argv[]) {
// set the "startvalue" to create the random numbers
srand(time(NULL));
int pid;
// as long as count is <= 50
for (count; count<=50; count++) {
// create new proccess if count == 9
if (count==9) {
pid = fork();
// reset start value for generating the random numbers
srand(time(NULL)+pid);
}
if (count<=25) {
// sleep for 300 ms
usleep(3*100000);
} else {
// create a random number between 1 and 5
int r = ( rand() % 5 ) + 1;
// sleep for r ms
usleep(r*100000);
}
if (pid==0) {
printf("Child: count:%d pid:%d\n", count, pid);
} else if (pid>0) {
printf("Father: count:%d pid:%d\n", count, pid);
}
}
return 0;
}
happy coding ;-)

The system call forks more than the execution thread: also forked is the data space. You have two n variables at that point.
There are a few interesting things that follow from all this:
A program that fork()s must consider unwritten output buffers. They can be flushed before the fork, or cleared after the fork, or the program can _exit() instead of exit() to at least avoid automatic buffer flushing on exit.
Fork is often implemented with copy-on-write in order to avoid unnecessarily duplicating a large data memory that won't be used in the child.
Finally, an alternate call vfork() has been revived in most current Unix versions, after vanishing for a period of time following its introduction i 4.0BSD. Vfork() does not pretend to duplicate the data space, and so the implementation can be even faster than a copy-on-write fork(). (Its implementation in Linux may be due less to speed reasons than because a few programs actually depend on the vfork() semantics.)

Related

Unix system calls : read/write and the buffer

I am writing a pretty simple script .
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>
int main(){
int pipefd[2];
pid_t c;
int value[2];
c = fork();
if(c<0){
perror("in fork");
exit(1);
}
if(c==0){
printf("i am the child\n");
int buf[2];
buf[0]=3;
buf[1]=0;
write(pipefd[1], buf, 4);
write(pipefd[1],buf+1,4);
close(pipefd[1]);
exit(0);
}
if (pipe(pipefd) == -1) { /*UPDATE */
perror("pipe");
exit(EXIT_FAILURE);
}
read(pipefd[0], value, 4);
read(pipefd[0], value+1, 4);
close(pipefd[0]);
printf("%d %d\n", value[0], value[1]);
exit(0);
}
What I intend to do is to achieve:
value[0] = buf[0];
value[1] = buf[1];
( and print those of course).
But all I get as a result is :
-1299582208 32766
i am the child
Because, I have ints, I assumed that each will hold 4 bytes. And I think that for an int array each element will holds 4 bytes. But clearly I am missing something. Any help?
As I mentioned in my top comment: Where is the pipe syscall?
Without it, the write and read calls will probably fail because pipefd has random values.
So, the parent will never have value filled in correctly.
Because these [unitialized] values are on the stack, they will have random values, which is what you're seeing.
This is UB [undefined behavior].
Different systems/compilers may manipulate the stack differently, which is why you see different [yet still random] results on different configurations.
To fix, add the following above your fork call:
pipe(pipefd);
I downloaded, built, and ran your program. Before I added the fix, I got random values. After applying the fix, I get 3 0 as the output, which is what you expected/wanted.
Note: As others have mentioned, you could check the return codes for read and write. If you had, they might return -1 and put an error code in errno that would have helped you debug the issue.
A very simple fix would be to put a sleep(1) call right above your read() calls - obviously this isn't a great solution.
An important early lesson in multi process programming and communications is "race conditions". Your fork'd child is executing before the parent, it seems. I bet if you ran this 20 times, you might get X number of times where it does what you want!
You cannot guarantee the order of execution. So a sleep(1) will suffice until you learn more advanced techniques on resource locking (mutexes, semaphores).

Convert a process based program into a thread based version?

I currently have this program which spawns an arbitrary number of child processes and I'm interested in having it implement threads instead of processes. I'm having trouble understanding how to convert from what I have, here is the code which uses processes:
#include<stdio.h>
#include<stdlib.h>
#include<unistd.h>
void childprocess(int num);
int main(int argc, char **argv) {
if (argc != 2)
{
fprintf(stderr, "Usage: %s num-procs\n", argv[0]);
exit(EXIT_FAILURE);
}
int counter;
pid_t pid = getpid();
int x = atoi(argv[1]);
printf("Parent Process, my PID is %d\n", pid);
for(counter = 1; counter <= x; counter++){
if(!fork()){
printf("Child %d is born, my PID is %d\n", counter, getpid());
childprocess(counter);
printf("Child %d dies\n", counter);
exit(0);
}
}
}
void childprocess(int num){
srand(getpid());
int max = rand() % 100;
for(int i = 0; i < max; i++){
printf("Child %d executes iteration: %d\n", num, i);
}
}
Is it as simple as changing a few lines to make it use threads instead? I understand the theory behind using threads but not how to actually write it in C given what I have here
The general case depends on whether the threads are wholly independent of each other. If you go multi-threaded, you must ensure that any shared resource is accessed appropriate protection.
The code shown has a shared resource in the use of srand() and rand(). You will get different results in a multi-threaded process compared with those from multiple separate processes. Does this matter? If not, you may be able to let the threads run, but the C standard says the srand() and rand() functions don't have to be thread-safe.
If it matters — and it probably does — you need to use a different PRNG (pseudo-random number generator) which allows you to have independent sequences (for example, POSIX nrand48() — there may well be better alternatives, especially if you use modern C++).
The use of getpid() won't work well in a multi-threaded process either. Each thread will get the same value (because the threads are all part of the same process), so the threads won't get independent sequences of random numbers for another reason.

Understanding POSIX - fork()

I was reading about the fork function and how it creates new processes. The following program runs fine and prints here sixteen times, but, I am having trouble understanding the flow of execution:
#include <stdio.h>
#include <string.h>
#include <stdlib.h>
#include <limits.h>
#include <sys/types.h>
#include <unistd.h>
#include <pthread.h>
int main()
{
int i;
for (i = 0; i < 4; i++) { // line no. 12
fork(); // line no. 13
}
printf("%s\n", "here");
return 0;
}
It seems to me that there are two ways this program can be viewed as:
1st approach: fork() is called a total of four times. If I replace the loop with four calls to the fork() function, things seem to fall in place and I understand why here is printed 2 ^ 4 times.
2nd approach: fork() spawns a new process exactly from where it is invoked and each of these child processes have their own local variables. So, after line no. 13, each of these child processes see the end of the loop (}) and they go to line no. 12. Since, all these child processes have their own local variable i set to 0 (maybe i is set to some garbage value?), they all fork again. Again for these child processes their local variable i is set to 0. This should result in a fork bomb.
I am certainly missing something in my 2nd approach could someone please help?
Thanks.
Your 2nd approach is not right. Because after fork() the child process inherits the current value of i. It's nneither set to 0 everytime fork() is called nor do they have garbage value. So, your code can't have a fork bomb. The fact that it's a local variable is irrelevant. fork() clones pretty much everything and the child process is identical to its parent except for certain things as noted in the POSIX manual.
I'll reduce the loop count to 2 for ease of explaining and assume all fork() calls succeed:
for (i = 0; i < 2; i++) {
fork();
}
printf("%s\n", "here");
1) When i=0, fork() is executed and there are two processes now. Call them P1 and P2.
2) Now, each P1 and P2 processes continue with i=0 the loop and increment i to 1. The for loop condition is true, so each of them spawn another two processes and in total 4. Call them P1a & P1b and P2a & P2b. All 4 processes now have i=1 and increment it to 2 (as they continue the loop).
3) Now, all 4 processes have the value of i as 2 and for loop condition is false in all of them and "here" will be printed 4 times (one by each process).
If it helps, you can convert the for loop to a while loop and how i gets incremented by both processes returning from each fork() might become a bit more clear:
i = 0;
while(i < 2) {
fork();
i++;
}
printf("%s\n", "here");
Your first approach was right.
That's a rather boring answer, so I'll give you all the technical details.
When fork is called, several things happen:
A new process is created ('the child')
The stack of the parent is duplicated, and assigned to the child.
The stack pointer of the child is set to that of the parent.
The PID (process ID) of the child is returned to the parent.
Zero is returned to the child
variables declared inside a function are stored in the stack, and are therefore start at the same value, but are not shared.
variables declared outside a function (at the top level) are not in the stack, so are shared between child/parent.
(Some other things are or are not duplicated; see man fork for more information.)
So, when you run your code:
what happens # of processes
1. the parent forks. 2
2. the parent and it's child fork. 4
3. everyone forks 8
4. everyone forks 16
5. everyone prints "here". 16
You end up with sixteen processes, and the word 'here' sixteen times.
basically,
if(fork() != 0) {
parent_stuff();
} else {
child_stuff();
}
Using following code and you can easily see how fork create the value of variable i:
for (i = 0; i < 4; i++) {
printf("%d %s\n", i, "here");
fork();
}
As what you could expect, the child process copy the value of parent process and so we get 0 line with i = 0; 2 lines with i = 1; 4 lines with i = 2 and 8 lines with i = 3. Which I think answers your 2nd question.

In a fork() program what is executed first? Parent or child?

I don't know what is executed first in fork. For example I have this code:
int main() {
int n = 1;
if(fork() == 0) {
n = n + 1;
exit(0);
}
n = n + 2;
printf(“%d: %d\n”, getpid(), n);
wait(0);
return 0;
}
What will this print on the screen?
1: 3
0: 4
or
0: 4
1: 3
It's not specificed. It's up to the OS scheduler to decide which process to schedule first.
After a fork(), it is indeterminate which process—the parent
or the child—next has access to the CPU. On a multiprocessor system,
they may both simultaneously get access to a CPU.
When I run this program, it only prints one line, instead of two, because it exit()s in the fork()'s block.
So the question is, did you forget a printf() in the fork()'s block, or shouldn't the exit() be there?
If you want to make one process run before the other,try using sleep() system call.

Functionality of fork()

System information: I am running 64bit Ubuntu 10.10 on a 2 month old laptop.
Hi everyone, I've got a question about the fork() function in C. From the resources I'm using (Stevens/Rago, YoLinux, and Opengroup) it is my understanding that when you fork a process, both the parent and child continue execution from the next command. Since fork() returns 0 to the child, and the process id of the child to the parent, you can diverge their behavior with two if statements, one if(pid == 0) for the child and if(pid > 0), assuming you forked with pid = fork().
Now, I am having the weirdest thing occur. At the beginning of my main function, I am printing to stdout a couple of command line arguments that have been assigned to variables. This is this first non assignment statement in the entire program, yet, it would seem that every time I call fork later in the program, these print statements are executed.
The goal of my program is to create a "process tree" with each process having two children, down to a depth of 3, thus creating 15 total children of the initial executable. Each process prints it's parent's process ID and its process ID before and after the fork.
My code is as follows and is properly commented, command line arguments should be "ofile 3 2 -p" (I haven't gotten to implementing -p/-c flags yet":
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
int main (int argc, char *argv[])
{
if(argc != 5)//checks for correct amount of arguments
{
return 0;
}
FILE * ofile;//file to write to
pid_t pid = 1;//holds child process id
int depth = atoi(argv[2]);//depth of the process tree
int arity = atoi(argv[3]);//number of children each process should have
printf("%d%d", depth, arity);
ofile = fopen(argv[1], "w+");//opens specified file for writing
int a = 0;//counter for arity
int d = 0;//counter for depth
while(a < arity && d < depth)//makes sure depth and arity are within limits, if the children reach too high(low?) of a depth, loop fails to execute
//and if the process has forked arity times, then the loop fails to execute
{
fprintf(ofile, "before fork: parent's pid: %d, current pid: %d\n", getppid(), getpid());//prints parent and self id to buffer
pid = fork(); //forks program
if(pid == 0)//executes for child
{
fprintf(ofile, "after fork (child):parent's pid: %d, current pid: %d\n", getppid(), getpid());//prints parent's id and self id to buffer
a=-1;//resets arity to 0 (after current iteration of loop is finished), so new process makes correct number of children
d++;//increases depth counter for child and all of its children
}
if(pid > 0)//executes for parent process
{
waitpid(pid, NULL, 0);//waits on child to execute to print status
fprintf(ofile, "after fork (parent):parent's pid: %d, current pid: %d\n", getppid(), getpid());//prints parent's id and self id to buffer
}
a++;//increments arity counter
}
fclose(ofile);
}
When I run gcc main.c -o ptree then ptree ofile 3 2 -p, the console is spammed with "32" repeating seemingly infinitely, and the file ofile is of seemingly proper format, but far far too large for what I think my program should be doing.
Any help would be greatly appreciated.
I am not sure why the fputs to stdout would be executed for the children, and don't have a Unix box to hand to verify/test.
However, the following jumps out:
int depth = *argv[2];//depth of the process tree
int arity = *argv[3];//number of children each process should have
You are taking the ASCII codes of the first character in argv[2] and argv[3] as your depth and arity, so your code is trying to spawn 50^51 processes instead of 2^3.
What you want is:
int depth = atoi(argv[2]);//depth of the process tree
int arity = atoi(argv[3]);//number of children each process should have
Once you fix this, bleh[0] = depth and its twin will also need correcting.
edit Although this is not a problem right now, you're cutting it pretty close with the length of some of the things you're sprintfing into obuf. Make some of the messages just a little bit longer and Kaboom! At the very least you want to use snprintf or, better yet, fprintf into the file directly.
edit I've just realised that fork, being an OS function, most probably isn't aware of internal buffering done by C I/O functions. This would explain why you get duplicates (both parent and child get a copy of buffered data on fork). Try fflush(stdout) before the loop. Also fflush(ofile) before every fork.
You have 2 errors in your code :
1)
int depth = *argv[2];//depth of the process tree
int arity = *argv[3];//number of children each process should have
With this code you are getting the first char of the strings argv[2] and argv[3].
A correct code must be like that :
int depth = atoi(argv[2]);
int arity = atoi(argv[3]);
2)
bleh[0] = depth;
fputs(bleh, stdout);
bleh[0] = arity;
fputs(bleh, stdout);
You can do something like that bleh[0] = (char) depth; but you'll just keep the first byte of your integer and its not that you want to do i guess, if you want to print the whole integer, simply use :
printf("%d\n%d", depth, arity);
I just tryied your code with those modifications and it seems to work well :)
Anhuin
You can't print out numbers using that code at the start of your function. It's probably invoking undefined behavior by passing a non-string to fputs(). You should use sprintf() (or, even better, snprintf()) to format the number into the string properly, and of course make sure the buffer is large enough to hold the string representation of the integers.
Also, you seem to be emitting text to the file, but yet it is opened in binary mode which seems wrong.

Resources