I read a book that gives the next example:
int value=0
int thread_func(int id) {
int temp;
temp=value+id;
printf("Thread%d value: %d", id, temp);
value=temp;
}
int main() {
int fork_id, status, i;
pthread_t tids[3];
fork_id=fork();
if (fork_id == 0) {
for (i=1; i≤3; i++)
pthread_create(&tids[i-1], NULL, thread_func, i);
for (i=0; i≤2; i++)
pthread_join(tids+i, &status);
printf("Second process value: %d", value);
}
else {
wait(&status);
printf("First process value: %d", value)
}
I don't understand two main things:
As I read, the only value that the line has in printf("First process value: %d", value) is 0.
But why? wait(&status) is wait until the child process is terminate. In out case, it will terminate only after all the joins would done. meaning, when the value is 6.
Second, in the line printf("Second process value: %d", value);, the vaule can be from 1 to 6. This is also strange, because we have the join instruction.
The answers to your questions:
The value will be 0 in the parent process because when the fork occurs the address space of the parent is duplicated (along with the variable value) in the child process. Hence, although value is changed in the child, this change is not reflected in the parent as they are different variables.
Since there is no synchronization involved, there is no way to know in which order the variable value is changed by the three child threads. Specifically, each thread has a local temp variable with a different value, which is then copied in the global value variable, but there is no way to know in which order the threads will overwrite value with temp here: value = temp;. Hence, its value can vary between executions.
Because when you fork, you get a brand new process with its own memory, which means that changes to a variable in one process won't show up in another. Threads, on the other hand, share their memory, which means that changes to variables in a threaded program show up in all threads.
The value is incremented by the child process, so the parent process will show its value as 0.
if(fork_id == 0){
......
......
}
is executed by the child which spawned threads.
And child process has different copy of memory. so increasing value in child does not mean the same for parent.
And threads have access to global values.
Related
In this C program, data is not being shared between process i.e. parent and child process. child has it's own data and parent has it's own data but pointer is showing the same address for both processes. How it is being done on background? Does fork generates copies of same data? if so then we have the same pointer's address for both processes. Or is it due to the statically allocated data that is being copied for each process and the data is independent for each process? I want to know how it is being done?
#include<stdio.h>
#include<sys/types.h>
#include<unistd.h>
int main()
{
//Static Array
int X[] = {1,2,3,4,5};
int i, status;
//The fork call
int pid = fork();
if(pid == 0) //Child process
{
//Child process modifies Array
for(i=0; i<5; i++)
X[i] = 5-i;
//Child prints Array
printf("Child Array:\t");
for(i=0; i<5; i++)
printf("%d\t", X[i]);
printf("\nArray ptr = %p\n", X);
}
else //Parent process
{
// Wait for the child to terminate and let
// it modify and print the array
waitpid(-1, &status, 0);
//Parent prints Array
printf("Parent Array:\t");
for(i=0; i<5; i++)
printf("%d\t", X[i]);
printf("\nArray ptr = %p\n", X);
}
return 0;
}
Here is the output of the program.
1 Child Array: 5 4 3 2 1
2 Array ptr = 0x7fff06c9f670
3 Parent Array: 1 2 3 4 5
4 Array ptr = 0x7fff06c9f670
When child process modifies array it should have also modified the data of parent process. What is going on in background?
When you fork a new process the new child process is a copy of the parent process. That's why pointers etc. are equal. Due to the wonders of virtual memory two processes can have the same memory map, but still be using different memory.
fork creates exact copy of the parent process memory image (exception see in man page). This is called Copy On Write(COW) fork. Upto time child only read data, both parent and child have same copy of data but when child write, a new copy is generated and then both child and parent have different copyies for their own data
fork() creates a copy of the calling process, including all the memory allocated to it.
Each process has its own address space and the values of pointers are within context of that address space. So printing the address of some variable in the original process will give the same output as printing that address in the spawned process.
However, as far as the operating system is concerned, the addresses are not equal. The operating system takes care of ensuring each process has the illusion of its own memory.
There are means of sharing memory between processes (i.e. what one process writes to the shared memory, the other one sees). However, that is not what happens by default, and still happens with the help of the host operating system.
So I have this program I'm trying to understand, its from an old exam but I just cant get a grip of it. How do I know the order of the forks and how the variables are changed?
static int g = -1;
int main(int argc, char *argv[])
{
int v = 0;
pid_t p;
while (v++ < 6)
if ((p = fork()) < 0) {
perror("fork error");
return 1;
} else if (p == 0) {
v++;
g--;
} else {
g++;
v+=3;
if (waitpid(p, NULL, 0) != p) {
perror("waitpid error");
return 1;
}
}
printf("mypid = %d parentpid = %d p = %d v = %d g = %d\n",
getpid(), getppid(), p, v, g);
return 0;
}
Each call to fork generates its own process with its own variables, which are copied at the time of the call (logically; optimization may change when the actual copy happens, but not in a way that'll change the outcome).
So when you enter the loop, v gets incremented to 1, then you fork. At this point the parent process has g=-1, v=1, p= and the new child has g=-1, v=1, p=0. The parent then drops into the else case, incrementing g to 0 and v to 4 and then waiting for the child to complete, whereas the child drops into the "else if (p == 0)", increments v to 2, decrements g to -2, and goes around the loop again.
From there, you've hopefully now got enough information to follow the logic as the next two child processes get forked off, finish off the loop, and print their respective results. When they do, the first child will also come to the end of its waitpid with v=6, drop out of the loop, and print its results.
At this point, the parent will unblock, go around the loop one more time (forking off one more child along the way), and (once that child has completed) drop out of the loop.
The call to fork() both starts a new process and continues the old one. If there is some kind of error, it returns an error value. All errors and only errors are negative numbers. This is what the first if block checks.
In the new process, fork() returns 0. The branch that increments v and decrements g is therefore called only in the child process, not the parent.
In the original process, the fork() function returns the process identifier (PID) of the daughter process, which is a positive integer. (This will later be passed to waitpid(). Therefore, the branch that decrements v and increments g is only called in the parent process, not the child.
Each process has its own copy of v and g. (That’s the main difference between a process and a thread: threads share memory.) On a modern SMP operating system, what will happen is that the child process gets a copy of the parent’s memory map. but these refer to the same pages of physical memory until one process or the other writes to them. When that happens, a copy is made of that page of memory and both processes now get their own, different copies.
The way modern Linux kernels implement fork(), the child process will continue before the parent does. This made a significant difference to performance. Most programs that call fork() immediately have the child process call exec() to start a new program. That means it isn’t going to need its copy of the parent’s memory at all. (There is a newer, simpler way to start a different program in a new process now, posix_spawn().) The parent process, on the other hand, almost always keeps running and modifying its memory. Therefore, giving the child the chance to declare that it’s going to discard the memory it inherited means that the parent doesn’t need to worry about leaving an unmodified copy of any memory pages for its children, and the kernel does not have to go through the rigmarole of copy-on-write.
In practice, though, any decent compiler will keep both local variables in registers, so this issue will not arise.
On the next iteration of the loop, which only happens after the child process terminates, a new child process is spawned using the updated values of the parent’s variables. Each child process also continues to run the loop with the same values of v and g that it inherited from its parent.
I'm having a tricky time understanding how to alternate control between two processes using semaphores. Here's a contrived example of the process handling code.
int pid = fork();
if (pid) {
int counter = 0;
while (true) {
counter += 1;
printf("P%d = %d", pid, counter);
}
} else {
int counter = 0;
while (true) {
counter += 1;
printf("P%d = %d", pid, counter);
}
}
I was expecting the above code to run in parallel, but it seems like control flow continues instantly for the forked process and only later resumes for the parent process.
This is fundamentally breaking my existing code that uses a semaphore to control which process can act.
int id = get_semaphore(1);
int pid = fork();
if (pid) {
int counter = 0;
while (true) {
sem_wait(id);
counter += 1;
printf("P%d = %d\n", pid, counter);
sem_signal(id);
}
} else {
int counter = 0;
while (true) {
sem_wait(id);
counter += 1;
printf("P%d = %d\n", pid, counter);
sem_signal(id);
}
}
The sem_wait helper just subtracts 1 from the semaphore value and blocks until the result is > 0 (uses semop under the hood).
The sem_signal helper just adds 1 to the semaphore value (uses semop under the hood).
I'd like the code to alternate between the two processes, using sem_wait to block until the other process releases the resources with sem_signal. The desired output would be:
P1 = 0
P0 = 0
P1 = 1
P0 = 1
...
However, because of the initial execution delay between the processes, the child process takes the available semaphore resource, uses it to print a number, then restores it and loops — at which point the resource is available again, so it continues without ever waiting for the other process.
What's the best way to prevent a process from using resources if it released them itself?
it seems like control flow continues instantly for the forked process and only later resumes for the parent process
That is because stream IO buffers the output on stdout until either
the buffer is full
fflush() is called on stdout
a newline (\n) is encountered
In your program, each process will fill a buffer before sending its contents to stdout giving the appearance of one process running for a long time, then the other. Terminate the format strings of your printf statements with \n and you'll see behaviour in your first program more like you expect.
I am not sure why your semaphore thing isn't working - I'm not very knowledgeable about system V semaphores but it seems like a red flag to me that you are getting the semaphore after you have forked. With the more common POSIX semaphores, the semaphore has to be in memory that both processes can see otherwise it's two semaphores.
Anyway, assuming your get_semaphore() function does the right thing to share the semaphore, there is still a problem because there is no guarantee that, when one process signals the semaphore, the other one will start soon enough for it to grab it again before the first process loops round and grabs it itself.
You need two semaphores, one for the parent and one for the child. Before the print each process should wait on its own semaphore. After the print, each process should signal the other semaphore. Also, one semaphore should be initialised with a count of 1 and the other should be initialised with a count of 0.
Semaphores have two general use cases. One is mutual exclusion and the second is synchronization. What's been done in your code is mutual exclusion. What you actually want is synchronization (alternation) between the parent and child processes.
Let me explain a bit:
Mutual exclusion means that at any time only once process can access a "critical section" which is a piece of code that you want only one process/thread to access at a time.Critical sections generally have a code that manipulates a shared resource.
Coming to your code, since you have used only a single semaphore, there is no guarantee as to the "order" in which each process is allowed to enter the critical section.
ex: sem_wait(id) from your code can be executed by any process and it's not necessary that the two processes should alternate.
For process synchronization (more specifically alternation), you need to use two semaphore one for parent and another for child.
Sample code:
int pid = fork();
int parent_sem = get_semaphore(0);
int child_sem = get_semaphore(1);
if (pid) {
int counter = 0;
while (true) {
sem_wait(child_sem);
counter += 1;
printf("P%d = %d", pid, counter);
sem_signal(parent_sem);
}
} else {
int counter = 0;
while (true) {
sem_wait(parent_sem);
counter += 1;
printf("P%d = %d", pid, counter);
sem_signal(child_sem);
}
}
You need to initialize one semaphore (in my case child) to 1 and the second one to zero. That way only of the two processes get to start while the other enters into wait. Once child is done printing, it signals the parent. Now child's semaphore value is zero so it waits on wait(child_sem) while the parent that was signaled by the child executes. Next time, parent signals child and it executes. This continues in alternating sequences and is a classic synchronization problem.
I want to create copies of a process using fork() in C.
I cant figure out how to pass arguments to the copies of my process.
For example,I want to pass an integer to the process copies.
Or I what to do, if I have a loop in which I call fork() and want to pass a unique value to processes (e.g. 0...N)
for (int i = 0; i < 4; ++i) {
fork();
// pass a unique value to new processes.
}
The nice part about fork() is that each process you spawn automatically gets a copy of everything the parent has, so for example, let's say we want to pass an int myvar to each of two child processes but I want each to have a different value from the parent process:
int main()
{
int myvar = 0;
if(fork())
myvar = 1;
else if(fork())
myvar = 2;
else
myvar = 3;
printf("I'm %d: myvar is %d\n", getpid(), myvar);
return 0;
}
So doing this allows each process to have a "copy" of myvar with it's own value.
I'm 8517: myvar is 1
I'm 8518: myvar is 2
I'm 8521: myvar is 3
If you didn't change the value, then each fork'd process would have the same value.
Local and global variables are inherently preserved across a fork(), so there's no need to "pass arguments". If you're calling a function in the forked process, you can do something like:
pid_t pid = fork();
if (pid == 0) {
funcToCallInChild(argument);
exit(0);
}
I'm late to respond, but here is how I do it:
const char *progname = "./yourProgName";
const char *argument1 = "arg1";
const char *argument2 = "arg2";
if (fork() == 0)
{
// We are the child process, so replace the process with a new executable.
execl(progname, progname, argument1, argument2, (char *)NULL);
}
// The parent process continues from here.
First, you fork() the process to create a new process. It still has the same memory space as the old one. fork() returns for both parent and child processes. If fork() returns zero, you are the child process. The child process then uses execl() to replace the process memory with one from a new file.
Notice that progname is given twice to execl(). The first is what execl() will actually try to run, the second is argv[0]. You must provide both or the argument count will be off by one. Progname must contain all the required path information to find the desired executable image.
I give two arguments in this example, but you can pass as many as you want. it must be terminated with NULL, and I think you have to cast it as (char *) like I show.
This approach gives you a fully independent process with arguments and a unique pid. It can continue running long after the parent process terminates, or it may terminate before the parent.
You can use clone() (which is actually used by fork() itself). It lets you pass an arg to your entry function.
http://linux.die.net/man/2/clone
See the exec() family of functions.
EDIT: If you're trying to initialize copies of the same program as the base process, just continue using variables as suggested by duskwuff.
In the next code:
int i = 1;
fork();
i=i*2;
fork();
i=i*2;
fork();
i=i*2;
printf("%d\n", i);
Why 8,8,8,8,8,8,8,8 is printed, and not 1,2,2,4,4,8,8,8? fork() duplicate the process, and print the i before each fork. What I miss?
Given the code shown, you should be seeing eight lots of 6 (you wrote i = i + 2; instead of i = i * 2; for the last computation.
Since each process follows the same code path, each process will produce the same result.
To get the result you expected, you'd have to track whether each fork() yielded the parent or child process:
int i = 1;
if (fork())
{
i=i*2;
if (fork())
{
i=i*2;
if (fork())
i=i*2; // + --> *
}
}
printf(|%d\n", i);
I'm assuming there are no problems with the fork() operation. It is also interesting to note that you could invert any or all of the conditions and end up with the same result.
Because fork continues to execute the code as it goes downwards. So each of the processes will run through the i = i * 2 each time as they spawn off more children. Making it what you get and not what you expected (i.e. it doesn't jump to the end of the block once forked).
Info on fork: http://www.csl.mtu.edu/cs4411/www/NOTES/process/fork/create.html
Each new process gets a copy of the stack of the parent, so immediately after calling fork(), both parent and child have the same value for i -- but they don't have the same stack, just a copy... so changing i's value in one process has no effect on the other.
If you want two parallel pieces of code to share the same memory, either use threads (and memory that's in the heap, not on the stack), or use an explicit shared memory region.