fork() system call and memory space of the process - c

I quote "when a process creates a new process using fork() call, Only the shared memory segments are shared between the parent process and the newly forked child process. Copies of the stack and the heap are made for the newly created process" from "operating system concepts" solutions by Silberschatz.
But when I tried this program out
#include <stdio.h>
#include <sys/types.h>
#define MAX_COUNT 200
void ChildProcess(void); /* child process prototype */
void ParentProcess(void); /* parent process prototype */
void main(void)
{
pid_t pid;
char * x=(char *)malloc(10);
pid = fork();
if (pid == 0)
ChildProcess();
else
ParentProcess();
printf("the address is %p\n",x);
}
void ChildProcess(void)
{
printf(" *** Child process ***\n");
}
void ParentProcess(void)
{
printf("*** Parent*****\n");
}
the result is like:
*** Parent*****
the address is 0x1370010
*** Child process ***
the address is 0x1370010
both parent and child printing the same address which is in heap.
can someone explain me the contradiction here. please clearly state what are all the things shared by the parent and child in memory space.

Quoting myself from another thread.
When a fork() system call is issued, a copy of all the pages
corresponding to the parent process is created, loaded into a separate
memory location by the OS for the child process. But this is not
needed in certain cases. Consider the case when a child executes an
"exec" system call or exits very soon after the fork(). When the
child is needed just to execute a command for the parent process,
there is no need for copying the parent process' pages, since exec
replaces the address space of the process which invoked it with the
command to be executed.
In such cases, a technique called copy-on-write (COW) is used. With
this technique, when a fork occurs, the parent process's pages are not
copied for the child process. Instead, the pages are shared between
the child and the parent process. Whenever a process (parent or child)
modifies a page, a separate copy of that particular page alone is made
for that process (parent or child) which performed the modification.
This process will then use the newly copied page rather than the
shared one in all future references. The other process (the one which
did not modify the shared page) continues to use the original copy of
the page (which is now no longer shared). This technique is called
copy-on-write since the page is copied when some process writes to it.
Also, to understand why these programs appear to be using the same space of memory (which is not the case), I would like to quote a part of the book "Operating Systems: Principles and Practice".
Most modern processors introduce a level of indirection, called
virtual addresses. With virtual addresses, every process's memory
starts at the "same" place, e.g., zero.
Each process thinks that it has the entire machine to itself, although
obviously that is not the case in reality.
So these virtual addresses are translations of physical addresses and doesn't represent the same physical memory space, to leave a more practical example we can do a test, if we compile and run multiple times a program that displays the direction of a static variable, such as this program.
#include <stdio.h>
int main() {
static int a = 0;
printf("%p\n", &a);
getchar();
return 0;
}
It would be impossible to obtain the same memory address in two
different programs if we deal with the physical memory directly.
And the results obtained from running the program several times are...

Yes, both processes are using the same address for this variable, but these addresses are used by different processes, and therefore aren't in the same virtual address space.
This means that the addresses are the same, but they aren't pointing to the same physical memory. You should read more about virtual memory to understand this.

The address is the same, but the address space is not. Each process has its own address space, so parent's 0x1370010 is not the same as child's 0x1370010.

You're probably running your program on an operating system with virtual memory. After the fork() call, the parent and child have separate address spaces, so the address 0x1370010 is not pointing to the same place. If one process wrote to *x, the other process would not see the change. (In fact those may be the same page of memory, or even the same block in a swap-file, until it's changed, but the OS makes sure that the page is copied as soon as either the parent or the child writes to it, so as far as the program can tell it's dealing with its own copy.)

When the kernel fork()s the process, the copied memory information inherits the same address information since the heap is effectively copied as-is. If addresses were different, how would you update pointers inside of custom structs? The kernel knows nothing about that information so those pointers would then be invalidated. Therefore, the physical address may change (and in fact often will change even during the lifetime of your executable even without fork()ing, but the logical address remains the same.

Yes address in both the case is same. But if you assign different value for x in child process and parent process and then also prints the value of x along with address of x, You will get your answer.
#include <stdio.h>
#include <sys/types.h>
#include <stdlib.h>
#include <unistd.h>
#define MAX_COUNT 200
void ChildProcess(void); /* child process prototype */
void ParentProcess(void); /* parent process prototype */
void main(void)
{
pid_t pid;
int * x = (int *)malloc(10);
pid = fork();
if (pid == 0) {
*x = 100;
ChildProcess();
}
else {
*x = 200;
ParentProcess();
}
printf("the address is %p and value is %d\n", x, *x);
}
void ChildProcess(void)
{
printf(" *** Child process ***\n");
}
void ParentProcess(void)
{
printf("*** Parent*****\n");
}
Output of this will be:
*** Parent*****
the address is 0xf70260 and value is 200
*** Child process ***
the address is 0xf70260 and value is 100
Now, You can see that value is different but address is same. So The address space for both the process is different. These addresses are not actual address but logical address so these could be same for different processes.

Related

Forking a child process that does not use it´s own memory copy

What i want to achieve is the following:
Spawn a new childprocess (pchild) that DOES NOT use its own, but the memoryblock from its parentprocess (pparent).
Why i want to achieve this behavior:
Think of multiple tests where the first one leads to segvault.
Normally your process would stop here due to segfault, all the other tests wouldn't be executed anymore. Therefore i want to encapsule each test in its own process.
Main Problem:
Once i spawn a process it gets its own memory copy (well, i'm
aware of the fact that this is not totally true for all OS, due to 'copy on write' technique). Think of e.g. testing tree functionallity where i have a node structure that has two pointers to other nodes. Once i retrieve a node by e.g. using a pipe or some shared memory block those pointers point to an address which is part of the memory block of the pchild and therefore i get a segvault when i try from pparent to get the childnode by following the pointers inside the node structure.
A Thread isn´t usefull, due to the main behavior some OS have once a segfault happens. (Killing child and father due to 'unclear state' ).
What i have so far (only fork testing part):
int main (void) {
// forking
pid_t pid = fork();
if (pid < 0 ) {
// somewhat went wrong
printf("An error occured!");
} else if (pid != 0) { // inside parent
// closing writing end, as not needed
if(wait(NULL)!=0){
printf("Segfault in Child\n");
} else {
printf("Everyone is done!\n");
}
} else {
printf("Child forked");
char *s = (char *)0xDEADBEEF;
*s = 'a';
printf("this probally is never executed due to segfault\n");
}
return 0;
}
Now my idea is to try to let the pchild access the memory segment of pparent only.
I'd welcome any ideas on how to do so.
Greetings,
Lars
Your question isn't very clear, but I think you have an XY problem.
My understanding is that you want to run a series of tests, where each test sees the results of previous tests that succeeded, but not tests that crashed/failed. If this is the case, one approach would be:
fork before each test
Execute the test in the child.
If the test succeeded, have the parent exit or simply wait while the child forks again and executes the next test in its child. If the test failed, have the parent fork again to do the next test.
Another approach might be to keep your data structures only in shared memory allocated by mmap with MAP_SHARED|MAP_ANON, but then if one test left them in an inconsistent state, all future test results would be junk.
Your idea of sharing all memory between processes is technically possible, but it will immediately blow up because they clobber each other's state.
Every process has it's own virtual memory adress space so there is no way to give a RAM adress to another process (well you can send a number, but in another virtual memory address space it will point to different 'real' place).
All you need is another thread. Threads of same process share one virtual memory space.
try to use vfork(); instead of fork();systemcall.
that shares the Resources of parent and child process.

Forks and Pointers in C

Can someone help me understand how the system handles variables that are set before a process makes a fork() call. Below is a small test program I wrote to try understanding what is going on behind the scenes.
I understand that the current state of a process is "cloned", variables included, at the time of the forking. My thought was, that if I malloc'd a 2D array before calling fork, I would need to free the array both in the parent and the child processes.
As you can see from the results below the sample code, the two values act as if they are totally separate from each other, yet they have the exact same address space. I expected that my final result for tmp would be -4 no matter which process completed first.
I am newer to C and Linux, so could someone explain how this is possible? Perhaps the variable tmp becomes a pointer to a pointer which is distinct in each process? Thanks so much.
#include <sys/types.h>
#include <unistd.h>
#include <stdlib.h>
int main()
{
int tmp = 1;
pid_t forkReturn= fork();
if(!forkReturn) {
/*child*/
tmp=tmp+5;
printf("Value for child %d\n",tmp);
printf("Address for child %p\n",&tmp);
}
else if(forkReturn > 0){
/*parent*/
tmp=tmp-10;
printf("Value for parent %d\n",tmp);
printf("Address for parent %p\n",&tmp);
}
else {
/*Error calling fork*/
printf("Error calling fork);
}
return 0;
}
RESULTS of standard out:
Value for child 6
Address for child 0xbfb478d8
Value for parent -9
Address for parent 0xbfb478d8
It did indeed copy the entire address space, and changing memory in the child process does not affect the parent. The key to understanding this is to remember that a pointer can only point to something in your own process, and the copy happens at a lower level.
However, you should not call malloc() or free() at all in the child of fork. This can deadlock (another thread was in malloc() when you called fork()). The only functions safe to call in the child are the ones also listed as safe for signal handlers. I used to be able to claim this was true only if you wrote multithreaded code; however Apple was kind enough to spawn a background thread in the standard library, so the deadlock is real all the time. The child of fork should never be allowed to drop out of the if block. Call _exit to make sure it doesn't.

how memory area is shared between processes [closed]

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 5 years ago.
Improve this question
How memory is shared in following scenarios?
Between Parent and child Processes
Between two irrelevant Processes
In which part of the physical memory does the shared memory (or) any other IPC used for communicating between processes exists?
Here it the program with explanation of Memory management between Parent and Child Process..
/*
SHARING MEMORY BETWEEN PROCESSES
In this example, we show how two processes can share a common
portion of the memory. Recall that when a process forks, the
new child process has an identical copy of the variables of
the parent process. After fork the parent and child can update
their own copies of the variables in their own way, since they
dont actually share the variable. Here we show how they can
share memory, so that when one updates it, the other can see
the change.
*/
#include <stdio.h>
#include <sys/ipc.h>
#include <sys/shm.h> /* This file is necessary for using shared
memory constructs
*/
main()
{
int shmid. status;
int *a, *b;
int i;
/*
The operating system keeps track of the set of shared memory
segments. In order to acquire shared memory, we must first
request the shared memory from the OS using the shmget()
system call. The second parameter specifies the number of
bytes of memory requested. shmget() returns a shared memory
identifier (SHMID) which is an integer. Refer to the online
man pages for details on the other two parameters of shmget()
*/
shmid = shmget(IPC_PRIVATE, 2*sizeof(int), 0777|IPC_CREAT);
/* We request an array of two integers */
/*
After forking, the parent and child must "attach" the shared
memory to its local data segment. This is done by the shmat()
system call. shmat() takes the SHMID of the shared memory
segment as input parameter and returns the address at which
the segment has been attached. Thus shmat() returns a char
pointer.
*/
if (fork() == 0) {
/* Child Process */
/* shmat() returns a char pointer which is typecast here
to int and the address is stored in the int pointer b. */
b = (int *) shmat(shmid, 0, 0);
for( i=0; i< 10; i++) {
sleep(1);
printf("\t\t\t Child reads: %d,%d\n",b[0],b[1]);
}
/* each process should "detach" itself from the
shared memory after it is used */
shmdt(b);
}
else {
/* Parent Process */
/* shmat() returns a char pointer which is typecast here
to int and the address is stored in the int pointer a.
Thus the memory locations a[0] and a[1] of the parent
are the same as the memory locations b[0] and b[1] of
the parent, since the memory is shared.
*/
a = (int *) shmat(shmid, 0, 0);
a[0] = 0; a[1] = 1;
for( i=0; i< 10; i++) {
sleep(1);
a[0] = a[0] + a[1];
a[1] = a[0] + a[1];
printf("Parent writes: %d,%d\n",a[0],a[1]);
}
wait(&status);
/* each process should "detach" itself from the
shared memory after it is used */
shmdt(a);
/* Child has exited, so parent process should delete
the cretaed shared memory. Unlike attach and detach,
which is to be done for each process separately,
deleting the shared memory has to be done by only
one process after making sure that noone else
will be using it
*/
shmctl(shmid, IPC_RMID, 0);
}
}
/*
POINTS TO NOTE:
In this case we find that the child reads all the values written
by the parent. Also the child does not print the same values
again.
1. Modify the sleep in the child process to sleep(2). What
happens now?
2. Restore the sleep in the child process to sleep(1) and modify
the sleep in the parent process to sleep(2). What happens now?
Thus we see that when the writer is faster than the reader, then
the reader may miss some of the values written into the shared
memory. Similarly, when the reader is faster than the writer, then
the reader may read the same values more than once. Perfect
i /*
SHARING MEMORY BETWEEN PROCESSES
In this example, we show how two processes can share a common
portion of the memory. Recall that when a process forks, the
new child process has an identical copy of the variables of
the parent process. After fork the parent and child can update
their own copies of the variables in their own way, since they
dont actually share the variable. Here we show how they can
share memory, so that when one updates it, the other can see
the change.
*/
#include <stdio.h>
#include <sys/ipc.h>
#include <sys/shm.h> /* This file is necessary for using shared
memory constructs
*/
main()
{
int shmid. status;
int *a, *b;
int i;
/*
The operating system keeps track of the set of shared memory
segments. In order to acquire shared memory, we must first
request the shared memory from the OS using the shmget()
system call. The second parameter specifies the number of
bytes of memory requested. shmget() returns a shared memory
identifier (SHMID) which is an integer. Refer to the online
man pages for details on the other two parameters of shmget()
*/
shmid = shmget(IPC_PRIVATE, 2*sizeof(int), 0777|IPC_CREAT);
/* We request an array of two integers */
/*
After forking, the parent and child must "attach" the shared
memory to its local data segment. This is done by the shmat()
system call. shmat() takes the SHMID of the shared memory
segment as input parameter and returns the address at which
the segment has been attached. Thus shmat() returns a char
pointer.
*/
if (fork() == 0) {
/* Child Process */
/* shmat() returns a char pointer which is typecast here
to int and the address is stored in the int pointer b. */
b = (int *) shmat(shmid, 0, 0);
for( i=0; i< 10; i++) {
sleep(1);
printf("\t\t\t Child reads: %d,%d\n",b[0],b[1]);
}
/* each process should "detach" itself from the
shared memory after it is used */
shmdt(b);
}
else {
/* Parent Process */
/* shmat() returns a char pointer which is typecast here
to int and the address is stored in the int pointer a.
Thus the memory locations a[0] and a[1] of the parent
are the same as the memory locations b[0] and b[1] of
the parent, since the memory is shared.
*/
a = (int *) shmat(shmid, 0, 0);
a[0] = 0; a[1] = 1;
for( i=0; i< 10; i++) {
sleep(1);
a[0] = a[0] + a[1];
a[1] = a[0] + a[1];
printf("Parent writes: %d,%d\n",a[0],a[1]);
}
wait(&status);
/* each process should "detach" itself from the
shared memory after it is used */
shmdt(a);
/* Child has exited, so parent process should delete
the cretaed shared memory. Unlike attach and detach,
which is to be done for each process separately,
deleting the shared memory has to be done by only
one process after making sure that noone else
will be using it
*/
shmctl(shmid, IPC_RMID, 0);
}
}
/*
POINTS TO NOTE:
In this case we find that the child reads all the values written
by the parent. Also the child does not print the same values
again.
1. Modify the sleep in the child process to sleep(2). What
happens now?
2. Restore the sleep in the child process to sleep(1) and modify
the sleep in the parent process to sleep(2). What happens now?
Thus we see that when the writer is faster than the reader, then
the reader may miss some of the values written into the shared
memory. Similarly, when the reader is faster than the writer, then
the reader may read the same values more than once. Perfect
inter-process communication requires synchronization between the
reader and the writer. You can use semaphores to do this.
Further note that "sleep" is not a synchronization construct.
We use "sleep" to model some amount of computation which may
exist in the process in a real world application.
Also, we have called the different shared memory related
functions such as shmget, shmat, shmdt, and shmctl, assuming
that they always succeed and never fail. This is done to
keep this proram simple. In practice, you should always check for
the return values from this function and exit if there is
an error.
*/nter-process communication requires synchronization between the
reader and the writer. You can use semaphores to do this.
Further note that "sleep" is not a synchronization construct.
We use "sleep" to model some amount of computation which may
exist in the process in a real world application.
Also, we have called the different shared memory related
functions such as shmget, shmat, shmdt, and shmctl, assuming
that they always succeed and never fail. This is done to
keep this proram simple. In practice, you should always check for
the return values from this function and exit if there is
an error.
*/

Is there any formula to know how fork() makes near-perfect copy of the current process?

#include <stdio.h>
main()
{
int i, n=1;
for(i=0;i<n;i++) {
fork();
printf("Hello!");
}
}
I am confused if I put n=1, it prints Hello 2 times.
If n=2, it prints Hello 8 times
If n=3, it prints Hello 24 times..
and so on..
There's is no single "formula" how it's done as different operating systems do it in different ways. But what fork() does is it makes a copy of the process. Rough steps that are usually involved in that:
Stop current process.
Make a new process, and initialize or copy related internal structures.
Copy process memory.
Copy resources, like open file descriptors, etc.
Copy CPU/FPU registers.
Resume both processes.
you dont only fork the 'main' prozess, you also fork the children!
first itteration:
m -> c1
//then
m -> c2 c1-> c1.1
m -> c3 c1-> c1.1 c2->c2.1 c1.1 -> c1.1.1
for i = ....
write it in down this way:
main
fork
child(1)
fork child(2)
fork child(1.1)
fork child(3)
fork child(1.2)
fork child(2.1)
and so on ...
fork() always makes a perfect copy of the current process -- the only difference is the process ID).
Indeed in most modern operating systems, immediately after fork() the two processes share exactly the same chunk of memory. The OS makes a new copy only when the new process writes to that memory (in a mode of operation called "copy-on-write") -- if neither process ever changes the memory, then they can carry on sharing that chunk of memory, saving overall system memory.
But all of that is hidden from you as a programmer. Conceptually you can think of it as a perfect copy.
That copy includes all the values of variables at the moment the fork() happened. So if the fork() happens inside a for() loop, with i==2, then the new process will also be mid-way through a for() loop, with i==2.
However, the processes do not share i. When one process changes the value of i after the fork(), the other process's i is not affected.
To understand your program's results, modify the program so that you know which process is printing each line, and which iteration it is on.
#include <stdio.h>
# include <sys/types.h>
main() {
int i , n=4;
for(i=0;i<n;i++); {
int pid = getpid();
if(fork() == 0) {
printf("New process forked from %d with i=%d and pid=%d", pid, i, getpid());
}
printf("Hello number %d from pid %d\n", i, getpid());
}
}
Since the timing will vary, you'll get output in different orders, but you should be able to make sense of where all the "Hellos" are coming from, and why there are as many as there are.
When you fork() you create a new child process that has different virtual memory but initially shows in the same physical memory with the father. At that point both processes can only read data from the memory. If they want to write or change anything in that common memory, then they use different physical memory and have write properties to that memory. Therefore when you fork() your child has initially the same i value as its father, but that value changes separately for every child.
Yes, there is a formula:
fork() makes exactly one near-identical copy of the process. It returns twice, however. Once in the original process (the parent, returning the child pid) and once in the child (returning 0).

Difference between the address space of parent process and its child process in Linux?

I am confused about something. I have read that when a child is created by a parent process, the child gets a copy of its parent's address space. What does it mean by copy?
If I use the code below, then it prints the same value for variable 'a' which is on the heap in both tthe child and parent. So what is happening here?
int main ()
{
pid_t pid;
int *a = (int *)malloc(4);
printf ("heap pointer %p\n", a);
pid = fork();
if (pid < 0) {
fprintf (stderr, "Fork Failed");
exit(-1);
}
else if (pid == 0) {
printf ("Child\n");
printf ("in child heap pointer %p\n", a);
}
else {
wait (NULL);
printf ("Child Complete\n");
printf ("in parent heap pointer %p\n", a);
exit(0);
}
}
The child gets an exact copy of the parents address space, which in many cases is likely to be laid out in the same format as the parent address space. I have to point out that each one will have it's own virtual address space for its memory, such that each could have the same data at the same address, yet in different address spaces. Also, Linux uses copy on write when creating child processes. This means that the parent and child will share the parent address space until one of them does a write, at which point the memory will be physically copied to the child. This eliminates unneeded copies when execing a new process. Since you're just going to overwrite the memory with a new executable, why bother copying it?
Yes, you will get the same virtual address, but remember each one has it's own process virtual address spaces.
Till there is a Copy-On-Write operation done everything is shared.
So when you try to strcpy or any write operation the Copy-On-Write takes place which means the child process virtual address of pointer a will be updated for the child process, but not so for the parent process.
A copy means exactly that, a bit-identical copy of the virtual address space. For all intents and purposes, the two copies are indistinguishable, until you start writing to one (the changes are not visible in the other copy).
With fork() the child process receives a new address space where all the contents of the parent address space are copied (actually, modern kernels use copy-on-write).
This means that if you modify a or the value pointed by it in a process, the other process still sees the old value.
You get two heaps, and since the memory addresses are translated to different parts of physical memory, both of them have the same virtual memory address.

Resources