Interesting parent and child behavior while doing fork

Interesting parent and child behavior while doing fork - c

Can someone please explain the output of the program below . Why am i getting the same value of &a for both parent and child.
They must have the different physical address.If i consider that i am getting the virtual address then how can they have same virtual address because as far as i know each physical address is uniquely bound to virtual address.
#include <stdio.h>
#include <stdlib.h>
int main(void) {
int pid=fork();
int a=10;
if(pid==0)
{
a=a+5;
printf("%d %d\n",a,&a);
}
else
{
a=a-5;
printf("%d %d\n",a,&a);
}
return 0;
}

The child process inherits its virtual address space from the parent, even though the virtual addresses start referring to different physical addresses once the child writes to a page. That's called copy-on-write (CoW) semantics.
So, in the parent &a is mapped to some physical address. Fork initially just copies the mapping. Then, when the processes write to a, CoW kicks in and in the child process, the physical page that holds a is copied, the virtual address mapping is updated to refer to the copy and both processes have their own copy of a, at the same virtual address &a but at different physical addresses.
each physical address is uniquely bound to virtual address
That's not true. A physical memory address may be unmapped, or it may be mapped to multiple virtual addresses in one or more processes' address spaces.
Conversely, a single virtual address can be mapped to several physical address, as long as these virtual addresses exist in different processes' virtual address spaces.
[Btw., you can't reliably print memory addresses with %d (that just happens to work on 32-bit x86). Use %p instead. Also, the return type of fork is pid_t, not int.]

Related

Are memory addreses portable in C?

Say we have program 1...
./program1
int main(int argc, char *argv[])
{
int *i;
*i = 10;
printf("%lld", i);
return 0;
}
Now program 2...
./program2 program1output 10
int main(int argc, char *argv[])
{
int *t;
t = (int*)atoll(argv[1]);
*t = atoi(argv[2]);
return 0;
}
Will this work? Can you share memory addresses between different programs?

This behavior is not defined by the C standard. On any general-purpose multi-user operating system, each process is given its own virtual address space. All of the memory assigned to a process is separate from the memory assigned to other processes except for certain shared memory:
Read-only data may be shared between processes, especially the instructions and constant data of two processes running the same executable and the instructions and constant data of shared libraries. That data may have the same address in different processes or different addresses (depending on various factors, including whether the code is position-independent and whether address space layout randomization is in use).
Some operating systems also map system-wide shared data into processes by default.
Memory may be shared between processes by explicit request of those processes to map shared memory segments. Those segments may or may not appear at the same virtual address in the different processes. (A request to map shared memory may request a certain address, in which case different processes could arrange to use the same address, or it could let the mapping software choose the address, in which case different processes cannot rely on receiving the same address assignment.)
In a special-purpose operating system, different processes could share one address space.
Supplement
This is not correct code:
int *i;
*i = 10;
The declaration int *i; defines i to be a pointer but does not assign it a value. Then using *i is improper because it attempts to refer to where i points, but i has not been assigned to point to anything.
To define an int and make its address visible in output, you could define int i; and then print &i.
This is not the proper way to print an address:
printf("%lld", i);
To print an address, cast it to void * and format it with %p. The result of the formatting is implementation-defined:
printf("%p", (void *) &i);
This is not a good way to reconstruct an address:
int *t;
t = (int*)atoll(argv[1]);
As with printf, the type should be void *, and there are problems attempting the conversion with atoll. The C standard does not guarantee it will work; the format produced by printing with %p might not be a normal integer format. Instead, use the %p specifier with sscanf:
void *temp;
if (1 != sscanf(argv[1], "%p", &temp))
exit(EXIT_FAILURE);
int *t = temp;
When the address comes from other process, the behavior of the sscanf conversion is not defined by the C standard.

In principal, an application operates on its own/private memory. There are ways of sharing memory among different processes, but this requires special mechanism to overcome above mentioned "principal" (memory mapped files, for example). Have a short look at, for example, this article on sharing memory.
In your case, program one will have ended and its memory is not available any more; and the way you access it is definitely not one of the "special mechanisms" necessary to access shared memory:
Though an integer vale may be converted to a pointer value, accessing this pointer is only valid if the integer value has originally been converted from a pointer to a valid object. This is not the case in your example, since the integral value calculated in t = (int*)atoll(argv[1]); never pointed to a valid object in the current program.

In general, memory addresses are tied to processes because each process may have its own memory space. So, the addresses are virtual addresses rather than physical addresses, which means they are references to a location in the process's memory space rather than references to a location on a chip.
(Not all environments have virtual memory. For example, an embedded system might not.)
If you have two programs running in the same process, a pointer can be passed between them. For example, a main program can pass a pointer to a dynamically linked library it loads.

Fork and local variable [duplicate]

I am confused about something. I have read that when a child is created by a parent process, the child gets a copy of its parent's address space. What does it mean by copy?
If I use the code below, then it prints the same value for variable 'a' which is on the heap in both tthe child and parent. So what is happening here?
int main ()
{
pid_t pid;
int *a = (int *)malloc(4);
printf ("heap pointer %p\n", a);
pid = fork();
if (pid < 0) {
fprintf (stderr, "Fork Failed");
exit(-1);
}
else if (pid == 0) {
printf ("Child\n");
printf ("in child heap pointer %p\n", a);
}
else {
wait (NULL);
printf ("Child Complete\n");
printf ("in parent heap pointer %p\n", a);
exit(0);
}
}

The child gets an exact copy of the parents address space, which in many cases is likely to be laid out in the same format as the parent address space. I have to point out that each one will have it's own virtual address space for its memory, such that each could have the same data at the same address, yet in different address spaces. Also, Linux uses copy on write when creating child processes. This means that the parent and child will share the parent address space until one of them does a write, at which point the memory will be physically copied to the child. This eliminates unneeded copies when execing a new process. Since you're just going to overwrite the memory with a new executable, why bother copying it?

Yes, you will get the same virtual address, but remember each one has it's own process virtual address spaces.
Till there is a Copy-On-Write operation done everything is shared.
So when you try to strcpy or any write operation the Copy-On-Write takes place which means the child process virtual address of pointer a will be updated for the child process, but not so for the parent process.

A copy means exactly that, a bit-identical copy of the virtual address space. For all intents and purposes, the two copies are indistinguishable, until you start writing to one (the changes are not visible in the other copy).

With fork() the child process receives a new address space where all the contents of the parent address space are copied (actually, modern kernels use copy-on-write).
This means that if you modify a or the value pointed by it in a process, the other process still sees the old value.

You get two heaps, and since the memory addresses are translated to different parts of physical memory, both of them have the same virtual memory address.

C pass void pointer using shared memory

I need to pass void handler to another application, To replicate the scenario I have created one small program using shared memory and try to pass the void pointer to another application and print the value, I can get the void pointer address in another application, but when I try to dereference the pointer second application crash.
Here are the sample application wire.c .
#include <sys/ipc.h>
#include <sys/shm.h>
#include <stdio.h>
int main() {
key_t key=1235;
int shm_id;
void *shm;
void *vPtr;
shm_id = shmget(key,10,IPC_CREAT | 0666);
shm = shmat(shm_id,NULL,NULL);
sprintf(shm,"%d",&vPtr);
printf("Address is %p, Value is %d \n", (void *)&vPtr, * (int *)vPtr);
return;
}
Here is read.c
#include <sys/ipc.h>
#include <sys/shm.h>
#include <stdio.h>
#include <stdlib.h>
int main() {
key_t key=1235;
int shm_id;
void *shm;
void *p = (void *)malloc(sizeof(void));
shm_id = shmget(key,10,NULL);
shm = shmat(shm_id,NULL,NULL);
if(shm == NULL)
{
printf("error");
}
sscanf(shm,"%d",&p);
printf("Address is %p %d\n",(void *)p);
return 0;
}
When I try to dereference p it crash. I need to pass the void pointer address and value in second application.
I don't want to share value between application, It works using shared memory,I know.
By default void *ptr will have some garbadge value (for ex. add= 0xbfff7f, value=23456), Can you please tell me, how can i pass void pointer address to another application and from the second application using that address i can print the value which was found in first application (i.e. 23456).
Apart from the shared memory is there any other alternate available?
Thanks.

This is probably because the pointer is still virtual; it's pointing at memory that is shared but there is no guarantee that two processes sharing the same memory maps it to the same virtual address. The physical addresses are of course the same (it's the same actual memory, after all) but processes never deal with physical addresses.
You can try to request a specific address by using mmap() (in Linux) to do the sharing. You can also verify the virtual address theory by simply printing the addresses of the shared memory block in both processes.
Also, don't cast the return value of malloc() in C and don't allocate memory when you're going to be using the shared memory. That part just makes no sense.

If you want to call a function within a process from another process, it is NOT IPC.
IPC is sharing of data between multiple threads/processes.
Consider adding the shared function into a DLL/shared-object to share the code across processes. If not, then you could add RPC support to your executables as shown here.
Why passing function pointers between 2 process does NOT work?
A function pointer is a virtual address referring to the physical memory location where the function code is currently loaded into physical memory. Whenever a function pointer(virtual address) is referred to in the process, the kernel is responsible for performing the mapping to physical address. This is successful as the mapping is present in the page-tables for the current process.
However when a context-switch occurs and another process is running, the page-tables containing the mappings of that particular process are loaded and currently active. these will NOT contain the mapping of the function pointer form the previous process. Hence attempting to use the function pointer from another process will fail.
Why the page-tables do NOT contain the mapping of function in another process?
If this was done then there would be no advantages with having multiple processes. All the code that could ever be run would have to be loaded into physical memory simultaneously. Also the entire system would then effectively be a single process.
Practically speaking whenever a context-switch happens and a different process is executing, the code/data segments of the earlier process can even be swapped out of physical memory. Hence even maintaining a function pointer and passing it to the new process is useless as it cannot guarantee that the function code will be held in memory even after newer process is loaded in memory and starts executing.

This is illegal:
void *p = (void *) malloc(sizeof(void));
void is an incomplete type, sizeof(void) is invalid. You can't do this for the same reason that you can't declare a void variable:
void i; /* WRONG */
What happens when you dereference p is undefined. If you just want the pointer value, you don't need to call malloc on read.c - that's the whole concept of shared memory - if it's shared, why would you allocate space on the reader program?

If different process have their own memory space, how could the address of local variable is the same? [duplicate]

This question already has answers here:
Why Virtual Memory Address is the same in different process?
(3 answers)
Closed 8 years ago.
From now on, I think after fork() is being called, the local variable is duplicated into parent process and child process, they are separated. But I try to fetch the address of each local variable in different process,it turns out that they are same:
int main(void){
int local = 10;
pid_t childPid;
childPid = fork();
if(childPid == 0 ){
printf("[Child] the local value address is %p\n",&local);
}else if(childPid < 0){
printf("there is something wrong");
}else{
printf("[Parent] the local value address is %p\n",&local);
}
return (EXIT_SUCCESS);
}
The output is:
[Parent] the local value address is 0x7fff5277baa8
[Child] the local value address is 0x7fff5277baa8
Any idea about this?

Being in a different "space" means that the "same" index point in different spaces does not refer to the same thing. Think of "spaces" as pieces of paper. "The 4th character of the 3rd line" on page 1 does not refer to the same thing as on page 2.

Because the memory space a process gets is virtual. That means the actual physical address on memory chips could be different. In the case you mentioned, local object addresses in two different processes are guaranteed to have different private physical address on memory chips.
That being said, there are circumstances when two non-local object addresses from different processes map to the same physical address. Most commonly, that could be shared library or shared memory.
If you do not specify position-indepedent-code when compiling your shared library, you really could end up same virtual address map to same physical address when two concurrent processes use this shared library.

virtual address assignment in C and linux

In the program given below virtual address for both process is same. I understood the reason for global variables but could not understand for local variables.
how is virtual addresses assigned to local variables before running?
int main()
{
int a;
if (fork() == 0)
{
a = a + 5;
printf(“%d,%d\n”, a, &a);
}
else
{
a = a –5;
printf(“%d, %d\n”, a, &a);
}
}

Virtual addresses are... virtual. That means a same virtual address from two different processes (like a parent process and its child process) points to two different physical addresses.

While compiling, the compiler decides to use either the stack or a register for local variables. In this case, the stack.
It also decides where in the (virtual) address space to place the stack.
So for both processes the stack starts in the same (virtual) address. And since the flow of this specific program is rather deterministic, the stack frames look exactly the same for both processes, resulting in the same offset in the stack for 'a'.

Whatever the address of a was before the fork, it must surely be the same after the fork, so it necessarily is the same in the two processes, since their addresses for a are both equal to the same thing. In most implementations, the address of a is derived by adding an offset (determined by the compiler) to the content of the stack pointer. The content of the stack pointer is duplicated by fork.