I have used fork() to create 2 different processes operating on 2 different address spaces.
Now, in parent process I need the value of a variable from child's address space or if the child process can modify the variable in parent's address space.
Is this possible?
No, once you've forked, each process gets its own address space and you'll have to look into either:
some form of IPC between the processes to access each others data (such as shared memory or message queues).
some more lighweight variant of fork that allows sharing of data (including possibly threading).
Once you have two processes, sharing data needs interprocess communication: file, pipe or shared memory.
If you mean exchanging data between these two processes you can not. You can do it by system APIs like SharedMemory, Message Passing, Pipeline, Socket, ...
As you have created two process using fork command Both Process will be in different address space so they will only communicate by IPC, message passing ,Piping , Shared Memory etc. otherwise one process can't access other process data as thay do have Process specific data
and similarly threads also have thread specific data
Related
Fork() executes the same program and has copy same variables of the father at the moment of the fork, how do the OS keep both process in memory, safeguarding each process only access his variables?
When the kernel creates a new process, it also creates a new memory mapping. Initially all pages in the new mapping are shared with the parent process, but once pages in the map are modified by the child process those are copied into their own pages.
Useful terms to search for: Virtual memory, on demand paging, memory mapping, shared memory, copy on write.
The OS copies virtual memory space of the forking process (with possible optimizations like copy-on-write).
Fork is a technique that in general makes a separate address space for the child. The child has the same memory of the parent, but they have different PID. So you can distinguish them: specifically fork() returns 0 in the child process and a non zero value (child's PID) in the parent process.
I have an application which uses mmap for ipc. Can I run this application multiple times? Will it have any side effects ?
My application scenario:
my application forks off a child process whose job is to always kill the parent process randomly but it should do this in controlled manner, for example setting a variable in parent process which indicates the child process to kill the parent process (here comes the mmap). The parent process has a signal handler where it can resume the application again the child process kills the parent process it continues...
Can any one help me? thanks in adavnce
Whether running your application multiple times will have side effects or not depends on how you implement it. Please have a look at this answer. It contains a lot of helpful information. For example:
mmap is great if you have multiple processes accessing data in a read only fashion from the same file [...]
This mean: If you want to use the same shared memory for multiple parent/child pairs, then you need to synchronize access to that shared memory. Please have a look at this Q&A on how to do that. Of course, you have to make sure, that each parent/child pair uses its own variables in the shared memory.
Another option is to use a separate shared memory segment for each parent/child pair. You could do this, for example, by making the process ID of the parent process a part of the shared memory file name. Then, when you fork the child process, you pass the process ID (or the shared memory file name) to the child process, so that parent and child know which shared memory to use in order to comunicate to each other.
Following question:
I created a shared memory segment (in my main.c), containing multiple structures, a few variables etc. Right after that, I am
-creating a pipe, and
-fork()-ing.
I am making both the child, and parent process communicate through the pipe - whose socket descriptors are both stored in a global structure, saved in the shared memory segment.
Now I read that for elements contained in a shared memory segment, after forking, both processes can manipulate the shared variables and structures, and that the other process sharing the memory would thereby have access to the same, manipulated data. So far, so good!
My question is not a a source code issue, it is rather more a theoretical point I seem to be missing, since my code is working exactly the way it should, but I don't understand why this works:
After forking, I make each process close it's irrelevant (for my purposes), side of the pipe (e.g. the parent closes the reading side of the pipe, the child the writing side). However, the pipe_fd[2] is stored in the global struct in the SHM segment. So how come, if one side is closed from one process, and the other side from the other process (accessing respectively by using
close(nameOfSHMStruct->pipe_fd[0]);
and
close(nameOfSHMStruct->pipe_fd[1]);
), but both access it form the struct, that they are still able to communicate with each-other? am I missing a something about the pipe()-statement , or is it something with the SHM, or is it something with the fork(), or god knows something about the combination of all the 3 of them? As I said already, the code actually works this way, I'm printing (as a debug message), the data exchanged between the processes, but I just don't really get the core theoretical aspect behind it's way of functioning...
They are able to communicate beacause they only close their descriptors of the pipe. I will explain deeply:
FATHER PROCCESS -----> FORK() ------>>> FATHER PROCESS
pipe() -> pipe_fd[2] | pipe_fd[2] (father pipe fds)
|
----->>> CHILD PROCESS
pipe_fd[2] (child pipe fds)
A fork clones the father process, including the file descriptors: the child owns a copy of the file descriptors of its father. So after a fork, we will have 2 file descriptors for each process.
So, considering this, you should not store the pipe file descriptors in a shared memory structure, beacause it is pointing to conceptually different fd's in the father and in the children.
Here and here more info.
It would helpful to see more of the code, but I'll take a guess.
The 'pipe_fd' created with the call to pipe() is copied to the child process upon fork(). Since the memory space is also copied on fork, that pointer in your shm object distinctly points to the memory address in the parent or child. So calling close, even though on the 'pipe_fd' in the shm, is actually pointing to the 'pipe_fd' in the parent or child respectively.
I guess an easier of looking at it is: all you've placed in that shm object is a pointer, which is shared across the processes, and since the address space is copied (which includes that pipe_fd), the pointer points to the same address in the parent or child, which is their own copy of that 'pipe_fd'.
How is the implementation of threads done in a system?
I know that child processes are created using the fork() call
and a thread is a light weight. How does the creation of a thread differ from that of a child process?
Threads are created using the clone() system call that can make a new process that shares memory space and some of the kernel control structures with its parent. These processes are called LWPs (light-weight processes) and are also known as kernel-level threads.
fork() creates a new process that initially shares memory with its parent but pages are copy-on-write, which means that separate memory pages are created when the content of the original one is altered. Thus both parent and child processes can no longer change each other's memory and effectively they run as separate processes. Also the newely forked child is a full-blown processes with its separate kernel control structures.
Each process has its own address space aka range of virtual addresses that the process can access. When a new process is forked a duplicate copy of all the resources involved has to be made. After the forking is complete the child and the parent have their own distinct address space and all the resources involved within it.Naturally, this is an performance intensive operation.
While all threads in the same process share the same address space, So when a new thread is spawned each thread only needs its own stack and there is no duplication of all resources as in case of processes.Hence spawning of an thread is considerably less performance intensive.
Ofcourse the two operations cannot and should not be compared because both provide essentially different features for different requirements.
Well it differs very much, first of all child process is in some way copy of parent program and have all variables duplicated, and you differ child from parent by its PID. Threads are like new programs , they run at the same time as main program (it looks like at the same time, due to slicing time of cpu by os ). Threads could use global variables in program, but they don't make duplicate as processes. So it`s much cheaper to use threads then new processes.
Well you've read the important parts, now here's something behind the curtains:
In current implementations(where current means the last few decades), the process memory isn't technically copied immediately upon forking. Read-only sections are just shared between the two processes (as they can't change anyway), as well as the read-only parts of shared libraries, of course. But most importantly, everything writeable is initially also just shared. However, it is shared in a write-protected manner, and as soon as you write to the child process memory (e.g. by incrementing a variable), a page fault is generated in the kernel, which only then causes the kernel to actually copy the respective page (where the modification then occurs).
This great optimization, which is called "copy on write", results in child processes usually not really consuming exactly as much (physical) memory as their parent processes. To the program developer (and user), however, it's completely transparent.
my question is somewhat conceptual, how is parent process' data shared with child process created by a fork() call or with a thread created by pthread_create()
for example, are global variables directly passed into child process and if so, does modification on that variable made by child process effect value of it in parent process?
i appreciate partial and complete answers in advance, if i'm missing any existing resource, i'm sorry, i've done some search on google but couldn't find good results
thanks again for your time and answers
The semantics of fork() and pthread_create() are a little different.
fork() will create a new process, where the global variables will be separate between the parent and children. Most OS implementations will use copy-on-write semantics, meaning that both the parent and child process will use the same physical memory pages for all global variables until one of the processes attempts to edit the physical memory, at which point a copy of that page is made, so that now each process gets its own copy and does not see the other process's, so that the processes are isolated.
pthread_create() on the other hand, creates a new thread within the same process. The new thread will have a separate stack space from the other running threads of the same process, however the global variables and heap space are shared between all threads of the same process. This is why you often need a mutex to coordinate access to a shared piece of memory between multiple threads of the same process.
TL;DR version: with fork(), you don't see the other guy's changes; with pthread_create() you do.
A fork creates an almost exact copy of the calling process, including memory and file descriptors. Global variables are copied along with everything else, but they are not in any way linked to the parent process. Since file descriptors are also copied, parent and child can interact via these (as long as they're setup properly, usually via pipe or socketpair).
There's a big difference between processes created by fork and between threads created with pthread_create. Processes don't share global variables and should communicate through pipes, sockets, or other tools provided by the OS. A good solution is MPI - which is a message-passing library for inter-process communication.
Threads are quite different. A thread created with pthread_create shares all the global variables with its caller. Moreover, the caller can pass an arbitrary structure into the thread, and this structure will also be shared. This means that one should be extremely careful when programming with threads - such amounts of sharing are dangerous and error prone. The pthread API provides mutexes and conditions for robust synchronization between threads (although it still requires practice and expertise to implement correctly).