I'm working with semaphores in C , especifically to control the access to a shared memory zone in linux. but there is one thing that I can't understand.
I am using a mutex to control the access to a specific zone because i have 2 processes that must read/write from that zone. the thing is, when we use the fork() to create a new child process, the whole program is "copied" to another program as if they were two seperate programs right ? so, when i do V(mutex) in one process, how does the other one know he can't access ?
I know its a noob question but nobody could explain this to me until now.
After the fork neither process is going to know about the memory actions of the other because they are separate copies. You have to put your shared variables in shared memory, including mutexes and semaphores. Then all the processes are operating on the same resource.
For unrelated (i.e. non-forked) process there are usually system facilities (e.g. named semaphores) that each process can open based on a path name or similar method that each can use to find and use the resource.
You synchronisation objects must be placed in process shared memory, for example created with mmap (... MAP_ANONYMOUS ...). In addition, they must have the PTHREAD_PROCESS_SHARED attribute set, for example, by using pthread_mutexattr_setpshared.
See here:
Semaphores and Mutex for Thread and Process Synchronization
So mutex in practice is often used in threads, which makes sharing trivial. For processes however, mutex could be stored as a part of the shared mem.
For semaphores however, linux has built in library, which identifies global semaphores by keys. See below.
http://beej.us/guide/bgipc/output/html/multipage/semaphores.html
Or you can use other IPC to sync. Signals, for example.
Hope this helps.
Related
I'm currently working on a school projet (C language) which aims to make processes who can write into shared memory, one by one.
I can't use multi-threading neither fork.
I can have only one executable, which must be run as many times as we want.
For example if i do : ./main john then ./main toto, John would be able to write in shared memory while Toto has to wait that John unlock the semaphore.
But I just can't find any documentation who explain clearly how it works without doing threads, forks etc.
May someone can help me ? Thanks
Assuming that Linux is used:
Because multi-threading is not allowed, you may ignore pthread-semaphores.
Maybe system-V IPC semaphores are what you are looking for:
https://man7.org/linux/man-pages/man7/sysvipc.7.html
Those can be used between processes.
The system-V IPC also provides functions for shared memory handling.
Posix-semaphores and shm API are also another alternative:
https://linux.die.net/man/7/sem_overview
https://man7.org/linux/man-pages/man7/shm_overview.7.html
https://opensource.com/article/19/4/interprocess-communication-linux-storage has examples for using POSIX semaphores to synchronize shared memory access (in the second part of that article, past the file lock example).
It is not unlike synchronizing threads, except that it uses operating system facilities for inter process communication.
POSIX semaphores are available on POSIX systems, notably many *nixes.
I'm taking over some C code running in Linux (Centos) with extensive use of semaphores.
The way the code is written :
./Program1
This program launches a bunch of processes which makes use of mutexes and semaphores.
./Program2
This program also launches a bunch of processes which makes use of mutexes and semaphores.
I've realised that Program1 and Program2, they make use of semaphores with the same names.
In Linux C programming, can different programs use the same semaphores?
My guess is no, but the same naming is confusing the hell out of me. They are using the same source code to launch and handle the semaphores.
The semaphores are invoked using the following commands:
semget
semctl
semop
I've read that these are called processes semaphores.. if Program1 creates SEMAPHORE1, can Program2 access SEMAPHORE1?
Appreciate any help here, thanks!
Assuming you mean named semaphores (or even unnamed semaphores stored in shared memory, both which can be created with sem_open), they generally are shared amongst processes.
Semaphores using semget and related calls use an ID key rather than a name but their usage patterns are similar.
Semaphores are one of the IPC (inter-process communication) methods.
You can create a one-process-only semaphore by using an unnamed variant in non-shared memory and this will only be accessible to threads of the given process but, in my experience, that's not a common use case. The semget family of calls can also give you process-private semaphores.
Mutexes, on the other hand, tend to be more used within a single process for inter-thread communication but there is even a variant of them that can work inter-process.
You create a pthread_mutexattr (attribute) which allows sharing of mutexes and then use that attribute when initialising the mutex you want to share. Obviously, the mutex needs to be in shared memory so that multiple processes can get at it.
I am trying to create a shared memory which will be used by multiple processes. these processes communicate with each other using MPI calls (MPI_Send, MPI_Recv).
I need a mechanism to control the access of this shared memory I added a question yesterday to see if MPI provides any facility to do that. Shared memory access control mechanism for processes created by MPI , but it seems that there is no such provision by MPI.
So I have to choose between named semaphore or flock.
For named semaphore if any of the process dies abruptly without calling sem_cloe(), than that semaphore always remains and can be seen by ll /dev/shm/. This results in deadlock sometimes(if I run the same code again!), for this reason I am currently thinking of using flock.
Just wanted to confirm if flock is best suited for this type of operation ?
Are there any disadvantages of using flock?
Is there anything else apart from named semaphore and flock that can be used here ?
I am working on C under linux.
You can also use a POSIX mutex in shared memory; you just have to set the "pshared" attribute on it first. See pthread_mutexattr_setpshared. This is arguably the most direct way to do what you want.
That said, you can also call sem_unlink on your named semaphore while you are still using it. This will remove it from the file system, but the underlying semaphore object will continue to exist until the last process calls sem_close on it (which happens automatically if the process exits or crashes).
I can think of two minor disadvantages to using flock. First, it is not POSIX, so it makes your code somewhat less portable, although I believe most Unixes implement it in practice. Second, it is implemented as a system call, so it will be slower. Both pthread_mutex_lock and sem_wait use the "futex" mechanism on Linux, which only does a system call when you actually have to wait. This is only a concern if you are grabbing and releasing the lock a lot.
I am new to threads and processes.
I have code that works fine right now with forking the code into multiple processes. However each process needs to add to a global variable, but from what I read, each time the process forks, it takes a copy of the global, and adds them independently. Is there a way to join them, like you can with threads?
Different processes can communicate and exchange data via shared memory.
On linux, you can look:
man shm_overview
for attaching a memory segment on several processes
and
man sem_overview
for the semaphore library for controlling parallel access.
You should define a struct with two fields, one for your global and one for a semaphore. Then, before any forking occurs, create some shared memory in the parent process big enough to hold this struct and initialize one there. In the children, map in the shared memory so they can access the global. All processes, parent and children, should obey the rules of the semaphore when accessing the global.
To avoid unnecessary blocking which can hurt performance, try not to hold the semaphore too long. When reading the global, make a quick copy of it in a process and use that, rather than holding the semaphore for the entire time you are using its value. Likewise, when changing the global, prepare your changes ahead of time (before you grab the semaphore) and, once you have the semaphore, copy them in all at once. Sometimes your work depends on reading and writing the global without it changing in between being read and written. In this case, some blocking may be inevitable.
It is not clear what platform you are on, but all major PC and server platforms (Windows, Linux/Unix/Mac OS) have support for shared memory and semaphores. The APIs may be different, but the functionality you need is there.
my question is somewhat conceptual, how is parent process' data shared with child process created by a fork() call or with a thread created by pthread_create()
for example, are global variables directly passed into child process and if so, does modification on that variable made by child process effect value of it in parent process?
i appreciate partial and complete answers in advance, if i'm missing any existing resource, i'm sorry, i've done some search on google but couldn't find good results
thanks again for your time and answers
The semantics of fork() and pthread_create() are a little different.
fork() will create a new process, where the global variables will be separate between the parent and children. Most OS implementations will use copy-on-write semantics, meaning that both the parent and child process will use the same physical memory pages for all global variables until one of the processes attempts to edit the physical memory, at which point a copy of that page is made, so that now each process gets its own copy and does not see the other process's, so that the processes are isolated.
pthread_create() on the other hand, creates a new thread within the same process. The new thread will have a separate stack space from the other running threads of the same process, however the global variables and heap space are shared between all threads of the same process. This is why you often need a mutex to coordinate access to a shared piece of memory between multiple threads of the same process.
TL;DR version: with fork(), you don't see the other guy's changes; with pthread_create() you do.
A fork creates an almost exact copy of the calling process, including memory and file descriptors. Global variables are copied along with everything else, but they are not in any way linked to the parent process. Since file descriptors are also copied, parent and child can interact via these (as long as they're setup properly, usually via pipe or socketpair).
There's a big difference between processes created by fork and between threads created with pthread_create. Processes don't share global variables and should communicate through pipes, sockets, or other tools provided by the OS. A good solution is MPI - which is a message-passing library for inter-process communication.
Threads are quite different. A thread created with pthread_create shares all the global variables with its caller. Moreover, the caller can pass an arbitrary structure into the thread, and this structure will also be shared. This means that one should be extremely careful when programming with threads - such amounts of sharing are dangerous and error prone. The pthread API provides mutexes and conditions for robust synchronization between threads (although it still requires practice and expertise to implement correctly).