Shared memory access control mechanism for processes created by MPI - c

I have a shared memory used by multiple processes, these processes are created using MPI.
Now I need a mechanism to control the access of this shared memory.
I know that named semaphore and flock mechanisms can be used to do this but just wanted to know if MPI provides any special locking mechanism for shared memory usage ?
I am working on C under Linux.

MPI actually does provide support for shared memory now (as of version 3.0). You might try looking at the One-sided communication chapter (http://www.mpi-forum.org/docs/mpi-3.0/mpi30-report.pdf) starting with MPI_WIN_ALLOCATE_SHARED (11.2.3). To use this, you'll have to make sure you have an implementation that supports it. I know that the most recent versions of both MPICH and Open MPI work.

No, MPI doesn't provide any support for shared memory. In fact, MPI would not want to support shared memory. The reason is that a program written with MPI is supposed to scale to a large number of processors, and a large number of processors never have shared memory.
However, it may happen, and often does, that groups of small number of processors (in that set of large number of processors) do have shared memory. To utilize that shared memory however, OpenMP is used.
OpenMP is very simple. I strongly suggest you learn it.

Related

Unified Shared Memory Systems

I am working with some older real-time control system code written using the RTAI extensions to Linux. I see four different mechanisms in use to create and share memory across process boundaries.
1) RTAI shared memory (rt_shm_alloc & rt_shm_free):
This uses an unsigned long globally unique value as a key for accessing the shared memory. Behind the scenes (from user space at least) it uses an ioctl on a character device to generate the memory then mmap to make it available.
2) System V (ftok, shmget, shmat, shmctl, etc):
This uses ftok to generate a key that is used, along with an index value, to find and map a block of memory. I've not tried to see how this is actually implemented, but I'm assuming that somewhere behind the curtain it is using mmap.
3) Posix shared memory (shm_open, mmap, shm_unlink, etc):
This takes a string (with some restrictions on content) and provides a file handle which can be used to mmap the linked block of memory. This seems to be supported using a virtual filesystem.
4) direct use of mmap and character driver ioctl calls
Certain kernel modules provide an interface which directly supports using mmap to create and shared a block of memory.
All of these mechanisms seem to use mmap either explicitly or implicitly to alter the virtual memory subsystem to setup and manage the shared memory.
The question is: If a block of memory is shared using one of these systems, is there any way to setup an alias that will access the same memory in the other systems.
A use case:
I have two I/O subsystems. The first is implemented using linux kernel driver and exports it's current I/O state in one chunk of shared memory created using the RTAI shared memory mechanism. The second is based on the etherlab ethercat master which uses a custom kernel module and directly uses ioctl and mmap to create a shared memory block.
I have 40 or so other systems which need access to certain I/O fields, but don't really need to know which I/O subsystem the data came from.
What I want is a way to open and access the different types of shared memory in a single coherent way, isolating the underlying implementation details from the users. Does such a mechanism exist?
I have altered the ethercat stack to use the RTAI shared memory mechanism to solve this instance, but that's just a temporary solution (Read: hack).

How to portably share a variable between threads/processes?

I have a server that spawns a new process or thread for every incoming request and I need to read and write a variable defined in this server from both threads and processes. Since the server program needs to work both on UNIX and Windows I need to share the variable in a portable way, but how do I do it?
I need to use the standard C library or the native syscalls, so please don’t suggest third party libraries.
shared memory is operating system specific. On Linux, consider reading shm_overview(7) and (since with shared memory you always need some way to synchronize) sem_overview(7).
Of course you need to find out the similar (but probably not equivalent) Windows function calls.
Notice that threads are not the same as processes. Threads by definition share a common single address space. With threads, the main issue is then mostly synchronization, often using mutexes (e.g. pthread_mutex_lock etc...). On Linux, read a pthread tutorial & pthreads(7)
Recall that several libraries (glib, QtCore, Poco, ...) provide useful abstractions above operating system specific functionalities, but you seem to want avoiding them.
At last, I am not at all sure that sharing a variable like you ask is the best way to achieve your goals (I would definitely consider some message passing approach with an event loop: pipe(7) & poll(2), perhaps with a textual protocol à la JSON).

OpenMP, multithreading or multiprocessing (C)?

I'm having some trouble understanding how OpenMP works. I know that it executes tasks in parallel and that it's a multi-processing tool, but what does it mean?
It uses 'threads' but at the same time it's a multi-processing tool? Aren't the two mutually exclusive, you use one method but not the other? Can you help explain which one it is?
To clarify, I've only worked with multi-threading with POSIX pthreads. And that's totally different from multiprocessing with fork and exec and shared memory.
Thank you.
OpenMP was developed to allow for an abstraction layer for parallel architectures utlizing multi-threading and shared memory so you don't have to write often used parallel code from scratch. Note, in general threads still have access to shared memory (the master thread's memory allocated). It takes advantage of multiple processors, but uses threads.
MPI is its counterpart for distributed systems. This might be more of the traditional "multi-processing" version you are thinking of, since all the "ranks" operate independently of eachother without shared memory, and must communicate through concepts such as scatter/map/reduce etc.
OpenMP is a used for multithreading. I go pretty in depth on how to use OpenMP and the pitfalls:
http://austingwalters.com/the-cache-and-multithreading/
It works very similar to the POSIX pthreads, except no fuss. It was developed to be incorporated into code that was already developed and then recompiled with an appropriate compiler (g++, clang/llvm will not work currently). If you clicked on my link above you'll note that a thread enables multiprocessing since it can be executed on any of the processors available.
Meaning if you have a single core, threads would could still execute faster since your processor shares time amongst all the programs. If you have multiple processors you and multiple threads the threads can be accessed from different processors simultaneously and therefore execute even faster.
Further OpenMP allows shared (and unshared memory), depending on the implementation and I believe you can use OpenMP with POSIX threading as well, though you will not gain any advantages if the pthreads were used correctly.
Below is a link to an excellent guide to OpenMP:
http://bisqwit.iki.fi/story/howto/openmp/

OpenMp and Shared Memory definition

According to the OpenMP web site OpenMp is "the de-facto standard for parallel programming on shared memory systems" According to Wikipedia "Using memory for communication inside a single program, for example among its multiple threads, is generally not referred to as shared memory."
What is wrong here ? Is it the "generally" term ?
OpenMp is really just creating threads "sharing memory" through a single same virtual adress space, isn't it ?
Moreover, I guess OpenMP is able to run on NUMA architectures where all the memory can be addressed by all the processors, but with some increased memory access time when threads sharing data, are assigned to cores accessing to different memories at different access time. Is this true ?
I'm redacting a full-fledged answer here to try to answer further questions asked as comments to lucas1024's answer.
On the meaning of "shared memory"
On the one hand, you have the software-oriented (i.e. OS-oriented) meaning of shared-memory: a way to enable different processes to access the same chunk of memory (i.e. to relax the usual OS constraint that a given process should not be able to tamper with other processes' memory). As stated in the wikipedia page, the POSIX shared memory API is one implementation of such a facility. In this acception, it does not make much sense to speak of threads (an OS might well provide shared memory without even providing threads).
On the other hand, you have the hardware-oriented meaning of "shared-memory": an hardware configuration where all CPUs have access to the same piece of RAM.
On the meaning of "thread"
Now we have to disambiguate another term: "thread". An OS might provide a way to have multiple concurrent execution flows within a process. POSIX threads are an implementation of such a feature.
However, the OpenMP specification has its own definitions:
thread: An execution entity with a stack and associated static memory, called
threadprivate memory.
OpenMP thread: A thread that is managed by the OpenMP runtime system.
Such definitions fit nicely with the definition of e.g. POSIX threads, and most OpenMP implementations indeed use POSIX threads to create OpenMP threads. But you might imagine OpenMP implementations on top of OSes which do not provide POSIX threads or equivalent features. Such OpenMP implementations would have to internally manage execution flows, which is difficult enough but entirely doable. Alternatively, they might map OpenMP threads to OS processes and use some kind of "shared memory" feature (in the OS sense) to enable them sharing memory (though I don't know of any OpenMP implementation doing this).
In the end, the only constraint you have for an OpenMP implementation is that all CPUs should have a way to share access to the same central memory. That is to say OpenMP programs should run on "shared memory" systems in the hardware sense. However, OpenMP threads do not necessarily have to be POSIX threads of the same OS process.
A "shared memory system" is simply a system where multiple cores or CPUs are accessing a single pool of memory through a local bus. So the OpenMP site is correct.
Communicating between threads in a program is not done using "shared memory" - instead the term typically refers to communication between processes on the same machine through memory. So the Wikipedia entry is not in contradiction and it, in fact, points out the difference in terminology between hardware and software.

What are the disadvantages of Linux's message queues?

I am working on a message queue used to communication among process on embedded Linux. I am wondering why I'm not using the message queues provided by Linux as following:
msgctl, msgget msgrcv, msgsnd.
instead of creating shared memory, and sync up with semaphore?
What's the disadvantage of using this set of functions directly on a business embedded product?
The functions msgctl(), msgget(), msgrcv(), and msgsnd() are the 'System V IPC' message queue functions. They'll work for you, but they're fairly heavy-weight. They are standardized by POSIX.
POSIX also provides a more modern set of functions, mq_close(), mq_getattr(), mq_notify(), mq_open(), mq_receive(), mq_send(), mq_setattr(), and mq_unlink() which might be better for you (such an embarrassment of riches).
However, you will need to check which, if either, is installed on your target platforms by default. Especially in an embedded system, it could be that you have to configure them, or even get them installed because they aren't there by default (and the same might be true of shared memory and semaphores).
The primary advantage of either set of message facilities is that they are pre-debugged (probably) and therefore have concurrency issues already resolved - whereas if you're going to do it for yourself with shared memory and semaphores, you've got a lot of work to do to get to the same level of functionality.
So, (re)use when you can. If it is an option, use one of the two message queue systems rather than reinvent your own. If you eventually find that there is a performance bottleneck or something similar, then you can investigate writing your own alternatives, but until then — reuse!
System V message queues (the ones manipulated by the msg* system calls) have a lot of weird quirks and gotchas. For new code, I'd strongly recommend using UNIX domain sockets.
That being said, I'd also strongly recommend message-passing IPC over shared-memory schemes. Shared memory is much easier to get wrong, and tends to go wrong much more catastrophically.
Message passing is great for small data chunks and where immutability needs to be maintained, as message queues copy data.
A shared memory area does not copy data on send/receive and can be more efficient for larger data sets at the tradeoff of a less clean programming model.
The disadvantages message queues are miniscule - some system call and copying overhead - which amount to nothing for most applications. The benefits far outweigh that overhead. Synchronization is automatic and they can be used in a variety of ways: blocking, non-blocking, and since in linux the message queue types are implemented as file descriptors they can even be used in select() calls for multiplexing. In the POSIX variety, which you should be using unless you have a really compelling need to use SYSV queues, you can even automatically generate threads or signals to process the queue items. And best of all they are fully debugged.
Message queue and shared memory are different. And it is upto the programmer and his requirement to select which to use. In shared memory you have to be bit careful in reading and writing. And the processes should be synchronized. So the order of execution is very important in shared memory. In shared memory, there is no way to find whether the reading value is newly written value or the older one. And there is no explicit mechanism to wait.
Message queue and shared memory are different. And it is upto the programmer and his requirement to select which to use. There are predefined functions to make your life easy in message queue.

Resources