I've never seen any project or anything utilizing posix or sysv message queues - and being curious, what problems or projects have you guys used them for ?
I had a series of commands that needed to be executed in order, but the main program flow did not depend on their completion so I queued them up and passed them to another process via a System V message queue to be executed independently of the main program. Since message queues provide an asynchronous communications protocol, they were a good fit for this task.
To be honest, I used System V message queues because I had never used them before and I wanted to. I'm sure there are other IPC methods I could have used.
It's been a while since I've done any real VxWorks programming, but you can also find message queues used in VxWorks applications. According to the VxWorks Application Programmer's Guide (Google search), the primary intertask communication mechanism within a single CPU is message queues. VxWorks uses two message queue subroutine libraries (POSIX and VxWorks).
I once wrote a text-mode I/O generator utility that had one thread in charge of updating the UI and a number of worker threads to do the actual I/O work. When a worker thread completed an I/O, it sent an update message to the UI thread. I implemented this message system using a POSIX message queue.
Why implement it like this? It sounded like a good idea at the time, and I was curious about how they worked. I figured I could solve the problem and learn something at the same time. There were many different techniques I could have used, and I don't suppose there was any profound reason why I chose this technique. I didn't realize it until later, but I was glad I used a POSIX queue when I had to port the utility to another system (it was also POSIX compliant, so I didn't have to worry about porting external libraries to get my app to run).
You can use it for IPC for sure because it is an IPC mechanism. With this mechanism you can write multi-process event processing applications in which all of the applications are using the queue and each of which are waiting for a special type of message (an special event to occur). When the message arrives that process takes the message, processes that and puts the result back into the queue so that the other process can use it.
Once i wrote such an application using message queues. It is pretty easy to work with and does not need Inter-process synchronization mechanisms such as semaphores. You can use it in place of Shared Memory of Memory Mapped files as well, in situations in which all you need is just sending a structure or some kind of packed data to other processes Message Queues are far easier to use than any other IPC mechanism.
This book contains all information you need to know about Message Queues and other IPC mechanisms in Linux.
Related
As the title suggests, is there a way in C to detect when a user-level thread running on top of a kernel-level thread e.g., pthread has blocked (or about to block) for I/O?
My use case is as follows: I need to execute tasks in a multithreaded environment (on top of kernel threads e.g., pthreads). The tasks are basically user functions that can be synchronized and may use blocking operations within. I need to hide latency in my implementation. So, I am exploring the idea of implementing the tasks as user-level threads for better control of their execution context such that, when a task blocks or synchronizes, I context-switch to other ready tasks (i.e., implementing my own scheduler for the user-level threads). Consequently, almost the full use of the OS’s time quantum per kernel thread can be achieved.
There used to be code that did this, for example GNU pth. It's generally been abandoned because it just doesn't work very well and we have much better options now. You have two choices:
1) If you have OS help, you can use the OS mechanisms. Windows provides OS help for this, IOCP dispatching uses it.
2) If you have no OS help, then you have to convert all blocking operations into non-blocking ones that call your dispatcher rather than blocking. So, for example, if someone calls socket, you intercept that call and set the socket non-blocking. When they call read, you intercept that call and if they get a "would block" indication, you arrange to resume when the operation might succeed and schedule another thread.
You can look at GNU pth to see how you might make option 2 work. But be warned, GNU pth is full of reported bugs that have never been fixed since it was abandoned. It will give you an idea of how to implement things like mutexes and sleeps in a cooperative user-space threading environment. But don't actually use the code.
I have a Linux C program where I'm passing data between threads. I was looking into using POSIX message queues to solve this since they don't require mutexes/locks.
Looking at the mq_open() call, I have to specify permissions and the path to the queue. This leads me to two questions.
Is there a well known convention for specifying the filepath? I was
just going to dump the queues in the same folder as the executable.
In terms of permissions, I was going to use 0600, but I want to restrict this even further to prevent other processes from accessing the queues (I'm sharing data between threads and not processes). Given that the queue is "just" a file, can I use flock() with LOCK_EX to prevent accesses from other processes?
Thanks in advance.
Regarding your question 1 look at the implementation notes for mq_open on your system. At least on Linux and FreeBSD message queue names must start with a slash, but must not contain other slashes.
So while the name of a message queue looks like a path, it might or might not be an actual inode in a filesystem, depending on the implementation. According to mq_overview(7), Linux uses a virtual filesystem for message queues, which may or may not be mounted.
In view of this, question 2 might be moot. You'd have to run a test or check the kernel source if locking of a file in /dev/mqueue is actually even supported and if it accomplishes what you want.
I would not bother protecting the queue from outside processes.
Since flock is only advisory not mandatory it will not do you any good.
Also I not sure that flock will even work on queue descriptors.
Running your service as it's own user will keep other processes from being able to access the queue with mode 0600 of course.
I would however ensure on startup only one service can work on a queue at a time.
You could use pid locking or d-bus to do so.
Problem definition:
We are designing an application for an industrial embedded system running Linux.
The system is driven by events from the outside world. The inputs to the system could be any of the following:
Few inputs to the system in the form of Digital IO lines(connected
to the GPIOs of the processor like e-stop).
The system runs a web-server which allows for the system to be
controlled via the web browser.
The system runs a TCP server. Any PC or HMI device could send commands over TCP/IP.
The system needs to drive or control RS485 slave devices over UART using Modbus. The system also need to control few IO lines like Cooler ON/OFF etc.We believe that a state machine is essential to define this application. The core application shall be a multi threaded application which shall have the following threads...
Main thread
Thread to control the RS485 slaves.
Thread to handle events from the Web interface.
Thread to handle digital I/O events.
Thread to handle commands over TCP/IP(Sockets)
For inter-thread communication, we are using Pthread condition signal & wait. As per our initial design approach(one state machine in main thread), any input event to the system(web or tcp/ip or digital I/O) shall be relayed to the main thread and it shall communicate to the appropriate thread for which the event is destined. A typical scenario would be to get the status of the RS485 slave through the web interface. In this case, the web interface thread shall relay the event to the main thread which shall change the state and then communicate the event to the thread that control's the RS485 slaves & respond back. The main thread shall send the response back to the web interface thread.
Questions:
Should each thread have its own state machine thereby reducing the
complexity of the main thread ? In such a case, should we still need
to have a state machine in main thread ?
Any thread processing input event can communicate directly to the
thread that handles the event bypassing the main thread ? For e.g
web interface thread could communicate directly with the thread
controlling the RS485 slaves ?
Is it fine to use pthread condition signals & wait for inter thread
communication or is there a better approach ?
How can we have one thread wait for event from outside & response
from other threads ? For e.g. the web interface thread usually waits
for events on a POSIX message queue for Inter process communication
from web server CGI bins. The CGI bin's send events to the web
interface thread through this message queue. When processing this
event, the web interface thread would wait for response from other
threads. In such a situation, it couldn't process any new event from
the web interface until it has completed processing the previous
event and gets back to the wait on the POSIX message queues.
sorry for the too big explanation...I hope I have put forward my explanation in the best possible way for others to understand and help me.
I could give more inputs if needed.
What I always try to do with such requirements is to use one state machine, run by one 'SM' thread, which could be the main thread. This thread waits on an 'EventQueue' input producer-cosumer queue with a timeout. The timeout is used to run an internal delta-queue that can provide timeout events into the state-machine when they are required.
All other threads communicate their events to the state engine by pushing messages onto the EventQueue, and the SM thread processes them serial manner.
If an action routine in the SM decides that it must do something, it must not synchronously wait for anything and so it must request the action by pushing a request message to an input queue of whatever thread/susbsystem can perform it.
My message class, (OK, *struct in your C case), typically contains a 'command' enum, 'result' enum, a data buffer pointer, (in case it needs to transport bulk data), an error-message pointer, (null if no error), and as much other state as is necessary to allow the asynchronous queueing up of any kind of request and returning the complete result, (whether success or fail).
This message-passing, one SM design is the only one I have found that is capable of doing such tasks in a flexible, expandable manner without entering into a nightmare world of deadlocks, uncontrolled communications and unrepeatable, undebuggable interactions.
The first question that should be asked about any design is 'OK, how can the system be debugged if there is some strange problem?'. In my design above, I can answer straightaway: 'we log all events dequeued in the SM thread - they all come in serially so we always know exactly what actions are taken based on them'. If any other design is suggested, ask the above question and, if a good answer is not immediately forthcoming, it will never be got working.
So:
If a thread, or threaded subsystem, can use a separate state-machine to do its own INTERNAL functionality, OK, fine. These SM's should be invisible from the rest of the system.
NO!
Use the pthread condition signals & wait to implement producer-consumer blocking queues.
One input queue per thread/subsystem. All inputs go to this queue in the form of messages. Commands/state in each message identify the message and what should be done with it.
BTW, I would 100% do this in C++ unless shotgun-at-head :)
I have implemented a legacy embedded library that was originally written for a clone (EC115/EC270) of Siemens ES122C terminal controller. This library and OS included more or less what you describe. The original hardware was based on 80186 cpu. The OS, RMOS for Siemens, FXMOS for us (don't google it was never published) had all the stuff needed for basic controller work.
It had preemptive multi-tasking, task-to-task communication, semaphores, timers and I/O events, but no memory protection.
I ported that stuff to RaspberryPi (i.e. Linux).
I used the pthreads to simulate our legacy "tasks" because we hadn't memory protection, so threads are semantically the closest.
The rest of the implementation then turned around the epoll API. This means that everything generates an event. An event is when something happens, a timer expires, another thread sends data, a TCP socket is connected, an IO pin changes state, etc.
This requires that all the event sources be transformed in file descriptors. Linux provides several syscalls that do exactly that:
for task to task communication I used classic Unix pipes.
for timer events I used timerfd API.
for TCP communication I used normal sockets.
for serial I/O I simply opened the right device /dev/???.
signals are not necessary in my case but Linux provides 'signalfd' if necessary.
I have then epoll_wait wrapped around to simulate the original semantic.
I works like a charm.
TL;DR
take a deep look at the epoll API it does what you probably need.
EDIT: Yes and the advices of Martin James are very good especially 4. Each thread should only ever be in a loop waiting on an event via epoll_wait.
I am working on a message queue used to communication among process on embedded Linux. I am wondering why I'm not using the message queues provided by Linux as following:
msgctl, msgget msgrcv, msgsnd.
instead of creating shared memory, and sync up with semaphore?
What's the disadvantage of using this set of functions directly on a business embedded product?
The functions msgctl(), msgget(), msgrcv(), and msgsnd() are the 'System V IPC' message queue functions. They'll work for you, but they're fairly heavy-weight. They are standardized by POSIX.
POSIX also provides a more modern set of functions, mq_close(), mq_getattr(), mq_notify(), mq_open(), mq_receive(), mq_send(), mq_setattr(), and mq_unlink() which might be better for you (such an embarrassment of riches).
However, you will need to check which, if either, is installed on your target platforms by default. Especially in an embedded system, it could be that you have to configure them, or even get them installed because they aren't there by default (and the same might be true of shared memory and semaphores).
The primary advantage of either set of message facilities is that they are pre-debugged (probably) and therefore have concurrency issues already resolved - whereas if you're going to do it for yourself with shared memory and semaphores, you've got a lot of work to do to get to the same level of functionality.
So, (re)use when you can. If it is an option, use one of the two message queue systems rather than reinvent your own. If you eventually find that there is a performance bottleneck or something similar, then you can investigate writing your own alternatives, but until then — reuse!
System V message queues (the ones manipulated by the msg* system calls) have a lot of weird quirks and gotchas. For new code, I'd strongly recommend using UNIX domain sockets.
That being said, I'd also strongly recommend message-passing IPC over shared-memory schemes. Shared memory is much easier to get wrong, and tends to go wrong much more catastrophically.
Message passing is great for small data chunks and where immutability needs to be maintained, as message queues copy data.
A shared memory area does not copy data on send/receive and can be more efficient for larger data sets at the tradeoff of a less clean programming model.
The disadvantages message queues are miniscule - some system call and copying overhead - which amount to nothing for most applications. The benefits far outweigh that overhead. Synchronization is automatic and they can be used in a variety of ways: blocking, non-blocking, and since in linux the message queue types are implemented as file descriptors they can even be used in select() calls for multiplexing. In the POSIX variety, which you should be using unless you have a really compelling need to use SYSV queues, you can even automatically generate threads or signals to process the queue items. And best of all they are fully debugged.
Message queue and shared memory are different. And it is upto the programmer and his requirement to select which to use. In shared memory you have to be bit careful in reading and writing. And the processes should be synchronized. So the order of execution is very important in shared memory. In shared memory, there is no way to find whether the reading value is newly written value or the older one. And there is no explicit mechanism to wait.
Message queue and shared memory are different. And it is upto the programmer and his requirement to select which to use. There are predefined functions to make your life easy in message queue.
I'm writing a TCP server/client application on Windows, to become familiar with the Winsock API. I come from an UNIX background and would like to know which of these could be the best approach to implement the application:
First the specification
Must scale well on multiprocessor and single-processor systems.
No hardset limit of connections.
Application can both listen for connections, acting as server, and act as client.
Multi threaded.
First approach:
Non-blocking select-like socket for listening, in the 'server' thread.
for each client connecting we spawn a separate thread.
Second approach:
Blocking socket for listening, in the 'server' thread.
for each client connecting we spawn a separate thread.
Third approach:
Non-blocking select-like socket for listening, in the 'server' thread.
No separate thread for each incoming connection, the protocol would need state information kept across sessions I suppose.
I wonder what is the most efficient and scalable approach, and especially if it can work with a UDP socket too.
Note: I'm writing the application in plain and old C. No .NET nor C++ involved, C++ exceptions disabled too.
As Gary says, I/O Completion Ports are the most efficient way to manage multiple network connections in a non-blocking/async manner on Windows platforms.
With IOCP you get notified when your networking operations complete and you can process these completions with a small number of threads. You get to decide how many threads you allocate to process the completions and the kernel decides when to use the threads that you're providing. It uses them in a LIFO order, to reduce context switching, so that if you are only using the minimal number of threads required at any point and you're reusing the same threads rather than cycling through all of the threads that you have available for use.
The asynchronous nature of IOCP programming can be a little confusing to start with, but once you get the hang of it it's fairly straight forward.
I have some free IOCP server code which demonstrates the basics and provides some example servers that are pretty easy to build on. You can find the code here: http://www.serverframework.com/products---the-free-framework.html. That page also links to some articles that I wrote to explain the code.
Relating this to the detail of your question. You should be looking at a variation on your third approach. Use AcceptEx() to accept new connections, this can be used in an asynchronous manner and so you don't need a separate thread for connection acceptance and can use the threads that are also processing your overlapped/async read and write operations.
I've written an asynchronous client which does not use blocking sockets, so if you're interested in that approach, then take a look at my client: http://codesprout.blogspot.com/2011/04/asynchronous-http-client.html
It's an HTTP client, but I've shown very little HTTP protocol processing in there, it's all just .NET sockets. The server would work in a similar way: you can take advantage of the *Async methods such as AsseptAsync.
Under Windows, the best performances are achieved by using I/O completion calls.
This is because the lists and queuing mechanism is done in the kernel, far from the heavy user-mode overhead (which drags your code down if you dare to do the hard work yourself).
Unfortunately, Windows I/O completion calls need to allocate many threads to scale and this is quickly killing the performances (as compared to Linux epoll which can scale independently of the number of worker threads you decide to involve in the task).
Recently, I discovered http://gwan.com/ a Web server which came from Windows and was then ported under Linux. And their authors describe the problem in details on their forum.