I know it’s been written a lot about mutex implementation, however I couldn’t find any solution for my problem.
I am working on one core multi-task system ran from RTOS.
Task_1() // lower priority preemptive task
Task_2() // higher priority task
I am trying to implement mutex for locking the Uart communication port, however I am facing with the following problem:
Task_1 locks the mutex and starts sending a message over Uart port.
In meantime it’s been preempted (interrupted) by the higher priority Task_2 which also attempts to send data over UART.
Task_2, however can not lock the mutex since the task which is holding the mutex locked was interrupted and it can not unlock it. This blocks both Tasks run.
Is there any good solution to avoid or resolve this situation?
The goal is also not to corrupt the Uart data sent by Task_1.
Related
My design is as follows:
Make the receiving socket non-blocking.
Set epoll_wait to timeout 0.
Loop in the main thread to receive clients and judge epoll_wait. If an event occurs, create a thread to process I/O operation。
I think this scheme has disadvantages:
Too many threads are opened. Although it can be managed by a thread pool, it seems that a large number of threads need to be opened to handle concurrency.
Use queues to manage read and write tasks, Only a small number of threads are required to process queued tasks. But this may cause server delays.
I have an idea:
The task is handed over to the linux kernel through aio_read and aio_write. This seems to reduce the number of server threads, but I'm not sure if this is potentially risky.
My unused knowledge:
No multi-process programming is used in the server.
Thanks to everyone who suggested.
I'm new to kernel programming and I was making changes in a Linux Driver. I want to block/wait in a Critical Section for user input(The communication between the Driver and the user-space Application work). The Problem is that when I used wait_event_timeout() the System is being crashed and I am getting
BUG: scheduling while atomic: swapper.
Is anybody have any idea how to solve this problem and can give me some advice where to start?
As explained in other questions, you are calling wait_event_timeout in a context when you already have some lock acquired (inside a critical section). In this point your process can potentially deadlock with other processes and the scheduler complains about it. Please, review the point where you are calling wait_event_timeout and check if the I/O is performed in the correct place and if you have unlocked all synchronization primitives before sending your process to sleep.
I am trying to understand why IOCP is used. I can think of two reasons:
Since WSARecv() will not block, then I can handle 1000s of clients without having to create a new thread for each client (also, there is a limit on how many threads you can create, and so the number of clients you can handle will be limited).
Since WSASend() will not block, then when I want to send a large file, I don't have to create a new thread to send it (if I did not create a new thread then the UI thread will block of course).
What other reasons are there to use IOCP?
IOCP has the benefits that you mention but that is not exclusive to IOCP. I'm not that familiar with the native socket APIs but some Win32 APIs have "overlapped IO" which is asynchronous but does not require IOCP.
Another benefit is that with IOCP the number of request serving threads is (kind of) optimized by the kernel. The kernel is aware of all blocking that request serving threads do and it will see to it that there are enough, and not more, threads unblocked at all times so that the CPU is well-utilized. Ideally, you would never block and there would be as many threads as there are cores (assuming 100% load). That would be very efficient.
IOCP also helps to reduce context switching because instead of switching to another thread to process the results of an IO an existing thread that is busy already simply calls GetQueuedCompletionStatus again.
GetQueuedCompletionStatusEx can be used to reduce the number of transitions to the kernel because you can dequeue multiple IOs in one call.
Also, it cuts down on avoidable bulk data copying and protection ring cycles. Instead of the kernel having to copy data from the network stack buffers into a user-space buffer when requested by a recv() call, user-space buffers are supplied by WSARecv() and the stack can then load them directly in kernel space.
Problem definition:
We are designing an application for an industrial embedded system running Linux.
The system is driven by events from the outside world. The inputs to the system could be any of the following:
Few inputs to the system in the form of Digital IO lines(connected
to the GPIOs of the processor like e-stop).
The system runs a web-server which allows for the system to be
controlled via the web browser.
The system runs a TCP server. Any PC or HMI device could send commands over TCP/IP.
The system needs to drive or control RS485 slave devices over UART using Modbus. The system also need to control few IO lines like Cooler ON/OFF etc.We believe that a state machine is essential to define this application. The core application shall be a multi threaded application which shall have the following threads...
Main thread
Thread to control the RS485 slaves.
Thread to handle events from the Web interface.
Thread to handle digital I/O events.
Thread to handle commands over TCP/IP(Sockets)
For inter-thread communication, we are using Pthread condition signal & wait. As per our initial design approach(one state machine in main thread), any input event to the system(web or tcp/ip or digital I/O) shall be relayed to the main thread and it shall communicate to the appropriate thread for which the event is destined. A typical scenario would be to get the status of the RS485 slave through the web interface. In this case, the web interface thread shall relay the event to the main thread which shall change the state and then communicate the event to the thread that control's the RS485 slaves & respond back. The main thread shall send the response back to the web interface thread.
Questions:
Should each thread have its own state machine thereby reducing the
complexity of the main thread ? In such a case, should we still need
to have a state machine in main thread ?
Any thread processing input event can communicate directly to the
thread that handles the event bypassing the main thread ? For e.g
web interface thread could communicate directly with the thread
controlling the RS485 slaves ?
Is it fine to use pthread condition signals & wait for inter thread
communication or is there a better approach ?
How can we have one thread wait for event from outside & response
from other threads ? For e.g. the web interface thread usually waits
for events on a POSIX message queue for Inter process communication
from web server CGI bins. The CGI bin's send events to the web
interface thread through this message queue. When processing this
event, the web interface thread would wait for response from other
threads. In such a situation, it couldn't process any new event from
the web interface until it has completed processing the previous
event and gets back to the wait on the POSIX message queues.
sorry for the too big explanation...I hope I have put forward my explanation in the best possible way for others to understand and help me.
I could give more inputs if needed.
What I always try to do with such requirements is to use one state machine, run by one 'SM' thread, which could be the main thread. This thread waits on an 'EventQueue' input producer-cosumer queue with a timeout. The timeout is used to run an internal delta-queue that can provide timeout events into the state-machine when they are required.
All other threads communicate their events to the state engine by pushing messages onto the EventQueue, and the SM thread processes them serial manner.
If an action routine in the SM decides that it must do something, it must not synchronously wait for anything and so it must request the action by pushing a request message to an input queue of whatever thread/susbsystem can perform it.
My message class, (OK, *struct in your C case), typically contains a 'command' enum, 'result' enum, a data buffer pointer, (in case it needs to transport bulk data), an error-message pointer, (null if no error), and as much other state as is necessary to allow the asynchronous queueing up of any kind of request and returning the complete result, (whether success or fail).
This message-passing, one SM design is the only one I have found that is capable of doing such tasks in a flexible, expandable manner without entering into a nightmare world of deadlocks, uncontrolled communications and unrepeatable, undebuggable interactions.
The first question that should be asked about any design is 'OK, how can the system be debugged if there is some strange problem?'. In my design above, I can answer straightaway: 'we log all events dequeued in the SM thread - they all come in serially so we always know exactly what actions are taken based on them'. If any other design is suggested, ask the above question and, if a good answer is not immediately forthcoming, it will never be got working.
So:
If a thread, or threaded subsystem, can use a separate state-machine to do its own INTERNAL functionality, OK, fine. These SM's should be invisible from the rest of the system.
NO!
Use the pthread condition signals & wait to implement producer-consumer blocking queues.
One input queue per thread/subsystem. All inputs go to this queue in the form of messages. Commands/state in each message identify the message and what should be done with it.
BTW, I would 100% do this in C++ unless shotgun-at-head :)
I have implemented a legacy embedded library that was originally written for a clone (EC115/EC270) of Siemens ES122C terminal controller. This library and OS included more or less what you describe. The original hardware was based on 80186 cpu. The OS, RMOS for Siemens, FXMOS for us (don't google it was never published) had all the stuff needed for basic controller work.
It had preemptive multi-tasking, task-to-task communication, semaphores, timers and I/O events, but no memory protection.
I ported that stuff to RaspberryPi (i.e. Linux).
I used the pthreads to simulate our legacy "tasks" because we hadn't memory protection, so threads are semantically the closest.
The rest of the implementation then turned around the epoll API. This means that everything generates an event. An event is when something happens, a timer expires, another thread sends data, a TCP socket is connected, an IO pin changes state, etc.
This requires that all the event sources be transformed in file descriptors. Linux provides several syscalls that do exactly that:
for task to task communication I used classic Unix pipes.
for timer events I used timerfd API.
for TCP communication I used normal sockets.
for serial I/O I simply opened the right device /dev/???.
signals are not necessary in my case but Linux provides 'signalfd' if necessary.
I have then epoll_wait wrapped around to simulate the original semantic.
I works like a charm.
TL;DR
take a deep look at the epoll API it does what you probably need.
EDIT: Yes and the advices of Martin James are very good especially 4. Each thread should only ever be in a loop waiting on an event via epoll_wait.
Is there any way to perform POSIX shared synchronization objects cleanup especially on process crash? Locked POSIX semaphores unblock is most desired thing but automatically 'collected' queues / shared memory region would be nice too. Another thing to keep eye on is we can't in general use signal handlers because of SIGKILL which cannot be caught.
I see only one alternative: some external daemon which accepts subscriptions and 'keep-alive' requests working as watchdog so not having notifications about some object it could close / unlock object in accordance to registered policy.
Has anyone better alternative / proposition? I never worked seriously with POSIX shared objects before (sockets were enough for all my needs and are much more useful by my opinion) and I did not found any applicable article. I'd gladly use sockets here but can't because of historical reasons.
Rather than using semaphores you could use file locking to co-oridinate your processes. The big advanatge of file locks being that they are released if the process terminates. You can map each semaphore onto a lock for a byte in a shared file and know that locks will get released on exit; in mosts version of unix the bytes you lock don't even have to exist. There is code for this in Marc Rochkind's book Advanced Unix Programming 1st edition, don't know if it's in the latest 2nd edition though.
I know this question is old, but another great solution is POSIX robust mutexes. They automatically unlock and enter an "inconsistent flag" state when the owner dies, and the next thread to attempt locking the mutex gets an EOWNERDEAD error but succeeds in becoming the new owner of the mutex. It's then able to clean up whatever state the mutex was protecting (which could be in a very bad inconsistent state due to asynchronous termination of the previous owner!) and mark the mutex as consistent again before unlocking it.
See the documentation on robust mutexes here:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/pthread_mutex_lock.html
The usual way is to work with signal handlers. Just catch the signals and call the cleanup functions.
But your watchdog daemon has some merits, too. It would surely make the system more simple to understand and manage. To make it more simple to administrate, your application should start the daemon when it's not running and the daemon should be able to clean up any residue from the last crash.