Multi threaded embedded linux application state machine design - c

Problem definition:
We are designing an application for an industrial embedded system running Linux.
The system is driven by events from the outside world. The inputs to the system could be any of the following:
Few inputs to the system in the form of Digital IO lines(connected
to the GPIOs of the processor like e-stop).
The system runs a web-server which allows for the system to be
controlled via the web browser.
The system runs a TCP server. Any PC or HMI device could send commands over TCP/IP.
The system needs to drive or control RS485 slave devices over UART using Modbus. The system also need to control few IO lines like Cooler ON/OFF etc.We believe that a state machine is essential to define this application. The core application shall be a multi threaded application which shall have the following threads...
Main thread
Thread to control the RS485 slaves.
Thread to handle events from the Web interface.
Thread to handle digital I/O events.
Thread to handle commands over TCP/IP(Sockets)
For inter-thread communication, we are using Pthread condition signal & wait. As per our initial design approach(one state machine in main thread), any input event to the system(web or tcp/ip or digital I/O) shall be relayed to the main thread and it shall communicate to the appropriate thread for which the event is destined. A typical scenario would be to get the status of the RS485 slave through the web interface. In this case, the web interface thread shall relay the event to the main thread which shall change the state and then communicate the event to the thread that control's the RS485 slaves & respond back. The main thread shall send the response back to the web interface thread.
Questions:
Should each thread have its own state machine thereby reducing the
complexity of the main thread ? In such a case, should we still need
to have a state machine in main thread ?
Any thread processing input event can communicate directly to the
thread that handles the event bypassing the main thread ? For e.g
web interface thread could communicate directly with the thread
controlling the RS485 slaves ?
Is it fine to use pthread condition signals & wait for inter thread
communication or is there a better approach ?
How can we have one thread wait for event from outside & response
from other threads ? For e.g. the web interface thread usually waits
for events on a POSIX message queue for Inter process communication
from web server CGI bins. The CGI bin's send events to the web
interface thread through this message queue. When processing this
event, the web interface thread would wait for response from other
threads. In such a situation, it couldn't process any new event from
the web interface until it has completed processing the previous
event and gets back to the wait on the POSIX message queues.
sorry for the too big explanation...I hope I have put forward my explanation in the best possible way for others to understand and help me.
I could give more inputs if needed.

What I always try to do with such requirements is to use one state machine, run by one 'SM' thread, which could be the main thread. This thread waits on an 'EventQueue' input producer-cosumer queue with a timeout. The timeout is used to run an internal delta-queue that can provide timeout events into the state-machine when they are required.
All other threads communicate their events to the state engine by pushing messages onto the EventQueue, and the SM thread processes them serial manner.
If an action routine in the SM decides that it must do something, it must not synchronously wait for anything and so it must request the action by pushing a request message to an input queue of whatever thread/susbsystem can perform it.
My message class, (OK, *struct in your C case), typically contains a 'command' enum, 'result' enum, a data buffer pointer, (in case it needs to transport bulk data), an error-message pointer, (null if no error), and as much other state as is necessary to allow the asynchronous queueing up of any kind of request and returning the complete result, (whether success or fail).
This message-passing, one SM design is the only one I have found that is capable of doing such tasks in a flexible, expandable manner without entering into a nightmare world of deadlocks, uncontrolled communications and unrepeatable, undebuggable interactions.
The first question that should be asked about any design is 'OK, how can the system be debugged if there is some strange problem?'. In my design above, I can answer straightaway: 'we log all events dequeued in the SM thread - they all come in serially so we always know exactly what actions are taken based on them'. If any other design is suggested, ask the above question and, if a good answer is not immediately forthcoming, it will never be got working.
So:
If a thread, or threaded subsystem, can use a separate state-machine to do its own INTERNAL functionality, OK, fine. These SM's should be invisible from the rest of the system.
NO!
Use the pthread condition signals & wait to implement producer-consumer blocking queues.
One input queue per thread/subsystem. All inputs go to this queue in the form of messages. Commands/state in each message identify the message and what should be done with it.
BTW, I would 100% do this in C++ unless shotgun-at-head :)

I have implemented a legacy embedded library that was originally written for a clone (EC115/EC270) of Siemens ES122C terminal controller. This library and OS included more or less what you describe. The original hardware was based on 80186 cpu. The OS, RMOS for Siemens, FXMOS for us (don't google it was never published) had all the stuff needed for basic controller work.
It had preemptive multi-tasking, task-to-task communication, semaphores, timers and I/O events, but no memory protection.
I ported that stuff to RaspberryPi (i.e. Linux).
I used the pthreads to simulate our legacy "tasks" because we hadn't memory protection, so threads are semantically the closest.
The rest of the implementation then turned around the epoll API. This means that everything generates an event. An event is when something happens, a timer expires, another thread sends data, a TCP socket is connected, an IO pin changes state, etc.
This requires that all the event sources be transformed in file descriptors. Linux provides several syscalls that do exactly that:
for task to task communication I used classic Unix pipes.
for timer events I used timerfd API.
for TCP communication I used normal sockets.
for serial I/O I simply opened the right device /dev/???.
signals are not necessary in my case but Linux provides 'signalfd' if necessary.
I have then epoll_wait wrapped around to simulate the original semantic.
I works like a charm.
TL;DR
take a deep look at the epoll API it does what you probably need.
EDIT: Yes and the advices of Martin James are very good especially 4. Each thread should only ever be in a loop waiting on an event via epoll_wait.

Related

Execution Patter of Multi-Threaded Server on Linux

I like to know what should be the execution pattern of Multiple Threads of a Server to implement TCP in request-response cycle of hi-performance Server (like dozens of packets with single or no system call on Linux using Packet MMAP or some other way).
Design 1) For simplicity, Start two thread in main at the start of a Server program. one thread just getting packets directly from network interface(s) like wlan0/eth0. and once number of packets read in one cycle (using while loop with poll() in Linux). wake up the other thread using conditional variable signal call. and after waking up, other thread (sender) process and send packet as tcp response.
Design 2) Start receiver thread at the start of main program. The packet receiver thread reads packets from interfaces using while loop and poll(). When number of packets received, create sender thread and pass number of packets received in one cycle to sender as parameter. Sender thread process the packets and respond as tcp response.
(I think, Design 2 will be more easy to implement but is there any design issue or possible performance issue with this approach this is the question). Since creating buffer to pass to sender thread from receiver thread need to be allocated prior to receiving packets. So I know the size of buffer to allocate. Also in this execution pattern I am creating new thread (which will return and end execution after processing packets and responding tcp response). I like to know what will be the performance issue with this approach since I am creating new thread every time I get a batch of packet from interfaces.
In first approach I am not creating more than two threads (or limited number of threads and threads can be tracked easily for logging and debugging since I will know how many thread are initially created) In second approach I don't know how many threads are hanging around and executing concurrently.
I need any advise how real website like youtube/ or others may have handled this in there hi-performance server if they had followed this way of implementing their front facing servers.
First when going to a 'real' website the magic lies in having a load balancers and a whole bunch of worker nodes to take the load and you easily exceed the boundary of a single system. For example take a look at the following AWS reference architecture for serving web pages at scale AWS Cloud Architecture for serving web whitepaper.
That being said taking this one level down it is always interesting to look at how other well-known products have solved this issue. For example NGINX has an excellent infographic available and matching blogpost describing their architecture and threading.

C/C++ code using pthreads to execute sync and async communications

I am using a Linux machine to communicate with a PLC. The PLC and Linux-machine are connected within a local network, and use UDP/IP as the base protocol. Also, the port number is fixed on both sides.
Such a communication needs to achieve:
Requirement 1: Linux machine could send commands (one command each time) to the PLC. After each command received, the PLC will response the Linux machine with a success/failure message within 50ms.
Requirement 2: Vise versa, PLC could send commands to the Linux machine. The Linux machine has to response back with a message within 50ms. The PLC sending is asynchronous to the Linux machine. Therefore the Linux machine needs to monitor(or listen to) the port continuously.
Simple C/C++ code has been used for testing the communication separately regarding the aforementioned requirements. It worked. But blocking mechanism was conducted.
Here comes the challenging part. I would like to use pthreads for such a communication. My solution is to simply create two threads for each requirement. I sketched my thought using the attached pic https://www.dropbox.com/s/vriyrprl7j6tntx/multi-thread%20solution.png?dl=0, with 'thread 0' denoting main thread, 'thread 1' denoting Requirement 1 thread and 'thread 2' denoting Requirement 2 thread. 'shared data' indicates the data that could be shared throughout all the child threads. 'thread 1 data' is dedicated for thread 1 usage, and other threads will not access. Likewise, 'thread 2 data' is only used by thread 2.
My concern rises considering two threads will be conducting system calls on the same port. Hence, I need reviews on my solution, and it would be awesome to get more working solutions. P.S. I am not too worried about thread synchronization and creation. And it is totally cool to me if thread sync and creation are necessary in your solution.
Thanks in advance.
There is no general problem with two threads executing system calls on the same socket. You may encounter some specific issues, though:
If you call recvfrom() in both threads (one waiting for the PLC to send a request, and the other waiting for the PLC to respond to a command from the server), you don't know which one will receive the response. To get around this, you can dedicate one thread to reading from the PLC, and have it pass reply messages from the PLC to the sending thread using shared queue or similar structure.
You have to be careful when you close a socket that could be in use by another thread - because of the way file descriptors are reused, it's easy to have a race condition that ends up with a thread acting on the wrong socket.

c linux multithreading networking

I have a network application on a gateway. It receives and sends packets. For most of them, my gateway acts as a router, but in some cases, it can receive packets too.
Should I have:
only one main thread
a main thread + a dispatch thread in charge of giving it to the correct flow handler
as many threads as there are flows
something else.
?
Doing multithreading correctly is no simple matter, in many cases a select and friends based solution will be a whole lot easier to create.
Your case sounds a lot like a typical Unix service daemon. The popular solution to your problem is not to use threads, but forks.
The idea is that your program listens on the socket and waits for connections. As soon as a connection arrives, it forks. The child process then continues to process the connection. The father process itself just continues in the loop and waits for incoming connections.
Advantages over threading:
Very simple program design
No problems with concurrency
Established method for Unix/Linux systems
Disadvantages:
Things get complicated when several connections interact with each other (your use case doesn't sound like they would)
Performance penalty on Windows systems (not on Unix systems!)
You can find many code examples online.
I don't know much about networking applications, but I think it's like this:
If you have the ability to react asynchronous to the requests you would probably use just one single thread (like in Node.JS). If you won't be able to react asynchronous the main thread would always block the other actions.
If you are not able to react asynchronous on your requests you have to use more than one thread. But you could achieve that in many different ways: you could create for every request a thread, or a limited number of threads and assign them then to your requests.
My personal preference is use one main thread and one worker thread per connection. No cap whatsoever. I am assuming that your server will be stateless like a HTTP server.
For stateful servers you will have to figure out some way to control number of threads.

Implementing correct inter-module synchronization in Linux kernel

I'm implementing a custom serial bus driver for a certain ARM-based Linux board (a custom UART driver, actually). This driver shall enable communication with a certain MCU on the other end of the bus via a custom protocol. The driver will not (and actually must not) expose any of its functions to the userspace, nor it is possible to implement it in userspace at all (hence, the need for the custom driver instead of using the stock TTY subsystem).
The driver will implement the communication protocol and UART reads/writes, and it has to export a set of higher-level functions to its users to allow them to communicate with the MCU (e.g. read_register(), drive_gpios(), all this stuff). There will be only one user of this module.
The calling module will have to wait for the completion of the operations (the aforementioned read_register() and others). I'm currently considering using semaphores: the user module will call my driver's function, which will initiate the transfers and wait on a semaphore; the IRQ handler of my driver will send requests to the MCU and read the answers, and, when done, post to the semaphore, thus waking up the calling module. But I'm not really familiar with kernel programming, and I'm baffled by the multitude of possible alternative implementations (tasklets? wait queues?).
The question is: is my semaphore-based approach OK, or too naïve? What are the possible alternatives? Are there any pitfalls I may be missing?
Traditionally IRQ handling in Linux is done in two parts:
So called "upper-half" is actual working in IRQ context (IRQ handler itself). This part must exit as fast as possible. So it basically checks interrupt source and then starts bottom-half.
"Bottom-half". It may be implemented as work queue. It is where actual job is done. It runs in normal context, so it can use blocking functions, etc.
If you only want to wait for IRQ in your worker thread, better to use special object called completion. It is exactly created for this task.

What have you used sysv/posix message queues for?

I've never seen any project or anything utilizing posix or sysv message queues - and being curious, what problems or projects have you guys used them for ?
I had a series of commands that needed to be executed in order, but the main program flow did not depend on their completion so I queued them up and passed them to another process via a System V message queue to be executed independently of the main program. Since message queues provide an asynchronous communications protocol, they were a good fit for this task.
To be honest, I used System V message queues because I had never used them before and I wanted to. I'm sure there are other IPC methods I could have used.
It's been a while since I've done any real VxWorks programming, but you can also find message queues used in VxWorks applications. According to the VxWorks Application Programmer's Guide (Google search), the primary intertask communication mechanism within a single CPU is message queues. VxWorks uses two message queue subroutine libraries (POSIX and VxWorks).
I once wrote a text-mode I/O generator utility that had one thread in charge of updating the UI and a number of worker threads to do the actual I/O work. When a worker thread completed an I/O, it sent an update message to the UI thread. I implemented this message system using a POSIX message queue.
Why implement it like this? It sounded like a good idea at the time, and I was curious about how they worked. I figured I could solve the problem and learn something at the same time. There were many different techniques I could have used, and I don't suppose there was any profound reason why I chose this technique. I didn't realize it until later, but I was glad I used a POSIX queue when I had to port the utility to another system (it was also POSIX compliant, so I didn't have to worry about porting external libraries to get my app to run).
You can use it for IPC for sure because it is an IPC mechanism. With this mechanism you can write multi-process event processing applications in which all of the applications are using the queue and each of which are waiting for a special type of message (an special event to occur). When the message arrives that process takes the message, processes that and puts the result back into the queue so that the other process can use it.
Once i wrote such an application using message queues. It is pretty easy to work with and does not need Inter-process synchronization mechanisms such as semaphores. You can use it in place of Shared Memory of Memory Mapped files as well, in situations in which all you need is just sending a structure or some kind of packed data to other processes Message Queues are far easier to use than any other IPC mechanism.
This book contains all information you need to know about Message Queues and other IPC mechanisms in Linux.

Resources