C - simultaneous receiving and handling data from unix sockets - c

I have a C program which communicates with PHP through Unix Sockets. The procedure is as follows: PHP accepts a file upload by the user, then sends a "signal" to C which then dispatches another process (fork) to unzip the file (I know this could be handled by PHP alone, this is just an example; the whole problem is more complex).
The problem is that I don't want to have more than say 4 processes running at the same time. I think this could be solved like this: C, when it gets a new "task" from PHP dumps it on a queue and handles them one-by-one (assuring that there are no more than 4 running) while still listening on the socket.
I'm unsure how to achieve this though, as I cannot do that in the same process (or can I)? I have thought I could have another child process for managing the queue which would be accessible by the parent with the use of shared memory, but that seems overly complicated. Is there any other way around?
Thanks in advance.

If you need to have a separate process for each task handler, then you might consider having five separate processes. The first one is the listener, and handles new incoming tasks, and places it into a queue. Each task handler initially sends a request for work, and also when it is finished processing a task. When the listener receives this request, it delivers the next task on the queue to the task handler, or it places the task handler on a handler queue if the task queue is empty. When the task queue transitions from empty to non-empty, it checks if there is a ready task handler in the handler queue. If so, it takes that task handler out of the queue, and delivers the task from the task queue to the task handler.
The PHP process would put tasks to the listener, while the task handlers would get tasks from the listener. The listener simply waits for put or get requests, and processes them. You can think of the listener as a simple web server, but each of the socket connections to the PHP process and to each task handler can be persistent.
Since the number of sockets is small and persistent, any of the multiplexing calls could work (select, poll, epoll, kqueue, or whatever is best and or available for your system), but it may be easiest to use a separate thread to handle each socket synchronously. The ready task handler queue would then be a semaphore or a condition variable on the task queue. The thread that handles puts from the PHP process would place tasks on the task queue, and up the semaphore. Each thread that handles ready tasks would down the semaphore, then take a task off the task queue. The task queue itself may itself need mutual exclusive protection depending on how it is implemented.

Related

How to handle shared memory in a multi threaded environment?

I have a client-server model. A multithreaded client sends a message to the server over the TCP sockets. The server is also multiple threaded with each request handled by a thread from the worker pool.
Now, the server must send back the message to the client via shared-memory IPC. For example:
multi threaded client --- GET /a.png --> server
|
|
one worker
|
|
\ /
puts the file descriptor into the shared memory
When worker thread adds the information into the shared memory, how do I make sure that it is read by the same client that requested it?
I feel clueless here as to how to proceed. Currently, I have created one segment of shared memory and there are 20 threads on the server and 10 threads on the client.
While you can use IPC between threads, it's generally not a good idea. Threads share all memory anyway since they are part of the same process and there are very efficient mechanisms for communications between threads.
It might just be easier to have the same thread handle a request all the way through. That way, you don't have to hand off a request from thread to thread. However, if you have a pool of requests that are being worked on, it often makes sense to have a thread be able to "put down" a request and then later be able to have that thread or a different thread "pick up" the request.
The easiest way to do this is to make all the information related to the request live in a single structure or object. Use standard thread synchronization tools (like mutexes) to control finding the object, taking ownership of it, and so on.
So when an I/O thread receives a request, it creates a new request object, acquires a mutex, and adds it to the global collection of requests the server is working on. Worker threads can check this global collection to see which requests need work or they can be explicitly dispatched by the thread that created the request.

LibUV multi-threaded handler architecture design

I'm trying to architecture the main event handling of a libuv-based application. The application is composed of one (or more) UDP receivers sharing a socket, whose job is to delegate processing incoming messages to a common worker pool.
As the protocol handled is stateful, all packets coming from any given server should always be directed to the same worker – this constraint seem to make using LibUV built-in worker pool impossible.
The workers should be able to send themselves packets.
As such, and as I am new to LibUV, I wanted to share with you the intended architecture, in order to get feedback and best practices about it.
– Each worker run their very own LibUV loop, allowing them to send directly packets over the network. Additionally, each worker has a dedicated concurrent queue for sending it messages.
– When a packet is received, its source address is hashed to select the corresponding worker from the pool.
– The receiver created a unique async handle on the receiver loop, to act as callback when processing has finished.
– The receiver notifies the worker with an async handle that a new message is available, which wakes up the worker, that starts to process all enqueued messages.
– The worker thread calls the async handle on the receiver queue, which will cause the receiver to return the buffer to pool and free all allocated resources (as such, the pool does not need to be thread-safe).
The main questions I have would be:
– What is the overhead of creating an async handle for each received message? Is it a good design?
– Is there any built-in way to send a message to another event loop?
– Would it be better to send outgoing packets using another loop, instead of doing it right from the worker loop?
Thanks.

How can I subscribe to a channel and then do something else without blocking?

I am using redis pub/sub to do some real-time processing.
In subscribe ends, I want to subscribe to a specified channel, then do some other computations. I am under the imporession that if I send a subscribe command to server, it will block the code.
So how can I do something else, and when the subscribe message arrives, I process that via a callback handler?
You need two different connections to do that. This is was a design choice because when you SUBSCRIBE / PSUBSCRIBE, actually the connection semantics changes from Request-Response to Push-style, so it is not suitable to run commands without implementing a more complex semantics like the one, for example, of the IMAP protocol.
The first point is to dedicate a Redis connection to the subscriptions. Once SUBSCRIBE or PUSBSCRIBE has been applied on a connection, only subscription related commands can be done. So in a C program, you need at least one connection for your subscription(s), and one connection to do anything else with Redis.
Then you need to also find a way to handle those two connections from the C program. Several solutions are possible. For instance:
use multi-threading, and dedicate a thread to the connection responsible of subscriptions. On reception on a new event, the dedicated thread should post a message to your main application thread, which will activate a callback.
use non-blocking and asynchronous API. Hiredis comes with event loop adapters. You need an event loop to handle the connections to Redis (including the one dedicated to subscription). Upon reception of a publication event, the associated callback will be directly triggered by the event loop. Here is an example of subscription with hiredis and libevent.

Libevent: how to close all open sockets on shutdown?

I have created a simple HTTP proxy using libevent. It can be shutdown by sending it a SIGHUP signal which is caught by the signal handler. The shutdown function calls event_base_loopexit, frees structures and other heap allocations and exits.
The problem is if a SIGHUP is caught when a connection is open. I need to be able to close the socket, ideally invoking the same close function that is called when a close event is caught.
Is there a correct or standard way to do this?
At the moment, the only thing I can think of is to keep a linked list of connections and simply iterate through this on shutdown, closing each in turn.
At the moment, the only thing I can think of is to keep a linked list of connections and simply >iterate through this on shutdown, closing each in turn.
That's what you have to do.
(Note that sockets are closed when the application exits. But if you need to do custom logic on all the connections on shutdown, you need to keep track of them and iterate through them.)

Multi threaded embedded linux application state machine design

Problem definition:
We are designing an application for an industrial embedded system running Linux.
The system is driven by events from the outside world. The inputs to the system could be any of the following:
Few inputs to the system in the form of Digital IO lines(connected
to the GPIOs of the processor like e-stop).
The system runs a web-server which allows for the system to be
controlled via the web browser.
The system runs a TCP server. Any PC or HMI device could send commands over TCP/IP.
The system needs to drive or control RS485 slave devices over UART using Modbus. The system also need to control few IO lines like Cooler ON/OFF etc.We believe that a state machine is essential to define this application. The core application shall be a multi threaded application which shall have the following threads...
Main thread
Thread to control the RS485 slaves.
Thread to handle events from the Web interface.
Thread to handle digital I/O events.
Thread to handle commands over TCP/IP(Sockets)
For inter-thread communication, we are using Pthread condition signal & wait. As per our initial design approach(one state machine in main thread), any input event to the system(web or tcp/ip or digital I/O) shall be relayed to the main thread and it shall communicate to the appropriate thread for which the event is destined. A typical scenario would be to get the status of the RS485 slave through the web interface. In this case, the web interface thread shall relay the event to the main thread which shall change the state and then communicate the event to the thread that control's the RS485 slaves & respond back. The main thread shall send the response back to the web interface thread.
Questions:
Should each thread have its own state machine thereby reducing the
complexity of the main thread ? In such a case, should we still need
to have a state machine in main thread ?
Any thread processing input event can communicate directly to the
thread that handles the event bypassing the main thread ? For e.g
web interface thread could communicate directly with the thread
controlling the RS485 slaves ?
Is it fine to use pthread condition signals & wait for inter thread
communication or is there a better approach ?
How can we have one thread wait for event from outside & response
from other threads ? For e.g. the web interface thread usually waits
for events on a POSIX message queue for Inter process communication
from web server CGI bins. The CGI bin's send events to the web
interface thread through this message queue. When processing this
event, the web interface thread would wait for response from other
threads. In such a situation, it couldn't process any new event from
the web interface until it has completed processing the previous
event and gets back to the wait on the POSIX message queues.
sorry for the too big explanation...I hope I have put forward my explanation in the best possible way for others to understand and help me.
I could give more inputs if needed.
What I always try to do with such requirements is to use one state machine, run by one 'SM' thread, which could be the main thread. This thread waits on an 'EventQueue' input producer-cosumer queue with a timeout. The timeout is used to run an internal delta-queue that can provide timeout events into the state-machine when they are required.
All other threads communicate their events to the state engine by pushing messages onto the EventQueue, and the SM thread processes them serial manner.
If an action routine in the SM decides that it must do something, it must not synchronously wait for anything and so it must request the action by pushing a request message to an input queue of whatever thread/susbsystem can perform it.
My message class, (OK, *struct in your C case), typically contains a 'command' enum, 'result' enum, a data buffer pointer, (in case it needs to transport bulk data), an error-message pointer, (null if no error), and as much other state as is necessary to allow the asynchronous queueing up of any kind of request and returning the complete result, (whether success or fail).
This message-passing, one SM design is the only one I have found that is capable of doing such tasks in a flexible, expandable manner without entering into a nightmare world of deadlocks, uncontrolled communications and unrepeatable, undebuggable interactions.
The first question that should be asked about any design is 'OK, how can the system be debugged if there is some strange problem?'. In my design above, I can answer straightaway: 'we log all events dequeued in the SM thread - they all come in serially so we always know exactly what actions are taken based on them'. If any other design is suggested, ask the above question and, if a good answer is not immediately forthcoming, it will never be got working.
So:
If a thread, or threaded subsystem, can use a separate state-machine to do its own INTERNAL functionality, OK, fine. These SM's should be invisible from the rest of the system.
NO!
Use the pthread condition signals & wait to implement producer-consumer blocking queues.
One input queue per thread/subsystem. All inputs go to this queue in the form of messages. Commands/state in each message identify the message and what should be done with it.
BTW, I would 100% do this in C++ unless shotgun-at-head :)
I have implemented a legacy embedded library that was originally written for a clone (EC115/EC270) of Siemens ES122C terminal controller. This library and OS included more or less what you describe. The original hardware was based on 80186 cpu. The OS, RMOS for Siemens, FXMOS for us (don't google it was never published) had all the stuff needed for basic controller work.
It had preemptive multi-tasking, task-to-task communication, semaphores, timers and I/O events, but no memory protection.
I ported that stuff to RaspberryPi (i.e. Linux).
I used the pthreads to simulate our legacy "tasks" because we hadn't memory protection, so threads are semantically the closest.
The rest of the implementation then turned around the epoll API. This means that everything generates an event. An event is when something happens, a timer expires, another thread sends data, a TCP socket is connected, an IO pin changes state, etc.
This requires that all the event sources be transformed in file descriptors. Linux provides several syscalls that do exactly that:
for task to task communication I used classic Unix pipes.
for timer events I used timerfd API.
for TCP communication I used normal sockets.
for serial I/O I simply opened the right device /dev/???.
signals are not necessary in my case but Linux provides 'signalfd' if necessary.
I have then epoll_wait wrapped around to simulate the original semantic.
I works like a charm.
TL;DR
take a deep look at the epoll API it does what you probably need.
EDIT: Yes and the advices of Martin James are very good especially 4. Each thread should only ever be in a loop waiting on an event via epoll_wait.

Resources