Shared data queue between processes - c

I have a C program that currently uses multiple threads to process data. I use a glib GAsyncQueue for the producer threads to send their data to consumer threads. Now I need to move the threads into into independent processes and I'm not sure how to proceed with pushing data between them. Using pipes does not seem to be very suitable to my task since the amount data being pushed is rather large. Another option is to obtain a piece of shared memory but, since calculating an upper bound on the amount of shared data is a little difficult, this option is less than attractive.
Do you know of something like GAsyncQueue that can be used with multiple processes? Since I'm already using glib, I prefer to use its facilities, but I'm open to using other libraries if they provide what I need.

POSIX specifies a msgsnd(2), msgget(2) interface, though the message and queue sizes may be smaller than you wish. (Linux allows you to modify the sizes with the /proc/sys/kernel/msgmax and /proc/sys/kernel/msgmnb tunable files; the defaults are 8k and 16k.)
Since message buses are a fairly common need you may wish to pick something like RabbitMQ, which provides prewritten bindings to many languages and may make future development easier.

Related

Optimal division of functions between source files

My C program has two threads, both of which interact with two external interfaces. There's too much code for one source file, so I'm splitting it in two. What is the right split?
One thread, MtoD, takes a message off an IPC message queue, processes it, and then sends commands to the driver of a physical interface. The other thread, DtoM, receives interrupts from that driver, processes the input, and then posts the results in a message to an IPC queue.
The obvious ways to split the code in two are:
by thread: two source files, MtoD.c and DtoM.c, each holding all the functions of a single thread - but both files will have to deal with both of the interfaces
by interface: two source files, M.c and D.c, each doing all the business related to a certain external interface - but the threads run through both files.
My concerns are
code maintenance. Doing it by thread makes it easier to follow the logic of a thread (no switching between files). But someone who'd write this object-oriented would probably wrap the interface to the IPC queues in one class, which would be in one file, and the driver interface in another, in the other file.
performance. If you have object files M.o and D.o, each will have just one external library to deal with - but they have to call into each other during execution of a thread. Does that incur any overhead (if the linker has made them into one binary)? If you have MtoD.o and DtoM.o, you could declare most functions as static, which might enable some more compiler optimizations. But would they both need links with the external libraries?
Which way is optimal?
That's an interesting one and you probably get BOTH options being recommended, simply because both have advanteges and disadvantages and it much depends how one values these.
Ok, third option: one thread ? If I get you right you connect a interface to an IPC, so if one thread both reacts to input on either side and sends it out the other side ? I dont think you loose much responce time this way, if any and you have it all in one place. If source is too big you can look into which classes you may naturally separate rather than separating into threads or interfaces.

Opening two programs in a same memoryspace

Is it possible to launch two completely independent programs into one scope of memory area?
For example, I have skype.exe and opera.exe and I want to launch them on a way that will allows them to share common memory. Sounds like threading to me.
These are quite some questions at the same time, let me try to dissect:
It is the definition of a process on a modern OS to have its own virtual address space. So running two processes in the same address space can't happen without a modification to the OS to allow exactly that.
Even if such a modification were available, it would be a less than perfect idea: Access to memory shared between threads is governed by synchronisation primitives explicitly built into them. There is no such mechanism to manage memory access between two processes, that have not explicitly been designed so
Sharing memory if so designed between processes does not at all need them to run in the same virtual address space in their totality: Shared memory segments exist in virtually all modern OS to facilitate exactly that. Again, those processes have to be explicitly designed to use this feature.
If they are two independent programs running then you have to ensure that the data is passed in an independent way between them. Let's say the two programs are running, the first program compute some data that the second program needs. The simplest thing to do is print the data from the first program into a file with some status at the end of the file (to indicate that it is safe for the other program to start reading it). From the other program you have a while loop that checks the status of the last line in that file every period of time.
The other option is to use some library like MPI which has protocols for message passing implemented.

Calling convention which only allows one instance of a function at a time

Say I have multiple threads and all threads call the same function at approximately the same time.
Is there a calling convention which would only allow one instance of the function at any time? What I mean is that the function called by the second thread would only start after the function called by the first thread had returned.
Or are these calling conventions compiler specific? I don't have a whole lot of experience using them.
(Skip to the bottom if you don't care about the threading mumbo-jumbo)
As mentioned before, this is not a "calling convention" but a general problem of computing: concurrency. And the particular case where two or more threads can enter a shared zone at a time, and have a different outcome, is called a race condition (and also extends to/from electronics, and other areas).
The hard thing about threading is that computing is such a deterministic affair, but when threading gets involved, it adds a degree of uncertainty, which vary per platform/OS.
A one-thread affair would guarantee that it can do all tasks in the same order, always, but when you got multiple threads, and the order depends on how fast they can complete a task, shared other applications wanting to use the CPU, then the underlying hardware affects the results.
There's not much of a "sure fire way to do threading", as there's techniques, tools and libraries to deal with individual cases.
Locking in
The most well known technique is using semaphores (or locks), and the most well known semaphore is the mutex one, which only allows one thread at a time to access a shared space, by having a sort of "flag" that is raised once a thread has entered.
if (locked == NO)
{
locked = YES;
// Do ya' thing
locked = NO;
}
The code above, although it looks like it could work, it would not guarantee against cases where both threads pass the if () and then set the variable (which threads can easily do). So there's hardware support for this kind of operation, that guarantees that only one thread can execute it: The testAndSet operation, that checks and then, if available, sets the variable. (Here's the x86 instruction from the instruction set)
On the same vein of locks and semaphores, there's also the read-write lock, that allows multiple readers and one writer, specially useful for things with low volatility. And there's many other variations, some that limit an X amount of threads and whatnot.
But overall, locks are lame, since they are basically forcing serialisation of multi-threading, where threads actually need to get stuck trying to get a lock (or just testing it and leaving). Kinda defeats the purpose of having multiple threads, doesn't it?
The best solution in terms of threading, is to minimise the amount of shared space that threads need to use, possibly, elmininating it completely. Maybe use rwlocks when volatility is low, try to have "try and leave" kind of threads, that check if the lock is up, and then go away if it isn't, etc.
As my OS teacher once said (in Zen-like fashion): "The best kind of locking is the one you can avoid".
Thread Pools
Now, threading is hard, no way around it, that's why there are patterns to deal with such kind of problems, and the Thread Pool Pattern is a popular one, at least in iOS since the introduction of Grand Central Dispatch (GCD).
Instead of having a bunch of threads running amok and getting enqueued all over the place, let's have a set of threads, waiting for tasks in a "pool", and having queues of things to do, ideally, tasks that shouldn't overlap each other.
Now, the thread pattern doesn't solve the problems discussed before, but it changes the paradigm to make it easier to deal with, mentally. Instead of having to think about "threads that need to execute such and such", you just switch the focus to "tasks that need to be executed" and the matter of which thread is doing it, becomes irrelevant.
Again, pools won't solve all your problems, but it will make them easier to understand. And easier to understand may lead to better solutions.
All the theoretical things above mentioned are implemented already, at POSIX level (semaphore.h, pthreads.h, etc. pthreads has a very nice of r/w locking functions), try reading about them.
(Edit: I thought this thread was about Obj-C, not plain C, edited out all the Foundation and GCD stuff)
Calling convention defines how stack & registers are used to implement function calls. Because each thread has its own stack & registers, synchronising threads and calling convention are separate things.
To prevent multiple threads from executing the same code at the same time, you need a mutex. In your example of a function, you'd typically put the mutex lock and unlock inside the function's code, around the statements you don't want your threads to be executing at the same time.
In general terms: Plain code, including function calls, does not know about threads, the operating system does. By using a mutex you tap into the system that manages the running of threads. More details are just a Google search away.
Note that C11, the new C standard revision, does include multi-threading support. But this does not change the general concept; it simply means that you can use C library functions instead of operating system specific ones.

How to share data between Tasks/Threads without coupling them?

I am developing a rather complex microcontroller application in C, and I have some doubts about how to "link" my shared data between the different tasks/threads without coupling them.
Until now I have used a time-sliced scheduler for running my application, and therefore there has been no need for data protection. But I want to make the application right, and I want to make it ready for an multi-threaded OS later on.
I have tried to simplify my question by using a completely different system than the actual system i am working on. I couldn't add a picture because i am a new user, but ill try and explain instead:
We got 4 tasks/threads: 3 input threads which reads some sensor data from different sensors through Hardware Abstraction Layers (HAL). The collected sensor data is stored within the task domain (ie: They wont be global!!).
Now we also got 1 output task, lets call it "Regulator". Regulator has to use (read) sensor data collected from all 3 sensors in order to generate a proper output.
Question: How will Regulator read the collected data stored in the different input tasks without coupling with other tasks?
Regulator must only know of the inputs tasks and their data by reference (ie: no #includes, no coupling).
Until now Regulator have had a pointer to each of the needed sensor data, and this pointer is set up at initialization time. This wont work in a multi-threaded application due to data protection.
I could make some getSensorValue() functions, which make use of semaphores, for each sensor value and then link these to Regulator with function pointers. But this would take up a lot of memory!! Is there a more elegant way of doing this? I am just searching for inputs.
I hope all this is understandable :)
From what you described in the question and comments it seems like you're most worried about the interfacing between Sensors and Regulators being low-memory with minimal implementation details and without knowing the explicit details of each Sensor implementation.
Since you're in C and don't have some of the C++ class features that would make encapsulation easier via inheritance, I'd suggest you make a common datapackage from each Sensor thread which is passed to Regulators rather than pass a function pointer. A struct of the form
struct SensorDataWrap {
DataType *data;
LockType *lock;
... other attributes such as newData or sensorName ...
};
would allow you to pass data to Regulators, where you could lock before reading. Similarly the Sensors would need to lock before writing. If you changed data to be a double pointer DataType **data you could make the write command only need to lock for the time it takes to swap the underlying pointer. The Regulator then just needs a single SensorDataWrap struct from each thread to process that thread's information regardless of the Sensor implementation details.
The LockType could be a semaphore, or any higher level lock object which enables single-access acquisition. The memory footprint for any such lock should only be a couple bytes. Furthermore you're not duplicating data here, so you shouldn't have any multiplicative effects on your memory size relative to sensor read-outs. The hardware you're using should have more than enough space for holding a single copy of the data from the sensors you described as well as enough flash space to accommodate the semaphore or lock objects.
The implementation details for communication are now restricted to lock, do operation, unlock and doesn't need complicated function pointers or SensorN specific header includes. It should take close to the minimal logic needed for any threaded shared data program. The program should also be transferable to other microcontrollers without major changes -- the communication only really restricted by the pressence/absence of threading and locks.
Another option is to pass a triple buffer object and do buffer flipping in order to avoid semaphores and locks. This approach needs atomic integer/bool support to be created (which you most likely have exposed by the compiler if you have semaphores). A guide to using triple buffers for concurrency can be found on this blog. This approach will use a little more active memory, but is a very slick way of avoiding most concurrency problems.

Library for Dataflow in C

How can I do dataflow (pipes and filters, stream processing, flow based) in C? And not with UNIX pipes.
I recently came across stream.py.
Streams are iterables with a pipelining mechanism to enable data-flow programming and easy parallelization.
The idea is to take the output of a function that turns an iterable into another iterable and plug that as the input of another such function. While you can already do this using function composition, this package provides an elegant notation for it by overloading the >> operator.
I would like to duplicate a simple version of this kind of functionality in C. I particularly like the overloading of the >> operator to avoid function composition mess. Wikipedia points to this hint from a Usenet post in 1990.
Why C? Because I would like to be able to do this on microcontrollers and in C extensions for other high level languages (Max, Pd*, Python).
* (ironic given that Max and Pd were written, in C, specifically for this purpose – I'm looking for something barebones)
I know, that it's not a good answer, but you should make your own simple dataflow framework.
I've written a prototype DF server (together with a friend of mine), which have several unimplemented features yet: it can only pass Integer and Trigger data in messages, and it does not supports paralellism. I've just skipped this work: the components' producer ports have a list of function pointers to consumer ports, which are set up upon the initialization, and they call it (if the list is not empty). So, when an event fires, the components perform a tree-like walk-thru of the dataflow graph. As they work with Integers and Triggers, it's extremly quick.
Also, I've written a strange component, which have one consumer and one producer port, it just simply passes the data thru - but in another thread. It's consumer routine finishes quickly, as it just puts the data and sets a flag to the producer-side thread. Dirty, but it suits my needs: it detaches long processes of the tree-walk.
So, as you may recognized, it's a low-traffic asynchronous system for quick tasks, where the graph size does not matter.
Unfortunatelly, your problem differs as many points from mine, just as many one dataflow system can differ from another, you need a synchronous, paralell, stream handling solution.
I think, the biggest issue in a DF server is the dispatcher. Concurrency, collision, threads, priority... as I said, I've just skipped the problem, not solved. You should skip it, too. And you also should skip other problems.
Dispatcher
In case of a synchronous DF architecture, all the components must run once per cycle, except special cases. They have a simple precondition: is the input data available? So, you should just to scan thru the components, and pass them to a free caller thread, if data is available. After processing all of them, you will have N remaining components, which haven't processed. You should process the list again. After the second processing you will have M remainings. If N == M, the cycle is over.
I think some kind of same stuff will work, if the number of components is below only 100.
Binding
Yep, the best way of binding is the visual programming. Until finishing the editor, config-like code should used insetad, something like:
// disclaimer: not actual code
Component* c1 = new AddComponent();
Component* c2 = new PrintComponent();
c2->format = "The result is %d\n";
bind(c1->result,c2->feed);
It's easy to write, well-readable, other wish?
Message
You should pass pure raw packets among components' ports. You need only a list of bindings, which contain pairs of pointers of producer and consumer ports, and contains the processed flag, which the "dispatcher" uses.
Calling issue
The problem is that producer should not call the consumer port, but the component; all component (class) variables and firings are in the component. So, the producer should call the component's common entry point directly, passing the consumer's ID to it, or it should call the port, which should call any method of the component which it belongs.
So, if you can live with some restrictions, I say, go ahead, and write your lite framework. It's a good task, but writing small components and see, how smart can they wired together building a great app is the ultimate fun.
If you have further questions, feel free to ask, I often scan the "dataflow" keyword here.
Possibly, you can figure out a more simple dataflowish model for your program.
I'm not aware of any library for such purpose. Friend of mine implemented something similar in versity as a lab assignment. Main problems of such systems is low performance (really bad if functions in long pipe-lines are smallish) and potential need to implement scheduling (detecting dead-locks and boosting priority to avoid overload of pipe buffer).
From my experience with similar data processing, error handling is quite burdensome. Since functions in the pipeline know little of the context (intentionally, for reusability) they can't produce sensible error message. One can implement in-line error handling - passing errors down the pipe as data - but that would require special handling all over the place, especially on the output as it is not possible with streams to correlate to what input the error corresponds.
Considering known performance problems of the approach, it is hard for me to imagine how that would fit microcontrollers. Performance-wise, nothing beats a plain function: one can create a function for every path through the data pipe-line.
Probably you can look for some Petri net implementation (simulator or code generator), as they are one of the theoretical base for streams.
This is cool: http://code.google.com/p/libconcurrency/
A lightweight concurrency library for C, featuring symmetric coroutines as the main control flow abstraction. The library is similar to State Threads, but using coroutines instead of green threads. This simplifies inter-procedural calls and largely eliminates the need for mutexes and semaphores for signaling.
Eventually, coroutine calls will also be able to safely migrate between kernel threads, so the achievable scalability is consequently much higher than State Threads, which is purposely single-threaded.
This library was inspired by Douglas W. Jones' "minimal user-level thread package". The pseudo-platform-neutral probing algorithm on the svn trunk is derived from his code.
There is also a safer, more portable coroutine implementation based on stack copying, which was inspired by sigfpe's page on portable continuations in C. Copying is more portable and flexible than stack switching, and making copying competitive with switching is being researched.

Resources