How callbacks work and global data synchronization - c

I am having some difficulty understanding callbacks and program flow, synchronization issues.
Lets say I have a global variable g_peers. And I register a callback with a system app which will notify me with peer events like - joins/leave/change. Now in the callback, I am modifying g_peers based on the event and associated information. In other parts of the code (i.e the regular code flow) I have functions which read from g_peers.
Now will this result in synchronization issues? Lets say I am in the middle of reading from g_peers when a peer leaves and callback is invoked which modifies g_peers.
How does callback work? Is the normal flow interrupted till the callback finishes?

Global variables in a multithreaded enviornment always need to be synchronized for concurrent access through multiple threads.
If your environment is multithreaded then the callback will be called in a separate thread and hence must be synchronized.
If your environment is single threaded then no synchronization is needed.
What is a Callback?
In simple terms, a Callback function is one that is not called explicitly by the programmer. Instead, there is some mechanism that continually waits for events to occur, and it will call selected functions in response to particular events.
This mechanism is typically used when a operation(function) can take long time for execution and the caller of the function does not want to wait till the operation is complete, but does wish to be intimated of the outcome of the operation. Typically, Callback functions help implement such an asynchronous mechanism, wherein the caller registers to get inimated about the result of the time consuming processing and continuous other operations while at a later point of time, the caller gets informed of the result.

Related

Emit signal from separate thread in glib/gtk

I've programmed a GTK3 application in C. To speed up performance I want to put some calculations in separate threads. Currently I have not yet decided how to exactly implement it. But I think I will go with a GTask that I will trigger to run in a separate thread.
I want to to emit certain status updates about the calculation progress on my GUI.
They way I imagine this:
I've got a calculation GObject with a do_the_stuff_async() function that triggers the thread/GTask. Ideally, I want to connect to a 'progress-changed' signal which gives me the current status that I can display on my GUI. Also it would be great to trigger an event once the task has finished, which seems doable with a GTask
How do I safely emit a signal from a GTask/GThread to my GTK main loop?
Because I have not yet started implementing the asynchronous stuff: Is a GTask a suitable way for this or should I use something entirely different?
A GTask is suitable for this.
To emit the signal in a different thread:
Store the GMainContext of the main thread somewhere, and pass it into the GTask as task data.
When you want to emit a signal, create a GSource with g_idle_source_new(), add a callback for it, and attach it to the GMainContext of the main thread. That will make the callback be called in the main thread the next time the main thread’s context iterates.
The callback should call g_signal_emit() as appropriate.
Unless you want to make the GObject that you’re emitting the signal from thread safe, you should not pass it to the GTask worker thread. My general advice would be to avoid locking by only passing immutable objects to and from the worker thread.
Note that you could use g_idle_add() to create an idle GSource and add it to the global default GMainContext (the one GTK uses in the main thread). However, this makes the use of the GMainContext less explicit, and hence makes the code less maintainable. It’s not an approach which I would recommend in general.

How to synchronise processing from WSARecvFrom() when using CompletionRoutine with multiple sockets

From the MSDN Documentation:
The transport providers allow an application to invoke send and receive operations from within the context of the socket I/O completion routine, and guarantee that, for a given socket, I/O completion routines will not be nested. This permits time-sensitive data transmissions to occur entirely within a preemptive context.
In our system we do have one thread calling WSARecvFrom() for multiple sockets. There is one CompletionRoutine for that thread handling all call backs from WSARecvFrom() opverlapped I/O.
Our tests showed that this Completion Routine is called like triggered from an Interrupt. Called for a socket while still processing the completeion Routine from an other socket.
How do we can prevent that this completion Routine gets not called while it is still processing Input from an other socket?
What Serialisation of data processing can we use ?
Note there are hundrets of sockets receiving and sending realtime data. Synchronisation with waiting for multiple objects is not applicable as there is a maximum of 64 defined by the Win32 API.
We can not use a Semaphore because when newly called the old ongoing processing is interreupted so a Semaphore would no be realeased and new processing blocks for ever.
Critical Sections or Mutex is not an Option because the Completion Routine Call back is made from within the same thread so CS or mutex would accept anyway and would not wait till the old processing is finished.
Does anyone have an Idea or even better approach to serialze (synchronize) data processing ?
If you read the WSARecvFrom() documentation again more carefully, it also says:
The completion routine follows the same rules as stipulated for Windows file I/O completion routines. The completion routine will not be invoked until the thread is in an alertable wait state such as can occur when the function WSAWaitForMultipleEvents with the fAlertable parameter set to TRUE is invoked.
The Alertable I/O documentation then states:
When the thread enters an alertable state, the following events occur:
The kernel checks the thread's APC queue. If the queue contains callback function pointers, the kernel removes the pointer from the queue and sends it to the thread.
The thread executes the callback function.
Steps 1 and 2 are repeated for each pointer remaining in the queue.
When the queue is empty, the thread returns from the function that placed it in an alertable state.
So it should be practically impossible for a given thread to overlap multiple pending completion routines on top of each other, because the thread receives and processes the routines in a serialized manner. The only way I could see that being different is if a completion routine is doing something to put the thread into a second alertable state while a previous alertable state is still in effect. I'm not sure what Windows does in that situation, but you should avoid doing it anyway.
Note there are hundrets of sockets receiving and sending realtime data. Synchronisation with waiting for multiple objects is not applicable as there is a maximum of 64 defined by the Win32 API
The WaitForMultipleObjects() documentation tells you how to work around that limitation:
To wait on more than MAXIMUM_WAIT_OBJECTS handles, use one of the following methods:
• Create a thread to wait on MAXIMUM_WAIT_OBJECTS handles, then wait on that thread plus the other handles. Use this technique to break the handles into groups of MAXIMUM_WAIT_OBJECTS.
• Call RegisterWaitForSingleObject to wait on each handle. A wait thread from the thread pool waits on MAXIMUM_WAIT_OBJECTS registered objects and assigns a worker thread after the object is signaled or the time-out interval expires.
I wouldn't wait on the sockets anyway, that is not very efficient. Using completion routines is fine as long as they are doing safe things.
Otherwise, I would suggest you stop using completion routines and switch to using an I/O Completion Port for the socket I/O instead. Then you are in more control of when the completion results are reported to you, because you have to call GetQueuedCompletionStatus() yourself to get the results of each I/O operation. You can have multiple sockets associated with a single IOCP, and then have a small pool of threads (typically one thread per CPU core works best) all calling GetQueuedCompletionStatus() on that IOCP. This way, you can process multiple I/O results in parallel, as they will be in different thread contexts and cannot overlap each other in the same thread. This does mean, however, that you can perform an I/O operation in one thread and the result may show up in a different thread. Just make sure your completion processing is thread-safe.
First of all let me thanks for all the helpful hints and comments to my question.
We did stop now using completion routines. We changed the application to use completion ports.
The biggest problem we had with completion routines is that every time the thread goes into an alertable state the completion routines can (and will) be called again from the OS. As seen in the Debugger also calling WSASendTo() from inside the completion routine puts the thread into an alertable state. So the completion routine is executed again before the previous execution of the completion routine comes to its end.
This makes it nearly impossible to synchronize data processing from multiple different sockets.
The approach using Completion Ports seems to be the perfect one. You then have control about what are doing when you are released from GetQueuedCompletionStatus() for processing a data buffer. You have to and you can do the synchronization of data processing by yourself in a linear fashion without being interrupted and newly executed while trying to process the data.

Can a C callback function be run simultaneously?

I am working on raspberry pi camera module interface, and it involves a callback function which you register to the mmal buffer, and it is run each time a frame is grabbed.
My confusion emerges in the case where another frames arrive when the previously running callback function did not finish in time.
Let's imagine we give some loop a callback function, which will be run when a certain event happens. If the callback function that was called with the previous happening of that event, did not yet finish its job, what will happen when the next event arrives? Can same callback function start running (like another thread?) while the previous one is still running?
I doubt it would, but I had to ask to understand.
It depends on the implementation of the software wich is calling your callback.
The most probable is that it will not launch a thread each time a frame is received and wait the return of the callback before waiting the next frame.
You have too look into the mmal buffer documenattion (and/or codeà to understand how it works and if you need to have thread safe code in your callback.
It could be implemented in different ways.
It could be the case that the frames are stored in a buffer and the function is called on them one after another. This seems to be somewhat like a bounded buffer producer-consumer. Maybe the callback function is a realtime function (they have strict runtime guarantees which ensure the buffer does not overflow) in this case?
Or the function is spawned in a separate thread for each frame. If it's calling the function simultaneously in separate threads, the callback function should be thread-safe.
http://en.wikipedia.org/wiki/Thread-safety

How to handle hooked WSARecv

I'm working on a project that involves hooking WSARecv. I know how to hook this function, I mean its just the same as hooking another function. Anyway the hard part is when WSARecv is used to perform overlapped operations. The idea is that when an application receives data to intercept that and be possible to modify this, I'm using pipes for this. The native DLL tunnels all data to a managed 'server'. This processes the input etc and returns it back to the native DLL. This works great for WSASend, send and recv. However the hard part is when an application uses overlapped sockets.
So I need the received data first before I can process it, this is the hard part. How would I do something like this? I thought of this, but they both seem like a mess:
When WSARecv is called using the WSAOverlapped:
Create a new thread, use WaitForSingleObject and pass the hEvent of the WSAOverlapped structure. When the event is signaled process the data to the managed server and pass the data to the program.
When WSARecv is called using the completion routine:
Create a new thread, modify the call to the original function with lpOperationCompleted to a new function. Use SleepEx to put the thread in an alertable state. When the OperationCompleted is called process the data and pass data back to the program.
I could post my code but I didn't write because it seems like a bad solution.. So there is not really a point for that.
I cannot think of a better solution and this seems horrible because when an application calls WSARecv a lot (for example a large server using overlapped sockets to handle lots of clients) it creates a new thread for every call and that just seems like a bad idea.
So how can I do such thing?
There's no need to create a thread for each overlapped IO call.
When overlapped operations are used, they either have an associated event (which you can safely ignore), a completion routine, or are associated with an I/O Completion port.
To handle the first two cases you should hook both WSARecv() and WSAGetOverlappedResult().
If you need to handle the last, you'll also need to hook GetQueuedCompletionStatus()
Now, when you get a call to WSARecv(), for the event case, you do nothing special there (except possibly save some information in relation to the lpOverlapped, eg. the buffer), and process the data in WSAGetOverlappedResult() (which the application must call to get the success/error and bytes transferred.)
If a completion routine is present, save the lpOverlapped and lpCompletionRoutine, and pass your own completion routine to the real WSARecv().
Your routine should process the data and call the original completion routine.
To handle the I/O completion port case, have WSARecv() save lpOverlapped and buffers etc., in GetQueuedCompletionStatus(), call the original, and if the returned overlapped structure matches, handle the data.
You should also note that overlapped operations may complete immediately, in which case the event isn't signaled, the completion routine isn't called, and (IIRC) no completion is queued on the IOCP.

What is C's analogy to LabVIEW's Event Structure?

One programming construct I use quite a bit in LabVIEW is the Event Structure. This gives me the benefit of not having to needlessly waste CPU cycles via polling but only perform actions when an event I'm interested in is generated.
As an experienced LabVIEW programmer with a decent understanding of C, I'm curious how one would go about emulating LabVIEW's event structure in C; preferably under Linux. A small code sample (like the one in the link above) illustrating how this might be done would be much appreciated. Also, if there already exists 3rd party libraries (for Linux) to add this event framework to C, that would be nice to know as well. Thanks.
The Event Structure is really just an abstraction that hides the thread of execution from you. There has to be some code running somewhere on the computer that is checking for these events and then calling your event handlers. in C, you'd be expected to provide this code (the "main loop" of the program) yourself. This code would check the various event sources you are interested in and call your event handler functions.
The trick then becomes how to not have this main loop wildly spinning the CPU. One easy trick is to have the main loop sleep for a period of time and then check if any events need to be handled, and then sleep again. This has the downside of introducing latency. A better trick, when applicable, is to have the Operating System do these checks as part of its normal operations, and then wake your application's main loop up when something interesting happened. In Linux, this is done with the 'select' system call, but select has the limitation that it can only specify a resource that can be associated with a file descriptor, so devices, stdin, files, network ports are fine.
Edit: To clarify for my downvoters: I am not denying the existance of hardware interrupts. Yes, in cases where code has direct access to hardware interrupts for all events that it wishes to handle (such as an embedded system or device driver) you can write truly "event driven" code with multiple entry points that does not busy wait or sleep. However, in a normal application level C program running under Linux, this code architecture does not literally exist but is emulated at the application level. Any Linux application is going to have a main loop, and at least one thread of execution. This thread may get paused by the scheduler, but it always exists and always has an instruction pointer at a particular instruction. If the code leaves the main() the program ends. There is no facility for the code to return from main and get a callback later on from the kernel. The code has a single entry point and must call its various event handlers manually. Other than in a device driver (or very specific system code using signals), you can not have the kernel or hardware automatically call a certain function if the user clicked on a certain menu item, instead your code is running, detects this event itself, and calls the correct event handler.
You can tell LabView "Call this function when XX happens". In C, you tell your own event dispatch code "Call this function when XX happens".
What I'm trying to say (poorly?) is that the Event framework architecture is not native to a C / Linux application. It must be emulated by your code by having a main dispatch thread that gives the appearance of an event driven framework. Either you do this manually, or use an event library that does this behind the scenes to give the appearance of an event driven model. LabView takes the second approach, so it appears that no code is running when no events are happening, but in reality there is LabView's own C++ code running managing the event queues. This doesn't mean that it is busy waiting all the time, as I said before there are system calls such as select and sleep that the code can use to yield cpu time when it has no work to do, but the code can not simply stop executing.
Lets say you want to write an "event driven" program with two event handlers. One that gets called every ten seconds called tick() and one that gets called every time a key gets pressed called key(), and one that gets called everytime the word "foobar" gets typed called foobar(). You can define these three event handlers, but in addition you need some dispatch main thread that basically does
while not quitting
If 10 seconds have elapsed, call tick()
If Key has been Pressed
call key()
add save the key to our key buffer
If buffer now contains "foobar" call foobar() and clear buffer
Wait()
If all of the events you care about are system level events or time level events, you can Wait() can simply be telling the kernel 'wake me up when one of these things happens' so I don't need to 'busy wait', But you can't simply tell the Kernel "call foobar() when "foobar is pressed". You have to have application level dispatch code that emulates the Event Structure. You're C program only has a single entry point from the kernel for each thread of execution. If you look at libraries that provide event dispatch models, such as Qt, you will find that they are working like this under the hood.
I like libev for this sort of thing.
Most GUI toolkits (GTK, Qt, etc.) implement their own abstraction of an event loop. I've pastebinned a sample program here, because it was a bit long to include in the answer. It's a port of the LabVIEW example you mentioned to C using the GTK toolkit, because that's the one I'm familiar with. The basics of the event loop are not much different in other toolkits, though.
If all you care about is keyboard input, C standard I/O is what you want. By default input streams are buffered and will stall your program until input is received. Use scanf, getchar, whatever else in <stdio.h>.
If you want mouse input, you'll need to be more specific about your platform as C/C++ has no native support for the mouse or windows.
A good analogy to LabVIEWs event structure is Win32's "event pull" function GetMessage(). GetMessage() waits forever until a GUI event occurs. There are much more events, even for every child window (LabVIEW: control or indicator) in Windows than in LabVIEW. GetMessage() simply returns on every event, fine filtering (as in LabVIEW) has to be done later, typically using DispatchMessage() and the Window's event handler procedure WindowProc() with its more or less large switch() statement.
Most tookits use "event push" style which is not adaequate to the event structure. Interrupt driven programs too.
If a timeout is used, think that MsgWaitForMultipleObjects() with zero file handles is called before PeekMessage(). The timeout case applies when no event arrived in the given time span.
Actually, LabVIEWs event structure should be inside a separate loop. A separate loop is a thread. For typical Win32 programming, GetMessage() is used in the main thread, and additional ("worker") threads are generated by user interaction as needed.
LabVIEW cannot easily create a thread. It is only possible by invoking an asynchronous SubVI. Really! Therefore, most LabVIEW programs use a second while loop as a permanently available worker thread that will run when something has to be done and block (i.e. stop consuming CPU power) otherwise. To instruct what has to be done in background, a queue is used.
As a bad side effect, when the worker thread does something, the user cannot do something else in background as there is only one worker thread.
The LabVIEWs event structure has a big difference to other programming languages: LabVIEW events can have multiple consumers! If multiple event structures are used, everything continues to work well (except for events with boolean return values). In Windows, events are posted to a specific thread, mostly to a Windows' thread. To feed multiple threads, events have to be posted multiple times. Similar to other programming languages. Events there are handled by something similar to LabVIEWs “Queue” related functions: If someone receives the event, it is out off the queue.
Multiple-targetting require that every consumer registers itself somehow to the producer. For GUI events, this is done automatically. For user events, this must be done programmatically. See LabVIEW examples.
Distributing events to multiple listeners is realized in Windows using DDE but that's merely for processes than for threads. Registering to a thread is done using DdeConnect() or similar, and events are pushed to a callback function. (To be more exact how Win32 works, GetMessage() receives DDE messages, and DispathcMessage() actually calls the callback function.)

Resources