C: pattern for returning asynchron error from background thread? - c

I'm writing an open source C library. This library is quite complex, and some operations can take a long time. I therefore created a background thread which manages the long-running tasks.
My problem is that I have not yet found an elegant way to return errors from the background thread. Suppose the background thread reorganizes a file or does periodic maintenance, and it fails – what to do?
I currently see two options:
1) if the user is interested in seeing these errors, he can register a callback function.
I don't like this option – the user doesn't even know that there's a background thread, so he will most likely forget about setting the callback function. From usability point of view, this option is bad.
2) the background thread stores the error in a global variable and the next API function returns this error.
That's what I'm currently doing, but I'm also not 100% happy with it, because it means that users have to expect EVERY possible error code being returned from every API function. I.e. if the background thread sets an IO Error, and the user just wants to know the library version, he will get an IO error although the get_version() API call doesn't access the disk at all. Again, bad usability…
Any other suggestions/ideas?

Perhaps for the "long running operations" (the ones you'd like to use a thread for) give users two options:
a blocking DoAction(...) that returns status
a non-blocking DoActionAsync(..., <callback>) that gives the status to a user provided callback function
This gives the user the choice in how they want to handle the long operation (instead of you deciding for them), and it is clear how the status will be returned.
Note: I suppose that if they call DoActionAsync, and the user doesn't specify a callback (e.g. they pass null) then the call wouldn't block, but the user wouldn't have/need to handle the status.

I am interested in knowing how the completion status informed to the caller of API.
Since the background thread carries out all the execution. Either the foreground thread chooses to wait till the completion, like synchronous. Or the foreground thread can do other tasks, registering for a callback.
Now, since the first method is synchronous, like your usage of a global variable. You can use a message queue with 1 member, instead of your global variable. Now,
- Caller can either poll the message queue for the status
- Caller can block wait on the message queue for status
What I can think of,
But if I am the caller, I would like to know the progress status, if the time taken is very ... very long. So better to give some kind of percentage completion or something to enable the end user to develop much better application with progress bar and all.

You should keep a thread-safe list (or queue) of error events and warnings. The worker thread can post events to the list, then the main thread can read events from the list, one at a time, or in a batch to prevent race conditions. Ideally, the main thread should fetch a copy of the event queue and flush it so there is no change of duplicating events in the case of multiple main or worker threads. Events on the list would have a type and details.

If you're providing a library and try to hide expensive work via a thread I'd suggest to not do it that way. If something is expensive it should be visible to the caller and if it bugs him, he should take care of backgrounding/threading himself. That way he also has full control over the error.
It also takes the control over his process away from the developer who uses your library.
If you still want to use threads I'd suggest to really follow the callback-route but make it clearly visible in the API and documentation that there will be a background thread running on this task and therefore the callback is necessary.
Best way would be if you offered both ways, synchronous and asynchronous, to the users of the library so they can choose what fits best for them in their specific situation.

Thanks for all the good answers. You provided me with a lot of material to think about.
Some of you suggested callbacks. Initially, i thought a callback is a good idea. But it just moves the problem to the user. If a user gets an asynchronous error notification, how will he deal with it? he will have to interrupt and/or notify his synchronous program flow, and that's usually tricky and often breaks a design.
The solution i'm doing now: if the background thread generates an error, the next API call will return an error BACKGROUND_ERROR_PENDING. With a separate API function (get_background_error()) the user can look at this error code, if he's interested in it.
Also, i added documentation so users don't be too surprised if this error is returned.

You might take a look at java's Future API for an alternate mechanism for dealing with asynchronous calls and errors. You could easily substitute the checked exceptions with some isError() or getError() methods if you preferred.

I agree with Sean. An message queue with an event loop.
If there is an error the background thread can insert into the queue. The event loop will block until a new message becomes available.
I have used Apache Portable runtime time with great success with this design in building telecomms servers with a high transaction rate. It has never failed.
I use 1 thread to inserting into the queue, that would be your background thread. The event loop will run in another thread and block until a new message is inserted.
I would recommend APR thread pool with APR FIFO queue (which is also thread safe).
Quick design here:
void background_job()
{
/* There has been an error insert into the queue */
apr_status_t rv = 0;
rv = apr_queue_push(queue, data);
if(rv == APR_EOF) {
MODULE_LOG(APK_PRIO_WARNING, "Message queue has been terminated");
return FALSE;
}
else if(rv == APR_EINTR) {
MODULE_LOG(APK_PRIO_WARNING, "Message queue was interrupted");
return FALSE;
}
else if(rv != APR_SUCCESS) {
char err_buf[BUFFER_SIZE];
MODULE_LOG(APK_PRIO_CRITICAL, "Failed to push to queue %s", apr_strerror(rv, err_buf, BUFFER_SIZE));
return FALSE;
}
return TRUE;
}
void evt_loop()
{
while(continue_loop) {
apr_status_t rv = 0;
rv = apr_queue_pop(queue, data);
if(rv == APR_EOF) {
MODULE_LOG(APK_PRIO_WARNING, "Message queue has been terminated");
return FALSE;
}
else if(rv == APR_EINTR) {
MODULE_LOG(APK_PRIO_WARNING, "Message queue was interrupted");
return FALSE;
}
else if(rv != APR_SUCCESS) {
char err_buf[BUFFER_SIZE];
MODULE_LOG(APK_PRIO_CRITICAL, "Failed to pop from the queue %s", apr_strerror(rv, err_buf, BUFFER_SIZE));
return FALSE;
}
return TRUE;
}
}
Above is just some simple code snippets, if you want I post more complete code.
Hope that helps

Related

X11 drawing on resize [duplicate]

I'm writing a program that has an X11/Xlib interface, and my event processing loop looks like this:
while (XNextEvent(display, &ev) >= 0) {
switch (ev.type) {
// Process events
}
}
The problem is when the window is resized, I get a bunch of Expose events telling me which parts of the window to redraw. If I redraw them in direct response to the events, the redraw operation lags terribly because it is so slow (after resizing I get to see all the newly invalidated rectangles refresh one by one.)
What I would like to do is to record the updated window size as it changes, and only run one redraw operation on the entire window (or at least only two rectangles) when there are no more events left to process.
Unfortunately I can't see a way to do this. I tried this:
do {
XPeekEvent(display, &ev);
while (XCheckMaskEvent(display, ExposureMask | StructureNotifyMask, &ev)) {
switch (ev.type) {
// Process events, record but don't process redraw events
}
}
// No more events, do combined redraw here
}
Which does actually work, but it's a little inefficient, and if an event arrives that I am not interested in the XCheckMaskEvent call doesn't remove it from the queue, so it stays there stopping XPeekEvent from blocking, resulting in 100% CPU use.
I was just wondering whether there is a standard way to achieve the delayed/combined redraw that I am after? Many of the Xlib event processing functions seem to block, so they're not really suitable to use if you want to do some processing just before they block, but only if they would block!
EDIT: For the record, this is the solution I used. It's a simplified version of n.m.'s:
while (XNextEvent(display, &ev) >= 0) {
switch (ev.type) {
// Process events, remember any redraws needed later
}
if (!XPending(display)) {
// No more events, redraw if needed
}
}
FWIW a UI toolkit such as GTK+ does it this way:
for each window, maintains a "damage region" (union of all expose events)
when the damage region becomes non-empty, adds an "idle handler" which is a function the event loop will run when it doesn't have anything else to do
the idle handler will run when the event queue is empty AND the X socket has nothing to read (according to poll() on ConnectionNumber(dpy))
the idle handler of course repaints the damage region
In GTK+, they're changing this over to a more modern 3D-engine oriented way (clean up the damage region on vertical sync) in a future version, but it's worked in the fairly simple way above for many years.
When translated to raw Xlib, this looks about like n.m.'s answer: repaint when you have a damage region and !XPending(). So feel free to accept that answer I just figured I'd add a little extra info.
If you wanted things like timers and idles, you could consider something lke libev http://software.schmorp.de/pkg/libev.html it's designed to just drop a couple of source files in your app (it isn't set up to be an external dependency). You would add the display's file descriptor to the event loop.
For tracking damage regions, people often cut-and-paste the file "miregion.c" which is from the "machine independent" code in the X server. Just google for miregion.c or download the X server sources and look for it. A "region" here is simply a list of rectangles which supports operations such as union and intersect. To add damage, union it with the old region, to repair damage, subtract it, etc.
Try something like the following (not actually tested):
while (TRUE) {
if (XPending(display) || !pendingRedraws) {
// if an event is pending, fetch it and process it
// otherwise, we have neither events nor pending redraws, so we can
// safely block on the event queue
XNextEvent (display, &ev);
if (isExposeEvent(&ev)) {
pendingRedraws = TRUE;
}
else {
processEvent(&ev);
}
}
else {
// we must have a pending redraw
redraw();
pendingRedraws = FALSE;
}
}
It could be beneficial to wait for 10 ms or so before doing the redraw. Unfortunately the raw Xlib has no interface for timers. You need a higher-level toolkit for that (all toolkits including Xt have some kind of timer interface), or work directly with the underlying socket of the X11 connection.

SetWindowPos/ShowWindow with a timeout

I'm using the SetWindowPos function for an automation task to show a window. I know that there are two ways that Windows provides to do this:
Synchronously: SetWindowPos or ShowWindow.
Asynchronously: SetWindowPos with SWP_ASYNCWINDOWPOS or ShowWindowAsync.
Now, I'd like to get the best of both worlds: I want to be able to show the window synchronously, because I'd like it to be done when the function returns. But I don't want the call to hang my process - if it takes too long, I want to be able to abort the call.
Now, while looking for an answer, the only thing I could come up with is using a separate thread and using SendMessageTimeout, but even then, if the thread hangs, there's not much I can do to end it except of TerminateProcess, which is not a clean solution.
I also have seen this answer, but as far as I understand, it has no alternative for native WinAPI.
The answer in the question you linked to simply loops until either the desired condition occurs or the timeout expires. It uses Sleep() every iteration to avoid hogging the processor. So a version for WinAPI can be written quite simply, as follows:
bool ShowWindowAndWait(HWND hWnd, DWORD dwTimeout) {
if (IsWindowVisible(hWnd)) return true;
if (!ShowWindowAsync(hWnd, SW_SHOW)) return false;
DWORD dwTick = GetTickCount();
do {
if (IsWindowVisible(hWnd)) return true;
Sleep(15);
} while (dwTimeout != 0 && GetTickCount() - dwTick < dwTimeout);
return false;
}
Unfortunately I think this is the best you're going to get. SendMessageTimeout can't actually be used for this purpose because (as far as I know anyway) there's no actual message you could send with it that would cause the target window to be shown. ShowWindowAsync and SWP_ASYNCWINDOWPOS both work by scheduling internal window events, and this API isn't publicly exposed.

Whats the best way to asynchronously return a result (as a struct) that hasn't been fully "set up" (or processed) yet

Alright, I honestly have tried looking up "Asynchronous Functions in C" (Results are for C# exclusively), but I get nothing for C. So I'm going to ask it here, but if there are better, already asked questions on StackExchange or what-have-you, please direct me to them.
So I'm teaching myself about concurrency and asynchronous functions and all that, so I'm attempting to create my own thread pool. So far, I'm still in the planning phase of it, and I'm trying to find a clear path to travel on, however I don't want a hand-out of code, I just want a nudge in the right direction (or else the exercise is pointless).
What would be the best way to asynchronously return from a function that isn't really "ready"? In that, it will return almost immediately, even if it's currently processing the task given by the user. The "task" is going to be a callback and arguments to fit the necessary pthread_t arguments needed, although I'll work on attributes later. The function returns a struct called "Result", which contains the void * return value and a byte (unsigned char) called "ready" which will hold values 0 and 1. So while "Result" is not "ready", then the user shouldn't attempt to process the item yet. Then again, the "item" can be NULL if the user returns NULL, but "ready" lets the user know it finished.
struct Result {
/// Determines whether or not it has been processed.
unsigned char ready;
/// The return type, NULL until ready.
void *item;
};
The struct isn't really complete, but it's a basic prototype embodying what I'm attempting to do. This isn't really the issue here though, although let me know if its the wrong approach.
Next I have to actually process the thing, while not blocking until everything is finished. As I said, the function will create the Result, then asynchronously process it and return immediately (by returning this result). The problem is asynchronously processing. I was thinking of spawning another thread inside of the thread_pool, but I feel it's missing the point of a thread pool as it's not longer remaining simple.
Here's what I was thinking (which I've a feeling is grossly over-complicated). In the function add_task, spawn a new thread (Thread A) with a passed sub_process struct then return the non-processed but initialized result. In the spawned thread, it will also spawn another thread (see the problem? This is Thread B) with the original callback and arguments, join Thread A with Thread B to capture it's return value, which is then stored in the result's item member. Since the result will be pointing to the very same struct the user holds, it shouldn't be a problem.
My problem is that it spawns 2 threads instead of being able to do it in 1, so I'm wondering if I'm doing this wrong and complicating things.Is there a better way to do this? Does pthread's library have a function which will asynchronously does this for me? Anyway, the prototype Sub_Process struct is below.
/// Makes it easier than having to retype everything.
typedef void *(*thread_callback)(void *args);
struct Sub_Process {
/// Result to be processed.
Result *result;
/// Thread callback to be processed
thread_callback cb;
/// Arguments to be passed to the callback
void *args;
};
Am I doing it wrong? I've a feeling I'm missing the whole point of a Thread_Pool. Another question is, is there a way to spawn a thread that is created, but waiting and not doing anything? I was thinking of handling this by creating all of the threads by having them just wait in a processing function until called, but I've a feeling this is the wrong way to go about this.
To further elaborate, I'll also post some pseudocode of what I'm attempting here
Notes: Was recommended I post this question here for an answer, so it's been copy and pasted, lemme know if there is any faulty editing.
Edit: No longer spawns another thread, instead calls callback directly, so the extra overhead of another thread shouldn't be a problem.
I presume it is your intention is that a thread will request the asychronous work to be performed, then go on to perform some different work itself until the point where it requires the result of the asynchronous operation in order to proceed.
In this case, you need a way for the requesting thread to stop and wait for the Result to be ready. You can do this by embedding a mutex and condition variable pair inside the Result:
struct Result {
/// Lock to protect contents of `Result`
pthread_mutex_t lock;
/// Condition variable to signal result being ready
pthread_cond_t cond;
/// Determines whether or not it has been processed.
unsigned char ready;
/// The return type, NULL until ready.
void *item;
};
When the requesting thread reaches the point that it requires the asynchronous result, it uses the condition variable:
pthread_mutex_lock(&result->lock);
while (!result->ready)
pthread_cond_wait(&result->cond, &result->lock);
pthread_mutex_unlock(&result->lock);
You can wrap this inside a function that waits for the result to be available, destroys the mutex and condition variable, frees the Result structure and returns the return value.
The corresponding code in the thread pool thread when the processing is finished would be:
pthread_mutex_lock(&result->lock);
result->item = item;
result->ready = 1;
pthread_cond_signal(&result->cond);
pthread_mutex_unlock(&result->lock);
Another question is, is there a way to spawn a thread that is created,
but waiting and not doing anything? I was thinking of handling this by
creating all of the threads by having them just wait in a processing
function until called, but I've a feeling this is the wrong way to go
about this.
No, you're on the right track here. The mechanism to have the thread pool threads wait around for some work to be available is the same as the above - condition variables.

Multi media timer works fine in release mode but not on debug mode

I'm trying to use mmTimer with a callback function, which is a static CALLBACK function.
I know that a static function cannot call a non-static function, thanks to you all guys, except from the case where the static function gets a pointer to an object as an argument.
the weird thing is that my timer works fine in release mode, and when I try to run it in debug mode there is this unhandeled exception that pops up and breaks the program down.
void CMMTimerDlg::TimerProc(UINT uID, UINT uMsg, DWORD dwUser, DWORD dw1, DWORD dw2)
{
CMMTimerDlg* p = (CMMTimerDlg*)dwUser;
if(p)
{
p->m_MMTimer += p->m_TimeDelay;
p->UpdateData(FALSE);
}
}
my questions are : - is there any way to resolve this problem? - If this error occurs on debug mode, who ensures me that it wouldn't happen once i release the program?
there is where the program stops:
#ifdef _DEBUG
void CWnd::AssertValid() const
{
if (m_hWnd == NULL)
return; // null (unattached) windows are valid
// check for special wnd??? values
ASSERT(HWND_TOP == NULL); // same as desktop
if (m_hWnd == HWND_BOTTOM)
ASSERT(this == &CWnd::wndBottom);
else if (m_hWnd == HWND_TOPMOST)
ASSERT(this == &CWnd::wndTopMost);
else if (m_hWnd == HWND_NOTOPMOST)
ASSERT(this == &CWnd::wndNoTopMost);
else
{
// should be a normal window
ASSERT(::IsWindow(m_hWnd));
// should also be in the permanent or temporary handle map
CHandleMap* pMap = afxMapHWND();
ASSERT(pMap != NULL);
when it gets to pMap it stops at that assertion!!!!
here is the static CALLBACK function
static void CALLBACK TimerProc(UINT uID, UINT uMsg, DWORD dwUser, DWORD dw1, DWORD dw2);
here is how I set the timer
UINT unTimerID = timeSetEvent(m_TimeDelay,1,(LPTIMECALLBACK)TimerProc,(DWORD)this,TIME_PERIODIC);
The problem here is that multimedia timer API unlike many other has restrictions on what you are allowed to do inside the callback. You are basically not allowed much and what you are allowed is to update internal structures, do some debug output, and set an synchronization event.
Remarks
Applications should not call any system-defined functions from inside
a callback function, except for PostMessage, timeGetSystemTime,
timeGetTime, timeSetEvent, timeKillEvent, midiOutShortMsg,
midiOutLongMsg, and OutputDebugString.
Assertion failures start display message boxes which are not allowed and can eventually crash the process. Additionally, windowing API such as IsWindow and friends are not allowed either and are the first place cause leading further to assertion failures.
The best here is to avoid using multimedia timers at all. In most cases you have less restrictive alternate options.
It only looks like your code works in the Release build, it will not assert() that you are doing it right. And you are not doing it right.
The callback from a multi-media timer runs on an arbitrary thread-pool thread. You have to be very careful about what you do in the callback. For one, you cannot directly touch the UI, that code is fundamentally thread-unsafe. So you most certainly cannot call UpdateData(). At best, you update a variable and let the UI thread know that it needs to refresh the window. Use PostMessage(). In general you need a critical section to ensure that your callback doesn't update that variable while the UI thread is using it to update the window.
The assert you get in the Debug build suggests more trouble. Looks like you are not making sure that the timer can no longer callback when the user closes the window. That's pretty hard to solve cleanly, it is a fundamental threading race. PostMessage() will already keep you out of the worst trouble. To do it perfectly clean, you must prevent the window from closing until you know that the timer cannot callback anymore. Which requires setting an event when you get WM_CLOSE and not call DestroyWindow. The timer's callback needs to check that event, call timeKillEvent() and post another message. Which the UI thread can now use to really close the window.
Threading is hard, do make sure that SetTimer() isn't already good enough to get the job done. It certainly will be if the UI update is the only side-effect. You only need timeSetEvent() when you require an accurate timer that needs to do something that is not UI related. Human eyes just don't have that requirement. Only our ears do.

Stop a CPU-intensive operation by pressing a GTK button

I'm extending a GTK-application that does a group of operations that takes high CPU loads. I want to include the possibility to stop this operation by clicking on a button in the GUI.
The problem is that, as expected, the signal coming from the button is actually fired just after the operation is completed.
For now, the code kinda looks like this:
[...]
// code snippet to show the dialog and to enable user interactions with the buttons on the lower side of the window
while(TRUE) {
gint run = gtk_dialog_run (window_main);
if (run == GTK_RESPONSE_APPLY) {
gboolean success = start_long_operation();
}
else if (run == GTK_RESPONSE_HELP) {
open_about();
}
else if (run == GTK_RESPONSE_CANCEL) {
stop_long_operation();
}
else {
gtk_widget_destroy (window_main);
return;
}
}
I've declared a global variable busy_state that is checked by the long operation's function: if it is TRUE, simply the inner loop continues to cycle. Else, the loop exits and the function returns a result.
stop_long_operation() simply sets this global var to FALSE.
As written before, I can't press the "stop" button and "send" GTK_RESPONSE_CANCEL until the operation finishes, because it blocks the entire window.
I've tried the use of while (g_main_context_iteration(NULL, FALSE)) trick inside the stop_long_operation() function, as explained in the gtk's docs, but without results.
Do I really need to set up a multithread functionality? Can I avoid this?
Thanks for the help.
If you can break up your long operation into multiple smaller tasks you may be able to avoid using threads. The easiest way would be to just create a callback that you would pass to g_idle_add (or g_idle_add_full). Each time the callback runs it does a small amount of work, then returns TRUE. When the the task is completed, return FALSE and the callback not be run again. When you would like to interrupt the task, simply remove the callback by passing the value returned by g_idle_add to g_source_remove.
If you can't break up the operation then threads are pretty much your only choice. g_thread_new is the low-level way to do that, but it's generally easier to use a GThreadPool. A more advanced option would be to use g_simple_async_result_run_in_thread.
Here's another option if you don't want to use threads (although you should use threads and this is very insecure):
Use processes. Processes are much simpler, and can allow you some greater flexibility. Here's what you need to do:
Create another C/C++/Any language you want program that does the task
Spawn it using spawn() or popen()
(Optional) Pass arguments using the command line, or IPC
When the button is pressed, use either the kill() call on UNIX, or the Win32 kill function to kill the process. You can use SIGTERM on UNIX and register a handler so that you can have a controlled shutdown.

Resources