I'm learning about MQTT (specifically the paho C library) by reading and experimenting with variations on the async pub/sub examples.
What's the difference between the MQTTAsync_deliveryComplete callback that you set with MQTTAsync_setCallbacks() vs. the MQTTAsync_onSuccess or MQTTAsync_onSuccess5 callbacks that you set in the MQTTAsync_responseOptions struct that you pass to MQTTAsync_sendMessage() ?
All seem to deal with "successful delivery" of published messages, but from reading the example code and doxygen, I can't tell how they relate to or conflict with or supplement each other. Grateful for any guidance.
Basically MQTTAsync_deliveryComplete and MQTTAsync_onSuccess do the same, they notify you via callback about the delivery of a message. Both callbacks are executed asynchronously on a separate thread to the thread on which the client application is running.
(Both callbacks are even using the same thread in the case of the current version of the Paho client, but this is a non-documented implementation detail. This thread used by MQTTAsync_deliveryComplete and MQTTAsync_onSuccess is of course not the application thread otherwise it would not be an asynchronous callback).
The difference is that MQTTAsync_deliveryComplete callback is set once via MQTTAsync_setCallbacks and then you are informed about every delivery of a message.
In contrast to this, the MQTTAsync_onSuccess informs you once for exactly the message that you send out via MQTTAsync_sendMessage().
You can even define both callbacks, which will both be called when a message is delivered.
This gives you the flexibility to choose the approach that best suits your needs.
Artificial example
Suppose you have three different functions, each sending a specific type of message (e.g. sendTemperature(), sendHumidity(), sendAirPressure()) and in each function you call MQTTAsync_sendMessage, and after each delivery you want to call a matching callback function, then you would choose MQTTAsync_onSuccess. Then you do not need to keep track of MQTTAsync_token and associate that with your callbacks.
For example, if you want to implement a logging function instead, it would be more useful to use MQTTAsync_deliveryComplete because it is called for every delivery.
And of course one can imagine that one would want to have both the specific one with some actions and the generic one for logging, so in this case both variants could be used at the same time.
Documentation
You should note that MQTTAsync_deliveryComplete explicitly states in its documentation that it takes into account the Quality of Service Set. This is not the case in the MQTTAsync_onSuccess documentation, but of course it does not mean that this is not done in the implementation. But if this is important, you should explicitly check the source code.
Related
I have an use case where I need to apply multiple functions to every incoming message, each producing 0 or more results.
Having a loop won't scale for me, and ideally I would like to be able to emit results as soon as they are ready instead of waiting for the all the functions to be applied.
I thought about using AsyncIO for this, maintaining a ThreadPool but if I am not mistaken I can only emit one record using this API, which is not a deal-breaker but I'd like to know if there are other options, like using a ThreadPool but in a Map/Process function so then I can send the results as they are ready.
Would this be an anti-pattern, or cause any problems in regards to checkpointing, at-least-once guarantees?
Depending on the number of different functions involved, one solution would be to fan each incoming message out to n operators, each applying one of the functions.
I fear you'll get into trouble if you try this with a multi-threaded map/process function.
How about this instead:
You could have something like a RichCoFlatMap (or KeyedCoProcessFunction, or BroadcastProcessFunction) that is aware of all of the currently active functions, and for each incoming event, emits n copies of it, each being enriched with info about a specific function to be performed. Following that can be an async i/o operator that has a ThreadPool, and it takes care of executing the functions and emitting results if and when they become available.
I am implementing a use case in Flink stateful functions. My specification highlights that starting from a stateful function f a business workflow (in other words a group of stateful functions f1, f2, … fn are called either sequentially or in parallel or both ). Stateful function f waits for a result to be returned to update a local state, it as well starts a timeout callback i.e. a message to itself. At timeout, f checks if the local state is updated (it has received a result), if this is the case life is good.
However, if at timeout f discovers that it has not received a result yet, it has to launch a compensating workflow to undo any changes that stateful functions f1, f2, … fn might have received.
Does Flink stateful functions framework support such as a design pattern/use case, or it should be implemented at the application level? What is the simplest design to achieve such a solution? For instance, how to know what functions of the workflow stateful functions f1, f2, … fn were affected by the timedout invocation (where the control flow has been timed out)? How does Flink sateful functions and the concept of integrated messaging and state facilitate such a pattern?
Thank you.
I posted the question on Apache Flink mailing list and got the following response by Igal Shilman, Thanks to Igal.
The first thing that I would like to mention is that, if your original
motivation for that scenario is a concern of a transient failures such as:
did function Y ever received a message sent by function X ?
did sending a message failed?
did the target function is there to accept a message sent to it?
did the order of message got mixed up?
etc'
Then, StateFun eliminates all of these problems and a whole class of
transient errors that otherwise you would have to deal with by yourself in
your business logic (like retries, backoffs, service discovery etc').
Now if your motivating scenario is not about transient errors but more
about transactional workflows, then as Dawid mentioned you would have to
implement
this in your application logic. I think that the way you have described the
flow should map directly to a coordinating function (per flow instance)
that keeps track of results/timeouts in its internal state.
Here is a sketch:
A Flow Coordinator Function - it would be invoked with the input
necessary to kick off a flow. It would start invoking the relevant
functions (as defined by the flow's DAG) and would keep an internal state
indicating
what functions (addresses) were invoked and their completion statues.
When the flow completes successfully the coordinator can safely discard its
state.
In any case that the coordinator decides to abort the flow (an internal
timeout / an external message / etc') it would have to check its internal
state and kick off a compensating workflow (sending a special message to
the already succeed/in progress functions)
Each function in the flow has to accept a message from the coordinator,
in turn, and reply with either a success or a failure.
I've been working with the Wayland protocol lately and many functions include a unit32_t serial parameter. Here's an example from wayland-client-protocol.h:
struct wl_shell_surface_listener {
/**
* ping client
*
* Ping a client to check if it is receiving events and sending
* requests. A client is expected to reply with a pong request.
*/
void (*ping)(void *data,
struct wl_shell_surface *wl_shell_surface,
uint32_t serial);
// ...
}
The intent of this parameter is such that a client would respond with a pong to the display server, passing it the value of serial. The server would compare the serial it received via the pong with the serial it sent with the ping.
There are numerous other functions that include such a serial parameter. Furthermore, implementations of other functions within the API often increment the global wl_display->serial property to obtain a new serial value before doing some work. My question is, what is the rationale for this serial parameter, in a general sense? Does it have a name? For example, is this an IPC thing, or a common practice in event-driven / asynchronous programming? Is it kind of like the XCB "cookie" concept for asynchronous method calls? Is this technique found in other programs (cite examples please)?
Another example is in glut, see glutTimerFunc discussed here as a "common idiom for asynchronous invocation." I'd love to know if this idiom has a name, and where (good citations please) it's discussed as a best practice or technique in asynchronous / even-driven programming, such as continuations or "signals and slots." Or, for example, how shared resource counts are just integers, but we consider them to be "semaphores."
You may find this helpful
Some actions that a Wayland client may perform require a trivial form
of authentication in the form of input event serials. For example, a
client which opens a popup (a context menu summoned with a right click
is one kind of popup) may want to "grab" all input events server-side
from the affected seat until the popup is dismissed. To prevent abuse
of this feature, the server can assign serials to each input event it
sends, and require the client to include one of these serials in the
request.
When the server receives such a request, it looks up the input event
associated with the given serial and makes a judgement call. If the
event was too long ago, or for the wrong surface, or wasn't the right
kind of event — for example, it could reject grabs when you wiggle the
mouse, but allow them when you click — it can reject the request.
From the server's perspective, they can simply send a incrementing
integer with each input event, and record the serials which are
considered valid for a particular use-case for later validation. The
client receives these serials from their input event handlers, and can
simply pass them back right away to perform the desired action.
https://wayland-book.com/seat.html#event-serials
As Hans Passant and Tom Zych state in the comments, the argument is distinguishes one asynchronous invocation from another.
I'm still curious about the deeper question, which is if this technique is one commonly used in asynchronous / event-driven software, and if it has a well-known name.
When do you use a callback function? I know how they work, I have seen them in use and I have used them myself many times.
An example from the C world would be libcurl which relies on callbacks for its data retrieval.
An opposing example would be OpenSSL: Where I have used it, I use out parameters:
ret = somefunc(&target_value);
if(ret != 0)
//error case
I am wondering when to use which? Is a callback only useful for async stuff? I am currently in the processes of designing my application's API and I am wondering whether to use a callback or just an out parameter. Under the hood it will use libcurl and OpenSSL as the main libraries it builds on and the parameter "returned" is an OpenSSL data type.
I don't see any benefit of a callback over just returning. Is this only useful, if I want to process the data in any way instead of just giving it back? But then I could process the returned data. Where is the difference?
In the simplest case, the two approaches are equivalent. But if the callback can be called multiple times to process data as it arrives, then the callback approach provides greater flexibility, and this flexibility is not limited to async use cases.
libcurl is a good example: it provides an API that allows specifying a callback for all newly arrived data. The alternative, as you present it, would be to just return the data. But return it — how? If the data is collected into a memory buffer, the buffer might end up very large, and the caller might have only wanted to save it to a file, like a downloader. If the data is saved to a file whose name is returned to the caller, it might incur unnecessary IO if the caller in fact only wanted to store it in memory, like a web browser showing an image. Either approach is suboptimal if the caller wanted to process data as it streams, say to calculate a checksum, and didn't need to store it at all.
The callback approach allows the caller to decide how the individual chunks of data will be processed or assembled into a larger whole.
Callbacks are useful for asynchronous notification. When you register a callback with some API, you are expecting that callback to be run when some event occurs. Along the same vein, you can use them as an intermediate step in a data processing pipeline (similar to an 'insert' if you're familiar with the audio/recording industry).
So, to summarise, these are the two main paradigms that I have encountered and/or implemented callback schemes for:
I will tell you when data arrives or some event occurs - you use it as you see fit.
I will give you the chance to modify some data before I deal with it.
If the value can be returned immediately then yes, there is no need for a callback. As you surmised, callbacks are useful in situations wherein a value cannot be returned immediately for whatever reason (perhaps it is just a long running operation which is better performed asynchronously).
My take on this: I see it as which module has to know about which one? Let's call them Data-User and IO.
Assume you have some IO, where data comes in. The IO-Module might not even know who is interested in the data. The Data-User however knows exactly which data it needs. So the IO should provide a function like subscribe_to_incoming_data(func) and the Data-User module will subscribe to the specific data the IO-Module has. The alternative would be to change code in the IO-Module to call the Data-User. But with existing libs you definitely don't want to touch existing code that someone else has provided to you.
In the GObject Reference Manual, it denotes that for a function:
g_signal_connect(instance, detailed_signal, c_handler, data)
A detailed_signal string parameter of form "signal-name::detail" is desired. My initial understanding of that is that there are predefined signal details to pass in. If that is the case, where can I find a list of these? If not, then what exactly does it mean, as the manual doesn't make that too terribly obvious.
The ::detail part of the signal name is optional. If a signal takes a detail parameter, then it will say so in the signal's documentation. Otherwise you can ignore it.
The only signal that I'm aware of that actually uses a detail parameter, is the notify signal of GObject. The notify signal without a detail fires whenever any property changes on the object, so it's fairly useless. But if you connect to the notify::visible signal, then it will fire whenever the object's visible property changes.
Unless things have changed a lot recently, there's no complete, official list of signals. The predefined signals depend entirely on what technologies you're using.
What you can do is look at the online documentation for the GObject instance classes you're working with. For instance, if you're working with GtkButton, you can look it up online and find out that it emits six signals (activate, clicked, enter, leave, pressed, released). GtkButton is derived from GtkContainer, which also emits several documented signals that can potentially be emitted by GtkButton. And GtkContainer is derived from GtkWidget, which emits many documented signals which can potentially be emitted by GtkButton.
If you find an object isn't emitting a kind of signal you expect, you might also look in the source code for that object, because sometimes objects emit undocumented signals,