x.stop_sequences() is causing this UVM FATAL Item_done() called with no outstanding requests - uvm

x.stop_sequences() is causing this
UVM FATAL Item_done() called with no outstanding requests. Each call
to item_done() must be paired with a previous call to get_next_item()
Can someone tell me how to use stop_sequences while making sure the driver is inactive?

I don't think there is any built-in mechanism; you have to write the code yourself. Basically, you need to implement a reset or interrupt mechanism in your driver. Here is a skeleton idea:
task run_phase (uvm_phase phase);
forever begin
#(posedge <ENABLE INPUT>);
fork
<DO DRIVERY THINGS>;
join_none
#(negedge <ENABLE INPUT>);
disable fork;
end
endtask: run_phase

In addition to #mathew-taylor's suggestion, you may need to also consider the monitor, since it will need to discard partially assembled data collections.
If you have a reactive driver, this gets even trickier. It would be prudent to provide an boolean validity attribute in your transactions. Construction would set it to true (1'b1). If responses are outstanding upon reset, send all the outstanding responses after setting the validity field to false (1'b0). This will keep the sequencer from jamming. Any consumer of transaction data would then need to examine the validity. To simplify, you could build in the check via accessor functions and make all attributes local. This would also work on the monitor.

Related

Flink stateful functions : compensating callback on a timeout

I am implementing a use case in Flink stateful functions. My specification highlights that starting from a stateful function f a business workflow (in other words a group of stateful functions f1, f2, … fn are called either sequentially or in parallel or both ). Stateful function f waits for a result to be returned to update a local state, it as well starts a timeout callback i.e. a message to itself. At timeout, f checks if the local state is updated (it has received a result), if this is the case life is good.
However, if at timeout f discovers that it has not received a result yet, it has to launch a compensating workflow to undo any changes that stateful functions f1, f2, … fn might have received.
Does Flink stateful functions framework support such as a design pattern/use case, or it should be implemented at the application level? What is the simplest design to achieve such a solution? For instance, how to know what functions of the workflow stateful functions f1, f2, … fn were affected by the timedout invocation (where the control flow has been timed out)? How does Flink sateful functions and the concept of integrated messaging and state facilitate such a pattern?
Thank you.
I posted the question on Apache Flink mailing list and got the following response by Igal Shilman, Thanks to Igal.
The first thing that I would like to mention is that, if your original
motivation for that scenario is a concern of a transient failures such as:
did function Y ever received a message sent by function X ?
did sending a message failed?
did the target function is there to accept a message sent to it?
did the order of message got mixed up?
etc'
Then, StateFun eliminates all of these problems and a whole class of
transient errors that otherwise you would have to deal with by yourself in
your business logic (like retries, backoffs, service discovery etc').
Now if your motivating scenario is not about transient errors but more
about transactional workflows, then as Dawid mentioned you would have to
implement
this in your application logic. I think that the way you have described the
flow should map directly to a coordinating function (per flow instance)
that keeps track of results/timeouts in its internal state.
Here is a sketch:
A Flow Coordinator Function - it would be invoked with the input
necessary to kick off a flow. It would start invoking the relevant
functions (as defined by the flow's DAG) and would keep an internal state
indicating
what functions (addresses) were invoked and their completion statues.
When the flow completes successfully the coordinator can safely discard its
state.
In any case that the coordinator decides to abort the flow (an internal
timeout / an external message / etc') it would have to check its internal
state and kick off a compensating workflow (sending a special message to
the already succeed/in progress functions)
Each function in the flow has to accept a message from the coordinator,
in turn, and reply with either a success or a failure.

Using Broadcast State To Force Window Closure Using Fake Messages

Description:
Currently I am working on using Flink with an IOT setup. Essentially, devices are sending data such as (device_id, device_type, event_timestamp, etc) and I don't have any control over when the messages get sent. I then key the steam by device_id and device_type to preform aggregations. I would like to use event-time given that is ensures the timers which are set trigger in a deterministic nature given a failure. However, given that this isn't always a high throughput stream a window could be opened for a 10 minute aggregation period, but not have its next point come until approximately 40 minutes later. Although the calculation would aggregation would eventually be completed it would output my desired result extremely late.
So my work around for this is to create an additional external source that does nothing other than pump fake messages. By having these fake messages being pumped out in alignment with my 10 minute aggregation period, even if a device hadn't sent any data, the event time windows would have something to force the windows closed. The critical part here is to make it possible that all parallel instances / operators have access to this fake message because I need to close all the windows with this single fake message. I was thinking that Broadcast state might be the most appropriate way to accomplish this goal given: "Broadcast state is replicated across all parallel instances of a function, and might typically be used where you have two streams, a regular data stream alongside a control stream that serves rules, patterns, or other configuration messages." Quote Source
Questions:
Is broadcast state the best method for ensuring all parallel instances (e.g. windows) receive my fake messages?
Once the operators have access to this fake message via the broadcast state can this fake message then be used to advance the event time watermark?
You can make this work with broadcast state, along the lines you propose, but I'm not convinced it's the best solution.
In an ideal world I'd suggest you arrange for the devices to send occasional keepalive messages, but assuming that's not possible, I think a custom Trigger would work well here. You can extend the EventTimeTrigger so that in addition to the event time timer it creates via
ctx.registerEventTimeTimer(window.maxTimestamp());
you also create a processing time timer, as a fallback, and you FIRE the window if the window still exists when that processing time timer fires.
I'm recommending this approach because it's simpler and more directly addresses the specific need. With the broadcast state approach you'll have to introduce a source for these messages, add a broadcast state descriptor and stream, add special fake watermarks for the non-broadcast stream (set to Watermark.MAX_WATERMARK), connect the broadcast and non-broadcast streams and implement a BroadcastProcessFunction (that probably doesn't really do anything), etc. It's a lot of moving parts spread across several different operators.

What is SourceFunction#run is supposed to work in Flink?

I have implemented a Source by extending RichSourceFunction for our Message Queue that Flink doesn't support.
When I implements the run method whose signature is:
override def run(sc: SourceFunction.SourceContext[String]): Unit = {
val msg = read_from_mq
sc.collect(msg)
}
When the run method is called, if there is no newer message in message queue,
Should I run without calling sc.collect or
I can wait until newer data comes(in this case, run method will be blocked).
I would prefer the 2nd one,not sure if this is the correct usage.
The run method of a Flink source should loop, endlessly producing output until its cancel method is called. When there's nothing to produce, then it's best if you can find a way to do a blocking wait.
The apache nifi source connector is another reasonable example to use as a model. You will note that it sleeps for a configurable interval when there's nothing for it to do.
As you probably know both options are functionally correct and will yield correct results.
This being said the second one is preferred because you're not holding the thread. In fact, if you take a look at the RabbitMQ connector implementation you'll notice that this exactly how it is implemented: inside its run it indirectly waits for messages to be placed on a BlockingQueue.

Libev: how to schedule a callback to be called as soon as possible

I'm learning libev and I've stumbled upon this question. Assume that I want to process something as soon as possible but not now (i.e. not in the current executing function). For example I want to divide some big synchronous job into multiple pieces that will be queued so that other callbacks can fire in between. In other words I want to schedule a callback with timeout 0.
So the first idea is to use ev_timer with timeout 0. The first question is: is that efficient? Is libev capable of transforming 0 timeout timer into an efficient "call as soon as possible" job? I assume it is not.
I've been digging through libev's docs and I found other options as well:
it can artificially delay invoking the callback by using a prepare or idle watcher
So the idle watcher is probably not going to be good here because
Idle watchers trigger events when no other events of the same or higher priority are pending
Which probably is not what I want. Prepare watchers might work here. But why not check watcher? Is there any crutial difference in the context I'm talking about?
The other option these docs suggest is:
or more sneakily, by reusing an existing (stopped) watcher and pushing it into the pending queue:
ev_set_cb (watcher, callback);
ev_feed_event (EV_A_ watcher, 0);
But that would require to always have a stopped watcher. Also since I don't know a priori how many calls I want to schedule at the same time then I would have to have multiple watchers and additionally keep track of them via some kind of list and increase it when needed.
So am I on the right track? Are these all possibilities or am I missing something simple?
You may want to check out the ev_prepare watcher. That one is scheduled for execution as the last handler in the given event loop iteration. It can be used for 'Execute this task ASAP' implementations. You can create dedicated watcher for each task you want to execute, or you can implement a queue with a one prepare watcher that is started once queue contains at least one task.
Alternatively, you can implement similar mechanism using ev_idle watcher, but this time, it will be executed only if the application doesn't process any 'higher priority' watcher handlers.

DeadlineExceededException and DataStore/Task Queue Operations

I'm doing some operations that should complete under 60 seconds but there may be some rare cases where it takes longer (but will never take longer than 10 minutes). It says in the app engine docs if you catch a DeadlineExceededException you have less than a second to do operations before it permanently fails. Would this be enough time to add a task to a queue and/or do a datastore write? I assume the safest way would be to add a task async/write a datastore entity (async) at the beginning of an operation and remove it from the queue if the operation completes. The latter method would use up twice as many api calls but is it worth it?
I would suggest to use the queue as default for all operations so you won't have to implement the fallback to it if you catch a dead line exceed error. It is more clean and easier to maintain along with the fact that the user doesn't have to wait for the operation to complete. In order to achieve this you can trigger your queue with an ajax call and get the result in the background, so the user will not wait for the operation to complete. Yes it worth's it, since it can "guarantee" the window of time you might need.
The runtime environment gives the request handler a little bit more time (less than a second) after raising the exception to prepare a custom response. so it would be sufficient to add that it into task queue.
If you do not want the client to keep polling for a task queue result, I suggest you have a look at the Channel API. It will enable you to implement push notifications to the client.
At the end of your task queue, you'll just have to send a notification to the client to let him now that is task has been processed.

Resources