The watermark and late event handling is easy to understand, but how about early event? For example, if the original stream contains events happened from 3:00 to 4:00, but if I insert some events which happened from 6:00 to 7:00 into the stream, then how flink handles them? It would create separate window(s) for them and when the window expires, they get handled too?
Depending on the watermarking strategy, early events can advance the watermark and then cause subsequent "on time" events to be considered late.
Early events are not dropped but put into the corresponding window. The window is processed when the watermark passes the end timestamp of the window. So, Flink is able to maintain several windows at the same time.
Related
Let's say we have an EventTimeSlidingWindow with an EventTime trigger based on some watermark. If the watermark is generated very infrequently, say every five minutes, and the window sizes are say one minute, then will five window results get fired at the same time when the watermark progresses? i.e. in my output stream would I have the same timestamp for all of their output, which is the time when the watermark generator produced the watermark?
Not exactly. Flink registers a timer with a timestamp when the window will fire. That timestamp is the event which has caused the window to open. As the watermark advances it will cause all the timers to fire (more or less at once) with the timestamps registered previously. So you are right with the assumption that all the windows will fire but having timestamps when the window (timer) was registered.
But be aware that if you "only" create a watermark every 5 minutes the minimum delay of the window will be that 5 minutes. But still the timestamps will be correct.
There are a lot of late events in my Flink job so set allowedLateness() to 10mins (using TumblingEventTimeWindows and a complex AggregateFunction runs on every window)
Seems the aggregation happens on every late event but I'd like to fire less frequently.
Is there any trigger which fires only in every minute?
Do the triggers affect to late events?
Are there any triggers which effect only to the late events?
You could implement a custom Trigger with whatever behavior you desire.
If you look at the implementation of EventTimeTrigger, the default trigger for tumbling event time windows,
public TriggerResult onElement(Object element, long timestamp, TimeWindow window, TriggerContext ctx) throws Exception {
if (window.maxTimestamp() <= ctx.getCurrentWatermark()) {
// if the watermark is already past the window fire immediately
return TriggerResult.FIRE;
} else {
ctx.registerEventTimeTimer(window.maxTimestamp());
return TriggerResult.CONTINUE;
}
}
you'll see that whenever an event is assigned to the window after the stream's watermark has reached or surpassed the window's end, then the trigger returns FIRE. This is why every late event causes another firing.
An alternative would be to have no allowed lateness, but instead collect the late events into their own stream (using a side output), and then process the late events independently.
Just to clarify, the late events I mention below, are those late events still in the range of allowlatenes you set.
Is there any trigger which fires only in every minute ?
No. However you can customize your own Trigger, try using the event timer service to achieve that.
Do the triggers affect to late events ?
Yes. The late events will be referenced in trigger by calling onElement function.
Are there any triggers which effect only to the late events ?
You can filter the late events in custom trigger like this:
if (window.maxTimestamp() <= ctx.getCurrentWatermark()) {
return TriggerResult.FIRE;
In my opinion the best approach is to implement processes function with custom watermark.
When I use flink event time window, the window just doesn't trigger. How can I solve the problem, and are there any ways to debug?
As you are using the event time window, it is probably a watermark problem. The window only output when watermarks make a progress. There are some reasons why the event time has not been advanced:
There are no data from the source
One of the source parallelisms doesn't have data
The time field extracted from the record should be millisecond instead of second.
Data should cover a longer time span than the window size to advance the event time.
The window will output if we change event time to processing time. Furthermore, we can monitor event time by checking the watermarks in the web dashboard[1] or print-debug it with a ProcessFunction which can lookup the current watermark.
[1] https://ci.apache.org/projects/flink/flink-docs-master/monitoring/debugging_event_time.html#monitoring-current-event-time
Be sure you're setting environment.setStreamTimeCharacteristic(TimeCharacteristic.EventTime).
I am new to flink, and am trying to learn the Event Time and Watermarks section.
Can you explain what is Watermarks, and what problem it solves? The example is not clear to me.
does it only need for event time (out of order processing)?
The purpose of the watermark is to define when a time-based window should fire.
Watermarks allow for the idea that events might be slightly out of order, so the time "extracted" from it might differ by some amount from where you would like to draw the "low water" mark for firing that window. For example if your data is generated from disparate sources that have varied latency before arrived (consider situation of distributed logging).However, you might not need this if your data is guaranteed to only have ascending timestamps, for example if it is generated from sensor readings.
So this goes hand in hand with the some of the pre-defined Watermark generators that Flink provides which, not surprisingly, line up with the options.
Apologies for the rather verbose and long-winded post, but this problem's been perplexing me for a few weeks now so I'm posting as much information as I can in order to get this resolved quickly.
We have a WPF UserControl which is being loaded by a 3rd party app. The 3rd party app is a presentation application which loads and unloads controls on a schedule defined by an XML file which is downloaded from a server.
Our control, when it is loaded into the application makes a web request to a web service and uses the data from the response to display some information. We're using an MVVM architecture for the control. The entry point of the control is a method that is implementing an interface exposed by the main app and this is where the control's configuration is set up. This is also where I set the DataContext of our control to our MainViewModel.
The MainViewModel has two other view models as properties and the main UserControl has two child controls. Depending on the data received from the web service, the main UserControl decides which child control to display, e.g. if there is a HTTP error or the data received is not valid, then display child control A, otherwise display child control B. As you'd expect, these two child controls bind two separate view models each of which is a property of MainViewModel.
Now child control B (which is displayed when the data is valid) has a RefreshService property/field. RefreshService is an object that is responsible for updating the model in a number of ways and contains 4 System.Timers.Timers;
a _modelRefreshTimer
a _viewRefreshTimer
a _pageSwitchTimer
a _retryFeedRetrievalOnErrorTimer (this is only enabled when something goes wrong with retrieving data).
I should mention at this point that there are two types of data; the first changes every minute, the second changes every few hours. The controls' configuration decides which type we are using/displaying.
If data is of the first type then we update the model quite frequently (every 30 seconds) using the _modelRefreshTimer's events.
If the data is of the second type then we update the model after a longer interval. However, the view still needs to be refreshed every 30 seconds as stale data needs to be removed from the view (hence the _viewRefreshTimer).
The control also paginates the data so we can see more than we can fit on the display area. This works by breaking the data up into Lists and switching the CurrentPage (which is a List) property of the view model to the right List. This is done by handling the _pageSwitchTimer's Elapsed event.
Now the problem
My problem is that the control, when removed from the visual tree doesn't dispose of it's timers. This was first noticed when we started getting an unusually high number of requests on the web server end very soon after deploying this control and found that requests were being made at least once a second! We found that the timers were living on and not stopping hours after the control had been removed from view and that the more timers there were the more requests piled up at the web server.
My first solution was to implement IDisposable for the RefreshService and do some clean up when the control's UnLoaded event was fired. Within the RefreshServices Dispose method I've set Enabled to false for all the timers, then used the Stop() method on all of them. I've then called Dispose() too and set them to null. None of this worked.
After some reading around I found that event handlers may hold references to Timers and prevent them from being disposed and collected. After some more reading and researching I found that the best way around this was to use the Weak Event Pattern. Using this blog and this blog I've managed to work around the shortcomings in the Weak Event pattern.
However, none of this solves the problem. Timers are still not being disabled or stopped (let alone disposed) and web requests are continuing to build up. Mem Profiler tells me that "This type has N instances that are directly rooted by a delegate. This can indicate the delegate has not been properly removed" (where N is the number of instances). As far as I can tell though, all listeners of the Elapsed event for the timers are being removed during the cleanup so I can't understand why the timers continue to run.
Thanks for reading. Eagerly awaiting your suggestions/comments/solutions (if you got this far :-p)
have you tried using the DispatchTimer instead of System.Timers.Timer?
If a Timer is used in a WPF application, it is worth noting that the Timer runs on a different thread then the user interface (UI) thread. In order to access objects on the user interface (UI) thread, it is necessary to post the operation onto the Dispatcher of the user interface (UI) thread using Invoke or BeginInvoke. Reasons for using a DispatcherTimer opposed to a Timer are that the DispatcherTimer runs on the same thread as the Dispatcher and a DispatcherPriority can be set. ( from this blog )
It sounds to me like the timer signals are being queued faster than they are processed. For example, this would happen if each elapse event takes 2 seconds to process while the timer elapses every second. This would result in a backlog of "raise Elapse event now" signals being scheduled on the .NET thread pool.
Such a backlog would then continue to cause elapse events even after you stop the timer. From the System.Timers.Timer documentation:
Even if SynchronizingObject is true,
Elapsed events can occur after the
Dispose or Stop method has been called
or after the Enabled() property has
been set to false, because the signal
to raise the Elapsed event is always
queued for execution on a thread pool
thread. One way to resolve this race
condition is to set a flag that tells
the event handler for the Elapsed
event to ignore subsequent events.
To avoid a backlog of unprocessed timer signals building up in the thread pool, you can set AutoReset to false. That way the timer disables itself at each elapse. You can then set Enabled back to true after handling the elapse event. That way no additional elapse events will be scheduled until the last one has been handled.
Call timer.Elapsed -= new ElapsedEventHandler(); when your control is disposed of to manually detach the handler.