Apache Flink - how to skip all but most recent window on startup - apache-flink

In Flink, I have a Job with a Keyed Stream of events (e.g.: 10 events for each Key every day on average).
They are handled as Sliding Windows based on Event-Time (e.g.: 90-days window size and 1-day window slide).
Events are consumed from Kafka, which persists all event history (e.g.: last 3 years).
Sometimes I'd like to restart Flink: for maintenance, bug handling, etc.
Or start a fresh Flink instance with Kafka already containing event history.
In such case I would like to skip triggering for all but the most recent window for each Key.
(It's specific to my use-case: each window, when processed, effectively overrides processing results from previous windows. So at startup, I would like to process only single most recent window for each Key.)
Is it possible in Flink? If so, then how to do it?

You can use
FlinkKafkaConsumer<T> myConsumer = new FlinkKafkaConsumer<>(...);
myConsumer.setStartFromTimestamp(...); // start from specified epoch timestamp (milliseconds)
which is described along with other related functions in the section of the docs on Kafka Consumers Start Position Configuration.
Or you could use a savepoint to do a clean upgrade/redeploy without losing your kafka offsets and associated window contents.

Related

Flink: What happens when ProcessAllWindowFunction takes more time than the TumblingProcessingTimeWindows defined in windowAll()

I have a ProcessAllWindowFunction implementation(refer AttributeBackLogEvents() in the code below) which has quite a few I/O and it might take more than 30seconds. windowAll() is windowing data using TumblingProcessingTimeWindows of 30seconds.
attributedStream
.windowAll(TumblingProcessingTimeWindows.of(Time.seconds(30)))
.process(new AttributeBackLogEvents())
.forceNonParallel()
.addSink(ConfluentKafkaSink.createKafkaSinkFromApplicationProperties())
.name("Enriched Event kafka topic sink");
AttributeBackLogEvents fetches a set of events from MySQL based on the iterable passed and after some processing deletes some of the fetched events of MySQL. I'm seeing that the record which cuurent window is fetching(and ideally which should be deleted before the next window fires), is also getting fetched by next window which means even though current window is processing next window fires up.
My questions are:
Is it possible that AttributeBackLogEvents is still running and next window fires?
If so, then how can i enforce that until current window processing is complete, next window shouldn't fire.
This Q does not describe what happens in the logic, but conceptually:
Your window means 'time range of source data' so any processing of that can never fully be done before the next window starts.
There may be a way, but for a streaming tool a source like MySQL is typically seen as reference data (which you typically want to read often) unless you are doing Change Data Capture.

Why are my Flink windows using so much state?

The checkpoints for my Flink job are getting larger and larger. After drilling down into individual tasks, the keyed window function seems to be responsible for most of the size. How can I reduce this?
If you have a lot of state tied up in windows, there are several possibilities:
Using incremental aggregation (by using reduce or aggregate) can dramatically reduce your storage requirements. Otherwise each event is being copied into the list of events assigned to each window.
If you are aggregating over multiple timeframes, e.g., every minute and every 10 minutes, you can cascade these windows, so that the 10 minute windows are only consuming the output of the minute-long windows, rather than every event.
If you are using sliding windows, each event is being assigned to each of the overlapping windows. For example, if your windows are 2 minutes long and sliding by 1 second, each event is being copied into 120 windows. Incremental and/or pre-aggregation will help here (a lot!), or you may want to use a KeyedProcessFunction instead of a window in order to optimize your state footprint.
If you have keyed count windows, you could have keys for which the requisite batch size is never (or only very slowly) reached, leading to more and more partial batches sitting around in state. You could implement a custom Trigger that incorporates a timeout in addition to the count-based triggering so that these partial batches are eventually processed.
If you are using globalState in a ProcessWindowFunction, the globalState for stale keys will accumulate. You can use state TTL on the state descriptor for the globalState. Note: this is the only place where window state isn't automatically freed when windows are cleared.
Or it may simply be that your key space is growing over time, and there's really nothing that can be done except to scale up the cluster.

Emitting the results of a session window every X minutes

I've implemented a Flink processor that aggregates events into sessions and then writes them to a sink. Now I'd like extend it so that I can get the number of concurrent sessions every five minutes.
The events coming into my system are on the form:
{
"SessionId": "UniqueUUID",
"Customer": "CustomerA",
"EventType": "EventTypeA",
[ ... ]
}
And a single session usually contains several events of different EventTypes. I then aggregate the events into sessions by doing the following in Flink.
DataStream<Session> sessions = events
.keyBy((KeySelector<HashMap, String>) event -> (String) event.get(Field.SESSION_ID))
.window(ProcessingTimeSessionWindows.withGap(org.apache.flink.streaming.api.windowing.time.Time.minutes(5)))
.trigger(SessionProcessingTimeTrigger.create())
.aggregate(new SessionAggregator())
Each session is the emitted (by the SessionProcessingTimeTrigger) when an event with a specific EventType is processed ("EventType":"Session.Ended"). And finally the stream is sent to a sink and written Kafka.
Now I want to write a similar Flink processor but instead of only emitting a session once it is finished, I instead want to emit all sessions every 5 minutes in order to keep track of how many concurrent session we have every 5 minutes.
So in a sense I guess what I want is a SessionWindow that also emits it's contents at regular intervals without purging the content.
I'm stumped on how to accomplish this in Flink and are therefore looking for some aid.
Whenever you want a Flink window to emit results at non-default times, you can do this by implementing a custom Trigger. You trigger just needs to return FIRE each time a 5-minute-long timer fires, in addition to its original logic. You'll want to register this timer when the first event is assigned to a window, and again every time the timer fires.
In the case of session windows this can be more complex because of the manner in which session windows are merged. But I believe that in the case of processing time session windows what I've outlined above will work.

Some questions related to Fraud detection demo from Flink DataStream API

The example is very useful at first,it illustrates how keyedProcessFunction is working in Flink
there is something worth noticing, it suddenly came to me...
It is from Fraud Detector v2: State + Time part
It is reasonable to set a timer here, regarding the real application requirement part
override def onTimer(
timestamp: Long,
ctx: KeyedProcessFunction[Long, Transaction, Alert]#OnTimerContext,
out: Collector[Alert]): Unit = {
// remove flag after 1 minute
timerState.clear()
flagState.clear()
}
Here is the problem:
The TimeCharacteristic IS ProcessingTime which is determined by the system clock of the running machine, according to ProcessingTime property, the watermark will NOT be changed overtime, so that means onTimer will never be called, unless the TimeCharacteristic changes to eventTime
According the flink website:
An hourly processing time window will include all records that arrived at a specific operator between the times when the system clock indicated the full hour. For example, if an application begins running at 9:15am, the first hourly processing time window will include events processed between 9:15am and 10:00am, the next window will include events processed between 10:00am and 11:00am, and so on.
If the watermark doesn't change over time, will the window function be triggered? because the condition for a window to be triggered is when the watermark enters the end time of a window
I'm wondering the condition where the window is triggered or not doesn't depend on watermark in priocessingTime, even though the official website doesn't mention that at all, it will be based on the processing time to trigger the window
Hope someone can spend a little time on this,many thx!
Let me try to clarify a few things:
Flink provides two kinds of timers: event time timers, and processing time timers. An event time timer is triggered by the arrival of a watermark equal to or greater than the timer's timestamp, and a processing time timer is triggered by the system clock reaching the timer's timestamp.
Watermarks are only relevant when doing event time processing, and only purpose they serve is to trigger event time timers. They play no role at all in applications like the one in this DataStream API Code Walkthrough that you have referred to. If this application used event time timers, either directly, or indirectly (by using event time windows, or through one of the higher level APIs like SQL or CEP), then it would need watermarks. But since it only uses processing time timers, it has no use for watermarks.
BTW, this fraud detection example isn't using Flink's Window API, because Flink's windowing mechanism isn't a good fit for this application's requirements. Here we are trying to a match a pattern to a sequence of events within a specific timeframe -- so we want a different kind of "window" that begins at the moment of a special triggering event (a small transaction, in this case), rather than a TimeWindow (like those provided by Flink's Window API) that is aligned to the clock (i.e., 10:00am to 10:01am).

Flink window operator checkpointing

I want to know how flink does the checkpoint of the window operator. How to ensure that it is exactly once when recovering? For example, saving the tuples in the current window and saving the progress of the current window processing. I want to know the detailed process of the window operator's checkpoint and recovery.
All of Flink's stateful operators participate in the same checkpointing mechanism. When instructed to do so by the checkpoint coordinator (part of the job manager), the task managers initiate a checkpoint in each parallel instance of every source operator. The sources checkpoint their offsets and insert a checkpoint barrier into the stream. This divides the stream into the parts before and after the checkpoint. The barriers flow through the graph, and each stateful operator checkpoints its state upon having processed the stream up to the checkpoint barrier. The details are described at the link shared by #bupt_ljy.
Thus these checkpoints capture the entire state of the distributed pipeline, recording offsets into the input queues as well as the state throughout the job graph that has resulted from having ingested the data up to that point. When a failure occurs, the sources are rewound, the state is restored, and processing is resumed.
Given that during recovery the sources are rewound and replayed, "exactly once" means that the state managed by Flink is affected exactly once, not that the stream elements are processed exactly once.
There's nothing particularly special about windows in this regard. Depending on the type of window function being applied, a window's contents are kept in an element of managed ListState, ReducingState, AggregatingState, or FoldingState. As stream elements arrive and are being assigned to a window, they are appended, reduced, aggregated, or folded into that state. Other components of the window API, including Triggers and ProcessWindowFunctions, can have state that is checkpointed as well. For example, CountTrigger using ReducingState to keep track of how many elements have been assigned to the window, adding one to the count as each element is added to the window.
In the case where the window function is a ProcessWindowFunction, all of the elements assigned to the window are saved in Flink state, and are passed in an Iterable to the ProcessWindowFunction when the window is triggered. That function iterates over the contents and produces a result. The internal state of the ProcessWindowFunction is not checkpointed; if the job fails during the execution of the ProcessWindowFunction the job will resume from the most recently completed checkpoint. This will involve rewinding back to a time before the window received the event that triggered the window firing (that event can't be included in the checkpoint because a checkpoint barrier following it can not have had its effect yet). Sooner or later the window will again reach the point where it is triggered and the ProcessWindowFunction will be called again -- with the same window contents it received the first time -- and hopefully this time it won't fail. (Note that I've ignored the case of processing-time windows, which do not behave deterministically.)
When a ProcessWindowFunction uses managed/checkpointed state, it is used to remember things between firings, not within a single firing. For example, a window that allows late events might want to store the result previously reported, and then issue an update for each late event.

Resources