Question about flink watermark illustrations in official documents - apache-flink

Recently I read flink official documents for something about watermarks.
url:https://ci.apache.org/projects/flink/flink-docs-release-1.9/dev/event_time.html
there are two pictures illustrating flink watermark mechanism, which puzzle me mush:
flink watermark
The first picture is easy to understand, But in the second, I wonder how do we get w(11) and w(17)?
As we know, we can define how to generate watermark in the flink job, in other words, watermarks are generated by certain rules. So what are the rules that the watermarks are generated in the second pic.
I look up for almost all offficial documents of different version flink and they use the same pictures.
It puzzled me much。Is there any explaination?

You're right; that example is confusing. While it does illustrate a possible scenario, it's not easy to understand.
Typically watermarks are generated using a bounded-out-of-orderness watermarking strategy, where the timestamp assigner keeps track of the largest timestamp seen so far (as a side effect of assigning the timestamps in the stream records' metadata). Then each time the periodic watermark generator's timer goes off (every 200 msec by default), the specified bounded delay is subtracted from that maximum timestamp, and the result is used to create a new watermark (provided the resulting timestamp is larger than the previous watermark).
In the example shown in that figure, the maximum timestamp before the W(17) appears to be the 22, so the bounded delay is presumably 5. By the same reasoning, there should therefore be an event for time 16 preceding the W(11), but if there was, it's somewhere before the event from time 7.

Related

How to handle the case for watermarks when num of kafka partitions is larger than Flink parallelism

I am trying to figure out a solution to the problem of watermarks progress when the number of Kafka partitions is larger than the Flink parallelism employed.
Consider for example that I have Flink app with parallelism of 3 and that it needs to read data from 5 Kafka partitions. My issue is that when starting the Flink app, it has to consume historical data from these partitions. As I understand it each Flink task starts consuming events from a corresponding partition (probably buffers a significant amount of events) and progress event time (therefore watermarks) before the same task transitions to another partition that now will have stale data according to watermarks already issued.
I tried considering a watermark strategy using watermark alignment of a few seconds but that
does not solve the problem since historical data are consumed immediately from one partition and therefore event time/watermark has progressed.Below is a snippet of code that showcases watermark strategy implemented.
WatermarkStrategy.forGenerator(ws)
.withTimestampAssigner(
(event, timestamp) -> (long) event.get("event_time))
.withIdleness(IDLENESS_PERIOD)
.withWatermarkAlignment(
GROUP,
Duration.ofMillis(DEFAULT_MAX_WATERMARK_DRIFT_BETWEEN_PARTITIONS),
Duration.ofMillis(DEFAULT_UPDATE_FOR_WATERMARK_DRIFT_BETWEEN_PARTITIONS));
I also tried using a downstream operator to sort events as described here Sorting union of streams to identify user sessions in Apache Flink but then again also this cannot effectively tackle my issue since event record times can deviate significantly.
How can I tackle this issue ? Do I need to have the same number of Flink tasks as the number of Kafka partitions or I am missing something regarding the way data are read from Kafka partitions
The easiest solution to this problem will be using the fromSource with WatermarkStrategy instead of assigning that by using assignTimestampsAndWatermarks.
When You use the WatermarkStrategy directly in fromSource with kafka connector, the watermarks will be partition aware, so the Watermark generated by the given operator will be minimum of all partitions assinged to this operator.
Assigning watermarks directly in source will solve the problem You are facing, but it has one main drawback, since the generated watermark in min of all partitions processed by the given operator, if some partition is idle watermark for this operator will not progress either.
The docs describe kafka connector watermarking here.

Apache Flink : Watermarks per partitions?

I see that there are lot of discussions going on about adding support for watermarks per key. But do flink support per partition watermarks?
Currently - then minimum of all the watermarks(non idle partitions) is taken into account. Because of this the last hanging records in a window are stuck as well.(when incremented the watermark using periodicemit)
Any info on this is really appreciated!
Some of the sources, such as the FlinkKafkaConsumer, support per-partition watermarking. You get this by calling assignTimestampsAndWatermarks on the source, rather than on the stream produced by the source.
What this does is that each consumer instance tracks the maximum timestamp within each partition, and take as its watermark the minimum of these maximums, less the configured bounded out-of-orderness. Idle partitions will be ignored, if you configure it to do so.
Not only does this yield more accurate watermarking, but if your events are in-order within each partition, this also makes it possible to take advantage of the WatermarkStrategy.forMonotonousTimestamps() strategy.
See Watermark Strategies and the Kafka Connector for more details.
As for why the last window isn't being triggered, this is related to watermarking, but not to per-partition watermarking. The problem is simply that windows are triggered by watermarks, and the watermarks are trailing behind the timestamps in the events. So the watermarks can never catch up to the final events, and can never trigger the last window.
This isn't a problem for unbounded streaming jobs, since they never stop and never have a last window. And it isn't a problem for batch jobs, since they are aware of all of the data. But for bounded streaming jobs, you need to do something to work around this issue. Broadly speaking, what you must do is to inform Flink that the input stream has ended -- whenever the Flink sources detect that they have reached the end of an event-time-based input stream, they emit one last watermark whose value is MAX_WATERMARK, and this will trigger any open windows.
One way to do this is to use a KafkaDeserializationSchema with an implementation of isEndOfStream that returns true when the job reaches its end.

How to handle future events in flink streaming?

We're working on calculating some max concurrent count for different type of events within a 1min tumbling time window.
These events like sensor data which was collected from our desktop agents on minute basis, however, some agent got a bad timestamp, say, it would be a timestamp even several hours later than now.
So, my question is how to handle/drop these events, currently I just apply
filter(s => s.ct.getTime < now) predicate to exclude them.
My 1st question is, if I don't do this, I doubt this bad "future" event would trigger window calculation even the for those incomplete data window
And 2nd question is, do we have any better method to prevent this?
Thanks
Interesting use case.
So first some background, then some solutions:
Windows in flink do not fire based on timestamps but based on watermarks. There is a close connection between the two and often it's okay to treat them the same when it comes to window firing, but in this case, it's important to have this clear separation. So yes your doubt is probably valid, if you use a watermark generator that is strictly bound to the timestamp.
So with that in mind, you have a few options:
Filter invalid events (timestamp > now())
Adjust timestamp (timestamp = min(timestamp, now())) or by understanding why specific sensors are off (timezone issues?)
Use a more sophisticated watermark generator
I think the first two options are straight-forward and I'd personally would go for the 2. (fixing data is always good). Let's focus on the watermark generator.
There is basically no limit on how you generate watermarks - you can rely on your imagination. Here are some ideas:
Only advance watermarks, when you saw X events with a watermark greater than the current watermark.
Use some low pass filter = slow moving average.
Ignore events with timestamp > now() (so filter only for watermark generation).
...
I'd be happy to hear which way you have chosen and I can help you further down.

Flink Time Characteristic and AutoWatermarkInterval

In Apache Flink, setAutoWatermarkInterval(interval) produces watermarks to downstream operators so that they advance their event time.
If the watermark has not been changed during the specified interval (no events arrived) the runtime will not emit any watermarks? On the other hand, if a new event is arrived before the next interval, a new watermark will be immediately emitted or it will be queued/waiting until the next setAutoWatermarkInterval interval is reached.
I am curious on what is the best configuration AutoWatermarkInterval (especially for high rate sources): the more this value is small, the more lag between processing time and event time will be small, but at the overhead of more BW usage to send the watermarks. Is that true accurate?
On the other hand, If I used env.setStreamTimeCharacteristic(TimeCharacteristic.IngestionTime), Flink runtime will automatically assign timestamps and watermarks (timestamps correspond to the time the event entered the Flink dataflow pipeline i.e. the source operator), nevertheless even with ingestionTime we can still define a processing time timer (in the processElement function) as show below:
long timer = context.timestamp() + Timeout.
context.timerService().registerProcessingTimeTimer(timer);
where context.timestamp() is the ingestion time set by Flink.
Thank you.
The autoWatermarkInterval only affects watermark generators that pay attention to it. They also have an opportunity to generate a watermark in combination with event processing.
For those watermark generators that use the autoWatermarkInterval (which is definitely the normal case), they are collecting evidence for what the next watermark should be as a side effect of assigning timestamps for each event. When a timer fires (based on the autoWatermarkInterval), the watermark generator is then asked by the Flink runtime to produce the next watermark. The watermark wasn't waiting somewhere, nor was it queued, but rather it is created on demand, based on information that had been stored by the timestamp assigner -- which is typically the maximum timestamp seen so far in the stream.
Yes, more frequent watermarks means more overhead to communicate and process them, and lower latency. You have to decide how to handle this throughput/latency tradeoff based on your application's requirements.
You can always use processing time timers, regardless of the TimeCharacteristic. (By the way, at a low level, the only thing watermarks do is to trigger event time timers, be they in process functions, windows, etc.)

How to gather late data in Flink Stream Processing Windowing

Consider I have a data stream that contains event time data in it. I want to gather input data stream in window time of 8 milliseconds and reduce every window data. I do that using the following code:
aggregatedTuple
.keyBy( 0).timeWindow(Time.milliseconds(8))
.reduce(new ReduceFunction<Tuple2<Long, JSONObject>>()
Point: The key of the data stream is the timestamp of processing time mapped to last 8 submultiples of a timestamp of processing millisecond, for example 1531569851297 will mapped to 1531569851296.
But it's possible the data stream arrived late and enter to the wrong window time. For example, suppose I set the window time to 8 milliseconds. If data enter the Flink engine in order or at least with a delay less than window time (8 milliseconds) it will be the best case. But suppose data stream event time (that is a field in the data stream, also) has arrived with the latency of 30 milliseconds. So it will enter the wrong window and I think if I check the event time of every data stream, as it wants to enter the window, I can filter at such a late data.
So I have two question:
How can I filter data stream as it wants to enter the window and check if the data created at the right timestamp for the window?
How can I gather such late data in a variable to do some processing on them?
Flink has two different, related abstractions that deal with different aspects of computing windowed analytics on streams with event-time timestamps: watermarks and allowed lateness.
First, watermarks, which come into play whenever working with event-time data (whether or not you are using windows). Watermarks provide information to Flink about the progress of event-time, and give you, the application writer, a means of coping with out-of-order data. Watermarks flow with the data stream, and each one marks a position in the stream and carries a timestamp. A watermark serves as an assertion that at that point in the stream, the stream is now (probably) complete up to that timestamp -- or in other words, the events that follow the watermark are unlikely to be from before the time indicated by the watermark. The most common watermarking strategy is to use a BoundedOutOfOrdernessTimestampExtractor, which assumes that events arrive within some fixed, bounded delay.
This now provides a definition of lateness -- events that follow a watermark with timestamps less than the watermarks' timestamp are considered late.
The window API provides a notion of allowed lateness, which is set to zero by default. If allowed lateness is greater than zero, then the default Trigger for event-time windows will accept late events into their appropriate windows, up to the limit of the allowed lateness. The window action will fire once at the usual time, and then again for each late event, up to the end of the allowed lateness interval. After which, late events are discarded (or collected to a side output if one is configured).
How can I filter data stream as it wants to enter the window and check
if the data created at the right timestamp for the window?
Flink's window assigners are responsible for assigning events to the appropriate windows -- the right thing will happen automatically. New window instances will be created as needed.
How can I gather such late data in a variable to do some processing on them?
You can either be sufficiently generous in your watermarking so as to avoid having any late data, and/or configure the allowed lateness to be long enough to accommodate the late events. Be aware, however, that Flink will be forced to keep all windows open that are still accepting late events, which will delay garbage collecting old windows and may consume considerable memory.
Note that this discussion assumes you want to work with time windows -- e.g. the 8msec long windows you are working with. Flink also supports count windows (e.g. group events into batches of 100), session windows, and custom window logic. Watermarks and lateness don't play any role if you are using count windows, for example.
If you want per-key results for your analytics, then use keyBy to partition the stream by key (e.g., by userId) before applying windowing. For example
stream
.keyBy(e -> e.userId)
.timeWindow(Time.seconds(10))
.reduce(...)
will produce separate results for each userId.
Update: Note that in recent versions of Flink it is now possible for windows to collect late events to a side output.
Some relevant documentation:
Event Time and Watermarks
Allowed Lateness

Resources