Flink SQL Windows not Reporting Final Results - apache-flink

I'm using Flink SQL to compute event-time-based windowed analytics. Everything works fine until my data source becomes idle each evening, after which the results for the last minute aren't produced until the next day when data begins to flow again.
CREATE TABLE input
id STRING,
data BIGINT,
rowtime TIMESTAMP(3) METADATA FROM 'timestamp',
WATERMARK FOR rowtime AS rowtime - INTERVAL '1' SECOND
WITH (
'connector' = 'kafka',
'topic' = 'input',
'properties.bootstrap.servers' = 'localhost:9092',
'scan.startup.mode' = 'latest-offset',
'format' = 'json'
)
SELECT ...
FROM
(SELECT * FROM
TABLE(TUMBLE(TABLE input, DESCRIPTOR(rowtime), INTERVAL '1' MINUTES)))
GROUP BY ..., window_start, window_end
I've tried setting table.exec.source.idle-timeout, but it didn't help. What can I do?

table.exec.source.idle-timeout (and the corresponding withIdleness construct used with the DataStream API for WatermarkStrategy) detects idle input partitions and prevents them from holding back the progress of the overall watermark. However, for the overall watermark to advance, there must still be some input, somewhere.
Some options:
(1) Live with the problem, which means waiting until the watermark can advance normally, based on observing larger timestamps in the input stream. As you've indicated, in your use case this can require waiting several hours.
(2) Arrange for the input stream(s) to contain keep-alive messages. This way the watermark generator will have evidence (based on the timestamps in the keep-alive messages) that it can advance the watermark. You'll have to modify your queries to ignore these otherwise extraneous events.
(3) Upon reaching the point where the job has fully ingested all of the daily input, but hasn't yet produced the final set of results, stop the job and specify --drain. This will send a watermark with value MAX_WATERMARK through the pipeline, which will close all pending windows. You can then restart the job.
(4) Implement a custom watermark strategy that uses a processing-time timer to detect idleness and artificially advance the watermark based on the passage of wall clock time. This will require converting your table input to a DataStream, adding the watermarks there, and then converting back to a table for the windowing. See https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/table/data_stream_api/ for examples of these conversions.

Related

flink cep sql Event Not triggering

I use CEP Pattern in Flink SQL which is working as expected connecting to Kafka broker. But when i connecting to cluster based cloud kafka setup, the Flink CEP is not triggering. Here is my sql:
create table agent_action_detail
(
agent_id String,
room_id String,
create_time Bigint,
call_type String,
application_id String,
connect_time Bigint,
row_time TIMESTAMP_LTZ(3), WATERMARK for row_time as row_time - INTERVAL '1' MINUTE)
with ('connector'='kafka', 'topic'='agent-action-detail', ...)
then I send messages in json format like
{"agent_id":"agent_221","room_id":"room1","create_time":1635206828877,"call_type":"inbound","application_id":"app1","connect_time":1635206501735,"row_time":"2021-10-25 16:07:09.019Z"}
in flink web ui, watermark works fine
flink web ui
I run my cep sql :
select * from agent_action_detail
match_recognize(
partition by agent_id
order by row_time
measures
last(BF.create_time) as create_time,
first(AF.connect_time) as connect_time
one row per match AFTER MATCH SKIP PAST LAST ROW
pattern (BF+ AF) define BF as BF.connect_time > 0 ,AF as AF.connect_time > 0
)
every kafka message, connect_time is > 0, but flink not triggering.
Can somebody help to this issue, thanks in advance!
select * from agent_action_detail match_recognize( partition by agent_id order by row_time measures AF.connect_time as connect_time one row per match pattern (BF AF) WITHIN INTERVAL '1' second define BF as (last(BF.connect_time, 1) < 1), AF as AF.connect_time >= 100)
Here is another cep sql still not working.
And the agent_action_detail table is insert by another flink sql as
insert into agent_action_detail select data.agent_id, data.room_id, data.create_time, data.call_type, data.application_id, data.connect_time, now() from source_table where type = 'xxx'
There are several things that can cause pattern matching to produce no results:
the input doesn't actually contain the pattern
watermarking is being done incorrectly
the pattern is pathological in some way
This particular pattern loops with no exit condition. This sort of pattern doesn't allow the internal state of the pattern matching engine to ever be cleared, which will lead to problems.
If you were using Flink CEP directly, I would tell you to
try adding until(condition) or within(time) to constrain the number of possible matches.
With MATCH_RECOGNIZE, see if you can add a distinct terminating element to the pattern.
Update: since you are still getting no results after modifying the pattern, you should determine if watermarking is the source of your problem. CEP relies on sorting the input stream by time, which depends on watermarking -- but only if you are using event time.
The easiest way to test this would be to switch to using processing time:
create table agent_action_detail
(
agent_id String,
...
row_time AS PROCTIME()
)
with (...)
If that works, then either the timestamps or watermarks are the problem. For example, if all of the events are late, you'll get no results. In your case, I'm wondering the row_time column has any data in it.
If that doesn't reveal the problem, please share a minimal reproducible example, including the data needed to observe the problem.

Improper window output intervals from Flink

I am new to Flink. I am replacing Kafka Streams API with Flink, because Kafka Streams is internally creating multiple internal topics which is adding overhead.
However, in the Flink job, all I am doing is
Dedupe the records in given window (1hr). (Window(TumblingEventTimeWindows(3600000), EventTimeTrigger, Job$$Lambda$1097/1241473750, PassThroughWindowFunction))
deDupedStream = deserializedStream
.keyBy(msg -> new StringBuilder()
.append("XXX").append("YYY"))
.timeWindow(Time.milliseconds(3600000)) // 1 hour
.reduce((event1, event2) -> {
event2.setEventTimeStamp(Math.max(event1.getEventTimeStamp(), event2.getEventTimeStamp()));
return event2;
})
.setParallelism(mapParallelism > 0 ? mapParallelism : defaultMapParallelism);
After Deduping, I do another level of windowing and count the records before producing to kafka topic. (Window(TumblingEventTimeWindows(3600000), EventTimeTrigger, Job$$Lambda$1101/2132463744, PassThroughWindowFunction) -> Map)
SingleOutputStreamOperator<PlImaItemInterimMessage> countedStream = deDupedStream
.filter(event -> event.getXXX() != null)
.map(this::buildXXXObject)
.returns(XXXObject.class)
.setParallelism(deDupMapParallelism > 0 ? deDupMapParallelism : defaultDeDupMapParallelism)
.keyBy(itemInterimMsg -> String.valueOf("key1") + "key2" + "key3")
.timeWindow(Time.milliseconds(3600000))
.reduce((existingMsg, currentMsg) -> { // Aggregate
currentMsg.setCount(existingMsg.getCount() + currentMsg.getCount());
return currentMsg;
})
.setParallelism(deDupMapParallelism > 0 ? deDupMapParallelism : defaultDeDupMapParallelism);
countedStream.addSink(kafkaProducerSinkFunction);
With the above setup, my assumption is the destination kafka topic will get the aggregated results every 3600000ms (1 hour). But Grafana graph shows the the result emits every near 30 mins. I do not understand why, when the window is still 1 hour range. Any suggestions?
Attached the Kafka destination topic emit range below.
While I can't fully diagnose this without seeing more of the project, here are some points that you may have overlooked:
When the Flink Kafka producer is used in exactly once mode, it only commits its output when checkpointing. Consumers of your job's output, if set to read committed, will only see results when checkpoints complete. How often is your job checkpointing?
When the Flink Kafka producer is used in at least once mode, it can produce duplicated output. Is your job is restarting at regular intervals?
Flink's event time window assigners use the timestamps in the stream record metadata to determine the timing of each event. These metadata timestamps are set when you call assignTimestampsAndWatermarks. Calling setEventTimeStamp in the window reduce function has no effect on these timestamps in the metadata.
The stream record metadata timestamps on events emitted by a time window are set to the end time of the window, and those are the timestamps considered by the window assigner of any subsequent window.
keyBy(msg -> new StringBuilder().append("XXX").append("YYY")) is partitioning the stream by a constant, and will assign every record to the same partition.
The second keyBy (right before the second window) is replacing the first keyBy (rather than imposing further partitioning).

the policy of new watermark generation when defined in DDL

In the https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/table/streaming/time_attributes.html
The DDL for the event time attribute and watermark is:
CREATE TABLE user_actions (
user_name STRING,
data STRING,
user_action_time TIMESTAMP(3),
-- declare user_action_time as event time attribute and use 5 seconds delayed watermark strategy
WATERMARK FOR user_action_time AS user_action_time - INTERVAL '5' SECOND
) WITH (
...
);
I would ask the policy of new watermark generation:
With data stream, flink provides following two policies for watermark generation,
what about in ddl?
periodically like AssignerWithPeriodicWatermarks does,that is, try to generate new watermark periodically
punctuated like AssignerWithPunctuatedWatermarks, that is,try to generate new watermark when new event comes.
The watermark is periodically assigned. You can specify the interval via the configuration pipeline.auto-watermark-interval.
Also note, that the API for Watermarks was changed in the DataStream API and the two classes you mention are deprecated by now.
[1] https://ci.apache.org/projects/flink/flink-docs-release-1.12/dev/event_timestamps_watermarks.html#introduction-to-watermark-strategies

Flink Watermark

In Flink, I found 2 ways to set up watermark,
the first is
val env = StreamExecutionEnvironment.getExecutionEnvironment
env.getConfig.setAutoWatermarkInterval(5000)
the second is
env.addSource(
new FlinkKafkaConsumer[...](...)
).assignTimestampsAndWatermarks(
WatermarkStrategy.forBoundedOutOfOrderness[...](Duration.ofSeconds(10)).withTimestampAssigner(...)
)
I would like to know which will take effect eventually.
There's no conflict at all between those two -- they are dealing with separate concerns. Everything specified will take effect.
The first one,
env.getConfig.setAutoWatermarkInterval(5000)
is specifying how often you want watermarks to be generated (one watermark every 5000 msec). If this wasn't specified, the default of 200 msec would be used instead.
The second,
env.addSource(
new FlinkKafkaConsumer[...](...)
).assignTimestampsAndWatermarks(
WatermarkStrategy.forBoundedOutOfOrderness[...](Duration.ofSeconds(10)).withTimestampAssigner(...)
)
is specifying the details of how those watermarks are to be computed. I.e., they should be generated by the FlinkKafkaConsumer using a BoundedOutOfOrderness strategy, with a bounded delay of 10 seconds. The WatermarkStrategy also needs a timestamp assigner.
There's no default WatermarkStrategy, so something like this second code snippet is required if you want to work with event time.

Flink CEP cannot get correct results on a unioned table

I use Flink SQL and CEP to recognize some really simple patterns. However, I found a weird thing (likely a bug). I have two example tables password_change and transfer as below.
transfer
transid,accountnumber,sortcode,value,channel,eventtime,eventtype
1,123,1,100,ONL,2020-01-01T01:00:01Z,transfer
3,123,1,100,ONL,2020-01-01T01:00:02Z,transfer
4,123,1,200,ONL,2020-01-01T01:00:03Z,transfer
5,456,1,200,ONL,2020-01-01T01:00:04Z,transfer
password_change
accountnumber,channel,eventtime,eventtype
123,ONL,2020-01-01T01:00:05Z,password_change
456,ONL,2020-01-01T01:00:06Z,password_change
123,ONL,2020-01-01T01:00:08Z,password_change
123,ONL,2020-01-01T01:00:09Z,password_change
Here are my SQL queries.
First create a temporary view event as
(SELECT accountnumber,rowtime,eventtype FROM password_change WHERE channel='ONL')
UNION ALL
(SELECT accountnumber,rowtime, eventtype FROM transfer WHERE channel = 'ONL' )
rowtime column is the event time extracted directly from original eventtime col with watermark periodic bound 1 second.
Then output the query result of
SELECT * FROM `event`
MATCH_RECOGNIZE (
PARTITION BY accountnumber
ORDER BY rowtime
MEASURES
transfer.eventtype AS event_type,
transfer.rowtime AS transfer_time
ONE ROW PER MATCH
AFTER MATCH SKIP PAST LAST ROW
PATTERN (transfer password_change ) WITHIN INTERVAL '5' SECOND
DEFINE
password_change AS eventtype='password_change',
transfer AS eventtype='transfer'
)
It should output
123,transfer,2020-01-01T01:00:03Z
456,transfer,2020-01-01T01:00:04Z
But I got nothing when running Flink 1.11.1 (also no output for 1.10.1).
What's more, I change the pattern to only password_change, it still output nothing, but if I change the pattern to transfer then it outputs several rows but not all transfer rows. If I exchange the eventtime of two tables which means let password_changes happen first, then the pattern password_change will output several rows while transfer not.
On the other hand, if I extract those columns from two tables and merge them in one table manually, then emit them into Flink, the running result is correct.
I searched and tried a lot to get it right including changing the SQL statement, watermark, buffer timeout and so on, but nothing helped. Hope anyone here can help. Thanks.
10/10/2020 update:
I use Kafka as the table source. tEnv is the StreamTableEnvironment.
Kafka kafka=new Kafka()
.version("universal")
.property("bootstrap.servers", "localhost:9092");
tEnv.connect(
kafka.topic("transfer")
).withFormat(
new Json()
.failOnMissingField(true)
).withSchema(
new Schema()
.field("rowtime",DataTypes.TIMESTAMP(3))
.rowtime(new Rowtime()
.timestampsFromField("eventtime")
.watermarksPeriodicBounded(1000)
)
.field("channel",DataTypes.STRING())
.field("eventtype",DataTypes.STRING())
.field("transid",DataTypes.STRING())
.field("accountnumber",DataTypes.STRING())
.field("value",DataTypes.DECIMAL(38,18))
).createTemporaryTable("transfer");
tEnv.connect(
kafka.topic("pchange")
).withFormat(
new Json()
.failOnMissingField(true)
).withSchema(
new Schema()
.field("rowtime",DataTypes.TIMESTAMP(3))
.rowtime(new Rowtime()
.timestampsFromField("eventtime")
.watermarksPeriodicBounded(1000)
)
.field("channel",DataTypes.STRING())
.field("accountnumber",DataTypes.STRING())
.field("eventtype",DataTypes.STRING())
).createTemporaryTable("password_change");
Thank #Dawid Wysakowicz's answer. To confirm that, I added 4,123,1,200,ONL,2020-01-01T01:00:10Z,transfer to the end of transfer table, then the output becomes right, which means it is really some problem about watermarks.
So now the question is how to fix it. Since a user will not change his/her password frequently, the time gap between these two table is unavoidable. I just need the UNION ALL table has the same behavior as that I merged manually.
Update Nov. 4th 2020:
WatermarkStrategy with idle sources may help.
Most likely the problem is somewhere around watermark generation in conjunction with the UNION ALL operator. Could you share how you create the two tables including how you define the time attributes and what are the connectors? It could let me confirm my suspicions.
I think the problem is that one of the sources stops emitting Watermarks. If the transfer table (or the table with lower timestamps) does not finish and produces no records it emits no Watermarks. After emitting the fourth row it will emit Watermark = 3 (4-1 second). The Watermark of a union of inputs is the smallest of values of the two. Therefore the first table will pause/hold the Watermark with value Watermark = 3 and thus you see no progress for the original query and you see some records emitted for the table with smaller timestamps.
If you manually join the two tables, you have just a single input with a single source of Watermarks and thus it progresses further and you see some results.

Resources