will the Broadcast state source block process? - apache-flink

use flink version 1.13.0
i use the Broadcast state in my application, and it will load large data per 2 minutes(about 500'000 data in Map type)
and i see the topological graph in web-ui find that every time the Broadcast source load, it's has 50%-100% backpressure, and the process which joined has in 50%-100% busy. i want to know in this time will the process has been blocked and deal data slowly or stop to deal data?

BroadcastProcessFunction is thread-safe. When processBroadcastElement is called to load state, the processElement function will not be executed.
Therefore, when the state is relatively large, normal data processing will be blocked, and back pressure will also be generated.

Related

Is there a way to broadcast configuration into all task managers or all FlatMapFunctions?

We currently have a flink-based streaming job (the task is composed of complex FlatMapFunctions DAG), and an http interface for fetching configuration.
Now I hope to read configuration from the http interface through a source function every 5 minutes with a parallelism of 1, and then distribute it to all task managers or FlatMapFunctions of the job. In FlatMapFunctions, the configuration will be read and will never not be changed.
I have read the documentationThe Broadcast State Pattern, but the method in the documentation seems to only apply to the first Function of the broadcast, and other subsequent downstream FlatMapFunctions cannot read the state of the broadcast. As shown in the figure below, only Co-Process-Broadcast can obtain the broadcast, but map func 1 and map func 2 cannot.
Broadcast state graph
Similar to QUESTION but different, I have many downstream FlatMapFunctions and expect them all to get the broadcast configuration.
You can send the broadcast stream to multiple functions, so if your config state isn't big then that's likely what I'd do.
If the config state is very small (relative to the size of records being processed) then you could attach it to every incoming record in your BroadcastProcessFunction, so downstream operators have it in hand when processing each of their records.

Using Broadcast State To Force Window Closure Using Fake Messages

Description:
Currently I am working on using Flink with an IOT setup. Essentially, devices are sending data such as (device_id, device_type, event_timestamp, etc) and I don't have any control over when the messages get sent. I then key the steam by device_id and device_type to preform aggregations. I would like to use event-time given that is ensures the timers which are set trigger in a deterministic nature given a failure. However, given that this isn't always a high throughput stream a window could be opened for a 10 minute aggregation period, but not have its next point come until approximately 40 minutes later. Although the calculation would aggregation would eventually be completed it would output my desired result extremely late.
So my work around for this is to create an additional external source that does nothing other than pump fake messages. By having these fake messages being pumped out in alignment with my 10 minute aggregation period, even if a device hadn't sent any data, the event time windows would have something to force the windows closed. The critical part here is to make it possible that all parallel instances / operators have access to this fake message because I need to close all the windows with this single fake message. I was thinking that Broadcast state might be the most appropriate way to accomplish this goal given: "Broadcast state is replicated across all parallel instances of a function, and might typically be used where you have two streams, a regular data stream alongside a control stream that serves rules, patterns, or other configuration messages." Quote Source
Questions:
Is broadcast state the best method for ensuring all parallel instances (e.g. windows) receive my fake messages?
Once the operators have access to this fake message via the broadcast state can this fake message then be used to advance the event time watermark?
You can make this work with broadcast state, along the lines you propose, but I'm not convinced it's the best solution.
In an ideal world I'd suggest you arrange for the devices to send occasional keepalive messages, but assuming that's not possible, I think a custom Trigger would work well here. You can extend the EventTimeTrigger so that in addition to the event time timer it creates via
ctx.registerEventTimeTimer(window.maxTimestamp());
you also create a processing time timer, as a fallback, and you FIRE the window if the window still exists when that processing time timer fires.
I'm recommending this approach because it's simpler and more directly addresses the specific need. With the broadcast state approach you'll have to introduce a source for these messages, add a broadcast state descriptor and stream, add special fake watermarks for the non-broadcast stream (set to Watermark.MAX_WATERMARK), connect the broadcast and non-broadcast streams and implement a BroadcastProcessFunction (that probably doesn't really do anything), etc. It's a lot of moving parts spread across several different operators.

Process elements after sinking to Destination

I am setting up a flink pipeline that reads from Kafka and sinks to HDFS. I want to process the elements after the addSink() step. This is because I want to setup trigger files indicating that writing data (to the sink) for a certain partition/hour is complete. How can this be achieved? Currently I am using the Bucketing sink.
DataStream messageStream = env
.addSource(flinkKafkaConsumer011);
//some aggregations to convert message stream to keyedStream
keyedStream.addSink(sink);
//How to process elements after 3.?
The Flink APIs do not support extending the job graph beyond the sink(s). (You can, however, fork the stream and do additional processing in parallel with writing to the sink.)
With the Streaming File Sink you can observe the part files transition to the finished state when they complete. See the JavaDoc for more information.
State lives within a single operator -- only that operator (e.g., a ProcessFunction) can modify it. If you want to modify the keyed value state after the sink has completed, there's no straightforward way to do that. One idea would be to add a processing time timer in the ProcessFunction that has the keyed state that wakes up periodically and checks for newly finished part files, and based on their existence, modifies the state. Or if that's the wrong granularity, write a custom source that does something similar and streams or broadcasts information into the ProcessFunction (which will then have to be a CoProcessFunction or a KeyedBroadcastProcessFunction) that it can use to do the necessary state updates.

Flink window operator checkpointing

I want to know how flink does the checkpoint of the window operator. How to ensure that it is exactly once when recovering? For example, saving the tuples in the current window and saving the progress of the current window processing. I want to know the detailed process of the window operator's checkpoint and recovery.
All of Flink's stateful operators participate in the same checkpointing mechanism. When instructed to do so by the checkpoint coordinator (part of the job manager), the task managers initiate a checkpoint in each parallel instance of every source operator. The sources checkpoint their offsets and insert a checkpoint barrier into the stream. This divides the stream into the parts before and after the checkpoint. The barriers flow through the graph, and each stateful operator checkpoints its state upon having processed the stream up to the checkpoint barrier. The details are described at the link shared by #bupt_ljy.
Thus these checkpoints capture the entire state of the distributed pipeline, recording offsets into the input queues as well as the state throughout the job graph that has resulted from having ingested the data up to that point. When a failure occurs, the sources are rewound, the state is restored, and processing is resumed.
Given that during recovery the sources are rewound and replayed, "exactly once" means that the state managed by Flink is affected exactly once, not that the stream elements are processed exactly once.
There's nothing particularly special about windows in this regard. Depending on the type of window function being applied, a window's contents are kept in an element of managed ListState, ReducingState, AggregatingState, or FoldingState. As stream elements arrive and are being assigned to a window, they are appended, reduced, aggregated, or folded into that state. Other components of the window API, including Triggers and ProcessWindowFunctions, can have state that is checkpointed as well. For example, CountTrigger using ReducingState to keep track of how many elements have been assigned to the window, adding one to the count as each element is added to the window.
In the case where the window function is a ProcessWindowFunction, all of the elements assigned to the window are saved in Flink state, and are passed in an Iterable to the ProcessWindowFunction when the window is triggered. That function iterates over the contents and produces a result. The internal state of the ProcessWindowFunction is not checkpointed; if the job fails during the execution of the ProcessWindowFunction the job will resume from the most recently completed checkpoint. This will involve rewinding back to a time before the window received the event that triggered the window firing (that event can't be included in the checkpoint because a checkpoint barrier following it can not have had its effect yet). Sooner or later the window will again reach the point where it is triggered and the ProcessWindowFunction will be called again -- with the same window contents it received the first time -- and hopefully this time it won't fail. (Note that I've ignored the case of processing-time windows, which do not behave deterministically.)
When a ProcessWindowFunction uses managed/checkpointed state, it is used to remember things between firings, not within a single firing. For example, a window that allows late events might want to store the result previously reported, and then issue an update for each late event.

How to stream delayed data from a real-time data source in Scala

I have to design a way to stream data in Scala/Java, with a certain delay compared to the moment I receive the data. I have an API for my original data source (the real time one) which lets me query it as it was a database, using a query-like format, and also to get notification when something happens, much as with EHCACHE.
Now I want for example to stream data with different delays, according to the user privileges. Some of the users will see data streamed with 0 delay, other with 15 minutes, other with 60 minutes.
I will therefore need a top-level cache composed of multiple lower level caches (0 delay, 15 min delay, 60 min delay).
In a first moment I would like the cache to contain only elements of the same type. This would make the things simpler, because I can decide for a unique ID for the elements and I would have only to route the request to the right cache according to the required delay
In a second moment, I would like my delayed cache to be queryable. Is there a part of the code from EHCache which I can recycle, for example?
An idea: Convert the stream data into some kind of event objects that implement Delayed. As soon as such a real-time event occurs, put it into DelayedQueue. Another thread can then call take() in a cycle and retrieve and process the delayed events.

Resources