Map State lifecycle in a keyed process function or windowed stream - apache-flink

Is MapState content automatically cleared up after the window expires or when onTimer function for that particular key is called or it has to be manually cleared given a TTL config is not defined

Any state you register yourself and that doesn't have TTL defined will be retained indefinitely.
Flink's built-in windows are cleaned up automatically, but windows you implement yourself in a KeyedProcessFunction using MapState need to be manually cleared when they are no longer useful.

Related

Cleanup configuration for ProcessWindowFunction's window state without TTL with RocksDB as backend

Flink offers TTL configuration for managed state and,
when using RocksDB as backend,
it executes cleanup in a custom compaction filter
(if I understand correctly).
However, in the case of keyed windowed state in a ProcessWindowFunction,
the expectation is that we override the clear method and explicitly call something like
context.windowState().*.clear()
If the state descriptor does not configure TTL,
does cleanup still occur after the clear callback?
If not, and cleanup for this type of state depends solely on sizes in RocksDB's levels,
what's the default setting and is it configurable?
If the state descriptor does not configure TTL, does cleanup still occur after the clear callback?
Yes, unless the state descriptor was used to create state stored in KeyedStateStore ProcessWindowFunction.Context#globalState. This global state is the only state that is kept after windows are cleared. If you have an ever-growing key space, you should configure state TTL for any globalState you use, as otherwise globalState for stale keys will never be cleaned up.
FWIW, there's nothing RocksDB-specific about this. The answer is the same for any of the state backends.

TTL for state in ProcessWindowFunction

I would like to set the TTL of the state in a processwindowfunction. This state is shared across windows. This TTL needs to be based on an attribute in the event itself. So I cannot calculate the TTL in the state descriptor. Also, onTimer function is not supported in processwindowfunction.
Is there any other way to achieve this?
If the time-to-live must be computed as a function of the event itself, then you can't use the state TTL mechanism.
The only alternative is to use timers with a KeyedProcessFunction, rather than using the window API. There's an example in the flink documentation: https://ci.apache.org/projects/flink/flink-docs-stable/learn-flink/event_driven.html#example

What happens to keyed window-global state without TTL if a key is never seen again?

Flink's ProcessWindowFunction can use so-called global state with something like context.globalState().getState.
Usually, such state could grow and shrink as time moves forward,
but what happens if global state without TTL was created for a key and that key is never seen again?
According to the documentation,
TTL cannot be added during upgrade,
so the state will stay there forever?

How is keyed state managed for KeyedBroadcastProcessFunction in Flink?

I am using BroadcastState to perform streaming computation in Flink. I have defined a class extending KeyedBroadcastProcessFunction for my job. Say I have a stream A which is keyed by (user_id, location), and a stream B, which is broadcasted to all executors to process elements in A using my defined class. I understand I can registered a timer in processBroadcastElement or processElement in this class so that when it times out, I can delete the associated state for a specific key group by calling state.clear(). I wonder after that, does this key group still exist?
For example, in stream A, a new message comes with (user_id=1, location='usa') and we have such key group and its associated states generated. After that if another message with (user_id=1, location='usa') comes, it will trigger processElement() and emit result.
Say after 24 hours, I'm no longer interested with this key group (user_id=1, location='usa'), I can register a timer to clear the associated state, but I have no control of this key group. As a result, after 24 hours, when another message with (user_id=1, location='usa') comes, since this key group still exists, processElement() will still be invoked. As the job runs, although their associated states will be cleared after 24 hours, will key groups accumulate or that should not be a concern for memory usage?
Relevant blogs: https://www.da-platform.com/blog/a-practical-guide-to-broadcast-state-in-apache-flink
Flink's keyed state is organized as a distributed (or sharded) key-value store, where the keys can be simple things, like integers and strings, or composites, like (user_id=1, location='usa'). Key groups are something different than composite keys. A key group is a runtime construct that was introduced in Flink 1.2 (see FLINK-3755) to permit efficient rescaling of key-value state. A key group is a subset of the key space, and is checkpointed as an independent unit. At runtime, all of the keys in the same key group are partitioned together in job graph -- each subtask has the key-value state for one or more complete key groups. This design doc gives more details. As a user working with the DataStream API, key groups are an implementation detail, and not something you work with directly.
As for timers in a KeyedBroadcastProcessFunction, they can be registered in the processElement or onTimer method, but not in the processBroadcastElement method. This is because timers are always associated with a key, and there is no key associated with a broadcast element. You can, however, manipulate any or all of the keyed state during your processBroadcastElement method by using the applyToKeyedState method on the KeyedBroadcastProcessFunction.Context object. See the docs for more details.
Once you call state.clear(), the state entry for that key is deleted. New stream events for that key may, of course, arrive after the state has been cleared, and you are able to once again store value state for that key, if you wish. In order to avoid unbounded memory usage due to keeping state for no-longer-relevant keys, you do need to be careful. You might want some logic like this to expire the state 24 hours after each time it is created:
processElement:
if state.value() is null, register timer
state.update(...)
onTimer:
state.clear()
Or you might need more complex logic that extends the lifetime of the state whenever it is updated or accessed.
Another option would be to use the state time-to-live feature.
Update:
Whenever you are in a processElement or onTimer method of any of the ProcessFunction types, there is a specific key implicitly in context, and anything done to keyed state (such as .update() or .clear()) will only affect the state for that one key.
Broadcast state works differently. Broadcast state is always MapState, and is replicated into all of the parallel subtasks. Broadcast state is keyless -- if you read broadcast state during the processElement method you will see the same value for the broadcast state regardless of what key is in context during that call.
Only in the processBroadcastElement method of a KeyedBroadcastProcessFunction can you modify (or clear) broadcast state, and it's important that whatever modifications (or deletions) occur be done in the same way in all of the parallel instances. This is designed this way so as to guarantee that every parallel instance will have the same contents in broadcast state. Ignoring this rule will lead to inconsistencies in the state, which can be very difficult to debug. See the docs for more info.
So yes, if you call .clear() on the broadcast state, then all of the broadcast state for all keys will be removed. Or you might remove a specific item from the broadcast state (remember, broadcast state is MapState), in which case that specific item will be removed for all keys.
There are several examples of working with broadcast state in the Flink training site. See
https://training.da-platform.com/exercises/ongoingRides.html
https://training.da-platform.com/exercises/nearestTaxi.html
https://training.da-platform.com/exercises/taxiQuery.html

Apache Flink:Window checkpoint

I want to know how to checkpoint a window. For example, windowed wordcount:
DataStream<Tuple3<String, Long, Long>> counts =
// split up the lines in pairs (2-tuples) containing: (word,1)
text
.flatMap(new Tokenizer())
.assignTimestampsAndWatermarks(new timestamp())
.keyBy(0)
.timeWindow(Time.seconds(2))
.process(new CountFunction())
Q1: What state should I save in CountFunction()? Do I need to save the buffer element of the window? Should I use ListState to store the buffered data in the window and use ValueState to store the current sum value?
Q2: When the fault occurs, how are the elements in the window handled? What happens when the window is restored?
Thank you for the help.
All of the state needed for Flink's windowing APIs is managed by Flink -- so you don't need to do anything. So long as checkpointing is enabled, the window buffer will be checkpointed and restored as needed.
Normally the CountFunction won't have any state that needs to be checkpointed. If the job fails while CountFunction is in the middle of iterating over the window's contents, the job will be rewound, and CountFunction will be called again with the same inputs.
If you do need to keep state in your CountFunction, then see Using per-window state in ProcessWindowFunction for information on how to go about that. It sounds like you will want to use globalState() (state that endures across all time), which you can access via the Context object passed to your process window function.
While you don't have a keyed stream, I suggest you use the keyed state mechanism described above. You can transform your non-keyed stream into a keyed stream by using keyBy with a constant key.

Resources