How Apache Flink manages MQTT consumer offsets - apache-flink

I'm using MQTT consumer as my flink job's data source. I'm wondering how to save the data offsets into checkpoint to ensure that no data lost when flink cluster restarts after a failure. I've see lots of articles introducing how apache flink manages kafka consumer offsets. Does anyone know whether apache flink has its own function to manage MQTT consumer? Thanks.

If you have a MQTT consumer, you should make sure it uses the Data Source API. You can read about that on https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/sources/ - That also includes how to work integrate with checkpointing. You can also read the details in FLIP-27 https://cwiki.apache.org/confluence/display/FLINK/FLIP-27%3A+Refactor+Source+Interface

You shoul read state-backends part of documentation. And checkpoints section.
When checkpointing is enabled, managed state is persisted to ensure
consistent recovery in case of failures. Where the state is persisted
during checkpointing depends on the chosen Checkpoint Storage.

Related

Deployment of new version of Flink application failed

env
flink 1.7.1
kafka 1.0.1
I use Flink application in Streaming process.
Read topic from kafka and sink it to kafka new Topic.
When i change application with new version of code and deploy, it comes to application execution failure.
If i deploy the same group.id after changing the application code, could there be a conflict with previous state checkpoint information?
Yes, if you are trying to do a stateful upgrade of your Flink application, there are a few things that can cause it to fail.
The UIDs of the stateful operators are used to find the state for each operator. If you haven't set the UIDs, then if the job's topology has changed, state restore will fail because Flink won't be able to find the state. See the docs on Assigning Operator IDs for details.
If you have dropped a stateful operator, then you should run the new job while specifying -allowNonRestoredState.
If you have modified your data types, the job can fail when attempting to deserialize the state in the checkpoint or savepoint. Flink 1.7 did not have any support for automatic schema evolution or state migration. In more recent versions of Flink, if you stick to POJOs or Avro, this is handled automatically. Otherwise you need custom serializers.
If this doesn't help you figure out what's going wrong, please share the information from the logs showing the specific exception.

Querying Kafka Offsets from Flink Savepoint

Can I somehow find out which offsets the Kafka consumers of my Flink job were the exact moment I took a savepoint? I assume I could hack up something using the state processor api, peering into the internal state of the Kafka consumer. But I'd like to know whether I've missed a nicer way, which doesn't rely on implementation details.

How does Flink make checkpoint asynchronously with RocksDB backend

I am using Flink with RocksDB. From the document of Flink I acknowledge that Flink will make checkpoint asynchronously when using RocksDB backend. See the descriptions in its doc.
It is possible to let an operator continue processing while it stores its state snapshot, effectively letting the state snapshots happen asynchronously in the background. To do that, the operator must be able to produce a state object that should be stored in a way such that further modifications to the operator state do not affect that state object. For example, copy-on-write data structures, such as are used in RocksDB, have this behavior.
From my understanding, when a checkpoint need to be make, an operator will do these steps for Rocksdb:
Flush data in memtable
Copy the db folder into another tmp folder, which contains all the data in RocksDB
Upload the copied data to remote Fs-system. (In this step, it is asynchronous)
Is my understanding right ? Or could anyone help to illustrate the details ?
Thanks a lot because I cannot find any documentation to describe the details.
Found one Blog where mentioned the process:
To do this, Flink triggers a flush in RocksDB, forcing all memtables into sstables on disk, and hard-linked in a local temporary directory. This process is synchronous to the processing pipeline, and Flink performs all further steps asynchronously and does not block processing.
See the link for more details: Incremental Checkpoint

Do I really need Flink checkpointing?

I have a Flink Application that reads some events from Kafka, does some enrichment of the data from MySQL, buffers the data using a window function and writes the data inside a window to HBase. I've currently enabled checkpointing, but it turns out that the checkpointing is quite expensive and over time it takes longer and longer and affects my job's latency (falling behind on kafka ingest rate). If I figure out a way to make my HBase writes idempotent, is there a strong reason for me to use checkpointing? I can just configure the internal kafka consumer client to commit every so often right?
If the only thing you are checkpointing is the Kafka provider offset(s), then it would surprise me that the checkpointing time is significant enough to slow down your workflow. Or is state being saved elsewhere as well? If so, you could skip that (as long as, per your note, the HBase writes are idempotent).
Note that you can also adjust the checkpointing interval, and (if need be) use incremental checkpoints with RocksDB.

Flink Kinesis Consumer not storing last successfully processed sequence nos

We are using Flink Kinesis Consumer to consume data from Kinesis stream into our Flink application.
KCL library uses a DynamoDB table to store last successfully processed Kinesis stream sequence nos. so that the next time application starts, it resumes from where it left off.
But, it seems that Flink Kinesis Consumer does not maintain any such sequence nos. in any persistent store. As a result, we need to rely upon ShardIteratortype (trim_horizen, latest, etc) to decide where to resume Flink application processing upon application restart.
A possible solution to this could be to rely on Flink checkpointing mechanism, but that only works when application resumes upon failure, and not when the application has been deliberately cancelled and is needed to be restarted from the last successfully consumed Kinesis stream sequence no.
Do we need to store these last successfully consumed sequence nos ourselves ?
Best practice with Flink is to use checkpoints and savepoints, as these create consistent snapshots that contain offsets into your message queues (in this case, Kinesis stream sequence numbers) together with all of the state throughout the rest of the job graph that resulted from having consumed the data up to those offsets. This makes it possible to recover or restart without any loss or duplication of data.
Flink's checkpoints are snapshots taken automatically by Flink itself for the purpose of recovery from failures, and are in a format optimized for rapid restoration. Savepoints use the same underlying snapshot mechanism, but are triggered manually, and their format is more concerned about operational flexibility than performance.
Savepoints are what you are looking for. In particular, cancel with savepoint and resume from savepoint are very useful.
Another option is to use retained checkpoints with ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION.
To add to David's response, I'd like to explain the reasoning behind not storing sequence numbers.
Any kind of offsets committing into the source system would limit the checkpointing/savepointing feature only to fault tolerance. That is, only the latest checkpoint/savepoint would be able to recover.
However, Flink actually supports to jump back to a previous checkpoint/savepoint. Consider an application upgrade. You make a savepoint before, upgrade and let it run for a couple of minutes where it creates a few checkpoints. Then, you discover a critical bug. You would like to rollback to the savepoint that you have taken and discard all checkpoints.
Now if Flink commits the source offsets only to the source systems, we would not be able to replay the data between now and the restored savepoint. So, Flink needs to store the offsets in the savepoint itself as David pointed out. At this point, additionally committing to source system does not yield any benefit and is confusing while restoring to a previous savepoint/checkpoint.
Do you see any benefit in storing the offsets additionally?

Resources