Flink checkpoint not replaying the kafka events which were in process during the savepoint/checkpoint - apache-flink

I want to test end-to-end exactly once processing in flink. My job is:
Kafka-source -> mapper1 -> mapper-2 -> kafka-sink
I had put a Thread.sleep(100000) in mapper1 and then ran the job. I took the savepoint while stopping the job and then I removed the Thread.sleep(100000) form the mapper1, and I expect that the event should be replayed as it was not sinked. But that didnt happen and job is waiting for new event.
My Kafka source:
KafkaSource.<String>builder()
.setBootstrapServers(consumerConfig.getBrokers())
.setTopics(consumerConfig.getTopic())
.setGroupId(consumerConfig.getGroupId())
.setStartingOffsets(OffsetsInitializer.latest())
.setValueOnlyDeserializer(new SimpleStringSchema())
.setProperty("commit.offsets.on.checkpoint", "true")
.build();
My kafka sink:
KafkaSink.<String>builder()
.setBootstrapServers(producerConfig.getBootstrapServers())
.setDeliverGuarantee(DeliveryGuarantee.EXACTLY_ONCE)
.setRecordSerializer(KafkaRecordSerializationSchema.builder()
.setTopic(producerConfig.getTopic())
.setValueSerializationSchema(new SimpleStringSchema()).build())
.build();
My environmentSetup for flink job:
StreamExecutionEnvironment environment = StreamExecutionEnvironment.getExecutionEnvironment();
environment.enableCheckpointing(2000);
environment.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
environment.getCheckpointConfig().setMinPauseBetweenCheckpoints(100);
environment.getCheckpointConfig().setCheckpointTimeout(60000);
environment.getCheckpointConfig().setTolerableCheckpointFailureNumber(2);
environment.getCheckpointConfig().setExternalizedCheckpointCleanup(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
environment.getCheckpointConfig().setCheckpointTimeout(1000);
environment.getCheckpointConfig().setMaxConcurrentCheckpoints(1);
environment.getCheckpointConfig().enableUnalignedCheckpoints();
environment.getCheckpointConfig().setCheckpointStorage("file:///tmp/flink-checkpoints");
Configuration configuration = new Configuration();
configuration.set(ExecutionCheckpointingOptions.ENABLE_CHECKPOINTS_AFTER_TASKS_FINISH, true);
environment.configure(configuration);
What am I doing wrong here?
I want that any event which is in process during the cancellation/stop of the job, should restart again.
EDIT 1:
I observed that my kafka was showing offset lag for my flink's kafka-source consumer group. I am assuming it means my checkpointing is behaving right, is that correct ?
I also observed when i restarted my job from checkpoint, it didnt start to consume from the remaining offsets, while I have the consumer offset set to EARLIEST. I had to send more events to trigger the consumption on kafka-source side and then it consumed all the events.

For exactly-once, you must provide a TransactionalIdPrefix unique across all applications running against the same Kafka cluster (this is a change compared to the legacy FlinkKafkaConsumer):
KafkaSink<T> sink =
KafkaSink.<T>builder()
.setBootstrapServers(...)
.setKafkaProducerConfig(...)
.setRecordSerializer(...)
.setDeliverGuarantee(DeliveryGuarantee.EXACTLY_ONCE)
.setTransactionalIdPrefix("unique-id-for-your-app")
.build();
When resuming from a checkpoint, Flink always uses the offsets stored in the checkpoint rather than those configured in the code or stored in the broker.

Related

Strange transactional id errors when using the kafka sink

I had a Flink 1.15.1 job configured with
execution.checkpointing.mode='EXACTLY_ONCE'
that was failing with the following error
Sink: Committer (2/2)#732 (36640a337c6ccdc733d176b18adab979) switched from INITIALIZING to FAILED with failure cause: java.lang.IllegalStateException: Failed to commit KafkaCommittable{producerId=4521984, epoch=0, transactionalId=}
...
Caused by: org.apache.kafka.common.config.ConfigException: Invalid value for configuration transactional.id: String must be non-empty
that happened after the first checkpoint was triggered. The strange thing about it is that the KafkaSinkBuilder was used without calling setDeliverGuarantee, and hence the default delivery guarantee was expected to be used, which is NONE 1.
Is that even possible to start with? Shouldn't kafka transactions be involved only when one follows this recipe in 2?
* <p>One can also configure different {#link DeliveryGuarantee} by using {#link
* #setDeliverGuarantee(DeliveryGuarantee)} but keep in mind when using {#link
* DeliveryGuarantee#EXACTLY_ONCE} one must set the transactionalIdPrefix {#link
* #setTransactionalIdPrefix(String)}.
So, in my case, without calling setDeliverGuarantee (nor setTransactionalIdPrefix), I cannot understand why I was seeing these errors. To avoid the problem, I temporarily relaxed the checkpointing settings to
execution.checkpointing.mode='AT_LEAST_ONCE'
but I'd like to understand what was happening.
Like the JavaDoc mentions, if you enable exactly-once, you must set a transactionalIdPrefix. A complete recipe on how-to configure exactly-once with Apache Kafka can be found in this recipe: https://www.docs.immerok.cloud/docs/cookbook/exactly-once-with-apache-kafka-and-apache-flink/
Disclaimer: I work for Immerok

Apache Flink with Kinesis Analytics : java.lang.IllegalArgumentException: The fraction of memory to allocate should not be 0

Background :
I have been trying to setup BATCH + STREAMING in the same flink application which is deployed on kinesis analytics runtime. The STREAMING part works fine, but I'm having trouble adding support for BATCH.
Flink : Handling Keyed Streams with data older than application watermark
Apache Flink : Batch Mode failing for Datastream API's with exception `IllegalStateException: Checkpointing is not allowed with sorted inputs.`
The logic is something like this :
The logic is something like this :
streamExecutionEnvironment.setRuntimeMode(RuntimeExecutionMode.BATCH);
streamExecutionEnvironment.fromSource(FileSource.forRecordStreamFormat(new TextLineFormat(), path).build(),
WatermarkStrategy.noWatermarks(),
"Text File")
.process(process function which transforms input)
.assignTimestampsAndWatermarks(WatermarkStrategy
.<DetectionEvent>forBoundedOutOfOrderness(orderness)
.withTimestampAssigner(
(SerializableTimestampAssigner<Event>) (event, l) -> event.getEventTime()))
.keyBy(keyFunction)
.window(TumblingEventWindows(Time.of(x days))
.process(processWindowFunction);
On doing this I'm getting the below exception :
java.lang.Exception: Exception while creating StreamOperatorStateContext.
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:254)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.initializeState(AbstractStreamOperator.java:272)
at org.apache.flink.streaming.runtime.tasks.OperatorChain.initializeStateAndOpenOperators(OperatorChain.java:441)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restoreGates(StreamTask.java:582)
at org.apache.flink.streaming.runtime.tasks.StreamTaskActionExecutor$1.call(StreamTaskActionExecutor.java:55)
at org.apache.flink.streaming.runtime.tasks.StreamTask.executeRestore(StreamTask.java:562)
at org.apache.flink.streaming.runtime.tasks.StreamTask.runWithCleanUpOnFail(StreamTask.java:647)
at org.apache.flink.streaming.runtime.tasks.StreamTask.restore(StreamTask.java:537)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:764)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:571)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: org.apache.flink.util.FlinkException: Could not restore keyed state backend for WindowOperator_90bea66de1c231edf33913ecd54406c1_(1/1) from any of the 1 provided restore options.
at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:160)
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.keyedStatedBackend(StreamTaskStateInitializerImpl.java:345)
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.streamOperatorStateContext(StreamTaskStateInitializerImpl.java:163)
... 10 more
Caused by: java.io.IOException: Failed to acquire shared cache resource for RocksDB
at org.apache.flink.contrib.streaming.state.RocksDBOperationUtils.allocateSharedCachesIfConfigured(RocksDBOperationUtils.java:306)
at org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.createKeyedStateBackend(EmbeddedRocksDBStateBackend.java:426)
at org.apache.flink.contrib.streaming.state.EmbeddedRocksDBStateBackend.createKeyedStateBackend(EmbeddedRocksDBStateBackend.java:90)
at org.apache.flink.streaming.api.operators.StreamTaskStateInitializerImpl.lambda$keyedStatedBackend$1(StreamTaskStateInitializerImpl.java:328)
at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.attemptCreateAndRestore(BackendRestorerProcedure.java:168)
at org.apache.flink.streaming.api.operators.BackendRestorerProcedure.createAndRestore(BackendRestorerProcedure.java:135)
... 12 more
Caused by: java.lang.IllegalArgumentException: The fraction of memory to allocate should not be 0. Please make sure that all types of managed memory consumers contained in the job are configured with a non-negative weight via `taskmanager.memory.managed.consumer-weights`.
at org.apache.flink.util.Preconditions.checkArgument(Preconditions.java:160)
at org.apache.flink.runtime.memory.MemoryManager.validateFraction(MemoryManager.java:672)
at org.apache.flink.runtime.memory.MemoryManager.computeMemorySize(MemoryManager.java:653)
at org.apache.flink.runtime.memory.MemoryManager.getSharedMemoryResourceForManagedMemory(MemoryManager.java:521)
at org.apache.flink.contrib.streaming.state.RocksDBOperationUtils.allocateSharedCachesIfConfigured(RocksDBOperationUtils.java:302)
... 17 more
Seems like kinesis-analytics does not allow clients to define a flink-conf.yaml file to define taskmanager.memory.managed.consumer-weights. Is there any way around this ?
It's not clear to me what the underlying cause of this exception is, nor how to make batch processing work on KDA.
You can try this (but I'm not sure KDA will allow it):
Configuration conf = new Configuration();
conf.setString("taskmanager.memory.managed.consumer-weights", "put-the-value-here");
StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment(conf);

"Could not complete snapshot" due to ConcurrentModificationException causes Flink pipeline to fail

After enabling checkpointing for our Flink pipeline, we regularly get the exception below, which causes the pipeline to fail.
The pipeline reads from Kafka, makes some stateless transformations (map) and then writes to HDFS via StreamingFileSink.
org.apache.flink.runtime.checkpoint.CheckpointException: Could not complete snapshot 1080 for operator foo -> bar -> Sink: Hadoop (1/2). Failure reason: Checkpoint was declined.
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:431)
at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.checkpointStreamOperator(StreamTask.java:1282)
at org.apache.flink.streaming.runtime.tasks.StreamTask$CheckpointingOperation.executeCheckpointing(StreamTask.java:1216)
at org.apache.flink.streaming.runtime.tasks.StreamTask.checkpointState(StreamTask.java:872)
at org.apache.flink.streaming.runtime.tasks.StreamTask.performCheckpoint(StreamTask.java:777)
at org.apache.flink.streaming.runtime.tasks.StreamTask.triggerCheckpointOnBarrier(StreamTask.java:708)
at org.apache.flink.streaming.runtime.io.CheckpointBarrierHandler.notifyCheckpoint(CheckpointBarrierHandler.java:88)
at org.apache.flink.streaming.runtime.io.CheckpointBarrierAligner.processBarrier(CheckpointBarrierAligner.java:113)
at org.apache.flink.streaming.runtime.io.CheckpointedInputGate.pollNext(CheckpointedInputGate.java:155)
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.pollNextNullable(StreamTaskNetworkInput.java:102)
at org.apache.flink.streaming.runtime.io.StreamTaskNetworkInput.pollNextNullable(StreamTaskNetworkInput.java:47)
at org.apache.flink.streaming.runtime.io.StreamOneInputProcessor.processInput(StreamOneInputProcessor.java:135)
at org.apache.flink.streaming.runtime.tasks.StreamTask.processInput(StreamTask.java:279)
at org.apache.flink.streaming.runtime.tasks.StreamTask.run(StreamTask.java:301)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:406)
at org.apache.flink.runtime.taskmanager.Task.doRun(Task.java:705)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:530)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.util.ConcurrentModificationException
at java.util.HashMap$HashIterator.nextNode(HashMap.java:1445)
at java.util.HashMap$EntryIterator.next(HashMap.java:1479)
at java.util.HashMap$EntryIterator.next(HashMap.java:1477)
at org.apache.flink.api.common.typeutils.base.MapSerializer.copy(MapSerializer.java:105)
at org.apache.flink.api.common.typeutils.base.MapSerializer.copy(MapSerializer.java:43)
at org.apache.flink.api.java.typeutils.runtime.PojoSerializer.copy(PojoSerializer.java:239)
at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.copy(StreamElementSerializer.java:105)
at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.copy(StreamElementSerializer.java:46)
at org.apache.flink.runtime.state.ArrayListSerializer.copy(ArrayListSerializer.java:73)
at org.apache.flink.runtime.state.PartitionableListState.<init>(PartitionableListState.java:68)
at org.apache.flink.runtime.state.PartitionableListState.deepCopy(PartitionableListState.java:80)
at org.apache.flink.runtime.state.DefaultOperatorStateBackendSnapshotStrategy.snapshot(DefaultOperatorStateBackendSnapshotStrategy.java:88)
at org.apache.flink.runtime.state.DefaultOperatorStateBackend.snapshot(DefaultOperatorStateBackend.java:261)
at org.apache.flink.streaming.api.operators.AbstractStreamOperator.snapshotState(AbstractStreamOperator.java:406)
... 17 more
Currently, there is just a single node, and checkpointing is configured to use the local filesystem:
state.backend: filesystem
state.checkpoints.dir: file://opt/flink/checkpoints
I am completely unsure how to deal with this error.
This is Flink 1.9.1.

Flink1.5.4 exception: Corrupt stream, found tag: 105

My program wants to join two streams without Flink Window.
I connect two streams and define a class A extends RichCoFlatMapFunction to handle them.
In class A, I use a Guava cache to hold all the data from flatmap1/2 method, and join them by a tag from streams.
Then Guava cache has a remove listener to collect joined&expired data to next Flink Function.
private synchronized void collect(ReqFeatures features) {
feaCollector.collect(features);
}
Each time at the beginning, it runs well, but a few hours later, it's always dead because of this exception.
java.io.IOException: Corrupt stream, found tag: 105
at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:220)
at org.apache.flink.streaming.runtime.streamrecord.StreamElementSerializer.deserialize(StreamElementSerializer.java:49)
at org.apache.flink.runtime.plugable.NonReusingDeserializationDelegate.read(NonReusingDeserializationDelegate.java:55)
at org.apache.flink.runtime.io.network.api.serialization.SpillingAdaptiveSpanningRecordDeserializer.getNextRecord(SpillingAdaptiveSpanningRecordDeserializer.java:106)
at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:172)
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:104)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
at java.lang.Thread.run(Thread.java:748)
And sometimes there's another error log:
java.lang.IllegalStateException: When there are multiple buffers, an unfinished bufferConsumer can not be at the head of the buffers queue.
at org.apache.flink.util.Preconditions.checkState(Preconditions.java:195)
at org.apache.flink.runtime.io.network.partition.PipelinedSubpartition.pollBuffer(PipelinedSubpartition.java:158)
at org.apache.flink.runtime.io.network.partition.PipelinedSubpartitionView.getNextBuffer(PipelinedSubpartitionView.java:51)
at org.apache.flink.runtime.io.network.partition.consumer.LocalInputChannel.getNextBuffer(LocalInputChannel.java:186)
at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:551)
at org.apache.flink.runtime.io.network.partition.consumer.SingleInputGate.getNextBufferOrEvent(SingleInputGate.java:508)
at org.apache.flink.streaming.runtime.io.BarrierTracker.getNextNonBlocked(BarrierTracker.java:94)
at org.apache.flink.streaming.runtime.io.StreamInputProcessor.processInput(StreamInputProcessor.java:209)
at org.apache.flink.streaming.runtime.tasks.OneInputStreamTask.run(OneInputStreamTask.java:104)
at org.apache.flink.streaming.runtime.tasks.StreamTask.invoke(StreamTask.java:306)
at org.apache.flink.runtime.taskmanager.Task.run(Task.java:712)
at java.lang.Thread.run(Thread.java:748)
If I use Flink Window Function instead, this exception doesn't occur.
Why does this exception occur, and how can I resolve it?
I can confirm this also happens in Flink 1.9.1 (albeit for us, it happens when we run flink stop <job-id>)
I fixed the same problem with getting checkpointing lock while collecting output. The users flatMap function already hold the checkpointing lock, so if u collect output in flatMap function could also fix this problem.
in flink's code:
synchronized (checkpointingLock) {
numRecordsIn.inc();
streamOperator.setKeyContextElement1(record);
streamOperator.processElement(record);
}

Camel ActiveMQ client blocking, temp storage usage immediately hits 100%

I'm seeing 100% utilisation of activemq's temp storage (configured to be 100mb), and the activemq client is blocking. This 100% usage remains permanently, and I have no idea what's going on
I have a camel route, which consumes from a queue (QUEUE.IN) using the JmsTransactionManager.
public final class RouteUnderTest extends RouteBuilder {
#Override
public void configure() throws Exception {
from("activemq-transacted:QUEUE.IN")
.bean(myBean)
.to("activemq:QUEUE.OUT");
}
}
While processing the message from this queue I'm invoking a spring-integration client (myBean) which is configured as follows
<int:gateway id="myBean" service-interface="MyBean">
<int:method name="request" request-channel="channel"/>
</int:gateway>
<int:chain input-channel="channel">
<int:transformer ref="transformedToJsonHere"/>
<jms:outbound-gateway request-destination-name="QUEUE.MYBEAN"
receive-timeout="5000"
explicit-qos-enabled="true"
time-to-live="5000"
delivery-persistent="false"/>
<int:transformer ref="transformedToAnObjectHere"/>
</int:chain>
My broker is configured to use LevelDB, and with the following usage limits:
<persistenceAdapter>
<levelDB directory="${activemq.data}/leveldb"/>
</persistenceAdapter>
<systemUsage>
<systemUsage>
<memoryUsage>
<memoryUsage percentOfJvmHeap="70"/>
</memoryUsage>
<storeUsage>
<storeUsage limit="500 mb"/>
</storeUsage>
<tempUsage>
<tempUsage limit="100 mb"/>
</tempUsage>
</systemUsage>
</systemUsage>
When my route consumes the message and then attempts to put a non-persistent message on QUEUE.OUT the client is blocked and my broker shows 100% usage of temp storage.
And I see the following activemq logs
2015-07-28 15:44:59,678 | INFO | Usage(default:temp:queue://QUEUE.MYBEAN:temp) percentUsage=0%, usage=104857600, limit=104857600, percentUsageMinDelta=1%;Parent:Usage(default:temp) percentUsage=100%, usage=104857600, limit=104857600, percentUsageMinDelta=1%: Temp Store is Full (0% of 104857600). Stopping producer (ID:orbit-vm-55561-1438094698190-1:1:3:1) to prevent flooding queue://QUEUE.MYBEAN. See http://activemq.apache.org/producer-flow-control.html for more info (blocking for: 1s) | org.apache.activemq.broker.region.Queue | ActiveMQ NIO Worker 6
The queues look like (You can see that the QUEUE.IN message has been not been dequeued because it's still being processed transactionally, and no message has gone to QUEUE.MYBEAN)
I can fix this problem with any one of the following approaches:
Use KahaDB instead of LevelDB
Increase temp storage limit (150MB seems to do it but I haven't experimented a great deal)
Configure tempDataStore in activemq.xml (see below)
When configuring the tempDataStore it looks like:
<tempDataStore>
<bean xmlns="http://www.springframework.org/schema/beans" class="org.apache.activemq.leveldb.LevelDBStore">
<property name="directory" value="${activemq.data}/tmp" />
</bean>
</tempDataStore>
I should add, we were using KahaDB previously and this worked fine, but the upgrade to LevelDB has exposed this issue. Reverting to KahaDB is not an option.
I'm hoping someone could explain what we're seeing here, as the results are really difficult to understand. Why does using LevelDB necessitate a higher temp usage limit?, and why does configuring the tempDataStore explicitly also fix the problem?
I don't fully understand what's going on here so I'm worried that simply increasing the temp usage limit a little will just hide the problem until a later date.
Versions:
ActiveMQ: 5.11.1
Camel: 2.14.0
Spring: 4.0.8.RELEASE
Spring Integration: 4.0.5.RELEASE
We ran into exactly the same issue with ActiveMQ 5.13.2
The solution when using LevelDB is to explicitly configure a dedicated tempDataStore as you did.
If not, the broker uses the same store (LevelDB) for both persistent (persistent usage) and non-persistent messages (temp usage). You may therefore end-up in situations where the broker doesn't accept any non-persistent message anymore just because the store already holds persistent ones up to the configured tempUsage limit. It will however accept persistent ones if your storeUsage limit is set higher...
When using KahaDB, the broker automatically uses another store for the non-persistent messages (created in the tmp directory). So you don't have the problem...
Look at the following code for more indepth information: https://github.com/apache/activemq/blob/activemq-5.13.2/activemq-broker/src/main/java/org/apache/activemq/broker/BrokerService.java#L1739
When reading that code, remember LevelDBStore implements PListStore, but KahaDBStore doesn't...

Resources