Flink Running out of Memory - apache-flink

I have some fairly simple stream code that aggregating data via time windows. The windows are on the large side (1 hour, with a 2 hour bound), and the values in the streams are metrics coming from hundreds of servers. I keep running out of memory, and so I added the RocksDBStateBackend. This caused the JVM to segfault. Next I tried the FsStateBackend. Both of these backends never wrote any data to disk, but simply created a directory with the JobID. I'm running this code in standalone mode, not deployed. Any thoughts as to why the State Backends aren't writing data, and why it runs out of memory even when provided with 8GB of heap?
final SingleOutputStreamOperator<Metric> metricStream =
objectStream.map(node -> new Metric(node.get("_ts").asLong(), node.get("_value").asDouble(), node.get("tags"))).name("metric stream");
final WindowedStream<Metric, String, TimeWindow> hourlyMetricStream = metricStream
.assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor<Metric>(Time.hours(2)) { // set how long metrics can come late
#Override
public long extractTimestamp(final Metric metric) {
return metric.get_ts() * 1000; // needs to be in ms since Java epoch
}
})
.keyBy(metric -> metric.getMetricName()) // key the stream so we can run the windowing in parallel
.timeWindow(Time.hours(1)); // setup the time window for the bucket
// create a stream for each type of aggregation
hourlyMetricStream.sum("_value") // we want to sum by the _value
.addSink(new MetricStoreSinkFunction(parameters, "sum"))
.name("hourly sum stream")
.setParallelism(6);
hourlyMetricStream.aggregate(new MeanAggregator())
.addSink(new MetricStoreSinkFunction(parameters, "mean"))
.name("hourly mean stream")
.setParallelism(6);
hourlyMetricStream.aggregate(new ReMedianAggregator())
.addSink(new MetricStoreSinkFunction(parameters, "remedian"))
.name("hourly remedian stream")
.setParallelism(6);
env.execute("flink test");

It is tough to say why you would run out of memory unless you have a very large number of metric names (that is the only explanation I can come up with based on the code you posted).
With respect to the disk writing, RocksDB will actually use a temporary directory by default for its actual database files. You can also pass an explicit directory during configuration. You would do this by calling state.setDbStoragePath(someDirectory)
Somewhat confusingly, the FSStateBackend in fact only writes to disk during checkpointing, it otherwise is entirely heap based. So you likely did not see anything in the directory if you did not have checkpointing enabled. So that would explain why you might still run out of memory when the FSStateBackend is used.
Assuming you do have the RocksDB (or any) state backend working, you can enable checkpointing by doing:
env.enableCheckpointing(5000); // value is in MS, so however frequently you want to checkpoint
env.getCheckpointConfig.setMinPauseBetweenCheckpoints(5000); // this is to help prevent your job from making progress if checkpointing takes a bit. For large state checkpointing can take multiple seconds

Related

No Output Received When Flink Streaming Execution Environment Passed With Custom Configuration

I'm running Apache Flink version 1.12.7 and configured Streaming Execution Environment with number of task slots for task manager = 3 (just experimenting) but unable to see the output of a file read by the environment. Instead, as seen in the logs, the Execution Graph is stuck as SCHEDULED and does not get into RUNNING state.
Note that if no configuration is passed in the properties file, everything works good and output is seen as environment is able to read the file since Execution Graph get switched to RUNNING state.
The code is as follows :
ParameterTool parameters = ParameterTool.fromPropertiesFile("src/main/resources/application.properties");
Configuration config = Configuration.fromMap(parameters.toMap());
TaskExecutorResourceUtils.adjustForLocalExecution(config);
StreamExecutionEnvironment env =
StreamExecutionEnvironment.getExecutionEnvironment(config);
System.out.println("Config Params : " + config.toMap());
DataStream<String> inputStream =
env.readTextFile(FILEPATH);
DataStream<String> filteredData = inputStream.filter((String value) -> {
String[] tokens = value.split(",");
return Double.parseDouble(tokens[3]) >= 75.0;
});
filteredData.print(); // no o/p seen if configuration object is set otherwise everything works as expected
env.execute("Filter Country Details");
Need help in understanding this behaviour and what changes should be made in order that the output gets printed along with having some custom configuration. Thank you.
Okay..So found the answer to the above puzzle by referring to some links mentioned below.
Solution : So I set the parallelism (env.setParallelism) in the above code just after configuring the streaming execution environment and the file was read with output generated as expected.
Post that, experimented with a few things :
set parallelism equal to number of task slots = everything worked
set parallelism greater than number of task slots = intermittent results
set parallelism less than number of task slots = intermittent results.
As per this link corresponding to Flink Architecture,
A Flink cluster needs exactly as many task slots as the highest parallelism used in the job
So its best to go with no. of task slots for a task manager equal to the parallelism configured.

Flink and Kinesis stream app for non continous data

We've built a Flink app to process data from Kinesis stream. The execution flow of the app contains basic operations for filtering data based on registered types, assigning watermarks based on event timestamps, map, process and aggregate functions applied on windows of data of 5 mins as shown below:
final SingleOutputStreamOperator<Object> inputStream = env.addSource(consumer)
.setParallelism(..)
.filter(..)
.assignTimestampsAndWatermarks(..);
// Processing flow
inputStream
.map(..)
.keyBy(..)
.window(..)
.sideOutputLateData(outputTag)
.aggregate(aggregateFunction, processWindowFunction);
// store processed data to external storage
AsyncDataStream.unorderedWait(...);
Ref code for my watermark assigner:
#Override
public void onEvent(#NonNull final MetricSegment metricSegment,
final long eventTimestamp,
#NonNull final WatermarkOutput watermarkOutput) {
if (eventTimestamp > eventMaxTimestamp) {
currentMaxTimestamp = Instant.now().toEpochMilli();
}
eventMaxTimestamp = Math.max(eventMaxTimestamp, eventTimestamp);
}
#Override
public void onPeriodicEmit(#NonNull final WatermarkOutput watermarkOutput) {
final Instant maxEventTimestamp = Instant.ofEpochMilli(eventMaxTimestamp);
final Duration timeElaspsed = Duration.between(Instant.ofEpochMilli(lastCurrentTimestamp), Instant.now());
if (timeElaspsed.getSeconds() >= emitWatermarkIntervalSec) {
final long watermarkTimestamp = maxEventTimestamp.plus(1, ChronoUnit.MINUTES).toEpochMilli();
watermarkOutput.emitWatermark(new Watermark(watermarkTimestamp));
}
}
Now this app was working with good performance (in terms of latency in order of few seconds) sometime back. However, recently there was a change in the upstream system post which the data in Kinesis stream gets published to the stream in bursts (only for 2-3 hours every day). Post this change, we have seen a huge spike in latency of our app (measured using flink gauge method by recording start time in first filter method and then emitting the metric in Async method by calculating the diff in the timetamp at that point from the start timestmap). Wondering if there is any issue in using Flink apps with Kinesis stream for bursty traffic/non continuous stream of data?
Since the input stream is now idle for long periods of time, this is probably creating situations where the watermarks are held up. If this is the case, then I would expect to see a lot of variance in the latency, as it would (probably) only be the final windows for each burst whose results are delayed until the arrival of the next burst.

Flink CEP Event Not triggering

I have implement the CEP Pattern in Flink which is working as expected connecting to local Kafka broker. But when i connecting to cluster based cloud kafka setup, the Flink CEP is not triggering.
final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
//saves checkpoint
env.getCheckpointConfig().enableExternalizedCheckpoints(
CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION);
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
I am using AscendingTimestampExtractor,
consumer.assignTimestampsAndWatermarks(
new AscendingTimestampExtractor<ObjectNode>() {
#Override
public long extractAscendingTimestamp(ObjectNode objectNode) {
long timestamp;
Instant instant = Instant.parse(objectNode.get("value").get("timestamp").asText());
timestamp = instant.toEpochMilli();
return timestamp;
}
});
And also i am getting Warn Message that,
AscendingTimestampExtractor:140 - Timestamp monotony violated: 1594017872227 < 1594017873133
And Also i tried using AssignerWithPeriodicWatermarks and AssignerWithPunctuatedWatermarks none of one is working
I have attached Flink console screenshot where Watermark is not assigning.
Updated flink console screenshot
Could Anyone Help?
CEP must first sort the input stream(s), which it does based on the watermarking. So
the problem could be with watermarking, but you haven't shown us enough to debug the cause. One common issue is having an idle source, which can prevent the watermarks from advancing.
But there are other possible causes. To debug the situation, I suggest you look at some metrics, either in the Flink Web UI or in a metrics system if you have one connected. To begin, check if records are flowing, by looking at numRecordsIn, numRecordsOut, or numRecordsInPerSecond and numRecordsOutPerSecond at different stages of your pipeline.
If there are events, then look at currentOutputWatermark throughout the different tasks of your job to see if event time is advancing.
Update:
It appears you may be calling assignTimestampsAndWatermarks on the Kafka consumer, which will result in per-partition watermarking. In that case, if you have an idle partition, that partition won't produce any watermarks, and that will hold back the overall watermark. Try calling assignTimestampsAndWatermarks on the DataStream produced by the source instead, to see if that fixes things. (Of course, without per-partition watermarking, you won't be able to use an AscendingTimestampExtractor, since the stream won't be in order.)

Non blocking streaming on Flink

Hi, I'm trying to run a Flink job that it should process incoming data as below. In the process operator right after keyBy(), there should be a case that takes too much time according to some property in data. Even though incoming data have different ids (which is used to keyBy() the stream), long processing code in process function blocks other incoming data. I mean the entire stream.
SingleOutputStreamOperator<Envelope> processingStream = deviceStream
.map(e -> (Envelope) e)
.keyBy((KeySelector<Envelope, String>) value -> value.eventId) // key by scenarios
.process(new RuleProcessFunction());
In RuleProcessFunction.java:
...
#Override
public void processElement(Envelope value, Context ctx, Collector<Envelope> out) throws Exception {
//handleEvent(value, ctx, out);
if (value.getEventId().equals("I")) {
System.out.println("hello i");
for (long i = 0; i < 10000000000L; i++) {
}
}
out.collect(value);
}
I expect the long-running code block should not block the entire stream. I know there is AsyncFunction for blocking IO situations but I don't know that it's correct solution for this.
Since you aren't pulling data from an external database like Cassandra, I don't think you need to use an AsyncFunction.
What it could be that you are running the flink job with a single parallelism. Try increasing the parallelism so one core isn't responsible for all of the processing as well as receiving data. Granted, there can still be back pressure if you do this. Since if the core responsible for ingesting data from the source is reading in data faster than the core(s) that are running the processFunction Flink's back pressure handling will slow the rate of ingestion.

Neo4j store is not cleanly shut down; Recovering from inconsistent db state from interrupted batch insertion

I was importing ttl ontologies to dbpedia following the blog post http://michaelbloggs.blogspot.de/2013/05/importing-ttl-turtle-ontologies-in-neo4j.html. The post uses BatchInserters to speed up the task. It mentions
Batch insertion is not transactional. If something goes wrong and you don't shutDown() your database properly, the database becomes inconsistent.
I had to interrupt one of the batch insertion tasks as it was taking time much longer than expected which left my database in an inconsistence state. I get the following message:
db_name store is not cleanly shut down
How can I recover my database from this state? Also, for future purposes is there a way for committing after importing every file so that reverting back to the last state would be trivial. I thought of git, but I am not sure if it would help for a binary file like index.db.
There are some cases where you cannot recover from unclean shutdowns when using the batch inserter api, please note that its package name org.neo4j.unsafe.batchinsert contains the word unsafe for a reason. The intention for batch inserter is to operate as fast as possible.
If you want to guarantee a clean shutdown you should use a try finally:
BatchInserter batch = BatchInserters.inserter(<dir>);
try {
} finally {
batch.shutdown();
}
Another alternative for special cases is registering a JVM shutdown hook. See the following snippet as an example:
BatchInserter batch = BatchInserters.inserter(<dir>);
// do some operations potentially throwing exceptions
Runtime.getRuntime().addShutdownHook(new Thread() {
public void run() {
batch.shutdown();
}
});

Resources