We are running Beam application on Flink cluster with side inputs of size 50Mb.
Side input refresh ( Pull from external data source ) based on the notification sent to the notification topic in Kafka.
As the application progress due to side input Full GC happening often and each GC taking ~30 sec which pauses task manager to send heart beat to the Master.
After consecutive heartbeat miss , master assuming worker is dead and start reassigning the jobs , results restarting of application.
We tried removing Side input , application works fine.
Questions :
Is there any limitation on size of side input in Apache Beam side input ?
I have created side input map using asSingleton() , is going to create seprate copy for each task ? I have given 15 parallelism. is it going to create 15 copy in a JVM ( assuming all tasks assigned to same worker )?
What is alternative for side inputs?
This is sample pipeline :
public class BeamApplication {
public static final CloseableHttpClient httpClient = HttpClients.createDefault();
public static void main(String[] args) {
PipelineOptions options = PipelineOptionsFactory.create();
options.as(FlinkPipelineOptions.class).setRunner(FlinkRunner.class);
Pipeline pipeline = Pipeline.create(options);
PCollection<Map<String, Double>> sideInput = pipeline
.apply(KafkaIO.<String, String>read().withBootstrapServers("localhost:9092")
.withKeyDeserializer(StringDeserializer.class).withValueDeserializer(StringDeserializer.class)
.withTopic("testing"))
.apply(ParDo.of(new DoFn<KafkaRecord<String, String>, Map<String, Double>>() {
#ProcessElement
public void processElement(ProcessContext processContext) {
KafkaRecord<String, String> record = processContext.element();
String message = record.getKV().getValue().split("##")[0];
String change = record.getKV().getValue().split("##")[1];
if (message.equals("START_REST")) {
Map<String, Double> map = new HashMap<>();
Map<String,Double> changeMap = new HashMap<>();
HttpGet request = new HttpGet("http://localhost:8080/config-service/currency");
try (CloseableHttpResponse response = httpClient.execute(request)) {
HttpEntity entity = response.getEntity();
String responseString = EntityUtils.toString(entity, "UTF-8");
ObjectMapper objectMapper = new ObjectMapper();
CurrencyDTO jsonObject = objectMapper.readValue(responseString, CurrencyDTO.class);
map.putAll(jsonObject.getQuotes());
System.out.println(change);
Random rand = new Random();
Double db = rand.nextDouble();
System.out.println(db);
changeMap.put(change,db);
entity.getContent();
} catch (Exception e) {
e.printStackTrace();
}
processContext.output(changeMap);
}
}
}));
PCollection<Map<String, Double>> currency = sideInput
.apply(Window.<Map<String, Double>>into(new GlobalWindows())
.triggering(Repeatedly.forever(AfterPane.elementCountAtLeast(1)))
.withAllowedLateness(Duration.ZERO).discardingFiredPanes());
PCollectionView<Map<String, Double>> sideInputView = currency.apply(View.asSingleton());
PCollection<KafkaRecord<Long, String>> kafkaEvents = pipeline
.apply(KafkaIO.<Long, String>read().withBootstrapServers("localhost:9092")
.withKeyDeserializer(LongDeserializer.class).withValueDeserializer(StringDeserializer.class)
.withTopic("event_testing"));
PCollection<String> output = kafkaEvents
.apply("Extract lines", ParDo.of(new DoFn<KafkaRecord<Long, String>, String>() {
#ProcessElement
public void processElement(ProcessContext processContext) {
String element = processContext.element().getKV().getValue();
Map<String, Double> map = processContext.sideInput(sideInputView);
System.out.println("This is it : " + map.entrySet());
}
}).withSideInputs(sideInputView));
pipeline.run().waitUntilFinish();
}
}
What state-backend are you using?
If i'm not mistaken, side inputs are implemented as state in Flink. If you're using MemoryStateBackend as state-backend, you might indeed reach pressure on you memory consumption.
Also, the processing of events will block until that side input is ready, buffering events. If preparing the side input take long time or the rate of incoming events is high, you might reach memory pressure.
Can try an alternative state-backend? Preferably RocksDBStateBackend, it holds in-flight data in a RocksDB database instead of in-memory.
It's difficult to guess what's the issue. I would recommend monitoring memory related metrics - see a good post on that here.
You could also run profiling on the Task Managers and analyse the dumps - see here
Is the memory increasing also if you only publish the first message to "testing" topic?
Maybe to isolate the problem I would use a simpler side-input. Remove the HTTP call and make the data static. Maybe a periodic triggered one instead of Kafka:
GenerateSequence.from(0).withRate(1, Duration.standardSeconds(5L))
Related
I have an Apache flink usecase that works as follows:
I have data events coming in through first stream. Part of each event is a foreign key for which I expect data from the second stream. E.g.: I am getting data for major cities in the first stream which has a city-code and I need the average temperature over time for this city code streamed through the second stream. It is not possible to have temperatures streamed for all possible cities, we have to request the city for which we need the data.
So we need some way to "notify" the second stream source that we need data for this city "pushed" when we encounter it the first time in the first stream.
This would have been easy if this notification could be done from the first stream. The problem is that the second stream is coming to us through a websocket part of which is a control channel through which we have to make the request - so the request HAS to be made from the second stream.
Check event in the first stream. Read city code x.
Have we seen this city code? If not, notify the second stream, we need data for city code x.
Second stream sends message to source for data for x.
Data starts flowing in for city x, which is used to join downstream.
If notification from the first stream was possible, this would be easy - I could have done it from step 2, so data starts flowing in the second stream. But that is not possible as the request needs to be send on the same websocket connection that feeds the second stream.
I have explored using CoProcessFunction or RichCoMapFunction for this - but it is not clear how this can be done. I have seen some examples of Broadcast State Pattern - but even that does not seem to fit the usecase.
Can someone help me with some pointers on possible solutions?
So I made it work using the suggestion of the side output stream. Thanks #whatisinthename and #kkrugler for the suggestions.
Still trying to figure out details, but here's a summary
From the notification stream (stream 1), create a side output stream (stream 1-1).
Use an extended class (TempRequester) of KeyedProcessFunction, to process the side output stream 1-1 and create Stream 2 from it. The KeyedProcessFunction has the websocket connection.
In the open method of the KeyedProcessFunction create the connection to websocket (handshaking etc.). Have a ListState state to keep the list of city codes.
In the processElement function of TempRequester, check the city code coming in from side output stream 1-1. If present in ListState, do nothing. Else, send a message through websocket control channel and request city data and add the code to ListState. Create a process timer (this is one time) to fire after 500 milliseconds or so. The websocket server writes the temp data very frequently and that is saved in a queue.
In the onTimer method, check the queue, read the data and push out (out.collect...). Create a timer again. So essentially, once the first city code gets in, we create a timer that runs every 500 milliseconds and dumps the records received out into the second stream.
Now the first and second streams can be joined downstream (I used the table API).
Not sure if this is the most elegant solution, but it worked. Thanks for the suggestions.
Here's the approximate main code:
DataStream<Event> notificationStream =
env.addSource(this.notificationSource)
.returns(TypeInformation.of(Event.class));
notificationStream.assignTimestampsAndWatermarks(WatermarkStrategy.forMonotonousTimestamps());
final OutputTag<String> outputTag = new OutputTag<String>("cities-seen"){};
SingleOutputStreamOperator<Event> mainDataStream = notificationStream.process(new ProcessFunction<Event, Event>() {
#Override
public void processElement(
Event value,
Context ctx,
Collector<Event> out) throws Exception {
// emit data to regular output
out.collect(value);
// emit data to side output
ctx.output(outputTag, event.cityCode);
}
});
DataStream<String> sideOutputStream = mainDataStream.getSideOutput(outputTag);
DataStream<TemperatureData> temperatureStream = sideOutputStream
.keyBy(value -> value)
.process(new TempRequester());
temperatureStream.assignTimestampsAndWatermarks(WatermarkStrategy.forMonotonousTimestamps());
// set up the Java Table API and rest of SQL based joins ...
And the approximate code for TempRequester (ProcessFunction):
public static class TempRequester extends KeyedProcessFunction<String, String, TemperatureData> {
private ListState<String> allCities;
private volatile boolean running = true;
//This is the queue for requesting city codes
private BlockingQueue<String> messagesToSend = new ArrayBlockingQueue<>(100);
//This is the queue for receiving temperature data
private ConcurrentLinkedQueue<TemperatureData> messages = new ConcurrentLinkedQueue<TemperatureData>();
private static final int TIMEOUT = 500;
#Override
public void open(Configuration parameters) throws Exception {
super.open(parameters);
allCities = getRuntimeContext().getListState(new ListStateDescriptor<>("List of cities seen", String.class));
... rest of websocket client setup code ...
}
#Override
public void close() throws Exception {
running = false;
super.close();
}
private boolean initialized = false;
#Override
public void processElement(String cityCode, Context ctx, Collector<TemperatureData> collector) throws Exception {
boolean citycodeFound = StreamSupport.stream(allCities.get().spliterator(), false)
.anyMatch(s -> cityCode.equals(s));
if (!citycodeFound) {
allCities.add(cityCode);
messagesToSend.put(.. add city code ..);
if (!initialized) {
ctx.timerService().registerProcessingTimeTimer(ctx.timestamp()+ TIMEOUT);
initialized = true;
}
}
}
#Override
public void onTimer(long timestamp, OnTimerContext ctx, Collector<TemperatureData> out) throws Exception {
TemperatureData p;
while ((p = messages.poll()) != null) {
out.collect(p);
}
ctx.timerService().registerProcessingTimeTimer(ctx.timestamp() + TIMEOUT);
}
}
I am newbie to flink apologize if my understanding is wrong i am building a dataflow application and the flow contains multiple data streams which check if the required fields are present in the incoming DataStream or not. My application validate the incoming data and if the data is validated successfully it should append the data to file in the given if it is already existing. I am trying to simulate if any exception happens in one DataStream other data streams should not get impacted for that i am explicitly throwing an exception in one of the flow. In the below example for simplicity i am using windows text file to append data
Note: My flow don't have states since i don't have any thing to store in state
public class ExceptionTest {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// start a checkpoint every 1000 ms
env.enableCheckpointing(1000);
// env.setParallelism(1);
//env.setStateBackend(new RocksDBStateBackend("file:///C://flinkCheckpoint", true));
// to set minimum progress time to happen between checkpoints
env.getCheckpointConfig().setMinPauseBetweenCheckpoints(500);
// checkpoints have to complete within 5000 ms, or are discarded
env.getCheckpointConfig().setCheckpointTimeout(5000);
// set mode to exactly-once (this is the default)
env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
// allow only one checkpoint to be in progress at the same time
env.getCheckpointConfig().setMaxConcurrentCheckpoints(1);
// enable externalized checkpoints which are retained after job cancellation
env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION); // DELETE_ON_CANCELLATION
env.setRestartStrategy(RestartStrategies.fixedDelayRestart(
3, // number of restart attempts
Time.of(10, TimeUnit.SECONDS) // delay
));
DataStream<String> input1 = env.fromElements("hello");
DataStream<String> input2 = env.fromElements("hello");
DataStream<String> output1 = input.flatMap(new FlatMapFunction<String, String>() {
#Override
public void flatMap(String value, Collector<String> out) throws Exception {
//out.collect(value.concat(" world"));
throw new Exception("=====================NO VALUE TO CHECK=================");
}
});
DataStream<String> output2 = input.flatMap(new FlatMapFunction<String, String>() {
#Override
public void flatMap(String value, Collector<String> out) throws Exception {
out.collect(value.concat(" world"));
}
});
output2.addSink(new SinkFunction<String>() {
#Override
public void invoke(String value) throws Exception {
try {
File myObj = new File("C://flinkOutput//filename.txt");
if (myObj.createNewFile()) {
System.out.println("File created: " + myObj.getName());
BufferedWriter out = new BufferedWriter(
new FileWriter("C://flinkOutput//filename.txt", true));
out.write(value);
out.close();
System.out.println("Successfully wrote to the file.");
} else {
System.out.println("File already exists.");
BufferedWriter out = new BufferedWriter(
new FileWriter("C://flinkOutput//filename.txt", true));
out.write(value);
out.close();
System.out.println("Successfully wrote to the file.");
}
} catch (IOException e) {
System.out.println("An error occurred.");
e.printStackTrace();
}
}
});
env.execute();
}
I have few doubts as below
When i am throwing exception in output1 stream the second flow output2 is running even after encountering the exception and writing data to the file in my local but when i check the file the output as below
hello world
hello world
hello world
hello world
As per my understanding from flink documentation if i use the checkpointing mode as EXACTLY_ONCE it should not write the data to file not more than one time as the process is already completed and written data to file. But its not happening in my case and i am not getting if i am doing anything wrong
Please help me to clear my doubts on checkpointing and how can i achieve the EXACTLY_ONCE mechanism i read about TWO_PHASE_COMMIT in flink but i didn't get any example on how to implement it.
As suggested by #Mikalai Lushchytski i implemented StreamingSinkFunction below
With StreamingSinkFunction
public class ExceptionTest {
public static void main(String[] args) throws Exception {
StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment();
// start a checkpoint every 1000 ms
env.enableCheckpointing(1000);
// env.setParallelism(1);
//env.setStateBackend(new RocksDBStateBackend("file:///C://flinkCheckpoint", true));
// to set minimum progress time to happen between checkpoints
env.getCheckpointConfig().setMinPauseBetweenCheckpoints(500);
// checkpoints have to complete within 5000 ms, or are discarded
env.getCheckpointConfig().setCheckpointTimeout(5000);
// set mode to exactly-once (this is the default)
env.getCheckpointConfig().setCheckpointingMode(CheckpointingMode.EXACTLY_ONCE);
// allow only one checkpoint to be in progress at the same time
env.getCheckpointConfig().setMaxConcurrentCheckpoints(1);
// enable externalized checkpoints which are retained after job cancellation
env.getCheckpointConfig().enableExternalizedCheckpoints(CheckpointConfig.ExternalizedCheckpointCleanup.RETAIN_ON_CANCELLATION); // DELETE_ON_CANCELLATION
env.setRestartStrategy(RestartStrategies.fixedDelayRestart(
3, // number of restart attempts
Time.of(10, TimeUnit.SECONDS) // delay
));
DataStream<String> input1 = env.fromElements("hello");
DataStream<String> input2 = env.fromElements("hello");
DataStream<String> output1 = input.flatMap(new FlatMapFunction<String, String>() {
#Override
public void flatMap(String value, Collector<String> out) throws Exception {
//out.collect(value.concat(" world"));
throw new Exception("=====================NO VALUE TO CHECK=================");
}
});
DataStream<String> output2 = input.flatMap(new FlatMapFunction<String, String>() {
#Override
public void flatMap(String value, Collector<String> out) throws Exception {
out.collect(value.concat(" world"));
}
});
String outputPath = "C://flinkCheckpoint";
final StreamingFileSink<String> sink = StreamingFileSink
.forRowFormat(new Path(outputPath), new SimpleStringEncoder<String>("UTF-8"))
.withRollingPolicy(
DefaultRollingPolicy.builder()
.withRolloverInterval(TimeUnit.MINUTES.toMillis(15))
.withInactivityInterval(TimeUnit.MINUTES.toMillis(5))
.withMaxPartSize(1)
.build())
.build();
output2.addSink(sink);
});
env.execute();
}
But when i check the Checkpoint folder i can see it created four part files with in progress as below
Is there anything i am doing because of that its creating multipart files?
In order to guarantee end-to-end exactly-once record delivery (in addition to exactly-once state semantics), the data sink needs to take part in the checkpointing mechanism (as well as the data source).
If you are going to write the data to a file, then you can use a StreamingFileSink, which emits its input elements to FileSystem files within buckets. This is integrated with the checkpointing mechanism to provide exactly once semantics out-of-the box.
If you are going to implement your own sink, then the sink function must implement the CheckpointedFunction interface and properly implement snapshotState(FunctionSnapshotContext context) method called when a snapshot for a checkpoint is requested and flushing the current application state. In addition I would recommend implementing the CheckpointListener interface to be notified once a distributed checkpoint has been completed.
Flink already provides an abstract TwoPhaseCommitSinkFunction, which is a recommended base class for all of the SinkFunction that intend to implement exactly-once semantic. It does that by implementing two phase commit algorithm on top of the CheckpointedFunction and
CheckpointListener. As an example, you can have a look at FlinkKafkaProducer.java source code.
When I use Windows All, because there is only one degree of parallelism, there is a bottleneck in processing. Therefore, I change to timeWindow and use processTime, but I encounter a new problem, data can not be ingested. From the log on the console, it can be seen that only more than ten data are processed every second, if I use Windows All. It can process tens of thousands of data per second. So I don't know why.
When I added waterMark to time Windows, I found that time Windows can handle a large number of data per second, but upstream data still accumulates
SingleOutputStreamOperator<DataSetPOJO> dataSetPOJOSingleOutputStreamOperator = sdkInfos.flatMap(...);
dataSetPOJOSingleOutputStreamOperator.keyBy(new KeySelector<DataSetPOJO, String>() {
#Override
public String getKey(DataSetPOJO dataSet) {
return dataSet.getPartitionKey();
}
}).timeWindow(Time.seconds(3))
.process(new ProcessWindowFunction<DataSetPOJO, List<DataSetPOJO>, String, TimeWindow>() {
#Override
public void process(String key, Context context, Iterable<DataSetPOJO> elements,
Collector<List<DataSetPOJO>> out) throws Exception {
ArrayList<DataSetPOJO> dataSetPOJO = Lists.newArrayList(elements);
if (dataSetPOJO.size() > 0) {
// log.info("key~~~~~~~~~~~~~~:" + key);
// log.info("dataSetPOJO.size():" + dataSetPOJO.size());
out.collect(dataSetPOJO);
}
}
}).addSink(new Sink2Postgre());
I hope I can save enough batches in windows to write PostgreSQL,If this is not correct, how to write, if it is no problem, what will be the problem. Fink Version 1.5.3
We are in the middle of testing scaling ability of Flink. But we found that scaling not working, no matter increase more slot or increase number of Task Manager. We would expect a linear, if not close-to-linear scaling performance but the result even show degradation. Appreciated any comments.
Test Details,
-VMWare vsphere
-Just a simple pass through test,
- auto gen source 3mil records, each 1kb in size, parallelism=1
- source pass into next map operator, which just return the same record, and sent counter to statsD, parallelism is in cases = 2,4,6
3 TM, total 6 slots(2/TM) each JM/TM has 32 vCPU, 100GB memory
Result:
2 slots: 26 seconds, 3mil/26=115k TPS
4 slots: 23 seconds, 3mil/23=130k TPS
6 slots: 22 seconds, 3mil/22=136k TPS
As shown the scaling is almost nothing. Any clue? Thanks.
You really should be using a RichParallelSourceFunction. If you care about making the records from different instances of the source distinct, you can get ahold of each instance's index from the RuntimeContext, which is available via the getRuntimeContext() method in the RichFunction interface.
Also, Flink has a built-in statsd metrics reporter that you should be using instead of rolling your own. Moreover, numRecordsIn, numRecordsOut, numRecordsInPerSecond, and numRecordsOutPerSecond are already being computed for you, so no need to create this instrumentation yourself. You can also access these metrics via Flink's web interface, or the REST API.
As for why you might be experiencing poor scalability with the Kafka consumer, there are many things that could cause this. If you are using event time processing, then idle partitions could be holding things up (see https://issues.apache.org/jira/browse/FLINK-5479). If the stream is keyed, then data skew could be an issue. If you are connecting to an external database or service, then it could easily be a bottleneck. If checkpointing is misconfigured it could cause this. Or insufficient network capacity.
I would start to debug this by looking at some key metrics in the Flink web UI. Is the load well balanced across the sub-tasks, or is it skewed? You could turn on latency tracking and see if one of the kafka partitions is misbehaving (by inspecting the latency at the sink(s), which will be reported on a per-partition basis). And you could look for back pressure.
please refer to the sample code,
public class passthru extends RichMapFunction<String, String> {
public void open(Configuration configuration) throws Exception {
... ...
stats = new NonBlockingStatsDClient();
}
public String map(String value) throws Exception {
... ...
stats.increment();
return value;
}
}
public class datagen extends RichSourceFunction<String> {
... ...
public void run(SourceContext<String> ctx) throws Exception {
int i = 0;
while (run){
String idx = String.format("%09d", i);
ctx.collect("{\"<a 1kb json content with idx in certain json field>\"}");
i++;
if(i == loop)
run = false;
}
}
... ...
}
public class Job {
public static void main(String[] args) throws Exception {
... ...
DataStream<String> stream = env.addSource(new datagen(loop)).rebalance();
DataStream<String> convert = stream.map(new passthru(statsdUrl));
env.execute("Flink");
}
}
the reductionState code,
dataStream.flatMap(xxx).keyBy(new KeySelector<xxx, AggregationKey>() {
public AggregationKey getKey(rec r) throws Exception {
... ...
}
}).process(new Aggr());
public class Aggr extends ProcessFunction<rec, rec> {
private ReducingState<rec> store;
public void open(Configuration parameters) throws Exception {
store= getRuntimeContext().getReducingState(new ReducingStateDescriptor<>(
"reduction store", new ReduceFunction<rec>() {
... ...
}
public void processElement(rec r, Context ctx, Collector<rec> out)
throws Exception {
... ...
store.add(r);
I have pulling scenario,
HTTP -> Kafka -> Flink -> some output
If im not wrong i can use kafka consumer on stream only ?
Therefor i need to "block" the stream in order to sum/count the data im receiving from the HTTP call .
The easiest way to "block" is to add window/.
What is the best approach for this pulling scenario .
UPDATE
I want to prevent from the collector to sum each value
SingleOutputStreamOperator<Tuple2<String, Integer>> t =
in.flatMap(new FlatMapFunction<String, Tuple2<String, Integer>>() {
#Override
public void flatMap(String s, Collector<Tuple2<String, Integer>>
collector) throws Exception {
ObjectMapper mapper = new ObjectMapper();
JsonNode node = mapper.readTree(s);
node.elements().forEachRemaining(v -> {
collector.collect(new Tuple2<>(v.textValue(), 1));
});
}
}).keyBy(0).sum(1);
If I understand correctly I think what you may want to use is a session window. This will continue to collect messages into the window and will only process the contents of the window when an event hasn't been received after a certain amount of time. See the documentation on session windows here: https://ci.apache.org/projects/flink/flink-docs-release-1.3/dev/windows.html