Flink checkpoints size are growing over 20GB and checkpoints time take over 1 minute - apache-flink

First and foremost:
I'm kind of new to Flink (Understand the principle and is able to create any basic streaming job I need to)
I'm using Kinesis Analytics to run my Flink job and by default it's using incremental checkpointing with a 1 minute interval.
The Flink job is reading event from a Kinesis stream using a FlinkKinesisConsumer and a custom deserailzer (deserialze the byte into a simple Java object which is used throughout the job)
What I would like to archieve is simply counting how many event of ENTITY_ID/FOO and ENTITY_ID/BAR there is for the past 24 hours. It is important that this count is as accurate as possible and this is why I'm using this Flink feature instead of doing a running sum myself on a 5 minute tumbling window.
I also want to be able to have a count of 'TOTAL' events from the start (and not just for the past 24h) so I also output in the result the count of events for the past 5 minutes so that the post porcessing app can simply takes these 5 minute of data and do a running sum. (This count doesn't have to be accurate and it's ok if there is an outage and I lose some count)
Now, this job was working pretty good up until last week where we had a surge (10 times more) in traffic. From that point on Flink went banana.
Checkpoint size starting to slowly grow from ~500MB to 20GB and checkpoint time were taking around 1 minutes and growing over time.
The application started failing and never was able to fully recover and the event iterator age shoot up never went back down so no new events were being consumed.
Since I'm new with Flink I'm not enterely sure if the way I'm doing the sliding count is completely un optimised or plain wrong.
This is a small snippet of the key part of the code:
The source (MyJsonDeserializationSchema extends AbstractDeserializationSchema and simply read byte and create the Event object):
SourceFunction<Event> source =
new FlinkKinesisConsumer<>("input-kinesis-stream", new MyJsonDeserializationSchema(), kinesisConsumerConfig);
The input event, simple java pojo which will be use in the Flink operators:
public class Event implements Serializable {
public String entityId;
public String entityType;
public String entityName;
public long eventTimestamp = System.currentTimeMillis();
}
env.setStreamTimeCharacteristic(TimeCharacteristic.EventTime);
DataStream<Event> eventsStream = kinesis
.assignTimestampsAndWatermarks(new BoundedOutOfOrdernessTimestampExtractor<Event>(Time.seconds(30)) {
#Override
public long extractTimestamp(Event event) {
return event.eventTimestamp;
}
})
DataStream<Event> fooStream = eventsStream
.filter(new FilterFunction<Event>() {
#Override
public boolean filter(Event event) throws Exception {
return "foo".equalsIgnoreCase(event.entityType);
}
})
DataStream<Event> barStream = eventsStream
.filter(new FilterFunction<Event>() {
#Override
public boolean filter(Event event) throws Exception {
return "bar".equalsIgnoreCase(event.entityType);
}
})
StreamTableEnvironment tEnv = StreamTableEnvironment.create(env);
Table fooTable = tEnv.fromDataStream("fooStream, entityId, entityName, entityType, eventTimestame.rowtime");
tEnv.registerTable("Foo", fooTable);
Table barTable = tEnv.fromDataStream("barStream, entityId, entityName, entityType, eventTimestame.rowtime");
tEnv.registerTable("Bar", barTable);
Table slidingFooCountTable = fooTable
.window(Slide.over("24.hour").every("5.minute").on("eventTimestamp").as("minuteWindow"))
.groupBy("entityId, entityName, minuteWindow")
.select("concat(concat(entityId,'_'), entityName) as slidingFooId, entityid as slidingFooEntityid, entityName as slidingFooEntityName, entityType.count as slidingFooCount, minuteWindow.rowtime as slidingFooMinute");
Table slidingBarCountTable = barTable
.window(Slide.over("24.hout").every("5.minute").on("eventTimestamp").as("minuteWindow"))
.groupBy("entityId, entityName, minuteWindow")
.select("concat(concat(entityId,'_'), entityName) as slidingBarId, entityid as slidingBarEntityid, entityName as slidingBarEntityName, entityType.count as slidingBarCount, minuteWindow.rowtime as slidingBarMinute");
Table tumblingFooCountTable = fooTable
.window(Tumble.over(tumblingWindowTime).on("eventTimestamp").as("minuteWindow"))
.groupBy("entityid, entityName, minuteWindow")
.select("concat(concat(entityName,'_'), entityName) as tumblingFooId, entityId as tumblingFooEntityId, entityNamae as tumblingFooEntityName, entityType.count as tumblingFooCount, minuteWindow.rowtime as tumblingFooMinute");
Table tumblingBarCountTable = barTable
.window(Tumble.over(tumblingWindowTime).on("eventTimestamp").as("minuteWindow"))
.groupBy("entityid, entityName, minuteWindow")
.select("concat(concat(entityName,'_'), entityName) as tumblingBarId, entityId as tumblingBarEntityId, entityNamae as tumblingBarEntityName, entityType.count as tumblingBarCount, minuteWindow.rowtime as tumblingBarMinute");
Table aggregatedTable = slidingFooCountTable
.leftOuterJoin(slidingBarCountTable, "slidingFooId = slidingBarId && slidingFooMinute = slidingBarMinute")
.leftOuterJoin(tumblingFooCountTable, "slidingFooId = tumblingBarId && slidingFooMinute = tumblingBarMinute")
.leftOuterJoin(tumblingFooCountTable, "slidingFooId = tumblingFooId && slidingFooMinute = tumblingFooMinute")
.select("slidingFooMinute as timestamp, slidingFooCreativeId as entityId, slidingFooEntityName as entityName, slidingFooCount, slidingBarCount, tumblingFooCount, tumblingBarCount");
DataStream<Result> result = tEnv.toAppendStream(aggregatedTable, Result.class);
result.addSink(sink); // write to an output stream to be picked up by a lambda function
I would greatly appreciate if someone with more experience in working with Flink could comment on the way I have done my counting? Is my code completely over engineered? Is there a better and more efficient way of counting events over a 24h period?
I have read somewhere in Stackoverflow #DavidAnderson suggesting to create our own sliding window using map state and slicing the event by timestamp.
However I'm not exactly sure what this mean and I didn't find any code example to show it.

You are creating quite a few windows in there. If You are creating a sliding window with a size of 24h and slide of 5 mins this means that there will be a lot of open windows in there, so You may expect that all the data You have received in the given day will be checkpointed in at least one window if You think about it. So, it's certain that the size & time of the checkpoint will grow as the data itself grows.
To be able to get the answer if the code can be rewritten You would need to provide more details on what exactly are You trying to achieve here.

Related

Unique Count for Multiple timewindows - Process or Reduce function combined with ProcessWindowFunction?

We need to find number of unique elements in the input stream for multiple timewindows.
The Input data Object is of below definition InputData(ele1: Integer,ele2: String,ele3: String)
Stream is keyed by ele1 and ele2.The requirement is to find number of unique ele3 in the last 1 hour, last 12 hours and 24 hours and the result should refresh every 15 mins.
We are using SlidingTimewindow with sliding interval as 15 mins and Streaming intervals 1,12 and 24.
Since we need to find Unique elements, we are using Process function as the window function,which would store all the elements(events) for each key till the end of window to process and count unique elements.This,we thought could be optimized for its memory consumption
Instead,we tried using combination of Reduce function and Process function,to incrementaly aggregate,keep storing unique elements in a HashSet in Reduce function and then count the size of the HashSet in Process window function.
https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/#processwindowfunction-with-incremental-aggregation
public class UserDataReducer implements ReduceFunction<UserData> {
#Override
public UserData reduce(UserData u1, UserData u2) {
u1.getElement3().addAll(u2.getElement3());
return new UserData.Builder(u1.getElement1(), u1.getElement2(),)
.withUniqueUsers(u1.geElement3())
.createUserData();
}
}
public class UserDataProcessor extends ProcessWindowFunction<UserData,Metrics,
Tuple2<Integer, String>,TimeWindow> {
#Override
public void process(Tuple2<Integer, String> key,
ProcessWindowFunction<UserData, Metrics, Tuple2<Integer, String>, TimeWindow>.Context context,
Iterable<UserData> elements,
Collector<Metrics> out) throws Exception {
if (Objects.nonNull(elements.iterator().next())) {
UserData aggregatedUserAttribution = elements.iterator().next();
out.collect(new Metrics(
key.ele1,
key.ele2,
aggregatedUserAttribution.getElement3().size(),
));
}
}
}
We expected the heap memory consumption to reduce,since we are now storing only one object per key per slide as the state.
But there was no decrease in the heap memory consumption,it was almost same or a bit higher.
We observed in the heapdump of the new process, a high number of hashmap instances,consuming more memory than the input data objects would occupy,in the ealrier job.
What would be the best way to solve this? Process function or Incremental aggregation with a combination of Reduce and Process function?
State Backend: Hashmap
Flink Version: 1.14.2 on Yarn
In this case I'm not really sure if partial aggregation will reduce Heap size. It should allow You to reduce state size by some factor depending on the uniqueness of the dataset. That is because (as far as I understand) You are effectively copying HashSet for every single element that is assigned to the window, while they are being garbage collected, it doesn't happen immediately so You will see quite a few of those HashSets in heap dumps.
Overall, ProcessFunction will quite probably generate larger state but in terms of Heap Size they may be quite similar as You have noticed.
One thing You might consider is to try to apply more advanced processing. You can either try to read on Triggers and try to implement a trigger in a such a way that You will have 24h window, but it would emit results for ever y 1h, 12h and 24h (after which the window would be purged). Note that in such case You would need to do some work in ProcessFunction to make sure the results are correct. One more thing You can look at is this post.
Note that both proposed solutions will require some understanding of Flink and more manual processing of window elements.

Persist Apache Flink window

I'm trying to use Flink to consume a bounded data from a message queue in a streaming passion. The data will be in the following format:
{"id":-1,"name":"Start"}
{"id":1,"name":"Foo 1"}
{"id":2,"name":"Foo 2"}
{"id":3,"name":"Foo 3"}
{"id":4,"name":"Foo 4"}
{"id":5,"name":"Foo 5"}
...
{"id":-2,"name":"End"}
The start and end of messages can be determined using the event id. I want to receive such batches and store the latest (by overwriting) batch on disk or in memory. I can write a custom window trigger to extract the events using the start and end flags as shown below:
DataStream<Foo> fooDataStream = ...
AllWindowedStream<Foo, GlobalWindow> fooWindow = fooDataStream.windowAll(GlobalWindows.create())
.trigger(new CustomTrigger<>())
.evictor(new Evictor<Foo, GlobalWindow>() {
#Override
public void evictBefore(Iterable<TimestampedValue<Foo>> elements, int size, GlobalWindow window, EvictorContext evictorContext) {
for (Iterator<TimestampedValue<Foo>> iterator = elements.iterator();
iterator.hasNext(); ) {
TimestampedValue<Foo> foo = iterator.next();
if (foo.getValue().getId() < 0) {
iterator.remove();
}
}
}
#Override
public void evictAfter(Iterable<TimestampedValue<Foo>> elements, int size, GlobalWindow window, EvictorContext evictorContext) {
}
});
but how can I persist the output of the latest window. One way would be using a ProcessAllWindowFunction to receive all the events and write them to disk manually but it feels like a hack. I'm also looking into the Table API with Flink CEP Pattern (like this question) but couldn't find a way to clear the Table after each batch to discard the events from the previous batch.
There are a couple of things getting in the way of what you want:
(1) Flink's window operators produce append streams, rather than update streams. They're not designed to update previously emitted results. CEP also doesn't produce update streams.
(2) Flink's file system abstraction does not support overwriting files. This is because object stores, like S3, don't support this operation very well.
I think your options are:
(1) Rework your job so that it produces an update (changelog) stream. You can do this with toChangelogStream, or by using Table/SQL operations that create update streams, such as GROUP BY (when it's used without a time window). On top of this, you'll need to choose a sink that supports retractions/updates, such as a database.
(2) Stick to producing an append stream and use something like the FileSink to write the results to a series of rolling files. Then do some scripting outside of Flink to get what you want out of this.

FLINK- Load historical data and maintain window of 30 days

My requirement is to hold 30 days data into stream to given any day for processing. so first day when FLINK application will start, it will fetch 30 days data from database and will merge to current stream data.
My challenge is - manage 30 days data window. If I create slidingwindow for 30 days with sliding time 1 day. something like
WatermarkStrategy<EventResponse> wmStrategy = WatermarkStrategy.<EventResponse>forBoundedOutOfOrderness(Duration.ofMillis(1))
.withTimestampAssigner((eventResponse, l) -> eventResponse.getLocalDateTime().toEpochSecond(ZoneOffset.MAX));
ds.assignTimestampsAndWatermarks(wmStrategy)
.windowAll(SlidingEventTimeWindows.of(Time.days(30), Time.days(1)))
.process(new ProcessAllWindowFunction<EventResponse, Object, TimeWindow>() {
#Override
public void process(Context context, Iterable<EventResponse> iterable, Collector<Object> collector) throws Exception {
--- proccessing logic
}
in this case process() do not start processing immediately when first element of historical data is added. my assumption is ```a) by default first event will be part of first window and will be available for processing immediately. b) next day job will remove last 29th day data from window. is my assumption correct with that piece of code? thank you for your help on this.
I don't think that Your assumptions are correct in this case. When You are using the TimeWindow with ProcessFunction it means that the function is able to process the data when the window is closed (in Your case after 30 days). In this case, slide in time window means that the second window will contain 29 days of the first window and 31st day which was not part of the first window.

Flink's CoProcessFunction doesn't trigger onTimer

I try to aggregate two streams like that
val joinedStream = finishResultStream.keyBy(_.searchId)
.connect(startResultStream.keyBy(_.searchId))
.process(new SomeCoProcessFunction)
and then working on them in SomeCoProcessFunction class like that
class SomeCoProcessFunction extends CoProcessFunction[SearchFinished, SearchCreated, SearchAggregated] {
override def processElement1(finished: SearchFinished, ctx: CoProcessFunction[SearchFinished, SearchCreated, SearchAggregated]#Context, out: Collector[SearchAggregated]): Unit = {
// aggregating some "finished" data ...
}
override def processElement2(created: SearchCreated, ctx: CoProcessFunction[SearchFinished, SearchCreated, SearchAggregated]#Context, out: Collector[SearchAggregated]): Unit = {
val timerService = ctx.timerService()
timerService.registerEventTimeTimer(System.currentTimeMillis + 5000)
// aggregating some "created" data ...
}
override def onTimer(timestamp: Long, ctx: CoProcessFunction[SearchFinished, SearchCreated, SearchAggregated]#OnTimerContext, out: Collector[SearchAggregated]): Unit = {
val watermark: Long = ctx.timerService().currentWatermark()
println(s"watermark!!!! $watermark")
// clean up the state
}
What I want is to clean up the state after a certain time( 5000 Milliseconds), and that is what onTimer have to be used for. But since it never get fired, I kinda ask my self what am I doing wrong here?
Thanks in advance for any hint.
UPDATE:
Solution was to set timeService like that (tnx to both fabian-hueske and Beckham):
timerService.registerProcessingTimeTimer(timerService.currentProcessingTime() + 5000)
I still didn't really figure out what timerService.registerEventTimeTimer does, watermark ctx.timerService().currentWatermark() shows always -9223372036854775808 now matter how long before EventTimer was registered.
I see that you are using System.currentTimeMillis which might be different from the TimeCharacteristic (event time, processing time, ingestion time) that your Flink job uses.
Try getting the timestamp of the event ctx.timestamp() then add the 5000ms on top of it.
The problem is that you are registering an event-time timer (timerService.registerEventTimeTimer) with a processing-time timestamp (System.currentTimeMillis + 5000).
System.currentTimeMillis returns the current machine time but event-time is not based on the machine time but on the time computed from watermarks.
Either you should register a processing-timer or register an event-time timer with an event-time timestamp. You can get the timestamp of the current watermark or the timestamp of the current record from the Context object that is passed as a parameter to processElement1() and processElement2().

How do I avoid STANDARD_PRICE_NOT_DEFINED when unit-testing an OpportunityLineItem in Apex v24.0?

Apparently a new feature of the Spring '12 / v24.0 release of Apex in Salesforce.com is that unit tests no longer have access to 'real' data -- thus (if I'm understanding the change correctly) a SOQL query will now only retrieve objects that have been inserted during the course of the unit test -- and even that is subject to some limitations.
At any rate this seems to throw OpportunityLineItem testing out the window, because:
It's impossible to insert an OpportunityLineItem without a PriceBookEntryId, BUT
You can't insert a new price-book entry for product X unless you already have a Standard Price Book entry for product X, BUT
There isn't a Standard Price Book in the test data because the Pricebook2 table, like all tables, is effectively empty at the beginning of the unit-test run, AND
There's no way to create a Standard Price Book in Apex
I'm really hoping I got at least one of those four points wrong, but so far no variation on my existing unit-tests has shown any of them to be wrong. Which of course means my unit tests no longer work. This happened literally overnight -- the tests ran fine in my sandbox on Friday, and now they fail.
Am I missing something, or is this a bug in the new SFDC release?
There is new functionality introduced in Summer 14, you can now use Test.getStandardPricebookId() to get the standard pricebook ID without having to set SeeAllData to True.
Firstly, to put your mind at ease, there are no plans ever to deprecate the seeAllData flag. We're not going to pull the rug out from under you. As to the creation of standard price book in an apex test, I'm not sure. There are, I'm sure, several areas where testing without existing data is difficult on the platform today, which is one reason why the seeAllData flag is there. We'll be trying to close those gaps in the next few releases.
I just ran into this, and although your post is old, it's the first result on Google so I thought I'd share what I did.
My basic architecture is a test class that calls a utility class to as a way of creating test data on the fly (there are other ways, this is my habit).
Short version:
set see all data to true
make sure the standard price book is active
add a pricebook entry for the standard price book - flag as active
add a pricebook entry for you test price book - flag as active
Test class:
#isTest (seeAllData=true)
public with sharing class RMA_SelectLineItemsControllerTest {
static testmethod void testBasicObjects() {
Pricebook2 standard = [Select Id, Name, IsActive From Pricebook2 where IsStandard = true LIMIT 1];
if (!standard.isActive) {
standard.isActive = true;
update standard;
}
Pricebook2 pb = RMA_TestUtilities.createPricebook();
Product2 prod = RMA_TestUtilities.createProduct();
PricebookEntry pbe = RMA_TestUtilities.createPricebookEntry(standard,pb,prod);
}
}
The utility method look like this (only showing that around the new PBE):
public static PricebookEntry createPricebookEntry (Pricebook2 standard, Pricebook2 newPricebook, Product2 prod) {
System.debug('***** starting one');
PricebookEntry one = new PricebookEntry();
one.pricebook2Id = standard.id;
one.product2id = prod.id;
one.unitprice = 1249.0;
one.isactive = true;
insert one;
System.debug('***** one complete, ret next');
PricebookEntry ret = new PricebookEntry();
ret.pricebook2Id = newPricebook.id;
ret.product2id = prod.id;
ret.unitprice = 1250.0;
ret.isactive = true;
insert ret;
return ret;
}
Another work around would be to make your trigger be aware of being run in a test using Test.isRunningTest(), but I think this solution misses the point of best practice, which I believe is the whole point of making tests isolated from pre-existing data.
Perhaps Salesforce could make the Pricebook2.isStandard field writeable if code is running in the context of a test, or the specific Standard Price Book record should be given the same status as User and Profile??
Please let me know if anyone has used Test.getStandardPricebookId() and able to insert opportunity line item in test class. I tried this method with below mentioned code but got an error ": STANDARD_PRICE_NOT_DEFINED, No standard price defined for this product: []".
Note: I have seeAllData=false
ID standardPBID = Test.getStandardPricebookId();
PriceBook2 pb = new PriceBook2();
pb.name = 'GEW Water CMS';
pb.isActive=true;
insert pb;
Product2 prod= new Product2();
prod.name='TestProd';
prod.productcode='4568';
prod.isActive=true;
insert prod;
PricebookEntry standardPrice = new PricebookEntry(Pricebook2Id = standardPBID, Product2Id = prod.Id, UnitPrice = 10000, IsActive = true, currencyISOCode='USD' );
PriceBookEntry pbe= new PricebookEntry(pricebook2id=pb.id, product2id=prod.id,unitprice=2000, isActive=true, currencyISOCode='EUR');
insert pbe;
OpportunityLineItem oli = new OpportunityLineItem(OpportunityId = OppList[0].Id, pricebookentryid=pbe.id, UnitPrice = 100, Quantity = 1);
insert oli;

Resources