I am new to Flink. I have a question that if all the messages sent to the downstream nodes are in order? For example,
[Stream] -> [DownStream]
Stream: [1,2,3,4,5,6,7,8,9]
Downstream get [3,2,1,4,5,7,6,8,9]
If so, how do we handle this case if we want it in order?
Any help would be very appreciated!
An operator can have multiple input channels. It will process the events from each channel in the order in which they were received. (Operators can also have multiple output channels.)
If your job has more than one pathway between stream and downstream, then the events can race and the the ordering will be non-deterministic. Otherwise the ordering will be preserved.
An example: Suppose you are reading, in parallel, from a Kafka topic with multiple partitions. Further imagine that all events from a given user are in the same Kafka partition (and are in order, by timestamp, for each user). Then in Flink you can use keyBy(user) and be sure that the event stream for each user will remain in order. On the other hand, if the events for a given user are spread across multiple partitions, then keyBy(user) will end up creating a stream of events for each user that is (almost certainly) out of order, because it will be pulling together events from several different FlinkKafkaConsumer instances that are reading in parallel.
Related
I'm getting data from kafka topic, then exploding array and producing multiple events using flatMap.
Incoming event format:
Event(eventId: Long, time: Long)
IncomingEvent(customerId: Long, events: List[Event])
Event format after exploding incoming event:
EventAfterExploding(customerId: Long, eventId: Long, time: Long)
These events will be written to the MySQL by using JDBC sink provided by Flink.
The data stored in the same Kafka partitions has the same customer id, that means I don't have any ordering issue here. But there can be lots of eventIds in an event, so that means there can be lots of events in the same flink partition after flatMap operation. This will cause latency or maybe OOM issues because one operator have to process more data. In order to prevent this issue, I can apply repartition or increase parallelism. But there is also one more concern here is that every (customerId, eventId) pair have to be sent to the same sink operator because there can be race condition issues if different writers try to operate same pair. For example;
event1 => EventAfterExploding(1, 1, 1)
event2 => EventAfterExploding(1, 1, 2)
In this scenario, database have to contains event2 that has the latest time, but if these two data go to different sink partitions, event1 can be in database instead of event2.
How can I solve race condition problem and scaling problem that happens when there are lots of data in the same partition? Does applying code block given below solves these problem? I though code given below, because after keyBy operation data will be redistribute and it will also guarantee that the same data will ben sent to the same partition but just want to make sure. Thanks!
incomingEvents
.flatMap(new ExplodingFunction())
.keyBy(event => (event.customerId, event.eventId))
.addSink(JdbcSink.sink(...))
Yes, that code will have the effect you are looking for. All events for the same customerId and eventId will go to the same instance of the sink.
I have a custom flink Source, and I have a SerializableTimestampAssigner that assigns event timestamps to records emitted by the source. The source may emit records out of order because of the nature of the underlying data storage, however with BATCH mode, I expect Flink to sort these records by event timestamp before any operator processes them.
Excerpted from Flink document on execution mode:
In BATCH mode, where the input dataset is known in advance, there is no need for such a heuristic as, at the very least, elements can be sorted by timestamp so that they are processed in temporal order.
However, this doesn't seem to be the case. If I create a datastream out of the Source (StreamExecutionEnvironment.fromSource) with my timestamp assigner, and then datastream.addSink(x => println(extractTimestamp(x)), the output isn't strictly ascending. Is my understanding of the document wrong? Or does flink expect me (the users) to sort the input dataset themselves?
BATCH execution mode first sorts by key, and within each key, it sorts by timestamp. By operating this way, it only needs to keep state for one key at a time, so this keeps the runtime simple and efficient.
If your pipeline isn't using keyed streams, then you won't be using keyed state or timers, so the ordering shouldn't matter (and I'm not sure what happens).
For keyed co-streams, they are both keyed in the same way, and both streams are sorted by those keys and the keys are advanced in lockstep.
Broadcast streams are sent in their entirety before anything else.
Suppose I want to implement an ETL job with Flink, source and sink of which are both Kafka topic with only one partition.
Order of records in source and sink matters to downstream(There are more jobs consume sink of my ETL, jobs are maintained by other teams.).
Is there any way make sure order of records in sink same as source, and make parallelism more than 1?
https://stackoverflow.com/a/69094404/2000823 covers parts of your question. The basic principle is that two events will maintain their relative ordering so long as they take the same path through the execution graph. Otherwise, the events will race against each other, and there is no guarantee regarding ordering.
If your job only has FORWARD connections between the tasks, then the order will always be preserved. If you use keyBy or rebalance (to change the parallel), then it will not.
A Kafka topic with one partition cannot be read from (or written to) in parallel. You can increase the parallelism of the job, but this will only have a meaningful effect on intermediate tasks (since in this case the source and sink cannot operate in parallel) -- which then introduces the possibility of events ending up out-of-order.
If it's enough to maintain the ordering on a key-by-key basis, then with just one partition, you'll always be fine. With multiple partitions being consumed in parallel, then if you use keyBy (or GROUP BY in SQL), you'll be okay only if all events for a key are always in the same Kafka partition.
I'm implementing a real-time streaming ETL pipeline using Apache Flink. The pipeline has these characteristics:
Ingest a single Kinesis stream: stream-A
The stream has records of type EventA which have a category_id, representing distinct logical streams
Because of how they are written to Kinesis (separate producer per category_id, writing serially), these logical streams are guaranteed to be read in order by FlinkKinesisConsumer
Flink does some in-order processing work, keyed by the category_id, generating a stream of EventB data records
These records are written to Kinesis stream-B
A separate service ingests the data from stream-B and it is important that this happens in order.
The processing looks something like this:
val in_events = env.addSource(new FlinkKinesisConsumer[EventA]( # these are guaranteed ordered
"stream-A",
new EventASchema,
consumerConfig))
val out_events = in_events
.keyBy(event => event.category_id)
.process(new EventAStreamProcessor)
out_events.addSink(new FlinkKinesisProducer[EventB](
"stream-B",
new EventBSchema,
producerConfig))
# a separate service reads the out_events and wants them in-order
Based on the guidelines here, it seems like it is impossible to guarantee the ordering of EventB records written to the sink. I only care that events with the same category_id are written in order, since the downstream service will keyBy this. Thinking from first principles, if I were to implement the threading manually, I would have a separate queue per category_id KeyedStream and ensure those are written serially to Kinesis (this seems like a strict generalization over what is done by default, which is to use a ThreadPool, which has a single global queue). Does the FlinkKinesisProducer support this mechanism or is there a way around this limitation using Flink's keyBy or similar construct? Separate sink per category_id maybe? For this last option, I'm anticipating 100k category_ids so this might have too much of a memory overhead.
One option is to buffer events read from stream-B in the downstream service to order them (with high probability if buffer window is large). This in theory should work, but it makes the downstream service more complex then it needs to be, precludes determinism since it depends on random timing of network calls, and, more importantly, adds latency to the pipeline (though maybe less latency overall then forcing serial writes to stream-B?). So ideally, I'm hoping to go with another option. And, this feels like a common problem, so perhaps there are more clever solutions out there or I'm missing something obvious
Many thanks in advance.
I have a specific task to join two data streams in one aggregation using Apache Flink with some additional logic.
Basically I have two data streams: a stream of events and a stream of so-called meta-events. I use Apache Kafka as a message backbone. What I'm trying to achieve is to trigger the aggregation/window to the evaluation based on the information given in meta-event. The basic scenario is:
The Data Stream of events starts to emit records of Type A;
The records keep accumulating in some aggregation or window based on some key;
The Data Stream of meta-events receives a new meta-event with the given key which also defines a total amount of the events that will be emitted in the Data Stream of events.
The number of events form the step 3 becomes a trigger criteria for the aggregation. After a total count of Type A events with a given key becomes equal to the number defined in the meta-event with a given key the aggregation should be triggered to the evaluation.
Steps 1 and 3 occur in the non-deterministic order, so they can be reordered.
What I've tried is to analyze the Flink Global Windows but not sure whether it would be a good and adequate solution. I'm also not sure if such problem has a solution in Apache Flink.
Any possible help is highly appreciated.
The simplistic answer is to .connect() the two streams, keyBy() the appropriate fields in each stream, and then run them into a custom KeyedCoProcessFunction. You'd save the current aggregation result & count in the left hand (Type A) stream state, and the target count in the right hand (meta-event) stream state, and generate results when the aggregation count == the target count.
But there is an issue here - what happens if you get N records in the Type A stream before you get the meta-event record for that key, and N > the target count? Essentially you either have to guarantee that doesn't happen, or you need to buffer Type A events (in state) until you get the meta-event record.
Though similar situations could occur if the meta-event target can be changed to a smaller value, of course.