Apache camel - EIP design query - apache-camel

I have a EIP design related query.I have a requirement to process csv file by chunks and call a Rest API.After completion of processing of whole file i need to call another Rest API telling processing is complete.I wanted the route to be transacted so i have queue in between in case of end system not available the retry will happen at broker level.
My flow is as below.
First flow:
csv File->Split by chunk of 100 records->Place message in queue
the second flow(Transacted route):
Picks message from queue ->call the rest API
the second flow is transacted.Since iam breaking the flow and it is asynchronous iam not sure how to call to the completion call.I do not have a persistent store to status of each chunk processing.
is there anyway i can achive it using JMS functionality or Camel?

What you can use for your first flow is the Camel Splitter EIP:
http://camel.apache.org/splitter.html
And closely looking at the doc, you will find that there are three exchange properties available for each split exchange:
CamelSplitIndex: A split counter that increases for each Exchange being split. The counter starts from 0.
CamelSplitSize: The total number of Exchanges that was splitted. This header is not applied for stream based splitting. From Camel 2.9 onwards this header is also set in stream based splitting, but only on the completed Exchange.
CamelSplitComplete: Whether or not this Exchange is the last.
As they are exchange properties, you should put them to JMS headers before sending the messages to a queue. But then you should be able to make use of the information at the second flow, so you can know which is the last message.
Keep in mind, though, that it's all asynchronous so the CamelSplitComplete flag doesn't necessarily mean the last message at the second flow. You may create a stateful counter or utilise the Resequencer EIP http://camel.apache.org/resequencer.html to deal with the asynchronicity.

Related

Reading two streams (main and configs) in sequential in Flink

I have two streams, one is main stream let's say in example of fraud detection I have transactions stream and then I have second stream which is configs, in our example it is rules. So I connect main stream to config stream in order to do processing. But when first time flink starts and we are adding job it starts consuming from transactions and configs stream parallel and when wants process transaction it sometimes see that there is no config and we have to send transaction to dead letter queue. However, what I want to achieve is, if there is patential config which I could get a bit later I want to get that config first then get transaction in order to process it rather then sending it to dead letter queue. I have the same key for transactions and configs.
long story short, is there a way telling flink when first time job starts try to consume one stream until there isn't new value then start processing main stream? How I can make them kind of sequential?
The recommended way to approach this is to connect the 2 streams and apply a RichCoFlatMap that will allow you to buffer events from main while you're waiting to receive the config events.
Check out this useful section of the Flink tutorials. The very last paragraph actually describes your problem.
It is important to recognize that you have no control over the order in which the flatMap1 and flatMap2 callbacks are called. These two input streams are racing against each other, and the Flink runtime will do what it wants to regarding consuming events from one stream or the other. In cases where timing and/or ordering matter, you may find it necessary to buffer events in managed Flink state until your application is ready to process them. (Note: if you are truly desperate, it is possible to exert some limited control over the order in which a two-input operator consumes its inputs by using a custom Operator that implements the InputSelectable interface.
So in a nutshell you should connect your 2 streams and have some kind of ListState where you can "buffer" your main elements while waiting to receive the rules. When you receive an element from the config stream, you check whether you had some pending elements "waiting" for that config in your ListState (your buffer). If you do, you can then process these elements and emit them through the collector of your flatmap.
Starting with version 1.16, you can use the hybrid source support in Flink to read all of once source (configs, in your case) before reading the second source. Though I imagine you'd have to map the events to an Either<config, transaction> so that the data stream has consistent record types.

apache camel - 2 source queues with one destination-resume/suspend route

I've configured two routes to consume messages from 2 ActiveMQ queues (queue1 and queue2) to move the messages to another destination (say to aws-sqs queue - sqsqueue1) and this setup is working as expected (queue1 --> sqsqueue1, queue2 --> sqsqueue1). Now my requirement is to modify this consumption such that the queue1 route must be consumed first and when there are no messages on queue1 then only the queue2 route should start consumption. I've explored Control Bus, RoutePolicy, and not sure whether that would fit my case. Moreover is it a valid use case in the EIP pattern. Please advise or any pointers would be helpful.
// pseudocode
from("activemq-queue:queue2")
.to("controlbus:route?routeId=queue1&action=suspend")
.onCompletion("controlbus:route?routeId=queue1&action=resume").end()
.to("aws-sqs:sqsqueue1")

Apache Camel - Dead Letter Channel: apply comparison after dequeuing

I'm having some issues trying to figure out the solution for this problem:
I need to implement a DLC on Apache Camel, though when message are dequeued from the dead letter queue I have on ActiveMQ, every single one of them has to be compared with the latest massage present on another AMQ queue.
So to be clear: when Camel is consuming from queue1 (dead letter queue) the message M1, before trying to resend it to a certain route, it has to compare M1 (for example header comparison) with the latest message present on queue2, M2. M2 is not to be removed from queue2 (it shall serve also for the next comparison) while M1 has to be removed from queue1.
I want to understand if this is possible and which EIP I'm missing in order to implement this.
What you need is a QueueBrowser to browse the messages of queue2 without consuming them.
Alternatively you could also consume from queue2 in a transaction and then force a rollback so that the message is not consumed. But when "latest message present on queue2" does not mean the first message, this will not work because you can only process the first message like this.

Aggregate results of batch consumer in Camel (for example from SQS)

I'm consuming messages from SQS FIFO queue with maxMessagesPerPoll=5 set.
Currently I'm processing each message individually which is a total waste of resources.
In my case, as we are using FIFO queue and all of those 5 messages are related to the same object, I could process them all toghether.
I though this might be done by using aggregate pattern but I wasn't able to get any results.
My consumer route looks like this:
from("aws-sqs://my-queue?maxMessagesPerPoll=5&messageGroupIdStrategy=usePropertyValue")
.process(exchange -> {
// process the message
})
I believe it should be possible to do something like this
from("aws-sqs://my-queue?maxMessagesPerPoll=5&messageGroupIdStrategy=usePropertyValue")
.aggregate(const(true), new GroupedExchangeAggregationStrategy())
.completionFromBatchConsumer()
.process(exchange -> {
// process ALL messages together as I now have a list of all exchanges
})
but the processor is never invoked.
Second thing:
If I'm able to make this work, when does ACK is sent to SQS? When each individual message is processed or when the aggregate process finishes? I hope the latter
When the processor is not called, the aggregator probably still waits for new messages to aggregate.
You could try to use completionSize(5) instead of completionFromBatchConsumer() for a test. If this works, the batch completion definition is the problem.
For the ACK against the broker: unfortunately no. I think the message is commited when it arrives at the aggregator.
The Camel aggregator component is a "stateful" component and therefore it must end the current transaction.
For this reason you can equip such components with persistent repositories to avoid data loss when the process is killed. In such a scenario the already aggregated messages would obviously be lost if you don't have a persistent repository attached.
The problem lies in GroupedExchangeAggregationStrategy
When I use this strategy, the output is an "array" of all exchanges. This means that the exchange that comes to the completion predicate no longer has the initial properties. Instead it has CamelGroupedExchange and CamelAggregatedSize which makes no use for the completionFromBatchConsumer()
As I don't actually need all exchanges being aggregated, it's enough to use GroupedBodyAggregationStrategy. Then exchange properties will remain as in the original exchange and just the body will contain an "array"
Another solution would be to use completionSize(Predicate predicate) and use a custom predicate that extracts necessary value from groupped exchanges.

Request Reply and Scatter Gather Using Apache Camel

I am attempting to construct a route which will do the following:
Consume a message from jms:sender-in. I am using a INOUTrequest reply pattern. The JMSReplyTo = sender-out
The above message will be routed to multiple recipients like jms:consumer1-in, jms:consumer2-in and jms:consumer3-in. All are using a request reply pattern. The JMSReplyTo is specified per consumer ( in this case, the JMSReplyTo are in this order jms:consumer1-out, jms:consumer2-out, jms:consumer3-out
I need to aggregate all the replies together and send the result back to jms:sender-out.
I constructed a route which will resemble this:
from("jms:sender-in")
.to("jms:consumer1-in?exchangePattern=InOut&replyTo=queue:consumer1-out&preserveMessageQos=true")
.to("jms:consumer2-in?exchangePattern=InOut&replyTo=queue:consumer2-out&preserveMessageQos=true")
.to("jms:consumer3-in?exchangePattern=InOut&replyTo=queue:consumer3-out&preserveMessageQos=true");
I then send the replies back to some queue to gather and aggreagte:
from("jms:consumer1-out?preserveMessageQos=true").to("jms:gather");
from("jms:consumer1-out?preserveMessageQos=true").to("jms:gather");
from("jms:consumer1-out?preserveMessageQos=true").to("jms:gather");
from("jms:gather").aggregate(header("TransactionID"), new GatherResponses()).completionSize(3).to("jms:sender-out");
To emulate the behavior of my consumers, I added the following route:
from("jms:consumer1-in").setBody(body());
from("jms:consumer2-in").setBody(body());
from("jms:consumer3-in").setBody(body());
I am getting a couple off issues:
I am getting a timeout error on the replies. If I comment out the gather part, then no issues. Why is there a timeout even though the replies are coming back to the queue and then forwarded to another queue.
How can I store the original JMSReplyTo value so Camel is able to send the aggregated result back to the sender's reply queue.
I have a feeling that I am struggling with some basic concepts. Any help is appreciated.
Thanks.
A good question!
There are two things you need to consider
Don't mix the exchange patterns, Request Reply (InOut) vs Event
message (InOnly). (Unless you have a good reason).
If you do a scatter-gather, you need to make the requests
multicast, otherwise they will be pipelined which is not
really scatter-gather.
I've made two examples which are similar to your case - one with Request Reply and one with (one way) Event messages.
Feel free to replace the activemq component with jms - it's the same thing in these examples.
Example one, using event messages - InOnly:
from("activemq:amq.in")
.multicast()
.to("activemq:amq.q1")
.to("activemq:amq.q2")
.to("activemq:amq.q3");
from("activemq:amq.q1").setBody(constant("q1")).to("activemq:amq.gather");
from("activemq:amq.q2").setBody(constant("q2")).to("activemq:amq.gather");
from("activemq:amq.q3").setBody(constant("q3")).to("activemq:amq.gather");
from("activemq:amq.gather")
.aggregate(new ConcatAggregationStrategy())
.header("breadcrumbId")
.completionSize(3)
.to("activemq:amq.out");
from("activemq:amq.out")
.log("${body}"); // logs "q1q2q3"
Example two, using Request reply - note that the scattering route has to gather the responses as they come in. The result is the same as the first example, but with less routes and less configuration.
from("activemq:amq.in2")
.multicast(new ConcatAggregationStrategy())
.inOut("activemq:amq.q4")
.inOut("activemq:amq.q5")
.inOut("activemq:amq.q6")
.end()
.log("Received replies: ${body}"); // logs "q4q5q6"
from("activemq:amq.q4").setBody(constant("q4"));
from("activemq:amq.q5").setBody(constant("q5"));
from("activemq:amq.q6").setBody(constant("q6"));
As for your question two - of course, it's possible to pass around JMSReplyTo headers and force exchange patterns along the road - but you will create hard to debug code. Keep your exchange patterns simple and clean - it keep bugs away.

Resources