How to interrupt camel multicast for subset of endpoints/subroute - apache-camel

I have requirement to poll two ftp folders say "output" and "error" for file. Actual file could come in either of two folders(not both). I tried multicast (even used completionSize of 1) but process keeps waiting as camel waits for both output and error endpoint routes to complete.
from("direct:got_filename_to_poll")
.multicast()
.parallelProcessing(true)
.to("ftp:output", "ftp:error")
.end()
.to("direct:process_extracted_file")
Is there any way to interrupt "ftp:error" sub-route if I get response from "ftp:output" sub-route(or vice versa) or is there any other option to solve this problem without compromising response time, example adding timeout on slow response will slow down overall response time.

Related

Aggregate results of batch consumer in Camel (for example from SQS)

I'm consuming messages from SQS FIFO queue with maxMessagesPerPoll=5 set.
Currently I'm processing each message individually which is a total waste of resources.
In my case, as we are using FIFO queue and all of those 5 messages are related to the same object, I could process them all toghether.
I though this might be done by using aggregate pattern but I wasn't able to get any results.
My consumer route looks like this:
from("aws-sqs://my-queue?maxMessagesPerPoll=5&messageGroupIdStrategy=usePropertyValue")
.process(exchange -> {
// process the message
})
I believe it should be possible to do something like this
from("aws-sqs://my-queue?maxMessagesPerPoll=5&messageGroupIdStrategy=usePropertyValue")
.aggregate(const(true), new GroupedExchangeAggregationStrategy())
.completionFromBatchConsumer()
.process(exchange -> {
// process ALL messages together as I now have a list of all exchanges
})
but the processor is never invoked.
Second thing:
If I'm able to make this work, when does ACK is sent to SQS? When each individual message is processed or when the aggregate process finishes? I hope the latter
When the processor is not called, the aggregator probably still waits for new messages to aggregate.
You could try to use completionSize(5) instead of completionFromBatchConsumer() for a test. If this works, the batch completion definition is the problem.
For the ACK against the broker: unfortunately no. I think the message is commited when it arrives at the aggregator.
The Camel aggregator component is a "stateful" component and therefore it must end the current transaction.
For this reason you can equip such components with persistent repositories to avoid data loss when the process is killed. In such a scenario the already aggregated messages would obviously be lost if you don't have a persistent repository attached.
The problem lies in GroupedExchangeAggregationStrategy
When I use this strategy, the output is an "array" of all exchanges. This means that the exchange that comes to the completion predicate no longer has the initial properties. Instead it has CamelGroupedExchange and CamelAggregatedSize which makes no use for the completionFromBatchConsumer()
As I don't actually need all exchanges being aggregated, it's enough to use GroupedBodyAggregationStrategy. Then exchange properties will remain as in the original exchange and just the body will contain an "array"
Another solution would be to use completionSize(Predicate predicate) and use a custom predicate that extracts necessary value from groupped exchanges.

Apache camel - EIP design query

I have a EIP design related query.I have a requirement to process csv file by chunks and call a Rest API.After completion of processing of whole file i need to call another Rest API telling processing is complete.I wanted the route to be transacted so i have queue in between in case of end system not available the retry will happen at broker level.
My flow is as below.
First flow:
csv File->Split by chunk of 100 records->Place message in queue
the second flow(Transacted route):
Picks message from queue ->call the rest API
the second flow is transacted.Since iam breaking the flow and it is asynchronous iam not sure how to call to the completion call.I do not have a persistent store to status of each chunk processing.
is there anyway i can achive it using JMS functionality or Camel?
What you can use for your first flow is the Camel Splitter EIP:
http://camel.apache.org/splitter.html
And closely looking at the doc, you will find that there are three exchange properties available for each split exchange:
CamelSplitIndex: A split counter that increases for each Exchange being split. The counter starts from 0.
CamelSplitSize: The total number of Exchanges that was splitted. This header is not applied for stream based splitting. From Camel 2.9 onwards this header is also set in stream based splitting, but only on the completed Exchange.
CamelSplitComplete: Whether or not this Exchange is the last.
As they are exchange properties, you should put them to JMS headers before sending the messages to a queue. But then you should be able to make use of the information at the second flow, so you can know which is the last message.
Keep in mind, though, that it's all asynchronous so the CamelSplitComplete flag doesn't necessarily mean the last message at the second flow. You may create a stateful counter or utilise the Resequencer EIP http://camel.apache.org/resequencer.html to deal with the asynchronicity.

Camel Multicast Subroutes out of order

I have a scenario where I get as input Message A. Message A must then be split into 3 different types of message, and forwarded to other routes. It is important that the messages arrive in a precise order, Ie. A-1 must be sent before A-2, which must be sent before A-3.
To do this I have done the following (outline):
from("activemq:queue:somequeue-local")
.multicast().to("direct:a1","direct:a2","direct:a3");
from("direct:a1)
//split incoming message and prepare output document for A-1
.to("activemq:queue:otherqueue")
.from("direct:a2)
//split incoming message and prepare output document for A-2
.to("activemq:queue:otherqueue")
.from("direct:a3)
//split incoming message and prepare output document for A-3
.to("activemq:queue:otherqueue")
And in another context, responsible for sending out the info to the external system, I have
.from("activemq:queue:otherqueue?maxMessagesPerTask=1&concurrentConsumers=1&maxConcurrentConsumers=1")
// do different stuff based on which type we are called with then end with
.beanref("somebean","writeToFileAndCallImportbat");
Now, my problem is, that when I get to the receiver, I get the messages in random order. Sometimes A-1,A-3,A-2, sometimes right, A-1,A-2,A-3.
I have tried adding JMSXGroupID and JMSXGroupSeq to the messages, but without any luck.
I have also tried skipping the MQ part entirely, and use direct-vm: to call the shared receiver, but then it looks like I have three simultanious invocations of the receiver at once, and still in random execution order.
I was under the impression that multicast would run sequential, unless otherwise prompted to?
Is there something fundamentally wrong with the approach taken?
I am using Camel version 2.12.
Or, said more plainly:
I would like a route that creates three different output messages, and executes a batch file on them, in order. How do I go about that?
If you use the Splitter pattern, have you checked to see if the streaming property is set to false.
If enabled then Camel will split in a streaming fashion, which means it will split the input message in chunks. This reduces the memory overhead. For example if you split big messages its recommended to enable streaming. If streaming is enabled then the sub-message replies will be aggregated out-of-order, eg in the order they come back. If disabled, Camel will process sub-message replies in the same order as they where splitted.
So, it turned out to not be a problem with multicast after all.
Rather, in each of my sub-routes, I did this:
.split(..stax(SpecialClass)).streaming()
.beanRef("transformationBean","somefunction")
.aggregate(constant("1"), new MyAggregator())
.completionTimeout(5000)
.completionSize(1000)
.to(writeToFileAndRunBat)
Which, I assumed meant "Process all elements in the split, and if you aren't finished in 5 seconds or after 1000 elements, break out".
I changed it to
.split(..stax(SpecialClass), , new MyAggregator()).streaming()
.beanRef("transformationBean","somefunction")
.end()
.to(writeToFileAndRunBat)
Coming to think of it, it makes perfect sense, as the first version couldn't really know when we were done, while the last (I assume) just iterate over all elements in the split and calls the Aggregator for each.
Also, I had to .end() in the first version. So I guess the whole thing was just acting random.

Request Reply and Scatter Gather Using Apache Camel

I am attempting to construct a route which will do the following:
Consume a message from jms:sender-in. I am using a INOUTrequest reply pattern. The JMSReplyTo = sender-out
The above message will be routed to multiple recipients like jms:consumer1-in, jms:consumer2-in and jms:consumer3-in. All are using a request reply pattern. The JMSReplyTo is specified per consumer ( in this case, the JMSReplyTo are in this order jms:consumer1-out, jms:consumer2-out, jms:consumer3-out
I need to aggregate all the replies together and send the result back to jms:sender-out.
I constructed a route which will resemble this:
from("jms:sender-in")
.to("jms:consumer1-in?exchangePattern=InOut&replyTo=queue:consumer1-out&preserveMessageQos=true")
.to("jms:consumer2-in?exchangePattern=InOut&replyTo=queue:consumer2-out&preserveMessageQos=true")
.to("jms:consumer3-in?exchangePattern=InOut&replyTo=queue:consumer3-out&preserveMessageQos=true");
I then send the replies back to some queue to gather and aggreagte:
from("jms:consumer1-out?preserveMessageQos=true").to("jms:gather");
from("jms:consumer1-out?preserveMessageQos=true").to("jms:gather");
from("jms:consumer1-out?preserveMessageQos=true").to("jms:gather");
from("jms:gather").aggregate(header("TransactionID"), new GatherResponses()).completionSize(3).to("jms:sender-out");
To emulate the behavior of my consumers, I added the following route:
from("jms:consumer1-in").setBody(body());
from("jms:consumer2-in").setBody(body());
from("jms:consumer3-in").setBody(body());
I am getting a couple off issues:
I am getting a timeout error on the replies. If I comment out the gather part, then no issues. Why is there a timeout even though the replies are coming back to the queue and then forwarded to another queue.
How can I store the original JMSReplyTo value so Camel is able to send the aggregated result back to the sender's reply queue.
I have a feeling that I am struggling with some basic concepts. Any help is appreciated.
Thanks.
A good question!
There are two things you need to consider
Don't mix the exchange patterns, Request Reply (InOut) vs Event
message (InOnly). (Unless you have a good reason).
If you do a scatter-gather, you need to make the requests
multicast, otherwise they will be pipelined which is not
really scatter-gather.
I've made two examples which are similar to your case - one with Request Reply and one with (one way) Event messages.
Feel free to replace the activemq component with jms - it's the same thing in these examples.
Example one, using event messages - InOnly:
from("activemq:amq.in")
.multicast()
.to("activemq:amq.q1")
.to("activemq:amq.q2")
.to("activemq:amq.q3");
from("activemq:amq.q1").setBody(constant("q1")).to("activemq:amq.gather");
from("activemq:amq.q2").setBody(constant("q2")).to("activemq:amq.gather");
from("activemq:amq.q3").setBody(constant("q3")).to("activemq:amq.gather");
from("activemq:amq.gather")
.aggregate(new ConcatAggregationStrategy())
.header("breadcrumbId")
.completionSize(3)
.to("activemq:amq.out");
from("activemq:amq.out")
.log("${body}"); // logs "q1q2q3"
Example two, using Request reply - note that the scattering route has to gather the responses as they come in. The result is the same as the first example, but with less routes and less configuration.
from("activemq:amq.in2")
.multicast(new ConcatAggregationStrategy())
.inOut("activemq:amq.q4")
.inOut("activemq:amq.q5")
.inOut("activemq:amq.q6")
.end()
.log("Received replies: ${body}"); // logs "q4q5q6"
from("activemq:amq.q4").setBody(constant("q4"));
from("activemq:amq.q5").setBody(constant("q5"));
from("activemq:amq.q6").setBody(constant("q6"));
As for your question two - of course, it's possible to pass around JMSReplyTo headers and force exchange patterns along the road - but you will create hard to debug code. Keep your exchange patterns simple and clean - it keep bugs away.

Respond to MINA request after aggregating several requests in Camel

I'm trying to implement a mina service where the response to the final message should be based on the previous messages. Each message (header (1), data (n), end (1)) should receive a response, but the response to the "end" message should be based on the "header", and any "data" messages received as well as the "end" message. Currently, I'm routing the messages to an aggregator which completes when it finds a "header" and "end" message for a particular correlation id. Unfortunately, the response is being sent before (or at the same time?) the message is sent to the aggregator, so I don't have access to the aggregated message (which contains all the data I need to build the correct response) when building the response.
Is there a way to do this without manually storing and accessing the accumulated data (that is, without re-implementing camel's aggregator)?
Edit:
Route is something like:
<camelContext>
<route>
<from uri="mina:..."/>
<process ref="messageProcessor"/>
<aggregate>
<process ref="completeMessageProcessor"/>
</aggregate>
</route>
</camelContext>
I left out some tags and attributes (correlationExpression, completionPredicate, strategyRef, etc.) for clarity.
The messages were being aggregated properly, and they were being processed properly when "completed" (that is, when aggregated). But the response sent back through the mina endpoint to the client was the one generated by the messageProcessor, never the one generated by the completeMessageProcessor.
For example (and yes, it's a rather contrived example, but bear with me), let's say the protocol involves the client sending a header message which includes the total number of data messages it expects to send. Then it sends a number of data messages, which might be different in number to what it expected to send. Finally, it sends a footer, or end, message. The server should then respond back with the difference between the expected number of messages and the actual number of messages. With the route as written, that is impossible, since the number of messages is not known by the messageProcessor, which only processes individual messages. The completeMessageProcessor, having the aggregated message (consisting of header, all the data, and the end) does know this number, but the response generated at that point is not propagated back to the mina endpoint.
Changing the parsing of the messages to generate a message only when entire composed message is received is not an option, since the server must respond to the individual messages.
off the top, my guess is that the messageProcessor is setting up the OUT message, but the completeMessageProcessor is setting up the IN message. The mina consumer response is expecting/using the OUT message instead.
you can add some logging to verify this. if this is the case, then you might change your messageProcessor to use the IN body instead (or use exchanges headers) and add a transform after your completeMessageProcessor to set the OUT body based on the IN body
<transform>
<simple>${in.body}</simple>
</transform>
see this for more information: http://camel.apache.org/using-getin-or-getout-methods-on-exchange.html
UPDATE: after some discussion, the real issue is that the aggregator currently only handles "InOnly" exchanges

Resources