apache camel testing wait multicast to finish processing - apache-camel

apache camel test
I have my route FirstRoute which at the end multicast and send to SecondRoute.
i am writing my route test, i noticed multicast starts new thread if second_route takes longer to persist the data my integration test which starts first_route can't read the data b/c second_route is separate process first_route signals already finish processing.
i am trying to find out a way where my first_route test will wait for second_route to finish processing before it runs my verifications.
following is my route code
from("First_route_id")
.process() // bla bla
.multicast()
.to("Second_route_id");
---
from("Second_route_id")
.proces() // save data
.end()

You can use Camel Notify Builder to build expressions like "when 1 message has completed in my route MySecondRoute".
Example:
NotifyBuilder notification = new NotifyBuilder(context)
.from("direct:secondRoute").whenDone(1)
.create();
You can then wait for the defined condition to come true and additionally pass a timeout as parameter.
boolean done = notification.matches(10, TimeUnit.SECONDS);
As soon as the condition becomes true the test is notified and continues. So you can put your assertions below this point.
If the condition does not become true, the test continues when the timeout is exhausted.
See link to Camel Docs for more examples of NotifyBuilder.

Related

how to perform parallel processing of gcp pubsub messages in apache camel

I have this code below that takes message from pubsub source topic -> transform it as per a template -> then publish the transformed message to a target topic.
But to improve performance I need to do this task in parallel.That is i need to poll 500 messages,and then transform it in parallel and then publish them to the target topic.
From the camel gcp component documentation I believe maxMessagesPerPoll and concurrentConsumers parameter will do the job.Due to lack of documentation I am not sure how does it internally works.
I mean a) if I poll say 500 message ,will then it create 500 parallel route that will process the messages and publish it to the target topic b)what about ordering of the messages c) should I be looking at parallel processing EIPs as an alternative
etc.
The concept is not clear to me
Was go
// my route
private void addRouteToContext(final PubSub pubSub) throws Exception {
this.camelContext.addRoutes(new RouteBuilder() {
#Override
public void configure() throws Exception {
errorHandler(deadLetterChannel("google-pubsub:{{gcp_project_id}}:{{pubsub.dead.letter.topic}}")
.useOriginalMessage().onPrepareFailure(new FailureProcessor()));
/*
* from topic
*/
from("google-pubsub:{{gcp_project_id}}:" + pubSub.getFromSubscription() + "?"
+ "maxMessagesPerPoll={{consumer.maxMessagesPerPoll}}&"
+ "concurrentConsumers={{consumer.concurrentConsumers}}").
/*
* transform using the velocity
*/
to("velocity:" + pubSub.getToTemplate() + "?contentCache=true").
/*
* attach header to the transform message
*/
setHeader("Header ", simple("${date:now:yyyyMMdd}")).routeId(pubSub.getRouteId()).
/*
* log the transformed event
*/
log("${body}").
/*
* publish the transformed event to the target topic
*/
to("google-pubsub:{{gcp_project_id}}:" + pubSub.getToTopic());
}
});
}
a) if I poll say 500 message ,will then it create 500 parallel route that will process the messages and publish it to the target topic
No, Camel does not create 500 parallel threads in this case. As you suspect, the number of concurrent consumer threads is set with concurrentConsumers. So if you define 5 concurrentConsumers with a maxMessagesPerPoll of 500, every consumer will fetch up to 500 messages and process them one after the other in a single thread. That is, you have 5 messages processed in parallel.
what about ordering of the messages
As soon as you process messages in parallel, the order of messages is messed up. But this already happens with 1 Consumer when you got processing errors and they are detoured to your deadLetterChannel and reprocessed later.
should I be looking at parallel processing EIPs as an alternative
Only if the concurrentConsumers option is not sufficient.
When you mention the concurrentConsumers option(let's say concurrentConsumers=10), you are asking Camel to create a thread pool of 10 threads, and each of those 10 threads will pick up a different message from the pub-sub queue and process them.
The thing to note here is that when you are specifying the concurrentConsumers option, the thread pool uses a fixed size, which means that a fixed number of active threads are waiting at all times to process incoming messages. So 10 threads(since I specified concurrentConsumers=10) will be waiting to process my messages, even if there aren't 10 messages coming in simultaneously.
Obviously, this is not going to guarantee that the incoming messages will be processed in the same order. If you are looking to have the messages in the same order, you can have a look at the Resequencer EIP to order your messages.
As for your third question, I don't think google-pubsub component allows a parallel processing option. You can make your own using the Threads EIP. This would definitely give more control over your concurrency.
Using Threads, your code would look something like this:
from("google-pubsub:project-id:destinationName?maxMessagesPerPoll=20")
// the 2 parameters are 'pool size' and 'max pool size'
.threads(5, 20)
.to("direct:out");

How to manually ack/nack a PubSub message in Camel Route

I am setting up a Camel Route with ackMode=NONE meaning acknowlegements are not done automatically. How do I explicitly acknowledge the message in the route?
In my Camel Route definition I've set ackMode to NONE. According to the documentation, I should be able to manually acknowledge the message downstream:
https://github.com/apache/camel/blob/master/components/camel-google-pubsub/src/main/docs/google-pubsub-component.adoc
"AUTO = exchange gets ack’ed/nack’ed on completion. NONE = downstream process has to ack/nack explicitly"
However I cannot figure out how to send the ack.
from("google-pubsub:<project>:<subscription>?concurrentConsumers=1&maxMessagesPerPoll=1&ackMode=NONE")
.bean("processingBean");
My PubSub subscription has an acknowledgement deadline of 10 seconds and so my message keeps getting re-sent every 10 seconds due to ackMode=NONE. This is as expected. However I cannot find a way to manually acknowledge the message once processing is complete and stop the re-deliveries.
I was able to dig through the Camel components and figure out how it is done. First I created a GooglePubSubConnectionFactory bean:
#Bean
public GooglePubsubConnectionFactory googlePubsubConnectionFactory() {
GooglePubsubConnectionFactory connectionFactory = new GooglePubsubConnectionFactory();
connectionFactory.setCredentialsFileLocation(pubsubKey);
return connectionFactory;
}
Then I was able to reference the ack id of the message from the header:
#Header(GooglePubsubConstants.ACK_ID) String ackId
Then I used the following code to acknowledge the message:
List<String > ackIdList = new ArrayList<>();
ackIdList.add(ackId);
AcknowledgeRequest ackRequest = new AcknowledgeRequest().setAckIds(ackIdList);
Pubsub pubsub = googlePubsubConnectionFactory.getDefaultClient();
pubsub.projects().subscriptions().acknowledge("projects/<my project>/subscriptions/<my subscription>", ackRequest).execute();
I think it is best if you look how the Camel component does it with ackMode=AUTO. Have a look at this class (method acknowledge)
But why do you want to do this extra work? Camel is your fried to simplify integration by abstracting away low level code.
So when you use ackMode=AUTO Camel automatically commits your successfully processed messages (when the message has successfully passed the whole route) and rolls back your not processable messages.

Camel: Poll Enrich with Aggregation

From the camel book, section 'Using pollEnrich to merge additional data with an existing message', it shows that you can merge the oldExchange(from the quarz) with the new one (from ftp).
The problem is that I have a file from a topic(old Exchange) and I use pollEnrich to get a new file from a ftp server and I want to merge this too. I am interested in set some headers from oldExchange to the newExchange.
The problem that I am facing is that the oldExchange is all the time null.
I have read the examples from camel book, for aggregator and there said: "The first message arrives for the first group. == null".
I don't understand, then where is my oldExchange? the one from the topic. Why only at the second iteration the exchange is not null (for the same group).
from("myTopic")
.pollEnrich()
.simple("ftp://myUrl&fileName=${in.headers.test}")
.aggregate((Exchange oldExchange, Exchange newExchange) -> {
final String oldHeader = oldExchange.getIn().getHeader("test", String.class);
newExchange.getIn().setHeader("test", oldHeader);
return newExchange;
})
I have read this: http://camel.465427.n5.nabble.com/Split-and-Aggregate-Old-Exchange-is-null-everytime-in-AggregationStrategy-td5746365.html#a5746405 and still I don't understand how can both messages belong to the same group.
The first message arrives for the first group. == null. I don't understand ...
This is true for a standard aggregation where you aggregate for example multiple incoming messages to one. In this case, on the first incoming message the aggregator is still empty and therefore the oldExchange (aggregator content) is null. You have to wait for another (second) message to be able to aggregate something.
However, in your case (enrich) the oldExchange should not be null because the first message, i.e. the message from your topic, is already there.
Have you tried to inspect the message from the topic in the debugger or log it out before it reaches the enricher? Just to make sure it is not empty.
Added after a test
This is fascinating, I tried this with a unit test and when I define the pollEnrich as you do, I get the inverse result: My consumed message routed by .from(...) is the oldExchange and my newExchange is always null.
However, if I define the pollEnrich "inline", it works fine
.pollEnrich("URI", Timeout, (AggregationStrategy))
I suspect that this is explainable if you analyze what the DSL does with these two definitions, but from my quick test perspective it looks a bit strange.
#burki true, is it working as you said with the aggregationStrategy inside the pollEnrich() but I need the simple because I am calling an endpoint dynamically and I cannot do this in the pollEnrich (or at least I don't know how).
I was able to solve like this:
from("myTopic")
.pollEnrich()
.simple("ftp://myUrl&fileName=${in.headers.test}")
.aggregationStrategy((Exchange oldExchange, Exchange newExchange) -> {
final String oldHeader = oldExchange.getIn().getHeader("test", String.class);
newExchange.getIn().setHeader("test", oldHeader);
return newExchange;
})
So instead of the .aggregate call, I am using .aggregationStrategy , what I understood is that the .aggregate call is for the standard aggregation (as #burki mentioned) if we want to aggregate multiple messages and the .aggregationStrategy call can be used to merge 2 messages (one of them is from an external service).

Infinite AMQP Consumer with Alpakka

I'm trying to implement a very simple service connected to an AMQP broker with Alpakka. I just want it to consume messages from its queue as a stream at the moment they are pushed on a given exchange/topic.
Everything seemed to work fine in my tests, but when I tried to start my service, I realized that my stream was only consuming my messages once and then exited.
Basically I'm using the code from Alpakka documentation :
def consume()={
val amqpSource = AmqpSource.committableSource(
TemporaryQueueSourceSettings(connectionProvider, exchangeName)
.withDeclaration(exchangeDeclaration)
.withRoutingKey(topic),
bufferSize = prefetchCount
)
val amqpSink = AmqpSink.replyTo(AmqpReplyToSinkSettings(connectionProvider))
amqpSource.mapAsync(4)(msg => onMessage(msg)).runWith(amqpSink)
}
I tried to schedule the consume() execution every second, but I experienced OutOfMemoryException issues.
Is there any proper way to make this code run as an infinite loop ?
If you want to have a Source restarted when it fails or is cancelled, wrap it with RestartSource.withBackoff.

How to schedule JMS consuming in Apache Camel?

I need to consume JMS messages with Camel everyday at 9pm (or from 9pm to 10pm to give it the time to consume all the messages).
I can't see any "scheduler" option for URIs "cMQConnectionFactory:queue:myQueue" while it exists for "file://" or "ftp://" URIs.
If I put a cTimer before it will send an empty message to the queue, not schedule the consumer.
You can use a route policy where you can setup for example a cron expression to tell when the route is started and when its stopped.
http://camel.apache.org/scheduledroutepolicy.html
Other alternatives is to start/stop the route via the Java API or JMX etc and have some other logic that knows when to do that according to the clock.
This is something that has caused me a significant amount of trouble. There are a number of ways of skinning this cat, and none of them are great as far as I can see.
On is to set the route not to start automatically, and use a schedule to start the route and then stop it again after a short time using the controlbus EIP. http://camel.apache.org/controlbus.html
I didn't like this approach because I didn't trust that it would drain the queue completely once and only once per trigger.
Another is to use a pollEnrich to query the queue, but that only seems to pick up one item from the queue, but I wanted to completely drain it (only once).
I wrote a custom bean that uses consumer and producer templates to read all the entries in a queue with a specified time-out.
I found an example on the internet somewhere, but it took me a long time to find, and quickly searching again I can't find it now.
So what I have is:
from("timer:myTimer...")
.beanRef( "myConsumerBean", "pollConsumer" )
from("direct:myProcessingRoute")
.to("whatever");
And a simple pollConsumer method:
public void pollConsumer() throws Exception {
if ( consumerEndpoint == null ) consumerEndpoint = consumer.getCamelContext().getEndpoint( endpointUri );
consumer.start();
producer.start();
while ( true ) {
Exchange exchange = consumer.receive( consumerEndpoint, 1000 );
if ( exchange == null ) break;
producer.send( exchange );
consumer.doneUoW( exchange );
}
producer.stop();
consumer.stop();
}
where the producer is a DefaultProducerTemplate, consumer is a DefaultConsumerTemplate, and these are configured in the bean configuration.
This seems to work for me, but if anyone gives you a better answer I'll be very interested to see how it can be done better.

Resources