How to schedule JMS consuming in Apache Camel? - apache-camel

I need to consume JMS messages with Camel everyday at 9pm (or from 9pm to 10pm to give it the time to consume all the messages).
I can't see any "scheduler" option for URIs "cMQConnectionFactory:queue:myQueue" while it exists for "file://" or "ftp://" URIs.
If I put a cTimer before it will send an empty message to the queue, not schedule the consumer.

You can use a route policy where you can setup for example a cron expression to tell when the route is started and when its stopped.
http://camel.apache.org/scheduledroutepolicy.html
Other alternatives is to start/stop the route via the Java API or JMX etc and have some other logic that knows when to do that according to the clock.

This is something that has caused me a significant amount of trouble. There are a number of ways of skinning this cat, and none of them are great as far as I can see.
On is to set the route not to start automatically, and use a schedule to start the route and then stop it again after a short time using the controlbus EIP. http://camel.apache.org/controlbus.html
I didn't like this approach because I didn't trust that it would drain the queue completely once and only once per trigger.
Another is to use a pollEnrich to query the queue, but that only seems to pick up one item from the queue, but I wanted to completely drain it (only once).
I wrote a custom bean that uses consumer and producer templates to read all the entries in a queue with a specified time-out.
I found an example on the internet somewhere, but it took me a long time to find, and quickly searching again I can't find it now.
So what I have is:
from("timer:myTimer...")
.beanRef( "myConsumerBean", "pollConsumer" )
from("direct:myProcessingRoute")
.to("whatever");
And a simple pollConsumer method:
public void pollConsumer() throws Exception {
if ( consumerEndpoint == null ) consumerEndpoint = consumer.getCamelContext().getEndpoint( endpointUri );
consumer.start();
producer.start();
while ( true ) {
Exchange exchange = consumer.receive( consumerEndpoint, 1000 );
if ( exchange == null ) break;
producer.send( exchange );
consumer.doneUoW( exchange );
}
producer.stop();
consumer.stop();
}
where the producer is a DefaultProducerTemplate, consumer is a DefaultConsumerTemplate, and these are configured in the bean configuration.
This seems to work for me, but if anyone gives you a better answer I'll be very interested to see how it can be done better.

Related

how to perform parallel processing of gcp pubsub messages in apache camel

I have this code below that takes message from pubsub source topic -> transform it as per a template -> then publish the transformed message to a target topic.
But to improve performance I need to do this task in parallel.That is i need to poll 500 messages,and then transform it in parallel and then publish them to the target topic.
From the camel gcp component documentation I believe maxMessagesPerPoll and concurrentConsumers parameter will do the job.Due to lack of documentation I am not sure how does it internally works.
I mean a) if I poll say 500 message ,will then it create 500 parallel route that will process the messages and publish it to the target topic b)what about ordering of the messages c) should I be looking at parallel processing EIPs as an alternative
etc.
The concept is not clear to me
Was go
// my route
private void addRouteToContext(final PubSub pubSub) throws Exception {
this.camelContext.addRoutes(new RouteBuilder() {
#Override
public void configure() throws Exception {
errorHandler(deadLetterChannel("google-pubsub:{{gcp_project_id}}:{{pubsub.dead.letter.topic}}")
.useOriginalMessage().onPrepareFailure(new FailureProcessor()));
/*
* from topic
*/
from("google-pubsub:{{gcp_project_id}}:" + pubSub.getFromSubscription() + "?"
+ "maxMessagesPerPoll={{consumer.maxMessagesPerPoll}}&"
+ "concurrentConsumers={{consumer.concurrentConsumers}}").
/*
* transform using the velocity
*/
to("velocity:" + pubSub.getToTemplate() + "?contentCache=true").
/*
* attach header to the transform message
*/
setHeader("Header ", simple("${date:now:yyyyMMdd}")).routeId(pubSub.getRouteId()).
/*
* log the transformed event
*/
log("${body}").
/*
* publish the transformed event to the target topic
*/
to("google-pubsub:{{gcp_project_id}}:" + pubSub.getToTopic());
}
});
}
a) if I poll say 500 message ,will then it create 500 parallel route that will process the messages and publish it to the target topic
No, Camel does not create 500 parallel threads in this case. As you suspect, the number of concurrent consumer threads is set with concurrentConsumers. So if you define 5 concurrentConsumers with a maxMessagesPerPoll of 500, every consumer will fetch up to 500 messages and process them one after the other in a single thread. That is, you have 5 messages processed in parallel.
what about ordering of the messages
As soon as you process messages in parallel, the order of messages is messed up. But this already happens with 1 Consumer when you got processing errors and they are detoured to your deadLetterChannel and reprocessed later.
should I be looking at parallel processing EIPs as an alternative
Only if the concurrentConsumers option is not sufficient.
When you mention the concurrentConsumers option(let's say concurrentConsumers=10), you are asking Camel to create a thread pool of 10 threads, and each of those 10 threads will pick up a different message from the pub-sub queue and process them.
The thing to note here is that when you are specifying the concurrentConsumers option, the thread pool uses a fixed size, which means that a fixed number of active threads are waiting at all times to process incoming messages. So 10 threads(since I specified concurrentConsumers=10) will be waiting to process my messages, even if there aren't 10 messages coming in simultaneously.
Obviously, this is not going to guarantee that the incoming messages will be processed in the same order. If you are looking to have the messages in the same order, you can have a look at the Resequencer EIP to order your messages.
As for your third question, I don't think google-pubsub component allows a parallel processing option. You can make your own using the Threads EIP. This would definitely give more control over your concurrency.
Using Threads, your code would look something like this:
from("google-pubsub:project-id:destinationName?maxMessagesPerPoll=20")
// the 2 parameters are 'pool size' and 'max pool size'
.threads(5, 20)
.to("direct:out");

How to manually ack/nack a PubSub message in Camel Route

I am setting up a Camel Route with ackMode=NONE meaning acknowlegements are not done automatically. How do I explicitly acknowledge the message in the route?
In my Camel Route definition I've set ackMode to NONE. According to the documentation, I should be able to manually acknowledge the message downstream:
https://github.com/apache/camel/blob/master/components/camel-google-pubsub/src/main/docs/google-pubsub-component.adoc
"AUTO = exchange gets ack’ed/nack’ed on completion. NONE = downstream process has to ack/nack explicitly"
However I cannot figure out how to send the ack.
from("google-pubsub:<project>:<subscription>?concurrentConsumers=1&maxMessagesPerPoll=1&ackMode=NONE")
.bean("processingBean");
My PubSub subscription has an acknowledgement deadline of 10 seconds and so my message keeps getting re-sent every 10 seconds due to ackMode=NONE. This is as expected. However I cannot find a way to manually acknowledge the message once processing is complete and stop the re-deliveries.
I was able to dig through the Camel components and figure out how it is done. First I created a GooglePubSubConnectionFactory bean:
#Bean
public GooglePubsubConnectionFactory googlePubsubConnectionFactory() {
GooglePubsubConnectionFactory connectionFactory = new GooglePubsubConnectionFactory();
connectionFactory.setCredentialsFileLocation(pubsubKey);
return connectionFactory;
}
Then I was able to reference the ack id of the message from the header:
#Header(GooglePubsubConstants.ACK_ID) String ackId
Then I used the following code to acknowledge the message:
List<String > ackIdList = new ArrayList<>();
ackIdList.add(ackId);
AcknowledgeRequest ackRequest = new AcknowledgeRequest().setAckIds(ackIdList);
Pubsub pubsub = googlePubsubConnectionFactory.getDefaultClient();
pubsub.projects().subscriptions().acknowledge("projects/<my project>/subscriptions/<my subscription>", ackRequest).execute();
I think it is best if you look how the Camel component does it with ackMode=AUTO. Have a look at this class (method acknowledge)
But why do you want to do this extra work? Camel is your fried to simplify integration by abstracting away low level code.
So when you use ackMode=AUTO Camel automatically commits your successfully processed messages (when the message has successfully passed the whole route) and rolls back your not processable messages.

Infinite AMQP Consumer with Alpakka

I'm trying to implement a very simple service connected to an AMQP broker with Alpakka. I just want it to consume messages from its queue as a stream at the moment they are pushed on a given exchange/topic.
Everything seemed to work fine in my tests, but when I tried to start my service, I realized that my stream was only consuming my messages once and then exited.
Basically I'm using the code from Alpakka documentation :
def consume()={
val amqpSource = AmqpSource.committableSource(
TemporaryQueueSourceSettings(connectionProvider, exchangeName)
.withDeclaration(exchangeDeclaration)
.withRoutingKey(topic),
bufferSize = prefetchCount
)
val amqpSink = AmqpSink.replyTo(AmqpReplyToSinkSettings(connectionProvider))
amqpSource.mapAsync(4)(msg => onMessage(msg)).runWith(amqpSink)
}
I tried to schedule the consume() execution every second, but I experienced OutOfMemoryException issues.
Is there any proper way to make this code run as an infinite loop ?
If you want to have a Source restarted when it fails or is cancelled, wrap it with RestartSource.withBackoff.

How to use multithreading inside an JPA camel route

We have an importer running on a powerful, multi-core server. However, our Apache Camel routes are single threaded, which is a shame.
Our [camel] importer is a single-instance program. How can I make a specific route process the messages using multiple threads? The messages are atomic and are processed by a bean, which already does this in a thread-safe way.
I am already happy if I could process batches (maxMessagesPerPoll) in threads and have idle time until the next poll takes place (after all, that's still better than sequential processing).
Here is one of the routes I would like to make multithreaded:
public void onConfigure() throws Exception {
// This is a JPA query which selects all unprocessed modules
String query = RouteQueryHelper.selectNextUnprocessedStaged(ImportAction.IMPORT_MODULES);
from("jpa:com.so.importer.entity.ModuleStageEntity" +
"?consumer.query=" + query +
"&maxMessagesPerPoll=2000" +
"&consumeLockEntity=false" +
"&consumer.delay=1000" +
"&consumeDelete=false")
.transacted().policy("CAMEL_DEFAULT_POLICY")
.bean(moduleImportService) // processes the entity and updates it's status flag
.to("log:import-module?groupInterval=10000")
.routeId("so.route.import-module");
}
The route has consumeDelete=false, because we use a status property on the entity instead (which is modified and saved). The status property is also respected in the consumer.query.
We use camel version 2.17.1 in spring boot (1.3.8.RELEASE) on Java 8.
EDIT 2019-Jan-21: The entities have a method with #Consumed on them, which pushes the entity into the next route after it was processed:
#Consumed
public void gotoNextStatus() {
switch (stageStatus) {
case STAGED: setStageStatus(StageStatus.IMPORTED); break;
case IMPORTED: setStageStatus(StageStatus.RENDERED); break;
case RENDERED: setStageStatus(StageStatus.DONE); break;
}
}
You could introduce some asynchronisation by sending your messages to an intermediate SEDA endpoint:
from("jpa:")
...
.to("seda:intermediateStage")
And then put the real processing inside a new route with N concurrrent SEDA consumers (default is one):
from("seda:intermediateStage?concurrentConsumers=5")
.process(...)

Camel ApacheMQ -> AHC behaviour (blocking?)

I just started using Apache Camel and I'm curious about the seemingly counter-intuitive default behaviour of asynchronous http client (AHC). While consuming messages from ActiveMQ, I can't get it to act in a non-blocking fashion.
My route looks like this:
#Component
public class Broadcaster extends RouteBuilder {
#Override
public void configure() throws Exception {
errorHandler(deadLetterChannel("activemq:failed.messages"));
from("activemq:outbound.messages")
.setExchangePattern(ExchangePattern.InOnly)
.recipientList(simple("ahc:${in.header[PublishDestination]}"))
.end();
}
}
I enqueued several messages, half of which I sent to a delayed web server, and the other half to a normal one. I expected to see all the normal messages consumed immediately by the fast server, and the slow messages gradually over time. However, this was the behaviour observed on the fast web server:
00:24:02.585, <hello>World</hello>
00:24:03.622, <hello>World</hello>
00:24:04.640, <hello>World</hello>
00:24:05.658, <hello>World</hello>
As you can see there is exactly one second between each logged request that corresponds to the artificial 1 second delay on the slow server. Based on the route timings, it looks like the JMS consumer is waiting for AHC to complete before it consumes the next message off the queue:
Processor Elapsed (ms)
[activemq://outbound.messages ] [ 1020]
[setExchangePattern[InOnly] ] [ 0]
[ahc:${in.header[PublishDestination]}} ] [ 1018]
Am I supposed to explicitly use async producers and write callback handlers in these cases, or is there something else I'm missing? Thank you!
Well, case of a RTFM I guess, although I ActiveMQ page leaves a lot to be desired in terms of properties available for endpoint configuration. There should probably be a note to say most (all?) JMS config options are also available for ActiveMQ component. In any case, the solution is to define the consumer as follows:
from("activemq:outbound.messages?asyncConsumer=true")

Resources