We have created a little spring boot application with Camel embedded. It simply polls an Office365 mailbox via imap for unread emails.
We have some verbose logging and we have seen Box one consumes the message and then processes it (sends some REST requests) and finishes. 2s later after box 1 has finished box 2 picks the same message up and processes it.
We implemented an Idempotent consumer:
from(casesMailBox.getUri()).idempotentConsumer(simple("${in.headers.Message-ID}"), repo).routeId("messaging").process(emailToCaseProcessor);
We can see duplicate entries in the underlying Oracle tables.
The documents are not clear but I assume the idempotentConsumer would commit as soon as possible to the DB.
Am I missing something here?
Idempotent consumer will not work in clustered environment as idempotent repository is a in-memory one.
You have to use central database or Hazelcast data grid based implementation.
For more info refer: http://camel.apache.org/idempotent-consumer.html
Zookeeper,
If you want to use polling consumer; scheduler in a clustered environment and want to avoid duplicate route triggering, you can use zookeeper with route policy.
Ref:http://camel.apache.org/zookeeper.html
Hope it helps!!!
Related
I'm looking for a best practise how to monitor the functionality of camel routes.
I know there are monitoring tools like hawtio and camelwatch, but that's not exactly what I'm looking for.
I want to know if a route is "working" as aspected, for example you have a route which listens on a queue(from("jms...")). Maybe there are messages in the queue, but the listener is not able to dequeue them because of some db issues or something else(depends on the jms provider). With the monitoring tools mentioned above you just see inflight/failed/completed messages but you don't see if the listener is able to get the messages -> so the route is not "working".
I know there is also apache BAM, maybe I have to do some more research, but somehow it looks like BAM creates new routes and you can't monitor existing routes. I also don't want to implement/define such business cases for each route, I look for a more generic way. It's also mentioned on the camel 3.0 idea board that BAM wasn't touched for 5 years, so I think people don't use it that often(which means for me it doesn't fit their needs exactly).
I had similar requirement some time ago and at the end I developed a small Camel application for monitoring.
It run on timer, query different Camel applications installed in remote servers through JMX/Jolokia and if LastExchangeCompletedTimestamp of the route I am interested in is older than some time interval, send a mail to administrators.
Maybe this approach is too simple for your scenario, but could be an option.
(Edit: more details added)
Principal points:
Main routes queries DB for entities to control and spawns controlling routes
Controlling routes fires on quartz and http post the following url
.to("http://server:port/app/jolokia/?"+
"maxDepth=7&maxCollectionSize=500&ignoreErrors=true&canonicalNaming=false")
sending the following jsonRequest body
LinkedHashMap<String,Object> request=new LinkedHashMap<String,Object>();
request.put("type","read");
request.put("mbean","org.apache.camel:"+entity.getRouteId());
jsonRequest=mapper.writeValueAsString(request);
As response you get another JSON, parse it and get LastExchangeCompletedTimestamp value
We're having some came routes defined in a single CamelContext which contains Web services,activemq.. in the Route.
Initially we've deployed the Routes as WAR in single Jboss node.
To scale out(usually we're doing for web services) , I've deployed the same CamelContext in multiple Jboss nodes.
But the performance is actually decreased.
FYI: All the CamelContexts points to the Same activemq brokers.
Here are my questions:
How to load balance/ Fail over camel context in different machines?
If CamelContexts are deployed in multiple nodes, Will aggregation work correctly?
Kindly give your thoughts!
Without seeing your system in detail, there is no way of knowing why it has slowed down so I'll pass over that. For your other two questions:
Failover
You don't say what sort of failover/load balancing behaviour you want. The not-very-helpful Camel documentation is here: http://camel.apache.org/clustering-and-loadbalancing.html.
One mechanism that works easily with Camel and ActiveMQ is to deploy to multiple servers and run active-active, sharing the same ActiveMQ queues. Each route attempts to read from the same queue to get a message to process. Only one route will get the message and therefore only one route processes it. Other routes are free to read subsequent messages, giving you simple load balancing. If one route crashes, the other routes will continue to process the messages, there will just be reduced capacity on your system.
If you need to provide fault tolerance for your web services then you need to look outside Camel and use something like Elastic Load Balancing. http://aws.amazon.com/elasticloadbalancing/
Aggregation
Each Camel context will run independently of the other contexts so one context will aggregate messages independently of what other contexts are up to. For example, suppose you have an aggregator that stores messages from ActiveMQ queue until receives a special end-of-batch message. If you have the aggregator running in two different routes, the messages will be split between the two routes and only one route will receive the end-of-batch message. So one aggregator will sit there with half the messages and do nothing. The other aggregator will have the other messages and will process the end-of-batch message but won't know about the messages the other route picked up.
I am using camel to implement a route, that load data from DB and then apply some processing on it before producing results that are saved in the DB again.
This is part of a web application.
My problem is this war is going to be deployed by a load balancer into two servers. Then there will be two camel contexts with two routes performing the same processing on the same DB.
I will have the case where the same record is being processed by the two routes. How to handle this problem to prevent the routes from performing the same job twice?
If you need to have that setup so that each server might receive the same record - then you need an idempotent route. And you need to make sure your idempotent repository is the same between your machines. Using a database as the repository is an easy option. If you do not have a database, a hazelcast repo might be an option.
What can be an issue is to determine what is unique in your records - such as an order number or customer + date/time or some increasing transaction ID number.
We're using sessions in our GAE/J application. Over the weekend, we had a large spike in our datastore writes that appears to have been caused by a large number of _ah_SESSION entities being created (about 100-200 every minute). Near as we can tell, there was a rogue task queue creating them because they stopped when we purged the queue. The task was part of a mapper process we run hourly.
We don't need sessions in that hourly mapper (or indeed in any of our task queues or cron jobs or many other requests). Is there a way to disable creating a session for selected URLs?
Unfortunately that can not be done.
This is particularly nasty when you have a non-browser clients (devices via REST or mapreduce jobs) where every request generates a new _ah_SESSION entity in the database.
The only way to avoid this is to write your own session handler: e.g. a servlet filter that sets/checks cookies and set it so that it ignores certain paths.
EDIT:
I just realized that there could be another way: make sure your client (mapreduce job) sets a dummy cookie with a proper name. GAE uses cookies named ACSID in production and dev_appserver_login on dev server. Just use always the same cookie value, so all requests will be treated as one user/session.
There will still be overhead of looking-up/saving session objects, but at least it will not create countless _ah_SESSION entities.
I am in the process of learning ActiveMQ and Camel, with the goal to create a little prototype system that works something like this:
(source: paulstovell.com)
(big)
When an order is placed in the Orders system, a message is sent out to any subscribers (a pub/sub system), and they can play their part in processing the order. The Orders, Shipping and Invoicing applications have their own ActiveMQ installations, so that if any of the three systems are offline, the others can continue to function. Something takes care of moving messages between the ActiveMQ installs.
Getting Apache Camel to move messages from one queue to another via routes is quite easy, if they are on the same ActiveMQ instance. So this works for managing the subscription queues.
The next challenge is pushing messages from one ActiveMQ instance to another, and it's the bit where I am not sure what to look at next.
Can Camel route between different ActiveMQ installations? (I can't figure out what the JMI endpoint URI would be if they are on different machines).
I understand ActiveMQ has store and forward capabilities. Is this what I would use to move messages between Orders and Shipping/Invoicing?
Or is this what Apache ServiceMix is meant to solve?
This is a pretty straightforward asynchronous, event-driven application that is well-suited for ActiveMQ and Camel.
Actually you do not move messages explicitly from one ActiveMQ instance to another. The way it works is using what's known as a network of brokers. In your case, you'd have three brokers: ActiveMQ-purple, ActiveMQ-green and ActiveMQ-blue. ActiveMQ-purple creates a uni-directional broker network with ActiveMQ-green and ActiveMQ-blue. This allows ActiveMQ-purple to store-and-forward messages to ActiveMQ-green and ActiveMQ-blue based on consumer demand.
The Orders app accepts orders on the orders queue on ActiveMQ-purple. The Orders app uses Camel to consume and process a message to determine if it is an invoicing message or a shipping message. Camel routes the messages to either the invoicing queue or the shipping queue on ActiveMQ-purple.
Consumer demand comes from the Invoicing app and the Shipping app. The Invoicing uses Camel to consume messages from the invoicing queue on ActiveMQ-green. The Shipping app uses Camel to consume messages from the shipping queue on ActiveMQ-blue. Because of the broker network and because of the consumer demand on the ActiveMQ-green.invoicing queue and the ActiveMQ-blue.shipping queue, messages will be forwarded from ActiveMQ-purple to the appropriate broker and queue. There is no need to explicitly route messages to specific broker.
I hope this answers your questions. Let me know if you have anymore.
Bruce
Hmmmm, I've only dabbled at best, and not for a fair while, but I'll try and offer something.
ActiveMQ can route between different installations and just uses standard URIs to my knowledge so I'm not sure what the problem is here. I would think that using TCP you'd be fine. Using ServiceMix (you mention it later) you'd just specify a connectionFactory & then provide the URI in that. This link shows some examples http://servicemix.apache.org/servicemix-jms-new-endpoints.html.
Camel has support for Durable Subscriber if that's what you were after (http://camel.apache.org/durable-subscriber.html)? This pattern will ensure that if the subscriber is offline when the message is ready, it will be held until the subscriber is back online. This is also supported by ServiceMix (see link given above and look for 'subscriptionDurable'.