How to include multiple statements in Apache Camel - Esper endpoint? - apache-camel

I have a list of Esper statements which I would like to run within the Apache Camel flow.
How can I make sure all statements are evaluated for all messages?
Do I need to have a separate route for each Esper statement (i.e. from: esper:// ...)?
Note: Each statement may be yielding a result at a different time (e.g. aggregating over 1 second, another one over 5 seconds, etc).

According to the documentation, each camel route will start a single event processing statement and consume the results. If you have a reason why you must have a single camel route, craft an EPL statement that performs all the desired work (or at least selects the appropriate data for further processing later in the camel pipeline). The alternative, as you suggest, is to stand up multiple camel routes each consuming from an esper component with a different EPL statement. The multiple routes could later be merged into a single route using one of camel's internal queue components (seda, vm, direct, or jms).
There is a two route example with source code here.

Related

In Apache Camel, for which component will create endpoints if it's not exist?

I'm new to Camel, and have some basic questions which can't found the answers online. Please help and I'm appreciate it.
I have read many example online, and saw bunch example like this:
from(direct:A).to(jms:queue:B)
But didn't see any configuration for them. My question is what will happen if the direct doesn't exist? How about from(jms:queue:A).to(direct:A)? and what about the other components?
For this example, what's the execution order? does it pass the original message to B first, then process and pass to C?
from(direct:A)
.to(jms:B)
.process(something)
.to(jms:C)
Direct is an in memory queue, provided by Camel. Prior it Camel 3, it was bundled with the camel-core module and you would not need any configuration at all in order to use the direct component. However, due to sake of modularity, since Camel 3, direct has been made its own component and in order to use it camel-direct needs to be imported.
Jms on the other hand is a generic component, using which you can implement connectivity to different Jms providers such as ActiveMq(though in Camel activemq has it's own component), IBM MQ, Weblogic JMS server, and others.
For your 1st question, if direct doesn't not exist, you need to import the direct component into your build. If the uri provided to the component is not present, Camel will create its own. This is true for most of the Camel components. One of the most common is the file component, which is used to pick up files from a given directory. If the given directory is not present, Camel is smart enough to create the directory. Obviously, these are default behaviours and you have a lot of control to pick and choose how you want your route to behave.
For your second question, the route will be processed entirely in order, which is, the message will be picked up from the direct:A, then will be sent to jms:B. After that it will be processed using the something processor and will finally be sent to jms:c.
The thing to note here is that the direct:A is just an example to show the syntax of a route. You can use any component which can act as a consumer.

Multiple route builders defined in single Camel context are sequential?

I wanted to understand how multiple routes in same camel context will behave?
For example, Let's say I have an application wherein I have a single camel context which contains three different route builders each defining one route. Each of these routes are listening to different queues.
Can someone please let me know if all these three routes will work in parallel or only one will process the message while others wait?
I believe the queues will function in parallel. You can also specify the number of threads (.threads()) the route uses. So, with that it is safe to presume that the routes run in parallel.

How to Efficiently Pass XML Documents between Camel routes through ActiveMQ

I have a series of Camel routes that retrieve, transform, split, and combine XML documents. This all works fine.
These routes are linked by ActiveMQ topics and queues.
All good.
However, in some cases I have a large number of documents to process, and because Camel's JMS component transforms XML documents into text for the message, the queues result in the rendering of the XML to string, and re-parsing to documents more than once, which is a significant processing overhead.
I've tried setting the JMS producer jmsMessageType to Object, but when the consumer retrieves the message, and I output exchange.getIn().getBody().getClass().getCanonicalName() I get java.lang.String.
What settings would I need to put on the producer and the consumer for the XML Document objects to be passed directly through the ActiveMQ topic/queue without being rendered to String and re-parsed?
Thanks for looking.
Xerces supports Java serialization of its DOMs and Camel supports Java serialization. It's questionable though if it's really more efficient, quoting from Xerces documentation:
Some rough measurements have shown that XML serialization performs better than Java object serialization and that XML instance documents require less storage space than object serialized DOMs.
And there's another catch: Camel's Java serialization data format is deprecated and there's a risk that it will be removed in an upcoming Camel version. The implementation is very straight forward though and in case it gets deprecated you could add a custom data format replicating current Camel SerializationDataFormat.
If you want to try it out though, the producer could look like:
from(...)
// you need to hava Xerces DOM object in the exchange body at this point
.marshal().serialization()
.to("jms:myqueue");
...and consumer:
from("jms:myqueue")
.unmarshal().serialization()
// you should have your Xerces DOM again
...

Building Apache Camel Routes Dynamically

I am working on an application that uses Apache Camel to flow a single request message (input) through some initial Camel components/logic and then to a multicast at which point the route branches out into multiple branches. The purpose of each branch is to retrieve data from a specific web service (or other back-end data source, e.g. database) and then after the web service invocation / data retrieval operation completes, each of the branches dumps its output data in the same way via a custom bean endpoint. I expect to eventually have approximately 40 different branches in the Camel route, each of which might flow through a different set of Camel components in order to prepare its request, submit the request, process the response, etc... I anticipate that a fair number of the branches will be quite similar (e.g. all SOAP invocations quite similar, all REST invocations quite similar, etc.) and so have concocted an approach whereby a config file stores the list of back-end data sources to be invoked/retrieved-from along with the ability to define (indirectly) the route that should be taken to reach each of those sources. The config file looks something like this:
[a]
route=X + Y
Y.url=http://someservice
[b]
route=Z
Z.someproperty=123
And then I have code that reads through that config file and treats each of the "sections" (e.g. "[a]", "[b]", etc.) as a branch (i.e. a destination out of the multicast) and relies on classes that are dynamically instantiated (e.g. XRouteSegment, YRouteSegment, ZRouteSegment) in order to each in-turn populate/define the route for its specific branch. As some examples, I have built RouteSegment helper classes for wiring up components such as Velocity, CXF, CXF-RS, for data marshalling/unmarshalling, etc... based on properties set in config file.
As far as initialization of the Camel context goes, it starts out in a fairly typical way with a single RouteBuilder which builds out the first part of the route up to the multicast. But then I go into a for loop and loop through all of the sources found in the config file (e.g. "a", "b", etc.) and create seda nodes for each of those which the multicast flows to. And then I call into each of the RouteSegment instances associated with a given source (e.g. X + Y) and allow those to add to the RouteDefinition as they need (e.g. from their seda start point going forward). And then back in my "main" RouteBuilder I tack on some final routing/components that is to be the same for all of the branches (i.e. the logic that forces each of the branches to store its data via the same custom bean).
The code works just fine, but I am questioning whether this approach is overkill and/or whether there is some easier/cleaner way of doing this that I am overlooking. Would I be better off just having individual Java classes (i.e. RouteBuilders) for each of the branches (in addition to the "trunk" and "tails" of the route)? What I was trying to avoid was having too much duplicated logic/code across all of those classes ... e.g. 20 classes all pulling data from SOAP web services in pretty much exactly the same way. So I am using a RouteSegment instance like "X" referenced above as re-usable shorthand for what would otherwise be a sequence of different Camel Java DSL calls (e.g. from/to/process/log/etc ... with parameters to control the specifics of the individual statements). Are there any other strategies/approaches I should consider in order to dynamically build out a Camel route (+ sizable number of branches) at runtime (e.g. within a for loop, or via some sort of reflection/discovery process (app runs using Spring Boot))?
Thanks in advance for any ideas you might be able to provide/suggest that I might not have thought of / tried yet!
I just want to throw in some subjects I am missing in your description.
If I understand your description correct, all your branches components that are called by multicast are not real components but kind of building blocks to build Camel routes at runtime. That sounds like they are not testable and not startable standalone (but perhaps you just not explained that aspect).
If you would build individual small components (every one with its own RouteBuilder) you would have something similar to a microservice architecture: small units to develop and deploy individually.
Since you use Spring Boot, you have autodiscovery of Routes, so they are kind of "pluggable". The components are also testable using Camel routetests etc.
The components would be much more "static" and small standalone projects. This also ensures a fast development roundtrip when you work on the components.
But as you write, this can lead to lots of redundant code. So I guess you have to decide what is more important for you.

Enabling Replay mechanism with camel from messages from DB

Iam trying to implement replay mechanisam with camel ie., i will have to retrieve all the messages already persisted and forward to appropriate camel route to reprocess.This will be triggred by quartz scheduler.
I achieved the same by using below.
1) once the quartz scheduler is triggered, fwd to processor which will query db and have the message as list and set the same in camel exchange properties as list.
2) Use the camel in which LoopProcessor will set appropriate xml in the iteration in the exchange.
3) fwd it to activemq which will be forwarded to appropriate camel route to reprocess.
Every thing is fine.
I see the following TWO issues
a) there might be 'n' number of msges(10,000+) which will be available in the camel exchange properties in the form of List - i can remove the message which is sent for processing but i think this will do more good on performance and memory usage.
b) I don want to forward all the 10,000+ messages to activemq which i guess will make it exhaustive. Is there a better mechanism to forward 10000+ messages to activemq.
-- I am thinking to use SEDA/VM(using different camel contexts).how good this can give me considering above questions.
Thanks.
Regards
Senthil Kumar Sekar
If the number of messages is a problem, then not all messages should be loaded at once.
Process as follows (see also my answer for your other SO):
Limit the number of results when querying the DB.
Set a marker (e.g. processedFlag) for the DB entries that are processed
Begin at 1. and query only the not already processed entries until all records are processed.
However, you should test the ActiveMQ approach as well, if 10,000+ messages are really a problem or not.

Resources