How can I preserve message history across a split - apache-camel

It seems like if I print the message history before and after a split, history traced before split is lost after the split. is there any way to preserve the message history

When the exchange is split() each sub-exchange is newly created by default.
What you're looking for is sharing the unit of work; for example,
.split(body()).shareUnitOfWork()
See the section Sharing Unit of Work in Camel's Splitter documentation.

Related

in-Message copied in out-Message

I have this simple route in my RouteBuilder.
from("amq:MyQueue").routeId(routeId).log(LoggingLevel.DEBUG, "Log: ${in.headers} - ${in.body}")
As stated in the doc for HTTP-component:
Camel will store the HTTP response from the external server on the OUT body. All headers from the IN message will be copied to the OUT message, ...
I would like to know if this concept also applies to amq-component, routeId, and log? Is it the default behaviour, that IN always gets copied to OUT?
Thank you,
Hadi
First of all: The concept of IN and OUT messages is deprecated in Camel 3.x.
This is mentioned in the Camel 3 migration guide and also annotated on the getOut method of the Camel Exchange.
However, it is not (yet) removed, but what you can take from it: don't care about the OUT message. Use the getMessage method and don't use getIn and getOut anymore.
To answer your question:
Yes, most components behave like this
Every step in the route takes the (IN) message and processes it
The body is typically overwritten with the new processing result
The headers typically stay, new headers can be added
So while the Camel Exchange traverses the route, typically the body is continuously updated and the header list grows.
However, some components like aggregator create new messages based on an AggregationStrategy. In such cases nothing is copied automatically and you have to implement the strategy to your needs.

Camel route with two MQs in it

I have a flow that I've defined using camel but I ended up using 6+ routes to accomplish what I wanted. Is there a way to do this with one or fewer than 4 routes?
I've outlined the routes below. The first one reads a message (containing an ID) from MQ1 and produces N-messages (expected to be in thousands) in MQ2 based on what it finds in a database for that one ID (db is not shown here). The messages produced in MQ2 have a type field defined, and based on that the choice element filters the messages to put them in the right MQ. Then each MQ(3,4,5) is processed by it's own processor (ProcessorMQ3,4,5). Once that's done, they output the result of the processing to MQ6 and ProcessorMQ6 reads the result and updates the database (also not shown).
Route 1: [Start]--->[MQ1]--->[ProcessorMQ1]--->[MQ2]
Route 2: [MQ2]--->[Choice]--->[MQ3,MQ4,MQ5 based on header value]
Route 3: [MQ3]--->[ProcessorMQ3]--->[MQ6]
Route 4: [MQ4]--->[ProcessorMQ4]--->[MQ6]
Route 5: [MQ5]--->[ProcessorMQ5]--->[MQ6]
Route 6: [MQ6]--->[ProcessorMQ6]--->[End]
Is there a way to do this using one route or am I doing this correctly? I will need to introduce 10 more "types" so the range MQ3-5 will increase by 10.
As far as I can see, your route design looks better now. Why do you want to minimise ? It will complicate your exception handling at the least.
I would always prefer to create micro routes which are functionally independent to each other, that way its makes life easier to write exception handling & unit test cases in a better way. (Even it is very good for refactoring, in the future if you want to move these routes to its own unit of deployment)
Yes, you are doing it in a correct way !
Cheers

Camel condition on aggregate of messages

I'm looking for a way to conditionally handle messages based on the aggregation of messages. I've looked into a lot of ways to do this, but it seems that Apache Camel doesn't support it. I'll explain the scenario and then the solutions I tried.
Scenario:
I'm trying to conditionally clean a directory. I poll from the directory every x days and fetch all the files (file://...). I route this into an aggregation, that aggregates the files into a single size (directorySize). I then check if this size passes a certain threshold.
Here is where the problem lies. I now want to remove certain files if this condition passes, but I don't have access to the original messages anymore because they were aggregated in a new exchange.
Solutions:
I tried to fetch the files again to process them. Problem is that you can't make a consumer fetch on demand as far as I know. I tried using pollEnrich, but that will only fetch a single file and not all files in the directory.
I tried to filter/stop the parent route. The problem here is that filter()/choice...stop()/end() will only stop the aggregated route with the directory size and not the parent route with the file messages. I can't conditionally process these.
I tried to move the aggregated condition to another route that I would call, but this causes the same problem as the first solution.
Things I consider doing:
Rewrite the aggregation strategy to not only aggregate the size, but also the files itself into a groupedExchange. This way I can split the aggregation again after the check. I don't really like this solution because it causes a lot boilerplate, both in code as during runtime.
Move the file size calculator to a processor instead of the aggregator. This would defeat the purpose of using camel in the first place.. I would manually be fetching the files and adding the sizes.. And that for every single file..
Use a ControlBus to dynamically start the delete route on that directory. Once again a lot of workaround to achieve something that I feel should be able to be done in a simple route.
I would like to set the calculated size on every parent message, but I have no clue how this could be achieved?
Another way to stop the parent route that I haven't thought of?
I'm a bit stunned that you can't elegantly filter messages based on the aggregation of these messages. Is there something that I missed in Camel that would provide an elegant solution? Or is this a case of the least bad solution?
Simple Schema
Message(File)
Message(File) --> AggregatedMessage(directorySize) --> delete certain Files?
Message(File)
Camel is really awesome, but sometimes it's sure difficult to see exactly which design pattern to use ;)
Firstly, you need to keep a copy of the file objects, because you don't know whether to delete them or not until you reach your threshold - there are basically (at least) two ways to do this.
Alternative 1
The first way is to use a List in an exchange property. This property will hang around no matter what you do with the exchange body. If you have a look at the source code for GroupedExchangeAggregationStrategy, it does precisely this:
list = new ArrayList<Exchange>();
answer.setProperty(Exchange.GROUPED_EXCHANGE, list);
// ...
list.add(newExchange);
Or you could do the same thing manually on your own exchange property. In any case, it's completely fine to use the Grouped aggregation strategy as you have done.
Alternative 2
The second way to "keep" old messages is to send a copy to a stopped SEDA queue. So you would do to("seda:xyz"). You define this queue as .noAutoStartup(). Then you can send messages to it and they will queue up on an internal queue, managed by camel. When you want to process the messages, you simply start it up via controlbus and stop it again afterwards.
Generally, messing around with starting and stopping queues should be avoided unless absolutely necessary, but that's certainly another way to do it
Suggested solution
I suggest you do as you have done (i.e. alternative 1):
aggregate via GroupedExchangeAggregationStrategy to keep the individual files in a list
Compute the total file size (use a processor, or do it along the way with a custom aggregation strategy)
Use a filter(simple("${body} < 123"))
"Unwind" your aggregation via a splitter(simple("${property.CamelGroupedExchange}"))
Delete your files one by one
Please let me know if this doesn'y makes sense, or if I have misunderstood your problem in any way.

Camel # route steps vs memory/performance

It might be a silly question, but say I have a hughe message that I want to process with Camel. How will the number of steps in my route affect the memory usage? Does camel deep copy my message payload for every step in the route, even if the DSL-step only reads from the message or does it do something smart here?
Is it better to keep the route down and do things in a "hughe" bean for large messages or not?
This is an example route that does various things, but not changing the payload.
from("foo:bar")
.log(..)
.setProperty(..)
.setHeader(..)
.log(..)
.choice()
.when(simple(... ) )
.log(..)
.to(..)
.when(simple(..))
.log(..)
.to(..)
.end()
from my understanding, for a simple pipelined route like this, an Exchange is created containing the body once and passed along each step in the route. Other EIPs do cause the Exchange to be copied though (like multicast, wiretap, etc)...
as well, if you have steps along the route which interface with external resources which could result in any type of copy/clone/conversion/serialization of the body unnecessarily, then you might use something like the claim check pattern to reduce this.
The camel exchange is the same through the route the message objects are copied or recereated in the steps. The body is just referenced though. So normally you should not have a problem.
This is handled by each camel processor individually though. So some of the processors may copy the body. Typically this is the case when the processor really works on the body. So in this case it can not be avoided.

What's the difference between "direct:" and to() in Apache Camel?

The DirectComponent documentation gives the following example:
from("activemq:queue:order.in")
.to("bean:orderServer?method=validate")
.to("direct:processOrder");
from("direct:processOrder")
.to("bean:orderService?method=process")
.to("activemq:queue:order.out");
Is there any difference between that and the following?
from("activemq:queue:order.in")
.to("bean:orderServer?method=validate")
.to("bean:orderService?method=process")
.to("activemq:queue:order.out");
I've tried to find documentation on what the behaviour of the to() method is on the Java DSL, but beyond the RouteDefinition javadoc (which gives the very curt "Sends the exchange to the given endpoint") I've come up blank :(
In the very case above, you will not notice much difference. The "direct" component is much like a method call.
Once you start build a bit more complex routes, you will want to segment them in several different parts for multiple reasons.
You can, for instance, create "sub routes" that could be reused among multiple routes in your Camel context. Much like you segment out methods in regular programming to allow reusability and make code more clear. The same goes for sub routes using, for instance the direct component.
The same approach can be extended. Say you want multiple protocols to be used as endpoints to your route. You can use the direct endpoint to create the main route, something like this:
// Three endpoints to one "main" route.
from("activemq:queue:order.in")
.to("direct:processOrder");
from("file:some/file/path")
.to("direct:processOrder");
from("jetty:http://0.0.0.0/order/in")
.to("direct:processOrder");
from("direct:processOrder")
.to("bean:orderService?method=process")
.to("activemq:queue:order.out");
Another thing is that one route is created for each "from()" clause in DSL. A route is an artifact in Camel, and you could do certain administrative tasks towards it with the Camel API, such as start, stop, add, remove routes dynamically. The "to" clause is just an endpoint call.
Once starting to do some real cases with somewhat complexity in Camel, you will note that you cannot get too many "direct" routes.
Direct Component is used to name the logical segment of the route. This is similar process to naming procedures in structural programming.
In your example there is no difference in message flow. In the terms of structural programming, we could say that you make a kind of inline expansion to your route.
Another difference is Direct component doesn't has any thread pool, the direct consumer process method is invoked by the calling thread of direct producer.
Mainly its used for break the complex route configuration like in java we used to have method for reusability. And also by configuring threads at direct route we can reduce the work for calling thread .
from(A).to(B).to(OUT)
is chaining
A --- B --- OUT
But
from(A ).to( X)
from(B ).to( X)
from( X).to( OUT )
where X is a direct:?
is basically like a join
A
\____ OUT
/
B
obviously these are different behaviours, and with the second you could implement anylogic you wanted, not just a serial chain

Resources