Camel route with two MQs in it - apache-camel

I have a flow that I've defined using camel but I ended up using 6+ routes to accomplish what I wanted. Is there a way to do this with one or fewer than 4 routes?
I've outlined the routes below. The first one reads a message (containing an ID) from MQ1 and produces N-messages (expected to be in thousands) in MQ2 based on what it finds in a database for that one ID (db is not shown here). The messages produced in MQ2 have a type field defined, and based on that the choice element filters the messages to put them in the right MQ. Then each MQ(3,4,5) is processed by it's own processor (ProcessorMQ3,4,5). Once that's done, they output the result of the processing to MQ6 and ProcessorMQ6 reads the result and updates the database (also not shown).
Route 1: [Start]--->[MQ1]--->[ProcessorMQ1]--->[MQ2]
Route 2: [MQ2]--->[Choice]--->[MQ3,MQ4,MQ5 based on header value]
Route 3: [MQ3]--->[ProcessorMQ3]--->[MQ6]
Route 4: [MQ4]--->[ProcessorMQ4]--->[MQ6]
Route 5: [MQ5]--->[ProcessorMQ5]--->[MQ6]
Route 6: [MQ6]--->[ProcessorMQ6]--->[End]
Is there a way to do this using one route or am I doing this correctly? I will need to introduce 10 more "types" so the range MQ3-5 will increase by 10.

As far as I can see, your route design looks better now. Why do you want to minimise ? It will complicate your exception handling at the least.
I would always prefer to create micro routes which are functionally independent to each other, that way its makes life easier to write exception handling & unit test cases in a better way. (Even it is very good for refactoring, in the future if you want to move these routes to its own unit of deployment)
Yes, you are doing it in a correct way !
Cheers

Related

React Simple Global Entity Cache instead of Flux/React/etc

I am writing a little "fun" Scala/Scala.js project.
On my server I have Entities which are referenced by uuid-s
(inside Ref-s).
For the sake of "fun", I don't want to use flux/redux architecture but still use React on the client (with ScalaJS-React).
What I am trying to do instead is to have a simple cache, for example:
when a React UserDisplayComponent wants the display the Entity User with uuid=0003
then the render() method calls to the Cache (which is passed in as a prop)
let's assume that this is the first time that the UserDisplayComponent asks for this particular User (with uuid=0003) and the Cache does not have it yet
then the Cache makes an AjaxCall to fetch the User from the server
when the AjaxCall returns the Cache triggers re-render
BUT ! now when the component is asking for the User from the Cache, it gets the User Entity from the Cache immediately and does not trigger an AjaxCall
The way I would like to implement this is the following :
I start a render()
"stuff" inside render() asks the Cache for all sorts of Entities
Cache returns either Loading or the Entity itself.
at the end of render the Cache sends all the AjaxRequest-s to the server and waits for all of them to return
once all AjaxRequests have returned (let's assume that they do - for the sake of simplicity) the Cache triggers a "re-render()" and now all entities that have been requested before are provided by the Cache right away.
of course it can happen that the newly arrived Entity-s will trigger the render() to fetch more Entity-s if for example I load an Entity that is for example case class UserList(ul: List[Ref[User]]) type. But let's not worry about this now.
QUESTIONS:
1) Am I doing something really wrong if I am doing the state handling this way ?
2) Is there an already existing solution for this ?
I looked around but everything was FLUX/REDUX etc... along these lines... - which I want to AVOID for the sake of :
"fun"
curiosity
exploration
playing around
I think this simple cache will be simpler for my use-case because I want to take the "REF" based "domain model" over to the client in a simple way: as if the client was on the server and the network would be infinitely fast and zero latency (this is what the cache would simulate).
Consider what issues you need to address to build a rich dynamic web UI, and what libraries / layers typically handle those issues for you.
1. DOM Events (clicks etc.) need to trigger changes in State
This is relatively easy. DOM nodes expose callback-based listener API that is straightforward to adapt to any architecture.
2. Changes in State need to trigger updates to DOM nodes
This is trickier because it needs to be done efficiently and in a maintainable manner. You don't want to re-render your whole component from scratch whenever its state changes, and you don't want to write tons of jquery-style spaghetti code to manually update the DOM as that would be too error prone even if efficient at runtime.
This problem is mainly why libraries like React exist, they abstract this away behind virtual DOM. But you can also abstract this away without virtual DOM, like my own Laminar library does.
Forgoing a library solution to this problem is only workable for simpler apps.
3. Components should be able to read / write Global State
This is the part that flux / redux solve. Specifically, these are issues #1 and #2 all over again, except as applied to global state as opposed to component state.
4. Caching
Caching is hard because cache needs to be invalidated at some point, on top of everything else above.
Flux / redux do not help with this at all. One of the libraries that does help is Relay, which works much like your proposed solution, except way more elaborate, and on top of React and GraphQL. Reading its documentation will help you with your problem. You can definitely implement a small subset of relay's functionality in plain Scala.js if you don't need the whole React / GraphQL baggage, but you need to know the prior art.
5. Serialization and type safety
This is the only issue on this list that relates to Scala.js as opposed to Javascript and SPAs in general.
Scala objects need to be serialized to travel over the network. Into JSON, protobufs, or whatever else, but you need a system for this that will not involve error-prone manual work. There are many Scala.js libraries that address this issue such as upickle, Autowire, endpoints, sloth, etc. Key words: "Scala JSON library", or "Scala type-safe RPC", depending on what kind of solution you want.
I hope these principles suffice as an answer. When you understand these issues, it should be obvious whether your solution will work for a given use case or not. As it is, you didn't describe how your solution addresses issues 2, 4, and 5. You can use some of the libraries I mentioned or implement your own solutions with similar ideas / algorithms.
On a minor technical note, consider implementing an async, Future-based API for your cache layer, so that it returns Future[Entity] instead of Loading | Entity.

Camel condition on aggregate of messages

I'm looking for a way to conditionally handle messages based on the aggregation of messages. I've looked into a lot of ways to do this, but it seems that Apache Camel doesn't support it. I'll explain the scenario and then the solutions I tried.
Scenario:
I'm trying to conditionally clean a directory. I poll from the directory every x days and fetch all the files (file://...). I route this into an aggregation, that aggregates the files into a single size (directorySize). I then check if this size passes a certain threshold.
Here is where the problem lies. I now want to remove certain files if this condition passes, but I don't have access to the original messages anymore because they were aggregated in a new exchange.
Solutions:
I tried to fetch the files again to process them. Problem is that you can't make a consumer fetch on demand as far as I know. I tried using pollEnrich, but that will only fetch a single file and not all files in the directory.
I tried to filter/stop the parent route. The problem here is that filter()/choice...stop()/end() will only stop the aggregated route with the directory size and not the parent route with the file messages. I can't conditionally process these.
I tried to move the aggregated condition to another route that I would call, but this causes the same problem as the first solution.
Things I consider doing:
Rewrite the aggregation strategy to not only aggregate the size, but also the files itself into a groupedExchange. This way I can split the aggregation again after the check. I don't really like this solution because it causes a lot boilerplate, both in code as during runtime.
Move the file size calculator to a processor instead of the aggregator. This would defeat the purpose of using camel in the first place.. I would manually be fetching the files and adding the sizes.. And that for every single file..
Use a ControlBus to dynamically start the delete route on that directory. Once again a lot of workaround to achieve something that I feel should be able to be done in a simple route.
I would like to set the calculated size on every parent message, but I have no clue how this could be achieved?
Another way to stop the parent route that I haven't thought of?
I'm a bit stunned that you can't elegantly filter messages based on the aggregation of these messages. Is there something that I missed in Camel that would provide an elegant solution? Or is this a case of the least bad solution?
Simple Schema
Message(File)
Message(File) --> AggregatedMessage(directorySize) --> delete certain Files?
Message(File)
Camel is really awesome, but sometimes it's sure difficult to see exactly which design pattern to use ;)
Firstly, you need to keep a copy of the file objects, because you don't know whether to delete them or not until you reach your threshold - there are basically (at least) two ways to do this.
Alternative 1
The first way is to use a List in an exchange property. This property will hang around no matter what you do with the exchange body. If you have a look at the source code for GroupedExchangeAggregationStrategy, it does precisely this:
list = new ArrayList<Exchange>();
answer.setProperty(Exchange.GROUPED_EXCHANGE, list);
// ...
list.add(newExchange);
Or you could do the same thing manually on your own exchange property. In any case, it's completely fine to use the Grouped aggregation strategy as you have done.
Alternative 2
The second way to "keep" old messages is to send a copy to a stopped SEDA queue. So you would do to("seda:xyz"). You define this queue as .noAutoStartup(). Then you can send messages to it and they will queue up on an internal queue, managed by camel. When you want to process the messages, you simply start it up via controlbus and stop it again afterwards.
Generally, messing around with starting and stopping queues should be avoided unless absolutely necessary, but that's certainly another way to do it
Suggested solution
I suggest you do as you have done (i.e. alternative 1):
aggregate via GroupedExchangeAggregationStrategy to keep the individual files in a list
Compute the total file size (use a processor, or do it along the way with a custom aggregation strategy)
Use a filter(simple("${body} < 123"))
"Unwind" your aggregation via a splitter(simple("${property.CamelGroupedExchange}"))
Delete your files one by one
Please let me know if this doesn'y makes sense, or if I have misunderstood your problem in any way.

How to make another search call inside SOLR

I would like to implement some kind of fallback querying mechanism inside SOLR. That is if a first search call doesn't generate enough results, I would like to make another call with different ranking and then combine the results and return it. I guess this can be done in the SOLR client side but I hope to do this inside the SOLR. By reading documentation, I guess I need to implement a search component and then add it next to "query" component? Any reference or experience in this regard would be highly appreciated.
SearchHandler calls all the registered search components in order you define, and there are several stages (prepare,process etc.).
You know the number of results only after the distributed processing phase (I suppose you work with distributed mode),so your custom search component should check the number of results in response object and run its own query if necessary.
Actually you may inherit (or wrap) a regular QueryComponent for that, augmenting its process/distributed process phases.

Camel # route steps vs memory/performance

It might be a silly question, but say I have a hughe message that I want to process with Camel. How will the number of steps in my route affect the memory usage? Does camel deep copy my message payload for every step in the route, even if the DSL-step only reads from the message or does it do something smart here?
Is it better to keep the route down and do things in a "hughe" bean for large messages or not?
This is an example route that does various things, but not changing the payload.
from("foo:bar")
.log(..)
.setProperty(..)
.setHeader(..)
.log(..)
.choice()
.when(simple(... ) )
.log(..)
.to(..)
.when(simple(..))
.log(..)
.to(..)
.end()
from my understanding, for a simple pipelined route like this, an Exchange is created containing the body once and passed along each step in the route. Other EIPs do cause the Exchange to be copied though (like multicast, wiretap, etc)...
as well, if you have steps along the route which interface with external resources which could result in any type of copy/clone/conversion/serialization of the body unnecessarily, then you might use something like the claim check pattern to reduce this.
The camel exchange is the same through the route the message objects are copied or recereated in the steps. The body is just referenced though. So normally you should not have a problem.
This is handled by each camel processor individually though. So some of the processors may copy the body. Typically this is the case when the processor really works on the body. So in this case it can not be avoided.

What's the difference between "direct:" and to() in Apache Camel?

The DirectComponent documentation gives the following example:
from("activemq:queue:order.in")
.to("bean:orderServer?method=validate")
.to("direct:processOrder");
from("direct:processOrder")
.to("bean:orderService?method=process")
.to("activemq:queue:order.out");
Is there any difference between that and the following?
from("activemq:queue:order.in")
.to("bean:orderServer?method=validate")
.to("bean:orderService?method=process")
.to("activemq:queue:order.out");
I've tried to find documentation on what the behaviour of the to() method is on the Java DSL, but beyond the RouteDefinition javadoc (which gives the very curt "Sends the exchange to the given endpoint") I've come up blank :(
In the very case above, you will not notice much difference. The "direct" component is much like a method call.
Once you start build a bit more complex routes, you will want to segment them in several different parts for multiple reasons.
You can, for instance, create "sub routes" that could be reused among multiple routes in your Camel context. Much like you segment out methods in regular programming to allow reusability and make code more clear. The same goes for sub routes using, for instance the direct component.
The same approach can be extended. Say you want multiple protocols to be used as endpoints to your route. You can use the direct endpoint to create the main route, something like this:
// Three endpoints to one "main" route.
from("activemq:queue:order.in")
.to("direct:processOrder");
from("file:some/file/path")
.to("direct:processOrder");
from("jetty:http://0.0.0.0/order/in")
.to("direct:processOrder");
from("direct:processOrder")
.to("bean:orderService?method=process")
.to("activemq:queue:order.out");
Another thing is that one route is created for each "from()" clause in DSL. A route is an artifact in Camel, and you could do certain administrative tasks towards it with the Camel API, such as start, stop, add, remove routes dynamically. The "to" clause is just an endpoint call.
Once starting to do some real cases with somewhat complexity in Camel, you will note that you cannot get too many "direct" routes.
Direct Component is used to name the logical segment of the route. This is similar process to naming procedures in structural programming.
In your example there is no difference in message flow. In the terms of structural programming, we could say that you make a kind of inline expansion to your route.
Another difference is Direct component doesn't has any thread pool, the direct consumer process method is invoked by the calling thread of direct producer.
Mainly its used for break the complex route configuration like in java we used to have method for reusability. And also by configuring threads at direct route we can reduce the work for calling thread .
from(A).to(B).to(OUT)
is chaining
A --- B --- OUT
But
from(A ).to( X)
from(B ).to( X)
from( X).to( OUT )
where X is a direct:?
is basically like a join
A
\____ OUT
/
B
obviously these are different behaviours, and with the second you could implement anylogic you wanted, not just a serial chain

Resources