Apache Camel - split and aggregation bug - apache-camel

I tried to create a bug in Camel's issue tracker but it's not easy to get access there now. So maybe someone will be able to help me here.
I'm migrating gradually to the newest Camel version. Currently I'm going from the 3.7.3 to 3.11.7 but I checked that this bug happens also on 3.20.1.
Ok, so to the point.
When I have pipeline like this:
.to(SPLIT_WORKER_ROUTE_ID, OTHER_ROUTE_ID)
it should execute in sequence. But when somewhere inside SPLIT_WORKER_ROUTE_ID I have an aggregation code like this:
.split(body())
.process(splitWorkerProcessor)
.aggregate(exchangeProperty(CORRELATION_ID), new SplitAggregator()).completionSize(exchangeProperty(SPLIT_SIZE))
.to(AFTER_SPLIT_ROUTE_ID)
before it goes to AFTER_SPLIT_ROUTE_ID the OTHER_ROUTE_ID kicks in and starts to run in parallel with SPLIT_WORKER_ROUTE_ID.
When I rewrite the code like this (or go back to Camel 3.7.3):
.split(body(), new SplitAggregator()).parallelProcessing()
.process(splitWorkerProcessor)
.end()
.to(AFTER_SPLIT_ROUTE_ID)
everything runs as it should sequentially. Unfortunately, I have to use more complex aggregation conditions so I'm afraid I cannot use this workaround as aggregation configuration is not possible in this approach.
I guess that according to
https://camel.apache.org/manual/camel-3x-upgrade-guide-3_11.html#_aggregate_eip
something has changed in this area. (EDIT: I've just checked Camel 3.10 and it works properly so I'm 99.99% sure this change introduced this bug)
The problem leads to the situation that order of execution is disturbed and we have this:
.to(SPLIT_WORKER_ROUTE_ID, OTHER_ROUTE_ID)
the OTHER_ROUTE_ID can complete before this sequence SPLIT_WORKER_ROUTE_ID -> AFTER_SPLIT_ROUTE_ID.
Here is the log presenting the problem:
2023-02-02T18:45:41,229 [main] INFO direct://Main [...] [] [] [] [] - MAIN START
2023-02-02T18:45:41,230 [Camel (camel-1) thread #1 - Threads] INFO direct://splitWorker [...] [] [] [] [] - SPLIT_WORKER_ROUTE_ID START
2023-02-02T18:45:41,399 [Camel (camel-1) thread #1 - Threads] INFO direct://other [...] [] [] [] [] - OTHER_ROUTE_ID START
2023-02-02T18:45:41,399 [Camel (camel-1) thread #3 - Aggregator] INFO direct://splitWorker [...] [] [] [] [] - Aggregation just finished inside SPLIT_WORKER_ROUTE_ID START!
2023-02-02T18:45:41,399 [Camel (camel-1) thread #3 - Aggregator] INFO direct://splitWorker [...] [] [] [] [] - SPLIT_WORKER_ROUTE_ID FINISH
2023-02-02T18:45:41,400 [Camel (camel-1) thread #3 - Aggregator] INFO direct://afterSplit [...] [] [] [] [] - AFTER_SPLIT_ROUTE_ID START
2023-02-02T18:45:42,404 [Camel (camel-1) thread #1 - Threads] INFO direct://other [...] [] [] [] [] - OTHER_ROUTE_ID FINISH
2023-02-02T18:45:43,406 [Camel (camel-1) thread #3 - Aggregator] INFO direct://afterSplit [...] [] [] [] [] - AFTER_SPLIT_ROUTE_ID FINISH
2023-02-02T18:45:47,417 [Camel (camel-1) thread #6 - Delay] INFO direct://Main [...] [] [] [] [] - MAIN FINISH
I would appreciate any help, thanks a lot!

By default, the output of the aggregator is executed on a thread from the aggregator's thread pool. However, you can have the aggregator output run in the same thread as the calling route:
.aggregate(exchangeProperty(CORRELATION_ID), new SplitAggregator())
.completionSize(exchangeProperty(SPLIT_SIZE))
.executorService(new SynchronousExecutorService())
This technique is briefly described here.

Related

Camel K SEDA component throws out of memory error

I have gotten an OutOfMemoryError in my Camel K route. I am using SEDA(one producer and multiple consumers) and feeding events with 10,000 events per second, even I use multiple consumers the error was thrown, does anyone know how to improve the performance?
I have tried to increase the consumers numbers but the issue hasn't been resolved and I tried increase the memory but the in-memory queue take too much memory.
Error Message:
2021-12-01 17:42:58,401 syslog-basic-68d776c9b4-js4cd io.quarkus.bootstrap.runner.QuarkusEntryPoint[1] WARN [io.net.cha.AbstractChannelHandlerContext] (Camel (camel-1) thread #1 - NettyConsumerExecutorGroup) An exception 'java.lang.OutOfMemoryError: Java heap space' [enable DEBUG level for full stacktrace] was thrown by a user handler's exceptionCaught() method while handling the following exception:: java.lang.OutOfMemoryError: Java heap space
Here's my source code for the SEDA routes:
from("netty:udp://" + HOST + ":" + PORT + "?sync=false&receiveBufferSize=16777216")
// .log("in: ${body}")
.to("seda:next?size=5000&timeout=0&blockWhenFull=true");
from("seda:next?concurrentConsumers=5")
.unmarshal(myDataFormat)
.routePolicy(myPolicy)
.process(myProcessor)
.choice()
.when(body().contains(MISSING_REQUIRED_FIELDS)).to("log:warning").otherwise()
.log("log:${body}");

Fatalfallbackerrorhandler getting called though handled is set to true

I have a camel route as that of below. Though I set handled(true), I am not getting why defaulterrorhandler is calling fatalfallbackerrorhandler after all retries got exhausted.
onException(Exception.class)
.handled(true)
.to("direct:errors") -----> (1)
;
from("direct:errors")
.log("hello world ");
from("timer:testRoute")
.routeId("testRoute_1")
.throwException(new Exception("Dummy Exception"))
.pollEnrich("file://source")
.to(http://localhost:8080)
Logs:
20.04.03 11:46:53.907 INFO ad #6 - timer://testRoute route1 BreadCrumbId=ID-xxxxxx-1585894556662-0-4 | hello world
20.04.03 11:46:53.913 ERROR ad #6 - timer://testRoute mel.processor.DefaultErrorHandler BreadCrumbId=ID-xxxxxx-1585894556662-0-4 | Failed delivery for (MessageId: ID-xxxxxx-1585894556662-0-5 on ExchangeId: ID-xxxxxx-1585894556662-0-4). Exhausted after delivery attempt: 4 caught: java.lang.Exception: Dummy exception. Processed by failure processor: FatalFallbackErrorHandler[Channel[sendTo(direct://errors)]]
If, I comment the line (1) defaulterrorhandler is not calling fatalfallbackerrorhandler.
This look perfectly valid. I even tried it in a test class and it worked as you expect, the timer "generates" a log entry every second.
In fact it is the message forward to direct:errors that is retried 5 times and does not succeed. This is strange because the direct component is part of camel-core.
I would suggest to check your project dependencies. Have you different Camel JARs with different versions on your classpath? If you use Maven, you can try the Maven enforcer plugin to check for classpath conflicts.

Apache Camel - Parallel Routes Inflight Exchanges

I have a Camel context with many routes that starts every 15m with Timer Component.
These routes set some properties in exchange (Target host, Query and Current Date that I use a Processor to get date, -12 hours and transform to GMT).
After set these properties, using Direct, another route is called to execute the HTTP Get. When the Request finished, another Route is called to Post the return on Artemis ActiveMQ.
The project is deployed on Wildfly 13.
The problem is:
Sometimes the routes simply freeze. Don't start after 15 minutes.
When I try to stop/start the route, I got the follow log:
[0m[0m08:27:45,230 INFO [org.apache.camel.impl.DefaultShutdownStrategy] (Camel (camel-example) thread #70 - ShutdownTask) There are 1 inflight exchanges: InflightExchange: [exchangeId=ID-exchange-ID, fromRouteId=Route1, routeId=GetDataAutoBySinceTime, nodeId=toD7, elapsed=0, duration=216958569]
[0m[0m08:27:46,231 INFO [org.apache.camel.impl.DefaultShutdownStrategy] (Camel (camel-example) thread #70 - ShutdownTask) Waiting as there are still 1 inflight and pending exchanges to complete, timeout in 299 seconds. Inflights per route: [Route1 = 1]
[0m[0m08:27:46,231 INFO [org.apache.camel.impl.DefaultShutdownStrategy] (Camel (camel-example) thread #70 - ShutdownTask) There are 1 inflight exchanges: InflightExchange: [exchangeId=ID-exchange-ID, fromRouteId=Route1, routeId=GetDataAutoBySinceTime, nodeId=toD7, elapsed=0, duration=216959570]
[0m[0m08:27:47,231 INFO [org.apache.camel.impl.DefaultShutdownStrategy] (Camel (camel-example) thread #70 - ShutdownTask) Waiting as there are still 1 inflight and pending exchanges to complete, timeout in 298 seconds. Inflights per route: [Route1 = 1]
I don't know if some processes are stuck making it impossible another processes to start.
I thought to remove the generic routes (PostMessageInActiveMQ and
GetDataAutomaticallyBySinceTime and to implements the same code in another routes (Route1, Route2 and Route3) but I don't think this is the best approach.
Routes:
Route1 (Route2 and Route3 are almost the same, just change properties values)
from("timer:Route1Timer?period=15m")
.routeId("Route1")
.autoStartup(false)
.setProperty("targetAddress", simple("hostname.route1"))
.process(new GetCurrentDate())
.setProperty("query",
simple("DataQuery%26URI=Route1%26format=xml%26Mode=since-time%26p1=${header.currentDate}"))
.to("direct:GetDataAutoBySinceTime");
GetDataAutomaticallyBySinceTime
from("direct:GetDataAutoBySinceTime")
.routeId("GetDataAutoBySinceTime")
.autoStartup(true)
.removeHeaders("*")
.setHeader(Exchange.HTTP_METHOD, constant("GET"))
.toD("http4:${header.targetAddress}/command=${header.query}%26httpClient.socketTimeout=3000")
.convertBodyTo(String.class, "utf-8")
.to("direct:PostMessageInActiveMQ");
PostMessageInActiveMQ
CamelArtemisComponent components = new CamelArtemisComponent();
getContext().addComponent("artemis", components.getArtemisComponent());
from("direct:PostMessageInActiveMQ")
.routeId("PostMessageInActiveMQ")
.autoStartup(true)
.convertBodyTo(String.class, "utf-8")
.inOnly("artemis:ARTEMIS.QUEUE");
Entire code: https://github.com/vitorvr/camel-example
EDIT:
Camel Version: 2.22.0

Apache camel how to implement an optional consumer for a wire tap

I set up some routes (Camel 2.22.1) that uses wire tap to log some stuff into a Mongo db.
from(DIRECT_NEXT).process(sendFile)
.wireTap( "direct:count-fetch?failIfNoConsumers=false" )
as you see i am using failIfNoConsumers=false.
from(COUNT_FETCH)
.routeId( MONGO_COUNT_FETCH_ROUTEID )
.autoStartup( false )
.process(countFetchProcessor)
.to(persistenceEndpoints.updateImage())
.log(LoggingLevel.DEBUG, "Counted fetch.");
The mongo DB is an optional component, the whole application will run without it.
I am using Mongo'S ServerMonitorListener to check if Mongo is available. I suspend or resume the rout using Camel's ControlBus accordingly.
All is running fine!
My Problem is that Camel tries to send the exchanges to the not running routes for 30s:
...
[DEBUG] 2019-01-03 14:02:45.848 [Camel (camel-1) thread #23 - WireTap] DirectBlockingProducer - Waited 20025 for consumer to be ready
...
Why the producer blocks? The default value for "block" should be false?!
And after it we see of course an exception:
No consumers available on endpoint: direct://count-fetch?failIfNoConsumers=false
What is the best approach to let camel discard the exchange immediately (how to set the time out?) and don't throw any exception (because it is normal application behavior, exception will only slow down)?
UPDATE:
here is the complete exception:
[ERROR] 2019-01-07 10:21:22.702 [Camel (camel-1) thread #4 - WireTap] DefaultErrorHandler - Failed delivery for (MessageId: ID-moritz-1546852848013-0-3 on ExchangeId: ID-moritz-1546852848013-0-2). Exhausted after delivery attempt: 1 caught: org.apache.camel.component.direct.DirectConsumerNotAvailableException: No consumers available on endpoint: direct://update-all?failIfNoConsumers=false. Exchange[ID-moritz-1546852848013-0-2]
Message History
---------------------------------------------------------------------------------------------------------------------------------------
RouteId ProcessorId Processor Elapsed (ms)
[route4 ] [route4 ] [timer://updateAll ] [ 30065]
[route4 ] [log1 ] [log ] [ 1]
[route4 ] [to3 ] [direct:updateAll ] [ 19]
[route5 ] [process2 ] [Processor#0x4e92466a ] [ 9]
[route5 ] [process3 ] [Processor#0x1b29d52b ] [ 7]
[route5 ] [wireTap1 ] [wireTap[direct:update-all?failIfNoConsumers=false] ] [ 1]
Stacktrace
---------------------------------------------------------------------------------------------------------------------------------------
org.apache.camel.component.direct.DirectConsumerNotAvailableException: No consumers available on endpoint: direct://update-all?failIfNoConsumers=false. Exchange[ID-moritz-1546852848013-0-2]
at org.apache.camel.component.direct.DirectBlockingProducer.getConsumer(DirectBlockingProducer.java:67) ~[camel-core-2.22.1.jar:2.22.1]
at org.apache.camel.component.direct.DirectBlockingProducer.process(DirectBlockingProducer.java:53) ~[camel-core-2.22.1.jar:2.22.1]
at org.apache.camel.processor.SendDynamicProcessor$1.doInAsyncProducer(SendDynamicProcessor.java:178) ~[camel-core-2.22.1.jar:2.22.1]
at org.apache.camel.impl.ProducerCache.doInAsyncProducer(ProducerCache.java:445) ~[camel-core-2.22.1.jar:2.22.1]
at org.apache.camel.processor.SendDynamicProcessor.process(SendDynamicProcessor.java:160) ~[camel-core-2.22.1.jar:2.22.1]
at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:548) [camel-core-2.22.1.jar:2.22.1]
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:201) [camel-core-2.22.1.jar:2.22.1]
at org.apache.camel.processor.DelegateAsyncProcessor.process(DelegateAsyncProcessor.java:97) [camel-core-2.22.1.jar:2.22.1]
at org.apache.camel.processor.WireTapProcessor$1.call(WireTapProcessor.java:160) [camel-core-2.22.1.jar:2.22.1]
at org.apache.camel.processor.WireTapProcessor$1.call(WireTapProcessor.java:155) [camel-core-2.22.1.jar:2.22.1]
at java.util.concurrent.FutureTask.run(FutureTask.java:266) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) [?:1.8.0_181]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) [?:1.8.0_181]
at java.lang.Thread.run(Thread.java:748) [?:1.8.0_181]
Make sure to check the document for the version of Camel you use, which is 2.22.x
There you can see the block is default enabled: https://github.com/apache/camel/blob/camel-2.22.x/camel-core/src/main/docs/direct-component.adoc

Unable to process large files in camel

I am trying to do simple transformation on a Csv file.But my programm is getting stuck and not giving any output and on console its printing something like below.
22:38:02.001 [main] INFO o.a.camel.impl.DefaultCamelContext - Apache Camel 2.15.2 (CamelContext: camel-1) is shutting down
22:38:02.135 [main] INFO o.a.c.impl.DefaultShutdownStrategy - Starting to graceful shutdown 1 routes (timeout 300 seconds)
22:38:02.167 [main] DEBUG o.a.c.i.DefaultExecutorServiceManager - Created new ThreadPool for source: org.apache.camel.impl.DefaultShutdownStrategy#65ead16a with name: ShutdownTask. -> org.apache.camel.util.concurrent.RejectableThreadPoolExecutor#52c0a65f[Running, pool size = 0, active threads = 0, queued tasks = 0, completed tasks = 0][ShutdownTask]
22:38:02.173 [Camel (camel-1) thread #1 - ShutdownTask] DEBUG o.a.c.impl.DefaultShutdownStrategy - There are 1 routes to shutdown
22:38:02.177 [Camel (camel-1) thread #1 - ShutdownTask] DEBUG o.a.c.impl.DefaultShutdownStrategy - Route: route1 suspended and shutdown deferred, was consuming from: Endpoint[file:///home/cloudera/Desktop/camelinput/?delay=15m&noop=true]
22:38:02.177 [Camel (camel-1) thread #1 - ShutdownTask] INFO o.a.c.impl.DefaultShutdownStrategy - Waiting as there are still 2 inflight and pending exchanges to complete, timeout in 300 seconds.
22:38:02.179 [Camel (camel-1) thread #1 - ShutdownTask] DEBUG o.a.c.impl.DefaultShutdownStrategy - There are 1 inflight exchanges:
InflightExchange: [exchangeId=ID-quickstart-cloudera-40574-1441345060577-0-2, fromRouteId=route1, routeId=route1, nodeId=unmarshal1, elapsed=10787, duration=10791]
22:38:05.436 [Camel (camel-1) thread #1 - ShutdownTask] INFO o.a.c.impl.DefaultShutdownStrategy - Waiting as there are still 2 inflight and pending exchanges to complete, timeout in 299 seconds.
22:38:05.437 [Camel (camel-1) thread #1 - ShutdownTask] DEBUG o.a.c.impl.DefaultShutdownStrategy - There are 1 inflight exchanges:
InflightExchange: [exchangeId=ID-quickstart-cloudera-40574-1441345060577-0-2, fromRouteId=route1, routeId=route1, nodeId=unmarshal1, elapsed=14045, duration=14049]
22:38:08.201 [Camel (camel-1) thread #1 - ShutdownTask] INFO o.a.c.impl.DefaultShutdownStrategy - Waiting as there are still 2 inflight and pending exchanges to complete, timeout in 298 seconds.
22:38:08.202 [Camel (camel-1) thread #1 - ShutdownTask] DEBUG o.a.c.impl.DefaultShutdownStrategy - There are 1 inflight exchanges:
InflightExchange: [exchangeId=ID-quickstart-cloudera-40574-1441345060577-0-2, fromRouteId=route1, routeId=route1, nodeId=unmarshal1, elapsed=16810, duration=16814]
Actually the same program worked for small file but when I try to do with large file I am getting this issue.I think it may problem with Threads .Please Help me out to figure out the issue.
Following is my Program
Main Class
TestRouter myRoute = new TestRouter();
HDFSTransfer hdfsTransfer = new HDFSTransfer();
String copy = hdfsTransfer.copyToLocal(
"hdfs://localhost:8020",
"/user/cloudera/input/CamelTestIn.csv",
"/home/cloudera/Desktop/camelinput/");
boolean flag = false;
if ("SUCCESS".equals(copy)) {
myContext.addRoutes(myRoute);
// Launching the context
myContext.start();
// Pausing to let the route do its work
Thread.sleep(10000);
myContext.stop();
flag = true;
}
if (flag) {
hdfsTransfer.moveFile(
"hdfs://localhost:8020",
"file:/home/cloudera/Desktop/camelout/out.csv",
"/user/cloudera/output/");
}
RouterBuilder Class
{
CsvDataFormat csv = new CsvDataFormat();
from("file:/home/cloudera/Desktop/camelinput/?noop=true&delay=15m")
.unmarshal(csv)
.convertBodyTo(List.class)
.process(new Processor() {
#Override
public void process(Exchange msg) throws Exception {
List<List<String>> data = (List<List<String>>) msg.getIn().getBody();
for (List<String> line : data) {
// Checks if column two contains text STANDARD
// and alters its value to DELUXE.
// System.out.println("line "+line);
/*
if("Aug-04".equalsIgnoreCase(line.get(6))){
line.set(6, "04-August");}
*/
}
}
}).marshal(csv)
.to("file:/home/cloudera/Desktop/camelout/?fileName=out.csv")
.log("done.").end();
}
If you have a bigger file then you need to sleep for longer than 10 seconds to let it have time to process the file.
Also mind that your current route reads the file into memory when means you can run out of memory if the file is very big.
See the lazyLoad option on: http://camel.apache.org/csv.html
Also if all your route is doing is to change some text in a big file, then there is better and faster ways doing that than maybe a Camel route.

Resources