Camel Kinesis ignoring maxResultsPerRequest parameter - apache-camel

Camel/Kinesis seems to be ignoring maxResultsPerRequest and greedy parameters that I have set in the uri.
<camelContext xmlns="http://camel.apache.org/schema/blueprint" >
<route>
<from uri="aws-kinesis://my-stream?maxResultsPerRequest=25&greedy=true"/>
<to uri="stream:out"/>
</route>
</camelContext>
Since there are a bunch of messages waiting to be consumed in the shard, I would expect it to read the first 25 all at once, see that there are more, and then immediately poll to get the next 25.
But instead, it still reads them one at a time, at a rate of what looks like one every half a second (which lines up with the default polling delay of 500ms specified in the documentation). Adjusting the delay parameter to be shorter also seems to do nothing.
So it seems to be ignoring both the maxPerRequest as well as the greedy flag.
maxResultsPerRequest: Maximum number of records that will be fetched in each poll (int, default 1)
greedy: If greedy is enabled, then the ScheduledPollConsumer will run immediately again, if the previous run polled 1 or more messages. (boolean, default false)
delay: Milliseconds before the next poll (long, default 500)
Am I misunderstanding what these parameters do?

Related

Splitter behaviour with parallel processing

I have this route which exhibits a behaviour that I can't understand.
from("sftp://...)
.autoStartup(isReadyToStart())
.routeId(model.getName())
.filter().method(startTrigger)
.bean(RouteHelperIn.class,"start")
.bean(initialize)
.bean(RouteHelperIn.class, "startChunk")
.convertBodyTo(InputStream.class)
.split().tokenize(newLine)
.streaming()
.stopOnException()
.executorServiceRef(splitterExecutorService)
.filter().method(RouteHelperIn.class,"isContentRow") //Skip header rows using the removeFirstLines route parameter
.unmarshal(jdf)
.bean(mapper)
.bean(RouteHelperIn.class, "generateAndUpdateLog")
.bean(writeToDb)
.end()
.end()
.bean(RouteHelperIn.class, "endChunk")
.bean(postProcessing).to("direct://" + sub.getCopy())
.bean(RouteHelperIn.class, "writeLastImportLog")
.bean(RouteHelperIn.class, "end")
.end()
Let's suppose that the SFTP consumer find two files, and that splitterExecutorService is set to null.
In this case, the split is executed without parallelism and the two exchanges are executed sequentially. This is true because the second RouteHelperIn#start method is called after RouteHelperIn#stop, as expected.
When splitterExecutorService is set, then parallelism in the splitter is enabled. According to the documentation:
If enabled then processing each split messages occurs concurrently. Note the caller thread will still wait until all messages has been fully processed, before it continues. It’s only processing the sub messages from the splitter which happens concurrently.
What I see is that the second RouteHelperIn#start is started before than the first RouteHelperIn#end, and that is not what I was expecting, causing a serious bug in my scenario. Also, it seems that the thread ID is always the same, so the documentation is not respected. I was expecting the same behaviour than the previous case, with the only difference being that the steps described inside the splitter block were executed in parallel.
Could it be a bug in Apache Camel or am I making some wrong assumption here? I'm using version 3.11.3 now.
UPDATE
I've found that adding synchronous=true to the consumer URI solves the problem, IOW it behaves as expected. What I don't understand is why the route consumer is synchronous when the splitter is not parallelized and it's asynchronous when the splitter is parallelized, again according to what the documentation states.

Apache camel not fwding all the message to AMQ, if using delay

I am trying to read message from activemq queue "AMQ:ORIGIN" using apache camel. After reading the message need to pass it to two different "AMQ queue's". But condition is following.
Message should pass to queue "AMQ:A" Immediately.
Message should pass to queue "AMQ:B" After one minute delay.
To Achieve above i created two routes. In first route i am reading from AMQ queue, and doing multicast to "AMQ:A" and "seda:delay" queue. In second route, i am reading from "seda:delay" queue, delaying for one minute and then passing to "AMQ:B" queue.
Work fine if pass with 1 or 10 messages to "AMQ:ORIGIN"
If i send 100 messages same time to "AMQ:ORIGIN" queue, then
All 100 messages are delivered to "AMQ:A" queue
Only 10 or 12 messages are delivered to "AMQ:B" queue. Rest is stuck in route only.
Following are my routes.
<route id="read-origin">
<from uri="activemq:ORIGIN"/>
<multicast stopOnException="true">
<to uri="activemq:A"/>
<to uri="seda:delay-route"/>
</multicast>
</route>
<route id="delay-route">
<from uri="seda:delay-route"/>
<delay asyncDelayed="true">
<constant>60000</constant>
</delay>
<to uri="activemq:B"/>
</route>
Please suggest the changes to achieve above.
Thanks,
That seems to be obvious since you delay every message for 1 minute.
If you send 100 messages to the ORIGIN queue, it takes 100 minutes until all these messages arrive at queue B.
The first message is consumed immediately, and delayed for 1 minute. The second is taken when the first is delivered (assuming 1 consumer on seda queue) and also delayed for one minute and so on...
I assume you want that a message that already waited for 1 minute in the queue, can be delivered immediately when consumed.
You can easily reach this making the delay dynamic.
Implement a bean that calculates the difference between the JMSTimestamp header (enqueue time) of a message and the current time.
currentTime - JMSTimestamp = alreadyWaited
Your minimal delay - alreadyWaited = time to wait before delivery (take 0 for negative values of messages that where queued for more than the delay)
Use this difference as value for the delay (I use Java DSL because I know it better).
from("seda:delay-route").routeId("delay-route")
.delay().expression(method(YourDelayCalculationBean.class))
.to("activemq:B");
Like this, if your messages pile up in the queue, they are probably all delivered immediately on consumption because they already waited in the queue for more than 1 minute.
Addition due to comment
OK, sorry, I did not spot the asyncDelayed.
What the docs say about asyncDelayed sounds like what you expect. But according to your comment it sounds like the Delay EIP is no more blocking the consumer, but blocks itself.
So the seda consumer takes a message, hands it over to the Delay and continues to the next message. After 10 messages (10 threads is the default thread pool size of Camel), the Delay is "full" (all threads are blocked in a fixed 1 minute delay).
Therefore the consumer becomes blocked because the Delay can take no more messages. After a minute, the Delay can deliver the first messages and it continues.
That is just wild guessing based on what you write how your route behaves.

Apache Camel route timing out

I have two Camel routes, configured in XML and pasted below: -
Route 1:
<camel:route id="statementsArchivingPollRoute">
<camel:from uri="timer://tempQueue?fixedRate=true&period=30s&delay=30s"/>
<camel:transacted ref="PROPAGATION_REQUIRED">
<camel:process ref="statementsArchivingRequestZipProcessor"/>
<camel:choice>
<camel:when>
<camel:simple>${body.size} >= 1</camel:simple>
<camel:split>
<camel:simple>${body}</camel:simple>
<camel:marshal ref="archiveFileInterfaceMetadataMapper"/>
<camel:to pattern="InOnly"
uri="activemq:{{ccs.activemq.queue.prefix}}.sr.archive.bulk.ingestion.req?jmsMessageType=Text"/>
</camel:split>
<camel:log loggingLevel="INFO" message="Archiving content was processed"/>
</camel:when>
<camel:otherwise>
<camel:log loggingLevel="INFO" message="No archiving content to process"/>
</camel:otherwise>
</camel:choice>
</camel:transacted>
</camel:route>
Route 2:
<camel:route id="statementsArchivingBulkIngestionRequestRoute">
<camel:from uri="activemq:{{ccs.activemq.queue.prefix}}.sr.archive.bulk.ingestion.req"/>
<camel:throttle timePeriodMillis="4000">
<camel:constant>1</camel:constant>
<camel:setExchangePattern pattern="InOnly"/>
<camel:unmarshal ref="archiveFileInterfaceMetadataMapper"/>
<camel:bean ref="archiveFileEntryTransformer" method="transform"/>
<camel:setHeader headerName="CamelHttpMethod">
<camel:constant>POST</camel:constant>
</camel:setHeader>
<camel:toD uri="{{ccs.bulk.ingestion.service.ingest.archive.file}}"/>
</camel:throttle>
</camel:route>
The processor in the first route returns a list of request objects. The list is then split and each request is marshalled and placed on a queue.
The second route listens to this queue. When it de-queues a message, it unmarshals it, performs a transform and then uses it to send a post request to another service. I am throttling this route so that it only processes one message per second so as not to overwhelm the downstream service.
This all works fine when the list only contains a few requests and hence only a few messages enter the queue, but when there are many items in the list, route 2 times out and the log entry below appears:
Atomikos:12] c.a.icatch.imp.ActiveStateHandler : Timeout/setRollbackOnly of ACTIVE coordinator !
The timeout leads to the process repeating itself and the downstream service ends up getting called multiple times per message instead of just once.
I can't understand why the number of times route 2 is invoked should cause it to timeout. I thought that a single instance of the route would be launched for every message de-queued from the activemq. If a single message takes a long time to complete, then I would understand, but clearly the timeout is based on the cumulative times of all messages being de-queued.
I am fairly new to Camel and I have clearly misunderstood something from an architectural point of view. I would be extremely grateful for any guidance into how to stop these timeouts occurring. Thank you for reading.
Jon, you may want to try following to investigate
disable/comment the 2nd route. Purpose of using activemq would be to make process async which means 1st route should not have any impact due to 2nd route. If this works without route 2 then problem is elsewhere.
If you find 1st route is working fine without 2nd route then next would be to try setting fewer number of threads in 2nd route. May be put 1 or 2 threads and see if that helps. I am thinking it is contention on activemq rather than these route configurations.
Check the payload size that you are pushing to activemq. If you are publishing very large message to activemq that may also have an impact as large number of items and each item is large causing contention in activemq and transaction is taking longer than the timeout setting.
If you are pushing large data set to activemq, you may want to revisit the design. Save payload in some persistence area (db/file/cache) and sent notification events containing only reference to the payload and some metadata. 2nd route can then take the reference from event and retrieve the payload from where it was saved by route 1.

How do I send camel hold back messages to directory when throttling limit is reached?

I have the situation where I want to accept the first 100 messages in a one hour time frame, and process those via a bean called handleMessage. Then, any messages over the 100 limit, I would like to simply place them into a directory (or pass to a separate bean) and never send them through the main processing bean. Currently, if 101 messages are received, the 101st message is on hold back and processed by "handleMessage" once the 1 hour time limit is up.
<route>
<from uri="file://inputdir/"/>
<!-- throttle 100 messages per hour -->
<throttle timePeriodMillis="3600000">
<constant>100</constant>
<bean name="handleMessage" method="process"/>
</throttle>
</route>
The throttler does not support this, it will always hold back messages and execute them later when the time slot is available.
You would need to write your own custom eip / bean / processor that does what you want.

Handling incoming JMSCorrelationId in an Apache Camel route

I have an camel route consuming on a JMS (activemq) queue targeted to be called in a request/reply manner. Inside this route I split the message and invoke another activemq queue (also in a request/reply manner).
Heres a minimal route showing the situation
<route>
<from uri="activemq:A" />
<split>
<xpath>/root/subpart</xpath>
<inOut uri="activemq:B" />
</split>
</route>
The problem is that Camel does not set a new JMSCorrelationId (since there is already one from the incoming message). If nothing is done, you get responses with unknown correlationId's and the exchanges never end.
I didn't go into details but my guess is that the same temporaryQueue is used for the hole splitter but that it (logically) expects different correlation id's for each of the messages. All using the same, it recieves the first and does not know what to do with the others.
What would be the best solution to handle the situation ?
The one I've found working is to save in another header the incoming JMSCorrelationId (not sure I need to though), and removing it. This is not really as clean as I would want it to be, but I couldn't think of something else. Any ideas ?
Essentially, your case is described in this Jira issue It seems there will be an addition in 2.11 where you can ask Camel to create a new corr-id.
So, in the meantime, why don't you continue what you had working - to remove the JMSCorrelationId header <removeHeader headerName="JMSCorrelationId" /> before you send it to "activemq:B"? I guess that is the best solution for now.
You could, of course, play with the "useMessageIDAsCorrelationID" option as well on the second endpoint.

Resources