AggregationStrategy warns on timeout all the time - apache-camel

Why does AggregationStrategy implementation always log a warning when it times out? I do not see any exchange/data loss in the aggregation, when this happens.
AggregateProcessor calls this timeout method when the completionTimeout requirement has been met. Any logging of that event could be debug or informational, but shouldn't rise to a warning.
2020-06-25 16:06:54.454 WARN 1 --- [eTimeoutChecker] o.e.s.e.a.ElasticBulkAggregationStrategy : Parallel processing timed out after 1000 millis for number -1. This task will be cancelled and will not be aggregated.
Here is the aggregate portion of my route.
from (direct:...)
...
.aggregate(constant(true)).id("aggregator"+id)
.aggregationStrategyRef("elasticAggregationStrategy")
.completionSize(aggregatorbatchSize)
.completionTimeout(aggregatorbatchTimeout)
.to("seda:aggregatedPayload")
.end()

It is harmless. There is open Jira CAMEL-15244 to remove/reduce severity of the message.
You have following options:
Ignore the warning
Submit PR - please add Jira comment before working on task
Wait until someone resolves that

Related

Strange transactional id errors when using the kafka sink

I had a Flink 1.15.1 job configured with
execution.checkpointing.mode='EXACTLY_ONCE'
that was failing with the following error
Sink: Committer (2/2)#732 (36640a337c6ccdc733d176b18adab979) switched from INITIALIZING to FAILED with failure cause: java.lang.IllegalStateException: Failed to commit KafkaCommittable{producerId=4521984, epoch=0, transactionalId=}
...
Caused by: org.apache.kafka.common.config.ConfigException: Invalid value for configuration transactional.id: String must be non-empty
that happened after the first checkpoint was triggered. The strange thing about it is that the KafkaSinkBuilder was used without calling setDeliverGuarantee, and hence the default delivery guarantee was expected to be used, which is NONE 1.
Is that even possible to start with? Shouldn't kafka transactions be involved only when one follows this recipe in 2?
* <p>One can also configure different {#link DeliveryGuarantee} by using {#link
* #setDeliverGuarantee(DeliveryGuarantee)} but keep in mind when using {#link
* DeliveryGuarantee#EXACTLY_ONCE} one must set the transactionalIdPrefix {#link
* #setTransactionalIdPrefix(String)}.
So, in my case, without calling setDeliverGuarantee (nor setTransactionalIdPrefix), I cannot understand why I was seeing these errors. To avoid the problem, I temporarily relaxed the checkpointing settings to
execution.checkpointing.mode='AT_LEAST_ONCE'
but I'd like to understand what was happening.
Like the JavaDoc mentions, if you enable exactly-once, you must set a transactionalIdPrefix. A complete recipe on how-to configure exactly-once with Apache Kafka can be found in this recipe: https://www.docs.immerok.cloud/docs/cookbook/exactly-once-with-apache-kafka-and-apache-flink/
Disclaimer: I work for Immerok

Setting handled as true in onException block prevents redeliveries

I am reading a file from a directory, and trying to call an API based on the data in the file.
While trying to handle the exceptions, I am facing an issue. I am trying to configure the onException block to redeliver 3 times, with a delay of 5 seconds. The issue occurs, when I am setting handled(true). This configuration does not redeliver, and stops as soon as the exception occurs.
This is my onException block:
onException(HttpOperationFailedException.class)
.log(LoggingLevel.ERROR, logger, "Error occurred while connecting to API for file ${header.CamelFileName} :: ${exception.message}")
.log("redelivery counter :: ${header.CamelRedeliveryCounter}")
.maximumRedeliveries(3)
.redeliveryDelay(5000)
.handled(true);
How do I do both, i.e. handle as well as redeliver?
Unless you use a buggy version of Camel, the redeliveries are made as expected whatever if it is handled or not.
The only difference between handled or not, is the fact that the result sent back to the client once the retries are exhausted will be either the exception (not handled) or the result of your onException (handled).
Your mistake here, is the fact that you assume that the log EIPs that you have defined in your onException are called for each retry while they are actually called only when the retries are exhausted.
If you want to see the retries in your logs, you can use retryAttemptedLogLevel as next:
onException(HttpOperationFailedException.class)
.maximumRedeliveries(3)
.redeliveryDelay(5000)
.retryAttemptedLogLevel(LoggingLevel.WARN);
You will then get warning messages of type:
Failed delivery for (MessageId: X on ExchangeId: Y). On delivery attempt: Z caught: ...

Aggregate after exception from ftp consumer: FatalFallbackErrorHandler

My camel route tries to pick up some files from sftp, transfer them to network, and delete them from sftp. If the sftp is unreachable after 3 attempts, I want the route to send an email warning the admin about the problem.
For this reason my sftp address has the following parameters:
maximumReconnectAttempts=2&throwExceptionOnConnectFailed=true&consumer.bridgeErrorHandler=true
In case the network location is not available, i want the route to notify the admin and not delete the files from sftp.
For this reason i have set .handled(false) in onException.
However, when connecting to sftp fails, aggregation also fails and no emails are coming. I have made a minimalist example below:
/configure
onException(Throwable.class)
.retryAttemptedLogLevel(LoggingLevel.WARN)
.redeliveryDelay(1000)
.handled(false)
.log(LoggingLevel.ERROR, LOG, "XXX - Error moving files")
.to(AGGREGATEROUTE)
.end();
from(downloadFrom)
.to(to)
.log(LoggingLevel.INFO, LOG, "XXX - Moving file OK")
.to(AGGREGATEROUTE);
from(AGGREGATEROUTE)
.log(LoggingLevel.INFO, LOG, "XXX - Starting aggregation.")
.aggregate(constant(true), new GroupedExchangeAggregationStrategy())
.completionFromBatchConsumer()
.completionTimeout(10000)
.log(LoggingLevel.INFO, LOG, "XXX - Aggregation completed, sending mail.");
In the logs i see:
16:02| ERROR | CamelLogger.java 156 | XXX - Error moving files
Then the logs for the Exception occurring during connection.
And then this:
16:02| ERROR | FatalFallbackErrorHandler.java 174 | Exception occurred while trying to handle previously thrown exception on exchangeId: ID-LP0641-1552662095664-0-2 using: [Pipeline[[Channel[Log(proefjes.camel_cursus.routebuilders.MoveWithPickupExceptions)[XXX - Error moving files]], Channel[sendTo(direct://aggregate)]]]].
16:02| ERROR | FatalFallbackErrorHandler.java 172 | \--> New exception on exchangeId: ID-LP0641-1552662095664-0-2
org.apache.camel.component.file.GenericFileOperationFailedException: Cannot connect to sftp://user#mycompany.nl:22
at org.apache.camel.component.file.remote.SftpOperations.connect(SftpOperations.java:149)
I do not see "XXX - Starting aggregation." which i would expect to see in the log. Does some kind of error occur befor aggregation? The breakpoint on aggregate(*, *) is never reached.
First, I just want to clarify something. You write "In case the network location is not available, i want the route to notify the admin and not delete the files from sftp", but shouldn't that be obvious anyhow? I mean, if the network location is not available, wouldn't deleting the files from sftp be impossible?
It's a little confusing that your exception handler is also routing .to(AGGREGATEROUTE). Given that you want to email an admin, shouldn't that be in the exception handler, not in the happy path? Why would you and how would you "aggregate" a connection failure?
Finally, and here I think is a real problem with your implementation, you may have misunderstood what handled(false) does. Setting this to false means routing should stop and propagate the exception returned to the caller. I'm not sure what having to the .to(AGGREGATEROUTE) would do in this case, but I'm not surprised it's not being called.
I suggest trying a few things. I don't have your code so I'm not sure which will work best. These are all related and any might work:
Change handled(false) to handled(true).
Replace handled with continued(true).
Use a Dead Letter Channel.
Reference:
Handle and Continue Exceptions
Dead Letter Channel
Since errorhandling is different depending on which endpoint causes the error, i have solved this by having two different versions of onException:
//configure exception on sft end
onException(Throwable.class)
.maximumRedeliveries(2)
.retryAttemptedLogLevel(LoggingLevel.WARN)
.redeliveryDelay(1000)
.onWhen(new hasSFTPErrorPredicate())
// .continued(true) // tries to connect once, mails and continues to aggregation with empty exchange
//.handled(false) // tries to connect twice but does not reach mail
.handled(true) // tries to connect once, does reach mail
// handled not defined: tries to connect twice but does not reach mail
.log(LoggingLevel.INFO, LOG, "XXX - SFTP exception")
.to(MAIL_ROUTE)
.end();
// exception anywhere else
onException(Throwable.class)
.maximumRedeliveries(2)
.retryAttemptedLogLevel(LoggingLevel.WARN)
.redeliveryDelay(1000)
.log(LoggingLevel.ERROR, LOG, "XXX - Error moving file ${file:name}: ${exception}")
.to(AGGREGATEROUTE)
.handled(false)
.end();
Exceptions occuring at the sftp end are handled in the first onException, because there the hasSFTPErrorPredicate returns 'true'. All this predicate does is check if any exception or their cause has "Cannot connect to sftp:" in the message.
No rollback is required in this case because nothing has happened yet.
Any other exception is handled by the second onException.

CamelExchangeException Invalid Correlation Key on exchange header

Apache Camel is throwing Invalid Correlation Key exception when trying to aggregate messages from my AWS SQS queue.
The messages were placed in the queue using ZipSplitter and they all appear in the queue with matching "parentId" values (which I added using a random uuid as part of the splitting -I've tried CamelSourceFile as well). I get the Exception repeatedly until the retries are exhausted.
My aggregate expression:
from(--queue--).aggregate(header("parentId"), customAggregationStrategy).completionTimeout(3000).processor(new Processor() {...}.to(--next queue--);
There is no logging emitted from my customAggregationStrategy nor from any of the subsequent processors. It fails to aggregate:
... DeadLetterChannel - Failed delivery for (MessageId: ...). On Delivery attempt: 0 caught ...CamelExchangeException: Invalid correlation key. Exchange[ID...]
The delivery attempt is 0 through 9 for my retry attempts.
The infuriating thing is that the code works everywhere but locally...which you think would narrow things down, but neither the exception nor anything else logged sheds any light onto what is going on here.
You could try to use Camel simple language when expressing the correlation key, ie:
.aggregate(simple("${headers.parentId}", customAggregationStrategy)
This way, the exceptions might be silently ignored ?
Did you activate Camel tracer (http://camel.apache.org/tracer.html) to analyse your exchanges and ease the debugging ?
I suspect you have an Exchange which does NOT have the "parentId" header. If you want to skip them, just activate the ignoreInvalidCorrelationKeys option (see http://camel.apache.org/aggregator2.html)

Only retry delivery on 500 responses

I only want to retry delivery on 500 errors but can't seem to find a way to limit the scope of the exception to only that status code. My code:
onException(HttpOperationFailedException.class)
.handled(true)
.maximumRedeliveries(5)
.redeliveryDelay(200);
.to("http4://localhost:8088/ws/v1/camel?bridgeEndpoint=true&throwExceptionOnFailure=false")
See Camel in Action book (1st or 2nd ed) it has such an example in the end of its error handler chapter.
You just add a onWhen to the onException where you then add a bit of code to check the status code is 500

Resources