Camel SFTP file polling synchronize - apache-camel

We have SFTP polling consumer route to get files from SFTP server. Here is my problem, in the sftp server polling folder we have 100 of files but I want them to process in a bunch. Let's say my configuration maxMessagesPerPoll=10 then we would like to process these 10 files first then go for next round of fetching. But as of now, sftp consumer start consuming all the files and then processing it. Is there way to do this?
sftp.options: shuffle=true&moveFailed=../error/&preMove=../processing&delete=true&readLockLoggingLevel=INFO&maxMessagesPerPoll=10&delay=5000&disconnect=false&scheduledExecutorService=#sftpScheduledExecutor
from(getSftpOptions())
.onCompletion()
.onCompleteOnly()
.process("routeCompleteProcessor")
.log("Route processing is Completed ...")
.end()
.log("Starting the file process")
.process("routeCBCheckProcessor")
.unmarshal().gzipDeflater()
.split(body().tokenizeXML("root", "*"),new SimpleStringAggregator()).streaming()
.parallelProcessing(true)
.process("lineProcessor")
.end()
.log("aggregate is completed...");

Related

In Mulesoft how to stop SFTP connector to read file when ActiveMQ is trying to reconnect

I have a reliable acquisition flow where an inbound SFTP connector is polling and reading the file and publishing to JMS ActiveMq. I have an issue when ActiveMQ is down. JMS connector get in to reconnection mode lets say every minute but SFTP is actively reading the files and try to publish the file in ActiveMq. which causes the loss of message and the concept of deadletter queue also does not work as JMS is down.
Is there a way we can stop reading the file until the JMS successfully reconnect? also what if the message was in flight and JMS get down before message end up in the queue? will the message rollback?
I am using Mule 3.9
If you set the flow processing strategy to synchronous and use only 1 receiver thread in the connector configuration. Then it should not try to read a new file if the previous one wasn't processed. Without those changes it is not a correct implementation of the reliable pattern.
More information:
Documentation of reliable patterns: https://docs.mulesoft.com/mule-runtime/3.9/reliability-patterns.
KB Article: https://help.mulesoft.com/s/article/How-to-fetch-files-in-order-using-the-FTP-connector
Example:
<sftp:connector name="sftpConn">
<receiver-threading-profile maxThreadsActive="1" />
</sftp:connector>
<flow name="mainFlow" processingStrategy="synchronous">
...

apache camel ftp: how to prevent ftp component to process file when that file being written

in my camel app, it is process file from ftp server. When I test, I found during file upload, meantime my route start pick up that file and do process. I have set readLock to 'changed' and delay is '60000', my file is around 500m. Does I missing anything?
Notice that the delay option is just a fixed interval that is not "coupled" with the readLock.
The readLock option changed checks every second if the filesize has changed. With slow uploads this could be the reason that files still uploading are already be consumed.
You could try to increase the readLockCheckInterval higher than 1 second.
See Camel FTP docs for more details and options (Option readLock)

Camel file polling: finish processing a batch before polling again

I have a file drop endpoint that I poll from. I need to poll the files in sequential order as they are received and I am using a cron expression to poll at only certain hours to the day. Here is my file input:
file:///tmp/input?idempotent=true&moveFailed=/tmp/error&readLock=changed&readLockCheckInterval=2500&sortBy=file:modified&move=processed/&scheduler=quartz2&scheduler.cron=0+0/5+0-3,5-23+*+*+?
The issue that I have is that Camel polls a batch of files but then subsequently newer files are written to the directory so in a subsequent poll a new file is processed before the previous batch is completed.
I added some properties to my route to show the batch size and whether or not it has been completed just for some info:
<camel:log message="Camel batch size: $simple{property.CamelBatchSize}, Camel Batch Index: $simple{property.CamelBatchIndex}, Camel Batch finished: $simple{property.CamelBatchComplete}"/>
How can I tell Camel not to poll until the previous batch is complete? I do this because order of file processing is important. Thanks!
Not sure if there is any existing method to accomplish your goal by cron job in file route directly. However, You could achieve your task by using 3 routes.
Cron job route
Emit suspend signal to Stopper route if Collector route is already started (check via controlBus component)
Startup Collector route at correct time (trigger by controlBus component)
Collector route
Control file consumer behavior
Emit complete signal to Stopper route when batch completed
Stopper route
Suspend Collector route when signal received (trigger by controlBus component)

Camel: Concurrent consumers, process file in route by order of arrival

I have a scenario: where I have 2 routes, one of my routes is used to download the file from FTP Server and put it in Local directory.
Second route will pick the file from local directory and start processing further.
Multiple files can arrive on FTP at any time. Second route uses thread pool(default 10 consumers) and as files are downloaded to local directory, second route will pick those file and start processing.
But this route picks the file randomly from local directory. I want second route to pick the file as per the timestamp.
So in case if second route is processing 10 files (as there are 10 threads configured) currently and if more files arrive at local directory then if any consumer gets free it should pick the file
from local directory which came first.
Can anyone please guide as how can I achieve this?

Move option using camel ftp component not working

I am using camel version 2.16.0.
I am trying to send some files through ftp camel component and move them to another location after finishing the transmission.
I am using the "ftp://127.0.0.1/folder1/folder2?username=dev_user&passiveMode=true&password=dev_password&maximumReconnectAttempts=500&reconnectDelay=300000&move=folder3" route.
My files are sent properly, but not moved from Folder2 to Folder3 as I would expect after finishing the transmission.
Any thoughts?
Thanks!
Marcelo
Move -option exists only for the consumer (from), not for the producer (to). Camel's FTP-component uses the File2-component, please study the API http://camel.apache.org/file2.html
If however you are using consumer, then it could be e.g. permission issue and you should study the logs.

Resources