I have a scenario: where I have 2 routes, one of my routes is used to download the file from FTP Server and put it in Local directory.
Second route will pick the file from local directory and start processing further.
Multiple files can arrive on FTP at any time. Second route uses thread pool(default 10 consumers) and as files are downloaded to local directory, second route will pick those file and start processing.
But this route picks the file randomly from local directory. I want second route to pick the file as per the timestamp.
So in case if second route is processing 10 files (as there are 10 threads configured) currently and if more files arrive at local directory then if any consumer gets free it should pick the file
from local directory which came first.
Can anyone please guide as how can I achieve this?
Related
in my camel app, it is process file from ftp server. When I test, I found during file upload, meantime my route start pick up that file and do process. I have set readLock to 'changed' and delay is '60000', my file is around 500m. Does I missing anything?
Notice that the delay option is just a fixed interval that is not "coupled" with the readLock.
The readLock option changed checks every second if the filesize has changed. With slow uploads this could be the reason that files still uploading are already be consumed.
You could try to increase the readLockCheckInterval higher than 1 second.
See Camel FTP docs for more details and options (Option readLock)
I have a file drop endpoint that I poll from. I need to poll the files in sequential order as they are received and I am using a cron expression to poll at only certain hours to the day. Here is my file input:
file:///tmp/input?idempotent=true&moveFailed=/tmp/error&readLock=changed&readLockCheckInterval=2500&sortBy=file:modified&move=processed/&scheduler=quartz2&scheduler.cron=0+0/5+0-3,5-23+*+*+?
The issue that I have is that Camel polls a batch of files but then subsequently newer files are written to the directory so in a subsequent poll a new file is processed before the previous batch is completed.
I added some properties to my route to show the batch size and whether or not it has been completed just for some info:
<camel:log message="Camel batch size: $simple{property.CamelBatchSize}, Camel Batch Index: $simple{property.CamelBatchIndex}, Camel Batch finished: $simple{property.CamelBatchComplete}"/>
How can I tell Camel not to poll until the previous batch is complete? I do this because order of file processing is important. Thanks!
Not sure if there is any existing method to accomplish your goal by cron job in file route directly. However, You could achieve your task by using 3 routes.
Cron job route
Emit suspend signal to Stopper route if Collector route is already started (check via controlBus component)
Startup Collector route at correct time (trigger by controlBus component)
Collector route
Control file consumer behavior
Emit complete signal to Stopper route when batch completed
Stopper route
Suspend Collector route when signal received (trigger by controlBus component)
I have two separate routes running in an application and I want to control the total amount of work in flight across the entire path.
Route 1: Gzipped file on SFTP --> unzip --> local directory
Route 2: local directory --> process stuff --> Kafka
If route 2 has problems or falls behind in its work, I don't want route 1 to fill up the local directory. How can I limit the total number of files sitting in local dir waiting to be processed?
(if it was a single route I may be able to throttle() easier, but are there other options to look at the overall picture of multiple routes?)
You can implement a custom RoutePolicy where you check the number of files in that directory and if its bigger than X then suspend the route, and resume it again if its lower than X.
See more details at the Camel docs: http://camel.apache.org/routepolicy.html
You can look at the existing ThrottlingInflightRoutePolicy how its implemented for inspiration.
I have a camel route consuming from an FTP server and storing any files it consumes to a directory with move=.dealtWith. However, the number of files in this .dealtWith directory can quickly become unmanageable for users to view, so I would like to move the file to a .dealtWith/{the_date} directory. Is there a way to specify this functionality in camel without bringing the route down?
Use Camel Simple Expression Language
ftp:url?move=.dealtWith/${date:now:yyyy-MM-dd}/${header.CamelFileName}
I am currently working on a Camel based test harness application, that processes groups of files from multiple folders and compares with the files present in the local repository.
Is there any way to change the folder location from the end point in Camel route dynamically? I want to use a single route for polling files from the multiple folders.
According dynamic change endpoint camel, use following procedure:
stop the route
remove the route
change the endpoint
add the route
start the route