download specific file with camel and shutdown - apache-camel

Can i download with camel a specific file list from a sftp server and then shutdown the service?
I know this should be a common question but i can't figure out how to do it without waiting the context stopping.
In some way, camel can ensure data integrity?

I guess you can do something like this using a direct route, pollEnrich and a template
from("direct:grabOneFile")
.pollEnrich("sftp://somewhere/blah/blah?fileName=foobar");
then from some java code somewhere, just grab a camel template and invoke the "direct:grabOneFile route.
String ret = template.requestBody("direct:grabOneFile","",String.class);
In this case, you don't have to worry about when to shut down the camel context with the chance of having multiple files etc.

Camel ftp component can only poll directories.
You can use a combination of maxMessagesPerPoll and fileName, like
from("ftp://.../xyz?maxMessagesPerPoll=x&fileName=y");
Also take a look at this.
This link has examples regarding shutdown.

Did you check this out at the bottom of the Camel FTP page.

Related

restarting a route initialized with File component does not poll the existing files again

Thanks to JMX (java console), I try to restart a route with a file component consumer endpoint.
from("file:<some dir>?noop=true")
I am using the wiretap pattern to record the intermediate data transformation through other files endpoint.
On first start of the camel application, everything is fine, and all the files already present in the input directory are polled and processed.
But when I try to restart the route thanks to jmx, nothing happens.
I try to manually removed .camel directory - created by I guess the default FileIdempotentRepository - before restarting the route, in vain.
I also tried to change the kind of IdempotentRepository with a MemoryIdempotentRepository :
from("file:<somedir>?noop=true").idempotentConsumer(header("CamelFileName"), MemoryIdempotentRepository.memoryIdempotentRepository(1000))
Even if I trigger the clear() operation of this MemoryIdempotentRepository before restarting the route in java console, nothing is polled from the input directory after restarting.
If I add a file, it works. Everything behaves like if there is a persistent history of the files already polled once.
I wonder if the use of the option "noop=true" creates an unmanaged idempotent repository I cannot control with jmx.
If true, the file is not moved or deleted in any way. This option is
good for readonly data, or for ETL type requirements. If noop=true,
Camel will set idempotent=true as well, to avoid consuming the same
files over and over again.
Any idea ?
(i am using camel-core 2.21)
I found the solution to my issue.
I made a bad use of idempotentConsumer; I needed to initialize the endpoint idempotent consumer inside the endpoint URI parameters list.
First, create an entry in a bean registry:
registry.put("myIdempotentRepository", MemoryIdempotentRepository.memoryIdempotentRepository(1000));
Then, refer to this idempotentRepository in the endpoint:
from("file:<somedire>noop=true&initialDelay=10&delay=1000&idempotentRepository=#myIdempotentRepository")
By doing this, GenericFileEndPoint:
will not create a default idempotentRepository
will add the idempotentRepository given in options of the endpoint to the services of the camel context. This means that it will be possible to manage it thanks to JMX
I think it would be useful to be allowed to manage the default idempotent repository in the FileEndPoint class of camel-core.

Apache Camel SFTP Consumer not deleting file after reading

I see a weird behavior with Apache Camel SFTP. Even after setting the delete=true attribute, it doesn't delete the file after receiving. I am using 3.0.0-M3 version of camel-ftp
Following is my SFTP configuration,
sftp://<<HOST_NAME>>:<<PORT>>/<<PATH>>?username=<<USERNAME>>" +
"&password=<<PASSWORD>>" +
"&preferredAuthentications=password" +
"&readLock=changed" +
"&readLockMinAge=30000" +
"&delay=20000" +
"&delete=true";
Now Camel is able to read the file, but it doesn't delete the file after reading. While going through the docs, it says
delete (consumer) -
If true, the file will be deleted after it is processed successfully.
How does camel define if it was processed successfully ? Do we need to set any exchange property for Camel to mark it processed successfully ?
After receiving the file all I am doing is pasing it to another route, like following,
from(endpointUri).to("direct:procesSftpFile");
Should I change it from direct to vm or seda?
Looks like nobody faced this issue and I somehow figured out the where this started happening.
The issue was not because of Camel sftp component, but it was with the piece of code I was calling.
Second part of my flow looks like this,
from("direct:procesSftpFile")
.log("...")
// logging and other regular processing
....
// sending to vm InOnly
.to("vm:queue1?exchangePattern=InOnly")
.. some more processing..
.to("vm:queue2?exchangePattern=InOnly")
So the issue was with calling those queue1 and queue2 in above snipet.
Commenting them, fixed it and sftp started deleting the files. For calling the VM, instead of to(), I used producerTemplate.asyncSend as workaround.
One thing I am still confused about is, if we are using InOnly exchange pattern, then why it is affecting the sftp behavior ? Probably I should ask this in a separate question.

pollEnrich with dynamic URI and its number of executions

I want to listen on ActiveMQ topic based on the hostname of system and some other logic. I planned to use pollEnrich for it so I evaluate my logic and provide topic name in pollEnrich but as per document:
pollEnrich or enrich does not access any data from the current Exchange which means when polling it cannot use any of the existing headers you may have set on the Exchange. For example you cannot set a filename in the Exchange.FILE_NAME header and use pollEnrich to consume only that file. For that you must set the filename in the endpoint URI.
How i can figure out this
from("timer://ipc?repeatCount=1")
.. some logic..
.setHeader("topic_no",simple("{{env:HOSTNAME}}"))
.pollEnrich("mqtt:foo?host=tcp://0.0.0.0:1883&subscribeTopicNames=${header.topic_no}/status&clientId=ipc")
.to("log:my?showAll=true&multiline=true");
Please don't suggest to use hostname directly in URI. As I highlighted I have to compute other logic too.
What other option or way I can use?
Will pollEnrich kept listening on topic or it will listen once and end the route?
Update1:
I figured out we can use simple expression with for dynamic URI, But one issue with pollEnrich it only pick one message how i can make sure it kept on listening as consumer? I want that before pollEnrich part get execute once and TopicListener kept listening till application is up.
Will pollEnrich kept listening on topic or it will listen once and end the route?
Same as the fact you have figure out, Camel pollEnrich component will listen on topic and consume at most one message per call.
What other option or way I can use?
Repeat pollEnrich by loop
Create new route at run-time by routeBuilder
Option 1 is naive, but simple in concept. pollEnrich will do once and loop will repeat it. However, this method need to handle more scenario than you might expected.
Option 2 is a better approach. You create a route at run-time and the consumer endpoint URI is pass by variable. That said, you can create the consumer route dynamically after your computation logic.
Example for routeBuilder

Using camel as a directory watcher

Since a week I am dealing with the apache camel framework. I wanted to use it as a directory watcher and for new files it works fine. But if a file is deleted camel send no event so my application can be triggered to start the related action for deleted files like unregistering at a database or something else.
Therefore my question: is this possible to implement with apache camel or is it recommended to use the FileWatcher from the jdk?
I do not think it is possible at the moment but it looks interesting so it would be nice if you open a JIRA so we won't forget.

Throttling FTP Polling consumers using apache camel

I have a requirement where in at one point of time, I need to connect to multiple ftp/sftp endpoints (say 100 ftp endpoints) to download files and process them.
I have a route like below. The Seda queue further processes the messages by moving them into appropriate folders
from(ftp://username#host/foldername?password=XXXXX&include=.*).to("seda:"+routeId)
Now if I am starting all the FTP endpoints at the same time, which is resulting in JVM memory issues. How could I throttle the starting of the ftp endpoints? can I use a SEDA before the ftp to throttle (if so how can I use it)? Any other EIP's or ideas I could use to throttle the triggering of the polling ftp consumers?
You can look into the throttler dsl to if you want to throttle the fetching of the messages.
http://camel.apache.org/throttler.html
For controlling the startup you can look into the simplescheduleroutepolicy..
http://camel.apache.org/simplescheduledroutepolicy.html
It handles route activating and deactivating. Although I haven't used it myself but it looks like you can perhaps add a controlled delay on when routes should start and stop.
I have had this problem in the past solved it using cron in the following way:
from("ftp://username#host/foldername?password=XXXXX&include=.*&scheduler=quartz2&scheduler.cron=0/2+*+*+*+*+?")
You can set up every FTP consumer to pull at different times (say with one minute difference).
If you decided to go down this path, you can use the following website to construct your crons easily:
http://www.cronmaker.com/
Hope this helps.
R.

Resources