camel file sortBy and poll failed - apache-camel

I have this endpoint :
<from
uri="file://{{incomingFileBaseFolder}}?filter=#fileFilter&recursive=true&readLock=changed&move=${file:parent}/.backup/${date:now:yyyy}/backup_${exchangeId}_${file:onlyname.noext}.${file:name.ext}&sortBy=file:modified&delay={{incomingFileDelay}}" />
So it is sorted by file:modified .
My question is what happen when the pull failed . Will the next poll move to the next file in the directory , or will it stay with the failed file ?

By default it will leave the failed file in the directory and proceed with the rest. But it's better to define the moveFailed URI option to specify a "failed" directory. For more info on moveFailed take a look at the Documentantion.

Related

Apache Camel doneFileName with changing name

I'm currently creating some route and for one of them I have a problem.
Usually I have a data file and then a done file which have the same name prefixed by "ACK" and this works perfectly with camel and the doneFileName option.
But for one of my route I have to work with a different situation, I still receive two files but they have the same typology, it's like: MyFILE-{{timestamp}}. The data file contains the data, and the done file contains just "done".
So I need something to check the content of the file, and if it's juste "done" then process the other file.
Is there a way to handle this with camel?
The most pragmatic solution I see is to write an "adapter script" (bash or whatever you have at your disposal) that peeks into every file with a timestamp in its name.
If the file content is "done":
Lookup the other "MyFILE-{{timestamp}}" (the data file) and rename it to "MyFILE"
Rename the done file to "MyFILE.done"
Camel can then import the data file using the standard done-file-option. Because both files are renamed to something without a timestamp, the peek-script ignores them after renaming.

camel unpacking tar.gzip files

After downloading several files with camel over FTP I should process them but they are in tar.gzip formats. Camel supports gzip and as I can see also tar endpoint from 2.16.0 onwards (http://camel.apache.org/camel-2160-release.html).
The code I have for extracting the gzip:
from("file:modelFiles?readLock=changed&recursive=true&consumer.delay=1000")
.unmarshal(new ZipFileDataFormat())
.choice()
.when(body().isNotNull())
.log("Uziping file ${file:name}.")
.to("file:modelFiles_unzipped")
.endChoice()
.end();
All the files run through the rule but they are created as .tar.gz again but the worse is that the content also becomes corrupt, so they cannot even be opened afterwards as gzip archives.
Questions:
How should I unpack the gzip archives?
How could I do the same for
the tar files?
Update 1:
Thanks for the post Jeremie. I changed the code like this as proposed:
from("file:modelFilesSBML2?readLock=changed&recursive=true&consumer.delay=1000")
.unmarshal().gzip()
.split(new TarSplitter())
.to("file:modelFilesSBML_unzipped");
Then I receive the following exception (just for info the tar.gzip files are not of zero length): FailedException: Cannot write null body to file: modelFilesSBML_unzipped\2006-01-31\BioModels_Database-r4-sbml_files.tar.gz :
2016-03-22 14:11:47,950 [ERROR|org.apache.camel.processor.DefaultErrorHandler|MarkerIgnoringBase] Failed delivery for (MessageId: ID-JOY-49807-1458652278822-0-592 on ExchangeId: ID-JOY-49807-1458652278822-0-591). Exhausted after delivery attempt: 1 caught: org.apache.camel.component.file.GenericFileOperationFailedException: Cannot write null body to file: modelFilesSBML_unzipped\2006-01-31\BioModels_Database-r4-sbml_files.tar.gz
Solution:
After trying many ways, I am using it finally as follows (with Camel 2.17.0 it did not work with 2.16.0 or 2.16.1):
from("file:modelFilesSBML?noop=true&recursive=true&consumer.delay=1000" )
.unmarshal().gzip()
.split(new TarSplitter())
.to("log:tar.gzip?level=INFO&showHeaders=true")
.choice()
.when(body().isNotNull())
.log("### Extracting file: ${file:name}.")
.to("file:modelFilesSBML_unzipped?fileName=${in.header.CamelFileRelativePath}_${file:name}")
.endChoice()
.end();
With Camel 2.17.0 you can also skip the body().isNotNull() check.
Jeremie's proposal help much, so I will accept his answer as a solution. Nevertheless, the exception would still come, if I did not check the message body for null. The fileName=${in.header.CamelFileRelativePath}_${file:name} keeps also the original file structure where the file name is prefixed with the file.tar.gz but I have not found any other way to preserve the directory structure as the file endpoint does not accept expressions for the directory in ("file:directory?options...").
You can use the camel-tarfile component.
If your tar.gz contain multiple files, you should ungzip, then untar and split the exchange for each file. The TarSplitter is an expression which split a tar into an iterator for each file contained in the tar.
from("file:target/from")
.unmarshal().gzip()
.split(new TarSplitter())
.to("file:target/to");

hadoop write file and put in Distributed cache

I have a requirement to create a dynamic file based on the content in hadoop job.properties and then put it in Distributed Cache.
When I create the file I see that it has been created with the path of "/tmp".
I create a symbolic name and refer to this file in the cache. Now, when I try to read the file in the Dis. cache I am not able to access it. I get th error caused by: java.io.FileNotFoundException: Requested file /tmp/myfile6425152127496245866.txt does not exist.
Can you please let me know If should I need to specify the path also while creating the file and also use that path while accessing/reading the file.
I only need the file to be available only till the job is running.
I don't really get your meaning of
I only need the file to be available only till the job is running
But, when I practice to use distributed cache , I use path like this :
final String NAME_NODE = "hdfs://sandbox.hortonworks.com:8020";
job.addCacheFile(new URI(NAME_NODE + "/user/hue/users/users.dat"));
hope this will help you .

Apache camel multiple file processing with exec

I am having trouble fixing this simple route, getting exception right after execute. Seems like execute is acting as Producer and over writing file.
Exception:
org.apache.camel.component.file.GenericFileOperationFailedException: Cannot store file: C:\camel_tests\stage\Downloads.rar
Route:
Home directory will have a rar file with images, that should be extracted with winrar.exe, each file in the rar is file processed, and eventually moved to arch directory once this route done. Last successful stage is extracting files in the stage directory.
Here CMD_EXPLODE = "\"C:/Program Files/WinRAR/WinRAR.exe\"";
from("file://C:/camel_tests/home?fileName=Downloads.rar&preMove=//C:/camel_tests/stage")
.to("exec:"+Consts.CMD_EXPLODE+"?args=e Downloads.rar&workingDir=C:/camel_tests/stage&outFile=decompress_output.txt")
.to("file://C:/camel_tests/stage?exclude=.*.rar")
.process(new PrintFiles())
.to("file://C:/camel_tests/stage?fileName=Downloads.rar&move=//C:/camel_tests/arch").end();
You should split that into 2 routes. The first that does the from -> exec
And a 2nd from -> process -> to
The 2nd will then process each of the extracted file from winrar.

how to move file to error directory and rollback other move with apache camel?

First of all: i'm a camel newbie :-)
I want to transfer a file from an input directory to an output directory and do some java stuff.
If something goes wrong, i want to move the file to an error directory and rollback to move to the output directory.
This is my route in java dsl:
onException(Exception.class).handled(true).to("file://C:/temp/camel/error");
from("file://C:/temp/camel/in?delete=true").to("file://C:/temp/camel/out").bean(ServiceBean.class, "callWebservice");
If an error is thrown in the ServiceBean, the file is copied to the error folder, but it also stays in the out directory.
What is the best way to rollback?
Thanks
There is a moveFailed option. Just use that, then you dont need the onException etc.
http://camel.apache.org/file2
from("file://C:/temp/camel/in?delete=true&moveFailed=C:/temp/camel/error")
.to("file://C:/temp/camel/out")
.bean(ServiceBean.class, "callWebservice");
And instead of storing to out in the route, then just use the move option so it becomes
from("file://C:/temp/camel/in?move=/temp/camel/out&moveFailed=/temp/camel/error")
.bean(ServiceBean.class, "callWebservice");
I don't think you can easily 'rollback' filesystem operations. Perhaps you could redesign your flow to first copy file to some intermediary 'stage' directory, do the work you need and depending on the outcome of that work move file to either 'output' or 'error' directory.

Resources