Files are moved to Completed folder even though the processor throws an exception - apache-camel

I have a weird issue in the below code. Grabbed file from file path and send it to downstream system using the below code. But the issue is when Actual processor class throws an exception , the file is getting moved from inprogress folder and then completed folder. But it should not happen like this. What I want is that file should be remain in same file location (InProgress), until the processor process the file successfully. What am i doing wrong here?
'''
from("file:src/path?preMove=InProcess&move=Completed&delete=false&idempotent=false&idempotentKey=${file:name}-${file:size}&readLock=rename&readLockMinAge=10s&renameUsingCopy=true&filterFile=$simple{file:size} > 0 &include=.*.xml")
.convertBodyTo(String.class)
.routeId("test-route")
.doTry().bean(ActualProcessor.class, "process")
.doCatch(Throwable.class)
.to("file:src/path?move=Reprocess")
.end();
'''

Related

"Filenotfound errno 2" but file is actually there

I've got a folder of images and want to iterate through them so I can write data corresponding with each to a csv file. The image directory contains four images but an error is thrown at the third image ('Loss Label_0_2.tif'). Note the directory containing all images is in a folder called DS_Test (see dataset_dir). But, with the error I get, it seems it starts searching in the parent directory (CNN) rather than CNN/Dataset_dir, specified in dataset_dir.
Could anyone tell what mistake I'm making?
I took the parent directories above CNN out of the path for privacy reasons. Apologies if this question is a bit of an overload of info.
dataset_dir = 'directorypath/CNN/DS_Test/'
with open('csvfilepath/Labels.csv', 'w') as file:
writer = csv.writer(file)
for image in os.listdir(dataset_dir):
if image.endswith('.tif'):
*
series of file operations
*
writer.writerow(data)
file.close()
Error: "FileNotFoundError: [Errno 2] No such file or directory: 'directorypath/CNN/Loss Label_0_2.tif'

Flink produces out file in log folder but does not print anything

I am using Flink local mode with parallelism = 1.
In my Flink code, I have tried to print the incoming source using:
DataStream<String> ds = env.addSource(source);
ds.print();
In my local Flink_dir/log folder, i could see that an xxx.out file has been created, but nothing was printed into the file. Is there any config that I might have overlooked? I am sure that my source data contains text as I have managed to add the data to the sink successfully. Thanks!
ds.print will write to stdout and not to a file. ${flink_dir}/log contains only the logs of your task and/or job manager.

camel unpacking tar.gzip files

After downloading several files with camel over FTP I should process them but they are in tar.gzip formats. Camel supports gzip and as I can see also tar endpoint from 2.16.0 onwards (http://camel.apache.org/camel-2160-release.html).
The code I have for extracting the gzip:
from("file:modelFiles?readLock=changed&recursive=true&consumer.delay=1000")
.unmarshal(new ZipFileDataFormat())
.choice()
.when(body().isNotNull())
.log("Uziping file ${file:name}.")
.to("file:modelFiles_unzipped")
.endChoice()
.end();
All the files run through the rule but they are created as .tar.gz again but the worse is that the content also becomes corrupt, so they cannot even be opened afterwards as gzip archives.
Questions:
How should I unpack the gzip archives?
How could I do the same for
the tar files?
Update 1:
Thanks for the post Jeremie. I changed the code like this as proposed:
from("file:modelFilesSBML2?readLock=changed&recursive=true&consumer.delay=1000")
.unmarshal().gzip()
.split(new TarSplitter())
.to("file:modelFilesSBML_unzipped");
Then I receive the following exception (just for info the tar.gzip files are not of zero length): FailedException: Cannot write null body to file: modelFilesSBML_unzipped\2006-01-31\BioModels_Database-r4-sbml_files.tar.gz :
2016-03-22 14:11:47,950 [ERROR|org.apache.camel.processor.DefaultErrorHandler|MarkerIgnoringBase] Failed delivery for (MessageId: ID-JOY-49807-1458652278822-0-592 on ExchangeId: ID-JOY-49807-1458652278822-0-591). Exhausted after delivery attempt: 1 caught: org.apache.camel.component.file.GenericFileOperationFailedException: Cannot write null body to file: modelFilesSBML_unzipped\2006-01-31\BioModels_Database-r4-sbml_files.tar.gz
Solution:
After trying many ways, I am using it finally as follows (with Camel 2.17.0 it did not work with 2.16.0 or 2.16.1):
from("file:modelFilesSBML?noop=true&recursive=true&consumer.delay=1000" )
.unmarshal().gzip()
.split(new TarSplitter())
.to("log:tar.gzip?level=INFO&showHeaders=true")
.choice()
.when(body().isNotNull())
.log("### Extracting file: ${file:name}.")
.to("file:modelFilesSBML_unzipped?fileName=${in.header.CamelFileRelativePath}_${file:name}")
.endChoice()
.end();
With Camel 2.17.0 you can also skip the body().isNotNull() check.
Jeremie's proposal help much, so I will accept his answer as a solution. Nevertheless, the exception would still come, if I did not check the message body for null. The fileName=${in.header.CamelFileRelativePath}_${file:name} keeps also the original file structure where the file name is prefixed with the file.tar.gz but I have not found any other way to preserve the directory structure as the file endpoint does not accept expressions for the directory in ("file:directory?options...").
You can use the camel-tarfile component.
If your tar.gz contain multiple files, you should ungzip, then untar and split the exchange for each file. The TarSplitter is an expression which split a tar into an iterator for each file contained in the tar.
from("file:target/from")
.unmarshal().gzip()
.split(new TarSplitter())
.to("file:target/to");

Apache camel multiple file processing with exec

I am having trouble fixing this simple route, getting exception right after execute. Seems like execute is acting as Producer and over writing file.
Exception:
org.apache.camel.component.file.GenericFileOperationFailedException: Cannot store file: C:\camel_tests\stage\Downloads.rar
Route:
Home directory will have a rar file with images, that should be extracted with winrar.exe, each file in the rar is file processed, and eventually moved to arch directory once this route done. Last successful stage is extracting files in the stage directory.
Here CMD_EXPLODE = "\"C:/Program Files/WinRAR/WinRAR.exe\"";
from("file://C:/camel_tests/home?fileName=Downloads.rar&preMove=//C:/camel_tests/stage")
.to("exec:"+Consts.CMD_EXPLODE+"?args=e Downloads.rar&workingDir=C:/camel_tests/stage&outFile=decompress_output.txt")
.to("file://C:/camel_tests/stage?exclude=.*.rar")
.process(new PrintFiles())
.to("file://C:/camel_tests/stage?fileName=Downloads.rar&move=//C:/camel_tests/arch").end();
You should split that into 2 routes. The first that does the from -> exec
And a 2nd from -> process -> to
The 2nd will then process each of the extracted file from winrar.

how to move file to error directory and rollback other move with apache camel?

First of all: i'm a camel newbie :-)
I want to transfer a file from an input directory to an output directory and do some java stuff.
If something goes wrong, i want to move the file to an error directory and rollback to move to the output directory.
This is my route in java dsl:
onException(Exception.class).handled(true).to("file://C:/temp/camel/error");
from("file://C:/temp/camel/in?delete=true").to("file://C:/temp/camel/out").bean(ServiceBean.class, "callWebservice");
If an error is thrown in the ServiceBean, the file is copied to the error folder, but it also stays in the out directory.
What is the best way to rollback?
Thanks
There is a moveFailed option. Just use that, then you dont need the onException etc.
http://camel.apache.org/file2
from("file://C:/temp/camel/in?delete=true&moveFailed=C:/temp/camel/error")
.to("file://C:/temp/camel/out")
.bean(ServiceBean.class, "callWebservice");
And instead of storing to out in the route, then just use the move option so it becomes
from("file://C:/temp/camel/in?move=/temp/camel/out&moveFailed=/temp/camel/error")
.bean(ServiceBean.class, "callWebservice");
I don't think you can easily 'rollback' filesystem operations. Perhaps you could redesign your flow to first copy file to some intermediary 'stage' directory, do the work you need and depending on the outcome of that work move file to either 'output' or 'error' directory.

Resources