Apache camel multiple file processing with exec - apache-camel

I am having trouble fixing this simple route, getting exception right after execute. Seems like execute is acting as Producer and over writing file.
Exception:
org.apache.camel.component.file.GenericFileOperationFailedException: Cannot store file: C:\camel_tests\stage\Downloads.rar
Route:
Home directory will have a rar file with images, that should be extracted with winrar.exe, each file in the rar is file processed, and eventually moved to arch directory once this route done. Last successful stage is extracting files in the stage directory.
Here CMD_EXPLODE = "\"C:/Program Files/WinRAR/WinRAR.exe\"";
from("file://C:/camel_tests/home?fileName=Downloads.rar&preMove=//C:/camel_tests/stage")
.to("exec:"+Consts.CMD_EXPLODE+"?args=e Downloads.rar&workingDir=C:/camel_tests/stage&outFile=decompress_output.txt")
.to("file://C:/camel_tests/stage?exclude=.*.rar")
.process(new PrintFiles())
.to("file://C:/camel_tests/stage?fileName=Downloads.rar&move=//C:/camel_tests/arch").end();

You should split that into 2 routes. The first that does the from -> exec
And a 2nd from -> process -> to
The 2nd will then process each of the extracted file from winrar.

Related

Files are moved to Completed folder even though the processor throws an exception

I have a weird issue in the below code. Grabbed file from file path and send it to downstream system using the below code. But the issue is when Actual processor class throws an exception , the file is getting moved from inprogress folder and then completed folder. But it should not happen like this. What I want is that file should be remain in same file location (InProgress), until the processor process the file successfully. What am i doing wrong here?
'''
from("file:src/path?preMove=InProcess&move=Completed&delete=false&idempotent=false&idempotentKey=${file:name}-${file:size}&readLock=rename&readLockMinAge=10s&renameUsingCopy=true&filterFile=$simple{file:size} > 0 &include=.*.xml")
.convertBodyTo(String.class)
.routeId("test-route")
.doTry().bean(ActualProcessor.class, "process")
.doCatch(Throwable.class)
.to("file:src/path?move=Reprocess")
.end();
'''

CMD how to write merged text file name on first line

I have created cmd batch file to run several R script and each script have their own log files in same folder like below :
coc_prod_xgb.log
ds_prod_xgb.log
ccpa_prod_xgb.log
pletb_prod_xgb.log
and many more
Then I merge all text files into 1 log file
copy *.log all_log.log
The problem is sometimes there are errors on different jobs, so I need to know on which log this error occurs. Currently I have to open each log file one by one, because in the merged log file, I can't identify which log file that has this error.
How to modify above copy code so it will write file names on the 1st row then the next row will be log information and append the same process to next file
I can't find any option for cmd copy code to write file name, so i find another solution from this question.
Easiest way to add a text to the beginning of another text file in Command Line (Windows)
so it serve my needs, eventhough i have to write each block of code for each log file. but only for 1 time effort

"Filenotfound errno 2" but file is actually there

I've got a folder of images and want to iterate through them so I can write data corresponding with each to a csv file. The image directory contains four images but an error is thrown at the third image ('Loss Label_0_2.tif'). Note the directory containing all images is in a folder called DS_Test (see dataset_dir). But, with the error I get, it seems it starts searching in the parent directory (CNN) rather than CNN/Dataset_dir, specified in dataset_dir.
Could anyone tell what mistake I'm making?
I took the parent directories above CNN out of the path for privacy reasons. Apologies if this question is a bit of an overload of info.
dataset_dir = 'directorypath/CNN/DS_Test/'
with open('csvfilepath/Labels.csv', 'w') as file:
writer = csv.writer(file)
for image in os.listdir(dataset_dir):
if image.endswith('.tif'):
*
series of file operations
*
writer.writerow(data)
file.close()
Error: "FileNotFoundError: [Errno 2] No such file or directory: 'directorypath/CNN/Loss Label_0_2.tif'

camel unpacking tar.gzip files

After downloading several files with camel over FTP I should process them but they are in tar.gzip formats. Camel supports gzip and as I can see also tar endpoint from 2.16.0 onwards (http://camel.apache.org/camel-2160-release.html).
The code I have for extracting the gzip:
from("file:modelFiles?readLock=changed&recursive=true&consumer.delay=1000")
.unmarshal(new ZipFileDataFormat())
.choice()
.when(body().isNotNull())
.log("Uziping file ${file:name}.")
.to("file:modelFiles_unzipped")
.endChoice()
.end();
All the files run through the rule but they are created as .tar.gz again but the worse is that the content also becomes corrupt, so they cannot even be opened afterwards as gzip archives.
Questions:
How should I unpack the gzip archives?
How could I do the same for
the tar files?
Update 1:
Thanks for the post Jeremie. I changed the code like this as proposed:
from("file:modelFilesSBML2?readLock=changed&recursive=true&consumer.delay=1000")
.unmarshal().gzip()
.split(new TarSplitter())
.to("file:modelFilesSBML_unzipped");
Then I receive the following exception (just for info the tar.gzip files are not of zero length): FailedException: Cannot write null body to file: modelFilesSBML_unzipped\2006-01-31\BioModels_Database-r4-sbml_files.tar.gz :
2016-03-22 14:11:47,950 [ERROR|org.apache.camel.processor.DefaultErrorHandler|MarkerIgnoringBase] Failed delivery for (MessageId: ID-JOY-49807-1458652278822-0-592 on ExchangeId: ID-JOY-49807-1458652278822-0-591). Exhausted after delivery attempt: 1 caught: org.apache.camel.component.file.GenericFileOperationFailedException: Cannot write null body to file: modelFilesSBML_unzipped\2006-01-31\BioModels_Database-r4-sbml_files.tar.gz
Solution:
After trying many ways, I am using it finally as follows (with Camel 2.17.0 it did not work with 2.16.0 or 2.16.1):
from("file:modelFilesSBML?noop=true&recursive=true&consumer.delay=1000" )
.unmarshal().gzip()
.split(new TarSplitter())
.to("log:tar.gzip?level=INFO&showHeaders=true")
.choice()
.when(body().isNotNull())
.log("### Extracting file: ${file:name}.")
.to("file:modelFilesSBML_unzipped?fileName=${in.header.CamelFileRelativePath}_${file:name}")
.endChoice()
.end();
With Camel 2.17.0 you can also skip the body().isNotNull() check.
Jeremie's proposal help much, so I will accept his answer as a solution. Nevertheless, the exception would still come, if I did not check the message body for null. The fileName=${in.header.CamelFileRelativePath}_${file:name} keeps also the original file structure where the file name is prefixed with the file.tar.gz but I have not found any other way to preserve the directory structure as the file endpoint does not accept expressions for the directory in ("file:directory?options...").
You can use the camel-tarfile component.
If your tar.gz contain multiple files, you should ungzip, then untar and split the exchange for each file. The TarSplitter is an expression which split a tar into an iterator for each file contained in the tar.
from("file:target/from")
.unmarshal().gzip()
.split(new TarSplitter())
.to("file:target/to");

Connect Direct Multiple files, one Node.

I'm working on a project that requires sending multiple files to the same node. The files are available for sending at the same time and I created a simple C.D shell script to send the files. I looped the call to this script to send all the files (about 20) at the same time.
In my script I'm intending to delete the files within the loop and after the CD script is called. However,.. some one at work , a colleague, told me that the files may not be sent on the spot but rather put in a queue for transmission at a later stage if the C.D node is busy and hence deleting the files would cause errors.
Can someone advise if this is the case? Are the files not fiscally copied even if put in a queue?
I find it weird that the CD script would complete with a successful return code and give me the process number and I still cannot delete the file?
Thanks,
Sergio
You can use the if statement for each file, if the exit code for the file is 0, only the file is deleted and CD moves to the next step for copying the next file.
if (step01=0) then
run task (Do something)
sysopts="rm filename"

Resources