Camel ZipFileDataFormat splitting doesn't set the headers when streaming - apache-camel

I have a simple route which polls zip files from FTP server. The zip file consists of one file that needs processing and zero or more attachments.
I am trying to use ZipFileDataFormat for splitting and I'm able to split and route the items as desired i.e. send the processing file to the processor and other files to the aggregator endpoint.
The route looks like below:
from(sftp://username#server/folder/path?password=password&delay=600000)
.unmarshal(getZipFileDataFormat()).split(body(Iterator.class)).streaming()
.log("CamelSplitComplete :: ${header.CamelSplitComplete}")
.log("Split Size :: ${header.CamelSplitSize}")
.choice()
.when(header(MyConstants.CAMEL_FILE_NAME_HEADER).contains(".json"))
.to(JSON_ENDPOINT).endChoice()
.otherwise()
.to(AGGREGATOR_ENDPOINT)
.endChoice()
.end();
getZipFileDataFormat
private ZipFileDataFormat getZipFileDataFormat() {
ZipFileDataFormat zipFile = new ZipFileDataFormat();
zipFile.setUsingIterator(true);
return zipFile;
}
The splitting works fine. However, I can see in the logs that the two headers CamelSplitComplete and CamelSplitSize are not set correctly. Where CamelSplitComplete is always false, CamelSplitSize is not having any value.
Because of this, I am not able to aggregate based on the size. I am using eagerCheckCompletion() for getting the input exchange in the aggregator route. My aggregator route looks like below.
from(AGGREGATOR_ENDPOINT).aggregate(new ZipAggregationStrategy()).constant(true)
.eagerCheckCompletion().completionSize(header("CamelSplitSize"))to("file:///tmp/").end();
I read Apache Documentation that these headers are always set. Am I missing anything here? Any pointer in the right direction would be very helpful.

I was able to get the whole route to work. I had to add a sort of pre-processor which would set some essential headers (Outgoing file name and file count of the zip) I'd require for aggregation.
from(sftp://username#server/folder/path?password=password&delay=600000).to("file:///tmp/")
.beanRef("headerProcessor").unmarshal(getZipFileDataFormat())
.split(body(Iterator.class)).streaming()
.choice()
.when(header(Exchange.FILE_NAME).contains(".json"))
.to(JSON_ENDPOINT).endChoice()
.otherwise()
.to(AGGREGATOR_ENDPOINT)
.endChoice()
.end();
After that, the zip aggregation strategy worked as expected. Putting here the aggregation route just for completion of answer.
from(AGGREGATOR_ENDPOINT)
.aggregate(header(MyConstants.HEADER_OUTGOING_FILE_NAME), new ZipAggregationStrategy())
.eagerCheckCompletion().completionSize(header(MyConstants.HEADER_TOTAL_FILE_COUNT))
.setHeader(Exchange.FILE_NAME, simple("${header.outgoingFileName}"))
.to("file:///tmp/").end();

Related

Uploading a file to Camel Rest route

I'm trying to upload a file using multipart/form-data to a Camel route.
All is good, however, I can't get the original file name.
Camel version is: 3.14.1
Update
With the following modification to the route. I managed to process binary files (getting the file name and storing them). However, with text files, the file is appended with the boundary footer:
------WebKitFormBoundary7BH9nQ2RqDXvTRAJ--
The route definition:
rest("/v1/file-upload-form")
.post()
.consumes(MediaType.MULTIPART_FORM_DATA_VALUE)
.route()
.process((exchange) -> {
InputStream is = exchange.getIn().getBody(InputStream.class);
MimeBodyPart mimeMessage = new MimeBodyPart(is);
DataHandler dh = mimeMessage.getDataHandler();
exchange.getIn().setBody(dh.getInputStream());
exchange.getIn().setHeader(Exchange.FILE_NAME, dh.getName());
})
.to("file://" + incomingFolder);
Thank you in advance
Edwardo
Edit: Since you have everything else already working, I'd recommend the Stream Caching option.
As Nicolas suggested, checkout Camel's MIME Multipart data format.
Also, the reason you're getting "Missing start boundary" is because your processor is consuming the InputStream. You can try to reset() it, but it might be better to just consume the InputStream once, or enable Stream Caching.
Instead of stream caching, you could also just convert the stream to a string. Before your processor, add:
.convertBodyTo(String.class)
The string can be read over and over. If you still get the missing start boundary error, try logging the body before the unmarshal operation. Make sure the message is intact and that it indeed contains the start boundary.

How to dynamically return a from endpoint in apache camel DSL

Here is my code
from("google-pubsub:123:subscription1?maxMessagesPerPoll=3 & concurrentConsumers=5" ).routeId("myroute")
.process(new ProducerProcessor())
to("google-pubsub:123:topic1")
;
In my code above ,the from channel I want to make it generic.Basically it should be able to consume data from good-pubsub or may be from a file or from a JMS queue.Hence depending upon a parameter I want to return
a different from channel.Something like below
private RouteDefinition fromChannel(String parameter) {
if (parameter is "google" then
return from("google-pubsub:123:subscription1?maxMessagesPerPoll=3 & concurrentConsumers=5" )
if (parameter is "file" then
return from(/my/fileFolder/)).split(body().tokenize("\n")).streaming().parallelProcessing();
}
I tried this but I am getting null pointer exception in the fromChannel method.Please let me know if you have better ideas.
Rewrite based on comment
You can for example create a (static) template route for every input type and generate the routes based on a configured endpoint list.
I described such an endpoint configuration and route generation scenario in this answer.
Like this you can generate the split part for every file route and any other specialty for other route types.
All these input routes are routing at their end to a common processing route
.from(pubsubEndpoint)
.to("direct:genericProcessingRoute")
.from(fileEndpoint)
.split(body()
.tokenize("\n"))
.streaming()
.parallelProcessing()
.to("direct:genericProcessingRoute")
.from("direct:genericProcessingRoute")
... [generic processing]
.to("google-pubsub:123:topic1")
The multiple input (and output) routes around a common core route is called hexagonal architecture and Camel fits very well into this.

Apache Camel: create and return file in response on the fly

I have a Camel route that should return a file in the response, which is created based on the request data. While this works fine with the following (greatly simplified) route, the problem is that I need to first create an actual file on the server that I can then add to the exchange body.
As I don't want these file piling up on the disk, I would prefer to either not create them at all or delete them directly from the same route.
The only way around this I currently see is to have a regular cleanup job that deletes these temporary files.
Any suggestions on how to solve this in a better way?
from("cxfrs://...")
.process(exchange -> {
File file = new File("out.pdf");
// write data to new FileOutputStream(file);
exchange.getIn().setBody(file);
})
The response content type is application/octet-stream.

Possible to initialize a bindy (Apache Camel DataFormat - FixedLength and use it in the same route

My input file consists of several type of FixedLengthRecord, so I have lots of FixedLengthDataFormat to unmarshal each post.
I split the body per row
for the first I should realize which DataFormat I should use, and create an object
Then unmarshal
Something like this one:
from(myURI)
.split().tokenize("\n")
.process(initializeMyBindyDataFormat)
.unmarshal(bindy)
.end();
But my problem is, I get NPE for that bindy object when I initilize it via a process.
But if I create a bindy object before my route definition (before from) it will be work fine. My bindy object is depended on body and I cannot initialize it before route definition.
Actually Apache Camel process initialization of bindy object before starting the route
The answer is using .inout
Since I want to have unmarshalling in another route, a simple example should be as below:
from(myURI)
.split().tokenize("\n")
.inout("direct:unmarshalSpecificRow")
.end();
from(direct:unmarshalSpecificRow")
.choice()
.when(firstPredicate)
unmarshal(new BindyFixedLengthDataFormat(package1)
.when(secondPredicate)
unmarshal(new BindyFixedLengthDataFormat(package1)
.when(thirdPredicate)
unmarshal(new BindyFixedLengthDataFormat(package1)
.otherwise()
.throwException(new IllegalArgumentException("Unrecognised post")
.end();
Thanks jakub-korab for his help.
In this case I think it is better to divide your processing in two seps.
A main route which receives the different data. Here you define the predicate rules that determine what kind of body it is. Check the start of the body, or something that determines it is of this type or that. Add a choice() when() and based on which predicate gets set to true set it to separate route.
In the secondary routes add the specific bindy format and do your marshal/unmarshal work.
An example from the the documentation:
Predicate isWidget = header("type").isEqualTo("widget");
from("jms:queue:order")
.choice()
.when(isWidget).to("bean:widgetOrder")
.when(isWombat).to("bean:wombatOrder")
.otherwise()
.to("bean:miscOrder")
.end();
http://camel.apache.org/predicate.html

Camel: How to go all "when" in "choice when"

I need to ask a problem on the operator "choice when" in Apache Camel route. In the following example, if I have two soap-env:Order elements which have 1, 2 value, then I want to create two xml file named output_1.xml and output_2.xml. However, the code can only create one file output_1.xml.
Can anyone give me any ideas or hints? Thanks for any help.
public void configure() {
...
from("direct:a")
.choice()
.when(ns.xpath("//soap-env:Envelope//soap-env:Order='1'"))
.to("file://data?fileName=output_1.xml")
.when(ns.xpath("//soap-env:Envelope//soap-env:Order='2'"))
.to("file://data?fileName=output_2.xml")
.when(ns.xpath("//soap-env:Envelope//soap-env:Order='3'"))
.to("file://data?fileName=output_3.xml")
}
My understanding is that the content based router implements "if - else if - else" semantics, meaning that as soon as one test evaluates to true, then the remaining tests are skipped. If you want to create files for every case that returns true then you'd have to change the route to something like this:
from("direct:a")
.choice()
.when(ns.xpath("//soap-env:Envelope//soap-env:Order='1'"))
.to("file://data?fileName=output_1.xml")
.end()
.choice()
.when(ns.xpath("//soap-env:Envelope//soap-env:Order='2'"))
.to("file://data?fileName=output_2.xml")
.end()
.choice()
.when(ns.xpath("//soap-env:Envelope//soap-env:Order='3'"))
.to("file://data?fileName=output_3.xml")
.end()
There is nothing wrong with the DSL and you dontt need end blocks here. I would look at your data and trace through why all calls are ending up in the same when block. Put a couple of log lines in or enable the tracer and look at the exchanges going through.
In Camel root choice() if you have multiple when() cases you have to write otherwise(). Please refer below.
from("direct:a")
.choice()
.when(header("foo").isEqualTo("bar"))
.to("direct:b")
.when(header("foo").isEqualTo("cheese"))
.to("direct:c")
.otherwise()
.to("direct:d")
.end;
The above mentioned solution will check all three conditions even if first one pass.

Resources