Camel FTP routing - two separate FTP filenames - apache-camel

I have a Camel FTP route that has, in the body, a list of objects which I need to break up into multiple separate FTP transmissions with different FTP filenames.
I am thinking this should be a Splitter EIP pattern, however, I am not sure how to go about enabling the multiple separate FTP transmissions.
Example ...assuming I have 4 records in the Exchange ${body} which I wish to organize into 2 separate ftp transmissions, I can see how I could use code in a Process class to separate the body into 2 groups of records, however, the issue is how to then make this two separate FTP transmissions.
I have thought of bundling the different groups of records into separate Header or Property objects ( in the Process class ) but the issue is still the same ...how to subsequently process these Headers or Properties as two separate FTP transmissions.
I am considering using a approach but I get the feeling there is a more elegant way using the pattern.
I should note that I do not know in advance how many different groups of records ( and thus how many separate FTP transmissions & filenames ) I shall need ...in my example above there are 2, however, in reality there could could be many more.
Any thoughts or examples ?
By way of example ...lets assume following Process class to setup a sample Camel body:
class SetTestBody implements Processor {
#Override
void process(Exchange exchange) throws Exception {
final String body = exchange.getIn().getBody(String.class)
/**
attribute6 is the -key- to use as the FTP filename ... all sttribute6 occurences of same value go
to same ftp filename ...in example below, we would have 4 separate FTP file transfers ...
one for the 2 entries having 2041,
one for 6 entries having 2143,
one for 2 entries having 2106
*/
def r1 = [
[ 'sku':'ALLO' , 'attribute3':'72','attribute4':'Local991','attribute5':'RosRen', 'attribute6':'2041' ],
[ 'sku':'ALL2' , 'attribute3':'73','attribute4':'Local9910','attribute5':'RosRen', 'attribute6':'2143' ],
[ 'sku':'ALL5O' , 'attribute3':'74','attribute4':'Local9911','attribute5':'RosRen', 'attribute6':'2143' ],
[ 'sku':'ALL6O' , 'attribute3':'75','attribute4':'Local9912','attribute5':'RosRen', 'attribute6':'2143' ],
[ 'sku':'ALL7O' , 'attribute3':'76','attribute4':'Local9913','attribute5':'RosRen', 'attribute6':'2143' ],
[ 'sku':'ALL8O' , 'attribute3':'77','attribute4':'Local9914','attribute5':'RosRen', 'attribute6':'2041' ],
[ 'sku':'ALL9O' , 'attribute3':'78','attribute4':'Local9915','attribute5':'RosRen', 'attribute6':'2106' ],
[ 'sku':'ALL10O' , 'attribute3':'79','attribute4':'Local9916','attribute5':'RosRen','attribute6':'2143' ],
[ 'sku':'ALL1O1' , 'attribute3':'80','attribute4':'Local9917','attribute5':'RosRen','attribute6':'2143' ],
[ 'sku':'ALL1O2' , 'attribute3':'81','attribute4':'Local9918','attribute5':'RosRen','attribute6':'2106' ]
]
// Sort the list so all entries are in proper sorted order ( sorted on attribute6 )
def sortedBody = r1.sort{ a,b -> a['attribute6'] <=> b['attribute6']}
exchange.getIn().setBody(sortedBody);
}
}
Lets assume a sample route as follows:
<routes xmlns="http://camel.apache.org/schema/spring">
<route id="rod-setTestBody-bean">
<from uri="timer://fetchData?repeatCount=1"/>
<log message="\n1 body is : ${body}\n"/>
<process id="process1" ref="setTestBody"/>
<log message="\n2 body is : ${body}\n"/>
<split>
<simple>${body}</simple>
<log message="\nBody after split is : ${body}\n"/>
<!--
I would like to send all records of same attribute6 value to a same ftp destination ...just with different filename
...all records for 2041 in 1 ftp transmission, all for 2106 in second ftp transmission, all for 2143 in third ftp transmission
-->
<to uri="mock:ftpDestination1"/>
</split>
</route>
</routes>

Related

Pick up files from multiple location using camel FTP

i have a situation i need to pick up files from two different location using camel FTP.
currently my code is
try {
from(format("sftp://%s#%s:22/../../..%s?password=%s&delete=true", ftpUserName, ftpServer, responsePath, ftpPassword ))
.filter(header("CamelFileName").endsWith(".PDF"))
.to(format("sftp://%s#%s:22/../../..%s/processed?password=%s", ftpUserName, ftpServer, responsePath, ftpPassword))
.process(documentProcessor)
/*.log(LoggingLevel.INFO, (org.slf4j.Logger) logger, "Download file ${file:name} complete.")*/
/*.to(downloadLocation)*/;
/*.to(format("smtp://relay.us.signintra.com?to=%s&from=noreply#dbschenker.com&subject=GTM response from cisco", emailTo))*/
;
} catch (Exception e) {
e.printStackTrace();
}
This is picking up the file that is mentioned in the application.properties files. How can i do this to puck up files from multile locations.
You can configure multiple FTP consumer routes and forward the message to shared direct-endpoint.
Example:
from("ftp:{{target.host}}:{{ftp.port}}/{{target.path1}}...")
.routeId("PollForFilesInPath1")
.to("direct:handleFileFromFTP");
from("ftp:{{target.host}}:{{ftp.port}}/{{target.path2}}...")
.routeId("PollForFilesInPath2")
.to("direct:handleFileFromFTP");
from("direct:handleFileFromFTP")
.routeId("handleFileFromFTP")
.log("file received from ftp: ${headers.CamelFileName }")
// Do stuff with the file
If you need to call 2 FTP consumer endpoints from single route you can use poll-enrich.

Camel - Poll rest endpoint and split JSON list

So I have a rest API that lives at https://foo.bar/api which returns either an empty json list [] or a list that contains 1 or more items:
[
{
"#class": "foo.bar.java.MyObject",
"name": "Joe Bloggs"
},
{
"#class": "foo.bar.java.MyObject",
"name": "Fred Flinstone"
}
]
Now I am trying to have camel take in this data from my endpoint and hand each object within the list to a processor. I've tried the following:
fromF("timer://foo-poll?fixedRate=true&delay=5s&period=%d&bridgeErrorHandler=true", pollRate)
.toF("https4://%s/%s", host, requestPath)
.log("Received: ${body}")
.split()
.jsonpath("$")
.log("Split: ${body}")
.process(barProccessor);
As well as various attempts to unmarshal the data using .unmarshal(new ListJacksonDataFormat(MyObject.class)) or .unmarshal().json(JsonLibrary.Jackson, List.class)
where nothing has worked.
Using the larger code block above, there are no errors nor is the "Split: ${body}" log message printers out.
Unmarshalling using either methods described above throws this regardless of how many items are returned from the API:
com.fasterxml.jackson.databind.exc.MismatchedInputException: No
content to map due to end-of-input
When there
Okay, managed to figure this out if anyone else is facing a similar issue. The working route builder:
fromF("timer://foo-poll?fixedRate=true&delay=5s&period=%d&bridgeErrorHandler=true", pollRate)
.toF("https4://%s/%s", host, requestPath)
.log("Received: ${body}")
.streamCaching("true")
.unmarshal(new ListJacksonDataFormat(MyObject.class))
.split()
.jsonpath("$")
.log("Split: ${body}")
.process(barProccessor);
I have enabled stream caching and unmarshalled the list using Jackson.

Can't start clickhouse service, too many files in ../data/default/<TableName>

I have a strange problem with my standalone clickhouse-server installation. Server was running for some time with nearly default config, except data and tmp directories was replaced to separate disk:
cat /etc/clickhouse-server/config.d/my_config.xml
<?xml version="1.0"?>
<yandex>
<path>/data/clickhouse/</path>
<tmp_path>/data/clickhouse/tmp/</tmp_path>
</yandex>
Today the server stopped responding with connection refused error. It was rebooted and after that the service couldn't completely start:
2018.05.28 13:15:44.248373 [ 2 ] <Information> DatabaseOrdinary (default): 42.86%
2018.05.28 13:15:44.259860 [ 2 ] <Debug> default.event_4648 (Data): Loading data parts
2018.05.28 13:16:02.531851 [ 2 ] <Debug> default.event_4648 (Data): Loaded data parts (2168 items)
2018.05.28 13:16:02.532130 [ 2 ] <Information> DatabaseOrdinary (default): 57.14%
2018.05.28 13:16:02.534622 [ 2 ] <Debug> default.event_5156 (Data): Loading data parts
2018.05.28 13:34:01.731053 [ 3 ] <Information> Application: Received termination signal (Terminated)
Really, I stopped process on 57%, because it starts too long(maybe it could start in an hour or two, I didn't try).
The log level by default is "trace", but I didn't show any reasons of such behavior.
I think the problem is in file count in /data/clickhouse/data/default/event_5156.
Now it is 626023 directories in it and ls -la command do not work in this catalog properly, I have to use find to count files:
# time find . -maxdepth 1 | wc -l
626023
real 5m0.302s
user 0m3.114s
sys 0m24.848s
I have two questions:
1)Why Clickhouse-Server generated so much files and directories, with default config?
2)How can I start the service without data loss in adequate time?
Issue was in data update method. I used script with jdbc connector and have been sending one string per request. After changing scheme to batch update, the issue was solved.

Setting some header with current file name with Camel

I am trying to set a header, "CamelAwsS3Key", with the current file name, before uploading the file to Amazon S3, but I am not able to get the current file name, here is the Spring snippet:
...
<to uri="file:D:/camel_out?fileName=foo_$simple{date:now:yyyyMMddHHmmss}.xml" />
<setHeader headerName="CamelAwsS3Key">
<simple>${file:onlyname}</simple>
</setHeader>
<to uri="aws-s3://bucket?accessKey=xxx&secretKey=yyy" />
Here is the message history (I removed useless columns):
...
[to2 ] [file:D:/camel_out?fileName=foo_$simple{date:now:yyyyMMddHHmmss}.xml]
[setHeader1] [setHeader[CamelAwsS3Key] ]
[to3 ] [aws-s3://bucket?accessKey=xxx&secretKey=xxxxxx ]
But the header is not set... From the log:
Exchange[
...
Headers{... CamelAwsS3Key=null, CamelFileName=null, CamelFileNameProduced=D:\camel_out\foo_20150622124308.xml ...}
...
]
It works if I use a constant instead (the file is uploaded), like:
<setHeader headerName="CamelAwsS3Key">
<constant>test</constant>
</setHeader>
How to set a header with the current file name?
Thanks!
The simple <simple>${file:onlyname}</simple> is only when you consume from a file endpoint, eg from file.
To file producer (eg to file) only has one header which is file name produced:
http://camel.apache.org/maven/current/camel-core/apidocs/src-html/org/apache/camel/Exchange.html#line.125
which is also what you can see from your logs. You will need to write some code to extract only the name part of file named produced if you only want that, as the name currently includes paths. Though we could introduce a FILE_ONLYNAME_PRODUCED header in the camel file producer.
You are welcome to log such ENH request at: http://camel.apache.org/support

StreamCache FileNotFound issues with bigger data in multicast routes

We are using camel 2.13.2 - I have a multicast route with an AggregationStrategy.
And in each multicast branch, we have a custom camel component that returns huge data (around 4 MB) and writes to Stream Cache (Cached Output Stream) and we need to aggregate the data in the multicast (Aggregation Strategy).
In the Aggregation strategy, I need to do XPath evaluation using camel XPathBuilder.
Hence, I try to read the body and convert from StreamCache to byte[] to avoid 'Error during type conversion from type: org.apache.camel.converter.stream.InputStreamCache.' in the XPathBuilder.
When I try to read the body in the beginning of the Aggregation Strategy, I get the following error.
*/tmp/camel/camel-tmp-4e00bf8a-4a42-463a-b046-5ea2d7fc8161/cos6047774870387520936.tmp (No such file or directory), cause: FileNotFoundException:/tmp/camel/camel-tmp-4e00bf8a-4a42-463a-b046-5ea2d7fc8161/cos6047774870387520936.tmp (No such file or directory).
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.(FileInputStream.java:138)
at org.apache.camel.converter.stream.FileInputStreamCache.createInputStream(FileInputStreamCache.java:123) at org.apache.camel.converter.stream.FileInputStreamCache.getInputStream(FileInputStreamCache.java:117)
at org.apache.camel.converter.stream.FileInputStreamCache.writeTo(FileInputStreamCache.java:93)
at org.apache.camel.converter.stream.StreamCacheConverter.convertToByteArray(StreamCacheConverter.java:102)
at com.sap.it.rt.camel.aggregate.strategies.MergeAtXPathAggregationStrategy.convertToByteArray(MergeAtXPathAggregationStrategy.java:169)
at com.sap.it.rt.camel.aggregate.strategies.MergeAtXPathAggregationStrategy.convertToXpathCompatibleType(MergeAtXPathAggregationStrategy.java:161)
*
Following is the line of code where it is throwing an error:
Object body = exchange.getIn().getBody();
if( body instanceof StreamCache){
StreamCache cache = (StreamCache)body;
xml = new String(convertToByteArray(cache,exchange));
exchange.getIn().setBody(xml);
}
By disabling stream cache to write to file by setting a threshold of 10MB in multicast related routes, we were able to work with the aggregation strategy. But we do not want to do that, as we may have incoming data that maybe bigger.
<camel:camelContext id="multicast_xml_1" streamCache="true">
<camel:properties>
<camel:property key="CamelCachedOutputStreamCipherTransformation" value="RC4"/>
<camel:property key="CamelCachedOutputStreamThreshold" value="100000000"/>
</camel:properties>
....
</camel:camelContext>
Note: The FileNotFound issue does not appear if we have the StreamCache based camel component in the route with other processors, but without Multicast + Aggregation.
After debugging, I could understand the issue with aggregating huge data from StreamCache with MulticastProcessor.
In MulticastProcessor.java: doProcessParallel() is called and on completion of the branch exchange of multicast, the CachedOutputStream deletes / cleans up the temporary file.
This happens even before the multicast branch exchange reaches the aggregation Strategy, which tries to read the data from the branch exchange. In case of huge data in StreamCache, the temporary file is already deleted, leading to FileNotFound issues.
public CachedOutputStream(Exchange exchange, boolean closedOnCompletion) {
this.strategy = exchange.getContext().getStreamCachingStrategy();
currentStream = new CachedByteArrayOutputStream(strategy.getBufferSize());
if (closedOnCompletion) {
// add on completion so we can cleanup after the exchange is done such as deleting temporary files
exchange.addOnCompletion(new SynchronizationAdapter() {
#Override
public void onDone(Exchange exchange) {
try {
if (fileInputStreamCache != null) {
fileInputStreamCache.close();
}
close();
} catch (Exception e) {
LOG.warn("Error deleting temporary cache file: " + tempFile, e);
}
}
#Override
public String toString() {
return "OnCompletion[CachedOutputStream]";
}
});
}
}
public void close() throws IOException {
currentStream.close();
cleanUpTempFile();
}
I was able to circumvent the issue, if I try to set closedOnCompletion= false, while writing to CachedOutputStream in any component in any Multicast branch.
But this is a leaky solution, because the streamcache temporary file(s) may then never get cleaned up... hence I try to close + clean up the cachestream, after reading the data in the AggregationStrategy.
Can the MulticastProcessor be adjusted so that the multicast branch exchanges reach 'completion' status only, after they have been aggregated at the end of multicast?
Please help / advise on the issue, as I am new to using camel Multicast.
Thanks,
Lakshmi
I have similar exception thrown when trying to send larger than 1MB JSON response to Restlet request (yes, I know 1MB JSON is too big):
java.io.FileNotFoundException: C:\Users\me\AppData\Local\Temp\camel\camel-tmp-7ad6e098-538d-4d4c-9357-2b7addb1f19d\cos6725022584818060586.tmp (The system cannot find the file specified)
at java.io.FileInputStream.open(Native Method)
at java.io.FileInputStream.<init>(FileInputStream.java:146)
at org.apache.camel.converter.stream.FileInputStreamCache.createInputStream(FileInputStreamCache.java:123)
at org.apache.camel.converter.stream.FileInputStreamCache.getInputStream(FileInputStreamCache.java:117)
at org.apache.camel.converter.stream.FileInputStreamCache.read(FileInputStreamCache.java:112)
at java.io.InputStream.read(InputStream.java:170)
at java.io.InputStream.read(InputStream.java:101)
at org.restlet.engine.io.BioUtils.copy(BioUtils.java:81)
at org.restlet.representation.InputRepresentation.write(InputRepresentation.java:148)
at org.restlet.engine.adapter.ServerCall.writeResponseBody(ServerCall.java:510)
at org.restlet.engine.adapter.ServerCall.sendResponse(ServerCall.java:454)
at org.restlet.ext.servlet.internal.ServletCall.sendResponse(ServletCall.java:426)
at org.restlet.engine.adapter.ServerAdapter.commit(ServerAdapter.java:196)
at org.restlet.engine.adapter.HttpServerHelper.handle(HttpServerHelper.java:153)
at org.restlet.ext.servlet.ServerServlet.service(ServerServlet.java:1089)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:848)
at org.eclipse.jetty.servlet.ServletHolder.handle(ServletHolder.java:684)
at org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1496)
at org.springframework.web.filter.OncePerRequestFilter.doFilter(OncePerRequestFilter.java:102)
Same workaround works for me:
getContext().getProperties().put(CachedOutputStream.THRESHOLD, "" + THREE_MEGABYTE_TRESHOLD_BEFORE_FILE_CACHE);
I don't use multicast in this route, just plain
restlet request -> Service -> Jackson marshall => error
I use Camel 2.14.0 & Restlet 2.2.2 with JDK 7 and Spring-boot 1.0.2 / Jetty
This Camel reverse proxy - no response stream caching might be related to my issue.

Resources