camel: split the message into multiple processing paths - apache-camel

My input message:
<file>
<node1>
...
</node1>
.....
<node10>
.....
</node10>
</file>
I want to:
Process the whole file using stylesheet and output to Dest A
For a few elements in the file (say, node1, node3 and node7) I want to extract them and output the content of each individually to Dest B
I know how to process the file using stylesheet but I'm at a loss how to do the other, let alone combine them together.
I'm looking for something like:
from(direct:start).magic_split(
to("xslt:mysheet").to("destA"),
setBody(xpath("//node1").to("destB"),
setBody(xpath("//node3").to("destB"),
setBody(xpath("//node7").to("destB"),
).transform(constant(responseOK);

if you can split the XML, then each node is put in it's own exchange, if you can identify the node after it is split, then you can use a Content Based Router to route the exchange to the appropriate destination. This might require a custom splitter bean, or you might be able to do it from xpath if the nodes are named nodeX where X is a number.

Use the Wire Tap pattern. This pattern allows you to route messages to a separate location while they are being forwarded to the ultimate destination:
from("cxf:bean:submitOrder")
.wireTap("direct:tap")
.beanRef("customBean2");
from("direct:tap")
.to("xslt:my.xsl")
.beanRef("customBean1);

For what I needed, custom beans did the job well. The only thing I had to figure out was how to restore the message to the original content. I guess there is a more elegant way of doing it but works fine:
from("cxf:bean:submitOrder")
.setProperty("originalData", simple("${in.body}")) //save original input msg
.to("xslt:my.xsl").beanRef("customBean1)
.setBody(simple("${property.originalData}")) //restore original message
.beanRef("customBean2");

Related

How to read a text file from resources without javaClass

I need to read a text file with readLines() and I've already found this question, but the code in the answers always uses some variation of javaClass; it seems to work only inside a class, while I'm using just a simple Kotlin file with no declared classes. Writing it like this is correct syntax-wise but it looks really ugly and it always returns null, so it must be wrong:
val lines = object {}.javaClass.getResource("file.txt")?.toURI()?.toPath()?.readLines()
Of course I could just specify the raw path like this, but I wonder if there's a better way:
val lines = File("src/main/resources/file.txt").readLines()
Thanks to this answer for providing the correct way to read the file. Currently, reading files from resources without using javaClass or similar constructs doesn't seem to be possible.
// use this if you're inside a class
val lines = this::class.java.getResourceAsStream("file.txt")?.bufferedReader()?.readLines()
// use this otherwise
val lines = object {}.javaClass.getResourceAsStream("file.txt")?.bufferedReader()?.readLines()
According to other similar questions I've found, the second way might also work within a lambda but I haven't tested it. Notice the need for the ?. operator and the lines?.let {} syntax needed from this point onward, because getResourceAsStream() returns null if no resource is found with the given name.
Kotlin doesn't have its own means of getting a resource, so you have to use Java's method Class.getResource. You should not assume that the resource is a file (i.e. don't use toPath) as it could well be an entry in a jar, and not a file on the file system. To read a resource, it is easier to get the resource as an InputStream and then read lines from it:
val lines = this::class.java.getResourceAsStream("file.txt").bufferedReader().readLines()
I'm not sure if my response attempts to answer your exact question, but perhaps you could do something like this:
I'm guessing in the final use case, the file names would be dynamic - Not statically declared. In which case, if you have access to or know the path to the folder, you could do something like this:
// Create an extension function on the String class to retrieve a list of
// files available within a folder. Though I have not added a check here
// to validate this, a condition can be added to assert if the extension
// called is executed on a folder or not
fun String.getFilesInFolder(): Array<out File>? = with(File(this)) { return listFiles() }
// Call the extension function on the String folder path wherever required
fun retrieveFiles(): Array<out File>? = [PATH TO FOLDER].getFilesInFolder()
Once you have a reference to the List<out File> object, you could do something like this:
// Create an extension function to read
fun File.retrieveContent() = readLines()
// You can can further expand this use case to conditionally return
// readLines() or entire file data using a buffered reader or convert file
// content to a Data class through GSON/whatever.
// You can use Generic Constraints
// Refer this article for possibilities
// https://kotlinlang.org/docs/generics.html#generic-constraints
// Then simply call this extension function after retrieving files in the folder.
listOfFiles?.forEach { singleFile -> println(singleFile.retrieveContent()) }
In order to have the same url that work for both Jar or in local, the url (or path) needs to be a relative path from the repository root.
..meaning, the location of your file or folder from your src folder.
could be "/main/resources/your-folder/" or "/client/notes/somefile.md"
The url must be a relative path from the repository root.
it must be "src/main/resources/your-folder/" or "src/client/notes/somefile.md"
Now you get the drill, and luckily for Intellij Idea users, you can get the correct path with a right-click on the folder or file -> copy Path/Reference.. -> Path From Repository Root (this is it)
Last, paste it and do your thing.

I am using a collector node on IIB to collect messages. Can someone guide with sample ESQL after collector node to process a message collection?

I'm using the FileOutputNode to write the data into the file. I have tried writing the collection messages in the file but every time the file created is of 0 byte and there is no data.
SET OutputRoot.Properties = InputRoot.Properties;
CREATE FIELD OutputRoot.Collection.IN;
DECLARE refCollection REFERENCE TO InputRoot.Collection.IN[1];
WHILE LASTMOVE(refCollection) DO
SET OutputRoot.Collection.IN= refCollection;
SET i = i + 1;
MOVE refCollection NEXTSIBLING REPEAT TYPE NAME;
END WHILE;
RETURN TRUE;
It is very difficult to provide help without knowing what your input message tree looks like.
You should follow the instructions here: https://www.ibm.com/support/knowledgecenter/en/SSMKHH_10.0.0/com.ibm.etools.mft.doc/bc16130_.htm
If you need further help, you should
add Trace nodes into your message flow before and after the Compute node and set the Pattern property on both nodes to ${Root}. This will allow you to see (and share) the structure of InputRoot and OutputRoot.
Enable user trace using the console commands mqsichangetrace, mqsireadlog, mqsiformatlog. This will show you exactly what the message flow is doing. It will also contain the full text of any errors that are being reported.

Apache Camel: How to use "done" files to identify records written into a file is over and it can be moved

As the title suggests, I want to move a file into a different folder after I am done writing DB records to to it.
I have already looked into several questions related to this: Apache camel file with doneFileName
But my problem is a little different since I am using split, stream and parallelProcessing for getting the DB records and writing to a file. I am not able to know when and how to create the done file along with the parallelProcessing. Here is the code snippet:
My route to fetch records and write it to a file:
from(<ROUTE_FETCH_RECORDS_AND_WRITE>)
.setHeader(Exchange.FILE_PATH, constant("<path to temp folder>"))
.setHeader(Exchange.FILE_NAME, constant("<filename>.txt"))
.setBody(constant("<sql to fetch records>&outputType=StreamList))
.to("jdbc:<endpoint>)
.split(body(), <aggregation>).streaming().parallelProcessing()
.<some processors>
.aggregate(header(Exchange.FILE_NAME), (o, n) -> {
<file aggregation>
return o;
}).completionInterval(<some time interval>)
.toD("file://<to the temp file>")
.end()
.end()
.to("file:"+<path to temp folder>+"?doneFileName=${file:header."+Exchange.FILE_NAME+"}.done"); //this line is just for trying out done filename
In my aggregation strategy for the splitter I have code that basically counts records processed and prepares the response that would be sent back to the caller.
And in my other aggregate outside I have code for aggregating the db rows and post that writing into the file.
And here is the file listener for moving the file:
from("file://<path to temp folder>?delete=true&include=<filename>.*.TXT&doneFileName=done")
.to(file://<final filename with path>?fileExist=Append);
Doing something like this is giving me this error:
Caused by: [org.apache.camel.component.file.GenericFileOperationFailedException - Cannot store file: <folder-path>/filename.TXT] org.apache.camel.component.file.GenericFileOperationFailedException: Cannot store file: <folder-path>/filename.TXT
at org.apache.camel.component.file.FileOperations.storeFile(FileOperations.java:292)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.component.file.GenericFileProducer.writeFile(GenericFileProducer.java:277)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.component.file.GenericFileProducer.processExchange(GenericFileProducer.java:165)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.component.file.GenericFileProducer.process(GenericFileProducer.java:79)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.util.AsyncProcessorConverterHelper$ProcessorToAsyncProcessorBridge.process(AsyncProcessorConverterHelper.java:61)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.processor.SendProcessor.process(SendProcessor.java:141)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.management.InstrumentationProcessor.process(InstrumentationProcessor.java:77)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.processor.RedeliveryErrorHandler.process(RedeliveryErrorHandler.java:460)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.processor.Pipeline.process(Pipeline.java:121)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.processor.Pipeline.process(Pipeline.java:83)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.processor.CamelInternalProcessor.process(CamelInternalProcessor.java:190)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.component.seda.SedaConsumer.sendToConsumers(SedaConsumer.java:298)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.component.seda.SedaConsumer.doRun(SedaConsumer.java:207)[209:org.apache.camel.camel-core:2.16.2]
at org.apache.camel.component.seda.SedaConsumer.run(SedaConsumer.java:154)[209:org.apache.camel.camel-core:2.16.2]
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)[:1.8.0_144]
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)[:1.8.0_144]
at java.lang.Thread.run(Thread.java:748)[:1.8.0_144]
Caused by: org.apache.camel.InvalidPayloadException: No body available of type: java.io.InputStream but has value: Total number of records discovered: 5
What am I doing wrong? Any inputs will help.
PS: Newly introduced to Apache Camel
I would guess that the error comes from .toD("file://<to the temp file>") trying to write a file, but finds the wrong type of body (String Total number of records discovered: 5 instead of InputStream.
I don't understand why you have one file-destinations inside the splitter and one outside of it.
As #claus-ibsen suggested try to remove this extra .aggregate(...) in your route. To split and re-aggregate it is sufficient to reference the aggregation strategy in the splitter. Claus also pointed to an example in the Camel docs
from(<ROUTE_FETCH_RECORDS_AND_WRITE>)
.setHeader(Exchange.FILE_PATH, constant("<path to temp folder>"))
.setHeader(Exchange.FILE_NAME, constant("<filename>.txt"))
.setBody(constant("<sql to fetch records>&outputType=StreamList))
.to("jdbc:<endpoint>)
.split(body(), <aggregationStrategy>)
.streaming().parallelProcessing()
// the processors below get individual parts
.<some processors>
.end()
// The end statement above ends split-and-aggregate. From here
// you get the re-aggregated result of the splitter.
// So you can simply write it to a file and also write the done-file
.to(...);
However, if you need to control the aggregation sizes, you have to combine splitter and aggregator. That would look somehow like this
from(<ROUTE_FETCH_RECORDS_AND_WRITE>)
.setHeader(Exchange.FILE_PATH, constant("<path to temp folder>"))
.setHeader(Exchange.FILE_NAME, constant("<filename>.txt"))
.setBody(constant("<sql to fetch records>&outputType=StreamList))
.to("jdbc:<endpoint>)
// No aggregationStrategy here so it is a standard splitter
.split(body())
.streaming().parallelProcessing()
// the processors below get individual parts
.<some processors>
.end()
// The end statement above ends split. From here
// you still got individual records from the splitter.
.to(seda:aggregate);
// new route to do the controlled aggregation
from("seda:aggregate")
// constant(true) is the correlation predicate => collect all messages in 1 aggregation
.aggregate(constant(true), new YourAggregationStrategy())
.completionSize(500)
// not sure if this 'end' is needed
.end()
// write files with 500 aggregated records here
.to("...");

Building a Route to output differences of two files

Trying to put together a file diff route... could someone help? here is what I have ->
CsvDataFormat csv = new CsvDataFormat();
csv.setDelimiter(",");
from("file:inputdir?delete=true&sortBy=ignoreCase:file:name")
.unmarshal(csv)
.pollEnrich("file:backup?fileName=test.csv&sendEmptyMessageWhenIdle=true")
.unmarshal(csv)
// Need to aggregate here!!!!
.log("test");
A csv file gets dropped in the /input directory and then a backup file is consumed from the /backup directory. I would like to compare these two files and output the difference.
This is not a specific Camel problem. In order to solve this problem you may implement a diff functionality on your own, or you may use an existing library such as java-diff-utils.
Pseudocode:
// read file 1 into a list "list1"
// read file 2 into a list "list2"
// use java-diff-utils to calculate the difference
Patch patch = DiffUtils.diff(list1, list2);

Importing Excel file with dynamic name into SQL table via SSIS?

I've done a few searches here, and while some issues are similar, they don't seem to be exactly what I need.
What I'm trying to do is import an Excel file into a SQL table via SSIS, but the problem is that I will never know the exact filename. We get files at no steady interval, and the file usually has a date/month in the name. For instance, our current file is "Census Data - May 2013.xls". We will only ever load ONE file at a time, so I don't need to loop through a directory for multiple Excel files.
My concept is that I can take this file, copy it to a "Loading" directory, and load it from there. At the start of the package, I will first clear out the loading directory, then scan the original directory for an Excel file, copy it to the loading directory and then load it into SQL. I suppose I may have to store the file names somewhere so I don't copy the same file into the loading directory in subsequent months, but I'm not really sure of the best way to handle that.
I've pretty much got everything down except the part that scans the directory for the Excel file and copies it to the loading directory. I've taken the majority of my info from this page, which (again) is close to what I want to do but not quite exactly the solution I need.
Can anyone get me over the finish line? I can't seem to get the Excel Connection Manager right (this is my first time using variables), and I can't figure out how to get the file into the Loading directory.
Problem statement
How do I dynamically identify a file name?
You will require some mechanism to inspect the contents of a folder and see what exists. Specifically, you are looking for an Excel file in your "Loading" directory. You know the file extension and that is it.
Resolution A
Use a ForEach File Enumerator.
Configure the Enumerator with an Expression on FileSpec of *.xls or *.xlsx depending on which flavor of Excel you're dealing with.
Add another Expression on Directory to be your Loading directory.
I typically create SSIS Variables named FolderInput and FileMask and assign those in the Enumerator.
Now when you run your package, the Enumerator is going to look in Diretory and find all the files that match the FileSpec.
Something needs to be done with what is found. You need to use that file name that the Enumerator returns. That's done through the Variable Mappings tab. I created a third Variable called CurrentFileName and assign it the results of the enumerator.
If you put a Script Task inside the ForEach Enumerator, you should be able to see that the value in the "Locals" window for #[User::CurrentFileName] has updated from the Design time value of whatever to the "real" file name.
Resolution B
Use a Script Task.
You will still need to create a Variable to hold the current file name and it probably won't hurt to also have the FolderInput and FileMask Variables available. Set the former as ReadWrite and the latter as ReadOnly variables.
Chose the .NET language of your choice. I'm using C#. The method System.IO.Directory.EnumerateFiles
using System;
using System.Data;
using System.IO;
using Microsoft.SqlServer.Dts.Runtime;
using System.Windows.Forms;
namespace ST_fe2ea536a97842b1a760b271f190721e
{
[Microsoft.SqlServer.Dts.Tasks.ScriptTask.SSISScriptTaskEntryPointAttribute]
public partial class ScriptMain : Microsoft.SqlServer.Dts.Tasks.ScriptTask.VSTARTScriptObjectModelBase
{
public void Main()
{
string folderInput = Dts.Variables["User::FolderInput"].Value.ToString();
string fileMask = Dts.Variables["User::FileMask"].Value.ToString();
try
{
var files = Directory.EnumerateFiles(folderInput, fileMask, SearchOption.AllDirectories);
foreach (string currentFile in files)
{
Dts.Variables["User::CurrentFileName"].Value = currentFile;
break;
}
}
catch (Exception e)
{
Dts.Events.FireError(0, "Script overkill", e.ToString(), string.Empty, 0);
}
Dts.TaskResult = (int)ScriptResults.Success;
}
enum ScriptResults
{
Success = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Success,
Failure = Microsoft.SqlServer.Dts.Runtime.DTSExecResult.Failure
};
}
}
Decision tree
Given the two resolutions to the above problem, how do you chose? Normally, people say "It Depends" but there only possible time it would depend is if the process should stop/error out in the case that more than one file did exist in the Loading folder. That's a case that the ForEach enumerator would be more cumbersome than a script task. Otherwise, as I stated in my original response that adds cost to your project for Development, Testing and Maintenance for no appreciable gain.
Bits and bobs
Further addressing nuances in the question: Configuring Excel - you'll need to be more specific in what isn't working. Both Siva's SO answer and the linked blogspot article show how to use the value of the Variable I call CurrentFileName to ensure the Excel File is pointing to the "right" file.
You will need to set the DelayValidation to True for both the Connection Manager and the Data Flow as the design-time value for the Variable will not be valid when the package begins execution. See this answer for a longer explanation but again, Siva called that out in their SO answer.

Resources