Unmarshalling an xml file downloaded from Amazon S3 with Camel - apache-camel

I want to download xml files from an S3 bucket, then unmarshal them to insert the data in a database:
...
from("aws-s3://myBucket?accessKey=xxx&secretKey=yyy")
.to("file:D:/output?fileName=${in.header.CamelAwsS3Key}")
.unmarshal(new JaxbDataFormat("com.xxx"))
...
I am able to download the files when I don't try to unmarshal, but I get this error when I try to unmarshal:
org.apache.camel.TypeConversionException:
Error during type conversion from type:
com.amazonaws.services.s3.model.S3ObjectInputStream to the required type:
javax.xml.stream.XMLStreamReader with value com.amazonaws.services.s3.model.S3ObjectInputStream#67b8328b due javax.xml.stream.XMLStreamException:
java.io.IOException: Attempted read on closed stream.
Since I am new to Camel there are maybe things I didn't understand yet...
When I pipe endpoints, doesn't the current endpoint get the message the way the previous endpoint "modified" it? In my case it looks like the S3 stream is being marshalled, instead of the xml file newly created locally from the download, hence the error.
My understanding is if I do .from().to().to(), the second .to() doesn't know what is coming from .from() so if my first .to() creates an xml file, the second .to() handles the message as an xml file. Am I wrong?
Maybe I need to create 2 routes? I was able to do the other way around with only 1 route though, from database to file to S3.
Do I need to write my own converter in that case?
Thanks!

When you have the 2nd to Camel needs to read the stream again, but its closed. Its becomes sometimes streams are only readable once. And it was read in the first to, where you write it to a file.
So you can either enable stream caching to allow Camel to be able to re-read the stream again. See more in this FAQ and the links
http://camel.apache.org/why-is-my-message-body-empty.html
http://camel.apache.org/stream-caching.html
You can also break it into 2 routes
from amq
to file
from file
unmarshal

Related

Is it possible to send a byte array in a JMS message in jmeter by mapmessage?

I would like to know if someone has been able to send a byte array in the jmeter by mapmessage without the output being a string
As per JMS Publisher documentation it is possible:
The Object message is implemented and works as follow:
Put the JAR that contains your object and its dependencies in jmeter_home/lib/ folder
Serialize your object as XML using XStream
Either put result in a file suffixed with .txt or .obj or put XML content directly in Text Area
Note that if message is in a file, replacement of properties will not occur while it will if you use Text Area.
If it is not something you're looking for - you can always directly call BytesMessage class functions, i.e. writeBytes() from JSR223 Test Elements using Groovy language

Apache Camel Interceptor with regular expression

This is my route. I want to send a file to an Azure blob. I want to set the name of the blob as the file name without extension. I also want to filter out the whitespaces from the file names. I am thinking of using an interceptor
from("file://C:/camel/source1").recipientList(simple("azure-blob://datastorage/container1/${header.fileName}?credentials=#credentials&operation=updateBlockBlob"))
I want to invoke the interceptor only for updateBlockBlob operatin
interceptSendToEndpoint("^(azure-blob:).+(operation=updateBlockBlob)").setHeader("fileName",simple("${file:onlyname.noext}")).convertBodyTo(File.class)
The above code works with interceptFrom().
I tried replacing the regular expression with wild card like azure* i.e interceptSendToEndpoint("azure*"). It did not work
Whats wrong with the above code? Is it because of recipientList?
Also what features does simple have to remove white space?
Is there a better way to generate blob names dynamically?
Here is the documentation from camel on interceptors.
http://camel.apache.org/intercept.html
interceptFrom that intercepts incoming Exchange in the route.
interceptSendToEndpoint that intercepts when an Exchange is about to
be sent to the given Endpoint.
So I suspect the Exchange is already formed and camel expects the url to be resolved.
So the header needs to be set before the exchange is created for the Azure end point.
I did the following. To set the header, I use the interceptFrom, and to convert the object into File I used the inteceptSendToEndPoint
interceptSendToEndpoint("^(azure-blob:).+(operation=updateBlockBlob)").convertBodyTo(File.class)
interceptFrom().setHeader("fileName",simple("${file:onlyname.noext}".replaceAll("[^a-zA-Z\d]")))
Managed to get rid of the whitespace too

Export Data to file in GNU Radio Companion

I have run into a slight problem with GNU Radio. I inserted a “File Sink” block into GNU Radio companion. I was receiving data last week, but coming back to the classroom today, I am not able to execute the file anymore. Do you have any idea what is wrong?
Basically, what I am trying to do, is export data created from GRC file using a file sink block to export the data to a file. That file, using python to parse through the data, will then be uploaded to a database. My problem is now that I cannot execute the file to export the data.
Below is some data from the Python script associated with the File Sink
audiodata = gr.file_sink(gr.sizeof_float, "audio.dat")
self.connect(src0, audio)
audiodata = gr.file_sink(gr.sink(gr.sizeof_complex, "audio.dat")
Below is a link of my GRC File.
http://i58.tinypic.com/10wv78z.png
If anyone has a better way to export the data from GRC, please let me know.
The second line of python looks broken. Where did you get it from? I haven't seen a bug in GRC's python code generation yet, so this is surprising.
Regarding the red arrow: This most probably indicates that something is wrong with the data type of the file sink. You should set the type to float, set it back to complex, and see if that solved the problem. If it did not, then your GRC file is broken, and you'll either need to manually look at the XML or re-build it from scratch, sorry :(
I have not seen XML corruption in GRC, either, so please make sure your data storage is not corrupted.
I think the second line should be
self.connect(src0, audiodata)
The lines looks similar to Capturing Signals in GNU Radio.pdf available in internet

Apache Camel Copying Files from multiple source folders to multiple destination folders

I am new to camel and we are building an EDI engine and our requirement is to read the files from multiple folders then second step is to parse the message type and the receiver id and based on that the messages need to be routed to different folders.
The source, message type, receiver id and destination cannot be hardcoded in camel instead it should be read from the database and the routes need to be built dynamically.
Please let me know what should be the strategy that we need to follow.
Thanks,
Jayadeep
As I understand from your comments, you can read from multiple folders by adding routes dynamically but are facing issue when trying to decide on where to send the messages as the destination , headers etc is being read from database.
Here's how I would do it.
Get the file --> Enrich it with database call and get the receiever id etc --> Use Xpath and get the receiver id etc and set them in propertyor header --> Use XSLT and remove the values that you enriched for database call so now you have the original message ---> Now use router and look at properties/headers to decide the <camel:to> path

ssis - flat file source - The process cannot access the file because it is being used by another process

Am attempting to use SSIS flat file source to read a log file being actively written to. Looping and waiting for the file to be released is not practical as this is an active log file held by a service.
Ideally, I'm looking for a setting upon the flat file source, similar to the C# code below. If not that, what route can I take to read the flat file? I'm attempting to stay within SSIS as I sincerely can't believe this can't be done with stock parts and assume I'm just missing something.
Using C#, I can successfully open the exact file upon which the flat file source errors
System.IO.FileStream file
= new System.IO.FileStream
(
file_to_hash.FullName
, System.IO.FileMode.Open
, System.IO.FileAccess.Read
, System.IO.FileShare.ReadWrite
);
This is the error message experienced in SSIS:
Warning: 0x80070020 at Data Flow Task, Flat File Source [1]: The process cannot access the file because it is being used by another process.
Error: 0xC020200E at Data Flow Task, Flat File Source [1]: Cannot open the datafile "XXX".
both ideas by tim and cade would work. I chose Tim's approach - copying the file b/c I already had the code ( both the copy and the data tranformation ), and changing the name/path of the file going into the data transformation was a configuration setting of the app being built. Wish I could mark it answered, but asked the question as an unregistered user.
You probably need to write a custom data source - possibly as simple as a script task.

Resources