ChannelHandler handling content of unknown size - apache-camel

I am still struggling with Camel (2.16.1) and Netty (4.0.33) to have them both receive tcp content of freely chosen length. Because of the unknown size of the tcp content received I was not yet able to create a working decoder for.
Let me describe my problem with an example. Lets say I have a file with a length of 3129 byte. When I nc that file to my route the size is not known until the last byte is read:
cat file.bin | nc localhost 10001
My route is defined like this:
from( "netty4:tcp://127.0.0.1:10001?sync=false&allowDefaultCodec=false&
decoder=#factory&receiveBufferSize=1000000")
.to("file:/temp/in");
The factory looks like this because I need to make sure that each ChannelHandler is used only once:
public class Factory implements ChannelHandlerFactory {
#Override
public ChannelHandler newChannelHandler() {
return new RawPrinterDecoder();
}
}
In my decoder I have this code:
public class RawPrinterDecoder extends ReplayingDecoder<Void> {
#Override
protected void decode(ChannelHandlerContext ctx, ByteBuf in,
List<Object> out) throws Exception {
while (in.isReadable()) {
byte readByte = in.readByte();
job.addContent(readByte);
}
in.discardReadBytes();
}
public void channelInactive(ChannelHandlerContext ctx) throws Exception {
System.out.println("Bytes in job: " + job.getSize() );
}
}
The problem with this is that instead of 3129 byte I receive 9273. The reason for this is that the file is split into 3 segments of 1024 byte and 1 with 57 byte. Those are passed repeatedly to my decoder and although I try to invalidate the segments after they are first processed with in.discardReadBytes() they are processed again so instead of ...
segment1
segment2
segment3
segment4
... my decoder sees them like this
segment1
segment1+segment2
segment1+segment2+segment3
segment1+segment2+segment3+segment4
I tried so solve my problem by using checkpoint() but the segments were still called repeatedly.
How can I make sure that each segment is only processed once and in the correct order ? If this can be done more efficiently instead of reading single bytes recommendations are welcome (readableBytes() always return 2 GB so I can not use this to get the number of bytes).

It gets seperated into segments because of the ByteBuffAllocator that your server is using. You can change that this way:
#Bean
public ChannelInitializer<SocketChannel> channelInitializer() {
return new ChannelInitializer<SocketChannel>() {
#Override
public void initChannel(SocketChannel ch) throws Exception {
ch.config().setRecvByteBufAllocator(new FixedRecvByteBufAllocator(2048));
}
};
}
You can read all of the available bytes at once using:
ByteBuf buffer = in.readBytes(in);
or
ByteBuf buffer = in.readSlice(in.readableBytes());

Related

How to get the number of iteration in a split?

I am new to Apache Camel.
I need to split a file line by line and to do some operation on each lines.
At the end I need a footer line with information from previous lines (number of lines and sum of the values of a column)
My understanding is that I should be using an aggregation strategy, so I tried something like that:
.split(body().tokenize("\r\n|\n"), sumAggregationStrategy)
.process("fileProcessor")
In my aggregation strategy I just set two headers with the incremented values:
newExchange.getIn().setHeader("sum", sum);
newExchange.getIn().setHeader("numberOfLines", numberOfLines);
And in the processor I try to access those headers:
int sum = inMessage.getIn().getHeader("sum", Integer.class);
int numberOfLines = inMessage.getIn().getHeader("numberOfLines", Integer.class);
There are two problems.
First of all the aggregation strategy seem to be called after the first iteration of the processor.
Second, my headers don't exist in the processors, so I can't access the information I need when I am at the last line of the file. The headers do exist in the oldExchange of the aggregators though.
I think I can still do it, but I would have to create a new processor just for the purpose of making the last line of the file.
Is there something I'm missing with the aggregation strategies ? Is there a better way to do this ?
An aggregator will be called for every iteration of the split. This is how they are supposed to work.
The reason you don't see the headers within the processor is, headers live and die with the message and not visible outside. You need to set the 'sum' and 'numberOfLines' as exchange properties instead. Because every iteration within a split results in an exchange, you need get the property from old exchange and set them again in the new exchange to pass them to subsequent components in the route.
This is how you could do,
AggregationStrategy:
public class SumAggregationStrategy implements AggregationStrategy {
public Exchange aggregate(Exchange oldExchange, Exchange newExchange) {
long sum = 0;
long numberOfLines = 0;
if(oldExchange != null) {
sum = (Long) oldExchange.getProperty("sum");
numberOfLines = oldExchange.getProperty("numberOfLines ");
}
sum = sum + ((Line)newExchange.getIn().getBody()).getColumnValue();
numberOfLines ++;
newExchange.setProperty("sum", sum);
newExchange.setProperty("numberOfLines",numberOfLines);
oldExchange.setProperty("CamelSplitComplete", newExchange.getProperty("CamelSplitComplete")); //This is for the completion predicate
return newExchange;
}
}
Route:
.split(body().tokenize("\r\n|\n"),sumAggregationStrategy)
.completionPredicate(simple("${exchangeProperty.CamelSplitComplete} == true"))
.process("fileProcessor").to("file:your_file_name?fileExist=Append");
Processor:
public class FileProcessor implements Processor {
public void process(Exchange exchange) throws Exception {
long sum = exchange.getProperty("sum");
long numberOfLines = exchange.getProperty("numberOfLines");
String footer = "Your Footer String";
exchange.getIn().setBody(footer);
}
}
Using custom aggregator like Srini suggested is a good idea. It might also support streaming large files better.
However if you want to keep things simple and avoid split and aggregation you could just use .tokenize("\r\n|\n") and convertBodyTo(List.class) to convert the string to a list of strings.
from("direct:addFooter")
.routeId("addFooter")
.setBody().tokenize("\r\n|\n")
.convertBodyTo(List.class)
.process(exchange -> {
List<String> rows = exchange.getMessage().getBody(List.class);
int sum = 0;
for (int i = 0; i < rows.size(); i++) {
sum += Integer.parseInt(rows.get(i));
}
int numberOfLines = rows.size();
exchange.getMessage().setHeader("numberOfLines", numberOfLines);
exchange.getMessage().setHeader("sum", sum);
})
// Write data to file using file or stream component
// you could also use Velocity, FreeMarker or Mustache templates to format the
// result before writing it to file.
;

Send and receive a w3c.dom.Document over socket as byte[] Java

I send a document over socket like this:
sendFXML(asByteArray(getRequiredScene(fetchSceneRequest())));
private void sendFXML(byte[] requiredFXML) throws IOException, TransformerException {
dataOutputStream.write(requiredFXML);
dataOutputStream.flush();
}
private Document getRequiredScene(String requiredFile) throws IOException, ParserConfigurationException, SAXException, TransformerException {
return new XMLLocator().getDocumentOrReturnNull(requiredFile);
}
private String fetchSceneRequest() throws IOException, ClassNotFoundException {
return dataInputStream.readUTF();
}
On the side of XMLLocator it finds the correct document and parses it right. I see it by printing the whole doc in console.
But I cannot handle it on the clients side where it's fetch by:
public static void receivePage() throws IOException {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
byte[] data = new byte[989898];
int bytesRead = -1;
while((bytesRead = dataInputStream.read(data)) != -1 ) { //stops here
baos.write(data, 0, bytesRead );
}
Files.write(Paths.get(FILE_TO_RECEIVED), data);
}
After the first iteration in while() cycle it just stops on the commented place.
I don't know if I have an error on the side of the server and I send this in doc in an incorrect format or I read the sent byte array incorrectly. Where is the problem?
Edit:
For the debug purpose, in the receivePage() method, I've chosen a different way of reading the byte array from server which goes like:
int count = inputStream.available();
byte[] b = new byte[count];
int bytes = dataInputStream.read(b);
System.out.println(bytes);
for (byte by : b) {
System.out.print((char)by);
}
And now I'm able to print fetched FXLM in console but a new problem has appeared.
On debug, it normally receives the byte[] from server, writes 2024 for count and displayes the content of the file but if I run the app normally via Shift + f10 it fetches nothing and just writes 0 in console
Edit2:
For some reason, once again, on debug, it's able to even write into a file
for (byte by : b) {
Files.write(Paths.get(FILE_TO_RECEIVED), b);
System.out.print((char)by);
}
But when I try to return this fxml on debug and then show like this:
Parent fxmlToShow = FXMLLoader.load(getClass().getResource("/network/gui.fxml"));
Scene childScene = new Scene(fxmlToShow);
Stage window = (Stage)((Node)ae.getSource()).getScene().getWindow();
window.setScene(childScene);
return window;
It shows only previous files. Like on the first attempt of debug it show a blank page when I asked for the 1st one from server. On the second attempt of debug when i ask for 3rd page from server, it shows me the previously asked one and so on.
To me, it seems absolutely insane cuz the fxml rile actually refreshes before the line
Parent fxmlToShow = FXMLLoader.load(getClass().getResource("/network/gui.fxml"));
is invoked.
Yeah, thank everybody for participating.
So, the issue of incorrect displaying if FXML files was caused by the incorrect FILE_TO_RECEIVED path.
When FXMLLoader.load(getClass().getResource("/network/gui.fxml")); loads gui.fxml it takes it not from D:\\JetBrains\\IdeaProjects\\Client\\src\\network\\gui.fxml,im my case, but from D:\\JetBrains\\IdeaProjects\\Client\\OUT\\PRODUCTION\\Client\\network\\gui.fxml.
As for me, that doesn't seem obvious.
What about different behaviour on debug and on run. In method receivePage() it needs to wait until connection is available.
int count = inputStream.available();
If you read docs for this method you will see
Returns an estimate of the number of bytes that can be read (or skipped over) from this input stream ...
The available method for class InputStream always returns 0...
So, you jext need to wait for connection to be available
while(inputStream.available()==0){
Thread.sleep(100);
}
Otherwise it just prepares byte[] b = new byte[count]; for 0 bytes and you can write in nothing.

TextEncodings.Base64Url.Decode vs Convert.FromBase64String

I was working on creating a method that would generate a JWT token. Part of the method reads a value from my web.config that services as the "secret" used to generate the hash used to create the signature for the JWT token.
<add key="MySecret" value="j39djak49H893hsk297353jG73gs72HJ3tdM37Vk397" />
Initially I tried using the following to convert the "secret" value to a byte array.
byte[] key = Convert.FromBase64String(ConfigurationManager.AppSettings["MySecret"]);
However, an exception was thrown when this line was reached ...
The input is not a valid Base-64 string as it contains a non-base 64 character, more than two padding characters, or an illegal character among the padding characters.
So I looked into the OAuth code and so another method being used to change a base64 string into a byte array
byte[] key = TextEncodings.Base64Url.Decode(ConfigurationManager.AppSettings["MySecret"]);
This method worked without issue. To me it looks like they are doing the same thing. Changing a Base64 text value into an array of bytes. However, I must be missing something. Why does Convert.FromBase64String fail and TextEncodings.Base64Url.Decode work?
I came across the same thing when I migrated our authentication service to .NET Core. I had a look at the source code for the libraries we used in our previous implementation, and the difference is actually in the name itself.
The TextEncodings class has two types of text encoders, Base64TextEncoder and Base64UrlEncoder. The latter one modifies the string slightly so the base64 string can be used in an url.
My understanding is that it is quite common to replace + and / with - and _. As a matter of fact we have been doing the same with our handshake tokens. Additionally the padding character(s) at the end can also be removed. This leaves us with the following implementation (this is from the source code):
public class Base64UrlTextEncoder : ITextEncoder
{
public string Encode(byte[] data)
{
if (data == null)
{
throw new ArgumentNullException("data");
}
return Convert.ToBase64String(data).TrimEnd('=').Replace('+', '-').Replace('/', '_');
}
public byte[] Decode(string text)
{
if (text == null)
{
throw new ArgumentNullException("text");
}
return Convert.FromBase64String(Pad(text.Replace('-', '+').Replace('_', '/')));
}
private static string Pad(string text)
{
var padding = 3 - ((text.Length + 3) % 4);
if (padding == 0)
{
return text;
}
return text + new string('=', padding);
}
}

How to use Java nio to write an uploaded image from ServletInputStream?

I've done the upload using ByteArrayOutputStream and now I want to use nio to write an image to a file in the hard disk from a ServletInputStream, I've tried a couple of ways but with no luck so far, now I have :
#Override
public void doPost(final HttpServletRequest request, final HttpServletResponse response)
throws IOException, ServletException {
final String fileName = "img_" + UUID.randomUUID().toString() + ".jpg";
final String filePathName = "E:\\tmp\\" + fileName;
final FileChannel outChannel = new FileOutputStream(filePathName).getChannel();
final ReadableByteChannel inChannel = Channels.newChannel(request.getInputStream());
outChannel.transferFrom(inChannel, 0, request.getContentLength());
inChannel.close();
outChannel.close();
}
The specified file is generated with the same size as original, but cannot be opened. What have I done wrong here please? what is the proper way?
Thanks.
I don't see why the '--' is being put in the file, unless it is being sent to you, but you need to call transferFrom() in a loop. You can't assume the entire file is transferred in one call. It returns the number of bytes it transferred each call, so you can track the total number transferred: if it's complete, break, otherwise add that to the offset, subtract it from the length, and repeat.

How to read byte by byte from appengine datastore Entity Object

In a nutshell, since GAE cannot write to a filesystem, I have decided to persist my data into the datastore (using JDO). Now, I will like to retrieve the data byte by byte and pass it to the client as an input stream. There's code from the gwtupload library(http://code.google.com/p/gwtupload/) (see below) which breaks on GAE because it writes to the system filesystem. I'll like to be able to provide a GAE ported solution.
public static void copyFromInputStreamToOutputStream(InputStream in, OutputStream out) throws IOException {
byte[] buffer = new byte[100000];
while (true) {
synchronized (buffer) {
int amountRead = in.read(buffer);
if (amountRead == -1) {
break;
}
out.write(buffer, 0, amountRead);
}
}
in.close();
out.flush();
out.close();
}
One work around I have tried (didn't work) is to retrieve the data from the datastore as a resource like this:
InputStream resourceAsStream = null;
PersistenceManager pm = PMF.get().getPersistenceManager();
try {
Query q = pm.newQuery(ImageFile.class);
lf = q.execute();
resourceAsStream = getServletContext().getResourceAsStream((String) pm.getObjectById(lf));
} finally {
pm.close();
}
if (lf != null) {
response.setContentType(receivedContentTypes.get(fieldName));
copyFromInputStreamToOutputStream(resourceAsStream, response.getOutputStream());
}
I welcome your suggestions.
Regards
Store data in a byte array, and use a ByteArrayInputStream or ByteArrayOutputStream to pass it to libraries that expect streams.
If by 'client' you mean 'HTTP client' or browser, though, there's no reason to do this - just deal with regular byte arrays on your end and send them to/from the user as you would any other data. The only reason to mess around with streams like this is if you have some library that expects them.

Resources