Looping over byte channels - nio

While reading from the good-old InputStream, I used the following code(with which I was never comfortable) :
int read = 0;
InputStream is = ....;
while((i = is.read() != -1){
....
}
Now I'm trying to read 10MB from an InputStream using NIO :
protected void doPost(HttpServletRequest request,
HttpServletResponse response) throws ServletException, IOException {
// TODO Auto-generated method stub
System.out.println("In Controller.doPost(...)");
ByteBuffer chunk = ByteBuffer.allocateDirect(1000000);
/* Source channel */
int numRead = 0;
ReadableByteChannel rbc = Channels.newChannel(request.getInputStream());
/* Destination channel */
File destFile = new File(
"D:\\SegyDest.sgy");
FileOutputStream destFileFos = new FileOutputStream(destFile);
FileChannel destFileChannel = destFileFos.getChannel();
/* Read-Write code */
while (numRead >= 0) {
chunk.rewind();
numRead = rbc.read(chunk);
System.out.println("numRead = " + numRead);
chunk.rewind();
destFileChannel.write(chunk);
}
/* clean-up */
rbc.close();
destFileChannel.close();
destFileFos.close();
request.setAttribute("ops", "File Upload");
request.getRequestDispatcher("/jsp/Result.jsp").forward(request,
response);
}
My question is /* How to loop over the source channel to read all the bytes ? */

OR perform IO in chunks of more than 1 byte the API like so:
byte[] bA = new byte[4096];
int i;
InputStream is = ....;
OutputStream os = ....;
while((i = is.read(bA) != -1){
os.write(bA, 0, i);
}
I've looked at your other question and my comments still stand. NIO is not the solution you are looking for. You have a low end machine with limits RAM acting as a proxy.
The best you can do is have your Servlet create a new thread, have this thread create and setup an outgoing connection using NIO sockets/HTTP-libraries. This new (and extra) thread is waiting on any of 3 things to happen and it pushes whatever APIs to try and make progress in these 3 areas.
The 3 things are:
Trying to write data to the remote server (if there is buffered in memory data to send)
Waiting for the main Servlet thread to indicate there is new data in the shared buffer. Or that End-of-stream was reached.
Waiting for the main Servlet thread to indicate the extra thread needs to shutdown (this is error recovery and cleanup).
You probably need a drainWithTimeout(long millis) function that the doPost() method calls on the extra thread to give it an amount of time to push the final data to the remote server. This gets called when an End-of-Stream if observed by the Servlet from the InputStream.
You MUST ensure your extra thread is 100% reliably reaped before the doPost() method returns. So controlling startup/shutdown of it is important, especially in the scenarios that the InputStream had an error because the sending client disconnected or was idle too long.
Then two threads (the normal Servlet thread in doPost() and the new thread you create) would setup and share some arbitrary memory buffer, maybe 16Mb or more that is shared.
If you can not have a 16Mb buffer due to limitations in clients/concurrent-users and 2Gb RAM then you really should stick with the example code at the top of this answer, since the network and the O/S kernels will already buffer some Mb's of data.
The point of using two threads is that you can not fix the issue that the Servlet API receiving the data is a blocking I/O API, you can not change that if you are writing the application to conform to Servlet specification/standards. If you know your specific Servlet container has a feature then that is outside the scope of this answer.
The two threads allow the main Servlet doPost thread to be in control and STILL use a blocking I/O API for InputStream.
There is no point using one thread and a blocking InputStream with a non-blocking OutputStream, you still have the problem that you can not service the output stream while the in.read() API call is blocked (waiting for more data or End-of-stream).

The correct way to copy between NIO channels is as follows:
while (in.read(buffer) > 0 || buffer.position() > 0)
{
buffer.flip();
out.write(buffer);
buffer.compact();
}
Note that this automatically takes care of EOS, partial reads, and partial writes.

Related

Non blocking streaming on Flink

Hi, I'm trying to run a Flink job that it should process incoming data as below. In the process operator right after keyBy(), there should be a case that takes too much time according to some property in data. Even though incoming data have different ids (which is used to keyBy() the stream), long processing code in process function blocks other incoming data. I mean the entire stream.
SingleOutputStreamOperator<Envelope> processingStream = deviceStream
.map(e -> (Envelope) e)
.keyBy((KeySelector<Envelope, String>) value -> value.eventId) // key by scenarios
.process(new RuleProcessFunction());
In RuleProcessFunction.java:
...
#Override
public void processElement(Envelope value, Context ctx, Collector<Envelope> out) throws Exception {
//handleEvent(value, ctx, out);
if (value.getEventId().equals("I")) {
System.out.println("hello i");
for (long i = 0; i < 10000000000L; i++) {
}
}
out.collect(value);
}
I expect the long-running code block should not block the entire stream. I know there is AsyncFunction for blocking IO situations but I don't know that it's correct solution for this.
Since you aren't pulling data from an external database like Cassandra, I don't think you need to use an AsyncFunction.
What it could be that you are running the flink job with a single parallelism. Try increasing the parallelism so one core isn't responsible for all of the processing as well as receiving data. Granted, there can still be back pressure if you do this. Since if the core responsible for ingesting data from the source is reading in data faster than the core(s) that are running the processFunction Flink's back pressure handling will slow the rate of ingestion.

How to implement multithreading in Libsoup server?

I want to implement multithreading in Libsoup server such that every time when a client request comes, a new thread will be created to serve that request.
How can I implement this using the Libsoup and GLib libraries?
My current server main code is like this:
sending_file = fopen("abc/project_foo.zip", "r");
fseek(sending_file, 0L, SEEK_END);
size_of_file = ftell(sending_file);
fseek(sending_file, 0L, SEEK_SET);
int port = 15000;
server = soup_server_new(SOUP_SERVER_RAW_PATHS,TRUE,SOUP_SERVER_PORT,port, SOUP_SERVER_SERVER_HEADER,"simple-httpd",NULL);
soup_server_add_handler(server , "/foo" , server_callback, NULL , NULL);
soup_server_run_async (server);
printf("Waiting for Requests...\n");
//Running a main loop so Async will work
GMainLoop *loop;
loop = g_main_loop_new (NULL, TRUE);
g_main_loop_run (loop);
return 0;
Create a new thread in the callback you pass to soup_server_add_handler. The manual explains the rest; the relevant part:
By default, libsoup assumes that you have completely finished processing the message when you return from the callback, and that it can therefore begin sending the response. If you are not ready to send a response immediately (eg, you have to contact another server, or wait for data from a database), you must call soup_server_pause_message on the message before returning from the callback. This will delay sending a response until you call soup_server_unpause_message. (You must also connect to the finished signal on the message in this case, so that you can break off processing if the client unexpectedly disconnects before you start sending the data.)
So make sure you call soup_server_pause_message in the callback you pass to soup_server_add_handler, then when you're done processing the request in your thread call soup_server_unpause_message.
Instead of creating a new thread for each request you might want to think about using a thread pool, but the idea is pretty much the sameā€”just add a task to the pool instead of creating a new thread.

Polling consumer thread stopping automatically

I am having a scenario where I am using consumer template to receive file from a endpoint. The Endpoint could be either File System or FTP site. Currently I am using only File System with following endpoint URL:
file://D:/metamodel/Seach.json?noop=true&idempotent=false
On every hit to following code:
Exchange exchange = consumerTemplate.receive(endPointURI, timeout);
if (exchange != null) {
String body = exchange.getIn().getBody(String.class);
consumerTemplate.doneUoW(exchange);
return body;
}
It creating a new Camel context thread and after some hits it giving error as
java.util.concurrent.RejectedExecutionException: PollingConsumer on Endpoint[file://D:/metamodel/Seach.json?noop=true&idempotent=false] is not started, but in state:Stopped
I am not sure why this is happening and its sporadic in nature.
Any suggestion on this would do great help.

Stop Camel after too many retries

I am trying to implement more advanced Apache Camel error handling:
in case if there are too many pending retries then stop processing at all and log all collected exceptions somewhere.
First part (stop on too many retries) is already implemented by following helper method, that gets size of retry queue and I just stop context if queue is over some limit:
static Long getToRetryTaskCount(CamelContext context) {
Long retryTaskCount = null;
ScheduledExecutorService errorHandlerExecutor = context.getErrorHandlerExecutorService();
if (errorHandlerExecutor instanceof SizedScheduledExecutorService)
{
SizedScheduledExecutorService svc = (SizedScheduledExecutorService) errorHandlerExecutor;
ScheduledThreadPoolExecutor executor = svc.getScheduledThreadPoolExecutor();
BlockingQueue<Runnable> queue = executor.getQueue();
retryTaskCount = (long) queue.size();
}
return retryTaskCount;
}
But this code smells to me and I don't like it and also I don't see here any way to collect the exceptions caused all this retries.
There is also a new control bus component in camel 2.11 which could do what you want (source)
template.sendBody("controlbus:route?routeId=foo&action=stop", null);
I wouldn't try to shutdown the CamelContext, just the route in question...that way the rest of your app can still function, you can get route stats and view/move messages to alternate queues, etc.
see https://camel.apache.org/how-can-i-stop-a-route-from-a-route.html

Download File (using Thread class)

Ok, I understand that maybe very stupid question, but i never did it before, so i ask this question. How can i download file (let's say, from the internet) using Thread class?
What do you mean with "using Thread class"? I guess you want to download a file threaded so it does not block your UI or some other part of your program.
Ill assume that your using C++ and WINAPI.
First create a thread. This tutorial provides good information about WIN32 threads.
This thread will be responsible for downloading the file. To do this you simply connect to the webserver on port 80 and send a HTTP GET request for the file you want. It could look similar to this (note the newline characters):
GET /path/to/your/file.jpg HTTP/1.1\r\n
Host: www.host.com\r\n
Connection: close\r\n
\r\n
\r\n
The server will then answer with a HTTP response containing the file with a preceding header. Parse this header and read the contents.
More information on HTTP can be found here.
If would suggest that you do not use threads for downloading files. It's better to use asynchronous constructs that are more targeted towards I/O, since they will incur a lower overhead than threads. I don't know what version of the .NET Framework you are working with, but in 4.5, something like this should work:
private static Task DownloadFileAsync(string uri, string localPath)
{
// Get the http request
HttpWebRequest webRequest = WebRequest.CreateHttp(uri);
// Get the http response asynchronously
return webRequest.GetResponseAsync()
.ContinueWith(task =>
{
// When the GetResponseAsync task is finished, we will come
// into this contiuation (which is an anonymous method).
// Check if the GetResponseAsync task failed.
if (task.IsFaulted)
{
Console.WriteLine(task.Exception);
return null;
}
// Get the web response.
WebResponse response = task.Result;
// Open a file stream for the local file.
FileStream localStream = File.OpenWrite(localPath);
// Copy the contents from the response stream to the
// local file stream asynchronously.
return response.GetResponseStream().CopyToAsync(localStream)
.ContinueWith(streamTask =>
{
// When the CopyToAsync task is finished, we come
// to this continuation (which is also an anonymous
// method).
// Flush and dispose the local file stream. There
// is a FlushAsync method that will flush
// asychronously, returning yet another task, but
// for the sake of brevity I use the synchronous
// method here.
localStream.Flush();
localStream.Dispose();
// Don't forget to check if the previous task
// failed or not.
// All Task exceptions must be observed.
if (streamTask.IsFaulted)
{
Console.WriteLine(streamTask.Exception);
}
});
// since we end up with a task returning a task we should
// call Unwrap to return a single task representing the
// entire operation
}).Unwrap();
}
You would want to elaborate a bit on the error handling. What this code does is in short:
See the code comments for more detailed explanations of how it works.

Resources