Are there any good polling strategies to use with Apache Camel? - apache-camel

I'm new to Apache Camel and currently reading Camel in Action. I'm building a system that will receive json messages via restful webservices and I plan to turn them into acsii files and transmit them to another system via SFTP.
The problem is the response file will take over 10 minutes to return per request so I need to come up with a polling/monitoring strategy that will keep track of the request state.
Can anyone point me in the right direction if there is a specific EIP that handles this sort of problem?

Related

How to handle long requests on the frontend?

My application allows a user to enter a URL of an article he/she wishes to analyze. It goes through our API gateway to reach the correct services engaged in this process. The analysis takes between 5 and 30 seconds depending on the article's word count.
For now, my reactjs client sends the request to the API and waits for 5 to 30 seconds to receive the response. Is there a better way to handle this such as enqueuing the job and let the API ping the client (reactjs frontend) once it has been done?
Server-sent Events (SSEs) allow your server to push new information to your browser, and hence look ideal to me for this purpose. They work over HTTP and there is good support for all browsers except for IE.
So the new process could look as follows:
Client send request to server, which initiates the lookup and potentially responds with the topic the browser needs to subscribe to (in case that's unique per lookup)
Server does its thing and sends updates as it processes new content. See how the beauty of this is that you could inform your client about partial updates.
If SSEs is not an option to you, you could leverage good old Websockets for bi-directional communication, but for such a simple endeavor, it might be too much technology to solve the problem.
A third alternative, especially if you are talking amongst services (no web or mobile clients on the other side) is to use web-hooks, so that the interested party would expose and listen on a specific endpoint, that the publisher (the server that does the processing) would write updates to.
Hope this is useful.

Angularjs 1 - one request, multiple responses

I have a page with multiple widgets, each receiving data from a different query in the backend. Doing a request for each will consume the limit the browser puts on the number of parallel connections and will serialize some of them. On the other hand, doing one request that will return one response means it will be as slow as the slowest query (I have no apriori knowledge about which query will be slowest).
So I want to create one request such that the backend runs the queries in parallel and writes each result as it is ready and for the frontend to handle each result as it arrives. At the HTTP level I believe it can be just one body with serveral json, or maybe multipart response.
Is there an angularjs extension that handles the frontend side of things? Optimally something that works well with whatever can be done in the Java backend (didn't start investigating my options there)
I have another suggestion to solve your problem, but I am not sure you would be able to implement such a thing as from you question it is not very clear what you can or cannot do.
You could implement WebSockets and the server would be able to notify the front-end about the data being fetched or it could send the data via WebSockets right away.
In the first example, you would send a request to the server to fetch all the data for your dashboard. Once a piece of data is available, you could make a request for that particular piece and given that the data was fetched couple of seconds ago, it could be cached on the server and the response would be fast.
The second approach seems a more reasonable one. You would make an HTTP/WebSocket request to the server and wait for the data to arrive over WebSocket.
I believe this would be the most robust an efficient way to implement what you are asking for.
https://github.com/dfltr/jQuery-MXHR
This plugin allows to parse a response that contains several parts (multipart) by having a callback to parse each part. This can be used in all our frontends to support responses for multiple data (widgets) in one requests. The server side will receive one request and use servlet 3 async support (or whatever exists in other languages) to ‘park’ it, sending multiple queries, writing each response to the request as each query returns (and with the right multipart boundary).
Another example can be found here: https://github.com/anentropic/stream.
While both of these may not be compatible with angularjs, the code does not seem complex to port there.

Throttling FTP Polling consumers using apache camel

I have a requirement where in at one point of time, I need to connect to multiple ftp/sftp endpoints (say 100 ftp endpoints) to download files and process them.
I have a route like below. The Seda queue further processes the messages by moving them into appropriate folders
from(ftp://username#host/foldername?password=XXXXX&include=.*).to("seda:"+routeId)
Now if I am starting all the FTP endpoints at the same time, which is resulting in JVM memory issues. How could I throttle the starting of the ftp endpoints? can I use a SEDA before the ftp to throttle (if so how can I use it)? Any other EIP's or ideas I could use to throttle the triggering of the polling ftp consumers?
You can look into the throttler dsl to if you want to throttle the fetching of the messages.
http://camel.apache.org/throttler.html
For controlling the startup you can look into the simplescheduleroutepolicy..
http://camel.apache.org/simplescheduledroutepolicy.html
It handles route activating and deactivating. Although I haven't used it myself but it looks like you can perhaps add a controlled delay on when routes should start and stop.
I have had this problem in the past solved it using cron in the following way:
from("ftp://username#host/foldername?password=XXXXX&include=.*&scheduler=quartz2&scheduler.cron=0/2+*+*+*+*+?")
You can set up every FTP consumer to pull at different times (say with one minute difference).
If you decided to go down this path, you can use the following website to construct your crons easily:
http://www.cronmaker.com/
Hope this helps.
R.

Apache Camel: Test if endpoints are up

Does Camel provide anything out of the box which tells if it is able to connect all endpoints?
These endpoints could be MQ, webservice etc.
If not then I have to write a servlet which will send test request to all the endpoints. I will be using multicast or splitter for this implementation.
From my experience Camel will only provide warning logs if a from() endpoint is not available since it is constantly trying to read from them. Every other endpoint won't be accessed until the exchange tries to use that endpoint. If your goal is to test if various resources are alive I believe you would need to create your own testing program. I don't think this will be implemented as a feature because typically applications build in error handling if a resource is down and definte appropriate behaviors.
If we're talking about producers, then no. If your route is sending messages to an amq or http4 endpoint for instance, camel with not automatically send TCP-packets on these connections for monitoring purposes. A common way to handle failure of external endpoints is by using "circuit breakers". Take a look at https://camel.apache.org/load-balancer.html. A more robust alternative, imho, is Netflix's Hystrix.
If you have a polling consumer, say a from:ftp://.. then the polling consumer will poll messages every n-th millisecond, and you'll get an error if the connection is broken.

Compression when using the Channel API in Google App Engine

In this FAQ question it says that compression is used automatically when the browser supports it and that I don't need to modify my application in any way.
My question is, does that apply to Channel API messages too?
I have an application that needs to send relatively large JSON (text) data through a persistent connection and I'm hoping I can get things through faster if they are compressed.
If not, I can think of a workaround to have the server send just a ping through the channel when a big load comes through and have the browser then make a GET request to fetch it (and that would "automatically" compress it), but that would add the latency of another request.
Data sent over the connection the Channel API uses is gzip compressed.
However, Channel API messages are limited to 32K uncompressed, so for anything bigger than that you'll need to use the ping/GET method anyway.

Resources