I am researching ActiveMQ, in particular, integrating it into my java app using Camel.
Our architecture involves queuing jobs across multiple multithreaded vms. I need in particular two kinds of rate limits:
per vm per time period (all threads)
per all vms per time period
Is there a way to specify these in camel, or are all rate limits implemented on a per-consumer basis?
With the help of throttler, I think you can setup the rate limit per route. As the exchange information is not share across the camel route, I don't think it can work for all vms.
The only way to implement this that meets the requirements (that I can imagine) is to have another process that monitors the queue. At a regular interval, it would send a message to a topic which would say whether we had exceeded the rate limit for that period of time. Any process that subscribes to the queue would have to subscribe to the topic, and when it received a message that the allotted number had been processed it would shut that route down.
Related
We have the following set up in our project: Two applications are communicating via a GCP Pub/Sub message queue. The first application produces messages that trigger executions (jobs) in the second (i.e. the first is the controller, and the second is the worker). However, the execution time of these jobs can vary drastically. For example, one could take up to 6 hours, and another could finish in less than a minute. Currently, the worker picks up the messages, starts a job for each one, and acknowledges the messages after their jobs are done (which could be after several hours).
Now getting to the problem: The worker application runs on multiple instances, but sometimes we see very uneven message distribution across the different instances. Consider the following graph, for example:
It shows the number of messages processed by each worker instance at any given time. You can see that some are hitting the maximum of 15 (configured via the spring.cloud.gcp.pubsub.subscriber.executor-threads property) while others are idling at 1 or 2. At this point, we also start seeing messages without any started jobs (awaiting execution). We assume that these were pulled by the GCP Pub/Sub client in the busy instances but cannot yet be processed due to a lack of executor threads. The threads are busy because they're processing heavier and more time-consuming jobs.
Finally, the question: Is there any way to do backpressure (i.e. tell GCP Pub/Sub that an instance is busy and have it re-distribute the messages to a different one)? I looked into this article, but as far as I understood, the setMaxOutstandingElementCount method wouldn't help us because it would control how many messages the instance stores in its memory. They would, however, still be "assigned" to this instance/subscriber and would probably not get re-distributed to a different one. Is that correct, or did I misunderstand?
We want to utilize the worker instances optimally and have messages processed as quickly as possible. In theory, we could try to split up the more expensive jobs into several different messages, thus minimizing the processing time differences but is this the only option?
Does anyone have performance data for Amazon's new Mobile Push service?
We are looking at using it, but want to understand performance for:
How many requests per second it can handle
Latency for delivering a notifications to a device in seconds
How long it takes to send an identical notification to a million users (using topics)
Since Amazon doesn't publish performance numbers and because creating synthetic tests for mobile push are difficult, I was wondering if anyone had real-world data.
We've sent a message to around 300,000 devices and they are delivered almost instantaneously. Obviously we do not have access to each of those devices, but judging by a sampling of devices that are subscribed to various topics at different times, all receive the message less than 10 seconds from the actual send.
A single publish to a device from the AWS console is startlingly fast. It appears on your device at almost the same instant that you release the "Publish" button on the AWS console.
While the delay in the AWS delivery infrastructure is nominal, and will surely be driven to near zero as they improve and add to their infrastructure, the time between the user action that generates the message in your system and the actual message is received by AWS that says "send this notification" will likely be the larger portion of the delay in the end-to-end process. The limit per topic is 10,000 devices, so if you are sending to a million users, you'll have 100 (or more) topics to publish to. The time it takes your software to publish to all of these topics depends on how much parallelism you support in the operation. It takes somewhere around 50-100ms to publish to a topic, so if you do this serially, it may be up to 10 seconds before you even publish your message to the 100th topic.
UPDATE: As of August 19, 2014, the limit on the number of subscribers you can have per topic has been raised to 10,000,000:
https://aws.amazon.com/blogs/aws/sns-large-topics-and-mpns-auth-mode/
I'm currently developing a Camel Integration app in which resumption from a previous state of processing is important. When there's a power outage, for instance, it's important that all previously processed messages are not re-processed. The processing should resume from where it left off before the outage.
I've gone through a number of possible solutions including Terracotta and Apache Shiro. I'm not sure how to use either as documentation on the integration with Apache Camel is scarce. I've not settled on the two, however.
I'm looking for suggestions on the potential alternatives I can use or a pointer to some tutorial to get me started.
The difficulty in surviving outages lies primarily in state, and what to do with in-flight messages.
Usually, when you're talking state within routes the solution is to flush it to disk, or other nodes in the cluster. Taking the aggregator pattern as an example, aggregated state is persisted in an aggregation repository. The default implementation is in memory, so if the power goes out, all the state is lost. However, there are other implementations, including one for JDBC, and another using Hazelcast (a lightweight in-memory data grid). I haven't used Hazelcast myself, but JDBC does a synchronous write to disk. The aggregator pattern allows you to resume from where you left off. A similar solution exists for idempotent consumption.
The second question, around in-flight messages is a little more complicated, and largely depends on where you are consuming from. If you're in the middle of handling a web service request, and the power goes out, does it matter if you have lost the message? The user can simply retry. Any effects on external systems can be wrapped in a transaction, or an idempotent consumer with JDBC idempotent repository.
If you are building out integrations based on messaging, you should consume within a transaction, so that if your server goes down, the messages go back into the broker and can be replayed to another consumer.
Be careful when using seda: or threads blocks, these use an in-memory queue to pass exchanges between threads, any messages flowing down these sorts of routes will be lost if someone trips over the power cable. If you can't afford message loss, and need this sort of processing model, consider using a JMS queue as the endpoints between the two routes (with transactions to ensure you pick up where you left off).
Can I block on a Google AppEngine Pull Task Queue until a Task is available? Or, do I need to poll an empty queue until a task is available?
You need to poll the queue. A typical use case for pull queues is to have multiple backends, each obtaining one thousand tasks at a time.
For use cases where there are no tasks in the queue for hours at a time, push queues can be a better fit.
Not 100% sure about your question, but thought to try an answer. Having a pull task queue started by a cron may apply. Saves the expense of running a backend. I have client-side log data that needs to be serialized and stored. On-line handler simply passes the client data to a task pull queue. Cron fires up the task every minute, and up to 10k log items get serialized and stored each run. (Change settings according to your loads -- these more than meet my modest needs.) In this case, the queue acts as a buffer, and load spikes get spread across even processing units. Obviously not useful if you want quick access to the TQ data or have wildly unpredictable loads. Very importantly the log data serialization cuts data writes by a factor of 1,000. May not apply to your question, so I'll end with a big HTH. -stevep
We have a GAE application that notifies users of sports goal changes and we want to notify them as fast as possible. Currently we create one entry in a GAE queue for each notification and this works fairly well. but lately it struggles a bit with sending 2000-3000 notifications at one score change.
What is the optimal queue configuration to send these events as fast as possible?
Currently we have:
<queue>
<name>c2dm</name>
<rate>10/s</rate>
</queue>
and in appengine-web.xml
<threadsafe>true</threadsafe>
You can increase the rate and bucket size to get the push queue to execute tasks faster. You can also leave max-concurrent-requests unset (unlimited). This will drive tasks up to the maximum number of instances you have configured, times the throughput of a single instance of your app. You can also increase your instance count maximum.
Note that "as fast as possible" might not be the ideal. You'll want to tweak these values based on the burst patterns, instance warm-up costs, datastore contention error rates, etc. until you're getting comfortable performance. If you need even more control, consider using pull queues, either with an App Engine backend instance (or several) or a notifications service running on another platform that calls the pull queues REST API.