I am new to mule and trying to understand how max concurrency works? I understand that is to the determine how many simultaneous request can be received. But I have a question, in case I have huge incoming requests to the application, if I set the max concurrency will the subsequent requests be in queue or it error out or data will be lost?
For example, say I am getting around 10,000 requests to my application and I set the max concurrency to 1000. So the other 9000 will be in the queue or there is a chance of data loss?
Also what would be the ideal max concurrency that can be set for flow?
Thanks in advance.
I'm not sure to which queue are you referring to. In Mule 4 requests are not queued. If a flow sets maxConcurrency, any concurrent request exceeding that number will fail with an error. If that causes data loss will probably depend on what your application is doing. maxConcurrency just limits the concurrency, without you needing to think on terms of the implementation or numbers of threads.
If your application is reading from a queue (ie a JMS or VM queue), it would be expected it can not read more than maxConcurrency simultaneous messages, so the other messages should remain in the queue and not trigger an error.
Note that maxConcurrency doesn't guarantees that an application can actually process as many concurrent requests. If you set it to 1000, you have to ensure that your application will have enough resources to process that many concurrent requests. The only way to know how to size it is to perform load testing in conditions similar to the real expected usage and verify at what number you start getting errors due to not enough resources.
A special case is when you want just a specific number. For example a flow reading from a queue with maxConcurrency = 1 to process only one message at a time.
Related
We have the following set up in our project: Two applications are communicating via a GCP Pub/Sub message queue. The first application produces messages that trigger executions (jobs) in the second (i.e. the first is the controller, and the second is the worker). However, the execution time of these jobs can vary drastically. For example, one could take up to 6 hours, and another could finish in less than a minute. Currently, the worker picks up the messages, starts a job for each one, and acknowledges the messages after their jobs are done (which could be after several hours).
Now getting to the problem: The worker application runs on multiple instances, but sometimes we see very uneven message distribution across the different instances. Consider the following graph, for example:
It shows the number of messages processed by each worker instance at any given time. You can see that some are hitting the maximum of 15 (configured via the spring.cloud.gcp.pubsub.subscriber.executor-threads property) while others are idling at 1 or 2. At this point, we also start seeing messages without any started jobs (awaiting execution). We assume that these were pulled by the GCP Pub/Sub client in the busy instances but cannot yet be processed due to a lack of executor threads. The threads are busy because they're processing heavier and more time-consuming jobs.
Finally, the question: Is there any way to do backpressure (i.e. tell GCP Pub/Sub that an instance is busy and have it re-distribute the messages to a different one)? I looked into this article, but as far as I understood, the setMaxOutstandingElementCount method wouldn't help us because it would control how many messages the instance stores in its memory. They would, however, still be "assigned" to this instance/subscriber and would probably not get re-distributed to a different one. Is that correct, or did I misunderstand?
We want to utilize the worker instances optimally and have messages processed as quickly as possible. In theory, we could try to split up the more expensive jobs into several different messages, thus minimizing the processing time differences but is this the only option?
Why would there be any latency in App Engine in the middle of processing a request? This only happens at times and randomly occurs at different places in the request handling with a latency of around 3 or more seconds after starting to process a request.
The usual suspect is your handler reaching out for some resources, either from GAE APIs (datastore, memcache, etc), other GCP API/infra (cloud storage, machine learning, big query, etc) or an external/3rd party service/URL.
Most, if not all such interactions can occasionally encounter peak response times way longer than average for various possible reasons (or combinations of reasons), for example:
temporary outages of the service being accessed of in the networking layer ensuring connectivity to them
retries at networking or application layers due to communication errors/packet loss
service VMs/instances needed to be launched from scratch during (re)starts or even during scaling up
normal operation conditions which require more time, like datastore transaction retries due to collisions
If the occurrence rate becomes unacceptable an investigation would need to be done to identify which of such external accesses is/are responsible, what are the conditions causing them and maybe find some solution to prevent or reduce the impact of the occurences.
Of course, there may be other reasons as well.
We are using google-cloud-pubsub (0.24.0-beta) pull client for reading messages from subscriber and seeing high rate of duplicates in that. Google documentation says that little duplication is expected but in our case, we are seeing 80% of messages are getting duplicated even after acknowledgement.
The most weird part is, even if we acknowledge the message immediately in receiver using consumer.ack(), duplicates are still occurring.
Does anybody know how to handle this.
A large number of message duplicates could be the result of flow control settings being set too high or too low. If your flow control settings are too high, where you are allowing too many messages to be outstanding to your client at the same time, then it is possible that the acks are being set too late. If this is the cause, you would probably see the CPU of your machine at or near 100%. In this case, try setting the max number of outstanding messages or bytes to a lower number.
It could also be the case that the flow control settings are set too low. Some messages get buffered in the client before they are delivered to your MessageReceiver, particularly if you are flow controlled. In this case, messages may spend too much time buffered in the client before they are delivered. There is an issue with messages in this state that is being fixed in an outstanding PR. In this scenario, you could either increase your max outstanding bytes or messages (up to whatever your subscriber can actually handle) or you can try to setAckExpirationPadding to a larger value than the default 500ms.
It is also worth checking your publisher to see if it is unexpectedly publishing messages multiple times. If that is the case, you may see the contents of your messages being the same, but they aren't duplicate messages being generated by Google Cloud Pub/Sub itself.
Edited to mention bug that was in the client library:
If you were using a version of google-cloud-pubsub between v0.22.0 and v0.29.0, you might have been running into an issue where a change in the underlying mechanism for getting messages could result in excessive duplicates. The issue has since been fixed.
I'm trying to speed up a Google App Engine request handler that has a big datastore PutMulti call (500 entities) by splitting it into batches of entities and running concurrent goroutines to send smallerPutMulti calls (100 entities each).
Before this, I had often been getting the datastore error Call error 11: Deadline exceeded (timeout) from my PutMulti calls going over the deadline when I tested the handler on many concurrent requests. After the parallelization, the handler did speed up, but I still occasionally got that error and also another type of error, API error 5 (datastore_v3: TIMEOUT): The datastore operation timed out, or the data was temporarily unavailable.
Is this error 5 due to contention in the datastore, and what is the difference between errors 5 and 11?
These errors come from two different places, the first, the call error, is a local error that is caused by a timeout in the RPC client. It indicates that there was a timeout waiting for completion of an RPC. The default RPC timeout in google.golang.org/appengine is 60 seconds.
The second error comes from the service side. This error indicates that a timeout occurred performing operations within datastore. Some of these operations have timeouts much shorter than 60s, and typically this may indicate contention.
A possibly simpler way to understand the differences is that you will find that if you make a single multi operation with a very large number of changes, you can trigger the first timeout with ease. If you create a significant number of concurrent operations against a single key or small set of keys, you will more readily trigger the latter. As timeouts are general indicators of saturation of shared resources, there are of course many ways and combinations to generate them. In general, one will want to retry operations as appropriate, and also size operations appropriately, as well as aggregating operations on hot keys as best as possible to reduce the chance of contention related issues. As others have suggested, the python and java docs have some examples of this already.
You may wish to make use of https://godoc.org/google.golang.org/appengine#IsTimeoutError and if you need to increase the timeout for the first error class, you may be able to adjust the context deadline, see the methods here: https://godoc.org/golang.org/x/net/context#WithDeadline Note: you will not be able to extend the deadline beyond that of a request deadline, however, if you are running in tasks or VMs you can extend to long deadlines.
The first error you see may be just the timeout in normal operation, the 2nd is likely because of write contention. More on this: Handling Datastore Errors https://cloud.google.com/appengine/articles/handling_datastore_errors
Our team is in a spike sprint to choose between ActiveMQ or RabbitMQ. We made 2 little producer/consumer spikes sending an object message with an array of 16 strings, a timestamp, and 2 integers. The spikes are ok on our devs machines (messages are well consumed).
Then came the benchs. We first noticed that somtimes, on our machines, when we were sending a lot of messages the consumer was sometimes hanging. It was there, but the messsages were accumulating in the queue.
When we went on the bench plateform :
cluster of 2 rabbitmq machines 4 cores/3.2Ghz, 4Gb RAM, load balanced by a VIP
one to 6 consumers running on the rabbitmq machines, saving the messages in a mysql DB (same type of machine for the DB)
12 producers running on 12 AS machines (tomcat), attacked with jmeter running on another machine. The load is about 600 to 700 http request per second, on the servlets that produces the same load of RabbitMQ messages.
We noticed that sometimes, consumers hang (well, they are not blocked, but they dont consume messages anymore). We can see that because each consumer save around 100 msg/sec in database, so when one is stopping consumming, the overall messages saved per seconds in DB fall down with the same ratio (if let say 3 consumers stop, we fall around 600 msg/sec to 300 msg/sec).
During that time, the producers are ok, and still produce at the jmeter rate (around 600 msg/sec). The messages are in the queues and taken by the consumers still "alive".
We load all the servlets with the producers first, then launch all the consumers one by one, checking if the connexions are ok, then run jmeter.
We are sending messages to one direct exchange. All consumers are listening to one persistent queue bounded to the exchange.
That point is major for our choice. Have you seen this with rabbitmq, do you have an idea of what is going on ?
Thank you for your answers.
It's always worth
setting the prefetch count when using basic.consume :
channel.basicQos(100);
before the channel.basicConsume line in order to ensure you never have
more than 100 messages queued up in your QueueingConsumer.
I have seen this behavior when using the RabbitMQ STOMP plugin. I haven't found a solution yet.
Are you using the STOMP plugin?
The channel in a RabbitMQ is not thread safe.
so check in consumer channel for any thread requests.