I'm currently using the RabbitMQ (3.6.2-1) on Ubuntu(16.04) in production. Producers publish messages and consumers consume messages and everything works correctly but sometimes RabbitMQ doesn't release memory and it touchs max memory and producers can not publish message into empty queues so I have to restart the service.
It's a bug or something else?
Update :
It is from the management plugin, so you can solve this issue by one of these solutions :
1.Update your RabbitMQ version (3.6.15 is stable)
2.Restarting statistics database periodically (in crontab hourly) https://www.rabbitmq.com/management.html#stats-db
3.set rates_mode to none in your rabbitmq.config file (it's not a good idea because in this case you can not see message rates)
You should check the number of the messages you have inside you queues.
RabbitMQ by default keeps the messages in memory to be fast.
if you have to handle lot of messages you can use the lazy queue:
https://www.rabbitmq.com/lazy-queues.html
With lazy queue you can handle milions of messages wihtout impact too much the node memory.
Or it could be a mamagement memory problem, see:
http://rabbitmq.com/management.html#stats-db
in your canse you can run:
rabbitmqctl eval 'supervisor2:terminate_child(rabbit_mgmt_sup_sup, rabbit_mgmt_sup), rabbit_mgmt_sup_sup:start_child().'
to reset the stats and free the memory.
You could call it periodically
Note:
There are different ways to reset the stats, it depends from the rabbitmq version, here all the detail
Related
I am doing a project in apache flink where I need to call multiple APIs so as to achieve my goal. The result of each API is required for the next API to work. Also as I am doing it on a KeyedStream, the same flow will be applicable to multiple data at once.
Below dig. can explain the scenario
/------API1---API2----
KeyedStream ----|------API1---API2----
\------API1---API2----
As I am doing all this, I am getting an exception saying "Buffer pool destroyed" after the job runs for sometime. Is it something related to API call, do I need to make use of Asynchronous function?? Please suggest. Thanks in advance.
a few things that are typically needed to help answer questions about Flink...
What version are you running?
How are you running it (from IDE, YARN cluster, stand-alone, etc)?
What's the complete stack trace for the exception?
(often) Can you share your code?
But at a high level, the "buffer pool destroyed" message you mentioned is not the root cause of failover, it's just a byproduct of Flink trying to kill off the workflow after an error has happened. So you need to dig deeper in the logs (typically Task Manager logs are where you'd look first).
In Google PubSub, the publish call from the client can be called asynchronously. Because of this, I would think that it would be possible to have multiple publish requests triggered and sent to the server, all at the same time, especially if the batch thresholds are too low.
If this is true, how does the pubsub client control the number of simultaneous publish requests that can be created? Is there a hard limit, or an error that can occur if too many requests are created? Is this the intended use of having an asynchronous publisher, or is simply to allow for other non-publishing activity to occur?
Though this question applies to any of the clients, we are specifically having an issue with the C# client, and are intermittently receiving the following error:
Grpc.Core.RpcException: Status(StatusCode=DeadlineExceeded, Detail="Deadline Exceeded")
at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task)
at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task)
at Google.Api.Gax.Grpc.ApiCallRetryExtensions.<>c__DisplayClass0_0`2.<<WithRetry>b__0>d.MoveNext()
--- End of stack trace from previous location where exception was thrown ---
My thought is that we are sending too many publish requests..., but I am not sure.
I would advice using the raw gRPC code, but use the client library that has a very thin wrapper.
Looking at the client source code always helps me, you can find for c# code here PublisherClient.cs (thin wrapper)
If you are using PublishAsync it queues/batch the messages anyway, the behaviour is controlled the settings you give to the client (see PublisherServiceApiClient for how to tune it). You can also control the number of client connections that are used to send the queues in the client. I suggest playing with the batch-size first, then the number of connections till you found your sweet spot for your throughput.
I had converted some tasks to run on a dynamic backend.
The tasks are failing silently [no logged error, no retry, nothing] ~20% of the time (min:10%, max:60%, sample:large, long term). Switching the task away from the backend restores retries and gets the failure rate back to ~0%.
Any ideas?
Converting it to a backend exacerbated the problem but wasn't the problem.
I had specified a task_retry_limit and the queue was a push queue. With a backend the number of instances is specified. (I believe you can replicate this issue on the frontend by ramping up requests rapidly, to a big number).
Tasks were failing 503: Instance Unavailable until they hit the task_retry_limit. This is visible temporarily in Task Queues, but will not show up in Logs.
I should be using pull queues. Even if my use case was stupid I'd probably +1 a task dying due to multiple 503: Instance Unavailable logging something so it doesn't appear like a phantom task.
Which runtime are you using on the backend?
Try running the backend for a bit without dynamic set to true and exercise the failing component.
On my project, I have seen tasks that target a static backend disappear on occasion, but no where near the rate you are seeing.
I'm using servicemix and camel to publish to some activemq topics:
from("vm:all").recipientList().simple(String.format("activemq:topic:%s.%s.%s.%s.%s.%s","${header.type}", ...
but a new producer thread gets created per topic, any idea how I can control that?
Edit1:
Actually I realized that I was creating too many topics and should you Selectors instead: http://www.andrejkoelewijn.com/blog/2011/02/21/camel-activemq-topic-route-with-jms-selector/
Thanks for the help!
The typical scenario is that Camel is using JMSTemplate to fire messages to ActiveMQ. That means it creates a new producer each time. Actually, it creates a new connection, session and producer per message. That is usually no issue except performance wise.
This is typically handled by org.apache.activemq.pool.PooledConnectionFactory which caches connections, sessions and producers for you. However, depending on your configuration, it might create "some" producers initially, but they will be reused.
Check your Connection Factory settings in activemq-broker.xml and make sure you grab the PooledConnectionFactory.
I've implemented a messaging system over SQL Server Service Broker. It is working great, with the sole exception that every once in a while (maybe once per week per server) my initiator service just vanishes without a trace. The corresponding queue is still there, but the service is missing.
Obviously this causes problems in my system. It's a simple matter to recreate the service by hand, but I'm confused as to what might cause this behavior. I understand that automatic poison message handling causes queues to be disabled, but I don't see anything that indicates services can be disabled or deleted automatically.
When this happens, I usually have a large backlog of messages in multiple application queues, but nothing extreme. Total message backlog is around 200,000.
Does anyone know what might be happening here?
You must have a bug of some sort that issues a DROP SERVICE statement. That is the only way a service gets deleted.
Check the default trace, the DROP statement gets traced and saved into it so you can track down the application/user/statement that issues the DROP. Check sys.traces to find the location of the default trace then open the .TRC file in Profiler.