Is Google PubSub suitable for low-volume (10 msg/sec) but mission-critical messaging, where timely delivery of each message is guaranteed within any fixed period of time?
Or, is it rather suited for high-throughput, where individual messages might be occasionally lost or delayed indefinitely?
Edit: To rephrase this question a bit: Is it true, that any particular message in PubSub, regardless of volume of messages produced, can be indefinitely delayed?
Google Cloud Pub/Sub guarantees delivery of all messages, whether low throughput or high throughput, so there should be no concern about messages being lost.
Latency for message delivery from publisher to subscriber depends on many different factors. In particular, the rate at which the subscriber is able to process messages and request more messages is vitally important. For pull subscribers, this means always having several outstanding pull requests to the server. For push subscribers, they should be returning a successful HTTP response code as quickly as possible. You can read more about the difference between push and pull subscribers.
Google Cloud Pub/Sub tries to minimize latency as much as possible, though there are no guarantees made. Empirically, Cloud Pub/Sub consistently delivers messages in no more than a couple of seconds at the 99th percentile. Note that if your publishers or subscribers are not running on Google Cloud Platform, then network latency between your servers and Google servers could also be a factor.
Related
I'm working on an application where I will be getting 40 million records in a day so will the PubSub can handle it?. I have also seen that in some cases PubSub sends duplicate messages how can we avoid this?
40 million records in a day (~460/s) is definition feasible for Pub/Sub, yes. The service is designed to scale horizontally with your load to tens of GB per second. Pub/Sub is an at-least-once delivery service by default, which means that duplicates are possible. There is an exactly once feature currently in public preview, which allows one to get stronger guarantees including:
Only one delivery of a message can be outstanding at a time.
A successful response to the Ack call means that the message is guaranteed not to be redelivered.
This does mean that if you don't ack a message before the deadline, the message will get redelivered, so it doesn't mean you avoid duplicates entirely. If you need exactly once processing, then Dataflow can be a good choice.
We tried to enforce a certain rate limit on Cloud PubSub Push Subscrber by setting the quota on "Push subscriber throughput, kB" to 1, effectively meaning that PubSub should process no more than 1 kbps with the push subscriber.
However, the actual throughput can be higher than that, around to 6-8 kbps.
Why is that not limiting the throughput as expected?
More details:
The goal is to have a rate limit of 50 messages per second.
We can assume the average message size, for the purposes of our testing we use 50 bytes messages, which is 50 bytes * 60 second = 3000 bytes per second, or 3 kbps for a message every second. By setting the quota to 1 we expected to get way less than 50 messages per second pushed by PubSub. During testing we got signiticantly more than that.
At the moment, there is a known issue with the enforcement of push subscriber quota in Google Cloud Pub/Sub.
In general, push subscriber quota is not really a good way to try to enforce flow control. For true flow control, it is better to use pull subscribers and the client libraries. The goal of flow control in the subscriber is to prevent the subscriber from being overwhelmed. In the client library, flow control is defined in terms of outstanding messages and/or outstanding bytes. When one of these limits is reached, Cloud Pub/Sub suspends the delivery of more messages.
The issue with rate-based flow control is that it doesn't account well for unexpected issues with the subscriber or its downstream dependencies. For example, imagine that the subscriber receives messages, writes to a database, and then acknowledges the message. If the database were suffering from high latency or just unavailable for a period of time, then rate-based flow control is still going to deliver more messages to the subscriber, which will back up and could eventually overload its memory. With flow control based on outstanding messages or bytes, the fact that the database is unavailable (which prevents the acknowledgement of messages by the subscriber) means that delivery is completely halted. In this situation where the database cannot process any messages or is processing them extremely slowly, sending more messages--even at a very low rate--is still harmful to the subscriber.
We are experiencing very high latencies when we start a Google PubSub client. Messages are not arriving before minutes after the client initialization.
When looking in the Google Cloud console, we can indeed see that google.pubsub.v1.Subscriber.StreamingPull calls have very high latencies (around 8 minutes):
Is it expected behaviour? If not, what could cause this issue?
Best regards
The latency in the Google Cloud console would not be correlated with latency in receiving messages. The nature of a StreamingPull request is that it stays open for a long time, until shut down by a connection error or when a shutdown is initiated on the client. The latency in the console would indicate how long the connections are staying open, not how long it is taking to receive messages. This is also why the error rate is 100%.
Messages should be received quickly after starting up a subscriber, assuming there are messages available in the backlog to receive. There are many different things that could lead to delays in message delivery:
Subscriber client running on a machine with limited available resources that
Very tight flow control settings that only allow a few messages through at a time.
Publisher-side latency due to the publisher running on a machine with limited available resources.
Messages having been received earlier by another subscriber client running or via a pull command on gcloud tool on the same subscription, resulting in messages not being redelivered until the ack deadline has expired.
So I was looking at using Google's Pub/Sub service for queues but by trial and error I came to a conclusion that I have no idea what it's good for in real applications.
Google says that it's
A global service for real-time and reliable messaging and streaming
data
but the way it work is really strange to me. It holds acked messages up to 7 days, if the subscriber re-subscribes it will get all the messages from the past 7 days even if it already acked them, acked messages will most likely be sent again to the same subscriber that acked them already and there's no FIFO as well.
So I really do not understand how one should use this service if the only thing that it guarantees is that a message will be delivered at least once to any subscriber. This cannot be used for idempotent actions, each subscriber has to store an information about all messages that were acked already so it won't process the message multiple times and so on...
Google Cloud Pub/Sub has a lot of different applications where decoupled systems need to send and receive messages. The overview page offers a number of use cases including balancing work loads, logging, and event notifications. It is true that Google Cloud Pub/Sub does not currently offer any FIFO guarantees and that messages can be redelivered.
However, the fact that the delivery guarantee is "at least once" should not be taken to mean acked messages are redelivered when a subscriber re-subscribers. Redelivery of acked messages is a rare event. This generally only happens when the ack did not make it all the way back to the service due to a networking issue, a machine failure, or some other exceptional condition. While that means that apps do need to be able to handle this case, it does not mean it will happen frequently.
For different applications, what happens on message redelivery can differ. In a case such as cache invalidation, mentioned in the overview page, getting two events to invalidate an entry in a cache just means the value will have to be reloaded an extra time, so there is not a correctness concern.
In other cases, like tracking button clicks or other events on a website for logging or stats purposes, infrequent acked message redelivery is likely not going to affect the information gathered in a significant way, so not bothering to check if events are duplicates is fine.
For cases where it is necessary to ensure that messages are processed exactly once, then there has to be some sort of tracking on the subscriber side to ensure this is the case. It might be that the subscriber is already accessing and updating an underlying database in response to messages and duplicate events can be detected via that storage.
I have an application that requires really low latency (real time game).
Currently in my solution it takes less than 2 milliseconds for a message to route to from the client front-end server to the destination server.
Does anybody know how much time will it take in Google Cloud Pub/Sub to route a message from one server to another?
Thank you!
While Cloud Pub/Sub's end-to-end latency at the 99.9th percentile is sufficient for many applications--including some using it for real-time interaction, 2ms is lower than what the system can currently promise. We have thus far prioritized high throughput and strong delivery guarantees. End-to-end latency is also highly dependent on the rate at which a subscriber issues pull requests. A subscriber should always have at least a few open pull requests if throughput and/or latency are important. We do aim to significantly reduce out intra-region latencies but at the moment Cloud Pub/Sub cannot guarantee 2ms intra-region latencies at the 99.9th percentile.