My question is related to this description of the pub/sub message flow from The Basics of a Publish/Subscribe Service:
The description apears to suggest that it's possible for a subscriber to only receive some of the messages hitting a subscription point: Subscriber 1 appears to be getting just the B message and Subscriber 2 getting just the A message, despite the fact that both A and B messages are coming from Subscription 1.
Nowhere else in the docs I encountered such concept, message receiving appears to be done based on particular subscription and a subscription appears to be done for a particular topic, but not for a particular publisher.
Am I misinterpreting the above description or is it really possible for a subscriber to select only some of the messages it receives (based on the publisher)?
The subscribers themselves do not choose which messages they get. When there are multiple subscribers for a single subscription, they can both pull from the same subscription and receive an arbitrary subset of the messages. This can be used to load balance across multiple subscribers and process more messages in parallel by increasing the number of subscribers.
Related
I have a Cloud Pub/Sub Push subscription that pushes multiple instances of the same messages to a processing end-point i GAE. I can track the message ID and it’s the same message that gets PUSH multiple times.
I have set the ack-timeout to 600 seconds but still it pushes multiple instances of some of the messages. Outside of the message doesn’t get “acked”, what can trigger this behavior? Anyone had the same problem?
The issue seems to be bigger the more instances I run, but even when using basic_scaling and with max_instances: 1 problem still remains.
I can see a bunch of 503 errors in GAE but if I understand it correct, that is not an issue since these messages automatically gets "re-tried" but Pub/Sub.
As it turns out this is a well known issue with Pub/Sub. Pub/Sub is "At least Once Delivery", and duplicates are to be expected. To resolve this, read here for some inspiration, https://cloud.google.com/blog/products/serverless/cloud-functions-pro-tips-building-idempotent-functions
I am posting this as an answer, because i dont have enough reputation to put as comment. :)
As you have already figured out, once Pub/Sub sends a message to a subscriber, the subscriber should acknowledge the message. Any message that has not been acknowledged, Cloud Pub/Sub will repeatedly attempt to deliver (Check here). This means that occasional duplicates are to be expected. However, a high rate of duplicates may indicate that the client is not acknowledging messages within the configured ack_deadline_seconds, and Cloud Pub/Sub is retrying the message delivery.
You could use Stackdriver, to monitor if the Pub/Sub System is successful and your messages are being acknowledged (Check here & here), or if there are too many duplicates (Check here & here).
We recently integrated google pubsub into our app, and some of our long running tasks are now under problem, as they take more than 1 minute sometimes. We have configured our subscriber's ack deadline to 600 seconds, yet, anything that is taking more than 600ms, is being retried by pubsub.
this is our config:
gcloud pubsub subscriptions describe name
ackDeadlineSeconds: 600
expirationPolicy: {}
messageRetentionDuration: 604800s
Not sure what is the issue. Most of our tasks will get repeated because of this
Pub/Sub has a built in At-least-once delivery system which will retry messages that were not acknowledged. In this case, after 600s have passed, the message you first sent becomes unacknowledged, thus Pub/Sub retries the message. It will keep retrying it for 600s until it reaches the messageRetentionDuration or you acknowledge it.
Keep in mind that it's specified in the documentation that your subscriber should be idempotent. So, making your code be able to handle multiple messages should be the best approach to this issue.
You could also decrease the messageRetentionDuration to 600s(it's minimum) so anything that passes the 10 min mark will not be retried.
Also, it is stated in the FAQs that:
Why are there too many duplicate messages?
Cloud Pub/Sub guarantees at-least-once message delivery, which means
that occasional duplicates are to be expected. However, a high rate of
duplicates may indicate that the client is not acknowledging messages
within the configured ack_deadline_seconds, and Cloud Pub/Sub is
retrying the message delivery. This can be observed in the monitoring
metrics.
pubsub.googleapis.com/subscription/pull_ack_message_operation_count
for pull subscriptions, and
pubsub.googleapis.com/subscription/push_request_count for push
subscriptions. Look for elevated expired or webhook_timeout values in
the /response_code. This is particularly likely if there are many
small messages, since Cloud Pub/Sub may batch messages internally and
a partially acknowledged batch will be fully redelivered.
Another possibility is that the subscriber is not acknowledging some
messages because the code path processing those specific messages
fails, and the Acknowledge call is never made; or the push endpoint
never responds or responds with an error.
I have watch/subscribed to the topic using the following code.
request = {
'labelIds': ['INBOX'],
'topicName': 'projects/myproject/topics/mytopic'
}
gmail.users().watch(userId='me', body=request).execute()
How can I get the status of the topic at any given point in time? The problem is, sometimes I am not getting the push from Gmail for any incoming emails.
From the Cloud Pub/Sub perspective, if you want to check on the status of messages, you could look at metrics via Stackdriver. There are many Cloud Pub/Sub metrics that are available. You can create graphs on any of the metrics that will be mentioned later by going to Stackdriver, creating a new dashboard, clicking on "Add Chart," and then typing in the name of the metric in the "Find resource type and metric box:
The first thing you have to determine is whether the issue is on the publish side (from Gmail into your topic) or on the subscribe side (from the subscription to your push endpoint). To determine if the topic is receiving messages, look at the topic/send_message_operation_count metric. This should be non-zero at points where messages were sent from Gmail to the topic. If it is always zero, then it is likely that the connection from Gmail to Cloud Pub/Sub is not set up properly, e.g., you need to grant publish rights to the topic. Note that results are delayed, so from the time you expect a message to have been sent to when it would be reflected on the graph could be up to 5 minutes.
If the messages are successfully being sent to Pub/Sub, then you'll want to see the status of attempts to receive those messages. If your subscription is a push subscription, then you'll want to look at subscription/push_request_count for the subscription. Results are grouped by response code. If the responses are in the 400 or 500 ranges, then Cloud Pub/Sub is attempting to deliver messages to your subscriber, but the subscriber is returning errors. In this case, it is likely an issue with your subscriber itself.
If you are using the Cloud Pub/Sub client libraries, then you'll want to look at properties like subscription/streaming_pull_message_operation_count to determine if your subscriber is managing to try to fetch messages for a subscription. If you are calling the pull method directly in your subscriber, then you'll want to look at subscription/pull_message_operation_count to see if there are pull requests returning successfully to your subscriber.
If the metrics for push, pull, or streaming pull indicate errors, that should help to narrow down the problem. If there are no requests at all, then it indicates that the subscribers may not There could be permission problems, e.g., the subscriber is running as a user that doesn't have permission to read from subscriptions.
As a subscriber to a pubsub topic, is it possible for me to find out what time a message was received by pubsub? That is, can a subscriber that has just received a message find out what time the corresponding publisher published the message?
This is possible with a pull subscriber. Use the publishTime field of PubsubMessage.
If you are using a client library, read the library docs on how to access this. For example, with the python client lib, it is accessible via the publish_time field on the Message class.
I created a working Google Channel AP and now I would like to send a message to all clients.
I have two servlets. The first creates the channel and tells the clients the userid and token. The second one is called by an http post and should send the message.
To send a message to a client, I use:
channelService.sendMessage(new ChannelMessage(channelUserId, "This is a server message!"));
This sends the message just to one client. How could I send this to all?
Have I to store every Id which I use to create a channel and send the message for every id? How could I pass the Ids to the second servlet?
Using Channel API it is not possible to create one channel and then having many subscribers to it. The server creates a unique channel for individual JavaScript clients, so if you have the same Client ID the messages will be received only by one.
If you want to send the same message to multiple clients, in short, you will have to keep a track of active clients and send the same message to all of them.
If that approach sounds scary and messy, consider using PubNub for your push notification messages, where you can easily create one channel and have many subscribers. To make it run on Google App Engine is not that hard, since they support almost any platform or device.
I know this is an old question, but I just finished an open source project that uses the Channel API to implement a publish/subscribe model, i.e. you can have multiple users subscribe to a single topic, and then all those subscribers will be notified when anyone publishes a message to the topic. It also has some nice features like automatic message persistence if desired, and "return receipts", where a subscriber can be notified whenever OTHER subscribers receive that message. See https://github.com/adevine/gaewebpubsub#gae-web-pubsub. Licensed under Apache 2.0 license.