So I was looking at using Google's Pub/Sub service for queues but by trial and error I came to a conclusion that I have no idea what it's good for in real applications.
Google says that it's
A global service for real-time and reliable messaging and streaming
data
but the way it work is really strange to me. It holds acked messages up to 7 days, if the subscriber re-subscribes it will get all the messages from the past 7 days even if it already acked them, acked messages will most likely be sent again to the same subscriber that acked them already and there's no FIFO as well.
So I really do not understand how one should use this service if the only thing that it guarantees is that a message will be delivered at least once to any subscriber. This cannot be used for idempotent actions, each subscriber has to store an information about all messages that were acked already so it won't process the message multiple times and so on...
Google Cloud Pub/Sub has a lot of different applications where decoupled systems need to send and receive messages. The overview page offers a number of use cases including balancing work loads, logging, and event notifications. It is true that Google Cloud Pub/Sub does not currently offer any FIFO guarantees and that messages can be redelivered.
However, the fact that the delivery guarantee is "at least once" should not be taken to mean acked messages are redelivered when a subscriber re-subscribers. Redelivery of acked messages is a rare event. This generally only happens when the ack did not make it all the way back to the service due to a networking issue, a machine failure, or some other exceptional condition. While that means that apps do need to be able to handle this case, it does not mean it will happen frequently.
For different applications, what happens on message redelivery can differ. In a case such as cache invalidation, mentioned in the overview page, getting two events to invalidate an entry in a cache just means the value will have to be reloaded an extra time, so there is not a correctness concern.
In other cases, like tracking button clicks or other events on a website for logging or stats purposes, infrequent acked message redelivery is likely not going to affect the information gathered in a significant way, so not bothering to check if events are duplicates is fine.
For cases where it is necessary to ensure that messages are processed exactly once, then there has to be some sort of tracking on the subscriber side to ensure this is the case. It might be that the subscriber is already accessing and updating an underlying database in response to messages and duplicate events can be detected via that storage.
Related
We are using google-cloud-pubsub (0.24.0-beta) pull client for reading messages from subscriber and seeing high rate of duplicates in that. Google documentation says that little duplication is expected but in our case, we are seeing 80% of messages are getting duplicated even after acknowledgement.
The most weird part is, even if we acknowledge the message immediately in receiver using consumer.ack(), duplicates are still occurring.
Does anybody know how to handle this.
A large number of message duplicates could be the result of flow control settings being set too high or too low. If your flow control settings are too high, where you are allowing too many messages to be outstanding to your client at the same time, then it is possible that the acks are being set too late. If this is the cause, you would probably see the CPU of your machine at or near 100%. In this case, try setting the max number of outstanding messages or bytes to a lower number.
It could also be the case that the flow control settings are set too low. Some messages get buffered in the client before they are delivered to your MessageReceiver, particularly if you are flow controlled. In this case, messages may spend too much time buffered in the client before they are delivered. There is an issue with messages in this state that is being fixed in an outstanding PR. In this scenario, you could either increase your max outstanding bytes or messages (up to whatever your subscriber can actually handle) or you can try to setAckExpirationPadding to a larger value than the default 500ms.
It is also worth checking your publisher to see if it is unexpectedly publishing messages multiple times. If that is the case, you may see the contents of your messages being the same, but they aren't duplicate messages being generated by Google Cloud Pub/Sub itself.
Edited to mention bug that was in the client library:
If you were using a version of google-cloud-pubsub between v0.22.0 and v0.29.0, you might have been running into an issue where a change in the underlying mechanism for getting messages could result in excessive duplicates. The issue has since been fixed.
Is Google PubSub suitable for low-volume (10 msg/sec) but mission-critical messaging, where timely delivery of each message is guaranteed within any fixed period of time?
Or, is it rather suited for high-throughput, where individual messages might be occasionally lost or delayed indefinitely?
Edit: To rephrase this question a bit: Is it true, that any particular message in PubSub, regardless of volume of messages produced, can be indefinitely delayed?
Google Cloud Pub/Sub guarantees delivery of all messages, whether low throughput or high throughput, so there should be no concern about messages being lost.
Latency for message delivery from publisher to subscriber depends on many different factors. In particular, the rate at which the subscriber is able to process messages and request more messages is vitally important. For pull subscribers, this means always having several outstanding pull requests to the server. For push subscribers, they should be returning a successful HTTP response code as quickly as possible. You can read more about the difference between push and pull subscribers.
Google Cloud Pub/Sub tries to minimize latency as much as possible, though there are no guarantees made. Empirically, Cloud Pub/Sub consistently delivers messages in no more than a couple of seconds at the 99th percentile. Note that if your publishers or subscribers are not running on Google Cloud Platform, then network latency between your servers and Google servers could also be a factor.
We're sending messages to Apache Camel using RabbitMQ.
We have a "sender" and a Camel route that processes a RabbitMQ message sent by the sender.
We're having deployment issues regarding which end of the system comes up first.
Our system is low-volume. I am sending perhaps 100 messages at a time. The point of the message is to reduce 'temporal cohesion' between a thing happening in our primary database, and logging of same to a different database. We don't want our front-end to have to wait.
The "sender" will create an exchange if it does not exist.
The issue is causing deployment issues.
Here's what I see:
If I down the sender, down Camel, delete the exchange (clean slate), start the sender, then start Camel, and send 100 messages, the system works. (I think because the sender has to be run manually for testing, the Exchange is being created by the Camel Route...)
If I clean slate, and send a message, and then up Camel afterwards, I can see the messages land in RabbitMQ (using the web tool). No queues are bound. Once I start Camel, I can see its bound queue attached to the Exchange. But the messages have been lost to time and fate; they have apparently been dropped.
If, from the current state, I send more messages, they flow properly.
I think that if the messages that got dropped were persisted, I'd be ok. What am I missing?
For me it's hard to say what exactly is wrong, but I'll try and provide some pointers.
You should set up all exchanges and queues to be durable, and the messages persistent. You should never delete any of these entities (unless they are empty and you no longer use them) and maybe look at them as tables in a database. It's your infrastructure of sorts, and as with database, you wouldn't want that the first DB client to create a table that it needs (this of course applies to your use case, at least that's what it seems to me).
In the comments I mentioned flow state of the queue, but with 100 messages this will probably never happen.
Regarding message delivery - persistent or not, the broker (server) keeps them until they are consumed with acknowledgment that's sent back by the consumer (in lot's of APIs this is done automatically but it's actually one of the most important concepts).
If the exchange to which the messages were published is deleted, they are gone. If the server gets killed or restarted and the messages are persisted - again, they're gone. There may as well be some more scenarios in which messages get dropped (if I think of some I'll edit the answer).
If you don't have control over creating (declaring usually in the APIs) exchanges and queues, than (aside from the fact that's it's not the best thing IMHO) it can be tricky since declaring those entities is idempotent, i.e. you can't create a durable queue q1 , if a non durable queue with the same name already exists. This could also be a problem in your case, since you mention the which part of the system comes first thing - maybe something is not declared with same parameters on both sides...
I am implementing a callback in java to store messages in a database. I have a client subscribing to '#'. But the problem is when this # client disconnects and reconnect it adds duplicate entries in the database of retained messages. If I search for previous entries bigger tables will be expensive in computing power. So should I allot a separate table for each sensor or per broker. I would really appreciate if you suggest me better designs.
Subscribing to wildcard with a single client is definitely an anti-pattern. The reasons for that are:
Wildcard subscribers get all messages of the MQTT broker. Most client libraries can't handle that load, especially not when transforming / persisting messages.
If you wildcard subscriber dies, you will lose messages (unless the broker queues endlessly for you, which also doesn't work)
You essentially have a single point of failure in your system. Use MQTT brokers which are hardened for production use. These are much more robust single point of failures than your hand-written clients. (You can overcome the SIP through clustering and load balancing, though).
So to solve the problem, I suggest the following:
Use a broker which can handle shared subscriptions (like HiveMQ or MessageSight), so you can balance all messages between many clients
Use a custom plugin for doing the persistence at the broker instead of the client.
You can also read more about that topic here: http://www.hivemq.com/blog/mqtt-sql-database
Also consider using QoS = 3 for all message to make sure one and only one message is delivered. Also you may consider time-stamp each message to avoid inserting duplicate messages if QoS requirement is not met.
I have a strange problem on a PeopleSoft application. It appears that integration broker messages are being processed out of order. There is another possibility, and that is that the commit is being fired asynchronously, allowing the transactions to complete out of order.
There are many inserts of detail records, followed by a trailer record which performs an update on the rows just inserted. Some of the rows are not receiving the update. This problem is sporadic, about once every 6 months, but it causes statistically significant financial reporting errors.
I am hoping that someone has had enough dealings with the internals of PeopleTools to know what it is up to, so that perhaps I can find a work around to the problem.
You don't mentioned whether you've set this or not, but you have a choice with Integration Broker. All messages flow through message channels, and a channel can either be ordered or unordered. If a channel is ordered then - if a message errors - all subsequent messages queue up behind it and will not be processed until it succeeds.
Whether a channel is ordered or not depends upon the checkbox on the message channel properties in Application Designer. From memory channels are ordered by default, but you can uncheck the box to increase throughput.
Hope this helps.
PS. As of Tools 8.49 the setup changed slightly, Channels became Queues, Messages Service Operations etc.
I heard from GSC. We had two domains on the sending end as well as two domains on the receiving end. All were active. According to them, it is possible when you have multiple domains for each of the servers to pick up some of the messages in the group, and therefore, process them asynchronously, rather than truly serially.
We are going to reduce the active servers to one, and see it it happens again, but it is so sporadic that we may never know for sure.
There are few changes happened in PSFT 9 IB so please let me know the version of your apps. Async services can work with Sync now. Message channel properties are need to set properly. Similar kind of problem, I found on www.itwisesolutions.com/PsftTraining.html website but that was more related to implementing itself.
thanks