Pub/Sub ordering guarantees - google-cloud-pubsub

I am going to use Pub/Sub platform for rtdn processing. I am a bit confused of some guarantees (if they exist, it will make my tasks much easier) of Pub/Sub platform, so I want to ask:
is it possible for message with less publishTime (for eg. publishTime = 12:00, found in request = 12:02) appear later, than one with bigger publishTime (publishTime = 12:01, found in request = 12:01)?
[wrong for sure if (1) is possible] are messages ordered by publishTime in api request? Is it guranteed if I'll request 2 batches (with aknowleding of all subs), that for every message1 from first batch and any message2 from second one message1.publishTime <= message2.publishTime will be true?
UPD: Okay, my question is a bit specific, I want to ask more general one also: Is there any attribute which can be used like offset in Apache Kafka? I'd like to have logic in my application with messages offsets and have ability to commit them sequentially (to be sure at every moment, that for commited offset every offset, which is less than commited, is processed)

Pub/Sub's ordering guarantees aren't based exactly on timestamp; they are based on "the order in which the servers received the requests." What that means is that in general, yes, messages will be delivered in timestamp order per ordering key. However, there is technically the possibility that a message received by the server later could be considered published at a timestamp earlier than a message that was received earlier, though in practice, it is extremely unlikely.
Therefore, messages returned in a response to the client will be sent in the order in which they were received. In general, yes, the timestamps will be increasing. Though keep in mind that given that Pub/Sub offers at-least-once delivery, it is possible for messages to get redelivered and those messages could have timestamps that are earlier than timestamps of messages already sent.
In general, it's best not to think in terms of offset and commit if using Cloud Pub/Sub's ordering feature (or Pub/Sub in general). Pub/Sub expects messages to be acknowledged individually. When using ordered delivery, acks for later messages are not processed until acks for earlier messages have been. What that means is that if you have messages 1, 2, and 3 with the same ordering key and you receive them all and ack only message 3, the service will not consider message 3 acked until messages 1 and 2 are acked. If acks for 1 and 2 are never received, then messages 1, 2, and 3 will be redelivered.

Related

Is it possible for a DynamoDB read to return state that is older than the state returned by a previous read?

Let's say there is a DynamoDB key with a value of 0, and there is a process that repeatably reads from this key using eventually consistent reads. While these reads are occurring, a second process sets the value of that key to 1.
Is it ever possible for the read process to ever read a 0 after it first reads a 1? Is it possible in DynamoDB's eventual consistency model for a client to successfully read a key's fully up-to-date value, but then read a stale value on a subsequent request?
Eventually, the write will be fully propagated and the read process will only read 1 values, but I'm unsure if it's possible for the reads to go 'backward in time' while the propagation is occuring.
The property you are looking for is known as monotonic reads, see for example the definition in https://jepsen.io/consistency/models/monotonic-reads.
Obviously, DynamoDB's strongly consistent read (ConsistentRead=true) is also monotonic, but you rightly asked about DynamoDB's eventually consistent read mode.
#Charles in his response gave a link, https://www.youtube.com/watch?v=yvBR71D0nAQ&t=706s, to a nice official official talk by Amazon on how eventually-consistent reads work. The talk explains that DynamoDB replicates written data to three copies, but a write completes when two out of three (including one designated as the "leader") of the copies were updated. It is possible that the third copy will take some time (usually a very short time to get updated).
The video goes on to explain that an eventually consistent read goes to one of the three replicas at random.
So in that short amount of time where the third replica has old data, a request might randomly go to one of the updated nodes and return new data, and then another request slightly later might go by random to the non-updated replica and return old data. This means that the "monotonic read" guarantee is not provided.
To summarize, I believe that DynamoDB does not provide the monotonic read guarantee if you use eventually consistent reads. You can use strongly-consistent reads to get it, of course.
Unfortunately I can't find an official document which claims this. It would also be nice to test this in practice, similar to how he paper http://www.aifb.kit.edu/images/1/17/How_soon_is_eventual.pdf tested whether Amazon S3 (not DynamoDB) guaranteed monotonic reads, and discovered that it did not by actually seeing monotonic-read violations.
One of the implementation details which may make it hard to see these monotonic-read violations in practice is how Amazon handles requests from the same process (which you said is your case). When the same process sends several requests in sequence, it may (but also may not...) may use the same HTTP connections to do so, and Amazon's internal load balancers may (but also may not) decide to send those requests to the same backend replica - despite the statement in the video that each request is sent to a random replica. If this happens, it may be hard to see monotonic read violations in practice - but it may still happen if the load balancer changes its mind, or the client library opens another connection, and so on, so you still can't trust the monotonic read property to hold.
Yes it is possible. Requests are stateless so a second read from the same client is just as likely as any other request to see slightly stale data. If that’s an issue, choose strong consistency.
You will (probably) not ever get the old data after getting the new data..
First off, there's no warning in the docs about repeated reads returning stale data, just that a read after a write may return stale data.
Eventually Consistent Reads
When you read data from a DynamoDB table, the response might not
reflect the results of a recently completed write operation. The
response might include some stale data. If you repeat your read
request after a short time, the response should return the latest
data.
But more importantly, every item in DDB is stored in three storage nodes. A write to DDB doesn't return a 200 - Success until that data is written to 2 of 3 storage nodes. Thus, it's only if your read is serviced by the third node, that you'd see stale data. Once that third node is updated, every node has the latest.
See Amazon DynamoDB Under the Hood
EDIT
#Nadav's answer points that it's at least theoretically possible; AWS certainly doesn't seem to guarantee monotonic reads. But I believe the reality depends on your application architecture.
Most languages, nodejs being an exception, will use persistent HTTP/HTTPS connections by default to the DDB request router. Especially given how expensive it is to open a TLS connection. I suspect though can't find any documents confirming it that there's at least some level of stickiness from the request router to a storage node. #Nadav discusses this possibility. But only AWS knows for sure and they haven't said.
Assuming that belief is correct
curl in a shell script loop - more likely to see the old data again
loop in C# using a single connection - less likely
The other thing to consider is that in the normal course of things, the third storage node in "only milliseconds behind".
Ironically, if the request router truly picks a storage node at random, a non-persistent connection is then less likely to see old data again given the extra time it takes to establish the connection.
If you absolutely need monotonic reads, then you'd need to use strongly consistent reads.
Another option might be to stick DynamoDB Accelerator (DAX) in front of your DDB. Especially if you're retrieving the key with GetItem(). As I read how it works it does seem to imply monotonic reads, especially if you've written-through DAX. Though it does not come right out an say so. Even if you've written around DAX, reading from it should still be monotonic, it's just there will be more latency until you start seeing the new data.

Dealing with poison kafka messages in flink while keeping the order per partition

As far as I can tell, when deserializing objects using KafkaDeserializationSchema[T], my 3 options are to return T, return null (ignore the record) or throw an exception (shut down the task manager) [from: https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#the-deserializationschema]. I have a requirement to stop processing subsequent messages on the topic where a poison message fails deserialization, but only until a human intervenes and makes a decision whether to ignore the message or replace it with a corrected one.
Has anyone had to deal with a similar requirement?
I was thinking about introducing a separate process function for dealing with converting an array of bytes to T, connecting a broadcast stream to it, and reacting to commands from a human operator in all instances of that operator. The problem here is that I can't figure out a way to pause reading from kafka while the system waits for a human to make a decision. I could throw exceptions and restart indefinitely, or I could keep reading from the topic and holding the incoming messages in the state, but I'm worried about additional CPU usage and balooning state for options 1 and 2 respectively.
Any thoughts anyone? Thanks!

Cassandra write consistency level ALL clarification

According to Datastax documentation for Cassandra:
"If the coordinator cannot write to enough replicas to meet the requested consistency level, it throws an Unavailable Exception and does not perform any writes."
Does this mean that while the write is in process, the data updated by the write will not be available to read requests? I mean it is possible that 4/5 nodes have successfully sent a SUCCESS to the coordinator, meaning that their data have been updated. But the 5th one is yet to do the write. Now if a read request comes in and goes to one of these 4 nodes, it will still show the old data until the coordinator recieves a confirmation from the 5th node and marks the new data valid?
If the coordinator knows that it cannot possibly achieve consistency before it attempts the write, then it will fail the request immediately before doing the write. (This is described in the quote given)
However, if the coordinator believes that there are enough nodes to achieve its configured consistency level at the time of the attempt, it will start to send its data to its peers. If one of the peers does not return a success, the request will fail and you will get into a state where the nodes that fail have the old data and the ones that passed have the new data.
If a read requests comes in, it will show the data it finds on the nodes it reaches no matter if it is old or new.
Let us take your example to demonstrate.
If you have 5 nodes and you have replication 3. This will mean that 3 of those 5 nodes will have the write that you have sent. However, one of the three nodes returned a failure to the coordinator. Now if you read with consistency level ALL. You will read all three nodes and will always get the new write (Latest timestamp always wins).
However, if you read with consistency level ONE, there is a 1/3 chance you will get the old value.

Service Broker queue rollback on receive

As using base on here
Service Broker provides automatic poison message detection. Automatic poison message detection sets the queue status to OFF if a transaction that receives messages from the queue rolls back five times. This feature provides a safeguard against catastrophic failures that an application cannot detect programmatically.
I have a windows service application that polls SB queue and send them to a web service endpoint. Since, I should handle any "server goes-off" issues ─get back message to the queue, so I include "queue item receiving" and "queue item sending" methods into the same transaction. On the very first exception (HttpRequestException), I start pinging server for predefined timeout then continue/close program.
However, rolling back five times is a problem, I understand that whatever the time gap between 5 consecutive rollbacks it always increments rollback count globally, so the queue will be disabled eventually. Am I right on this? Does queue has a timeout for zeroing rollback count?
If this is the behaviour, is it better to exclude "queue item sending" method from transaction? If I do this, I should follow the approach that, on exception keep the message in another resource(table, file) to be sent later, or other alternatives...
What about using tables as queues to keep my transaction united and be freed from SB's rollback issue? Would it be as reliable as SB?
AFAIK, 5 consecutive rollbacks of the same message on a queue with POISON_MESSAGE_HANDLING = ON with will disable the queue regardless of the time gap.
Have you considered simply turning off poison message handling for the queue? The onus would then be on your application to distinguish between a true poison message (one that can never be successfully processed) versus a problem with an external service dependency. In the first case, you could log the problem message elsewhere and commit instead of rollback.
There are other patterns one could use, such as re-queuing the message and committing but much depends on whether messages must be processed in order.

Is it reasonable to assume that all HTTP headers (of a single HTTP message) arrive in the same SSL/TLS record?

The man page of openssl's SSL_read() call states:
SSL_read() works based on the SSL/TLS records. The data are received in records (with a maximum record size of 16kB for SSLv3/TLSv1). Only when a record has been completely received, it can be processed (decryption and check of integrity). Therefore data that was not retrieved at the last call of SSL_read() can still be buffered inside the SSL layer and will be retrieved on the next call to SSL_read().
Given that:
HTTP headers of a single outgoing message can always be sent in one go
a single SSL/TLS record can apparently hold 16KB of data, which should be enough for everyone (or at least any non-perverse HTTP request)
Browsers have little reason to divide the headers into multiple SSL records, right? Or are there browsers out there that are so aggressive with regard to latency that they will chop up even these kinds of small payloads into multiple records?
I'm asking this because it would be nice to be able to parse an entire set of HTTP headers from a single read buffer filled by a single succesfull SSL_read() call. If that means denying the odd few (e.g. if only 0.0000X% of all requests), that might be worth it to me.
edit: Alexei Levenkov made the valid point that cookies can be really long. But let's consider then the scenario that cookies are never set or expected by this particular server.
edit2: This question was a little premature. I've meanwhile written code that stores per client state efficiently enough to accept an arbitrary number of SSL records while parsing without incurring a performance penalty of any significance. Prior to doing so I was wondering if I could take a shortcut, but general consensus seems to be that I better play by the book. Point conceded.
No, cookies alone could be well above 16K.
According to RFC2109 and IE implementation for cookie limits the recommended minimum restriction on cookie size per domain is 80K
RFC 2109 section "6.3 Implementation Limits":
at least 4096 bytes per cookie (as measured by the size of the
characters that comprise the cookie non-terminal in the syntax
description of the Set-Cookie header)
at least 20 cookies per unique host or domain name
You cannot make that assumption. Just like recv() can return fewer bytes than requested, so can SSL_read(). It is your responsibility to buffer inbound data, reading in however many records it takes, and then process the buffered data only when you have finished receiving everything you are expecting to receive. You should not be processing the HTTP header until you have received the <CRLF><CRLF> sequence at the end of the headers.
Perhaps a more careful statement would be that the assumption is wrong but I would expect that most of the time for the average scenario you would get the behavior you want. I don't know the answer for sure, but experiments could be done to find out.
I'm all for hackers ignoring the rules and abusing protocols for learning, fun, and profit, but otherwise I can see no benefit for restricting yourself to reading a single SSL record. If you wouldn't mind elaborating I am curious.
Browsers have little reason to divide the headers into multiple SSL
records, right? Or are there browsers out there that are so aggressive
with regard to latency that they will chop up even these kinds of
small payloads into multiple records?
There can be some reasons that lead to splitting the data into multiple records. One of them is the mitigation against the BEAST attack.
You can't generally make the assumption that the data won't be split into multiple records.
I'm asking this because it would be nice to be able to parse an entire
set of HTTP headers from a single read buffer filled by a single
succesfull SSL_read() call. If that means denying the odd few (e.g. if
only 0.0000X% of all requests), that might be worth it to me.
This is not just an SSL/TLS issue, but this would also apply to TCP: not treating the incoming data as a stream will simply lead to bugs and bad code.

Resources