As using base on here
Service Broker provides automatic poison message detection. Automatic poison message detection sets the queue status to OFF if a transaction that receives messages from the queue rolls back five times. This feature provides a safeguard against catastrophic failures that an application cannot detect programmatically.
I have a windows service application that polls SB queue and send them to a web service endpoint. Since, I should handle any "server goes-off" issues ─get back message to the queue, so I include "queue item receiving" and "queue item sending" methods into the same transaction. On the very first exception (HttpRequestException), I start pinging server for predefined timeout then continue/close program.
However, rolling back five times is a problem, I understand that whatever the time gap between 5 consecutive rollbacks it always increments rollback count globally, so the queue will be disabled eventually. Am I right on this? Does queue has a timeout for zeroing rollback count?
If this is the behaviour, is it better to exclude "queue item sending" method from transaction? If I do this, I should follow the approach that, on exception keep the message in another resource(table, file) to be sent later, or other alternatives...
What about using tables as queues to keep my transaction united and be freed from SB's rollback issue? Would it be as reliable as SB?
AFAIK, 5 consecutive rollbacks of the same message on a queue with POISON_MESSAGE_HANDLING = ON with will disable the queue regardless of the time gap.
Have you considered simply turning off poison message handling for the queue? The onus would then be on your application to distinguish between a true poison message (one that can never be successfully processed) versus a problem with an external service dependency. In the first case, you could log the problem message elsewhere and commit instead of rollback.
There are other patterns one could use, such as re-queuing the message and committing but much depends on whether messages must be processed in order.
Related
As far as I can tell, when deserializing objects using KafkaDeserializationSchema[T], my 3 options are to return T, return null (ignore the record) or throw an exception (shut down the task manager) [from: https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#the-deserializationschema]. I have a requirement to stop processing subsequent messages on the topic where a poison message fails deserialization, but only until a human intervenes and makes a decision whether to ignore the message or replace it with a corrected one.
Has anyone had to deal with a similar requirement?
I was thinking about introducing a separate process function for dealing with converting an array of bytes to T, connecting a broadcast stream to it, and reacting to commands from a human operator in all instances of that operator. The problem here is that I can't figure out a way to pause reading from kafka while the system waits for a human to make a decision. I could throw exceptions and restart indefinitely, or I could keep reading from the topic and holding the incoming messages in the state, but I'm worried about additional CPU usage and balooning state for options 1 and 2 respectively.
Any thoughts anyone? Thanks!
I'm new to Aerospike.
I would like to know that in all possible timeout scenarios, as stated in this link:
https://discuss.aerospike.com/t/understanding-timeout-and-retry-policies/2852
Client can’t connect by specified timeout (timeout=). Timeout of zero
means that there is no timeout set.
Client does not receive response by specified timeout (timeout=).
Server times out the transaction during it’s own processing (default
of 1 second if client doesn’t specify timeout). To investigate this,
confirm that the server transaction latencies are not the bottleneck.
Client times out after M iterations of retries when there was no error
due to a failed node or a failed connection.
Client can’t obtain a valid node after N retries (where retries are
set from your client).
Client can’t obtain a valid connection after X retries. The retry
count is usually the limiting factor, not the timeout value. The
reasoning is that if you can’t get a connection after R retries, you
never will, so just timeout early.
Of all timeout scenarios mentioned, under which circumstances could I be absolute certain that the final result of the transaction is FAILED?
Does Aerospike offer anything i.e. to rollback the transaction if the client does not respond?
In the worst case, If I could’t be certain about the final result, how would I be able to know for certain about the final state of the transaction?
Many thanks in advance.
Edit:
We came up with a temporary solution:
Keep a map of [generation -> value read] for that record (maybe a background thread constantly reading the record etc.) and then on timeouts, we would periodically check the map (key = the generation expected) to see if the true written value is actually the one put to the map. If they are the same, it means the write succeeded, otherwise it means the write failed.
Do you guys think it's necessary to do this? Or is there other way?
First, timeouts are not the only error you should be concerned with. Newer clients have an 'inDoubt' flag associated with errors that will indicate that the write may or may not have applied.
There isn't a built-in way of resolving an in-doubt transaction to a definitive answer and if the network is partitioned, there isn't a way in AP to rigorously resolve in-doubt transactions. Rigorous methods do exist for 'Strong Consistency' mode, the same methods can be used to handle common AP scenarios but they will fail under partition.
The method I have used is as follows:
Each record will need a list bin, the list bin will contain the last N transaction ids.
For my use case, I gave each client an unique 2 byte identifier - each client thread a unique 2 byte identifier - and each client thread had a 4 byte counter. So a particular transaction-id would look like would mask an 8 byte identifier from the 2 ids and counter.
* Read the records metadata with the getHeader api - this avoids reading the records bins from storage.
Note - my use case wasn't an increment so I actually had to read the record and write with a generation check. This pattern should be more efficient for a counter use case.
Write the record using operate and gen-equal to the read generation with the these operations: increment the integer bin, prepend to the list of txns, and trim the list of txns. You will prepend you transaction-id to your txns list and then trim the list to the max size of the list you selected.
N needs to be large enough such that a record can be sure to have enough time to verify its transaction given the contention on the key. N will affect the stored size of the record so choosing too big will cost disk resource and choosing too small will render the algorithm ineffective.
If the transaction is successful then you are done.
If the transaction is 'inDoubt' then read the key and check the txns list for your transaction-id. If present then your transaction 'definitely succeeded'.
If your transaction-id isn't in txns, repeat step 3 with the generation returned from the read in step 5.
Return to step 3 - with the exception that on step 5 a 'generation error' would also need to be considered 'in-doubt' since it may have been the previous attempt that finally applied.
Also consider that reading the record in step 5 and not finding the transaction-id in txns does not ensure that the transaction 'definitely failed'. If you wanted to leave the record unchanged but have a 'definitely failed' semantic you would need to have observed the generation move past the previous write's gen-check policy. If it hasn't you could replace the operation in step 6 with a touch - if it succeeds then the initial write 'definitely failed' and if you get a generation-error you will need to check if you raced the application of the initial transaction initial write may now have 'definitely succeeded'.
Again, with 'Strong Consistency' the mentions of 'definitely succeeded' and 'definitely failed' are accurate statements, but in AP these statements have failure modes (especially around network partitions).
Recent clients will provide an extra flag on timeouts, called "in doubt". If false, you are certain the transaction did not succeed (client couldn't even connect to the node so it couldn't have sent the transaction). If true, then there is still an uncertainty as the client would have sent the transaction but wouldn't know if it had reached the cluster or not.
You may also consider looking at Aerospike's Strong Consistency feature which could help your use case.
I am using an F4 instance (because of memory needs) with automatic scheduling to do some background processing. It is run from a task queue. It takes 40s to 60s to complete each invocation. Because of the high memory needs, each instance should only handle one request at a time.
The action that needs to be done is not urgent. If it doesn't get scheduled for 30 minutes that isn't a problem. Even 60 minutes is acceptable and I'd rather make use of that time rather than spin up more instances. However, if the service gets popular and the is getting more than 60 requests an hour I want to spin up more instances to make sure there isn't more than a 60 minute wait.
I am having trouble figuring out how to configure the instance and queue parameters to keep my costs down but be able to scale in that way. My initial thought was something like this:
<queue>
<name>non-urgent-queue</name>
<target>slow-service</target>
<rate>1/m</rate>
<bucket-size>1</bucket-size>
<max-concurrent-requests>1</max-concurrent-requests>
</queue>
<automatic-scaling>
<min-idle-instances>0</min-idle-instances>
<max-idle-instances>0</max-idle-instances>
<min-pending-latency>20m</min-pending-latency>
<max-pending-latency>1h</max-pending-latency>
<max-concurrent-requests>1</max-concurrent-requests>
</automatic-scaling>
First of all those latency settings are invalid, but I can't find documentation on the valid range or units. Can anyone direct me to that info?
Secondly, if I understand the queue settings correctly, this configuration would limit it to 60 invocations an hour getting to the service, even if the task queue had 60+ jobs waiting.
Thanks for your help!
Indeed, throttling at the queue level basically defeats the ability to scale when needed. So you can't use the <rate> in the queue configuration at the values you have right now, you need to use the value matching the maximum rate you're willing to accept (with you max number of instances running simultaneously):
the max rate of requests that can go through the queue being limited at 1/min means you can't scale above 60/h
the <bucket-size> set at 1 means no peaks above the rate can be handled (as soon as one task starts the token bucket empties).
the <max-concurrent-requests> set at 1 will basically prevent multiple instances dealing simultaneouly with the queued workload. They may be started by the autoscaler because of the request latencies, but they won't be able to help since only one queue task can be handled at a time.
In the <automatic-scaling> section the <max-concurrent-requests> set to 1 is good - this ensures no instance handles more than 1 request at a time - which is what you want.
The bad news is that the max values for the latencies appear to be 15s. At least when using the app.yaml config for python (but I think it's unlikely for that to differ across language sandboxes):
Error 400: --- begin server output ---
automatic_scaling.min_pending_latency (30s), must be in the range [0.010000s,15.000000s].
--- end server output ---
and
Error 400: --- begin server output ---
automatic_scaling.max_pending_latency (60s), must be in the range [0.010000s,15.000000s].
--- end server output ---
Which probably also explains why your 5m and 1h values aren't accepted - I used 30s and 60s and got the above errors.
This means you won't be able to use the autoscaling parameters to tune such a slow-moving processing like you desire.
The only alternative I can think of is to have 2 queues:
a fast one feeding just trigger tasks for the slow-service jobs, but which your service intercepts and saves in the datastore. Maybe performed by some faster service (you don't want these stuck behind a slow-service job execution as it can cause unnecessary instance launching. Maybe, depending on the rest of your implementation, you can replace this queue completely with just storing the job info in the datastore instead of enqueing tasks in the fast queue.
a slow one for the actual slow-service job execution tasks
You'd also have a cron job executing once a minute, checking how many triggers are pending in the datastore, decide how much to scale and enqueue the corresponding number of slow-service job tasks in the slow queue. The autoscaler would simply bring up the corresponding number of instances (if needed). Low latency autoscaling configs would be desirable in this case - you already decided how you want your app to scale.
This is how I ended up doing it. I use a slow queue and a fast queue configured like this:
<queue>
<name>slow-queue</name>
<target>pdf-service</target>
<rate>2/m</rate>
<bucket-size>1</bucket-size>
<max-concurrent-requests>1</max-concurrent-requests>
</queue>
<queue>
<name>fast-queue</name>
<target>pdf-service</target>
<rate>10/m</rate>
<bucket-size>1</bucket-size>
<max-concurrent-requests>5</max-concurrent-requests>
</queue>
The max-concurrent-requests in the slow queue ensures only one task will run at a time, so there will only be one instance active.
Before I post to the slow queue I check to see how many items are already on the queue. The result may not be totally reliable, but for my purposes it is sufficient. In java:
QueueStatistics queueStats = queue.fetchStatistics();
if(queueStats.getNumTasks()<30) {
//post to slow queue
} else {
//post to fast queue
}
So when my slow queue gets too full, I post to the fast queue which allows concurrent requests.
The instance is configured like this:
<automatic-scaling>
<min-idle-instances>0</min-idle-instances>
<max-idle-instances>automatic</max-idle-instances>
<min-pending-latency>15s</min-pending-latency>
<max-pending-latency>15s</max-pending-latency>
<max-concurrent-requests>1</max-concurrent-requests>
</automatic-scaling>
So it will create new instances as slowly as possible (15s is the max latency) and make sure only one process runs on an instance at a time.
With this configuration I'll have a max of 6 instances at a time but that should do about 500/hr. I could increase the rate and concurrent requests to do more.
The negative of this solution is an element of unfairness. Under heavy load, some tasks will be stuck in the slow queue while others will get processed more quickly in the fast queue.
Because of that, I have decreased the max items on the slow queue to 13 so the unfairness won't be so extreme, maybe a 10 minute wait for jobs that go to the slow queue when it is full.
This is more sounds like a design issue to me.
Scenario -
I have an embedded system with multiple threads -
One of the thread is xxx -- a networking protocol that tells the neighbour router -- Producer
Another thread is xxx-TE - this a traffic engineering - xxx protocol. - Consumer.
They both are communicating to each other via. Message queue. So, basically the producer puts the data in the xxx-TE queue for the thread xxx-TE.
Problem -
When we have a lot of nodes or in simple words a lot of routing information from xxx, the message put in the xxx - TE queue is lost.
Solution -
Is this the solution correct?
Should we increase the queue-depth so that the message is not lost.
[symptoms] - We see errors while pushing the message in the message queue.
Probably not.
Generally speaking, message queues should stay empty, or close to empty, as much of the time as possible. If your queue is not usually empty, you need to improve the speed at which messages are being processed.
Increasing the size of the queue is generally not a solution; if the queue is being filled faster than it is being emptied, it will always end up full in the end; increasing the size will only make it take slightly longer to fill up.
(An exception is if messages are being produced in an extremely "bursty" pattern. If this is the case, increasing the queue size may help to buffer against those bursts. However, a large burst, or several bursts back to back, may put you back in the same situation.)
Let's say I need to perform two different kinds write operations on a datastore entity that might happen simultaneously, for example:
The client that holds a write-lock on the entry updates the entry's content
The client requests a refresh of the write-lock (updates the lock's expiration time-stamp)
As the content-update operation is only allowed if the client holds the current write-lock, I need to perform the lock-check and the content-write in a transaction (unless there is another way that I am missing?). Also, a lock-refresh must happen in a transaction because the client needs to first be confirmed as the current lock-holder.
The lock-refresh is a very quick operation.
The content-update operation can be quite complex. Think of it as the client sending the server a complicated update-script that the server executes on the content.
Given this, if there is a conflict between those two transactions (should they be executed simultaneously), I would much rather have the lock-refresh operation fail than the complex content-update.
Is there a way that I can "prioritize" the content-update transaction? I don't see anything in the docs and I would imagine that this is not a specific feature, but maybe there is some trick I can use?
For example, what happens if my content-update reads the entry, writes it back with a small modification (without committing the transaction), then performs the lengthy operation and finally writes the result and commits the transaction? Would the first write be applied immediately and cause a simultaneous lock-refresh transaction to fail? Or are all writes kept until the transaction is committed at the end?
Is there such a thing as keeping two transactions open? Or doing an intermediate commit in a transaction?
Clearly, I can just split my content-update into two transactions: The first one sets a "don't mess with this, please!"-flag and the second one (later) writes the changes and clears that flag.
But maybe there is some other trick to achieve this with fewer reads/writes/transactions?
Another thought I had was that there are 3 different "blocks" of data: The current lock-holder (LH), the lock expiration (EX), and the content that is being modified (CO). The lock-refresh operation needs to perform a read of LH and a write to EX in a transaction, while the content-update operation needs to perform a read of LH, a read of CO, and a write of CO in a transaction. Is there a way to break the data apart into three entities and somehow have the transactions span only the needed entities? Since LH is never modified by these two operations, this might help avoid the conflict in the first place?
The datastore uses optimistic concurrency control, which means that a (datastore primitive) transaction waits until it is committed, then succeeds only if someone else hasn't committed first. Typically, the app retries the failed transaction with fresh data. There is no way to modify this first-wins behavior.
It might help to know that datastore transactions are strongly consistent, so a client can first commit a lock refresh with a synchronous datastore call, and when that call returns, the client knows for sure whether it obtained or refreshed the lock. The client can then proceed with its update and lock clear. The case you describe where a lock refresh and an update might occur concurrently from the same client sounds avoidable.
I'm assuming you need the lock mechanism to prevent writes from other clients while the lock owner performs multiple datastore primitive transactions. If a client is actually only doing one update before it releases the lock and it can do so within seconds (well before the datastore RPC timeout), you might get by with just a primitive datastore transaction with optimistic concurrency control and retries. But a lock might be a good idea for simple serialization of, say, edits to a record in a user interface, where a user hits an "edit" button in a UI and you want that to guarantee that the user has some time to prepare and submit changes without the record being changed by someone else. (Whether that's the user experience you want is your decision. :) )