Cross-region replica DyanmoDB transaction conflict handling - database

Since transactions are only ACID in a single region, and don't replicate until said region completes the transaction, there is opportunity for transactions to occur simultaneously in different regions which would not be allowed to happen in the same region, then when these transactions replicate to other regions, they cannot be completeled because of some condition.
Example:
User 1 buys item 1 from User 2 in region A
User 3 buys item 1 from User 2 in region B at the same time
Region A transaction completes, Region B transaction completes, and both begin to replicate to the other regions. Transaction 1 cannot complete in region B, and Transaction 2 cannot complete in region A, because User 2 no longer has that item (it now belongs to User 1 and User 3 respectively).
How does DynamoDB handle these conflicts? SQL DBs would typically prevent this because of the single Writer node, but since DynamoDB has multiple writer nodes we can see this potential conflict.

Once the transaction is complete in a region, the item changes individually enter the stream to be replicated via global tables, not the transaction. The transaction is over.
In your scenario, DynamoDB's conflict resolution works as it normally does on each individual item. DynamoDB uses last writer wins for its conflict resolution.

Related

General strategy to handle reverting financial transactions in DB?

I am building a home budget app with NextJS + Prisma ORM + PostgreSQL.
I am not sure if my current strategy of how to handle deleting/reverting past transactions makes sense in terms of scaling up/db performance..
So, app functions in this way:
User adds transaction that are assigned to a chosen bank account. Every transaction row in db includes fields like amount, balanceBefore and balanceAfter.
After successful transaction, banks account current balance is being updated.
Now, assuming the situation where multiple transactions has been inserted and user realises he made a mistake somewhere along the line. He would then need to select the transaction and delete/update it, which would follow updating every row following this transaction to update balanceAfter and balanceBefore fields so the transactions history is correct.
Is there a better way of handling this kind of situation? Having to update a row that is couple thousand records in past might be heavy on resources.
Not only should you never delete or update a financial transaction but neither should your input data contain balances (before or after). Instead of updating or deleting a transaction you generate 2 new ones: one which reverses the incorrect transaction (thus restoring balances) and one that inserts the correct values.
As for balances, do not store them, just store the transaction amount. Then create a view which calculates the balances on the fly when needed. By creating a view you do not need to perform any calculations when DML is preformed on your transactions. See the following example for both the above.

Concurrent updates to shared database by microservices .. Lost Updates

Flow Diagram
Our microservices are calling third party service, using REST API to access (read/update) same record in shared database
Use case 1: Same microservice making multiple calls to access same record
A customer bought 10 units of a product A
Another customer bought 5 units of same product A
2 calls made by UI API to end point /decrementProduct within few milliseconds
Both calls may end up reading the same inventory count for product A at that time, and both will decrement the purchase units for product A, based on the inventory count they reads.
Example:
Inventory Count before calls: 10
Call 1 decrement 5 units from 10, and updated back 5 as current inventory count.
Call 2 decrement 2 units from 10, and updated back 8 as current inventory count.
Inventory Count after calls: 8
Correct Inventory Count after calls should be: 3
Use case 2: Multiple microservices making multiple calls to access same record
The problem explained in use case 1 will exacerbate in this case, due to number of calls to update same record in a database.
Edit: 13-April-2021
The shared database is exposed to our microservices using REST API and we don't have any control over the physical database and the exposed REST API, to implement any transactions or locking mechanism at the database level.
I don't know which database you are using but traditional relational databases like: oracle, postgresql, mysql, db2, etc... already address these kind of issues using locks in the records that are being updated to ensure that there are not concurrent problems.
I mean, if you open a transaction where you read a value and then update it, There won't be any problem because if you try to update a row in database with different version number that the one that is currently set (an outdated version), The database w'ont let update it

How to prevent update anomalies with multiple clients running non-atomic computations concurrently in PostgreSQL?

I am running three PostgreSQL instances using replication (1 master, 2 slaves) which are accessed by two separate servers:
The first (unexposed) server basically iterates over every row in a particular table and continuously updates specific columns (resources) every tick (based on production rate of those resources) for each user.
The second server is a public API that exposes various functions such as spending a certain amount of those resources.
In order to access and manipulate the data I am using an ORM library which allows me to write code as follows:
const resources = await repository.findById(1337);
// some complex computation
resources.iron = computeNewIron(resources.iron);
await repository.save(resources);
Of course it might occur that the API wants to deduct a specific amount of resources right when the server handling the ticks is trying to update the amount of resources which can cause either of the servers to assume a certain amount of resources that is incorrect, basically your typical UPDATE anomaly.
My problem is that I am not just writing a "simple" atomic query such UPDATE table SET iron = iron + 42 WHERE id = :id. The ORM library is internally using a direct assignment that is not self-referencing the respective columns which yields something akin to UPDATE table SET iron = 123 WHERE id = :id where the amount has been computed previously.
I can just assume that it's possible to prevent the mentioned anomaly if I use manually written queries that are incrementing/decrementing the values atomically with self-references. I'd like to know which other options can alleviate the issue. Should I wrap my SELECT/computation/UPDATE in a transaction? Does this suffice?
Your question is a bit unclear, but if your transaction spans several statements, yet needs to have a consistent state of the database, there are basically two options:
Use pessimistic locking: when you read values from the database, do it with SELECT ... FOR UPDATE. Then the rows are locked for the duration of your transaction, and no concurrent transaction can modify them.
Use optimistic locking: start your transaction in REPEATABLE READ isolation level. Then you see a consistent snapshot of the database for the whole duration of your transaction. If somebody else modifies your data after you read them, your UPDATE will cause a serialization error and you'll have to retry the transaction.
Optimistic locking is better if conflicts are rare, while pessimistic locking is preferable if conflicts are likely.

Updating database keys where one table's keys refer to another's

I have two tables in DynamoDB. One has data about homes, one has data about businesses. The homes table has a list of the closest businesses to it, with walking times to each of them. That is, the homes table has a list of IDs which refer to items in the businesses table. Since businesses are constantly opening and closing, both these tables need to be updated frequently.
The problem I'm facing is that, when either one of the tables is updated, the other table will have incorrect data until it is updated itself. To make this clearer: let's say one business closes and another one opens. I could update the businesses table first to remove the old business and add the new one, but the homes table would then still refer to the now-removed business. Similarly, if I updated the homes table first to refer to the new business, the businesses table would not yet have this new business' data yet. Whichever table I update first, there will always be a period of time where the two tables are not in synch.
What's the best way to deal with this problem? One way I've considered is to do all the updates to a secondary database and then swap it with my primary database, but I'm wondering if there's a better way.
Thanks!
Dynamo only offers atomic operations on the item level, not transaction level, but you can have something similar to an atomic transaction by enforcing some rules in your application.
Let's say you need to run a transaction with two operations:
Delete Business(id=123) from the table.
Update Home(id=456) to remove association with Business(id=123) from the home.businesses array.
Here's what you can do to mimic a transaction:
Generate a timestamp for locking the items
Let's say our current timestamp is 1234567890. Using a timestamp will allow you to clean up failed transactions (I'll explain later).
Lock the two items
Update both Business-123 and Home-456 and set an attribute lock=1234567890.
Do not change any other attributes yet on this update operation!
Use a ConditionalExpression (check the Developer Guide and API) to verify that attribute_not_exists(lock) before updating. This way you're sure there's no other process using the same items.
Handle update lock responses
Check if both updates succeeded to Home and Business. If yes to both, it means you can proceed with the actual changes you need to make: delete the Business-123 and update the Home-456 removing the Business association.
For extra care, also use a ConditionExpression in both updates again, but now ensuring that lock == 1234567890. This way you're extra sure no other process overwrote your lock.
If both updates succeed again, you can consider the two items updated and consistent to be read by other processes. To do this, run a third update removing the lock attribute from both items.
When one of the operations fail, you may try again X times for example. If it fails all X times, make sure the process cleans up the other lock that succeeded previously.
Enforce the transaction lock throught your code
Always use a ConditionExpression in any part of your code that may update/delete Home and Business items. This is crucial for the solution to work.
When reading Home and Business items, you'll need to do this (this may not be necessary in all reads, you'll decide if you need to ensure consistency from start to finish while working with an item read from DB):
Retrieve the item you want to read
Generate a lock timestamp
Update the item with lock=timestamp using a ConditionExpression
If the update succeeds, continue using the item normally; if not, wait one or two seconds and try again;
When you're done, update the item removing the lock
Regularly clean up failed transactions
Every minute or so, run a background process to look for potentially failed transactions. If your processes take at max 60 seconds to finish and there's an item with lock value older than, say 5 minutes (remember lock value is the time the transaction started), it's safe to say that this transaction failed at some point and whatever process running it didn't properly clean up the locks.
This background job would ensure that no items keep locked for eternity.
Beware this implementation do not assure a real atomic and consistent transaction in the sense traditional ACID DBs do. If this is mission critical for you (e.g. you're dealing with financial transactions), do not attempt to implement this. Since you said you're ok if atomicity is broken on rare failure occasions, you may live with it happily. ;)
Hope this helps!

Dealing with race condition in transactional database table

Let me lay the scenario out first. Say you have a database for a business app and one of the things it tracks is inventory. The system says you have 5 screws in stock. Say you needed all 5. The system creates an inventory transaction record for -5. After you commit that transaction, since you know you had 5 before and you pulled out 5, if you sum up all the inventory transaction records for that screw the total should be 0. The problem occurs when two people are trying to do this at the same time. Say one person wants 4 and the other wants 2. Both client apps check the quantity beforehand and they are both told 5. At the exact same time one creates a transaction for -4 and the other for -2. The results in the total inventory quantity to be -1 which should never be possible because the system should not allow negative inventory.
How would you solve this if you didn't have a server application to help you? I mention that because a server coordinating the inventory transactions is how I would solve it but right now our product has no server application. We just have client apps which talk to a Firebird database directly. I'm trying to figure out how to do this with just the client apps and database. One thing that might help is that Firebird has something called a Generator which is basically a unique number generator that is atomic so you are guaranteed that if you asked Firebird to increment the generator and give you the next number that it will not give anyone else that same number.
My mind was going down the route of trying to create a makeshift record lock using a generator. I thought I could have them both check a "lock" field on the Item table. If it is null, then noone has a lock. If it is non-null it is locked so you need to keep checking back until it is not locked. If there is no lock you ask the generator for a uniq number and store that in the locking field for the Item you want to lock. You commit that transaction then go back and check to see if it is indeed the case that the Item table's lock field contains the number you put there. If it does then you have successfully locked and if it doesn't then that means someone was locking it at the same time and you lost the race. Once you are done you null out the lock and the client that is waiting will then see the null, lock it themselves and repeat.
This itself has a race condition I believe though. Trxn1 (transaction 1) checks lock and finds null. Trxn2 checks lock and finds null. Trxn1 gets new lock number from generator. Trxn2 gets new lock from generator. Trxn1 says update Item record with my lock if lock is still null which it is. Trxn1 commits trxn then starts a new Trxn1 and proves the lock contains his lock id and it does so it knows it has permission to make inventory transactions and it starts doing so. Right after Trxn1 checks to see if it got the lock Trxn2 commits its update statement that stored its lock if the lock was null. If Trxn2 executed his update statement before Trxn1 committed the lock then Trxn2 would still see the value as null and the update would occur. If Trxn2's lock commit happens after Trxn1 committed lock and already verified it we have a problem. Trxn1 is making changes to Item transaction table. Trxn2 got his lock committed because the lock was null in its transaction world when it did it and when it commits Trxn2's update statement will overwrite Trxn1's lock because the null check in the update statement happened before both committed, not at the time of commit. So now both think they have a lock and we will end up with negative inventory.
Can anyone think of a way to solve this short of having a server application with some kind of queueing system (FIFO)? I would prefer if it could all be done via clients "talking to the database" to coordinate this but that may not be possible technically speaking. Sorry If this got a bit wordy :D
Solution Edit:
jtahlborn seems to have the right idea. I somehow didn't realize that Firebird does in fact have row level locking. Simple select statements (no joins, group by, etc) can have "with lock" appended to the end of the statement and any row returned by the statement will be locked until the transaction is committed or rolled back. Noone else can obtain a lock on that row nor make changes to it. Because I don't want to lock the entire ITEM table while I'm inserting rows in to the Item transaction table, I am going to create a table just for locking that has one column (the ItemID field). Because the second transaction will get an error when it tries to do it's own lock, it doesn't matter that I am never actually modifying anything on the locking table itself. Failing to get a lock gives me all the information I need. I will put triggers on the insert / delete of the ITEM table so that for every Item record this is also a record in the ITEMLOCK table. Here is the process I'm going to use.
Start database transaction
Attempted to obtain lock on ITEMLOCK row with the ItemID of the Item you want to change
If you can't get a lock keep trying until the record is unlocked
Once locked go prove that the quantity on hand of that Item is enough to cover what you
want to take out, because they could have old data this might not be
the case and it will drop out here and message the user
If sufficient quantities exist insert your inventory transaction record in the inventory transaction table
Commit transaction which in turn releases the lock
Note: Matthieu M mentioned the FOR UPDATE clause. It is mentioned in the documentation along with the WITH LOCK clause. As I understand it you can use that when you are locking multiple rows with one statement. I am not one hundred percent sure, but it seems like doing this with WITH LOCK will trying an all or nothing approach and FOR UPDATE will lock each one separately one at a time. I am not sure what happens if it locked the first 100 records you asked for but on the 101th record it couldn't get a lock. Does it then release the 100 locks you did get? I will need to lock more than one Item at a time, but I do not feel comfortable with FOR UPDATE since I feel like I don't truly understand the difference. I also probably want to know which Item was already locked for user messaging purposes (going to put a timeout so trxns wont wait forever for a lock) so I will be locking one at at time using WITH LOCK.
Note 2: I want to point out to anyone using this in their own code to be careful. I am going to have a very simple loop when waiting for a lock to be released (is it released yet? how about now? now?). If I had a ton of users possibly trying to lock the same row at the same time there may be a deadlock scenario. Say you have a slow client. That client may always end up with the short end of the stick because every time the lock was release some other client then grabbed it faster than the slow client could. If this happened over and over this would be essentially a deadlock scenario. If I was worried about that I would need a way to figure out who is first in line. In my case, database transactions should be short lived, we never have more than 50 users (not a cloud system), and it is highly unlikely that they all are using this part of the system at the same time trying to modify the exact same Item's inventory quantity.
The simplest solution is to lock some primary row (like the main "item") and use this as your distributed locking mechanism. (assuming your database supports row-level locks, as most modern dbs do).
I recommend reading up about the CAP theorem and how it may be an explanation for the scenario you are describing. EDIT: Having read in more detail, my comment may be of limited use because it seems you already know this and are trying to solve the problem within Firebird.

Resources