Can lost update problem in REST APIs prevented with Database transactions alone? - database

Can lost updates be prevented at REST APIs by only requiring a DB transaction (at repeatable read or Snapshot isolation) ?
What if the user GETs a version and then updates an attribute and sends a PUT request but meanwhile some other user has changed the same attribute and successfully completed the PUT (also DB update) ? Assume such a case cannot be prevented DB transaction alone?
In this case we additionally need an optimistic concurrency control mechanism at the application layer (say using version number or Etags) ?
If we do have an optimistic control at application layer, do we then need transaction (since single row update with WHERE version = v1 would automatically prevent lost updates) unless updating multiple tables?

Related

how does laravel handle locking and concurrent updates?

Grails, by default, uses optimistic locking. It maintains an update count, and it checks this and throws an exception (and rolls the second one back) if two people try to update the same record at the same time.
What is laravel's strategy for concurrent updates?
If the answer is nothing (i.e. overwrite), This would result in a broken application. E.g. if you have an api which happens to update a user's "last logged in" value, and you have a backend admin application which allows an administrator to say "ban" a user, then we could have the situation where the ban update is overwriten (and lost) by the api update. In this case we need to use pessimistic locking, which is not understood by many developers and can easily result in deadlocking or slowdowns. Or separate the tables into a lot of small tables, but this also has its issues.

Isolation Level vs Optimistic Locking-Hibernate , JPA

I have a web application where I want to ensure concurrency with a DB level lock on the object I am trying to update. I want to make sure that a batch change or another user or process may not end up introducing inconsistency in the DB.
I see that Isolation levels ensure read consistency and optimistic lock with #Version field can ensure data is written with a consistent state.
My question is can't we ensure consistency with isolation level only? By making my any transaction that updates the record Serializable(not considering performance), will I not ensure that a proper lock is taken by the transaction and any other transaction trying to update or acquire lock or this transaction will fail?
Do I really need version or timestamp management for this?
Depending on isolation level you've chosen, specific resource is going to be locked until given transaction commits or rollback - it can be lock on a whole table, row or block of sql. It's a pessimistic locking and it's ensured on database level when running a transaction.
Optimistic locking on the other hand assumes that multiple transactions rarely interfere with each other so no locks are required in this approach. It is a application-side check that uses #Version attribute in order to establish whether version of a record has changed between fetching and attempting to update it.
It is reasonable to use optimistic locking approach in web applications as most of operations span through multiple HTTP request. Usually you fetch some information from database in one request, and update it in another. It would be very expensive and unwise to keep transactions open with lock on database resources that long. That's why we assume that nobody is going to use set of data we're working on - it's cheaper. If the assumption happens to be wrong and version has changed in between requests by someone else, Hibernate won't update the row and will throw OptimisticLockingException. As a developer, you are responsible for managing this situation.
Simple example. Online auctions service - you're watching an item page. You read its description and specification. All of it takes, let's say, 5 minutes. With pessimistic locking and some isolation levels you'd block other users from this particular item page (or all of the items even!). With optimistic locking everybody can access it. After reading about the item you're willing to bid on it so you click the proper button. If any other of users watching this item and change its state (owner changed its description, someone other bid on it) in the meantime you will probably (depending on app implementation) be informed about the changes before application will accept your bid because version you've got is not the same as version persisted in database.
Hope that clarifies a few things for you.
Unless we are talking about some small, isolated web application (only app that is working on a database), then making all of your transactions to be Serializable would mean having a lot of confidence in your design, not taking into account the fact that it may not be the only application hitting on that certain database.
In my opinion the incorporation of Serializable isolation level, or a Pessimistic Lock in other words, should be very well though decision and applied for:
Large databases and short transactions that update only a few rows
Where the chance that two concurrent transactions will modify the same rows is relatively low.
Where relatively long-running transactions are primarily read-only.
Based on my experience, in most of the cases using just the Optimistic Locking would be the most beneficial decision, as frequent concurrent modifications mostly happen in only small percentage of cases.
Optimistic locking definately also helps other applications run faster (dont think only of yourself!).
So when we take the Pessimistic - Optimistic locking strategies spectrum, in my opinion the truth lies somewhere more towards the Optimistic locking with a flavor of serializable here and there.
I really cannot reference anything here as the answer is based on my personal experience with many complex web projects and from my notes when i was preapring to my JPA Certificate.
Hope that helps.

Database- or user-level locks

I'm currently facing the following problem:
I have a C# .NET application connecting to a database (with the use of NHibernate). The application basically displays the database content and lets the user edit it. Since multiple instances of the application are running at the same time (on the same and on different workstations) i'm having concurrency problems as soon as two users modify the same record at the same time.
Currently I kind of solved the issues with optimistic locking. But this is not the perfect solution since one user still looses its changes.
Now i came up with the idea of having the application lock an entry every time it loads a new one from the database and release the lock as soon as the user switches to another entry. So basically all entries which are currently displayed to the user are locked in the database. If another user loads locked entries it will display them in a read-only mode.
Now to my actual question:
Is it a good idea to do the locking on database level? Which means i would open a new transaction every time a user loads a new entry and lock it. Or would it be better to do it through a "Lock Table" which holds for example a key to all locked entries in a table?
Thanks for your help!
Is it a good idea to do the locking on database level?
Yes, it is fine in some cases.
So basically all entries which are currently displayed to the user are
locked in the database.
...
Or would it be better to do it through a "Lock Table" which holds for example a key to all locked entries in a table?
So you lock a bunch of entries on page load? And when would you release them? What if the editing will take lots of time (e.g. had started editing entry and then went for a lunch)? What if user would close the page without editing all these locked entries, for how long entries would remain locked?
Pessimistic locking and "Lock Table" help to avoid some problems of optimistic locking but bring new.
Currently I kind of solved the issues with optimistic locking. But this is not the perfect solution since one user still looses its changes.
Can't agree that this is loosing, because in your case if validate and commit phases are performed as a single atomic operation then entry wouldn't be corrupted and only one transaction would be successful (let suppose it is the 1st), another would be rolled back (2nd).
According to NHibernate's Optimistic concurrency control
It will be atomic if only one of these database transactions (the last
one) stores the updated data, all others simply read data.
The only approach that is consistent with high concurrency and high
scalability is optimistic concurrency control with versioning.
NHibernate provides for three possible approaches to writing
application code that uses optimistic concurrency.
So the 2nd transaction would be gracefully rolled back and after that user could be notified that he has either to make new edit (new transaction) or skip this entry.
But everything depends on your business logic and requirements. If you don't have high contention for the data and thus there wouldn't be lots of collisions then I suggest you to use Optimistic locking.

Should a correct user-provided timestamp/rowversion be a requirement to update data (is it secure)?

I am using rowversion for optimistic concurrency over a set of data: the client gets a batch of data, makes some updates, and then the updates are sent to the database. The simplest solution for managing optimistic concurrency appears to be the one described here: on retrieval, just get the single largest rowversion from the data of interest (or even just the database's last-used rowversion value), and send it to the client. When updates are requested, have the client send the value back, and then ensure that all rows involved in the update have a rowversion value that is less than or equal to the value sent by the client. On update, any row in the database with a higher rowversion than the one sent to the client must have been updated after the initial retrieval, and the user should be prompted to refresh and try again, or whatever the desired experience is.
The problem that seems obvious to me in this is that it would be easy for the client to simply send back UInt64.MaxValue or some other large value and completely defeat this.
After some searching, I've seen quite a few descriptions of solutions that involve sending rowversions to the client to manage optimistic concurrency, but not a single mention of this kind of concern.
Should data values used for optimistic concurrency checking be signed and verified by the server, or perhaps stored server-side in a user session cache or something similar instead of actually sent to the user? Or should the design of an application consider optimistic concurrency checks to be only part of a good user experience and not a security feature - i.e. concurrency checking should only exist to help ensure that users (who should be properly authorized to touch this data in the first place anyways) are making decisions based on fresh data, and the app should function properly even if someone goes out of their way to defeat the concurrency checks?
I'm leaning toward the latter, but it gives me pause to think about apps that use insecure, client-provided rowversion values and just throw user updates blindly into the database without performing any kind of sanity checks on the rows being updated...

Hibernate and multiple threads, synchronize changes between multiple users

I am using Hibernate in an Eclipse RAP application. I have database tables mapped to classes with Hibernate and these classes have properties that are fetched lazily (If these weren't fetched lazily then I would probably end up loading the whole database into memory on my first query). I do not synchronize database access so there are multiple Hibernate Sessions for the users and let the DBMS do the transaction isolation. This means different instances of fetched data will belong to different users. There are things that if a user changes those things, then I would like to update those across multiple users. Currently I was thinking about using Hibernate session.refresh(object) in these cases to refresh the data, but I'm unsure how this will impact performance when refreshing multiple objects or if it's the right way to go.
Hope my problem is clear. Is my approch to the problem OK or is it fundamentally flawed or am I missing something? Is there a general solution for this kind of problem?
I would appreciate any comments on this.
The general solution is
to have transactions as short as possible
to link the session lifecycle to the transaction lifecycle (this is the default: the session is closed when the transaction is committed or rolled back)
to use optimistic locking concurrency to avoid two transactions updating the same object at the same time.
If each transaction is very short and transaction A updates some object from O to O', then concurrent transaction B will only see O until it commits or rolls back, and any other transaction started after A will see O', because a new session starts with the transaction.
We maintain an application that does exactly what you are trying to accomplish. Yes, every session.refresh() will hit the database, but since all sessions will refresh the same row at the same time, the DB server will answer all of these queries from memory.
The only thing that you still need to solve is how to propagate the information that something has changed and needs reloading to all the other sessions, possibly even to sessions on a different host.
For our application, we have about 30 users on RCP and 10-100 users on RAP instances that all connect to the very same DB backend (though through pgpool). We use a small network service that every runtime connects to; when a transaction commits, the application tells this change service that "row id X of table T" has changed and this is then propagated to all other "change subscribers", even across JVMs.
But: make sure that session.refresh() is called within the Thread that belongs to that session, possibly the RAP-Display thread. Do not call refresh() from Jobs or other unrelated threads.
As long you don't have a large number of users updating big counts of rows in short time, I guess you won't have to worry about performance.

Resources