Transactions best practices [closed] - database

Closed. This question needs to be more focused. It is not currently accepting answers.
Want to improve this question? Update the question so it focuses on one problem only by editing this post.
Closed 4 years ago.
Improve this question
How much do you rely on database transactions?
Do you prefer small or large transaction scopes ?
Do you prefer client side transaction handling (e.g. TransactionScope in .NET) over server
side transactions or vice-versa?
What about nested transactions?
Do you have some tips&tricks related to transactions ?
Any gotchas you encountered working with transaction ?
All sort of answers are welcome.

I always wrap a transaction in a using statement.
using(IDbTransaction transaction )
{
// logic goes here.
transaction.Commit();
}
Once the transaction moves out of scope, it is disposed. If the transaction is still active, it is rolled back. This behaviour fail-safes you from accidentally locking out the database. Even if an unhandled exception is thrown, the transaction will still rollback.
In my code I actually omit explicit rollbacks and rely on the using statement to do the work for me. I only explicitly perform commits.
I've found this pattern has drastically reduced record locking issues.

Personally, developing a website that is high traffic perfomance based, I stay away from database transactions whenever possible. Obviously they are neccessary, so I use an ORM, and page level object variables to minimize the number of server side calls I have to make.
Nested transactions are an awesome way to minimize your calls, I steer in that direction whenever I can as long as they are quick queries that wont cause locking. NHibernate has been a savior in these cases.

I use transactions on every write operation to the database.
So there are quite a few small "transactions" wrapped in a larger transaction and basically there is an outstanding transaction count in the nesting code. If there are any outstanding children when you end the parent, its all rolled back.
I prefer client-side transaction handling where available. If you are relegated to doing sps or other server side logical units of work, server side transactions are fine.

Wow! Lots of questions!
Until a year ago I relied 100% on transactions. Now its only 98%. In special cases of high traffic websites (like Sara mentioned) and also high partitioned data, enforcing the need of distributed transactions, a transactionless architecture can be adopted. Now you'll have to code referential integrity in the application.
Also, I like to manage transactions declaratively using annotations (I'm a Java guy) and aspects. That's a very clean way to determine transaction boundaries and it includes transaction propagation functionality.

Just as an FYI... Nested transactions can be dangerous. It simply increases the chances of getting deadlock. So, though it is good and necessary, the way it is implemented is important in higher volume situation.

Server side transactions, 35,000 transactions per second, SQL Server: 10 lessons from 35K tps
We only use server side transactions:
can start later and finish sooner
not distributed
can do work before and after
SET XACT_ABORT ON means immediate rollback on error
client/OS/driver agnostic
Other:
we nest calls but use ##TRANCOUNT to detect already started TXNs
each DB call is always atomic
We deal with millions of INSERT rows per day (some batched via staging tables), full OLTP, no problems. Not 35k tps though.

As Sara Chipps said, transaction is overkill for high traffic applications. So we should avoid it as much as possible. In other words, we use a BASE architecture rather than ACID. Ebay is a typical case. Distributed transaction is not used at all in Ebay architecture. But for eventual consistency, you have to do some sort of trick on your own.

Related

How does multi table schema create data consistency issues?

As per this answer, it is recommended to go for single table in Cassandra.
Cassandra 3.0
We are planning for below schema:
Second table has composite key. PK(domain_id, item_id). So, domain_id is partition key & item_id will be clustering key.
GET request handler will access(read) two tables
POST request handler will access(write) into two tables
PUT request handler will access(write) details table(only)
As per CAP theorem,
What are the consistency issues in having multi-table schema? in Cassandra...
Can we avoid consistency issues in Cassandra? with these terms QUORUM, consistency level etc...
recommended to go for single table in Cassandra.
I would recommend the opposite. If you have to support multiple queries for the same data in Apache Cassandra, you should have one table for each query.
What are the consistency issues in having multi-table schema? in Cassandra...
Consistency issues between query tables can happen when writes are applied to one table but not the other(s). In that case, the application should have a way to gracefully handle it. If it becomes problematic, perhaps running a nightly job to keep them in-sync might be necessary.
You can also have consistency issues within a table. Maybe something happens (node crashes, down longer than 3 hours, hints not replayed) during the write process. In that case, a given data point may have only a subset of its intended replicas.
This scenario can be countered by running regularly-scheduled repairs. Additionally, consistency can be increased on a per-query basis (QUORUM vs. ONE, etc), and consistency levels of QUORUM and higher will occasionally trigger a read-repair (which syncs all replicas in the current operation).
Can we avoid consistency issues in Cassandra? with these terms QUORUM, consistency level etc...
So Apache Cassandra was engineered to be highly-available (HA), thereby embracing the paradigm of eventual consistency. Some might interpret that to mean Cassandra is inconsistent by design, and they would not be incorrect. I can say after several years of supporting hundreds of clusters at web/retail scale, that consistency issues (while they do happen) are rare, and are usually caused by failures to components outside of a Cassandra cluster.
Ultimately though, it comes down to the business requirements of the application. For some applications like product reviews or recommendations, a little inconsistency shouldn't be a problem. On the other hand, things like location-based pricing may need a higher level of query consistency. And if 100% consistency is indeed a hard requirement, I would question whether or not Cassandra is the proper choice for data storage.
Edit
I did not get this: "Consistency issues between query tables can happen when writes are applied to one table but not the other(s)." When writes are applied to one table but not the other(s), what happens?
So let's say that a new domain is added. Perhaps a scenario arises where the domain_details_table gets updated, but the id_table does not. Nothing wrong here on the database side. Except that when the application expects to find that domain_id in the id_table, but cannot.
In that case, maybe the application can retry using a secondary index on domain_details_table.domain_id. It won't be fast, but the decision to be made is more around which scenario is more preferable; no answer, or a slow answer? Again, application requirements come into play here.
For your point: "You can also have consistency issues within a table. Maybe something happens (node crashes, down longer than 3 hours, hints not replayed) during the write process." How does RDBMS(like MySQL) deal with this?
So the answer to this used to be simple. RDBMSs only run on a single server, so there's only one replica to keep in-sync. But today, most RDBMSs have HA solutions which can be used, and thus have to be kept in-sync. In that case (from what I understand), most of them will asynchronously update the secondary replica(s), while restricting traffic only to the primary.
It's also good to remember that RDBMSs enforce consistency through locking strategies, as well. So even a single-instance RDBMS will lock a data point during an update, blocking any reads until the lock is released.
In a node-down scenario, a single-instance RDBMS will be completely offline, so instead of inconsistent data you'd have data loss instead. In a HA RDBMS scenario, there would be a short pause (during which you would likely encounter connection/query failures) until it has failed-over to the new primary. Once the replica comes up, there would probably be additional time necessary to sync-up the replicas, until HA can be restored.

Isolation Level vs Optimistic Locking-Hibernate , JPA

I have a web application where I want to ensure concurrency with a DB level lock on the object I am trying to update. I want to make sure that a batch change or another user or process may not end up introducing inconsistency in the DB.
I see that Isolation levels ensure read consistency and optimistic lock with #Version field can ensure data is written with a consistent state.
My question is can't we ensure consistency with isolation level only? By making my any transaction that updates the record Serializable(not considering performance), will I not ensure that a proper lock is taken by the transaction and any other transaction trying to update or acquire lock or this transaction will fail?
Do I really need version or timestamp management for this?
Depending on isolation level you've chosen, specific resource is going to be locked until given transaction commits or rollback - it can be lock on a whole table, row or block of sql. It's a pessimistic locking and it's ensured on database level when running a transaction.
Optimistic locking on the other hand assumes that multiple transactions rarely interfere with each other so no locks are required in this approach. It is a application-side check that uses #Version attribute in order to establish whether version of a record has changed between fetching and attempting to update it.
It is reasonable to use optimistic locking approach in web applications as most of operations span through multiple HTTP request. Usually you fetch some information from database in one request, and update it in another. It would be very expensive and unwise to keep transactions open with lock on database resources that long. That's why we assume that nobody is going to use set of data we're working on - it's cheaper. If the assumption happens to be wrong and version has changed in between requests by someone else, Hibernate won't update the row and will throw OptimisticLockingException. As a developer, you are responsible for managing this situation.
Simple example. Online auctions service - you're watching an item page. You read its description and specification. All of it takes, let's say, 5 minutes. With pessimistic locking and some isolation levels you'd block other users from this particular item page (or all of the items even!). With optimistic locking everybody can access it. After reading about the item you're willing to bid on it so you click the proper button. If any other of users watching this item and change its state (owner changed its description, someone other bid on it) in the meantime you will probably (depending on app implementation) be informed about the changes before application will accept your bid because version you've got is not the same as version persisted in database.
Hope that clarifies a few things for you.
Unless we are talking about some small, isolated web application (only app that is working on a database), then making all of your transactions to be Serializable would mean having a lot of confidence in your design, not taking into account the fact that it may not be the only application hitting on that certain database.
In my opinion the incorporation of Serializable isolation level, or a Pessimistic Lock in other words, should be very well though decision and applied for:
Large databases and short transactions that update only a few rows
Where the chance that two concurrent transactions will modify the same rows is relatively low.
Where relatively long-running transactions are primarily read-only.
Based on my experience, in most of the cases using just the Optimistic Locking would be the most beneficial decision, as frequent concurrent modifications mostly happen in only small percentage of cases.
Optimistic locking definately also helps other applications run faster (dont think only of yourself!).
So when we take the Pessimistic - Optimistic locking strategies spectrum, in my opinion the truth lies somewhere more towards the Optimistic locking with a flavor of serializable here and there.
I really cannot reference anything here as the answer is based on my personal experience with many complex web projects and from my notes when i was preapring to my JPA Certificate.
Hope that helps.

Best practices of distributed transactions(java) [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 6 years ago.
Improve this question
Please share your experience with distributed transactions.What kind of frameworks(java) will you advise to use?
The practices of the distributed transactions are far behind its theory. Only one of the approaches to the distributed transactions (2PC) has a large variety of libraries and frameworks to choose from; while the others you need to implement them on you own. Unfortunately 2PC is also the least advanced algorithm so choosing it you sacrifice partition tolerance and performance for the convenience and speed of the development.
Let's take a look of the other major algorithms in the area of the distributed transactions. All of them allow you to do transactions which span for multiple data sources.
Two-phase commit algorithm (2PC)
2PC is the most developed algorithm. It's the heart of the X/Open XA standard which models a generic 2PC-based distributed transaction processing and formalizes the interaction between clients, a coordinator and resources. XA allows vendors do not integrate their solution with all the other solutions but just follow the standard and get the integration for free. JTA is a Java interface to the X/Open XA model.
Some problems of 2PC comes from the fact that the coordinator is a single point of failure. If it is down then the system is unavailable, if there is a network partitioning and the coordinator happens to be in other partition than clients and resources then the system is also unavailable.
Another problem of the algorithm is its blocking nature: once a resource has sent an agreement message to the coordinator, it will block until a commit or rollback is received. As a result the system can't use all the potential of the hardware it uses.
Percolator's transactions
Percolator's transactions are distributed serializable optimistic transactions. They were introduced in the Large-scale Incremental Processing Using Distributed Transactions and Notifications paper by Google and later were implemented in the Amazon's Transaction Library for DynamoDB and in the CockroachDB database.
Unlike 2PC Percolator's transactions:
don't require coordinator so they work when there is a network partitioning and a client executing a transaction and the resources affected by the transaction happen to be in the same partition
use technique similar to the lock-free algorithms so they are non-blocking and better utilize hardware of the cluster
It's very handy that Percolator's transactions can be implemented on the client side. The only requirement is that the datasources must be linearizable and provide compare-and-set operation. The downside is that in the case of race the concurrent transactions can abort each other.
You can take a look on the Visualization of the Percolator's transaction to understand how they work.
RAMP transactions
RAMP transactions are Read Committed isolation level distributed transaction. They were introduced in the Scalable Atomic Visibility with RAMP Transactions paper by Peter Bailis. They are pretty new so they didn't get into any database yet but there are rumors that Cassandra may support them. Also Facebook reported that they are working on the Apollo database which uses Paxos for replication and CRDT & RAMP for cross shard transactions.
As well as 2PC, RAMP transactions require coordinator-like servers but unlike them there can be any number of such servers so there is no availability impact.
Just like the Percolator's transactions RAMP uses non-blocking approach and the relaxed isolation level helps it avoid the contention issues and achieve incredible performance see the paper for the details.
RAMP also has the same requirements to the storages as Percolator's transactions: linearizability and compare-and-set operations.
You can take a look on the Visualization of the RAMP transaction to understand how they work.
I would start e.g. here
https://en.wikipedia.org/wiki/Java_Transaction_API#Open_source_JTA_implementations
As working as qa on Narayana project I would personally recommend http://narayana.io

Concurrent editing of same data

I recently came up with a case that makes me wonder if I'm a newbie or something trivial has escaped to me.
Suppose I have a software to be run by many users, that uses a table. When the user makes login in the app a series of information from the table appears and he has just to add and work or correct some information to save it. Now, if the software he uses is run by many people, how can I guarantee is he is the only one working with that particular record? I mean how can I know the record is not selected and being worked by 2 or more users at the same time? And please I wouldn't like the answer use “SELECT FOR UPDATE... “
because for what I've read it has too negative impact on the database. Thanks to all of you. Keep up the good work.
This is something that is not solved primarily by the database. The database manages isolation and locking of "concurrent transactions". But when the records are sent to the client, you usually (and hopefully) closed the transaction and start a new one when it comes back.
So you have to care yourself.
There are different approaches, the ones that come into my mind are:
optimistic locking strategies (first wins)
pessimistic locking strategies
last wins
Optimistic locking: you check whether a record had been changed in the meanwhile when storing. Usually it does this by having a version counter or timestamp. Some ORMs and frameworks may help a little to implement this.
Pessimistic locking: build a mechanism that stores the information that someone started to edit something and do not allow someone else to edit the same. Especially in web projects it needs a timeout when the lock is released anyway.
Last wins: the second person storing the record just overwrites the first changes.
... makes me wonder if I'm a newbie ...
That's what happens always when we discover that very common stuff is still not solved by the tools and frameworks we use and we have to solve it over and over again.
Now, if the software he uses is runed by many people how can I guarantee is he
is the only one working with that particular record.
Ah...
And please I wouldn't like the answer use “SELECT FOR UPDATE... “ because for
what I've read it has too negative impact on the database.
Who cares? I mean, it is the only way (keep a lock on a row) to guarantee you are the only one who can change it. Yes, this limits throughput, but then this is WHAT YOU WANT.
It is called programming - choosing the right tools for the job. IN this case impact is required because of the requirements.
The alternative - not a guarantee on the database but an application server - is an in memory or in database locking mechanism (like a table indicating what objects belong to what user).
But if you need to guarantee one record is only used by one person on db level, then you MUST keep a lock around and deal with the impact.
But seriously, most programs avoid this. They deal with it either with optimistic locking (second user submitting changes gets error) or other programmer level decisions BECAUSE the cost of such guarantees are ridiculously high.
Oracle is different from SQL server.
In Oracle, when you update a record or data set the old information is still available because your update is still on hold on the database buffer cache until commit.
Therefore who is reading the same record will be able to see the old result.
If the access to this record though is a write access, it will be a lock until commit, then you'll have access to write the same record.
Whenever the lock can't be resolved, a deadlock will pop up.
SQL server though doesn't have the ability to read a record that has been locked to write changes, therefore depending which query you're running, you might lock an entire table
First you need to separate queries and insert/updates using a data-warehouse database. Which means you could solve slow performance in update that causes locks.
The next step is to identify what is causing locks and work out each case separately.
rebuilding indexes during working hours could cause very nasty locks. Push them to after hours.

Optimistic vs. Pessimistic locking

I understand the differences between optimistic and pessimistic locking. Now, could someone explain to me when I would use either one in general?
And does the answer to this question change depending on whether or not I'm using a stored procedure to perform the query?
But just to check, optimistic means "don't lock the table while reading" and pessimistic means "lock the table while reading."
Optimistic Locking is a strategy where you read a record, take note of a version number (other methods to do this involve dates, timestamps or checksums/hashes) and check that the version hasn't changed before you write the record back. When you write the record back you filter the update on the version to make sure it's atomic. (i.e. hasn't been updated between when you check the version and write the record to the disk) and update the version in one hit.
If the record is dirty (i.e. different version to yours) you abort the transaction and the user can re-start it.
This strategy is most applicable to high-volume systems and three-tier architectures where you do not necessarily maintain a connection to the database for your session. In this situation the client cannot actually maintain database locks as the connections are taken from a pool and you may not be using the same connection from one access to the next.
Pessimistic Locking is when you lock the record for your exclusive use until you have finished with it. It has much better integrity than optimistic locking but requires you to be careful with your application design to avoid Deadlocks. To use pessimistic locking you need either a direct connection to the database (as would typically be the case in a two tier client server application) or an externally available transaction ID that can be used independently of the connection.
In the latter case you open the transaction with the TxID and then reconnect using that ID. The DBMS maintains the locks and allows you to pick the session back up through the TxID. This is how distributed transactions using two-phase commit protocols (such as XA or COM+ Transactions) work.
When dealing with conflicts, you have two options:
You can try to avoid the conflict, and that's what Pessimistic Locking does.
Or, you could allow the conflict to occur, but you need to detect it upon committing your transactions, and that's what Optimistic Locking does.
Now, let's consider the following Lost Update anomaly:
The Lost Update anomaly can happen in the Read Committed isolation level.
In the diagram above we can see that Alice believes she can withdraw 40 from her account but does not realize that Bob has just changed the account balance, and now there are only 20 left in this account.
Pessimistic Locking
Pessimistic locking achieves this goal by taking a shared or read lock on the account so Bob is prevented from changing the account.
In the diagram above, both Alice and Bob will acquire a read lock on the account table row that both users have read. The database acquires these locks on SQL Server when using Repeatable Read or Serializable.
Because both Alice and Bob have read the account with the PK value of 1, neither of them can change it until one user releases the read lock. This is because a write operation requires a write/exclusive lock acquisition, and shared/read locks prevent write/exclusive locks.
Only after Alice has committed her transaction and the read lock was released on the account row, Bob UPDATE will resume and apply the change. Until Alice releases the read lock, Bob's UPDATE blocks.
Optimistic Locking
Optimistic Locking allows the conflict to occur but detects it upon applying Alice's UPDATE as the version has changed.
This time, we have an additional version column. The version column is incremented every time an UPDATE or DELETE is executed, and it is also used in the WHERE clause of the UPDATE and DELETE statements. For this to work, we need to issue the SELECT and read the current version prior to executing the UPDATE or DELETE, as otherwise, we would not know what version value to pass to the WHERE clause or to increment.
Application-level transactions
Relational database systems have emerged in the late 70's early 80's when a client would, typically, connect to a mainframe via a terminal. That's why we still see database systems define terms such as SESSION setting.
Nowadays, over the Internet, we no longer execute reads and writes in the context of the same database transaction, and ACID is no longer sufficient.
For instance, consider the following use case:
Without optimistic locking, there is no way this Lost Update would have been caught even if the database transactions used Serializable. This is because reads and writes are executed in separate HTTP requests, hence on different database transactions.
So, optimistic locking can help you prevent Lost Updates even when using application-level transactions that incorporate the user-think time as well.
Conclusion
Optimistic locking is a very useful technique, and it works just fine even when using less-strict isolation levels, like Read Committed, or when reads and writes are executed in subsequent database transactions.
The downside of optimistic locking is that a rollback will be triggered by the data access framework upon catching an OptimisticLockException, therefore losing all the work we've done previously by the currently executing transaction.
The more contention, the more conflicts, and the greater the chance of aborting transactions. Rollbacks can be costly for the database system as it needs to revert all current pending changes which might involve both table rows and index records.
For this reason, pessimistic locking might be more suitable when conflicts happen frequently, as it reduces the chance of rolling back transactions.
Optimistic locking is used when you don't expect many collisions. It costs less to do a normal operation but if the collision DOES occur you would pay a higher price to resolve it as the transaction is aborted.
Pessimistic locking is used when a collision is anticipated. The transactions which would violate synchronization are simply blocked.
To select proper locking mechanism you have to estimate the amount of reads and writes and plan accordingly.
Optimistic assumes that nothing's going to change while you're reading it.
Pessimistic assumes that something will and so locks it.
If it's not essential that the data is perfectly read use optimistic. You might get the odd 'dirty' read - but it's far less likely to result in deadlocks and the like.
Most web applications are fine with dirty reads - on the rare occasion the data doesn't exactly tally the next reload does.
For exact data operations (like in many financial transactions) use pessimistic. It's essential that the data is accurately read, with no un-shown changes - the extra locking overhead is worth it.
Oh, and Microsoft SQL server defaults to page locking - basically the row you're reading and a few either side. Row locking is more accurate but much slower. It's often worth setting your transactions to read-committed or no-lock to avoid deadlocks while reading.
I would think of one more case when pessimistic locking would be a better choice.
For optimistic locking every participant in data modification must agree in using this kind of locking. But if someone modifies the data without taking care about the version column, this will spoil the whole idea of the optimistic locking.
There are basically two most popular answers. The first one basically says
Optimistic needs a three-tier architectures where you do not necessarily maintain a connection to the database for your session whereas Pessimistic Locking is when you lock the record for your exclusive use until you have finished with it. It has much better integrity than optimistic locking you need either a direct connection to the database.
Another answer is
optimistic (versioning) is faster because of no locking but (pessimistic) locking performs better when contention is high and it is better to prevent the work rather than discard it and start over.
or
Optimistic locking works best when you have rare collisions
As it is put on this page.
I created my answer to explain how "keep connection" is related to "low collisions".
To understand which strategy is best for you, think not about the Transactions Per Second your DB has but the duration of a single transaction. Normally, you open trasnaction, performa operation and close the transaction. This is a short, classical transaction ANSI had in mind and fine to get away with locking. But, how do you implement a ticket reservation system where many clients reserve the same rooms/seats at the same time?
You browse the offers, fill in the form with lots of available options and current prices. It takes a lot of time and options can become obsolete, all the prices invalid between you started to fill the form and press "I agree" button because there was no lock on the data you have accessed and somebody else, more agile, has intefered changing all the prices and you need to restart with new prices.
You could lock all the options as you read them, instead. This is pessimistic scenario. You see why it sucks. Your system can be brought down by a single clown who simply starts a reservation and goes smoking. Nobody can reserve anything before he finishes. Your cash flow drops to zero. That is why, optimistic reservations are used in reality. Those who dawdle too long have to restart their reservation at higher prices.
In this optimistic approach you have to record all the data that you read (as in mine Repeated Read) and come to the commit point with your version of data (I want to buy shares at the price you displayed in this quote, not current price). At this point, ANSI transaction is created, which locks the DB, checks if nothing is changed and commits/aborts your operation. IMO, this is effective emulation of MVCC, which is also associated with Optimistic CC and also assumes that your transaction restarts in case of abort, that is you will make a new reservation. A transaction here involves a human user decisions.
I am far from understanding how to implement the MVCC manually but I think that long-running transactions with option of restart is the key to understanding the subject. Correct me if I am wrong anywhere. My answer was motivated by this Alex Kuznecov chapter.
In most cases, optimistic locking is more efficient and offers higher performance. When choosing between pessimistic and optimistic locking, consider the following:
Pessimistic locking is useful if there are a lot of updates and
relatively high chances of users trying to update data at the same
time. For example, if each operation can update a large number of
records at a time (the bank might add interest earnings to every
account at the end of each month), and two applications are running
such operations at the same time, they will have conflicts.
Pessimistic locking is also more appropriate in applications that contain small tables that are frequently updated. In the case of these so-called hotspots, conflicts are so probable that optimistic locking wastes effort in rolling back conflicting transactions.
Optimistic locking is useful if the possibility for conflicts is very
low – there are many records but relatively few users, or very few updates and mostly read-type operations.
One use case for optimistic locking is to have your application use the database to allow one of your threads / hosts to 'claim' a task. This is a technique that has come in handy for me on a regular basis.
The best example I can think of is for a task queue implemented using a database, with multiple threads claiming tasks concurrently. If a task has status 'Available', 'Claimed', 'Completed', a db query can say something like "Set status='Claimed' where status='Available'. If multiple threads try to change the status in this way, all but the first thread will fail because of dirty data.
Note that this is a use case involving only optimistic locking. So as an alternative to saying "Optimistic locking is used when you don't expect many collisions", it can also be used where you expect collisions but want exactly one transaction to succeed.
Lot of good things have been said above about optimistic and pessimistic locking.
One important point to consider is as follows:
When using optimistic locking, we need to cautious of the fact that how will application recover from these failures.
Specially in asynchronous message driven architectures, this can lead of out of order message processing or lost updates.
Failures scenarios need to be thought through.
Let's say in an ecommerce app, a user wants to place an order. This code will get executed by multiple threads. In pessimistic locking, when we get the data from the DB, we lock it so no other thread can modify it. We process the data, update the data, and then commit the data. After that, we release the lock. Locking duration is long here, we have locked the database record from the beginning till committing.
In optimistic locking, we get the data and process the data without locking. So multiple threads can execute the code so far concurrently. This will speed up. While we update, we lock the data. We have to verify that no other thread updated that record. For example, If we had 100 items in inventory and we have to update it to 99 (because your code might be quantity=queantity-1) but if another thread already used 1 it should be 98. We had race condition here. In this case, we restart the thread so we execute the same code from the beginning. But this is an expensive operation, you already came to end but then restart. if we had a few race conditions, that would not be a big deal, If the race condition was high, there would be a lot of threads to restart. We might run in a loop. In the race condition is high, we should be using `pessimistic locking
Optimistic locking means exclusive lock is not used when reading a row so lost update or write skew is not prevented. So, use optimistic locking:
If lost update or write skew doesn't occur.
Or, if there are no problems even if lost update or write skew occurs.
Pessimistic locking means exclusive lock is used when reading a row so lost update or write skew is prevented. So, use pessimistic locking:
If lost update or write skew occurs.
Or if there are some problems if lost update or write skew occurs.
In MySQL and PostgreSQL, you can use exclusive lock with SELECT FOR UPDATE.
You can check my answer of the lost update and write skew examples with optimistic locking(without SELECT FOR UPDATE) and pessimistic locking(with SELECT FOR UPDATE) in MySQL.
On a more practical note, when updating a distributed system, optimistic locking in the DB may be inadequate to provide the consistency needed across all parts of the distributed system.
For example, in applications built on AWS, it is common to have data in both a DB (e.g. DynamoDB) and a storage (e.g. S3). If an update touches both DynamoDB and S3, an optimistic locking in DynamoDB could still leave the data in S3 inconsistent. In this type of cases, it is probably safer to use a pessimistic lock that is held in DynamoDB until the S3 update is finished. In fact, AWS provides a locking library for this purpose.
Optimistic locking and Pessimistic locking are two models for locking data in a database.
Optimistic locking : where a record is locked only when changes are committed to the database.
Pessimistic locking : where a record is locked while it is edited.
Note : In both data-locking models, the lock is released after the changes are committed to the database.

Resources