Why do we need intent lock?

Why do we need intent lock? - database

In a database, we would not like the table to be dropped during we are modifying a row in this table. Per my understanding, a read lock on table + a write lock on row when write a row in table should be enough(based on that a write lock is needed when drop the table), why do we need a intent lock in this case? seems many databases using intent lock which confused me very much. I think pthread_rwlock should be enough.

I read here that they only exists for performance. Imagine you want to drop a table but you would have to check for every row if its locked or not - that would be time consuming, and you would have to lock every row that you checked.
Heres a citation from the blog post:
From a technical perspective the Intent Locks are not really needed by
SQL Server. They have to do with performance optimization. Let’s have
a look on that in more detail. With an Intent Lock SQL Server just
indicates at a higher level within the Lock Hierarchy that you have
acquired a Lock somewhere else. A Intent Shared Lock tells SQL Server
that there is a Shared Lock somewhere else. A Intent Update or Intent
Exclusive Lock does the same, but this time SQL Server knows that
there is an Update Lock or an Exclusive Lock somewhere. It is just an
indication, nothing more.
But how does that indication help SQL Server with performance
optimization? Imagine you want to acquire an Exclusive Lock at the
table level. In that case, SQL Server has to know if there is an
incompatible lock (like a Shared or Update Lock) somewhere else on a
record. Without Intent Locks SQL Server would have to check every
record to see if an incompatible lock has been granted.
But with an Intent Shared Lock on the table level, SQL Server knows
immediately that a Shared Lock has been granted somewhere else, and
therefore an Exclusive Lock can’t be granted at the table level.
That’s the whole reason why Intent Locks exist in SQL Server: to allow
efficient checking if an incompatible lock exists somewhere within the
Lock Hierarchy. Quite easy, isn’t it?

read lock on table + a write lock on row
This would break meaning of the read lock on the table.
Assume concurrent SELECT operation, which expects unmodified table during execution. This operation will take read lock on the table ... and it will succeed in your implementation. This is bad, as table is actually modified during row modification.
Instead, follow locks combination is used for modify row in the table:
IX(Intent eXclusive) on table + X(eXclusive, similar to "write lock") on row
This combination is compatible (that is, can be executed concurrently) with modification of another row, but it is incompatible with
S(Share, similar to "read lock") on table
used by SELECT.
Locks compatibility table can be found, e.g., on wiki.

One of the conclusions today is "intent lock can lock a parent node AND all its children nodes in a read only mode in a cheaper/safer way".
Take an example for making a table read only case, how to lock it in S-X mode?
We lock the table in S mode, then user still can modify the rows with S(table) + W(row) way. to avoid that, we need to lock every row in a S mode to make sure rows will not be updated. The cost is so huge, and it has a bug that user can insert new rows as well. -- cost too much and not safe.
We lock the table in X mode, How other can read the rows (S on table + S on row), no way, since mode_X on table blocked MODE_S on table. That's not read only.
The right solution with the intent lock is:
Lock the table in MODE_S. that's all!
any intention to modify the rows needs to take a MODE_IX lock on the table, but it is blocked by MODE_S. the solution is cheap/efficient and safe!

Related

How to acquire lock on a table and not let other transaction get read access?

I am having a nodejs program which uses sequelize to create tables and insert data based on it.
Now, in future we are going to have multiple instances of the program and so we don't want multiple instances to read from the table during program startup so that only one instance can do the setup thing if required and other instance shouldn't get 'any access' to the table until the first instance has completed it's work.
I have checked 'transaction locking' - shared and exclusive but both of them seems to be giving reading access to the tables which I don't want.
My requirement is specifically that once a transaction acquires lock on a table, other transaction shouldn't be able to read from that table unless first one has completed it's work. How can I do this?

In MySQL use LOCK TABLES to lock an entire table.
In postgresql LOCK TABLE whatever IN EXCLUSIVE MODE; does the trick.
For best results have your app, when starting, look for a particular table. Do something simple and fast such as SELECT id FROM whatever LIMIT 1; to probe whether the table exists. If your app gets an exception because the table isn't there, then do
CREATE TABLE whatever ....;
LOCK TABLES whatever WRITE;
from the app creating the table. It blocks access to the table from all instances of your app except the one that gets the LOCK.
Once your table is locked, the initial SELECT I suggested will block from other clients. There's a possible race condition if two clients try to create the table more-or-less concurrently. But the extra CREATE TABLE will throw an exception.
Note: if you LOCK more than one table, and it's possible to run the code from more than one instance of the app, always always lock the tables in the same order, or you have the potential for a deadlock.

As documented in the manual the statement to lock a table is, LOCK TABLE ...
If you lock a table in exclusive mode, then no other access is allowed - not even a SELECT. Exclusive mode is the default:
If no lock mode is specified, then ACCESS EXCLUSIVE, the most restrictive mode, is used.
The manual explains the different lock modes:
ACCESS EXCLUSIVE
This mode guarantees that the holder is the only transaction accessing the table in any way.

Do SQL server prioritize select and update statements?

Assume that SQL server received both select and update statements to the same table at the same time from different threads and connections
Do any of them get prioritized?
I know that select statements are delayed until update completed if table is already locked for update (update statements locks table by default i am incorrect?). If table lock continue for a long time due to update, select statement gets cancelled with too much waiting error
So what happens when both received at the same time?

A SELECT statement will place a shared lock (S) on any rows it's reading - depending on the isolation levels, that lock will be held for various amounts of time. In the default READ COMMITTED isolation level, the lock is held only while actually reading the row - once it's read, the lock is released right away.
The shared lock is compatible with other shared locks - so any number of SELECT statements can read the same rows simultaneously.
The UPDATE statement will place an update (U) lock on the row it wants to update, to read the existing values. Then, after that's done, before the actual updated values are written back, the lock is converted into an exclusive (X) lock for the time the data is written. Those locks are held until the transaction they're executing in is committed (or rolled back).
An update lock is not compatible with another update lock, nor with an exclusive lock. It is compatible with a shared lock however - so if the UPDATE statement is currently only reading the existing values, another transaction might read those same values using a SELECT statement with a shared lock.
An exlusive lock is incompatible with anything - you cannot even read the row anymore, while an X lock is on it.
So if you have two statements that come in and try to access the same row, then:
if the SELECT comes first, it will place a S lock on the row, read it, and typically release that lock again
at the same time, the UPDATE statement can place a U lock on the row and read the existing values; the "promotion" of the lock to X will not be possible until the S lock has been released - if that's not happening, the UPDATE statement will wait, and eventually time out, if the S lock is never released
if the UPDATE lock comes first, it will place an U lock on the row to read the existing values
at the same time, another transaction could be placing a S lock on the row to read it
and again: the UPDATE statement can only progress to the X level to write back the new values once the S lock is gone - otherwise it will time out
if the UPDATE lock comes first, it will place an U lock on the row to read the existing values, and already places the X lock on the row to actually do the update
then at this time, no other transaction can even read that row - they will have to wait (or time out, if it takes too long for them to get serviced)
Read SQL Server Transaction Locking and Row Versioning Guide for a more in-depth overview of the topic and more details

SQLServer when is UPDLOCK Applied in Select

I am issuing the following query with an UPDLOCK applied:
select #local_var = Column
from table (UPDLOCK)
where OtherColumn = #parameter
What happens is that multiple connections hit this routine which is used inside a stored procedure to compute a unique id. Once the lock acquires we compute the next id, update the value in the row and commit. This is done because the client has a specific formatting requirement for certain Object ID's in their system.
The UPDLOCK locks the correct row and blocks the other processes, but every now and then we get a duplicate id. It seems the local variable is given the current value before the row is locked. I had assumed that the lock would be obtained before the select portion of the statement was processed.
I am using SQLServer 2012 and the isolation level is set to read committed.
If there is other information required, just let me know. Or if I am doing something obviously stupid, that information is also welcome.

From the SQL Server documentation on UPDLOCK:
Use update locks instead of shared locks while reading a table, and hold locks until the end of the statement or transaction. UPDLOCK has the advantage of allowing you to read data (without blocking other readers) and update it later with the assurance that the data has not changed since you last read it.
That means that other processes can still read the values.
Try using XLOCK instead, that will lock other reads out as well.

I think the issue is that your lock is only being held during this Select.
So once your Stored Proc has the Value, it releases the Lock, BEFORE it goes on to update the id (or insert a new row or whatever).
This means that another query running in Parallel is able to Query for the same value and then Update/Insert the same row.
You should additinoally add a HOLDLOCK to your 'with' statement so that the lock gets held a little longer.
This is treated quite well in this Answer

Will Oracle lock the whole table while performing a DML statement or just the row

When i try to insert/update something in a db table, will Oracle lock the whole table or only the row being inserted/updated?
Is this something that can be controlled through external configuration?

We can issue locks explicitly with the LOCK TABLE command. Find out more
Otherwise, an insert does not lock any other rows. Because of Oracle's read isolation model that row only exists in our session until we commit it, so nobody else can do anything with it. Find out more.
An update statement only locks the affected rows. Unless we have implemented a pessimistic locking strategy with SELECT ... FOR UPDATE. Find out more.
Finally, in Oracle writers do not block readers. So even locked rows can be read by other sessions, they just can't be changed. Find out more.
This behaviour is baked into the Oracle kernel, and is not configurable.
Justin makes a good point about the table-level DDL lock. That lock will cause a session executing DDL on the table to wait until the DML session commits, unless the DDL is something like CREATE INDEX in which case it will fail immediately with ORA-00054.

It depends what you mean by "lock".
For 99.9% of what people are likely to care about, Oracle will acquire a row-level lock when a row is modified. The row-level lock still allows readers to read the row (because of multi-version read consistency, writers never block readers and readers never do dirty reads).
If you poke around v$lock, you'll see that updating a row also takes out a lock on the table. But that lock only prevents another session from doing DDL on the table. Since you'd virtually never want to do DDL on an active table in the first place, that generally isn't something that would actually cause another session to wait for the lock.

When a regular DML is executed (UPDATE/DELETE/INSERT,MERGE, and SELECT ... FOR UPDATE) oracle obtains 2 locks.
Row level Lock (TX) - This obtains a lock on the particular row being touched and any other transaction attempting to modify the same row gets blocked, till the one already owning it finishes.
Table Level Lock (TM) - When Row lock (TX) is obtained an additional Table lock is also obtained to prevent any DDL operations to occur while a DML is in progress.
What matters is though in what mode the Table lock is obtained.
A row share lock (RS), also called a subshare table lock (SS), indicates that the transaction holding the lock on the table has locked rows in the table and intends to update them. An SS lock is the least restrictive mode of table lock, offering the highest degree of concurrency for a table.
A row exclusive lock (RX), also called a subexclusive table lock (SX), indicates that the transaction holding the lock has updated table rows or issued SELECT ... FOR UPDATE. An SX lock allows other transactions to query, insert, update, delete, or lock rows concurrently in the same table. Therefore, SX locks allow multiple transactions to obtain simultaneous SX and SS locks for the same table.
A share table lock (S) held by one transaction allows other transactions to query the table (without using SELECT ... FOR UPDATE) but allows updates only if a single transaction holds the share table lock. Multiple transactions may hold a share table lock concurrently, so holding this lock is not sufficient to ensure that a transaction can modify the table.
A share row exclusive table lock (SRX), also called a share-subexclusive table lock (SSX), is more restrictive than a share table lock. Only one transaction at a time can acquire an SSX lock on a given table. An SSX lock held by a transaction allows other transactions to query the table (except for SELECT ... FOR UPDATE) but not to update the table.
An exclusive table lock (X) is the most restrictive mode of table lock, allowing the transaction that holds the lock exclusive write access to the table. Only one transaction can obtain an X lock for a table.

You should probably read the oracle concepts manual regarding locking.
For standard DML operations (insert, update, delete, merge), oracle takes a shared DML (type TM) lock.
This allows other DMLs on the table to occur concurrently (it is a share lock.)
Rows that are modified by an update or delete DML operation and are not yet committed will have an exclusive row lock (type TX). Another DML operation in another session/transaction can operate on the table, but if it modifies the same row it will block until the holder of the row lock releases it by either committing or rolling back.
Parallel DML operations and serial insert direct load operations take exclusive table locks.

Two threads adding new rows at the same time - how to prevent?

In my application, I have couple of threads that execute some logic.
At the end they adding new row to some table.
Before adding the new row, they check if a previous entry with the same details does not already exist. If one found - they updating instead of adding.
The problem is when some thread A do the check, see that no previous entity with the same details exist, and just before he add a new row, the thread B search the DB for the same entity. Thread B see that no such entity exist so he add new row too.
The result is that there are two rows with the same data in the table.
Note: no table key violated, because the thread get the next sequence just before adding the row, and the table key is some ID that does not related to the data.
Even if I will change the table key so it will be a combination of the data, It will prevent two rows with the same data, but will cause a DB error when the second thread will try to add the row.
Thank you in advance for the help, Roy.

You should be using a queue, possibly blocking queue. Threads A and B (producers) would add objects to the queue and another thread C (consumer) would poll the queue and remove the oldest object from the queue persisting it to the DB. This will prevent the problem when both A and B in the same time want to persist equal objects

You speak of "rows" so presumably this is a SQL database?
If so, why not just use transactions?
(Unless the threads are sharing a database connection, in which case a mutex might help, but I would prefer to give each thread a separate connection.)

I would recommend avoid locking in the client layer. Synchronized only works within one process, later you may scale so that your threads are across several JVMs or indeed machines.
I would enforce uniqueness in the DB, as you suggest this will then cause an exception for the second inserter. Catch that exception and do an update if that's the business logic you need.
But consider this argument:
Sometimes either of the following sequences may occur:
A insert Values VA, B updates to values VB.
B insert VB, A updates to VA.
If the two threads are racing either of these two outcomes VA or VB is equally valid. So you can't distinguish the second case from A inserts VA and B just fails!
So in fact there may be no need for the "fail and then update" case.

I think this is a job for SQL constraints, namely "UNIQUE" on the set of columns that have the data + the appropriate error handling.

Most database frameworks (Hibernate in Java, ActiveRecord etc in Ruby) have a form of optimistic locking. What this means is that you execute each operation on the assumption that it will work without conflict. In the special case where there is a conflict, you check this atomically at the point where you do the database operation, throw an exception, or error return code, and retry the operation in your client code after requerying etc.
This is usually implemented using a version number on each record. When a database operation is done, the row is read (including the version number), the client code updates the data, then saves it back to the database with a where clause specifying the primary key ID AND the version number being the same as it was when it was read. If it is different - this means another process has updated the row, and the operation should be retried. Usually this means re-reading the record, and doing that operation again on it with the new data from the other process.
In the case of adding, you would also want a unique index on the table, so the database refuses the operation, and you can handle that in the same code.
Pseudo code would look something like
do {
read row from database
if no row {
result_code = insert new row with data
} else {
result_code = update row with data
}
} while result_code != conflict_code
The benefit of this is that you don't need complicated synchronization/locking in your client code - each thread just executes in isolation, and uses the database as the consistency check (which it is very quick, and good at). Because you're not locking on some shared resource for every operation, the code can run much faster.
It also means that you can run multiple separate operating system processes to split the load and/or scale the operation over multiple servers as well without any code changes to handle conflicts.

You need to wrap the calls to check and write the row in a critical section or mutex.
With a critical section, interrupts and thread-switching are disabled while you perform the check and write, so both threads can't write at once.
With a mutex, the first thread would lock the mutex, perform its operations, then unlock the mutex. The second thread would attempt to do the same but the mutex lock would block until the first thread released the mutex.
Specific implementations of critical section or mutex functionality would depend on your platform.

You need to perform the act of checking for existing rows and then updating / adding rows inside a single transaction.
When you perform your check you should also acquire an update lock on those records, to indicate that you are going to write to the database based on the information that you have just read, and that no-one else should be allowed to change it.
In pseudo T-SQL (for Microsoft SQL Server):
BEGIN TRANSACTION
SELECT id FROM MyTable WHERE SomeColumn = #SomeValue WITH UPDLOCK
-- Perform your update here
END TRANSACTION
The update lock wont prevent people reading from those records, but it will prevent people from writing anything which might change the output of your SELECT

Multi Threading is always mind-bending ^^.
Main thing to do is to delimit the critical resources and critical operations.
Critical resource : your table.
Critical operation : adding yes, but
the whole procedure
You need to lock access to your table from the beginning of the check, until the end of the add.
If a thread attempt to do the same, while another is adding/checking, then he waits until the thread finish its operation. As simple as that.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight