Deadlock occuring in Clustered Columnstore index

Deadlock occuring in Clustered Columnstore index - sql-server

We are using clustered columnstore index in our transaction table holding order fulfillments. This table is regularly updated by different sessions. But, every session is specific to order job number and so, they are not trying to update same row at the same time. But, we are facing deadlock issues due to below scenarios between sessions.
Row group locking & Page lock
Row group locking & Row group locking
This is not specific to a stored procedure. It is due to multiple stored procedures updating this table, sequentially one by one, as part of order fulfillment.
The sample schema of the table is very simple:
CREATE TABLE OrderFulfillments
(
OrderJobNumber INT NOT NULL,
FulfilledIndividualID BIGINT NOT NULL,
IsIndividualSuppressed BIT NOT NULL,
SuppressionReason VARCHAR(100) NULL
)
I have given sample deadlock graph for your reference. Please let me know, what approach can I take to avoid this deadlock situation. We need clustered Columnstore index in this table, as we are doing aggregation operations to see how many times an Individual been fulfilled already. without columnstore index, it might be slower.

In my case, the deadlock scenario was due to lock escalations happening, as some of the fulfillments were very big and in 10,000s or in 100k ranges and it was causing lock escalation to happen to rowgroup level and in some cases, page level.
I solved this issue by having a temporary table at the very beginning of transactions and work on updates on the temporary table and finally inserting the temporary table related fulfillments information in to this OrderFulfillments. This OrderFulfillments is also being used by temporary table to see how many times the individual is already fulfilled. but, it is shared lock on the top and not exclusive locks.
By going for temporary table, every session is working on their own copy and concurrency issues are resolved.

You assume NOLOCK is the same as no locking...that is incorrect.
NOLOCK Is equivalent to READUNCOMMITTED.
• READUNCOMMITTED and NOLOCK hints apply only to data locks.
All queries, including those with READUNCOMMITTED and NOLOCK hints,
acquire Sch-S (schema stability) locks during compilation and
execution. Because of this, queries are blocked when a concurrent
transaction holds a Sch-M (schema modification) lock on the table.
For example, a data definition language (DDL) operation acquires a Sch-M
lock before it modifies the schema information of the table.
Any concurrent queries, including those running with READUNCOMMITTED or
NOLOCK hints, are blocked when attempting to acquire a Sch-S lock.
Conversely, a query holding a Sch-S lock blocks a concurrent
transaction that attempts to acquire a Sch-M lock.
READUNCOMMITTED and NOLOCK cannot be specified for tables modified by
insert, update, or delete operations. The SQL Server query optimizer
ignores the READUNCOMMITTED and NOLOCK hints in the FROM clause that
apply to the target table of an UPDATE or DELETE statement.
You can minimize locking contention while protecting transactions from
dirty reads of uncommitted data modifications by using either of the
following:
• The READ COMMITTED isolation level with the
READ_COMMITTED_SNAPSHOT database option set ON.
• The SNAPSHOT
isolation level. For more information about isolation levels, see SET
TRANSACTION ISOLATION LEVEL (Transact-SQL).
https://learn.microsoft.com/en-us/sql/t-sql/queries/hints-transact-sql-table
Understand how your Indexes are structured can cause blocking if say, a select statement requires an entire page that your UPDATE is modifying concurrently.
Limit your variables upon testing.
Consider splitting your DML into sections. You may find an optimal range for performing concurrent modifications of your table data.

Related

Will Oracle lock the whole table while performing a DML statement or just the row

When i try to insert/update something in a db table, will Oracle lock the whole table or only the row being inserted/updated?
Is this something that can be controlled through external configuration?

We can issue locks explicitly with the LOCK TABLE command. Find out more
Otherwise, an insert does not lock any other rows. Because of Oracle's read isolation model that row only exists in our session until we commit it, so nobody else can do anything with it. Find out more.
An update statement only locks the affected rows. Unless we have implemented a pessimistic locking strategy with SELECT ... FOR UPDATE. Find out more.
Finally, in Oracle writers do not block readers. So even locked rows can be read by other sessions, they just can't be changed. Find out more.
This behaviour is baked into the Oracle kernel, and is not configurable.
Justin makes a good point about the table-level DDL lock. That lock will cause a session executing DDL on the table to wait until the DML session commits, unless the DDL is something like CREATE INDEX in which case it will fail immediately with ORA-00054.

It depends what you mean by "lock".
For 99.9% of what people are likely to care about, Oracle will acquire a row-level lock when a row is modified. The row-level lock still allows readers to read the row (because of multi-version read consistency, writers never block readers and readers never do dirty reads).
If you poke around v$lock, you'll see that updating a row also takes out a lock on the table. But that lock only prevents another session from doing DDL on the table. Since you'd virtually never want to do DDL on an active table in the first place, that generally isn't something that would actually cause another session to wait for the lock.

When a regular DML is executed (UPDATE/DELETE/INSERT,MERGE, and SELECT ... FOR UPDATE) oracle obtains 2 locks.
Row level Lock (TX) - This obtains a lock on the particular row being touched and any other transaction attempting to modify the same row gets blocked, till the one already owning it finishes.
Table Level Lock (TM) - When Row lock (TX) is obtained an additional Table lock is also obtained to prevent any DDL operations to occur while a DML is in progress.
What matters is though in what mode the Table lock is obtained.
A row share lock (RS), also called a subshare table lock (SS), indicates that the transaction holding the lock on the table has locked rows in the table and intends to update them. An SS lock is the least restrictive mode of table lock, offering the highest degree of concurrency for a table.
A row exclusive lock (RX), also called a subexclusive table lock (SX), indicates that the transaction holding the lock has updated table rows or issued SELECT ... FOR UPDATE. An SX lock allows other transactions to query, insert, update, delete, or lock rows concurrently in the same table. Therefore, SX locks allow multiple transactions to obtain simultaneous SX and SS locks for the same table.
A share table lock (S) held by one transaction allows other transactions to query the table (without using SELECT ... FOR UPDATE) but allows updates only if a single transaction holds the share table lock. Multiple transactions may hold a share table lock concurrently, so holding this lock is not sufficient to ensure that a transaction can modify the table.
A share row exclusive table lock (SRX), also called a share-subexclusive table lock (SSX), is more restrictive than a share table lock. Only one transaction at a time can acquire an SSX lock on a given table. An SSX lock held by a transaction allows other transactions to query the table (except for SELECT ... FOR UPDATE) but not to update the table.
An exclusive table lock (X) is the most restrictive mode of table lock, allowing the transaction that holds the lock exclusive write access to the table. Only one transaction can obtain an X lock for a table.

You should probably read the oracle concepts manual regarding locking.
For standard DML operations (insert, update, delete, merge), oracle takes a shared DML (type TM) lock.
This allows other DMLs on the table to occur concurrently (it is a share lock.)
Rows that are modified by an update or delete DML operation and are not yet committed will have an exclusive row lock (type TX). Another DML operation in another session/transaction can operate on the table, but if it modifies the same row it will block until the holder of the row lock releases it by either committing or rolling back.
Parallel DML operations and serial insert direct load operations take exclusive table locks.

How to efficiently use LOCK_ESCALATION in SQL Server 2008

I'm currently having troubles with frequent deadlocks with a specific user table in SQL Server 2008. Here are some facts about this particular table:
Has a large amount of rows (1 to 2 million)
All the indexes used on this table only have the "use row lock" ticked in their options
Edit: There is only one index on the table which is its primary Key
rows are frequently updated by multiple transactions but are unique (e.g. probably a thousand or more update statements are executed to different unique rows every hour)
the table does not use partitions.
Upon checking the table on sys.tables, I found that the lock_escalation is set to TABLE
I'm very tempted to turn the lock_escalation for this table to DISABLE but I'm not really sure what side effect this would incur. From What I understand, using DISABLE will minimize escalating locks from TABLE level which if combined with the row lock settings of the indexes should theoretically minimize the deadlocks I am encountering..
From what I have read in Determining threshold for lock escalation it seems that locking automatically escalates when a single transaction fetches 5000 rows..
What does a single transaction mean in this sense? A single session/connection getting 5000 rows thru individual update/select statements?
Or is it a single sql update/select statement that fetches 5000 or more rows?
Any insight is appreciated, btw, n00b DBA here
Thanks

LOCK Escalation triggers when a statement holds more than 5000 locks on a SINGLE object. A statement holding 3000 locks each on two different indexes of the same table will not trigger escalation.
When a lock escalation is attempted and a conflicting lock exists on the object, the attempt is aborted and retried after another 1250 locks (held, not acquired)
So if your updates are performed on individual rows and you have a supporting index on the column, then lock escalation is not your issue.
You will be able to verify this using the Locks-> lock escalation event from profiler.
I suggest you capture the deadlock trace to identify the actual cause of the deadlock.

I found this article after a quick Google of disabling table lock escalation. Although not a real answer for the OP I think it is still relevant for one off scripts and note worthy here. There's a nice little trick you can do to temporarily disable table lock escalation.
Open another connection and issue something like.
BEGIN TRAN
SELECT * FROM mytable (UPDLOCK, HOLDLOCK) WHERE 1=0
WAITFOR DELAY '1:00:00'
COMMIT TRAN
as
Lock escalation cannot occur if a different SPID is currently holding
an incompatible table lock.
from microsoft kb

Preventing entire table from locking while bulk INSERT

I have a stored procedure that performs a bulk insert in a table. I added BEGIN TRANSACTION command just above the INSERT query to enable ROLL BACK if something goes wrong. When the bulk insert initiated, it locked the entire table and other users were unable to execute SELECT on the same table.
I am not following why SQL Server locks entire table for even a SELECT.
I am using SQL Server 2005 Express. Is this a problem with this version or it persists in 2008 as well? How to overcome this situation? Writers should not block Readers.

Writers should not block Readers
This is true only for snapshot isolation, all other isolation levels require both readers to block writes and writers to block readers (dirty reads not considered, since they are inconsistent and should never be used). If you need this behavior, then use row versioning (the link contains the solution).
Why does bulk insert lock the entire table?
This actually may or may not be true. The behavior is under your control:
TABLOCK
Specifies that a table-level lock is acquired for the duration of
the bulk-import operation. A table can be loaded concurrently by
multiple clients if the table has no indexes and TABLOCK is specified.
By default, locking behavior is determined by the table option table
lock on bulk load.
For more details, read the product specifications: Controlling Locking Behavior for Bulk Import.

You have an open transaction. That means SQL Server needs to preserve the state of the table, and any changes you are in the process of making are "dirty" and uncommitted.
If you SELECT from a table that is currently being altered with an open (explicit) transaction, the SELECT will wait until the table is in a stable state and the transaction has been either committed or rolled back.
To get around this, you can alter the transaction isolation level on the SELECT query.

If you're specifying TABLOCK in your proc, don't.

TABLOCK vs TABLOCKX

What is the difference between TABLOCK and TABLOCKX?
http://msdn.microsoft.com/en-us/library/ms187373.aspx states that TABLOCK is a shared lock while TABLOCKX is an exclusive lock. Is the first maybe only an index lock of sorts? And what is the concept of sharing a lock?

Big difference, TABLOCK will try to grab "shared" locks, and TABLOCKX exclusive locks.
If you are in a transaction and you grab an exclusive lock on a table, EG:
SELECT 1 FROM TABLE WITH (TABLOCKX)
No other processes will be able to grab any locks on the table, meaning all queries attempting to talk to the table will be blocked until the transaction commits.
TABLOCK only grabs a shared lock, shared locks are released after a statement is executed if your transaction isolation is READ COMMITTED (default). If your isolation level is higher, for example: SERIALIZABLE, shared locks are held until the end of a transaction.
Shared locks are, hmmm, shared. Meaning 2 transactions can both read data from the table at the same time if they both hold a S or IS lock on the table (via TABLOCK). However, if transaction A holds a shared lock on a table, transaction B will not be able to grab an exclusive lock until all shared locks are released. Read about which locks are compatible with which at msdn.
Both hints cause the db to bypass taking more granular locks (like row or page level locks). In principle, more granular locks allow you better concurrency. So for example, one transaction could be updating row 100 in your table and another row 1000, at the same time from two transactions (it gets tricky with page locks, but lets skip that).
In general granular locks is what you want, but sometimes you may want to reduce db concurrency to increase performance of a particular operation and eliminate the chance of deadlocks.
In general you would not use TABLOCK or TABLOCKX unless you absolutely needed it for some edge case.

Quite an old article on mssqlcity attempts to explain the types of locks:
Shared locks are used for operations that do not change or update data, such as a SELECT statement.
Update locks are used when SQL Server intends to modify a page, and later promotes the update page lock to an exclusive page lock before actually making the changes.
Exclusive locks are used for the data modification operations, such as UPDATE, INSERT, or DELETE.
What it doesn't discuss are Intent (which basically is a modifier for these lock types). Intent (Shared/Exclusive) locks are locks held at a higher level than the real lock. So, for instance, if your transaction has an X lock on a row, it will also have an IX lock at the table level (which stops other transactions from attempting to obtain an incompatible lock at a higher level on the table (e.g. a schema modification lock) until your transaction completes or rolls back).
The concept of "sharing" a lock is quite straightforward - multiple transactions can have a Shared lock for the same resource, whereas only a single transaction may have an Exclusive lock, and an Exclusive lock precludes any transaction from obtaining or holding a Shared lock.

This is more of an example where TABLOCK did not work for me and TABLOCKX did.
I have 2 sessions, that both use the default (READ COMMITTED) isolation level:
Session 1 is an explicit transaction that will copy data from a linked server to a set of tables in a database, and takes a few seconds to run. [Example, it deletes Questions]
Session 2 is an insert statement, that simply inserts rows into a table that Session 1 doesn't make changes to. [Example, it inserts Answers].
(In practice there are multiple sessions inserting multiple records into the table, simultaneously, while Session 1 is running its transaction).
Session 1 has to query the table Session 2 inserts into because it can't delete records that depend on entries that were added by Session 2. [Example: Delete questions that have not been answered].
So, while Session 1 is executing and Session 2 tries to insert, Session 2 loses in a deadlock every time.
So, a delete statement in Session 1 might look something like this:
DELETE tblA FROM tblQ LEFT JOIN tblX on ...
LEFT JOIN tblA a ON tblQ.Qid = tblA.Qid
WHERE ... a.QId IS NULL and ...
The deadlock seems to be caused from contention between querying tblA while Session 2, [3, 4, 5, ..., n] try to insert into tblA.
In my case I could change the isolation level of Session 1's transaction to be SERIALIZABLE. When I did this: The transaction manager has disabled its support for remote/network transactions.
So, I could follow instructions in the accepted answer here to get around it: The transaction manager has disabled its support for remote/network transactions
But a) I wasn't comfortable with changing the isolation level to SERIALIZABLE in the first place- supposedly it degrades performance and may have other consequences I haven't considered, b) didn't understand why doing this suddenly caused the transaction to have a problem working across linked servers, and c) don't know what possible holes I might be opening up by enabling network access.
There seemed to be just 6 queries within a very large transaction that are causing the trouble.
So, I read about TABLOCK and TabLOCKX.
I wasn't crystal clear on the differences, and didn't know if either would work. But it seemed like it would. First I tried TABLOCK and it didn't seem to make any difference. The competing sessions generated the same deadlocks. Then I tried TABLOCKX, and no more deadlocks.
So, in six places, all I needed to do was add a WITH (TABLOCKX).
So, a delete statement in Session 1 might look something like this:
DELETE tblA FROM tblQ q LEFT JOIN tblX x on ...
LEFT JOIN tblA a WITH (TABLOCKX) ON tblQ.Qid = tblA.Qid
WHERE ... a.QId IS NULL and ...

Will an EXISTS query still lock a table?

IF NOT EXISTS(SELECT * FROM MyTable WITH(nolock) WHERE Key = 'MyKey')
INSERT MyTable(Key) Values('MyKey')
If The value does not exist in the table, does the query aquire a lock?

From the docs:
READUNCOMMITTED and NOLOCK hints apply only to data locks. All queries, including those with READUNCOMMITTED and NOLOCK hints, acquire Sch-S (schema stability) locks during compilation and execution. Because of this, queries are blocked when a concurrent transaction holds a Sch-M (schema modification) lock on the table. For example, a data definition language (DDL) operation acquires a Sch-M lock before it modifies the schema information of the table. Any concurrent queries, including those running with READUNCOMMITTED or NOLOCK hints, are blocked when attempting to acquire a Sch-S lock. Conversely, a query holding a Sch-S lock blocks a concurrent transaction that attempts to acquire a Sch-M lock. For more information about lock behavior, see Lock Compatibility (Database Engine).
So it won't acquire a data lock, but it will still acquire a schema stability lock.

EXISTS normally will still acquire a lock. But you added a hint that told it not to, and so it won't.

Using a NOLOCK hint will indeed prevent the row lock. Just a heads up though, this kind of 'lookup and insert' is riddled with problems. The operation is not atomic and two sessions trying to do it will cause a race condition when both find the key missing and both try to insert, resulting in one of them causing a duplicate key violation. Is it also suboptimal because the index seek occurs twice (once to lookup the key, once to locate the insert position). The optimal and correct solution is to actually try to insert and recover from the duplicate key error if already exists.

That code is vulnerable to error. Instead you could try:
Put a unique index on the table so that it's not possible to insert multiple rows that conflict, and then just insert. A conflict generates an error, which you'd need to handle.
Or, if conflicts are an expected condition and not the exception, then you'll want to make the insert/check atomic:
insert MyTable( [Key] )
select 'MyKey'
where not exists (
select *
from MyTable
where [Key] = 'MyKey'
)
Also, note that (nolock) and Read Uncommitted do not produce accurate results, by design. It's OK for reporting and such, but dangerous to act on your data based on a decision that uses (nolock).

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight