We have an issue with "deadlocks" in SQL Server, where there are no explicit locks involved and would like to know, how to get around them.
Some relevant background info: Our application is quite old and large. We recently began to remove some concurrency-hindering issues and doing so we stepped onto the deadlock-thing on SQL Server. We do not have the resources to tackle each select statement within the application but are looking for a more general approach on configuration level.
We can reduce one exemplary problem as follows: Basically, we have two entities, EntityA and EntityB. Both are mapped to individual tables in the SQL Server schema. Between both entities, there is a m:n relation, mapped within the database via a AToB table, which contains some additional context (i.e. there can be more than one entry within AToB regarding the same A and B.
During one business operation a new instance of A and B are inserted into the database and also multiple entries within the AToB table. Within the same transaction at a later point all this data is read again (without for update). When executing this operation in parallel deadlocks occur. These deadlocks are linked to the AToB table.
Say, we have A1 and B1, which are linked via A1B1_1 and A1B1_2 and A2 and B2, which are linked via A2B2_1 and A2B2_2.
My guess is the following happens:
t1 -> INSERT A1
t1 -> INSERT B1
t1 -> INSERT A1B1_1 (PAGE1)
t2 -> INSERT A2
t2 -> INSERT B2
t2 -> INSERT A2B2_1 (PAGE1)
t1 -> INSERT A1B1_2 (PAGE2)
t2 -> INSERT A2B2_2 (PAGE2)
t1 -> SELECT * FROM AToB WHERE AToB.A=A1
t1 -> SELECT * FROM AToB WHERE AToB.A=A2
Now, during concurrent reads on the AToB table t1 obtains a lock on PAGE1 and t2 obtains a lock on PAGE2 resulting in a deadlock.
First question: Is this a plausible explanation for the deadlock to occur?
Second question: During research I was given the impression, an index on AToB.A might force SQL Server to lock less entries on the table (possibly even reducing it to a row- instead of a page lock). Is this right?
Third question: I further got the impression, this problem might be solved by snapshot-locking. Is that right?
We tried this approach, however it lead us into the next circle of hell:
During the business-transaction at one point a business-identifier is assigned to A. This comes from a separate table and it must be unique among the As. There is no possibility to assign this via a database-sequence. Our solution is to assign this identifier via an select/update on a fourth table Identifier. This is done via a for update statement. When employing snapshot-locking this for update lock is ignored during acquisition and only leads to an optimistic locking exception during commit. This leads us to
Fourth question: When using snapshot-locking, is it possible to still have a special transactions, which still run on pessimistic locking or is it possible to tell SQL Server, some tables are excluded from optimistic locking?
Related
I have a database with READ_COMMITTED_SNAPSHOT_ISOLATION set ON (cannot change that).
I insert new rows into a table on many parallel sessions,
but only if they don't exist there yet (classic left join check).
The inserting code looks like this:
INSERT INTO dbo.Destination(OrderID)
SELECT DISTINCT s.OrderID
FROM dbo.Source s
LEFT JOIN dbo.Destination d ON d.OrderID = s.OrderID
WHERE d.OrderID IS NULL;
If I run this on many parallel sessions I get a lot of duplicate key errors,
since different sessions try to insert the same OrderIDs over and over again.
That is expected due to the lack of SHARED locks under RCSI.
The recommended solution here (as per my research) would be to use the READCOMMITTEDLOCK hint like this:
LEFT JOIN dbo.Destination d WITH (READCOMMITTEDLOCK) ON d.OrderID = s.OrderID
This somewhat works, as greatly reduces the duplicate key errors, but (to my surprise) doesn't completely eliminate them.
As an experiment I removed the unique constraint on the Destination table, and saw that many duplicates enters the table in the very same millisecond originated from different sessions.
It seems that despite the table hint, I still get false positive on the existence check, and the redundant insert fires.
I tried different hints (SERIALIZABLE) but it made it worse and swarmed me with deadlocks.
How could I make this insert work under RCSI?
The right lock hint for reading a table you are about to insert into is (UPDLOCK,HOLDLOCK), which will place U locks on the rows as you read them, and also place SERIALIZABLE-style range locks if the row doesn't exist.
The problem with your approach is that each client is attempting to insert a batch of rows, and each batch has to either succeed completely or fail. If you use row-level locking, you will always have scenarios where a session inserts one row succesfully, but then becomes blocked waiting to read or insert a subsequent row. This inevitably leads to either PK failures or deadlocks, depending on the type of row lock used.
The solution is to either:
1) Insert the rows one-by-one, and not hold the locks from one row while you check and insert the next row.
2) Simply escalate to a tablockx, or an Applciation Lock to force your concurrent sessions to serialize through this bit of code.
So you can have highly-concurrent loads, or batch loads, but you can't have both. Well mostly.
3) You could turn on IGNORE_DUP_KEY on the index, which instead of an error will just skip any duplicate when inserting.
My source tables called Event sitting in a different database and it has millions of rows. Each event can have an action of DELETE, UPDATE or NEW.
We have a Java process that goes through these events in the order they were created and do all sort of rules and then insert the results into multiple tables for look up, analyse etc..
I am using JdbcTemplate and using batchUpdate to delete and upsert to Postgres DB in a sequential order right now, but I'd like to be able to parallel too. Each batch is 1,000 entities to be insert/upserted or deleted.
However, currently even doing in a sequential manner, Postgres locks queries somehow which I don't know much about and why.
Here are some of the codes
entityService.deleteBatch(deletedEntities);
indexingService.deleteBatch(deletedEntities);
...
entityService.updateBatch(allActiveEntities);
indexingService.updateBatch(....);
Each of these services are doing insert/delete into different tables. They are in one transaction though.
The following query
SELECT
activity.pid,
activity.usename,
activity.query,
blocking.pid AS blocking_id,
blocking.query AS blocking_query
FROM pg_stat_activity AS activity
JOIN pg_stat_activity AS blocking ON blocking.pid = ANY(pg_blocking_pids(activity.pid));
returns
Query being blocked: "insert INTO ENTITY (reference, seq, data) VALUES($1, $2, $3) ON CONFLICT ON CONSTRAINT ENTITY_c DO UPDATE SET data = $4",
Blockking query: delete from ENTITY_INDEX where reference = $1
There are no foreign constraints between these tables. And we do have indexes so that we can run queries for our processing as part of the process.
Why would one completely different table can block the other tables? And how can we go about resolving this?
Your query is misleading.
What it shows as “blocking query” is really the last statement that ran in the blocking transaction.
It was probably a previous statement in the same transaction that caused entity (or rather a row in it) to be locked.
This is something very disturbing I stumbled upon while stress-testing an application using Sybase ASE 15.7.
We have the following table:
CREATE TABLE foo
(
i INT NOT NULL,
blob IMAGE
);
ALTER TABLE foo ADD PRIMARY KEY (i);
The table has, even before starting the test, a single row with some data in the IMAGE column. No rows are either deleted or inserted during the test. So the table always contains a single row. Column blob is only updated (in transaction T1 below) to some value (not NULL).
Then, we have the following two transactions:
T1: UPDATE foo SET blob=<some not null value> WHERE i=1
T2: SELECT * FROM foo WHERE i=1
For some reason, the above transactions may deadlock under load (approx. 10 threads doing T1 20 times in a loop and another 10 threads doing T2 20 times in loop).
This is already weird enough, but there's more to come. T1 is always chosen as the deadlock victim. So, the application logic, on the event of a deadlock (error code 1205) simply retries T1. This should work and should normally be the end of the story. However …
… it happens that sometimes T2 will retrieve a row in which the value of the blob column is NULL! This is even though the table already starts with a row and the updates simply reset the previous (non-NULL) value to some other (non-NULL) value. This is 100% reproducible in every test run.
This is observed with the READ COMMITTED serialization level.
I verified that the above behavior also occurs with the TEXT column type but not with VARCHAR.
I've also verified that obtaining an exlusive lock on table foo in transaction T1 makes the issue go away.
So I'd like to understand how can something that so fundamentally breaks transaction isolation be even possible? In fact, I think this is worse than transaction isolation as T1 never sets the value of the blob column to NULL.
The test code is written in Java using the jconn4.jar driver (class com.sybase.jdbc4.jdbc.SybDriver) so I don't rule out that this may be a JDBC driver bug.
update
This is reproducible simply using isql and spawning several shells in parallel that continuously execute T1 in a loop. So I am removing the Java and JDBC tags as this is definitely server-related.
Your example create table code by default would create an allpages locked table unless your DBA has changed the system-wide 'lock scheme' parameter via sp_configure to another value(you can check this yourself as anyone via sp_configure 'lock scheme'.
Unless you have a very large number of rows they are all going to be sat on a single data page because an int is only 4 bytes long and the blob data is stored at the end of the table (unless you use the in-row LOB functionality in ASE15.7 and up). This is why you are getting deadlocks. You have by definition created a single hotspot where all the data is being accessed at the page level. This is even more likely where larger page sizes > 2k are used, since by their nature they will have even more rows per page and with allpages locking, even more likelihood of contention.
Change your locking scheme to datarows (unless you are planning to have very high rowcounts) as has been said above and your problem should go away. I will add that your blob column looks to allow nulls from your code, so you should also consider setting the 'dealloc_first_txtpg' attribute for your table to avoid wasted space if you have nulls in your image column.
We've seen all kinds of weird stuff with isolation level 1. I'm under the impression that when T2 is in progress, T1 can change data and T2 might return intermediate result of T1.
Try isolation level 2 and see if it helps (does for us).
We have a stored procedure which loads order details about an order. We always want the latest information about an order, so order details for the order are regenerated every time, when the stored procedure is called. We are using SQL Server 2016.
Pseudo code:
DELETE by clustered index based on order identifier
INSERT into the table, based on a huge query containing information about order
When multiple end-users are executing the stored procedure concurrently, there is a blocking created on orderdetails table. Once the first caller is done, second caller is queued, followed by third caller. So, the time for the generation of the orderdetails increases as time goes by. This is happening especially in the cases of big orders containing details rows in > 100k or 1 or 2 million, as there is table level lock is happening.
The approach we took
We partitioned the table based on the last digit of the order identifier for concurrent orderdetails loading. This improves the performance in the case of first time orderdetails loading, as there are no deletes. But, second time onwards, INSERT in first session is causing blocking for other sessions DELETE. The other sessions are blocked till first session is done with INSERT.
We are considering creation of separate orderdetails table for every order to avoid this concurrency issues.
Question
Can you please suggest some approach, which will support concurrent DELETE & INSERT scenario ?
We solved the contention issue by going for temporary table for orderdetails. We found that huge queries are taking longer SELECT time and this longer time was contributing to longer table level locks on the orderdetails table.
So, we first loaded data into temporary table #orderdetail and then went for DELETE and INSERT in the orderdetail table.
As the orderdetail table is already partitioned, DELETE were faster and INSERT were happening in parallel. INSERT was also very fast here, as it is simple table scan from #orderdetail table.
You can give a look to the Hekaton Engine. It is available even in SQL Server Standard Edition if you are using SP1.
If this is too complicated for implementation due to hardware or software limitations, you can try to play with the Isolation Levels of the database. Sometimes, queries that are reading huge amount of data are blocked or even deadlock victims of queries which are modifying parts of these data. You can ask yourself do you need to guarantee that the data read by the user is valid or you can afford for example some dirty reads?
I need to update an identity column in a very specific scenario (most of the time the identity will be left alone). When I do need to update it, I simply need to give it a new value and so I'm trying to use a DELETE + INSERT combo.
At present I have a working query that looks something like this:
DELETE Test_Id
OUTPUT DELETED.Data,
DELETED.Moredata
INTO Test_id
WHERE Id = 13
(This is only an example, the real query is slightly more complex.)
A colleague brought up an important point. She asked if this wont cause a deadlock since we are writing and reading from the same table. Although in the example it works fine (half a dozen rows), in a real world scenario with tens of thousands of rows this might not work.
Is this a real issue? If so, is there a way to prevent it?
I set up an SQL Fiddle example.
Thanks!
My first thought was, yes it can. And maybe it is still possible, however in this simplified version of the statement it would be very hard to hit an deadlock. You're selecting a single row for which probably row level locks are acquired plus the fact that the locks required for the delete and the insert are acquired very fast after each other.
I've did some testing against a table holding a million rows execution the statement 5 million times on 6 different connections in parallel. Did not hit a single deadlock.
But add the reallive query, an table with indexes and foreign keys and you just might have a winner. I've had a similar statement which did cause deadlocks.
I have encountered deadlock errors with a similar statement.
UPDATE A
SET x=0
OUTPUT INSERTED.ID, 'a' INTO B
So for this statement to complete mssql needs to take locks for the updates on table A, locks for the inserts on table B and shared (read) locks on table A to validate the foreign key table B has to table A.
And last but not least, mssql decided it would be wise to use parallelism on this particular query causing the statement to deadlock on itself. To resolve this I've simply set "MAXDOP 1" query hint on the statement to prevent parallelism.
There is however no definite answer to prevent deadlocks. As they say with mssql ever so ofter, it depends. You could take an exclusive using the TABLOCKX table hint. This will prevent a deadlock, however it's probably not desirable for other reasons.