How to refactor this deadlock issue? - sql-server

I ran into a deadlock issue synchronizing a table multiple times in a short period of time. By synchronize I mean doing the following:
Insert data to be synchronized into a temp table
Update existing records in destination table
Insert new records into the destination table
Delete records that are not in the synch table under certain
circumstances
Drop temp table
For the INSERT and DELETE statements, I'm using a LEFT JOIN similar to:
INSERT INTO destination_table (fk1, fk2, val1)
FROM #tmp
LEFT JOIN destination_table dt ON dt.fk1 = #tmp.fk1
AND dt.fk2 = #temp.fk2
WHERE dt.pk IS NULL;
The deadlock graph is reporting the destination_table's primary key is under an exclusive lock. I assume the above query is causing a table or page lock instead of a row lock. How would I confirm that?
I could rewrite the above query with an IN, EXIST or EXCEPT command. Are there any additional ways of refactoring the code? Will refactoring using any of these commands avoid the deadlock issue? Which one would be the best? I'm assuming EXCEPT.

Well under normal circumstances I could execute scenario pretty well. Given below is the test script I created. Are you trying something else?
drop table #destination_table
drop table #tmp
Declare #x int=0
create table #tmp(fk1 int, fk2 int, val int)
set #x=2
while (#x<1000)
begin
insert into #tmp
select #x,#x,100
set #x=#x+3
end
create table #destination_table(fk1 int, fk2 int, val int)
while (#x<1000)
begin
insert into #destination_table
select #x,#x,100
set #x=#x+1
end
INSERT INTO #destination_table (fk1, fk2, val)
select t.*
FROM #tmp t
LEFT JOIN #destination_table dt ON dt.fk1 = t.fk1
AND dt.fk2 = t.fk2
WHERE dt.fk1 IS NULL

Related

How to prevent deadlock in concurrent T-SQL transactions?

I have a query which inserts hundreds of records. The idea behind the query is:
DELETE old record with id
INSERT new record with the same id
If the record with id not exists, value for eternal_id will be generated
If the record with id exists, we should save the value from the eternal_id
Query executing in transaction with Read Committed type
Query looks like:
DECLARE #id1 int = 100
DECLARE #id2 int = 200
CREATE TABLE #t(
[eternal_id] [uniqueidentifier] NULL,
[id] [int] NOT NULL
)
DELETE FROM [dbo].[SomeTable] WITH (HOLDLOCK)
OUTPUT
DELETED.eternal_id
,DELETED.id
INTO #t
WHERE [id] IN (#id1, #id2)
INSERT INTO [dbo].[SomeTable]
([id]
,[title]
,[eternal_id])
SELECT main.*, ISNULL([eternal_id], NEWID())
FROM
(
SELECT
#id1 Id
,'Some title 1' Title
UNION
SELECT
#id2 Id
,'Some title 2' Title
) AS main
LEFT JOIN #t t ON main.[id] = t.[id]
DROP TABLE #t
I have hundreds of threads which executing this query with different #id. Everything works perfectly when record already exists in [dbo].[SomeTable], but when records with #id doesn't exists I am catching:
Transaction (Process ID 73) was deadlocked on lock resources with another process and has been chosen as the deadlock victim. Rerun the transaction.
So the problem appears when 2 or more concurrent threads pass the same #id and the record not existing in [dbo].[SomeTable].
I tried to remove WITH (HOLDLOCK) here:
DELETE FROM [dbo].[SomeTable] WITH (HOLDLOCK)
OUTPUT
DELETED.eternal_id
,DELETED.id
INTO #t
WHERE [id] IN (#id1, #id2)
This not hepled and I am started to catch:
Violation of PRIMARY KEY constraint 'PK__SomeTable__3213E83F5D97F3D0'. Cannot insert duplicate key in object 'dbo.SomeTable'. The duplicate key value is (49).
The statement has been terminated.
So without WITH (HOLDLOCK) it works bad even when record already exists.
How to prevent deadlocks when record with id doesn't exists in the table?
Conditional update of eternal_id can be done like this:
update t set
...
eternal_id = ISNULL(t.eternal_id, NEWID())
from [dbo].[SomeTable] t
where t.id = #id
Thus you will keep the old value if it exists. No need to delete/insert. Unless you have some magic in triggers.
I think the comment above from #DaleK helped me the most. I will quote it:
While its a great ambition to try and avoid all deadlocks... its not
always possible... and you can't prevent all future deadlocks from
happens, because as more rows are added to tables query plans change.
Any application code should have some form of retry mechanism to
handle this. – Dale K
So I decided to implement some form of retry mechanism to handle this.

How to handle outputting values not included in SQL insert

I'm writing an import process which will import data from one (somewhat legacy) database to another. The import process takes one flat table with the source data. I have this populating a temp table (#SourcePersonAccount) at the start. The goal is to distribute this data into three destination tables (dbo.Person, dbo.Account & dbo.PersonAccount). This runs within a trigger on a table use SQL Server Replication, so needs to run quickly.
#SourcePersonAccount([AccountNumber], [CompanyId], [TargetPersonId], [TargetAccountId]);
dbo.Person ([Id] pk identity(1,1), [CompanyId], ...);
dbo.Account ([Id] pk identity(1,1), [AccountNumber], ...);
dbo.PersonAccount ([Id], [PersonId] fk_Person_Id, [AccountId] fk_Account_Id);
In my code, I have the TargetPersonId already populated in the #SourcePersonAccount temp table. All that's left is to 1) insert into dbo.Account, 2) update #SourcePersonAccount with the inserted dbo.Account.Id value, 3) insert into dbo.PersonAccount.
One of the challenges is that the AccountNumber and CompanyId make up a composite primary key of the source table, so both are needed to join properly on the #SourcePersonAccount temp table.
I have seen threads addressing similar issues to a certain extent here and here which did not solve my particular problem, mostly due to performance issues.
As stated in this post, the OUTPUT clause cannot output columns that were not included in the insert, so that is not an option here.
One solution I saw that technically can give the desired output (I can't find the link to where I found the suggestion) while using the OUTPUT clause is to actually add and drop a column within the query.
DECLARE #PersonAccountTbl TABLE ([AccountId] INT, [AccountNumber] INT, [CompanyId] INT);
ALTER TABLE [dbo].[Account]
ADD [CompanyId] INT NULL;
INSERT INTO [dbo].[Account]
([AccountNumber], [CompanyId])
OUTPUT INSERTED.[Id], INSERTED.[AccountNumber], INSERTED.[CompanyId]
INTO #PersonAccountTbl
SELECT
[AccountNumber], [CompanyId]
FROM #SourcePersonAccount
WHERE
[TargetAccountId] IS NULL;
ALTER TABLE [dbo].[Account]
DROP COLUMN [CompanyId];
This is not a viable option for my situation.
I tried using MERGE as every thread I've found on this issue recommends using it. I do not like MERGE for a few reasons. I tried it anyways; the below code gives the desired output, but ended up being much too slow for my purposes.
DECLARE #PersonAccountTbl TABLE ([AccountId] INT, [AccountNumber] INT, [CompanyId] INT);
MERGE INTO [dbo].[Account] a
USING #SourcePersonAccount spa
ON spa.[TargetAccountId] IS NULL
WHEN NOT MATCHED THEN
INSERT
([AccountNumber])
VALUES
(spa.[AccountNumber])
OUTPUT INSERTED.[Id], INSERTED.[AccountNumber], spa.[CompanyId]
INTO #PersonAccountTbl ([AccountId], [AccountNumber], [CompanyId]);
UPDATE spa
SET spa.[TargetAccountId] = pat.[AccountId]
FROM #SourcePersonAccount spa
JOIN #PersonAccountTbl pat
ON pat.[AccountNumber] = spa.[AccountNumber]
AND pat.[CompanyId] = spa.[CompanyId];
INSERT INTO [dbo].[PersonAccount]
([PersonId], [AccountId])
SELECT
spa.[TargetPersonId], spa.[TargetAccountId]
FROM #SourcePersonAccount spa
LEFT JOIN [dbo].[PersonAccount] pa
ON pa.[PersonId] = spa.[TargetPersonId]
AND pa.[AccountId] = spa.[TargetAccountId]
WHERE
pa.[Id] IS NULL;
Is there a way other than MERGE or adding/dropping a column to accomplish this?
You can use a SEQUENCE instead of an IDENTITY column. Then you can assign the IDs to a temp table or table variable before you INSERT the data.

SQL Server Create Table Variable to hold records which will be inserted into permanent Table after truncating permanent table

I have a table with around 7 million rows which I need to perform a truncate of the table. I am going to do this like the following:
BEGIN TRY
BEGIN TRANSACTION
Declare #RecsToKeep Table
(
Id int
)
SELECT Id
FROM RealTable
Where CONVERT (DATE, CreatedDate) > '2017-08-16'
Declare #KeepTheseRecs Table
(
Id int
)
Insert into #KeepTheseRecs
Select *
From RealTable Where Id IN (Select Id From #RecsToKeep)
Truncate Table RealTable
Insert into RealTable
Select *
From #KeepTheseRecs
COMMIT
END TRY
BEGIN CATCH
ROLLBACK
END CATCH
The real table and table variable have the same column structure. Is this the correct way to do this?
First, you haven't changed anything in the table based off your query. You are simply moving the records from A to B and back to A.
A simpler method would be to skip the move all together.
delete from RealTable
where someColumn = 'someValue' --or what ever condition you want
If you are really going to stage the records, you're going to at least want a WHERE clause on that Insert Into statement. I really don't see why you need to do this though.

SQL Server - how to manually increment a PK in multiple-row INSERT transaction

I am working in SQL Server. I have a table that has a PK int column. This column does not have auto-increment enabled, and I am not allowed to change the schema. I need to insert lots of rows (perhaps thousands) into this table manually. None of the data inserted will come from any existing table. However, I need to ensure that the PK column gets incremented by +1 for each new row. My current script is like the following:
BEGIN TRAN
INSERT INTO DB1.dbo.table1
(PK_col, col1)
VALUES
(10, 'a')
,(11, 'something')
,(12, 'more text')
;
where I already know via a pre-query (SELECT MAX(PK_col) + 1) that PK_col is currently at 9.
My problem is ensuring that the PK column gets incremented by +1 for each new row. Because there could be thousands of rows to insert, I want to reduce the possibility of skipping values or a PK constraint violation being thrown. I know that I can achieve this outside of the DB (via Excel), as long as I validate the PK values prior to running the SQL script. However, I would like to create a solution that handles the auto-increment within the TRAN statement itself. Is this possible (without running into a race condition)? If so, how?
The following should do what you want:
INSERT INTO DB1.dbo.table1(PK_col, col1)
SELECT COALESCE(l.max_pk_col, 0) + row_number() over (order by (select null)) as PK_col,
col1
FROM (VALUES ('a'), ('something'), ('more text')) v(col1) CROSS JOIN
(SELECT MAX(pk_col) as max_pk_col FROM DB1.dbo.table1) l;
You need to be careful with this arrangement. Locking the entire table for the duration of the INSERT is probably a good idea -- if anything else could be updating the table.
One way to do this would be to create a new temporary table with an identity column and a data column, do your insert and then insert the contents of this table back into your desired original table. Then clean it up
BEGIN TRAN
DECLARE #initialIndex INT
SELECT #initialIndex = MAX(PK_col) + 1)
FROM DB1.dbo.table1
CREATE TABLE #tempData(
PK_col INT IDENTITY(#initialIndex, 1),
col1 VARCHAR(MAX)
)
INSERT INTO #tempData (col1)
VALUES
('a')
,('something')
,('more text')
INSERT INTO DB1.dbo.table1
SELECT PK_col, col1
FROM #tempData
DROP TABLE #tempData

DROP TABLE fails for temp table

I have a client application that creates a temp table, the performs a bulk insert into the temp table, then executes some SQL using the table before deleting it.
Pseudo-code:
open connection
begin transaction
CREATE TABLE #Temp ([Id] int NOT NULL)
bulk insert 500 rows into #Temp
UPDATE [OtherTable] SET [Status]=0 WHERE [Id] IN (SELECT [Id] FROM #Temp) AND [Group]=1
DELETE FROM #Temp WHERE [Id] IN (SELECT [Id] FROM [OtherTable] WHERE [Group]=1)
INSERT INTO [OtherTable] ([Group], [Id]) SELECT 1 as [Group], [DocIden] FROM #Temp
DROP TABLE #Temp
COMMIT TRANSACTION
CLOSE CONNECTION
This is failing with an error on the DROP statement:
Cannot drop the table '#Temp', because it does not exist or you do not have permission.
I can't imagine how this failure could occur without something else going on first, but I don't see any other failures occurring before this.
Is there anything that I'm missing that could be causing this to happen?
possibly something is happening in the session in between?
Try checking for the existence of the table before it's dropped:
IF object_id('tempdb..#Temp') is not null
BEGIN
DROP TABLE #Temp
END
I've tested this on SQL Server 2005, and you can drop a temporary table in the transaction that created it:
begin transaction
create table #temp (id int)
drop table #temp
commit transaction
Which version of SQL Server are you using?
You might reconsider why you are dropping the temp table at all. A local temporary table is automatically deleted when the connection ends. There's usually no need to drop it explicitly.
A global temporary table starts with a double hash (f.e. ##MyTable.) But even a global temp table is automatically deleted when no connection refers to it.
I think you aren't creating the table at all, because the statement
CREATE TABLE #Temp ([Id] AS int)
is incorrect. Please, write it as
CREATE TABLE #Temp ([Id] int)
and see if it works.
BEGIN TRAN
IF object_id('DATABASE_NAME..#TABLE_NAME') is not null
BEGIN
DROP TABLE #TABLE_NAME
END
COMMIT TRAN
Note:Please enter your table name where TABLE_NAME and database name where it says DATABASE_NAME

Resources