Avoid blocking other queries while truncating table - sql-server

Here's my problem. Let's say I have a long running SELECT query (#1):
select * from table1 -- assume this runs for a long time
While it's running, I run another SELECT (#2) from the same table. It runs in parallel and finishes in a second:
select top 1 * from table1
So #2 is not blocked by #1.
Now, let's say I want to run #3, which is truncate and reload of table1:
begin tran
truncate table table1
insert into table1
select * from table2
commit
#3 is blocked by #1 and has to wait, which is understandable. However, it also puts TABLOCK on the table1 and #2 cannot run either. Essentially #2 becomes blocked by #1.
Question: is there a way to run #3 in a way that is not blocking other queries?
I'd like to see #3 waiting for #1, but #2 still able to run.
I tried to check for locks before running #3, but can't see any locks in sys.dm_tran_locks view. Is there another place where I could see them?

I haven't done this in a while, but you could try it like this:
begin tran
CREATE TABLE table1_new <(identical to table1, incl. indexes, keys and triggers)>
insert into table1_new
select * from table2
drop table table1
RENAME N'table1_new', N'table1', N'OBJECT'
commit
You may need to try a higher isolation level to insure that your users don't get any "Object Not Found" errors.

Ok, I found the solution. Not sure why it wasn't working for me before, but sys.dm_tran_locks view is the way to go.
Basically, I'm waiting until there are no locks on the table and then start truncate/reload.
begin tran
--wait until there are no locks on the table
while exists(
select 1 from sys.dm_tran_locks
where resource_database_id = DB_ID()
and resource_associated_entity_id = OBJECT_ID(N'dbo.table1')
)
begin
waitfor delay '00:00:05'
end
truncate table table1
insert into table1
select * from table2
commit

A colleague of mine suggested another option, which I ended up using: set LOCK_TIMEOUT before running TRUNCATE.
If it can't lock the table it simply fails. (Of course there should be some mechanism to re-run it later on)
SET LOCK_TIMEOUT 0
TRUNCATE TABLE table1
Hope it helps someone.

Related

How do I acquire write locks in SQL Server?

I need to run a query that selects ten records. Then, based on their values and some outside information, update said records.
Unfortunately I am running into deadlocks when I do this in a multi-threaded fashion. Both threads A and B run their selects at the same time, acquiring read locks on the ten records. So when one of them tries to do an update, the other transaction is aborted.
So what I need to be able to say is "select and write-lock these ten records".
(Yea, I know serial transactions should be avoided, but this is a special case for me.)
Try applying UPDLOCK
BEGIN TRAN
SELECT * FROM table1
WITH (UPDLOCK, ROWLOCK)
WHERE col1 = 'value1'
UPDATE table1
set col1 = 'value2'
where col1 = 'value1'
COMMIT TRAN

Select and Delete in the same transaction using TOP clause

I have table in which the data is been continuously added at a rapid pace.
And i need to fetch record from this table and immediately remove them so i cannot process the same record second time. And since the data is been added at a faster rate, i need to use the TOP clause so only small number of records go to business logic for processing at the time.
I am using the below query to
BEGIN TRAN readrowdata
SELECT
top 5 [RawDataId],
[RawData]
FROM
[TABLE] with(HOLDLOCK)
WITH q AS
(
SELECT
top 5 [RawDataId],
[RawData]
FROM
[TABLE] with(HOLDLOCK)
)
DELETE from q
COMMIT TRANSACTION readrowdata
I am using the HOLDLOCK here, so new data cannot insert into the table while i am performing the SELECT and DELETE operation. I used it because Suppose if there are only 3 records in the table now, so the SELECT statement will get 3 records and in the same time new record gets inserted and the DELETE statement will delete 4 records. So i will loose 1 data here.
Is the query is ok in performance term? If i can improve it then please provide me your suggestion.
Thank you
Personally, I'd use a different approach. One with less locking, but also extra information signifying that certain records are currently being processed...
DECLARE #rowsBeingProcessed TABLE (
id INT
);
WITH rows AS (
SELECT top 5 [RawDataId] FROM yourTable WHERE processing_start IS NULL
)
UPDATE rows SET processing_start = getDate() WHERE processing_start IS NULL
OUTPUT INSERTED.RowDataID INTO #rowsBeingProcessed;
-- Business Logic Here
DELETE yourTable WHERE RowDataID IN (SELECT id FROM #rowsBeingProcessed);
Then you can also add checks like "if a record has been 'beingProcessed' for more than 10 minutes, assume that the business logic failed", etc, etc.
By locking the table in this way, you force other processes to wait for your transaction to complete. This can have very rapid consequences on scalability and performance - and it tends to be hard to predict, because there's often a chain of components all relying on your database.
If you have multiple clients each running this query, and multiple clients adding new rows to the table, the overall system performance is likely to deteriorate at some times, as each "read" client is waiting for a lock, the number of "write" clients waiting to insert data grows, and they in turn may tie up other components (whatever is generating the data you want to insert).
Diego's answer is on the money - put the data into a variable, and delete matching rows. Don't use locks in SQL Server if you can possibly avoid it!
You can do it very easily with TRIGGERS. Below mentioned is a kind of situation which will help you need not to hold other users which are trying to insert data simultaneously. Like below...
Data Definition language
CREATE TABLE SampleTable
(
id int
)
Sample Record
insert into SampleTable(id)Values(1)
Sample Trigger
CREATE TRIGGER SampleTableTrigger
on SampleTable AFTER INSERT
AS
IF Exists(SELECT id FROM INSERTED)
BEGIN
Set NOCOUNT ON
SET XACT_ABORT ON
Begin Try
Begin Tran
Select ID From Inserted
DELETE From yourTable WHERE ID IN (SELECT id FROM Inserted);
Commit Tran
End Try
Begin Catch
Rollback Tran
End Catch
End
Hope this is very simple and helpful
If I understand you correctly, you are worried that between your select and your delete, more records would be inserted and the first TOP 5 would be different then the second TOP 5?
If that so, why don't you load your first select into a temp table or variable (or at least the PKs) do whatever you have to do with your data and then do your delete based on this table?
I know that it's old question, but I found some solution here https://www.simple-talk.com/sql/learn-sql-server/the-delete-statement-in-sql-server/:
DECLARE #Output table
(
StaffID INT,
FirstName NVARCHAR(50),
LastName NVARCHAR(50),
CountryRegion NVARCHAR(50)
);
DELETE SalesStaff
OUTPUT DELETED.* INTO #Output
FROM Sales.vSalesPerson sp
INNER JOIN dbo.SalesStaff ss
ON sp.BusinessEntityID = ss.StaffID
WHERE sp.SalesLastYear = 0;
SELECT * FROM #output;
Maybe it will be helpfull for you.

Can I Select and Update at the same time?

This is an over-simplified explanation of what I'm working on.
I have a table with status column. Multiple instances of the application will pull the contents of the first row with a status of NEW, update the status to WORKING and then go to work on the contents.
It's easy enough to do this with two database calls; first the SELECT then the UPDATE. But I want to do it all in one call so that another instance of the application doesn't pull the same row. Sort of like a SELECT_AND_UPDATE thing.
Is a stored procedure the best way to go?
You could use the OUTPUT statement.
DECLARE #Table TABLE (ID INTEGER, Status VARCHAR(32))
INSERT INTO #Table VALUES (1, 'New')
INSERT INTO #Table VALUES (2, 'New')
INSERT INTO #Table VALUES (3, 'Working')
UPDATE #Table
SET Status = 'Working'
OUTPUT Inserted.*
FROM #Table t1
INNER JOIN (
SELECT TOP 1 ID
FROM #Table
WHERE Status = 'New'
) t2 ON t2.ID = t1.ID
Sounds like a queue processing scenario, whereby you want one process only to pick up a given record.
If that is the case, have a look at the answer I provided earlier today which describes how to implement this logic using a transaction in conjunction with UPDLOCK and READPAST table hints:
Row locks - manually using them
Best wrapped up in sproc.
I'm not sure this is what you are wanting to do, hence I haven't voted to close as duplicate.
Not quite, but you can SELECT ... WITH (UPDLOCK), then UPDATE.. subsequently. This is as good as an atomic operation as it tells the database that you are about to update what you previously selected, so it can lock those rows, preventing collisions with other clients. Under Oracle and some other database (MySQL I think) the syntax is SELECT ... FOR UPDATE.
Note: I think you'll need to ensure the two statements happen within a transaction for it to work.
You should do three things here:
Lock the row you're working on
Make sure that this and only this row is locked
Do not wait for the locked records: skip the the next ones instead.
To do this, you just issue this:
SELECT TOP 1 *
FROM mytable (ROWLOCK, UPDLOCK, READPAST)
WHERE status = 'NEW'
ORDER BY
date
UPDATE …
within a transaction.
A stored procedure is the way to go. You need to look at transactions. Sql server was born for this kind of thing.
Yes, and maybe use the rowlock hint to keep it isolated from the other threads, eg.
UPDATE
Jobs WITH (ROWLOCK, UPDLOCK, READPAST)
SET Status = 'WORKING'
WHERE JobID =
(SELECT Top 1 JobId FROM Jobs WHERE Status = 'NEW')
EDIT: Rowlock would be better as suggested by Quassnoi, but the same idea applies to do the update in one query.

In tsql is an Insert with a Select statement safe in terms of concurrency?

In my answer to this SO question I suggest using a single insert statement, with a select that increments a value, as shown below.
Insert Into VersionTable
(Id, VersionNumber, Title, Description, ...)
Select #ObjectId, max(VersionNumber) + 1, #Title, #Description
From VersionTable
Where Id = #ObjectId
I suggested this because I believe that this statement is safe in terms of concurrency, in that if another insert for the same object id is run at the same time, there is no chance of having duplicate version numbers.
Am I correct?
As Paul writes: No, it's not safe, for which I would like to add empirical evidence: Create a table Table_1 with one field ID and one record with value 0. Then execute the following code simultaneously in two Management Studio query windows:
declare #counter int
set #counter = 0
while #counter < 1000
begin
set #counter = #counter + 1
INSERT INTO Table_1
SELECT MAX(ID) + 1 FROM Table_1
end
Then execute
SELECT ID, COUNT(*) FROM Table_1 GROUP BY ID HAVING COUNT(*) > 1
On my SQL Server 2008, one ID (662) was created twice. Thus, the default isolation level applied to single statements is not sufficient.
EDIT: Clearly, wrapping the INSERT with BEGIN TRANSACTION and COMMIT won't fix it, since the default isolation level for transactions is still READ COMMITTED, which is not sufficient. Note that setting the transaction isolation level to REPEATABLE READ is also not sufficient. The only way to make the above code safe is to add
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
at the top. This, however, caused deadlocks every now and then in my tests.
EDIT: The only solution I found which is safe and does not produce deadlocks (at least in my tests) is to explicitly lock the table exclusively (default transaction isolation level is sufficient here). Beware though; this solution might kill performance:
...loop stuff...
BEGIN TRANSACTION
SELECT * FROM Table_1 WITH (TABLOCKX, HOLDLOCK) WHERE 1=0
INSERT INTO Table_1
SELECT MAX(ID) + 1 FROM Table_1
COMMIT
...loop end...
The default isolation of read commited makes this unsafe, if two of these run in perfect paralel you will get a duplicate since there is no read lock applied.
You need REPEATABLE READ or SERIALIZABLE isolation levels to make it safe.
I think you're assumption is incorrect. When you query the VersionNumber table, you are only putting a read lock on the row. This does not prevent other users from reading the same row from the same table. Therefore, it is possible for two processes to read the same row in the VersionNumber table at the same time and generate the same VersionNumber value.
You need a unique constraint on (Id, VersionNumber) to enforce it
I'd use ROWLOCK, XLOCK hints to block other folk reading the locked row where you calculate
or wrap the INSERT in a TRY/CATCH. If I get a duplicate, try again...

SQL Server READPAST hint

I'm seeing behavior which looks like the READPAST hint is set on the database itself.
The rub: I don't think this is possible.
We have table foo (id int primary key identity, name varchar(50) not null unique);
I have several threads which do, basically
id = select id from foo where name = ?
if id == null
insert into foo (name) values (?)
id = select id from foo where name = ?
Each thread is responsible for inserting its own name (no two threads try to insert the same name at the same time). Client is java.
READ_COMMITTED_SNAPSHOT is ON, transaction isolation is specifically set to READ COMMITTED, using Connection.setTransactionIsolation( Connection.TRANSACTION_READ_COMMITTED );
Symptom is that if one thread is inserting, the other thread can't see it's row -- even rows which were committed to the database before the application started -- and tries to insert, but gets a duplicate-key-exception from the unique index on name.
Throw me a bone here?
You're at the wrong isolation level. Remember what happens with the snapshot isolation level. If one transaction is making a change, no other concurrent transactions see that transaction. Period. Other transactions only will see your changes once you have committed, but only if they START after your commit. The solution to this is to use a different isolation level. Wrap your statements in a transaction and SET TRANSACTION LEVEL SERIALIZABLE. This will ensure that your other concurrent transactions work as if they were all run serially, which is what you seem to want here.
Sounds like you're not wrapping the select and insert into a transaction?
As a solution, you could:
insert into foo (col1,col2,col3) values ('a','b','c')
where not exists (select * from foo where col1 = 'a')
After this, ##rowcount will be 1 if can check if a row was inserted.
SELECT SCOPE_IDENTITY()
should do the trick here...
plus wrapping into a transaction like previous poster mentioned.
The moral of this story is fully explained in my blog post "You can't hold onto nothing" but the short version of this is that you want to use the HOLDLOCK hint. I use the pattern:
INSERT INTO dbo.Foo(Name)
SELECT TOP 1
#name AS Name
FROM (SELECT 1 AS FakeColumn) AS FakeTable
WHERE NOT EXISTS (SELECT * FROM dbo.Foo WITH (HOLDLOCK)
WHERE Name=#name)
SELECT ID FROM dbo.Foo WHERE Name=#name

Resources