I'm working on updating over 100 million rows in a table. In order to reduce the number of rows that are locked (or even a table lock), I'm processing the updates with a combination of a while-loop and incrementing of 10k rows at a time.
I am running this on SQL Server 2012.
DECLARE #i int = 0
DECLARE #last int = 10000000
WHILE #i <= #last
BEGIN
UPDATE mytbl (ROWLOCK)
SET foo = null
WHERE id BETWEEN #i AND #i + 10000
SET #i += 10000
WAITFOR DELAY '00:00:01'
END
Does the 1 second wait in the code do anything in terms of improved performance, committing/flushing the transaction, or releasing transaction locks?
RE: "In order to reduce the number of rows that are locked (or even a table lock)..."
In general, you have a good approach. Recommendations:
I recommend that you wrap the UPDATE in an explicit transaction. Commit the transaction on each iteration. Otherwise you risk having a 100,000 million row transaction. Explicit transactions mean that I know that it committed. BEGIN TRAN ... UPDATE ... COMMIT TRAN. I use explicit transactions for this type of work every time.
Reduce the increment from 10,000. I use 1,000 in all cases. You can run tests to determine the exact number that will trigger table lock. You can read Microsoft guidelines for lock escalation for 2012 (lock escalation differs by version). But, my goal, when I do this, is to never table lock. Emphasis on "never". I just use 1,000 and sleep better at night. I recommend 1,000 for you. But, your mileage may vary.
Remove the WAITFOR. For the purposes of this loop, it only adds hours to the run time (10,000 seconds for the original 10,000 increment on 100 million rows).
I hope that helps. I have done between 50 and 100 of these on tables into the billions of rows. The keys (as noted above): never table lock and commit every iteration to keep the transaction small. Small transactions are your friend for many reasons.
Related
I have a database table with thousands of entries. I have multiple worker threads which pick up one row at a time, does some work (takes roughly one second each). While picking up the row, each thread updates a flag on the database row (like a timestamp) so that the other threads do not pick it up. But the problem is that I end up in a scenario where multiple threads are picking up the same row.
My general question is that what general design approach should I follow here to ensure that each thread picks up unique rows and does their task independently.
Note : Multiple threads are running in parallel to hasten the processing of the database rows. So I would like to have a as small as possible critical segment or exclusive lock.
Just to give some context, below is the stored proc which picks up the rows from the table after it has updated the flag on the row. Please note that the stored proc is not compilable as I have removed unnecessary portions from it. But generally that's the structure of it.
The problem happens when multiple threads execute the stored proc in parallel. The change made by the update statement (note that the update is done after taking up a lock) in one thread is not visible to the other thread unless the transaction is committed. And as there is a SELECT statement (which takes around 50ms) between the UPDATE and the TRANSACTION COMMIT, on 20% cases the UPDATE statement in a thread picks up a row which has already been processed.
I hope I am clear enough here.
USE ['mydatabase']
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
ALTER PROCEDURE [dbo].[GetRequest]
AS
BEGIN
-- some variable declaration here
BEGIN TRANSACTION
-- check if there are blocking rows in the request table
-- FM: Remove records that don't qualify for operation.
-- delete operation on the table to remove rows we don't want to process
delete FROM request where somecondition = 1
-- Identify the requests to process
DECLARE #TmpTableVar table(TmpRequestId int NULL);
UPDATE TOP(1) request
WITH (ROWLOCK)
SET Lock = DateAdd(mi, 5, GETDATE())
OUTPUT INSERTED.ID INTO #TmpTableVar
FROM request tur
WHERE (Lock IS NULL OR GETDATE() > Lock) -- not locked or lock expired
AND GETDATE() > NextRetry -- next in the queue
IF(##RowCount = 0)
BEGIN
ROLLBACK TRANSACTION
RETURN
END
select #RequestID = TmpRequestId from #TmpTableVar
-- Get details about the request that has been just updated
SELECT somerows
FROM request
WHERE somecondition = 1
COMMIT TRANSACTION
END
The analog of a critical section in SQL Server is sp_getapplock, which is simple to use. Alternatively you can SELECT the row to update with (UPDLOCK,READPAST,ROWLOCK) table hints. Both of these require a multi-statement transaction to control the duration of the exclusive locking.
You need start a transaction isolation level on sql for isolation your line, but this can impact on your performance.
Look the sample:
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
GO
BEGIN TRANSACTION
GO
SELECT ID, NAME, FLAG FROM SAMPLE_TABLE WHERE FLAG=0
GO
UPDATE SAMPLE_TABLE SET FLAG=1 WHERE ID=1
GO
COMMIT TRANSACTION
Finishing, not exist a better way for use isolation level. You need analyze the positive and negative point for each level isolation and test your system performance.
More information:
https://learn.microsoft.com/en-us/sql/t-sql/statements/set-transaction-isolation-level-transact-sql
http://www.besttechtools.com/articles/article/sql-server-isolation-levels-by-example
https://en.wikipedia.org/wiki/Isolation_(database_systems)
EDITED: I have a table with composite key which is being used by multiple windows services deployed on multiple servers.
Columns:
UserId (int) [CompositeKey],
CheckinTimestamp (bigint) [CompositeKey],
Status (tinyint)
There will be continuous insertion in this table. I want my windows service to select top 10000 rows and do some processing while locking those 10000 rows only. I am using ROWLOCK for this using below stored procedure:
ALTER PROCEDURE LockMonitoringSession
AS
BEGIN
BEGIN TRANSACTION
SELECT TOP 10000 * INTO #TempMonitoringSession FROM dbo.MonitoringSession WITH (ROWLOCK) WHERE [Status] = 0 ORDER BY UserId
DECLARE #UserId INT
DECLARE #CheckinTimestamp BIGINT
DECLARE SessionCursor CURSOR FOR SELECT UserId, CheckinTimestamp FROM #TempMonitoringSession
OPEN SessionCursor
FETCH NEXT FROM SessionCursor INTO #UserId, #CheckinTimestamp
WHILE ##FETCH_STATUS = 0
BEGIN
UPDATE dbo.MonitoringSession SET [Status] = 1 WHERE UserId = #UserId AND CheckinTimestamp = #CheckinTimestamp
FETCH NEXT FROM SessionCursor INTO #UserId, #CheckinTimestamp
END
CLOSE SessionCursor
DEALLOCATE SessionCursor
SELECT * FROM #TempMonitoringSession
DROP TABLE #TempMonitoringSession
COMMIT TRANSACTION
END
But by doing so, dbo.MonitoringSession is being locked permanently until the stored procedure ends. I am not sure what I am doing wrong here.
The only purpose of this stored procedure is to select and update 10000 recent rows without any primary key and ensuring that whole table is not locked because multiple windows services are accessing this table.
Thanks in advance for any help.
(not an answer but too long for comment)
The purpose description should be about why/what for are you updating whole table. Your SP is for updating all rows with Status=0 to set Status=1. So when one of your services decides to run this SP - all rows become non-relevant. I mean, logically event which causes status change already occurred, you just need some time to physically change it in the database. So why do you want other services to read non-relevant rows? Ok, probably you need to read rows available to read (not changed) - but it's not clear again because you are updating whole table.
You may use READPAST hint to skip locked rows and you need rowlocks for that.
Ok, but even with processing of top N rows update of those N rows with one statement would be much faster then looping through this number of rows. You are doing same job but manually.
Check out example of combining UPDLOCK + READPAST to process same queue with parallel processes: https://www.mssqltips.com/sqlservertip/1257/processing-data-queues-in-sql-server-with-readpast-and-updlock/
Small hint - CURSOR STATIC, READONLY, FORWARD_ONLY would do same thing as storing to temp table. Review STATIC option:
https://msdn.microsoft.com/en-us/library/ms180169.aspx
Another thing is a suggestion to think of RCSI. This will avoid other services locking for sure but this is a db-level option so you'll have to test all your functionality. Most of it will work same as before but some scenarios need testing (concurrent transactions won't be locked in situations where they were locked before).
Not clear to me:
what is the percentage of 10000 out of the total number of rows?
is there a clustered index or this is a heap?
what is actual execution plan for select and update?
what are concurrent transactions: inserts or selects?
by the way discovered similar question:
why the entire table is locked while "with (rowlock)" is used in an update statement
My application runs a nightly purge process to delete old records from the primary tables in my OLTP application. I was experiencing lock escalation during the purge process which was blocking concurrent inserts into the table, so I modified the purge procedure to loop through and delete records in blocks of 4900 which should be well below SQL Server's lock escalation threshold of 5000. While lock escalation was much reduced, SQL Server Profiler still reports occasional lock escalation on the following DELETE statement in the loop:
-- outer loop increments #BatchMinId and #BatchMaxId variables
BEGIN TRAN
-- limit is set at 4900
DELETE TOP (#limit) h
OUTPUT DELETED.ChildTable1Id,
DELETED.ChildTable2Id,
DELETED.ChildTable3Id,
DELETED.ChildTable4Id
INTO #ChildRecordsToDelete
FROM MainTable h WITH (ROWLOCK)
WHERE h.Id >= #BatchMinId AND h.Id <= #BatchMaxId AND h.Id < #MaxId AND
NOT EXISTS (SELECT 1 FROM OtherTable ot WHERE ot.Id = h.Id);
-- delete from ChildTables 1-4 (no additional references to MainTable)
COMMIT TRAN;
-- end loop
The "IntegerData2" column in SQL Server Profiler for the reported lock escalation events (which is supposed to be the escalated lock count) ranges from 10197 to 10222 which does not look close to any multiple of 4900 (my purge batch size) plus any multiple of 1250 (number of additional locks SQL Server may take before attempting escalation).
Given that I am explicitly limiting the DELETE statement to 4900 rows, how are more locks ever being taken, especially to the point that SQL Server is escalating to a table lock? I would like to understand this before I resort to disabling lock escalation altogether on this table.
I can't comment on your question since I don't have enough reputation on this web site, so I'm commenting here.
I had a similar issue with a cleanup task running at night. The delete statement was locked by the "GHOST CLEANUP".
Here have a look at this :
SQL Server Lock Timeout Exceeded Deleting Records in a Loop
Hope this help.
One weird solution that I found at the time was :
1) Insert the record to keep in another table with same structure. (Copy)
2) Truncate table to clean
3) Insert back data to keep from the copy into the now empty table.
4) Truncate copy table to release space.
This trick was faster to cleanup then the delete itself, because the deletion was done in a split second because of truncate. Somehow the cost of insertion was less expensive then deletion one.
But still, I would recommend to avoid this solution. You could also reduce the chunk between 100 to 500. This increase time the cleanup takes, but you are less likely to have the lock escalation.
I try to find out the performance or internal implementation for WAITFOR in T-SQL, have gone through MSDN and Stackoverflow and other sites without luck, here is my question
For below code, I want to delete the top 10,000 rows from table DUMMY. I want to make this delete job have the least performance impact on the database's other jobs as possible and give priority to others (if any). So I make it delete 100 rows at a time and do it 100 times with sleep time in two adjacent deletes.
Question:
During the WAITFOR blocking time, will this transaction consume CPU or just idle and waiting for kicked up by some event 1 second later?
During that 1 sec, if there are other transactions trying to INSERT/UPDATE on the DUMMY table, who gets priority?
Really appreciate your help or any insights for this
declare #cnt int
set #cnt = 0
while #cnt < 100
begin
delete top 100 from DUMMYTABLE where FOO = 'BAR'
set #cnt = #cnt + 1
waitfor delay '00:00:01'
end
It does not consume any CPU
Status = suspended
You can see this with 2 query windows:
SELECT ##SPID;
GO
WAITFOR DELAY '000:03:00'; -- three minutes
Then in the other
SELECT * FROM sys.sysprocesses S WHERE S.spid = 53; -- replace 53
Note: SQL Server 2012 SP1 but AFAIK behaviour is the same
Point 2, sorry missed this
Another session will modify the table while the WAITFOR is running. It isn't a lock.
Here is my scenario: we have a database, let's call it Logging, with a table that holds records from Log4Net (via MSMQ). The db's recovery mode is set to Simple: we don't care about the transaction logs -- they can roll over.
We have a job that uses data from sp_spaceused to determine if we've met a certain size threshold. If the threshold is exceeded, we determine how many rows need to be deleted to bring the size down to x percent of that threshold. (As an aside, I'm using exec sp_spaceused MyLogTable, TRUE to get the number of rows and a rough approximation of their average size, although I'm not convinced that's the best way to go about it. But that's a different issue.)
I then try to chunk deletes (say, 5000 at a time) by looping a call to a sproc that basically does this:
DELETE TOP (#RowsToDelete) FROM [dbo].[MyLogTable]
until I've deleted what needs to be deleted.
Here's the issue: If I have a lot of rows to delete, the transaction log file fills up. I can watch it grow by running
dbcc sqlperf (logspace)
What puzzles me is that, when the job fails, ALL deleted rows get rolled back. In other words, it appears all the chunks are getting wrapped (somehow) in an implicit transaction.
I've tried expressly setting implicit transactions off, wrapping each DELETE statement in a BEGIN and COMMIT TRAN, but to no avail: either all deleted chunks succeed, or none at all.
I know the simple answer is, Make your log file big enough to handle the largest possible number of records you'd ever delete, but still, why is this being treated as a single transaction?
Sorry if I missed something easy, but I've looked at a lot of posts regarding log file growth, recovery modes, etc., and I can't figure this out.
One other thing: Once the job has failed, the log file stays up at around 95 - 100 percent full for a while before it drops back. However, if I run
checkpoint
dbcc dropcleanbuffers
it drops right back down to about 5 percent utilization.
TIA.
The log file in simple recovery model is truncated automatically every checkpoint gererally speaking. You can invoke checkpoint manually as you do at the end of the loop, but you can also do it every iteration. The frequency of checkpoints is by default determined automatically by sql server based on the recovery interval setting.
As far as the 'all deletes are rolled back', I don't see other explanation but an external transaction. Can you post entire code that cleans up the log? How do you invoke this code?
What is your setting of implicit transactions?
Hm.. if the log grows and doesn't truncate automatically, it may also indicate that there is a transaction running outside of the loop. Can you select ##trancount before your loop and perhaps with each iteration to find out what's going on?
Well, I tried several things, but still all deletes get rolled back. I added printint ##TRANCOUNT both before and after the delete and I get zero as the count. Yet, on failure, all deletes are rolled back .... I added SET IMPLICIT_TRANSACTIONS OFF in several places (including within my initial call from Query Analyzer, but that does not seem to help. This is the body of the stored procedure that is being called (I have set #RowsToDelete to 5000 and 8000):
SET NOCOUNT ON;
print N'##TRANCOUNT PRIOR TO DELETE: ' + CAST(##TRANCOUNT AS VARCHAR(20));
set implicit_transactions off;
WITH RemoveRows AS
(
SELECT ROW_NUMBER() OVER(ORDER BY [Date] ASC) AS RowNum
FROM [dbo].[Log4Net]
)
DELETE FROM RemoveRows
WHERE RowNum < #RowsToDelete + 1
print N'##TRANCOUNT AFTER DELETE: ' + CAST(##TRANCOUNT AS VARCHAR(20));
It is called from this t-sql:
WHILE #RowsDeleted < #RowsToDelete
BEGIN
EXEC [dbo].[DeleteFromLog4Net] #RowsToDelete
SET #RowsDeleted = #RowsDeleted + #RowsToDelete
Set #loops = #loops + 1
print 'Loop: ' + cast(#loops as varchar(10))
END
I have to admit I am puzzled. I am not a DB guru, but I thought I understood enough to figure this out....