What is the impact of WAITFOR on other processes and transactions? - sql-server

I try to find out the performance or internal implementation for WAITFOR in T-SQL, have gone through MSDN and Stackoverflow and other sites without luck, here is my question
For below code, I want to delete the top 10,000 rows from table DUMMY. I want to make this delete job have the least performance impact on the database's other jobs as possible and give priority to others (if any). So I make it delete 100 rows at a time and do it 100 times with sleep time in two adjacent deletes.
Question:
During the WAITFOR blocking time, will this transaction consume CPU or just idle and waiting for kicked up by some event 1 second later?
During that 1 sec, if there are other transactions trying to INSERT/UPDATE on the DUMMY table, who gets priority?
Really appreciate your help or any insights for this
declare #cnt int
set #cnt = 0
while #cnt < 100
begin
delete top 100 from DUMMYTABLE where FOO = 'BAR'
set #cnt = #cnt + 1
waitfor delay '00:00:01'
end

It does not consume any CPU
Status = suspended
You can see this with 2 query windows:
SELECT ##SPID;
GO
WAITFOR DELAY '000:03:00'; -- three minutes
Then in the other
SELECT * FROM sys.sysprocesses S WHERE S.spid = 53; -- replace 53
Note: SQL Server 2012 SP1 but AFAIK behaviour is the same
Point 2, sorry missed this
Another session will modify the table while the WAITFOR is running. It isn't a lock.

Related

SQL Server - Batch Processing and 1 second waits

I'm working on updating over 100 million rows in a table. In order to reduce the number of rows that are locked (or even a table lock), I'm processing the updates with a combination of a while-loop and incrementing of 10k rows at a time.
I am running this on SQL Server 2012.
DECLARE #i int = 0
DECLARE #last int = 10000000
WHILE #i <= #last
BEGIN
UPDATE mytbl (ROWLOCK)
SET foo = null
WHERE id BETWEEN #i AND #i + 10000
SET #i += 10000
WAITFOR DELAY '00:00:01'
END
Does the 1 second wait in the code do anything in terms of improved performance, committing/flushing the transaction, or releasing transaction locks?
RE: "In order to reduce the number of rows that are locked (or even a table lock)..."
In general, you have a good approach. Recommendations:
I recommend that you wrap the UPDATE in an explicit transaction. Commit the transaction on each iteration. Otherwise you risk having a 100,000 million row transaction. Explicit transactions mean that I know that it committed. BEGIN TRAN ... UPDATE ... COMMIT TRAN. I use explicit transactions for this type of work every time.
Reduce the increment from 10,000. I use 1,000 in all cases. You can run tests to determine the exact number that will trigger table lock. You can read Microsoft guidelines for lock escalation for 2012 (lock escalation differs by version). But, my goal, when I do this, is to never table lock. Emphasis on "never". I just use 1,000 and sleep better at night. I recommend 1,000 for you. But, your mileage may vary.
Remove the WAITFOR. For the purposes of this loop, it only adds hours to the run time (10,000 seconds for the original 10,000 increment on 100 million rows).
I hope that helps. I have done between 50 and 100 of these on tables into the billions of rows. The keys (as noted above): never table lock and commit every iteration to keep the transaction small. Small transactions are your friend for many reasons.

Preventing blocking when using cursor over stored proc in transaction

I'm trying to work out how I can prevent blocking while running my code. There are a few things I could swap out, and I'm not sure which would be best. I'm including below some fake code to illustrate my current layout, please look past any syntax errors
SELECT ID
INTO #ID
FROM Table
DECLARE #TargetID int = 0
BEGIN TRAN
DECLARE ID_Cursor FOR
SELECT ID
FROM TABLE
OPEN ID_Cursor
FETCH NEXT FROM ID_Cursor
INTO #TargetID
WHILE ##FETCH_STATUS = 0
BEGIN
EXEC usp_ChangeID
#ID = #TargetID
#NewValue = 100
FETCH NEXT FROM ID_Cursor INTO #TargetID
END
CLOSE ID_Cursor
DEALLOCATE ID_Cursor
IF ((SELECT COUNT(*) FROM Table WHERE Value = 100) = 10000)
COMMIT TRAN
ELSE
ROLLBACK TRAN
The problem I'm encountering is that the usp_ChangeID in my code updates about 15 tables each run and other spids that want to work with any of those tables are needing to wait until the entire process is done running. The stored proc itself runs in about a second, but I need to run it repeatedly. I'm thinking that these locks are because of the transaction rather than the cursor itself, though I'm not 100% sure. Ideally, my code would finish one run of the stored proc, let other users through to the tables, then run again once that other operation is complete. The rows I'm working with each time shouldn't be frequently used so a row lock would be perfect, but the blocking I'm seeing implies that isn't happening.
This is running on production data so I want to leave as little impact as possible while this runs. Performance hits are fine if it means less blocking. I don't want to break this into chunks because I generally want this to only be saved if every record is updated as expected and rolled back in any other case. I can't modify the proc, go around the proc, or do this without a cursor involved either. I'm leaning towards breaking my initial large select into smaller chunks, but I'd rather not have to change parts manually.
Thanks in advance!

How can I set a lock inside a stored procedure?

I've got a long-running stored procedure on a SQL server database. I don't want it to run more often than once every ten minutes.
Once the stored procedure has run, I want to store the latest result in a LatestResult table, against a time, and have all calls to the procedure return that result for the next ten minutes.
That much is relatively simple, but we've found that, because the procedure checks the LatestResult table and updates it, that large userbases are getting a number of deadlocks, when two users call the procedure at the same time.
In a client-side/threading situation, I would solve this by using a lock, having the first user lock the function, the second user encounters the lock, waiting for the result, the first user finishes their procedure call, updates the LatestResult table, and unlocks the second user, who then picks up the result from the LatestResult table.
Is there any way to accomplish this kind of locking in SQL Server?
EDIT:
This is basically how the code looks without its error checking calls:
DECLARE #LastChecked AS DATETIME
DECLARE #LastResult AS NUMERIC(18,2)
SELECT TOP 1 #LastChecked = LastRunTime, #LastResult = LastResult FROM LastResult
DECLARE #ReturnValue AS NUMERIC(18,2)
IF DATEDIFF(n, #LastChecked, GetDate()) >= 10 OR NOT #LastResult = 0
BEGIN
SELECT #ReturnValue = ABS(ISNULL(SUM(ISNULL(Amount,0)),0)) FROM Transactions WHERE ISNULL(DeletedFlag,0) = 0 GROUP BY GroupID ORDER BY ABS(ISNULL(SUM(ISNULL(Amount,0)),0))
UPDATE LastResult SET LastRunTime = GETDATE(), LastResult = #ReturnValue
SELECT #ReturnValue
END
ELSE
BEGIN
SELECT #LastResult
END
I'm not really sure what's going on with the grouping, but I've found a test system where execution time is coming in around 4 seconds.
I think there's some work scheduled to archive some of these records and boil them down to running totals, which will probably help things given that there's several million rows in that four second table...
This is a valid opportunity to use an Application Lock (see sp_getapplock and sp_releaseapplock) as it is a lock taken out on a concept that you define, not on any particular rows in any given table. The idea is that you create a transaction, then create this arbitrary lock that has an indetifier, and other processes will wait to enter that piece of code until the lock is released. This works just like lock() at the app layer. The #Resource parameter is the label of the arbitrary "concept". In more complex situations, you can even concatenate a CustomerID or something in there for more granular locking control.
DECLARE #LastChecked DATETIME,
#LastResult NUMERIC(18,2);
DECLARE #ReturnValue NUMERIC(18,2);
BEGIN TRANSACTION;
EXEC sp_getapplock #Resource = 'check_timing', #LockMode = 'Exclusive';
SELECT TOP 1 -- not sure if this helps the optimizer on a 1 row table, but seems ok
#LastChecked = LastRunTime,
#LastResult = LastResult
FROM LastResult;
IF (DATEDIFF(MINUTE, #LastChecked, GETDATE()) >= 10 OR #LastResult <> 0)
BEGIN
SELECT #ReturnValue = ABS(ISNULL(SUM(ISNULL(Amount, 0)), 0))
FROM Transactions
WHERE DeletedFlag = 0
OR DeletedFlag IS NULL;
UPDATE LastResult
SET LastRunTime = GETDATE(),
LastResult = #ReturnValue;
END;
ELSE
BEGIN
SET #ReturnValue = #LastResult; -- This is always 0 here
END;
SELECT #ReturnValue AS [ReturnValue];
EXEC sp_releaseapplock #Resource = 'check_timing';
COMMIT TRANSACTION;
You need to manage errors / ROLLBACK yourself (as stated in the linked MSDN documentation) so put in the usual TRY / CATCH. But, this does allow you to manage the situation.
If there are any concerns regarding contention on this process, there shouldn't be much as the lookup done right after locking the resource is a SELECT from a single-row table and then an IF statement that (ideally) just returns the last known value if the 10-minute timer hasn't elapsed. Hence, most calls should process rather quickly.
Please note: sp_getapplock / sp_releaseapplock should be used sparingly; Application Locks can definitely be very handy (such as in cases like this one) but they should only be used when absolutely necessary.

Deleting 1 millions rows in SQL Server

I am working on a client's database and there is about 1 million rows that need to be deleted due to a bug in the software. Is there an efficient way to delete them besides:
DELETE FROM table_1 where condition1 = 'value' ?
Here is a structure for a batched delete as suggested above. Do not try 1M at once...
The size of the batch and the waitfor delay are obviously quite variable, and would depend on your servers capabilities, as well as your need to mitigate contention. You may need to manually delete some rows, measuring how long they take, and adjust your batch size to something your server can handle. As mentioned above, anything over 5000 can cause locking (which I was not aware of).
This would be best done after hours... but 1M rows is really not a lot for SQL to handle. If you watch your messages in SSMS, it may take a while for the print output to show, but it will after several batches, just be aware it won't update in real-time.
Edit: Added a stop time #MAXRUNTIME & #BSTOPATMAXTIME. If you set #BSTOPATMAXTIME to 1, the script will stop on it's own at the desired time, say 8:00AM. This way you can schedule it nightly to start at say midnight, and it will stop before production at 8AM.
Edit: Answer is pretty popular, so I have added the RAISERROR in lieu of PRINT per comments.
DECLARE #BATCHSIZE INT, #WAITFORVAL VARCHAR(8), #ITERATION INT, #TOTALROWS INT, #MAXRUNTIME VARCHAR(8), #BSTOPATMAXTIME BIT, #MSG VARCHAR(500)
SET DEADLOCK_PRIORITY LOW;
SET #BATCHSIZE = 4000
SET #WAITFORVAL = '00:00:10'
SET #MAXRUNTIME = '08:00:00' -- 8AM
SET #BSTOPATMAXTIME = 1 -- ENFORCE 8AM STOP TIME
SET #ITERATION = 0 -- LEAVE THIS
SET #TOTALROWS = 0 -- LEAVE THIS
WHILE #BATCHSIZE>0
BEGIN
-- IF #BSTOPATMAXTIME = 1, THEN WE'LL STOP THE WHOLE JOB AT A SET TIME...
IF CONVERT(VARCHAR(8),GETDATE(),108) >= #MAXRUNTIME AND #BSTOPATMAXTIME=1
BEGIN
RETURN
END
DELETE TOP(#BATCHSIZE)
FROM SOMETABLE
WHERE 1=2
SET #BATCHSIZE=##ROWCOUNT
SET #ITERATION=#ITERATION+1
SET #TOTALROWS=#TOTALROWS+#BATCHSIZE
SET #MSG = 'Iteration: ' + CAST(#ITERATION AS VARCHAR) + ' Total deletes:' + CAST(#TOTALROWS AS VARCHAR)
RAISERROR (#MSG, 0, 1) WITH NOWAIT
WAITFOR DELAY #WAITFORVAL
END
BEGIN TRANSACTION
DoAgain:
DELETE TOP (1000)
FROM <YourTable>
IF ##ROWCOUNT > 0
GOTO DoAgain
COMMIT TRANSACTION
Maybe this solution from Uri Dimant
WHILE 1 = 1
BEGIN
DELETE TOP(2000)
FROM Foo
WHERE <predicate>;
IF ##ROWCOUNT < 2000 BREAK;
END
(Link: https://social.msdn.microsoft.com/Forums/sqlserver/en-US/b5225ca7-f16a-4b80-b64f-3576c6aa4d1f/how-to-quickly-delete-millions-of-rows?forum=transactsql)
Here is something I have used:
If the bad data is mixed in with the good-
INSERT INTO #table
SELECT columns
FROM old_table
WHERE statement to exclude bad rows
TRUNCATE old_table
INSERT INTO old_table
SELECT columns FROM #table
Not sure how good this would be but what if you do like below (provided table_1 is a stand alone table; I mean no referenced by other table)
create a duplicate table of table_1 like table_1_dup
insert into table_1_dup select * from table_1 where condition1 <> 'value';
drop table table_1
sp_rename table_1_dup table_1
If you cannot afford to get the database out of production while repairing, do it in small batches. See also: How to efficiently delete rows while NOT using Truncate Table in a 500,000+ rows table
If you are in a hurry and need the fastest way possible:
take the database out of production
drop all non-clustered indexes and triggers
delete the records (or if the majority of records is bad, copy+drop+rename the table)
(if applicable) fix the inconsistencies caused by the fact that you dropped triggers
re-create the indexes and triggers
bring the database back in production

If not Exists logic

I am doing something like this:
exec up_sp1 -- executing this stored procedure which populates the table sqlcmds
---Check the count of the table sqlcmds,
---if the count is zero then execute up_sp2,
----otherwise wait till the count becomes zero,then exec up_sp2
IF NOT EXISTS ( SELECT 1 FROM [YesMailReplication].[dbo].[SQLCmds])
BEGIN
exec up_sp2
END
What would the correct t-sql look like?
T-SQL has no WAITFOR semantics except for Service Broker queues. So all you can do, short of using Service Broker, is to poll periodically and see if the table was populated. For small scale this works fine, but for high scale it breaks as the right balance between wait time and poll frequency is difficult to achieve, and is even harder to make it adapt to spikes and lows.
But if you are willing to use Service Broker, then you can do much more elegant and scalable solution by leveraging Activation: up_sp1 drops a message into a queue and this message activates the queue procedure that starts and launches up_sp2 in turn, after the up_sp1 has committed. This is a reliable mechanism that handles server restarts, mirroring and clustering failover and even rebuilding of the server from backups. See Asynchronous procedure execution for a an example of achieving something very similar.
The Service Broker solution is surely the best - but there is a WAITFOR solution as well:
exec up_sp1;
while exists (select * from [YesMailReplication].[dbo].[SQLCmds]) begin
waitfor delay ('00:00:10'); -- wait for 10 seconds
end;
exec up_sp2;
Try this:
DECLARE #Count int
SELECT #Count = COUNT(*) FROM [YesMailReplication].[dbo].[SQLCmds])
IF #Count > 0 BEGIN
exec up_sp2
END
Why not keep it simple and self-documenting?
DECLARE #Count int;
SELECT #Count = Count(*) FROM [YesMailReplication].[dbo].[SQLCmds]
If #Count = 0 exec up_sp2

Resources