How to execute in parallel using Transaction SQL? - sql-server

I need to call a stored procedure with hundreds different parameters in a scheduled SQL Agent job. Right now it's executed sequentially. I want to execute the stored procedure with N (e.g. N = 8) different parameters at the same time.
Is it a good way to implement it in Transaction SQL? Can SQL Server Service broker be used for this purpose? Any other options?

There is mention in a comment on the question of a table that holds the various parameters to call the proc with, and that the execution times vary a lot across the parameter values.
If you are able to add two fields to the table of parameters--StartTime DATETIME and EndTime DATETIME--then you can create 7 more SQL Agent Jobs and have them scheduled to run at the same time.
The Job Step of each Job should be the same and should be similar to the following:
DECLARE #Params TABLE (ParamID INT, Param1 DataType, Param2 DataType, ...);
DECLARE #ParamID INT,
#Param1Variable DataType,
#Param2Variable DataType,
...;
WHILE (1 = 1)
BEGIN
UPDATE TOP (1) param
SET param.StartTime = GETDATE() -- or GETUTCDATE()
OUTPUT INSERTED.ParamID, INSERTED.Param1, INSERTED.Param2, ...
INTO #Params (ParamID, Param1, Param2, ...)
FROM Schema.ParameterTable param
WHERE param.StartTime IS NULL;
IF (##ROWCOUNT = 0)
BEGIN
BREAK; -- no rows left to process so just exit
END;
SELECT #ParamID = tmp.ParamID,
#Param1Variable = tmp.Param1,
#Param2Variable = tmp.Param2,
FROM #Params tmp;
BEGIN TRY
EXEC Schema.MyProc #Param1Variable, #Param2Variable, ... ;
UPDATE param
SET param.EndTime = GETDATE() -- or GETUTCDATE()
FROM Schema.ParameterTable param
WHERE param.ParamID = #ParamID;
END TRY
BEGIN CATCH
... do something here...
END CATCH;
DELETE FROM #Params; // clear out last set of params
END;
That general structure should allow for the 8 SQL Jobs to run until all of the parameter value sets have been executed. It accounts for that fact that some sets will run faster than others as each Job will just pick the next available one off the queue until there are none left, at which time the Job will cleanly exit.
Two things to consider adding to the above structure:
A way of resetting the StartTime field to be NULL so that the row can re-run later
A way of handling errors (i.e. clean up of rows where StartTime IS NOT NULL AND EndTime IS NULL and the DATEDIFF between StartTime and GETDATE / GETUTCDATE is too much. A TRY / CATCH could do it by either setting StartTime back to NULL to get re-run OR maybe add a 3rd field for ErrorTime DATETIME that is reset to NULL at the start of the run (like the other 2 fields) but only set if an error happens. Those are just some thoughts.

SQL Server has nothing native built in to issue parallel queries from a T-SQL batch. You need an external driver. Someone who connects on N connections.
SQL Agent can do that if you create N jobs and start them manually. It is a hack, but it will work.
It is probably easier to write a small C# app do do this and put it into Windows Task Scheduler.

Related

TVF is much slower when using parameterized query

I am trying to run an inline TVF as a raw parameterized SQL query.
When I run the following query in SSMS, it takes 2-3 seconds
select * from dbo.history('2/1/15','1/1/15','1/31/15',2,2021,default)
I was able to capture the following query through SQL profiler (parameterized, as generated by Entity framework) and run it in SSMS.
exec sp_executesql N'select * from dbo.history(#First,#DatedStart,#DatedEnd,#Number,#Year,default)',N'#First date,#DatedStart date,#DatedEnd date,#Maturity int,#Number decimal(10,5)',#First='2015-02-01',#DatedStart='2015-01-01',#DatedEnd='2015-01-31',#Year=2021,#Number=2
Running the above query in SSMS takes 1:08 which is around 30x longer than the non parameterized version.
I have tried adding option(recompile) to the end of the parameterized query, but it did absolutely nothing as far as performance. This is clearly an indexing issue to me, but I have no idea how to resolve it.
When looking at the execution plan, it appears that the parameterized version mostly gets mostly hung up on an Eager Spool (46%) and then a Clustered Index scan (30%) which are not present in the execution plan without parameters.
Perhaps there is something I am missing, can someone please point me in the right direction as to how I can get this parameterized query to work properly?
EDIT: Parameterized query execution plan, non-parameterized plan
Maybe it's a parameter sniffing problem.
Try modifying your function so that the parameters are set to local variables, and use the local vars in your SQL instead of the parameters.
So your function would have this structure
CREATE FUNCTION history(
#First Date,
#DatedStart Date,
#DatedEnd Date,
#Maturity int,
#Number decimal(10,5))
RETURNS #table TABLE (
--tabledef
)
AS
BEGIN
Declare #FirstVar Date = #First
Declare #DatedStartVar Date = #DatedStart
Declare #DatedEndVar Date = #DatedEnd
Declare #MaturityVar int = #Maturity
Declare #NumberVar decimal(10,5) = #Number
--SQL Statement which uses the local 'Var' variables and not the parameters
RETURN;
END
;
I've had similar probs in the past where this has been the culprit, and mapping to local variables stops SQL Server from coming up with a dud execution plan.

How can I set a lock inside a stored procedure?

I've got a long-running stored procedure on a SQL server database. I don't want it to run more often than once every ten minutes.
Once the stored procedure has run, I want to store the latest result in a LatestResult table, against a time, and have all calls to the procedure return that result for the next ten minutes.
That much is relatively simple, but we've found that, because the procedure checks the LatestResult table and updates it, that large userbases are getting a number of deadlocks, when two users call the procedure at the same time.
In a client-side/threading situation, I would solve this by using a lock, having the first user lock the function, the second user encounters the lock, waiting for the result, the first user finishes their procedure call, updates the LatestResult table, and unlocks the second user, who then picks up the result from the LatestResult table.
Is there any way to accomplish this kind of locking in SQL Server?
EDIT:
This is basically how the code looks without its error checking calls:
DECLARE #LastChecked AS DATETIME
DECLARE #LastResult AS NUMERIC(18,2)
SELECT TOP 1 #LastChecked = LastRunTime, #LastResult = LastResult FROM LastResult
DECLARE #ReturnValue AS NUMERIC(18,2)
IF DATEDIFF(n, #LastChecked, GetDate()) >= 10 OR NOT #LastResult = 0
BEGIN
SELECT #ReturnValue = ABS(ISNULL(SUM(ISNULL(Amount,0)),0)) FROM Transactions WHERE ISNULL(DeletedFlag,0) = 0 GROUP BY GroupID ORDER BY ABS(ISNULL(SUM(ISNULL(Amount,0)),0))
UPDATE LastResult SET LastRunTime = GETDATE(), LastResult = #ReturnValue
SELECT #ReturnValue
END
ELSE
BEGIN
SELECT #LastResult
END
I'm not really sure what's going on with the grouping, but I've found a test system where execution time is coming in around 4 seconds.
I think there's some work scheduled to archive some of these records and boil them down to running totals, which will probably help things given that there's several million rows in that four second table...
This is a valid opportunity to use an Application Lock (see sp_getapplock and sp_releaseapplock) as it is a lock taken out on a concept that you define, not on any particular rows in any given table. The idea is that you create a transaction, then create this arbitrary lock that has an indetifier, and other processes will wait to enter that piece of code until the lock is released. This works just like lock() at the app layer. The #Resource parameter is the label of the arbitrary "concept". In more complex situations, you can even concatenate a CustomerID or something in there for more granular locking control.
DECLARE #LastChecked DATETIME,
#LastResult NUMERIC(18,2);
DECLARE #ReturnValue NUMERIC(18,2);
BEGIN TRANSACTION;
EXEC sp_getapplock #Resource = 'check_timing', #LockMode = 'Exclusive';
SELECT TOP 1 -- not sure if this helps the optimizer on a 1 row table, but seems ok
#LastChecked = LastRunTime,
#LastResult = LastResult
FROM LastResult;
IF (DATEDIFF(MINUTE, #LastChecked, GETDATE()) >= 10 OR #LastResult <> 0)
BEGIN
SELECT #ReturnValue = ABS(ISNULL(SUM(ISNULL(Amount, 0)), 0))
FROM Transactions
WHERE DeletedFlag = 0
OR DeletedFlag IS NULL;
UPDATE LastResult
SET LastRunTime = GETDATE(),
LastResult = #ReturnValue;
END;
ELSE
BEGIN
SET #ReturnValue = #LastResult; -- This is always 0 here
END;
SELECT #ReturnValue AS [ReturnValue];
EXEC sp_releaseapplock #Resource = 'check_timing';
COMMIT TRANSACTION;
You need to manage errors / ROLLBACK yourself (as stated in the linked MSDN documentation) so put in the usual TRY / CATCH. But, this does allow you to manage the situation.
If there are any concerns regarding contention on this process, there shouldn't be much as the lookup done right after locking the resource is a SELECT from a single-row table and then an IF statement that (ideally) just returns the last known value if the 10-minute timer hasn't elapsed. Hence, most calls should process rather quickly.
Please note: sp_getapplock / sp_releaseapplock should be used sparingly; Application Locks can definitely be very handy (such as in cases like this one) but they should only be used when absolutely necessary.

Execute stored procedure from a Trigger after a time delay

I want to call stored procedure from a trigger,
how to execute that stored procedure after x minutes?
I'm looking for something other than WAITFOR DELAY
thanks
Have an SQL Agent job that runs regularly and pulls stored procedure parameters from a table - the rows should indicate also when their run of the stored procedure should occur, so the SQL Agent job will only pick rows that are due/slightly overdue. It should delete the rows or mark them after calling the stored procedure.
Then, in the trigger, just insert a new row into this same table.
You do not want to be putting anything in a trigger that will affect the execution of the original transaction in any way - you definitely don't want to be causing any delays, or interacting with anything outside of the same database.
E.g., if the stored procedure is
CREATE PROCEDURE DoMagic
#Name varchar(20),
#Thing int
AS
...
Then we'd create a table:
CREATE TABLE MagicDue (
MagicID int IDENTITY(1,1) not null, --May not be needed if other columns uniquely identify
Name varchar(20) not null,
Thing int not null,
DoMagicAt datetime not null
)
And the SQL Agent job would do:
WHILE EXISTS(SELECT * from MagicDue where DoMagicAt < CURRENT_TIMESTAMP)
BEGIN
DECLARE #Name varchar(20)
DECLARE #Thing int
DECLARE #MagicID int
SELECT TOP 1 #Name = Name,#Thing = Thing,#MagicID = MagicID from MagicDue where DoMagicAt < CURRENT_TIMESTAMP
EXEC DoMagic #Name,#Thing
DELETE FROM MagicDue where MagicID = #MagicID
END
And the trigger would just have:
CREATE TRIGGER Xyz ON TabY after insert
AS
/*Do stuff, maybe calculate some values, or just a direct insert?*/
insert into MagicDue (Name,Thing,DoMagicAt)
select YName,YThing+1,DATEADD(minute,30,CURRENT_TIMESTAMP) from inserted
If you're running in an edition that doesn't support agent, then you may have to fake it. What I've done in the past is to create a stored procedure that contains the "poor mans agent jobs", something like:
CREATE PROCEDURE DoBackgroundTask
AS
WHILE 1=1
BEGIN
/* Add whatever SQL you would have put in an agent job here */
WAITFOR DELAY '00:05:00'
END
Then, create a second stored procedure, this time in the master database, which waits 30 seconds and then calls the first procedure:
CREATE PROCEDURE BootstrapBackgroundTask
AS
WAITFOR DELAY '00:00:30'
EXEC YourDB..DoBackgroundTask
And then, mark this procedure as a startup procedure, using sp_procoption:
EXEC sp_procoption N'BootstrapBackgroundTask', 'startup', 'on'
And restart the service - you'll now have a continuously running query.
I had kind of a similar situation where before I processed the records inserted into the table with the trigger, I wanted to make sure all the relevant related data in relational tables was also there.
My solution was to create a scratch table which was populated by the insert trigger on the first table.
The scratch table had a updated flag, (default set to 0), and an insert get date() date field, and the relevant identifier from the main table.
I then created a scheduled process to loop over the scratch table and perform whatever process I wanted to perform against each record individually, and updating the 'updated flag' as each record was processed.
BUT, here is where I was a wee bit clever, in the loop over process looking for records in the scratch table that had a update flag = 0, I also added the AND clause of AND datediff(mi, Updated_Date, getdate())> 5. So the record would not actually be processed until 5 minutes AFTER it was inserted into the scratch table.

Catch a cancelled stored procedure in SQL Server

I have a stored procedure in SQL server that looks something like this:
insert into Record(StartTimestamp) values (GETDATE())
SET #MyID = SCOPE_IDENTITY()
begin try
-- do something
UPDATE Record SET EndTimestamp = GETDATE() WHERE ID = #MyID
end try
begin catch
UPDATE Record SET EndTimestamp = GETDATE(), Error = ERROR_MESSAGE() WHERE ID = #MyID
end catch
It gets called from an application and takes a few seconds to run. If a user cancels it while it's running I end up with a StartTimestamp in the Record table but no error and no EndTimestamp. I always want to know that the user initiated this stored proc but I also want to always record when it finished, either by success, error, or being cancelled.
Is there any way to do that in SQL Server?
Thanks
Maybe you could rework the cancellation logic? So instead of calling SqlCommand.Cancel from the business layer execute some second proc that sets a flag somewhere, then have the first proc check for that flag while it's running...
The whole point of SqlCommand.Cancel is that it really stops execution, stopping the SP from running and doing a rollback. I think you'll have to approach this from the business layer.
GJ

SQL Server Agent Job Timeout

I have just had a scheduled SQL Server job run for longer than normal, and I could really have done with having set a timeout to stop it after a certain length of time.
I might be being a bit blind on this, but I can't seem to find a way of setting a timeout for a job. Does anyone know the way to do it?
Thanks
We do something like the code below as part of a nightly job processing subsystem - it is more complicated than this actually in reality; for example we are processing multiple interdependent sets of jobs, and read in job names and timeout values from configuration tables - but this captures the idea:
DECLARE #JobToRun NVARCHAR(128) = 'My Agent Job'
DECLARE #dtStart DATETIME = GETDATE(), #dtCurr DATETIME
DECLARE #ExecutionStatus INT, #LastRunOutcome INT, #MaxTimeExceeded BIT = 0
DECLARE #TimeoutMinutes INT = 180
EXEC msdb.dbo.sp_start_job #JobToRun
SET #dtCurr = GETDATE()
WHILE 1=1
BEGIN
WAITFOR DELAY '00:00:10'
SELECT #ExecutionStatus=current_execution_status, #LastRunOutcome=last_run_outcome
FROM OPENQUERY(LocalServer, 'set fmtonly off; exec msdb.dbo.sp_help_job') where [name] = #JobToRun
IF #ExecutionStatus <> 4
BEGIN -- job is running or finishing (not idle)
SET #dtCurr=GETDATE()
IF DATEDIFF(mi, #dtStart, #dtCurr) > #TimeoutMinutes
BEGIN
EXEC msdb.dbo.sp_stop_job #job_name=#JobToRun
-- could log info, raise error, send email etc here
END
ELSE
BEGIN
CONTINUE
END
END
IF #LastRunOutcome = 1 -- the job just finished with success flag
BEGIN
-- job succeeded, do whatever is needed here
print 'job succeeded'
END
END
What kind of a job is this? You may want to consider putting the whole job in a TSQL script within a While loop. The condition to check would obviously be the time difference between current time and job start time.
Raj

Resources