SQL server cursor slow performance - sql-server

I'm getting started with my first use of a cursor in a stored procedure in sql server 2008. I've done some preliminary reading and I understand that they have significant performance limitations. In my current case I think they're necessary (I want to run multiple stored procedures for each stock symbol in a symbols table.
Edit:
The sprocs I'll be calling on each symbol will for the most part be insert operations to calculate symbol- dependent values, such as 5 day moving average, average daily volume, ATR (average true range). Most of these values will be calculated from data from a daily pricing and volume table... I'd like to streamline the retrieval of data values that would be retrieved redundantly otherwise... for example, I'd like to get for each symbol the daily pricing and volume data into a table variable... that temp table will then be passed in to the stored procedure that calls each of the aggregated functions I just mentioned. Hope that makes sense...
So my initial "outer loop" cursor- based stored procedure is below.. it times out after several minutes, without returning anything to the output window.
ALTER PROCEDURE dbo.sprocSymbolDependentAggsDriver2
AS
DECLARE #symbol nchar(10)
DECLARE symbolCursor CURSOR
STATIC FOR
SELECT Symbol FROM tblSymbolsMain ORDER BY Symbol
OPEN symbolCursor
FETCH NEXT FROM symbolCursor INTO #symbol
WHILE ##FETCH_STATUS = 0
SET #symbol = #symbol + ': Test.'
FETCH NEXT FROM symbolCursor INTO #symbol
CLOSE symbolCursor
DEALLOCATE symbolCursor
When I run it without the #symbol local variable and eliminate the assignment to it in the while loop, it seems to run ok. Is there a clear violation of performance best- practices within that assignment? Thanks..

"In my current case I think they're necessary (I want to run multiple
stored procedures for each stock symbol in a symbols table."
Cursors are rarely necessary.
From your example above, I think a simple WHILE loop will easily take the place of your cursor. Adapted from SQL Cursors - How to avoid them (one of my favorite SQL bookmarks)
-- Create a temporary table...
CREATE TABLE #Symbols (
RowID int IDENTITY(1, 1),
Symbol(nvarchar(max))
)
DECLARE #NumberRecords int, #RowCount int
DECLARE #Symbol nvarchar(max)
-- Get your data that you want to loop over
INSERT INTO #Symbols (Symbol)
SELECT Symbol
FROM tblSymbolsMain
ORDER BY Symbol
-- Get the number of records you just grabbed
SET #NumberRecords = ##ROWCOUNT
SET #RowCount = 1
-- Just do a WHILE loop. No cursor necessary.
WHILE #RowCount <= #NumberRecords
BEGIN
SELECT #Symbol = Symbol
FROM #Symbols
WHERE RowID = #RowCount
EXEC <myProc1> #Symbol
EXEC <myProc2> #Symbol
EXEC <myProc3> #Symbol
SET #RowCount = #RowCount + 1
END
DROP TABLE #Symbols

You don't really need all that explicit cursor jazz to build a string. Here is probably a more efficient way to do it:
DECLARE #symbol NVARCHAR(MAX) = N'';
SELECT #symbol += ': Test.'
FROM dbo.tblSymbolsMain
ORDER BY Symbol;
Though I suspect you actually wanted to see the names of the symbol, e.g.
DECLARE #symbol NVARCHAR(MAX) = N'';
SELECT #symbol += N':' + Symbol
FROM dbo.tblSymbolsMain
ORDER BY Symbol;
One caveat is that while you will typically observe the order to be observed, it is not guaranteed. So if you want to stick to the cursor, at least declare the cursor as follows:
DECLARE symbolCursor CURSOR
LOCAL STATIC READ_ONLY FORWARD_ONLY
FOR
...
Also it seems to me like NCHAR(10) is not sufficient to hold the data you're trying to stuff into it, unless you only have one row (which is why I chose NVARCHAR(MAX) above).
And I agree with Abe... it is quite possible you don't need to fire a stored procedure for every row in the cursor, but to suggest ways around that (which will almost certainly be more efficient), we'd have to understand what those stored procedures actually do.

you need an begin end here:
WHILE ##FETCH_STATUS = 0 BEGIN
SET #symbol = #symbol + ': Test.'
FETCH NEXT FROM symbolCursor INTO #symbol
END
also try DECLARE symbolCursor CURSOR LOCAL READ_ONLY FORWARD_ONLY instead of STATIC to improve performance.

After reading all the suggestions, I ended up doing some old trick and it worked miracles!
I had this cursor which was taking almost 3 mins to run, while the enclosing query was instant. I have other databases with more complex cursors that were only taking 1 second or less, so I ruled out the global issue on using cursors. My solution:
Detach the database in question, but ensure you tick Update Statistics.
Attach the database and check performance
This seems to help optimize all the performance parameters without the detailed effort. I am using SQL Express 2008 R2.
Would like to know your experience.

Related

Can't query and put data inside a cursor when using variable inside the query

I have to put a result of a query (single column and value is being pulled) into a variable. I'm trying to use a cursor however I choose the database to query based on a variable here is my query
SELECT productName, price FROM #ShopName.dbo.Products WHERE ProductName = #ProductName
#ShopName variable is being pulled from the database first and assigned to the variable using a cursor. #ProductName variable is being populated by an input parameter coming from API. I have to get ProductName from a specific database (there are multiple databases with products), but the query above throws syntax errors. Additionally when I tried ad hoc query assigned to a variable:
SET #Sql = N'SELECT productName, price FROM ' + QUOTENAME(#ShopName) + '.dbo.Products WHERE ProductName = ' + #ProductName
It doesn't allow to use it in
DECLARE cursorT CURSOR
FOR
#Sql
This throws Incorrect syntax near '#Sql', Expecting '(', SELECT, or WITH
Is there any way to make it possible to use that query in cursor while using the variable with database name in it?
Cursors should be right at the bottom of your bag of techniques, used sparingly and with great care, only when necessary. I can't tell if it's necessary in your case, there's not enough code to know. But I wanted to get that out before continuing.
As a point of purely academic interest, yes, there are some ways you can do this. Two main ways:
Declare a cursor in the dynamic SQL, as Dale suggested. You can still use the cursor in static code which follows the declaration if the cursor is global.
Use dynamic SQL to drop the results into something with scope outside of the dynamic sql, like a temp table. The cursor over the temp table.
1 is just bad. It is likely to result in code which is extremely difficult to understand in future. I include it for curiosity only. 2 is reasonable.
Examples:
-- some dummy schema and data to work with
create table t(i int);
insert t values(1), (2);
-- option 1: declare a cursor dynamically, use it statically (don't do this)
declare #i int;
exec sp_executesql N'declare c cursor global for select i from t';
open c;
fetch next from c into #i;
while (##fetch_status = 0)
begin
print #i;
fetch next from c into #i;
end
close c;
deallocate c;
-- option 2: dynamically dump data to a table, eg a temp table
create table #u(i int);
exec sp_executesql N'insert #u (i) select i from t';
declare c cursor local for select i from #u;
declare #i int;
open c;
fetch next from c into #i;
while (##fetch_status = 0)
begin
print #i;
fetch next from c into #i;
end
close c;
deallocate c;

Table Variable inside cursor, strange behaviour - SQL Server

I observed a strange thing inside a stored procedure with select on table variables. It always returns the value (on subsequent iterations) that was fetched in the first iteration of cursor. Here is some sample code that proves this.
DECLARE #id AS INT;
DECLARE #outid AS INT;
DECLARE sub_cursor CURSOR FAST_FORWARD
FOR SELECT [TestColumn]
FROM testtable1;
OPEN sub_cursor;
FETCH NEXT FROM sub_cursor INTO #id;
WHILE ##FETCH_STATUS = 0
BEGIN
DECLARE #Log TABLE (LogId BIGINT NOT NULL);
PRINT 'id: ' + CONVERT (VARCHAR (10), #id);
INSERT INTO Testtable2 (TestColumn)
OUTPUT inserted.[TestColumn] INTO #Log
VALUES (#id);
IF ##ERROR = 0
BEGIN
SELECT TOP 1 #outid = LogId
FROM #Log;
PRINT 'Outid: ' + CONVERT (VARCHAR (10), #outid);
INSERT INTO [dbo].[TestTable3] ([TestColumn])
VALUES (#outid);
END
FETCH NEXT FROM sub_cursor INTO #id;
END
CLOSE sub_cursor;
DEALLOCATE sub_cursor;
However, while I was posting the code on SO and tried various combinations, I observed that removing top from the below line, gives me the right values out of table variable inside a cursor.
SELECT TOP 1 #outid = LogId FROM #Log;
which would make it like this
SELECT #outid = LogId FROM #Log;
I am not sure what is happening here. I thought TOP 1 on table variable should work, thinking that a new table is created on every iteration of the loop. Can someone throw light on the table variable scoping and lifetime.
Update: I have the solution to circumvent the strange behavior here.
As a solution, I have declared the table at the top before the loop and deleting all rows at the beginning of the loop.
There are numerous things a bit off with this code.
First off, you roll back your embedded transaction on error, but I never see you commit it on success. As written, this will leak a transaction, which could cause major issues for you in the following code.
What might be confusing you about the #Log table situation is that SQL Server doesn't use the same variable scoping and lifetime rules as C++ or other standard programming languages. Even when declaring your table variable in the cursor block you will only get a single #Log table which then lives for the remainder of the batch, and which gets multiple rows inserted into it.
As a result, your use of TOP 1 is not really meaningful, since there's no ORDER BY clause to impose any sort of deterministic ordering on the table. Without that, you get whatever order SQL Server sees fit to give you, which in this case appears to be the insertion order, giving you the first inserted element of that log table every time you run the SELECT.
If you truly want only the last ID value, you will need to provide some real ordering criterion for your #Log table -- some form of autonumber or date field alongside the data column that can be used to provide the proper ordering for what you want to do.

Could this cursor be optimized or rewritten for optimum performance?

There is a need to update all of our databases on our server and perform the same logic on each one. The databases in question all follow a common naming scheme like CorpDB1, CorpDB2, etc. Instead of creating a SQL Agent Job for each of the databases in question (over 50), I have thought about using a cursor to iterate over the list of databases and then perform some dynamic sql on each one. In light of the common notion that cursors should be a last resort; could this be rewritten for better performance or written another way perhaps with the use of the undocumented sp_MSforeachdb stored procedure?
DECLARE #db VARCHAR(100) --current database name
DECLARE #sql VARCHAR(1000) --t-sql used for processing on each database
DECLARE db_cursor CURSOR FAST_FORWARD FOR
SELECT name
FROM MASTER.dbo.sysdatabases
WHERE name LIKE 'CorpDB%'
OPEN db_cursor
FETCH NEXT FROM db_cursor INTO #db
WHILE ##FETCH_STATUS = 0
BEGIN
SET #sql = 'USE ' + #db +
' DELETE FROM db_table --more t-sql processing'
EXEC(#sql)
FETCH NEXT FROM db_cursor INTO #db
END
CLOSE db_cursor
DEALLOCATE db_cursor
Cursors are bad when they are used to tackle a set-based problem with procedural code. I don't think a cursor is necessarily a bad idea in your scenario.
When operations need to be run against multiple databases (backups, integrity checks, index maintenance, etc.), there's no issue with using a cursor. Sure, you could build a temp table that contains database names and loop through that...but it's still a procedural approach.
For your specific case, if you're not deleting rows in these tables based on some WHERE clause criteria, consider using TRUNCATE TABLE instead of DELETE FROM. Differences between the two operations explained here. Note that the user running TRUNCATE TABLE will need ALTER permission on the affected objects.
This will collect the set of delete statements and run them all in a single sequence. This is not necessarily going to be better performance-wise but just another way to skin the cat.
DECLARE #sql NVARCHAR(MAX); -- if SQL Server 2000, use NVARCHAR(4000)
SET #sql = N'';
SELECT #sql = #sql + N';DELETE ' + name + '..db_table -- more t-sql'
FROM master.sys.databases
WHERE name LIKE N'CorpDB%';
SET #sql = STUFF(#sql, 1, 1, '');
EXEC sp_executesql #sql;
You may consider building the string in a similar way inside your cursor instead of running EXEC() inside for each command. If you're going to continue using a cursor, use the following declaration:
DECLARE db_cursor CURSOR
LOCAL STATIC FORWARD_ONLY READ_ONLY
FOR
This will have the least locking and no unnecessary tempdb usage.

Determine a cursor by condition

In SQL Server for CURSOR we say:
CREATE PROCEDURE SP_MY_PROC
(#BANKID VARCHAR(6)='')
-------------------------------
-------------------------------
DECLARE MY_CURSOR CURSOR FOR
SELECT .......
Now, what I wonder, can we determine the select statement according to a cerain condition?
IF BANKID<>''// SELECT * FROM EMPLOYESS WHERE BANKID=#BANKID to be the cursors query
ELSE // otherwise SELECT * FROM EMPLOYEES to be the cursors query
Or does it have to be static?
Yes, you can do this with Dynamic SQL
IF #BANKID<> ''
SET #sql = '
DECLARE MyCursor CURSOR FOR
SELECT ...'
ELSE
SET #sql = '
DECLARE MyCursor CURSOR FOR
SELECT ...'
EXEC sp_executesql #sql
OPEN MyCursor
If it is such a simple example, it's better to re-write it as a single query:
DECLARE MY_CURSOR CURSOR FOR
SELECT * FROM EMPLOYESS WHERE BANKID=#BANKID or #BANKID=''
And, of course, we haven't addressed whether a cursor is the right solution for the larger problem or not (cursors are frequently misused by people not used to thinking of set based solutions, which is what SQL is good at).
PS - avoid prefixing your stored procedures with sp_ - These names are "reserved" for SQL Server, and should be avoided to prevent future incompatibilities (and ignoring, for now, that it's also slower to access stored procs with such names, since SQL Server searches the master database before searching in the current database).

What is an alternative to cursors for sql looping?

Using SQL 2005 / 2008
I have to use a forward cursor, but I don't want to suffer poor performance. Is there a faster way I can loop without using cursors?
Here is the example using cursor:
DECLARE #VisitorID int
DECLARE #FirstName varchar(30), #LastName varchar(30)
-- declare cursor called ActiveVisitorCursor
DECLARE ActiveVisitorCursor Cursor FOR
SELECT VisitorID, FirstName, LastName
FROM Visitors
WHERE Active = 1
-- Open the cursor
OPEN ActiveVisitorCursor
-- Fetch the first row of the cursor and assign its values into variables
FETCH NEXT FROM ActiveVisitorCursor INTO #VisitorID, #FirstName, #LastName
-- perform action whilst a row was found
WHILE ##FETCH_STATUS = 0
BEGIN
Exec MyCallingStoredProc #VisitorID, #Forename, #Surname
-- get next row of cursor
FETCH NEXT FROM ActiveVisitorCursor INTO #VisitorID, #FirstName, #LastName
END
-- Close the cursor to release locks
CLOSE ActiveVisitorCursor
-- Free memory used by cursor
DEALLOCATE ActiveVisitorCursor
Now here is the example how can we get same result without using cursor:
/* Here is alternative approach */
-- Create a temporary table, note the IDENTITY
-- column that will be used to loop through
-- the rows of this table
CREATE TABLE #ActiveVisitors (
RowID int IDENTITY(1, 1),
VisitorID int,
FirstName varchar(30),
LastName varchar(30)
)
DECLARE #NumberRecords int, #RowCounter int
DECLARE #VisitorID int, #FirstName varchar(30), #LastName varchar(30)
-- Insert the resultset we want to loop through
-- into the temporary table
INSERT INTO #ActiveVisitors (VisitorID, FirstName, LastName)
SELECT VisitorID, FirstName, LastName
FROM Visitors
WHERE Active = 1
-- Get the number of records in the temporary table
SET #NumberRecords = ##RowCount
--You can use: SET #NumberRecords = SELECT COUNT(*) FROM #ActiveVisitors
SET #RowCounter = 1
-- loop through all records in the temporary table
-- using the WHILE loop construct
WHILE #RowCounter <= #NumberRecords
BEGIN
SELECT #VisitorID = VisitorID, #FirstName = FirstName, #LastName = LastName
FROM #ActiveVisitors
WHERE RowID = #RowCounter
EXEC MyCallingStoredProc #VisitorID, #FirstName, #LastName
SET #RowCounter = #RowCounter + 1
END
-- drop the temporary table
DROP TABLE #ActiveVisitors
"NEVER use Cursors" is a wonderful example of how damaging simple rules can be. Yes, they are easy to communicate, but when we remove the reason for the rule so that we can have an "easy to follow" rule, then most people will just blindly follow the rule without thinking about it, even if following the rule has a negative impact.
Cursors, at least in SQL Server / T-SQL, are greatly misunderstood. It is not accurate to say "Cursors affect performance of SQL". They certainly have a tendency to, but a lot of that has to do with how people use them. When used properly, Cursors are faster, more efficient, and less error-prone than WHILE loops (yes, this is true and has been proven over and over again, regardless of who argues "cursors are evil").
First option is to try to find a set-based approach to the problem.
If logically there is no set-based approach (e.g. needing to call EXEC per each row), and the query for the Cursor is hitting real (non-Temp) Tables, then use the STATIC keyword which will put the results of the SELECT statement into an internal Temporary Table, and hence will not lock the base-tables of the query as you iterate through the results. By default, Cursors are "sensitive" to changes in the underlying Tables of the query and will verify that those records still exist as you call FETCH NEXT (hence a large part of why Cursors are often viewed as being slow). Using STATIC will not help if you need to be sensitive of records that might disappear while processing the result set, but that is a moot point if you are considering converting to a WHILE loop against a Temp Table (since that will also not know of changes to underlying data).
If the query for the cursor is only selecting from temporary tables and/or table variables, then you don't need to prevent locking as you don't have concurrency issues in those cases, in which case you should use FAST_FORWARD instead of STATIC.
I think it also helps to specify the three options of LOCAL READ_ONLY FORWARD_ONLY, unless you specifically need a cursor that is not one or more of those. But I have not tested them to see if they improve performance.
Assuming that the operation is not eligible for being made set-based, then the following options are a good starting point for most operations:
DECLARE [Thing1] CURSOR LOCAL READ_ONLY FORWARD_ONLY STATIC
FOR SELECT columns
FROM Schema.ReadTable(s);
DECLARE [Thing2] CURSOR LOCAL READ_ONLY FORWARD_ONLY FAST_FORWARD
FOR SELECT columns
FROM #TempTable(s) and/or #TableVariables;
You can do a WHILE loop, however you should seek to achieve a more set based operation as anything in SQL that is iterative is subject to performance issues.
http://msdn.microsoft.com/en-us/library/ms178642.aspx
Common Table Expressions would be a good alternative as #Neil suggested. Here's an example from Adventureworks:
WITH cte_PO AS
(
SELECT [LineTotal]
,[ModifiedDate]
FROM [AdventureWorks].[Purchasing].[PurchaseOrderDetail]
),
minmax AS
(
SELECT MIN([LineTotal]) as DayMin
,MAX([LineTotal]) as DayMax
,[ModifiedDate]
FROM cte_PO
GROUP BY [ModifiedDate]
)
SELECT * FROM minmax ORDER BY ModifiedDate
Here's the top few lines of what it returns:
DayMin DayMax ModifiedDate
135.36 8847.30 2001-05-24 00:00:00.000
129.8115 25334.925 2001-06-07 00:00:00.000
Recursive Queries using Common Table Expressions.
I have to use a forward cursor, but I don't want to suffer poor performance. Is there a faster way I can loop without using cursors?
This depends on what you do with the cursor.
Almost everything can be rewritten using set-based operations in which case the loops are performed inside the query plan and since they involve no context switch are much faster.
However, there are some things SQL Server is just not good at, like computing cumulative values or joining on date ranges.
These kinds of queries can be made faster using a CURSOR:
Flattening timespans: SQL Server
But again, this is a quite a rare exception, and normally a set-based way performs better.
If you posted your query, we could probably optimize it and get rid of a CURSOR.
Depending on what you want it for, you may be able to use a tally table.
Jeff Moden has an excellent article on tally tables Here
Don't use a cursor, instead look for a set-based solution. If you can't find a set-based solution... still don't use a cursor! Post details of what you are trying to achieve, someone will be able to find a set-based solution for you.
There may be some scenarios where one can use Tally tables. It could be a good alternative of loop and cusrors but remember it cannot be applied in every case. A well explain case can be found here

Resources