SQL script taking long time to extract data - sql-server

This is the script that is taking a very long time
USE [r_prod]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
-- =============================================
-- Author: Drew Borden
-- Create date: 4/16/2009
-- Description: Procedure to populated subdivision extract table
-- =============================================
IF EXISTS(SELECT * FROM sys.procedures WHERE name='sp_extract_subdivision')
BEGIN
DROP PROCEDURE sp_extract_subdivision
END
GO
CREATE PROCEDURE sp_extract_subdivision
#subdivsion_cd char(2)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
declare #strap varchar(25)
-- Clear existing record
delete from dbo.subdivision_extract
-- Select list of straps to loop through
declare strapList cursor for
select strap from dbo.parcel where county_cd = #subdivsion_cd
--Loop through straps and populate extract table values
BEGIN TRY
OPEN strapList
FETCH NEXT FROM strapList INTO #strap
WHILE ##FETCH_STATUS = 0
BEGIN
IF #strap IS NOT NULL
BEGIN
insert into dbo.subdivision_extract (acct_num) values (RTRIM(#strap))
exec sp_extract_parcel #strap
exec sp_extract_detail #strap
exec sp_extract_lnd_c #strap
exec sp_extract_parcel_flg #strap
exec sp_extract_owner #strap
exec sp_extract_mail #strap
exec sp_extract_legal_ln #strap
exec sp_extract_site #strap
exec sp_extract_condo_unit #strap
exec sp_extract_personal_x #strap
exec sp_extract_personal_x_dist #strap
exec sp_extract_phase_in #strap
exec sp_extract_p_tax_dist #strap
exec sp_extract_parcel_rel #strap
exec sp_extract_entzone #strap
exec sp_extract_dates #strap
exec sp_extract_sales #strap
exec sp_extract_sale_dtl #strap
exec sp_extract_pchar #strap
exec sp_extract_protest #strap
END
FETCH NEXT FROM strapList INTO #strap
END
CLOSE strapList
DEALLOCATE strapList
END TRY
BEGIN CATCH
SELECT ERROR_NUMBER() as ErrorNumber,
ERROR_MESSAGE() as ErrorMessage,
ERROR_PROCEDURE() as ExecutingProcedure,
ERROR_LINE() as LineNumber
CLOSE strapList
DEALLOCATE strapList
END CATCH
END
GO
Any way to speed this up?

The best way to speed this up involves writing versions of the stored procedures that you're calling with every row so that they run against the whole set, and ditching your cursor altogether. Otherwise, you might get a small benefit from specifying the cursor as FORWARD_ONLY but I don't see much else that can be done.

The real problem here is the fact that you're calling 20 stored procedures sequentially via a cursor.
I hate cursors for a start, and have come up with solution for this on previous projects.
Instead of getting a variable from the cursor, are you able to run the 20 stored procedures sequentially for all of the data?
I suggest having a temporary table with the primary key of the data and a status integer which shows which have been processed and to which point. Each stored procedure can then be called in order to process all of the rows.
If you really want to do a nice job with it, have each stored proc process say 5% of the rows at a time, and then allow a small pause using WAITFOR before looping until all of the records have been processed by each stage. If the process time for each is reasonable, it will make sure that locks can still be allocated to other processes so more important processes do not time out because they can't acquire a lock.
How long does the delete from dbo.subdivision_extract take? If it takes a while and the log is not required (and you have no triggers on the table), try changing it to TRUNCATE TABLE dbo.subdivision_extract
TLDR: Redevelop the stored procs to process all of the data, then you'll only need to call 20 stored procs once each.

You are calling several stored procedures for each loop in this stored. I don't know what the others do, but it seems that they are also querying/modifying few data. You should consider joining the storeds in a single one and perform the querys in blocks of several records instead of loop for every strap.

If you are extracting data to a text file, you owe it to yourself to do it in a set-based manner or at least use SSIS. A cursor running mulitple stored procs for each row is the absolute worst method you can use for this sort of thing. I'd bet you can do this in an SSIS package and take minutes instead of 9 hours.

Yes, actualy is extreamly easy to fix: measure what is slow, then optimize the slow part.
All you posted is a T-SQL script that is as opaque as it can get in regard to performance. A DELETE, a SELECT, a cursor iteration with an INSERT and a bunch of EXECs. The problem can be anywhere in these, so the best solution is to measure and see where the problem might be.
Take the script and add a PRINT GETDATE(); at start, after the DELETE, after the first FETCH, then after each EXEC and execute one single iteration (remove the FETCH inside the loop). Look at the PRINT output, you can deduce the time it takes to execute each step from it. Does any of them stand out?
Attach Profiler and monitor for the event SP:StmtCompleted with a filter on Duration. Run again one single iteration of the extraction loop. Which statements stand out highest in Duration?
Run the script for a single iteration in SSMS, but check the Include Actual Execution Plan button in the toolbar. In the resulted execution show, which statements stand out as high cost relative to the batch?
You must narrow down your problem, the script as it is it's impossible to diagnose, there isn't any actual work done in this script, it just calls other procedures to do the work. Once you identified the actual slow statements inside the procedures invoked by this script, then you should post again, give the exact statment that has the problem and the exact schema of your data (including all index definitions), and ask solutions for specific problems.
While in theory a set oriented processing could have better performance than a cursor, in practice it would be probably impossible to write a single statement that extracts the same information as 20 stored procedure execution invocations, unless those procedure are extremly trivial single SELECTs.

Related

Preventing blocking when using cursor over stored proc in transaction

I'm trying to work out how I can prevent blocking while running my code. There are a few things I could swap out, and I'm not sure which would be best. I'm including below some fake code to illustrate my current layout, please look past any syntax errors
SELECT ID
INTO #ID
FROM Table
DECLARE #TargetID int = 0
BEGIN TRAN
DECLARE ID_Cursor FOR
SELECT ID
FROM TABLE
OPEN ID_Cursor
FETCH NEXT FROM ID_Cursor
INTO #TargetID
WHILE ##FETCH_STATUS = 0
BEGIN
EXEC usp_ChangeID
#ID = #TargetID
#NewValue = 100
FETCH NEXT FROM ID_Cursor INTO #TargetID
END
CLOSE ID_Cursor
DEALLOCATE ID_Cursor
IF ((SELECT COUNT(*) FROM Table WHERE Value = 100) = 10000)
COMMIT TRAN
ELSE
ROLLBACK TRAN
The problem I'm encountering is that the usp_ChangeID in my code updates about 15 tables each run and other spids that want to work with any of those tables are needing to wait until the entire process is done running. The stored proc itself runs in about a second, but I need to run it repeatedly. I'm thinking that these locks are because of the transaction rather than the cursor itself, though I'm not 100% sure. Ideally, my code would finish one run of the stored proc, let other users through to the tables, then run again once that other operation is complete. The rows I'm working with each time shouldn't be frequently used so a row lock would be perfect, but the blocking I'm seeing implies that isn't happening.
This is running on production data so I want to leave as little impact as possible while this runs. Performance hits are fine if it means less blocking. I don't want to break this into chunks because I generally want this to only be saved if every record is updated as expected and rolled back in any other case. I can't modify the proc, go around the proc, or do this without a cursor involved either. I'm leaning towards breaking my initial large select into smaller chunks, but I'd rather not have to change parts manually.
Thanks in advance!

Getting "Maximum nesting level exceeded" error when wrap a recursive stored procedure in another procedure

I have a recursive procedure "FindLoopMakingNode" which is used to find loops, and I'm not expecting nested level to be more than 30.
Here is the recursive procedure:
alter procedure FindLoopMakingNode
#p1 int,
#p2 nvarchar(max)
as
begin
-- recursively calls itself
end
Also, I have another procedure "CheckFormulaForLoop" which is responsible for finding every kind of loop and it uses the recursive procedure mentioned above as well as other statements.
Here is the main wrapper
alter procedure CheckFormulaForLoop
#p1 int,
#p2 nvarchar(max),
#p3 bit
as
begin
--search for other kinds of loop
--if no other loop exists calls recursive procedure
EXEC dbo.FindLoopMakingNode #p1,#p2
--writes the result in a temp table which has been created by wrapper procedure
end
Because I use the second proc for different scenarios, I have different wrapper procedures which use it
Here is the problem: when I execute "CheckFormulaForLoop" for a given set of parameters there is no problem but when I execute one of those wrapper procedures for the exact same set of parameters, I get the Error blew:
Maximum stored procedure, function, trigger, or view nesting level exceeded (limit 32)
Here is the wrapper(the one which throws exception, and yes, it's really that simple)
alter procedure CheckFormulaForLoopWrapper
#p1 int,
#p2 nvarchar(max),
#p3 bit
as
begin
Create table #tempLoopHolder(id int,code int)
EXEC CheckFormulaForLoop #p1,#p2,#p3
SELECT id,code from #tempLoopHolder
end
Now when I run
Execute CheckFormulaForLoopWrapper 1212,'2',1
It throws the exception but when I run
Create table #tempLoopHolder(id int,code int)
EXEC CheckFormulaForLoop 1212,'2',1
SELECT id,code from #tempLoopHolder
it runs successfully
I'm wondering if there is a problem with recursion levels why sql doesn't throw an exception when running the main procedure? and can nesting be responsible for this error?
You may maintain this by adding one more parameter to keep the count of recursive call.
You can pass a parameter and before executing your procedure check the value, in your case <30. For true condition before executing your procedure increment your value by +1. By this way you can keep the track of how many times your recursive call is made.
Sample code for same will be like this
Create proc Calltab ( #id int, #cou int=0 )
as
Begin
if(#cou <30 )
Begin
--- perform your operation in this section
set #cou = #cou + 1 --- here increment your value by +1 for the count
exec CallTab #id, #cou
end
end
GO
You may test the same with this
exec CallTab 1 , 0
I have to thank all of you for your precious helps
using all your tips, I found That using a combination of nested procedures + recursive procedure is my problem since as #Zohar Peled said :
The wrapper procedure adds another nesting level. In SQL Server, a procedure is considered nested when it's called by another procedure. As you can surly understand from the error message, using a stored procedure recursively is probably not the best way to handle whatever situation you need handling. If the soul purpose of this procedure is to fill a temporary table, there are probably better ways to do that then recursion. You might want to ask a different question on how to implement whatever it is that procedure does in a set based approach (which is SQL's strong suit).
I used ##NESTLEVEL that #Dale Burrell had mentioned and it turned out that my recursive procedure had to run 31 times (for given parameters) and this amount + 2 ( for 2 wrappers) led to the exception
So finally I see no way to fix it but trying to find a set based approach instead of recursion

Return row count for each stored procedure inside a stored procedure

Although this question is somewhat similar to mine. The solution just aren't enough to satisfy my problem.
I have a stored procedure sp_ImportAll, inside which i am trying to execute 5 more procs that have simple insert update statements.
EXEC sp_ImportAll
(#User_Name NVARCHAR(250))
AS
BEGIN
EXEC sp_A #User_Name
EXEC sp_B #User_Name
EXEC sp_C #User_Name
EXEC sp_D #User_Name
EXEC sp_E #User_Name
END
Within each of sp_A, sp_B, sp_C, etc. there is a bunch of Insert update commands.
My need is to get the exact no. of rows affected by each sp which i'll have to write to a log in turn.
So, for example, if sp_A has affected 5 rows, I should be able to write to a log this way :
sp_A executed for #User_Name. Affected row count : [Row Count].
Found one of the ways which I have mentioned in the linked question. Was wondering if there's a better and an elegant way to achieve this

How can I make a stored procedure commit immediately?

EDIT This questions is no longer valid as the issue was something else. Please see my explanation below in my answer.
I'm not sure of the etiquette so i'l leave this question in its' current state
I have a stored procedure that writes some data to a table.
I'm using Microsoft Practices Enterprise library for making my stored procedure call.
I invoke the stored procedure using a call to ExecuteNonQuery.
After ExecuteNonQuery returns i invoke a 3rd party library. It calls back to me on a separate thread in about 100 ms.
I then invoke another stored procedure to pull the data I had just written.
In about 99% of cases the data is returned. Once in a while it returns no rows( ie it can't find the data). If I put a conditional break point to detect this condition in the debugger and manually rerun the stored procedure it always returns my data.
This makes me believe the writing stored procedure is working just not committing when its called.
I'm fairly novice when it comes to sql, so its entirely possible that I'm doing something wrong. I would have thought that the writing stored procedure would block until its contents were committed to the db.
Writing Stored Procedure
ALTER PROCEDURE [dbo].[spWrite]
#guid varchar(50),
#data varchar(50)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
-- see if this guid has already been added to the table
DECLARE #foundGuid varchar(50);
SELECT #foundGuid = [guid] from [dbo].[Details] where [guid] = #guid;
IF #foundGuid IS NULL
-- first time we've seen this guid
INSERT INTO [dbo].[Details] ( [guid], data ) VALUES (#guid, #data)
ELSE
-- updaeting or verifying order
UPDATE [dbo].[Details] SET data =#data WHERE [guid] = #guid
END
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
Reading Stored Procedure
ALTER PROCEDURE [dbo].[spRead]
#guid varchar(50)
AS
BEGIN
-- SET NOCOUNT ON added to prevent extra result sets from
-- interfering with SELECT statements.
SET NOCOUNT ON;
SELECT * from [dbo].[Details] where [guid] = #guid;
END
To actually block other transactions and manually commit,
maybe adding
BEGIN TRANSACTION
--place your
--transactions you wish to do here
--if everything was okay
COMMIT TRANSACTION
--or
--ROLLBACK TRANSACTION if something went wrong
could help you?
I’m not familiar with the data access tools you mention, but from your description I would guess that either the process does not wait for the stored procedure to complete execution before proceeding to the next steps, or ye olde “something else” is messing with the data in between your write and read calls.
One way to tell what’s going on is to use SQL Profiler. Fire it up, monitor all possible query execution events on the database (including stored procedure and stored procedures line start/stop events), watch the Text and Started/Ended columns, correlate this with the times you are seeing while tracing the application, and that should help you figure out what’s going on there. (SQL Profiler can be complex to use, but there are many sources on the web that explain it, and it is well worth learning how to use it.)
I'll leave my answer below as there are comments on it...
Ok, I feel shame I had simplified my question too much. What was actually happening is two things:
1) the inserting procedure is actually running on a separate machine( distributed system).
2) the inserting procedure actually inserts data into two tables without a transaction.
This means the query can run at the same time and find the tables in a state where one has been written to and the second table hasn't' yet had its write committed.
A simple transaction fixes this as the reading query can handle either case of no write or full write but couldn't handle the case of one table written to and the other having a pending commit.
Well it turns out that when I created the stored procedure the MSSQLadmin tool added a line to it by default:
SET NOCOUNT ON;
If I turn that to:
SET NOCOUNT OFF;
then my procedure actually commits to the database properly. Strange that this default would actually end up causing problems.
Easy way using try-catch, like it if useful
BEGIN TRAN
BEGIN try
INSERT INTO meals
(
...
)
Values(...)
COMMIT TRAN
END try
BEGIN catch
ROLLBACK TRAN
SET #resp = (convert(varchar,ERROR_LINE()), ERROR_MESSAGE() )
END catch

How do I flush the PRINT buffer in TSQL?

I have a very long-running stored procedure in SQL Server 2005 that I'm trying to debug, and I'm using the 'print' command to do it. The problem is, I'm only getting the messages back from SQL Server at the very end of my sproc - I'd like to be able to flush the message buffer and see these messages immediately during the sproc's runtime, rather than at the very end.
Use the RAISERROR function:
RAISERROR( 'This message will show up right away...',0,1) WITH NOWAIT
You shouldn't completely replace all your prints with raiserror. If you have a loop or large cursor somewhere just do it once or twice per iteration or even just every several iterations.
Also: I first learned about RAISERROR at this link, which I now consider the definitive source on SQL Server Error handling and definitely worth a read:
http://www.sommarskog.se/error-handling-I.html
Building on the answer by #JoelCoehoorn, my approach is to leave all my PRINT statements in place, and simply follow them with the RAISERROR statement to cause the flush.
For example:
PRINT 'MyVariableName: ' + #MyVariableName
RAISERROR(N'', 0, 1) WITH NOWAIT
The advantage of this approach is that the PRINT statements can concatenate strings, whereas the RAISERROR cannot. (So either way you have the same number of lines of code, as you'd have to declare and set a variable to use in RAISERROR).
If, like me, you use AutoHotKey or SSMSBoost or an equivalent tool, you can easily set up a shortcut such as "]flush" to enter the RAISERROR line for you. This saves you time if it is the same line of code every time, i.e. does not need to be customised to hold specific text or a variable.
Yes... The first parameter of the RAISERROR function needs an NVARCHAR variable. So try the following;
-- Replace PRINT function
DECLARE #strMsg NVARCHAR(100)
SELECT #strMsg = 'Here''s your message...'
RAISERROR (#strMsg, 0, 1) WITH NOWAIT
OR
RAISERROR (n'Here''s your message...', 0, 1) WITH NOWAIT
Another better option is to not depend on PRINT or RAISERROR and just load your "print" statements into a ##Temp table in TempDB or a permanent table in your database which will give you visibility to the data immediately via a SELECT statement from another window. This works the best for me. Using a permanent table then also serves as a log to what happened in the past. The print statements are handy for errors, but using the log table you can also determine the exact point of failure based on the last logged value for that particular execution (assuming you track the overall execution start time in your log table.)
Just for the reference, if you work in scripts (batch processing), not in stored procedure, flushing output is triggered by the GO command, e.g.
print 'test'
print 'test'
go
In general, my conclusion is following: output of mssql script execution, executing in SMS GUI or with sqlcmd.exe, is flushed to file, stdoutput, gui window on first GO statement or until the end of the script.
Flushing inside of stored procedure functions differently, since you can not place GO inside.
Reference: tsql Go statement
To extend Eric Isaac's answer, here is how to use the table approach correctly:
Firstly, if your sp uses a transaction, you won't be able monitor the contents of the table live, unless you use the READ UNCOMMITTED option:
SELECT *
FROM table_log WITH (READUNCOMMITTED);
or
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED;
SELECT *
FROM table_log;
To solve rollback issues, put an increasing ID on the log table, and use this code:
SET XACT_ABORT OFF;
BEGIN TRY
BEGIN TRANSACTION mytran;
-- already committed logs are not affected by a potential rollback
-- so only save logs created in this transaction
DECLARE #max_log_id = (SELECT MAX(ID) FROM table_log);
/*
* do stuff, log the stuff
*/
COMMIT TRANSACTION mytran;
END TRY
BEGIN CATCH
DECLARE #log_table_saverollback TABLE
(
ID INT,
Msg NVARCHAR(1024),
LogTime DATETIME
);
INSERT INTO #log_table_saverollback(ID, Msg, LogTime)
SELECT ID, Msg, LogTime
FROM table_log
WHERE ID > #max_log_id;
ROLLBACK TRANSACTION mytran; -- this deletes new log entries from the log table
SET IDENTITY_INSERT table_log ON;
INSERT INTO table_log(ID, Msg, LogTime)
SELECT ID, Msg, LogTime
FROM #log_table_saverollback;
SET IDENTITY_INSERT table_log OFF;
END CATCH
Notice these important details:
SET XACT_ABORT OFF; prevents SQL Server from just shutting down the entire transaction instead of running your catch block, always include it if you use this technique.
Use a #table_variable, not a #temp_table. Temp tables are also affected by rollbacks.

Resources