Stored procedure execution is too slow - sql-server

I have a stored procedure which refer multiple tables (four to be specific i.e RefurbRef, ActivationDetailRefurb, ActivationDetailReplaced, ReplacedData) with of approx 1 lac of data on each table.
I need to bind the data from the stored procedure to UI on front end. When I tried executing the stored procedure on my SQL Server 2008 it took almost 20 minutes to execute and fetch the result. There's no way user's going to wait for that long gazing at the "please wait loading" user interface.
This is the procedure:
CREATE procedure [dbo].[uspLotFailureDetail]
#fromDate varchar(50),
#toDate varchar(50),
#vendorName varchar(50),
#modelName varchar(50)
AS
BEGIN
select
d.LOTQty,
ApprovedQty = count(distinct d.SerialNUMBER),
d.DispatchDate,
Installed = count(a.SerialNumber) + count(r.SerialNumber),
DOA = sum(case when datediff(day, coalesce(a.ActivationDate,r.ActivationDate), f.RecordDate) between 0 and 10 then 1 else 0 end),
Bounce = sum(case when datediff(day, coalesce(a.ActivationDate,r.ActivationDate), f.RecordDate) between 11 and 180 then 1 else 0 end)
from
RefurbRef d
left join
ActivationDetailRefurb a on d.SerialNUMBER= a.SerialNumber
and d.DispatchDate <= a.ActivationDate
and d.LOTQty = a.LOTQty
left join
ActivationDetailReplaced r on d.SerialNUMBER= r.SerialNumber
and d.DispatchDate <= r.ActivationDate
and d.LOTQty = r.LotQty
and (a.ActivationDate is null or a.ActivationDate <= d.DispatchDate)
left join
ReplacedData f on f.OldSerialNumber = (coalesce (a.SerialNumber, r.SerialNumber))
and f.RecordDate >= (coalesce (a.ActivationDate, r.ActivationDate))
where
d.DispatchDate between #fromDate and #toDate
and d.VendorName = #vendorName
and d.Model = #modelName
group by
d.LOTQty, d.DispatchDate
END
There are two types of results the procedure extracts, Results based on Vendor and on Model. However if result is extracted based on Vendor i.e by using only #fromDate, #toDate and #Vendor, procedure takes less than 2 minutes to execute and get the result. But when all the four variables are used like in the procedure above it takes not less than 20 minutes to execute.
Is there any way I could optimize the query to increase the performance of the procedure?
Thanks in advance

Given the information provided I would look at the RefurbRef.model number to see if there is a covering index on this field. I would bet there is not based on it jumping to 20 minutes once you added this criteria. Additionally, I would change your variables to dates from varchar.
#fromDate Date,
#toDate Date,
Hope this helps,
Jason

Related

Need to get the min startdate and max enddate when there are no breaks in months or changes in ownership

I have a table that contains account ownership with startdates and enddates by account, however, some accounts have duplications and some have rules that overlap date ranges. I need a clean result set showing account, owner, startdate and enddate with no duplications or overlaps.
The source table look like this:
accountnumber
startdate
enddate
owner
1
3/1/2012
6/30/2012
john
1
3/1/2012
6/30/2012
john
1
5/31/2012
7/31/2015
john
2
5/1/2012
8/1/2012
bill
2
8/2/2012
10/31/2012
bill
2
12/1/2012
12/31/2012
joe
2
1/1/2013
12/31/2025
bill
I need the results to read like
accountnumber
startdate
enddate
owner
1
3/1/2012
7/31/2015
john
2
5/1/2012
10/31/2012
bill
2
12/1/2012
12/31/2012
joe
2
1/1/2013
12/31/2025
bill
Any help is much appreciated. I'm very much a novice when it comes to SQL.
Select Distinct removes my duplicates, but I still end up with multiple overlapping date ranges.
I don't know what version of sql server we are using. It is a connector within a BI application called Sisense, and doesn't really say.
This is my select statement so far:
select distinct
r.accountnumber,
r.startdate,
r.enddate,
a.employeename Owner
from "dbo"."ruleset" r
left join "dbo"."rule" a on r.id = a.rulesetid
where
a.roleid = '1' and
r.isapproved = 'true'
The table structure is a bit interesting, and while there may be a better way to figure this out with less code (i.e., set based); this does the trick. Here's my explanation along with my code.
Thought Process: I needed to order the rows in order of AccountNumber and Owner and identify whenever either of these change, as that would mark a new "term"; additionally I would need a way of marking the beginnings of each of these "terms". For the former I used ROW_NUMBER, and for the latter I used LAG. These records , along with with 2 new fields are inserted into a temp table.
Having these 2 pieces of information allowed for me to loop through the rows using a WHILE loop, keeping track of the current row, as well as the most recent beginning of a term. I update the first record of each term with the latest end date (assuming that you don't have earlier end dates for later start dates), and once we're done we select just the records which are marked as being the new term, and we get the result set which you asked for.
Links to documentation.
RowNumber()
Lag
While
Code example:
DECLARE #RowNumber INTEGER
,#BeginTerm INTEGER
,#EndDate DATE;
DROP TABLE IF EXISTS #OwnershipChange;
SELECT
r.accountNumber
,r.startDate
,r.endDate
,a.employeename AS [owner]
,ROW_NUMBER()OVER(ORDER BY r.accountNumber, r.StartDate) RowNumber
,0 AS Processed
,CASE
WHEN a.employeename = LAG(a.employeename,1,NULL) OVER(ORDER BY r.accountNumber)
AND r.accountNumber = LAG(r.accountNumber,1,NULL)OVER(ORDER BY r.accountNumber)
THEN 0
ELSE 1
END AS NewOwnership
INTO #OwnershipChange
FROM dbo.ruleset r
LEFT OUTER JOIN dbo.rule a ON r.id = a.rulesetid
WHERE a.roleid = '1'
AND r.isapproved = 'true';
WHILE EXISTS (
SELECT 1/0
FROM #OwnershipChange
WHERE Processed = 0
)
BEGIN
SET #RowNumber = (
SELECT TOP 1 RowNumber
FROM #OwnershipChange
WHERE Processed = 0
ORDER BY RowNumber
);
SET #BeginTerm = (
SELECT
CASE
WHEN NewOwnership = 1 THEN #RowNumber
ELSE #BeginTerm
END
FROM #OwnershipChange
WHERE RowNumber = #RowNumber
);
SET #EndDate = (
SELECT endDate
FROM #OwnershipChange
WHERE RowNumber = #RowNumber
);
UPDATE #OwnershipChange
SET endDate = #EndDate
WHERE RowNumber = #BeginTerm;
UPDATE #OwnershipChange
SET Processed = 1
WHERE RowNumber = #RowNumber;
END;
SELECT
accountNumber
,startDate
,endDate
,[owner]
FROM #OwnershipChange
WHERE NewOwnership = 1;

CTE slow performance on Left join

I need to provide a report that shows all users on a table and their scores. Not all users on said table will have a score, so in my solution I calculate the score first using a few CTE's then in a final CTE i pull a full roster and assign a default score to users with no actual score.
While the CTE's are not overly complex, they are also not simple. Separately when I run the calculation part of the CTE's for users with an actual score, it runs in less than a second. When I join to a final CTE that grabs the full roster and assigns default scores where the nulls appear (no actual score) the wheels completely fall off and it never completes.
I've experimented with switching up the indexes and refreshing them to no avail. I have noticed the join at agent_effectiveness when switched to INNER runs in one second, but I need it to be a LEFT join so it will pull in the whole roster even when no score is present.
EDIT*
Execution Plan Inner Join
Execution Plan Left Join
WITH agent_split_stats AS (
Select
racf,
agent_stats.SkillGroupSkillTargetID,
aht_target.EnterpriseName,
aht_target.target,
Sum(agent_stats.CallsHandled) as n_calls_handled,
CASE WHEN (Sum(agent_stats.TalkInTime) + Sum(agent_stats.IncomingCallsOnHoldTime) + Sum(agent_stats.WorkReadyTime)) = 0 THEN 1 ELSE
(Sum(agent_stats.TalkInTime) + Sum(agent_stats.IncomingCallsOnHoldTime) + Sum(agent_stats.WorkReadyTime)) END
AS total_handle_time
from tblAceyusAgntSklGrp as agent_stats
-- GET TARGETS
INNER JOIN tblCrosswalkWghtPhnEffTarget as aht_target
ON aht_target.SgId = agent_stats.SkillGroupSkillTargetID
AND agent_stats.DateTime BETWEEN aht_target.StartDt and aht_target.EndDt
-- GET RACF
INNER JOIN tblAgentMetricCrosswalk as xwalk
ON xwalk.SkillTargetID = agent_stats.SkillTargetID
--GET TAU DATA LIKE START DATE AND GRADUATED FLAG
INNER JOIN tblTauClassList AS T
ON T.SaRacf = racf
WHERE
--FILTERS BY A ROLLING 15 BUSINESS DAYS UNLESS THE DAYS BETWEEN CURRENT DATE AND TAU START DATE ARE <15
agent_stats.DateTime >=
CASE WHEN dbo.fn_WorkDaysAge(TauStart, GETDATE()) <15 THEN TauStart ELSE
dbo.fn_WorkDate15(TauStart)
END
And Graduated = 'No'
--WPE FILTERS TO ENSURE ACCURATE DATA
AND CallsHandled <> 0
AND Target is not null
Group By
racf, agent_stats.SkillGroupSkillTargetID, aht_target.EnterpriseName, aht_target.target
),
agent_split_stats_with_weight AS (
-- calculate weights
-- one row = one advocate + split
SELECT
agent_split_stats.*,
agent_split_stats.n_calls_handled/SUM(agent_split_stats.n_calls_handled) OVER(PARTITION BY agent_split_stats.racf) AS [weight]
FROM agent_split_stats
),
agent_split_effectiveness AS (
-- calculate the raw Effectiveness score for each eligible advocate/split
-- one row = one agent + split, with their raw Effectiveness score and the components of that
SELECT
agent_split_stats_with_weight.*,
-- these are the components of the Effectiveness score
(((agent_split_stats_with_weight.target * agent_split_stats_with_weight.n_calls_handled) / agent_split_stats_with_weight.total_handle_time)*100)*agent_split_stats_with_weight.weight AS effectiveness_sum
FROM agent_split_stats_with_weight
), -- this is where we show effectiveness per split select * from agent_split_effectiveness
agent_effectiveness AS (
-- sum all of the individual effectiveness raw scores for each agent to get each agent's raw score
SELECT
racf AS SaRacf,
ROUND(SUM(effectiveness_sum),2) AS WpeScore
FROM agent_split_effectiveness
GROUP BY racf
),
--GET FULL CLASS LIST, TAU DATES, GOALS FOR WHOLE CLASS
tau AS (
Select L.SaRacf, TauStart, Goal as WpeGoal
,CASE WHEN agent_effectiveness.WpeScore IS NULL THEN 1 ELSE WpeScore END as WpeScore
FROM tblTauClassList AS L
LEFT JOIN agent_effectiveness
ON agent_effectiveness.SaRacf = L.SaRacf
LEFT JOIN tblCrosswalkTauGoal AS G
ON G.Year = TauYear
AND G.Bucket = 'Wpe'
WHERE TermDate IS NULL
AND Graduated = 'No'
)
SELECT tau.*,
CASE WHEN dbo.fn_WorkDaysAge(TauStart, GETDATE()) > 14 --MUST BE AT LEAST 15 DAYS TO PASS
AND WpeScore >= WpeGoal THEN 'Pass'
ELSE 'Fail' END
from tau
This style of query runs fine in 3 other different calculation types (different score types). So i am unsure why its failing so badly here. Actual results should be a list of individuals, a date, a score, a goal and a score. When no score exists, a default score is provided. Additionally there is a pass/fail metric using the score/goal.
As #Habo mentioned, we need the actual execution plan (e.g. run the query with "include actual execution plan" turned on.) I looked over what you posted and there is nothing there that will explain the problem. The difference with the actual plan vs the estimated plan is that the actual number of rows retrieved are recorded; this is vital for troubleshooting poorly performing queries.
That said, I do see a HUGE problem with both queries. It's a problem that, once fixed will, improve both queries to less than a second. Your query is leveraging two scalar user Defined Functions (UDFs): dbo.fn_WorkDaysAge & dbo.fn_WorkDate15. Scalar UDFs ruin
everything. Not only are they slow, they force a serial execution plan which makes any query they are used in much slower.
I don't have the code for dbo.fn_WorkDaysAge or dbo.fn_WorkDate15 I have my own "WorkDays" function which is inline (code below). The syntax is a little different but the performance benefits are worth the effort. Here's the syntax difference:
-- Scalar
SELECT d.*, workDays = dbo.countWorkDays_scalar(d.StartDate,d.EndDate)
FROM <sometable> AS d;
-- Inline version
SELECT d.*, f.workDays
FROM <sometable> AS d
CROSS APPLY dbo.countWorkDays(d.StartDate,d.EndDate) AS f;
Here's a performance test I put together to show the difference between an inline version vs the scalar version:
-- SAMPLE DATA
IF OBJECT_ID('tempdb..#dates') IS NOT NULL DROP TABLE #dates;
WITH E1(x) AS (SELECT 1 FROM (VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)) AS x(x)),
E3(x) AS (SELECT 1 FROM E1 a, E1 b, E1 c),
iTally AS (SELECT N=ROW_NUMBER() OVER (ORDER BY (SELECT 1)) FROM E3 a, E3 b)
SELECT TOP (100000)
StartDate = CAST(DATEADD(DAY,-ABS(CHECKSUM(NEWID())%1000),GETDATE()) AS DATE),
EndDate = CAST(DATEADD(DAY,+ABS(CHECKSUM(NEWID())%1000),GETDATE()) AS DATE)
INTO #dates
FROM iTally;
-- PERFORMANCE TESTS
PRINT CHAR(10)+'Scalar Version (always serial):'+CHAR(10)+REPLICATE('-',60);
GO
DECLARE #st DATETIME = GETDATE(), #workdays INT;
SELECT #workdays = dbo.countWorkDays_scalar(d.StartDate,d.EndDate)
FROM #dates AS d;
PRINT DATEDIFF(MS,#st,GETDATE());
GO 3
PRINT CHAR(10)+'Inline Version:'+CHAR(10)+REPLICATE('-',60);
GO
DECLARE #st DATETIME = GETDATE(), #workdays INT;
SELECT #workdays = f.workDays
FROM #dates AS d
CROSS APPLY dbo.countWorkDays(d.StartDate,d.EndDate) AS f
PRINT DATEDIFF(MS,#st,GETDATE());
GO 3
Results:
Scalar Version (always serial):
------------------------------------------------------------
Beginning execution loop
380
363
350
Batch execution completed 3 times.
Inline Version:
------------------------------------------------------------
Beginning execution loop
47
47
46
Batch execution completed 3 times.
As you can see - the inline version about 8 times faster than the scalar version. Replacing those scalar UDFs with an inline version will almost certainly speed this query up regardless of join type.
Other problems I see include:
I see a lot of Index scans, this is a sign you need more filtering and/or better indexes.
dbo.tblCrosswalkWghtPhnEffTarget does not have any indexes which means it will always get scanned.
Functions used for performance test:
-- INLINE VERSION
----------------------------------------------------------------------------------------------
IF OBJECT_ID('dbo.countWorkDays') IS NOT NULL DROP FUNCTION dbo.countWorkDays;
GO
CREATE FUNCTION dbo.countWorkDays (#startDate DATETIME, #endDate DATETIME)
/*****************************************************************************************
[Purpose]:
Calculates the number of business days between two dates (Mon-Fri) and excluded weekends.
dates.countWorkDays does not take holidays into considerations; for this you would need a
seperate "holiday table" to perform an antijoin against.
The idea is based on the solution in this article:
https://www.sqlservercentral.com/Forums/Topic153606.aspx?PageIndex=16
[Author]:
Alan Burstein
[Compatibility]:
SQL Server 2005+
[Syntax]:
--===== Autonomous
SELECT f.workDays
FROM dates.countWorkDays(#startdate, #enddate) AS f;
--===== Against a table using APPLY
SELECT t.col1, t.col2, f.workDays
FROM dbo.someTable t
CROSS APPLY dates.countWorkDays(t.col1, t.col2) AS f;
[Parameters]:
#startDate = datetime; first date to compare
#endDate = datetime; date to compare #startDate to
[Returns]:
Inline Table Valued Function returns:
workDays = int; number of work days between #startdate and #enddate
[Dependencies]:
N/A
[Developer Notes]:
1. NULL when either input parameter is NULL,
2. This function is what is referred to as an "inline" scalar UDF." Technically it's an
inline table valued function (iTVF) but performs the same task as a scalar valued user
defined function (UDF); the difference is that it requires the APPLY table operator
to accept column values as a parameter. For more about "inline" scalar UDFs see this
article by SQL MVP Jeff Moden: http://www.sqlservercentral.com/articles/T-SQL/91724/
and for more about how to use APPLY see the this article by SQL MVP Paul White:
http://www.sqlservercentral.com/articles/APPLY/69953/.
Note the above syntax example and usage examples below to better understand how to
use the function. Although the function is slightly more complicated to use than a
scalar UDF it will yield notably better performance for many reasons. For example,
unlike a scalar UDFs or multi-line table valued functions, the inline scalar UDF does
not restrict the query optimizer's ability generate a parallel query execution plan.
3. dates.countWorkDays requires that #enddate be equal to or later than #startDate. Otherwise
a NULL is returned.
4. dates.countWorkDays is NOT deterministic. For more deterministic functions see:
https://msdn.microsoft.com/en-us/library/ms178091.aspx
[Examples]:
--===== 1. Basic Use
SELECT f.workDays
FROM dates.countWorkDays('20180608', '20180611') AS f;
---------------------------------------------------------------------------------------
[Revision History]:
Rev 00 - 20180625 - Initial Creation - Alan Burstein
*****************************************************************************************/
RETURNS TABLE WITH SCHEMABINDING AS RETURN
SELECT workDays =
-- If #startDate or #endDate are NULL then rerturn a NULL
CASE WHEN SIGN(DATEDIFF(dd, #startDate, #endDate)) > -1 THEN
(DATEDIFF(dd, #startDate, #endDate) + 1) --total days including weekends
-(DATEDIFF(wk, #startDate, #endDate) * 2) --Subtact 2 days for each full weekend
-- Subtract 1 when startDate is Sunday and Substract 1 when endDate is Sunday:
-(CASE WHEN DATENAME(dw, #startDate) = 'Sunday' THEN 1 ELSE 0 END)
-(CASE WHEN DATENAME(dw, #endDate) = 'Saturday' THEN 1 ELSE 0 END)
END;
GO
-- SCALAR VERSION
----------------------------------------------------------------------------------------------
IF OBJECT_ID('dbo.countWorkDays_scalar') IS NOT NULL DROP FUNCTION dbo.countWorkDays_scalar;
GO
CREATE FUNCTION dbo.countWorkDays_scalar (#startDate DATETIME, #endDate DATETIME)
RETURNS INT WITH SCHEMABINDING AS
BEGIN
RETURN
(
SELECT workDays =
-- If #startDate or #endDate are NULL then rerturn a NULL
CASE WHEN SIGN(DATEDIFF(dd, #startDate, #endDate)) > -1 THEN
(DATEDIFF(dd, #startDate, #endDate) + 1) --total days including weekends
-(DATEDIFF(wk, #startDate, #endDate) * 2) --Subtact 2 days for each full weekend
-- Subtract 1 when startDate is Sunday and Substract 1 when endDate is Sunday:
-(CASE WHEN DATENAME(dw, #startDate) = 'Sunday' THEN 1 ELSE 0 END)
-(CASE WHEN DATENAME(dw, #endDate) = 'Saturday' THEN 1 ELSE 0 END)
END
);
END
GO
UPDATE BASED ON OP'S QUESTION IN THE COMMENTS:
First for the inline table valued function version of each function. Note that I'm using my own tables and don't have time to make the names match your environment but I did my best to include comments in the code. Also note that if, in your function, workingday = '1' is simply pulling weekdays then you'll find my function above to be a much faster alternative to your dbo.fn_WorkDaysAge function. If workingday = '1' also filters out holidays then it won't work.
CREATE FUNCTION dbo.fn_WorkDaysAge_itvf
(
#first_date DATETIME,
#second_date DATETIME
)
RETURNS TABLE AS RETURN
SELECT WorkDays = COUNT(*)
FROM dbo.dimdate -- DateDimension
WHERE DateValue -- [date]
BETWEEN #first_date AND #second_date
AND IsWeekend = 0 --workingday = '1'
GO
CREATE FUNCTION dbo.fn_WorkDate15_itvf
(
#TauStartDate DATETIME
)
RETURNS TABLE AS RETURN
WITH DATES AS
(
SELECT
ROW_NUMBER() OVER(Order By DateValue Desc) as RowNum, DateValue
FROM dbo.dimdate -- DateDimension
WHERE DateValue BETWEEN #TauStartDate AND --GETDATE() testing below
CASE WHEN GETDATE() < #TauStartDate + 200 THEN GETDATE() ELSE #TauStartDate + 200 END
AND IsWeekend = 0 --workingday = '1'
)
--Get the 15th businessday from the current date
SELECT DateValue
FROM DATES
WHERE RowNum = 16;
GO
Now, to replace your scalar UDFs with the inline table valued functions, you would do this (note my comments):
WITH agent_split_stats AS (
Select
racf,
agent_stats.SkillGroupSkillTargetID,
aht_target.EnterpriseName,
aht_target.target,
Sum(agent_stats.CallsHandled) as n_calls_handled,
CASE WHEN (Sum(agent_stats.TalkInTime) + Sum(agent_stats.IncomingCallsOnHoldTime) + Sum(agent_stats.WorkReadyTime)) = 0 THEN 1 ELSE
(Sum(agent_stats.TalkInTime) + Sum(agent_stats.IncomingCallsOnHoldTime) + Sum(agent_stats.WorkReadyTime)) END
AS total_handle_time
from tblAceyusAgntSklGrp as agent_stats
INNER JOIN tblCrosswalkWghtPhnEffTarget as aht_target
ON aht_target.SgId = agent_stats.SkillGroupSkillTargetID
AND agent_stats.DateTime BETWEEN aht_target.StartDt and aht_target.EndDt
INNER JOIN tblAgentMetricCrosswalk as xwalk
ON xwalk.SkillTargetID = agent_stats.SkillTargetID
INNER JOIN tblTauClassList AS T
ON T.SaRacf = racf
-- INLINE FUNCTIONS HERE:
CROSS APPLY dbo.fn_WorkDaysAge_itvf(TauStart, GETDATE()) AS wd
CROSS APPLY dbo.fn_WorkDate15_itvf(TauStart) AS w15
-- NEW WHERE CLAUSE:
WHERE agent_stats.DateTime >=
CASE WHEN wd.workdays < 15 THEN TauStart ELSE w15.workdays END
And Graduated = 'No'
AND CallsHandled <> 0
AND Target is not null
Group By
racf, agent_stats.SkillGroupSkillTargetID, aht_target.EnterpriseName, aht_target.target
),
agent_split_stats_with_weight AS (
SELECT
agent_split_stats.*,
agent_split_stats.n_calls_handled/SUM(agent_split_stats.n_calls_handled) OVER(PARTITION BY agent_split_stats.racf) AS [weight]
FROM agent_split_stats
),
agent_split_effectiveness AS
(
SELECT
agent_split_stats_with_weight.*,
(((agent_split_stats_with_weight.target * agent_split_stats_with_weight.n_calls_handled) /
agent_split_stats_with_weight.total_handle_time)*100)*
agent_split_stats_with_weight.weight AS effectiveness_sum
FROM agent_split_stats_with_weight
),
agent_effectiveness AS
(
SELECT
racf AS SaRacf,
ROUND(SUM(effectiveness_sum),2) AS WpeScore
FROM agent_split_effectiveness
GROUP BY racf
),
tau AS
(
SELECT L.SaRacf, TauStart, Goal as WpeGoal
,CASE WHEN agent_effectiveness.WpeScore IS NULL THEN 1 ELSE WpeScore END as WpeScore
FROM tblTauClassList AS L
LEFT JOIN agent_effectiveness
ON agent_effectiveness.SaRacf = L.SaRacf
LEFT JOIN tblCrosswalkTauGoal AS G
ON G.Year = TauYear
AND G.Bucket = 'Wpe'
WHERE TermDate IS NULL
AND Graduated = 'No'
)
SELECT tau.*,
-- NEW CASE STATEMENT HERE:
CASE WHEN wd.workdays > 14 AND WpeScore >= WpeGoal THEN 'Pass' ELSE 'Fail' END
from tau
-- INLINE FUNCTIONS HERE:
CROSS APPLY dbo.fn_WorkDaysAge_itvf(TauStart, GETDATE()) AS wd
CROSS APPLY dbo.fn_WorkDate15_itvf(TauStart) AS w15;
Note that I can't test this right now but it should be correct (or close)
UPDATE
I accepted Alan's answer, i ended up doing the following. Posting examples hoping the formatting helps someone, it slowed me down a bit...or maybe I am just slow heh heh.
1. Changed my Scalar UDF to InlineTVF
SCALAR Function 1-
ALTER FUNCTION [dbo].[fn_WorkDaysAge]
(
-- Add the parameters for the function here
#first_date DATETIME,
#second_date DATETIME
)
RETURNS int
AS
BEGIN
-- Declare the return variable here
DECLARE #WorkDays int
-- Add the T-SQL statements to compute the return value here
SELECT #WorkDays = COUNT(*)
FROM DateDimension
WHERE Date BETWEEN #first_date AND #second_date
AND workingday = '1'
-- Return the result of the function
RETURN #WorkDays
END
iTVF function 1-
ALTER FUNCTION [dbo].[fn_iTVF_WorkDaysAge]
(
-- Add the parameters for the function here
#FirstDate as Date,
#SecondDate as Date
)
RETURNS TABLE AS RETURN
SELECT WorkDays = COUNT(*)
FROM DateDimension
WHERE Date BETWEEN #FirstDate AND #SecondDate
AND workingday = '1'
I then updated my next function the same way. I added the CROSS APPLY (something ive personally not used, im still a newbie) as indicated below and replaced the UDFs with the field names in my case statement.
Old Code
INNER JOIN tblTauClassList AS T
ON T.SaRacf = racf
WHERE
--FILTERS BY A ROLLING 15 BUSINESS DAYS UNLESS THE DAYS BETWEEN CURRENT DATE AND TAU START DATE ARE <15
agent_stats.DateTime >=
CASE WHEN dbo.fn_WorkDaysAge(TauStart, GETDATE()) <15 THEN TauStart ELSE
dbo.fn_WorkDate15(TauStart)
END
New Code
INNER JOIN tblTauClassList AS T
ON T.SaRacf = racf
--iTVFs
CROSS APPLY dbo.fn_iTVF_WorkDaysAge(TauStart, GETDATE()) as age
CROSS APPLY dbo.fn_iTVF_WorkDate_15(TauStart) as roll
WHERE
--FILTERS BY A ROLLING 15 BUSINESS DAYS UNLESS THE DAYS BETWEEN CURRENT DATE AND TAU START DATE ARE <15
agent_stats.DateTime >=
CASE WHEN age.WorkDays <15 THEN TauStart ELSE
roll.Date
END
New code runs in 3-4 seconds. I will go back and index the appropriate tables per your recommendation and probably gain more efficiency there.
Cannot thank you enough!

Purge job optimization

SQL Server 2008 R2 Enterprise
I have a database with 3 tables that I am keeping a retention time of 15 days. This is a logging database that is very active and about 500 GB in size and eats about 30GB a day unless purged. I can't seem to get caught up on one of the tables and I am falling behind. This table has 220 million rows and it needs to purge around 10-12 million rows nightly. I am currently at 30 million rows needed to purge. I can only run this purge at night due to the volume of incoming inserts competing for table locks. I have confirmed that everything is indexed correctly and have run Brent Ozars sp_Blitz_Index just to confirm that. Is there any way to optimize what I am doing below? I am running the same purge steps for each table.
Drop and Create 3 purge tables: Purge_Log, Purge_SLogHeader and Purge_SLogMessage.
2.Insert rows into the purge tables (Takes 5 minutes each table):
Insert Into Purge_Log
Select ID from ServiceLog
where startTime < dateadd (day, -15, getdate() )
--****************************************************
Insert into Purge_SLogMessage
select serviceLogId from ServiceLogMessage
where serviceLogId in ( select id from
ServiceLog
where startTime < dateadd (day, -15, getdate() ))
--****************************************************
Insert into Purge_SLogHeader
Select serviceLogId from ServiceLogHeader
where serviceLogId in ( select id from
ServiceLog
where startTime < dateadd (day, -15, getdate() ))
After that is inserted, then I run the following with differences for each table:
SET ROWCOUNT 1000
delete_more:
delete from ServiceLog
where Id in ( select Id from Purge_Log)
IF ##ROWCOUNT > 0 GOTO delete_more
SET ROWCOUNT 0
Basically does anyone see a way that I can make this procedure run faster or have a different way to go about it. I've made the queries as simple as possible and with only one subquery. I've used a join and the execution query plan says the time is the same to complete it that way. Any guidance would be appreciated.
You can use this technique for all the tables, collect IDs first in temporary table to avoid scanning original table again and again in huge data. I hope it will work perfectly for you all the tables:
DECLARE #del_query VARCHAR(MAX)
/*
Taking IDs from ServiceLog table instead of Purge_Log because Purge_Log may have more data than expected because of frequent purging
*/
IF OBJECT_ID('tempdb..#tmp_log_ids') IS NOT NULL DROP TABLE #tmp_log_ids
SELECT ID INTO #tmp_log_ids FROM ServiceLog WHERE startTime < DATEADD(DAY, -15, GETDATE())
SET #del_query ='
DELETE TOP(100000) sl
FROM ServiceLog sl
INNER JOIN #tmp_log_ids t ON t.id = s1.id'
WHILE 1 = 1
BEGIN
EXEC(#del_query + ' option(maxdop 5) ')
IF ##rowcount < 100000 BREAK;
END
SET #del_query ='
DELETE TOP(100000) sl
FROM ServiceLogMessage sl
INNER JOIN #tmp_log_ids t ON t.id = s1.serviceLogId'
WHILE 1 = 1
BEGIN
EXEC(#del_query + ' option(maxdop 5) ')
IF ##rowcount < 100000 BREAK;
END
SET #del_query ='
DELETE TOP(100000) sl
FROM ServiceLogHeader sl
INNER JOIN #tmp_log_ids t ON t.id = s1.serviceLogId'
WHILE 1 = 1
BEGIN
EXEC(#del_query + ' option(maxdop 5) ')
IF ##rowcount < 100000 BREAK;
END

Stored procedure in SQL Server 2008 R2 is running fine, but getting "Divide by zero" error in Visual Studio 2012

I am trying to create an SSRS in VS and am getting a divide by zero error in VS even though the query executes fine in SQL Server and am having a bit of trouble debugging it.
The report setup in VS is pretty simple -- I am simply pulling in columns from my stored procedure, passing a year parameter so the user can choose the report window, and providing a few bits of random text to make it look nice. There are no expressions in my solution except to pull in the stored procedure columns, e.g.
=Fields!TOTAL_NUMBER_RECEIVED.Value
The stored procedure has a few instances of division, which look like this:
WHILE #Month_Loop < 13
BEGIN
UPDATE #Report
SET FractionField =
CAST((SELECT
SUM(CASE WHEN
DATEPART(mm,EndDate)= #Month_Loop
AND EndDate - StartDate <= 30
THEN 1 ELSE 0 END)
FROM #YTD
JOIN OtherTable ON #YTD.id = OtherTable.id
GROUP BY DATEPART(yy,EndDate))
AS FLOAT)
/
CAST((SELECT TotalField
FROM #Report
WHERE MonthID = #Month_Loop)
AS FLOAT) * 100
WHERE MonthID = #Month_Loop
SET #Month_Loop = #Month_Loop + 1
END
The code is just returning a ratio of the "FractionField" over the "TotalField" and inserting it into a table for each month of the year. I have a three more loops similar to this one that are nearly identical.
I've tested my stored procedure in SQL Server and it works fine but for whatever reason I consistently get this error when running the report in Visual Studio:
An error occurred due to local report processing. An error has occurred during report processing. Query execution failed for dataset 'myDataset'.
Divide by zero error encountered. (x30)
The statement has been terminated. (x30)
Any idea what I could be missing?
UPDATE: Changed from DATEPART to DATEDIFF.
With what I said before in mind, Your CASE WHEN will always evaluate as false. You can't subtract two dates like that. You need to use DATEDIFF on the dates and extract just the days and subtract those.
WHILE #Month_Loop < 13
BEGIN
UPDATE #Report
SET FractionField =
CAST((SELECT
SUM(CASE WHEN
DATEPART(mm,EndDate)= #Month_Loop
AND DATEDIFF(day, EndDate, StartDate) <=30
THEN 1 ELSE 0 END)
FROM #YTD
JOIN OtherTable ON #YTD.id = OtherTable.id
GROUP BY DATEPART(yy,EndDate))
AS FLOAT)
/
CAST((SELECT TotalField
FROM #Report
WHERE MonthID = #Month_Loop)
AS FLOAT) * 100
WHERE MonthID = #Month_Loop
SET #Month_Loop = #Month_Loop + 1
END
I was testing my code using 2015 in SQL Server, and the error came in VS when I defaulted to 2016--I forgot to limit the loops for when we don't want the months that haven't happened yet. Here is the updated code:
WHILE #Month_Loop < (CASE WHEN YEAR(GETDATE()) = #Year THEN MONTH(GETDATE()) ELSE 13 END)

Select Range of Dates, Including Ones With No Results [duplicate]

This question already has answers here:
Closed 10 years ago.
Possible Duplicate:
SQL Server: How to select all days in a date range even if no data exists for some days
I wasn't really sure how to word this question, but I'll try to explain. I'm trying to build some basic reporting using queries like the following:
SELECT COUNT(*) AS count, h_date FROM (SELECT CONVERT(VARCHAR(10), h_time, 102) AS h_date FROM hits h GROUP BY h_date ORDER BY h_date
This returns results like this, which I use to build a graph:
8 2012.05.06
2 2012.05.07
9 2012.05.09
As you can see, it's missing the 8th as there were no hits on that day. Is there a way to get a value of 0 for dates that have no results, or will I have to parse the results after the fact and add them manually?
You can use existing catalog views to derive a sequential range of dates between your start date and your end date. Then you can just left join to your data, and any missing dates will be there with 0s.
DECLARE #min SMALLDATETIME, #max SMALLDATETIME;
SELECT #min = MIN(h_time), #max = MAX(h_time)
FROM dbo.hits
-- WHERE ?
-- or if you just want a fixed range:
-- SELECT #min = '20120101', #max = '20120131';
;WITH n(d) AS
(
SELECT TOP (DATEDIFF(DAY, #min, #max)+1)
DATEADD(DAY, ROW_NUMBER() OVER (ORDER BY [object_id]) - 1, DATEDIFF(DAY, 0, #min))
FROM sys.all_objects ORDER BY [object_id]
)
SELECT n.d, [count] = COUNT(h.h_time)
FROM n
LEFT OUTER JOIN dbo.hits AS h
ON h.h_time >= n.d
AND h.h_time < DATEADD(DAY, 1, n.d)
-- AND --WHERE clause against hits?
GROUP BY n.d;
I've never been a big fan of using system tables to create dummy records to join against, but it's a very common approach.
I took Aaron Bertrand's answer and changed the Common Table Expression (CTE) to use a recursive one instead. It's quicker as it doesn't have to hit a table to do the query. Not that the previous version is slow anyway.
You need to specify "OPTION (MAXRECURSION 0);" otherwise it will limit the number of rows returned to the default (100). The value of 0 will return unlimited rows.
DECLARE #min SMALLDATETIME, #max SMALLDATETIME;
--SELECT #min = MIN(h_time), #max = MAX(h_time)
-- FROM dbo.hits
SELECT #min = '20120101', #max = '20121231';
WITH recursedate(each_date, date_index) AS
(
SELECT #min, 0
UNION ALL
SELECT DATEADD(DAY,date_index+1,#min), date_index+1
FROM recursedate
WHERE DATEADD(DAY,date_index+1,#min) <= #max
)
SELECT recursedate.each_date, [count] = COUNT(h.h_time)
FROM recursedate
LEFT OUTER JOIN dbo.hits AS h
ON --CONVERT(SMALLDATETIME,h.h_time) = recursedate.dates
h.h_time >= recursedate.each_date
AND h.h_time < DATEADD(DAY, 1, recursedate.each_date)
-- AND --WHERE clause against hits?
GROUP BY recursedate.each_date
OPTION (MAXRECURSION 0); -- The default is 100 so you'll only get 100 dates, 0 is unlimited.

Resources