I've inherited a db etc from another developer and need some help.
I have the following stored procedure:
CREATE PROCEDURE [dbo].[TESTgetSearchResults]
(
#ids varchar(100),
#Date DateTime = Null,
#Date2 DATETIME = Null,
#Sort VARCHAR(5), /* ASC or DESC */
#SortBy VARCHAR(10), /* Sorting criteria */
#Location VARCHAR(40)
)
AS
SELECT #Date = GetDate()
SELECT #Date2 = DATEADD(day,-14,GETDATE())
BEGIN
SELECT Aircraft.Id AS AircraftID, AircraftManufacturers.Name, AircraftModels.ModelName,
Aircraft.ModelSuffix, Aircraft.ImageFileName, Aircraft.Year, Aircraft.SerialNo,
Locations.DescriptionForSite, Aircraft.Description, Aircraft.Description2,
Aircraft.InfoWebAddress, Aircraft.ImageDescription, Advertisers.Id AS AdvertisersID,
Advertisers.Name AS AdvertisersName, Aircraft.AircraftDataId, Aircraft.ForSale, Aircraft.ForLease,
Aircraft.TTAF, Aircraft.ReSend, Aircraft.ReSendReason, Aircraft.Registration, Aircraft.AdType,
Aircraft.HasAlternateImage, Aircraft.AlternateImageDescription,
Aircraft.Price, AircraftModels.AircraftType, Advertisers.CurrentEMagLink, Aircraft.CurrentEMagLink,
Aircraft.Email, Aircraft.IsSold, Aircraft.SoldDate, Aircraft.DateAdded, Aircraft.ExtendedDetails,
Aircraft.LastUpdateDate, Aircraft.ImageCount, Aircraft.ContactTelephone, AircraftModels.id, Advertisers.IsPremiumAdvertiser,
Aircraft.Location, Advertisers.ContactTelephone As AdvertisersTelephone, Aircraft.EndDate, Aircraft.VideoLink,
Aircraft.Contact, Advertisers.WASSalesEmail, Advertisers.WASSalesEmail2, Aircraft.PriceNumeric,
Aircraft.PriceQualifier, Aircraft.Currency, Aircraft.AircraftDescription
FROM (((Aircraft
INNER JOIN Advertisers ON Aircraft.AdvertiserId = Advertisers.Id)
INNER JOIN AircraftModels ON Aircraft.AircraftModelId = AircraftModels.Id)
INNER JOIN AircraftManufacturers ON AircraftModels.ManufacturerId = AircraftManufacturers.Id)
INNER JOIN Locations ON Aircraft.LocationId = Locations.Id
JOIN iter$simple_intlist_to_tbles(#ids) i ON AircraftModels.id = i.number
WHERE (Aircraft.IsActive=1 AND Advertisers.IsActive=1 AND Aircraft.IsSold=0 AND (Aircraft.EndDate>=#Date OR Aircraft.EndDate Is Null) AND Locations.Id = #Location)
OR (Aircraft.IsActive=1 AND Advertisers.IsActive=1 AND Aircraft.IsSold=1 AND Aircraft.SoldDate>=#Date2 AND Locations.Id = #Location)
ORDER BY Advertisers.IsPremiumAdvertiser ASC, Aircraft.DateAdded DESC, Aircraft.ListPosition DESC,
Aircraft.LastUpdateDate, AircraftManufacturers.Name, AircraftModels.ModelName, Aircraft.ModelSuffix,
Aircraft.Id DESC
END
iter$simple_intlist_to_tbles(#ids) simple builds a table from the #ids input. This input comes in the form of a strings of id numbers seperated by a ',' eg ,1,,2,,3,,4, etc...
Now I need to replace the #Location with a string of location IDs formatted in this same fashion eg ,1,,2,,3,,4, etc....
So my problem is this... How do I adapt the above sql/ stored procedure so that the two 'WHERE' clauses which filter based on a single location, are now able to take multiple location IDs ??????
Any help would be really appreciated.
Thanks.
To solve your problem, simply retrieve the values from iter$simple_intlist_to_tbles(#Location) in a subquery, and check for them with IN:
AND Locations.Id IN (SELECT * FROM iter$simple_intlist_to_tbles(#Location))
Your where clause is also more complex that it needs to be. There are identical AND requirements in each OR, so you can move them outside the OR. It simplifies to:
WHERE Aircraft.IsActive=1
AND Advertisers.IsActive=1
AND ((Aircraft.IsSold=0 AND (Aircraft.EndDate>=#Date OR Aircraft.EndDate Is Null))
OR (Aircraft.IsSold=1 AND Aircraft.SoldDate>=#Date2))
AND Locations.Id IN (SELECT * FROM iter$simple_intlist_to_tbles(#Location))
By using a subquery like:
FROM blah ... AND Locations.Id IN (SELECT number FROM iter$simple_intlist_to_tbles(#locations))
Related
I have a list of teams in one table and list of cases in another table. I have to allocate a unique random case number to each one of the members in the team. What is the best way to generate unique random case number for each team member. I have read about NewID() and CRYPT_GEN_RANDOM(4) functions. I tried using them but not getting unique number for each team member. Can some one please help me. Thanks for your time. I am using SQL 2008.
I have a 'Teams' table which has team members, their ids(TM1,TM2 etc.) and their names.
I have another 'Cases' table which has ID numbers like 1,2,3,4 etc. I want to allocate random case to each team member. The desired output should be as below.
Team member Random_case_allocated
TM1 3
TM2 5
TM3 7
TM4 2
TM5 8
TM6 6
I have tried
SELECT TOP 1 id FROM cases
ORDER BY CRYPT_GEN_RANDOM(4)
It is giving the same id for all team members. I want a different case id for each team member. Can someone please help. Thank you.
The TOP(1) ORDER BY NEWID() will not work the way you are trying to get it to work here. The TOP is telling the query engine you are only interested on the first record of the result set. You need to have the NEWID() evaluate for each record. You can force this inside of a window function, such as ROW_NUMBER(). This could optimized I would imagine, however, it was what I could come up with from the top of my head. Please note, this is not nearly a truly random algorithm.
UPDATED With Previous Case Exclusions
DECLARE #User TABLE(UserId INT)
DECLARE #Case TABLE(CaseID INT)
DECLARE #UserCase TABLE (UserID INT, CaseID INT, DateAssigned DATETIME)
DECLARE #CaseCount INT =10
DECLARE #SaveCaseID INT = #CaseCount
DECLARE #UserCount INT = 100
DECLARE #NumberOfUserAllocatedAtStart INT= 85
WHILE(#CaseCount > 0)BEGIN
INSERT #Case VALUES(#CaseCount)
SET #CaseCount = #CaseCount-1
END
DECLARE #RandomCaseID INT
WHILE(#UserCount > 0)BEGIN
INSERT #User VALUES(#UserCount)
SET #UserCount = #UserCount-1
IF(#NumberOfUserAllocatedAtStart > 0 )BEGIN
SET #RandomCaseID = (ABS(CHECKSUM(NewId())) % (#SaveCaseID))+1
INSERT #UserCase SELECT #UserCount,#RandomCaseID,DATEADD(MONTH,-3,GETDATE())
SET #RandomCaseID = (ABS(CHECKSUM(NewId())) % (#SaveCaseID))+1
INSERT #UserCase SELECT #UserCount,#RandomCaseID,DATEADD(MONTH,-5,GETDATE())
SET #RandomCaseID = (ABS(CHECKSUM(NewId())) % (#SaveCaseID))+1
INSERT #UserCase SELECT #UserCount,#RandomCaseID,DATEADD(MONTH,-2,GETDATE())
SET #NumberOfUserAllocatedAtStart=#NumberOfUserAllocatedAtStart-1
END
END
;WITH RowNumberWithNewID AS
(
SELECT
U.UserID, C.CaseID, UserCase_CaseID = UC.CaseID,
RowNumber = ROW_NUMBER() OVER (PARTITION BY U.UserID ORDER BY NEWID())
FROM
#User U
INNER JOIN #Case C ON 1=1
LEFT OUTER JOIN #UserCase UC ON UC.UserID=U.UserID AND UC.CaseID=C.CaseID AND UC.DateAssigned > DATEADD(MONTH, -4, UC.DateAssigned)
WHERE
UC.CaseID IS NULL OR UC.CaseID <> C.CaseID
)
SELECT
UserID,
CaseID,
PreviousCases = STUFF((SELECT ', '+CONVERT(NVARCHAR(10), UC.CaseID) FROM #UserCase UC WHERE UC.UserID=RN.UserID FOR XML PATH('')),1,1,'')
FROM RowNumberWithNewID RN
WHERE
RN.RowNumber=1
So, my first post is less a question and more a statement! Sorry.
I needed to convert delimited strings stored in VarChar table columns to multiple/separate columns for the same record. (It's COTS software; so please don't bother telling me how the table is designed wrong.) After searching the internet ad nauseum for how to create a generic single line call to do that - and finding lots of how not to do that - I created my own. (The name is not real creative.)
Returns: A table with sequentially numbered/named columns starting with [Col1]. If an input value is not provided, then an empty string is returned. If less than 32 values are provided, all past the last value are returned as null. If more than 32 values are provided, they are ignored.
Prerequisites: A Number/Tally Table (luckily, our database already contained 'dbo.numbers').
Assumptions: Not more than 32 delimited values. (If you need more, change "WHERE tNumbers.Number BETWEEN 1 AND XXX", and add more prenamed columns ",[Col33]...,[ColXXX]".)
Issues: The very first column always gets populated, even if #InputString is NULL.
--======================================================================
--SMOZISEK 2017/09 CREATED
--======================================================================
CREATE FUNCTION dbo.fStringToPivotTable
(#InputString VARCHAR(8000)
,#Delimiter VARCHAR(30) = ','
)
RETURNS TABLE AS RETURN
WITH cteElements AS (
SELECT ElementNumber = ROW_NUMBER() OVER(PARTITION BY #InputString ORDER BY (SELECT 0))
,ElementValue = NodeList.NodeElement.value('.','VARCHAR(1022)')
FROM (SELECT TRY_CONVERT(XML,CONCAT('<X>',REPLACE(#InputString,#Delimiter,'</X><X>'),'</X>')) AS InputXML) AS InputTable
CROSS APPLY InputTable.InputXML.nodes('/X') AS NodeList(NodeElement)
)
SELECT PivotTable.*
FROM (
SELECT ColumnName = CONCAT('Col',tNumbers.Number)
,ColumnValue = tElements.ElementValue
FROM DBO.NUMBERS AS tNumbers --DEPENDENT ON ANY EXISTING NUMBER/TALLY TABLE!!!
LEFT JOIN cteElements AS tElements
ON tNumbers.Number = tElements.ElementNumber
WHERE tNumbers.Number BETWEEN 1 AND 32
) AS XmlSource
PIVOT (
MAX(ColumnValue)
FOR ColumnName
IN ([Col1] ,[Col2] ,[Col3] ,[Col4] ,[Col5] ,[Col6] ,[Col7] ,[Col8]
,[Col9] ,[Col10],[Col11],[Col12],[Col13],[Col14],[Col15],[Col16]
,[Col17],[Col18],[Col19],[Col20],[Col21],[Col22],[Col23],[Col24]
,[Col25],[Col26],[Col27],[Col28],[Col29],[Col30],[Col31],[Col32]
)
) AS PivotTable
;
GO
Test:
SELECT *
FROM dbo.fStringToPivotTable ('|Height|Weight||Length|Width||Color|Shade||Up|Down||Top|Bottom||Red|Blue|','|') ;
Usage:
SELECT 1 AS ID,'Title^FirstName^MiddleName^LastName^Suffix' AS Name
INTO #TempTable
UNION SELECT 2,'Mr.^Scott^A.^Mozisek^Sr.'
UNION SELECT 3,'Ms.^Jane^Q.^Doe^'
UNION SELECT 5,NULL
UNION SELECT 7,'^Betsy^^Ross^'
;
SELECT SourceTable.*
,ChildTable.Col1 AS ColTitle
,ChildTable.Col2 AS ColFirst
,ChildTable.Col3 AS ColMiddle
,ChildTable.Col4 AS ColLast
,ChildTable.Col5 AS ColSuffix
FROM #TempTable AS SourceTable
OUTER APPLY dbo.fStringToPivotTable(SourceTable.Name,'^') AS ChildTable
;
No, I have not tested any plan (I just needed it to work).
Oh, yeah: SQL Server 2012 (12.0 SP2)
Comments? Corrections? Enhancements?
Here is my TVF. Easy to expand up to the 32 (the pattern is pretty clear).
This is a straight XML without the cost of the PIVOT.
Example - Notice the OUTER APPLY --- Use CROSS APPLY to Exclude NULLs
Select A.ID
,B.*
From #TempTable A
Outer Apply [dbo].[tvf-Str-Parse-Row](A.Name,'^') B
Returns
The UDF if Interested
CREATE FUNCTION [dbo].[tvf-Str-Parse-Row] (#String varchar(max),#Delimiter varchar(10))
Returns Table
As
Return (
Select Pos1 = ltrim(rtrim(xDim.value('/x[1]','varchar(max)')))
,Pos2 = ltrim(rtrim(xDim.value('/x[2]','varchar(max)')))
,Pos3 = ltrim(rtrim(xDim.value('/x[3]','varchar(max)')))
,Pos4 = ltrim(rtrim(xDim.value('/x[4]','varchar(max)')))
,Pos5 = ltrim(rtrim(xDim.value('/x[5]','varchar(max)')))
,Pos6 = ltrim(rtrim(xDim.value('/x[6]','varchar(max)')))
,Pos7 = ltrim(rtrim(xDim.value('/x[7]','varchar(max)')))
,Pos8 = ltrim(rtrim(xDim.value('/x[8]','varchar(max)')))
,Pos9 = ltrim(rtrim(xDim.value('/x[9]','varchar(max)')))
From (Select Cast('<x>' + replace((Select replace(#String,#Delimiter,'§§Split§§') as [*] For XML Path('')),'§§Split§§','</x><x>')+'</x>' as xml) as xDim) as A
Where #String is not null
)
--Thanks Shnugo for making this XML safe
--Select * from [dbo].[tvf-Str-Parse-Row]('Dog,Cat,House,Car',',')
--Select * from [dbo].[tvf-Str-Parse-Row]('John <test> Cappelletti',' ')
I want to start off by saying that I am brand new to Stored Procedures, and am basically teaching myself how to do them. Any suggestions or advice will be greatly appreciated. I would mail you chocolate if I could.
The Gist: My organization's clients take a survey on their initial visit and on each 6th subsequent visits. We need to know if the individual has shown improvement over time. The way we decided to do this is compare the 1st to the most recent. So if they have been to 18 sessions, it would be the 1st and 3rd surveys that are compared (because they would have completed the survey 3 times over 18 sessions).
I have been able to obtain the "first" score and the "recent" score with two complex, multiple layered-nested select statements inside of one stored procedure. The "first" one is a TOP(1) linking on unique id (DOCID) and then ordered by date. The "recent" one is a TOP(1) linking on unique id (DOCID) and then ordered by date descending. This gets me exactly what I need within each statement, but it does not output what I need correctly which is obviously to the ordering in the statements.
The end result will be to create a Crystal Report with it for grant reporting purposes.
Declare
#StartDate Date,
#EndDate Date,
#First_DOCID Int,
#First_Clientkey Int,
#First_Date_Screening Date,
#First_Composite_Score Float,
#First_Depression_Score Float,
#First_Emotional_Score Float,
#First_Relationship_Score Float,
#Recent_DOCID Int,
#Recent_Clientkey Int,
#Recent_Date_Screening Date,
#Recent_Composite_Score Float,
#Recent_Depression_Score Float,
#Recent_Emotional_Score Float,
#Recent_Relationship_Score Float,
#Difference_Composit_Score Float,
#Difference_Depression_Score Float,
#Difference_Emotional_Score Float,
#Difference_Relationship_Score Float
SET #StartDate = '1/1/2016'
SET #EndDate = '6/1/2016'
BEGIN
SELECT #First_DOCID = CB24_1.OP__DOCID, #First_Date_Screening = CB24_1.Date_Screening, #First_Clientkey = CB24_1.ClientKey, #First_Composite_Score = CB24_1.Composite_score, #First_Depression_Score = CB24_1.Depression_Results, #First_Emotional_Score = CB24_1.Emotional_Results, #First_Relationship_Score = CB24_1.Relationships_Results
FROM FD__CNSLG_BASIS24 AS CB24_1
WHERE (CB24_1.OP__DOCID =
(Select TOP(1) CB24_2.OP__DOCID
...
ORDER BY CB24_2.Date_Screening))
ORDER BY ClientKey DESC
END
BEGIN
SELECT #Recent_DOCID = CB24_1.OP__DOCID, #Recent_Date_Screening = CB24_1.Date_Screening, #Recent_Clientkey = CB24_1.ClientKey, #Recent_Composite_Score = CB24_1.Composite_score, #Recent_Depression_Score = CB24_1.Depression_Results, #Recent_Emotional_Score = CB24_1.Emotional_Results, #Recent_Relationship_Score = CB24_1.Relationships_Results
FROM FD__CNSLG_BASIS24 AS CB24_1
WHERE (CB24_1.OP__DOCID =
(Select TOP(1) CB24_2.OP__DOCID
...
ORDER BY CB24_2.Date_Screening DESC))
ORDER BY ClientKey
END
SET #Difference_Composit_Score = (#Recent_Composite_Score - #First_Composite_Score)
SET #Difference_Depression_Score = (#Recent_Depression_Score - #First_Depression_Score)
SET #Difference_Emotional_Score = (#Recent_Emotional_Score - #First_Emotional_Score)
SET #Difference_Relationship_Score = (#Recent_Relationship_Score - #First_Relationship_Score)
SELECT
#First_DOCID AS First_Docid,
#First_Clientkey AS First_Clientkey,
#First_Date_Screening AS First_Date_Screening,
#First_Composite_Score AS First_Composite_Score,
#First_Depression_Score AS First_Depression_Score,
#First_Emotional_Score AS First_Emotional_Score,
#First_Relationship_Score AS First_Relationship_Score,
#Recent_DOCID AS Recent_DOCID,
#Recent_Clientkey AS Recent_Clientkey,
#Recent_Date_Screening AS Recent_Date_Screening,
#Recent_Composite_Score AS Recent_Composite_Score,
#Recent_Depression_Score AS Recent_Depression_Score,
#Recent_Emotional_Score AS Recent_Emotional_Score,
#Recent_Relationship_Score AS Recent_Relationship_Score,
#Difference_Composit_Score AS Difference_Composit_Score,
#Difference_Depression_Score AS Difference_Depression_Score,
#Difference_Emotional_Score AS Difference_Emotional_Score,
#Difference_Relationship_Score AS Difference_Relationship_Score
In SQL you don't want unnecessary declared variables.
Here's a contrived but reproducible example which utilizes common table expressions and window functions that should get you in the right direction. I created the stored procedure from the template with the necessary input parameters (which in real life you'd like to avoid).
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE PROCEDURE dbo.Client_Improvement_Results
(#StartDate DATETIME, #EndDate DATETIME)
AS
BEGIN
SET NOCOUNT ON;
-- Insert statements for procedure here
-- You would never do this in real-life but for a simple reproducible example...
DECLARE #Survey TABLE
(
Clientkey INT,
Date_Screening DATE,
Composite_Score FLOAT
)
INSERT INTO #Survey
VALUES
(1, '2014-04-01', 42.1),
(1, '2014-04-10', 46.1),
(1, '2014-04-20', 48.1),
(2, '2014-05-10', 40.1),
(2, '2014-05-20', 30.1),
(2, '2014-05-30', 10.1)
;
--Use Common Table Expression & Window Functions to ID first/recent visit by client
WITH CTE AS (
SELECT
S.Clientkey
,S.Composite_Score
,S.Date_Screening
,First_Date_Screening = MIN(S.Date_Screening) OVER(PARTITION BY S.Clientkey)
,Recent_Date_Screening = MAX(S.Date_Screening) OVER(PARTITION BY S.Clientkey)
FROM #Survey AS S
)
--Self join of CTE with proper filters
--applied allows you to return differences in one row
SELECT
f.Clientkey
,f.First_Date_Screening
,f.Recent_Date_Screening
,Difference_Score = r.Composite_Score - f.Composite_Score
FROM
CTE AS f --first
INNER JOIN CTE AS r --recent
ON f.Clientkey = r.Clientkey
WHERE
f.Date_Screening = f.First_Date_Screening
AND r.Date_Screening = r.Recent_Date_Screening
END
GO
Here is the solution I came up with after everyone amazing advice.
I want to go back and replace the TOP(1) with another new thing I learned at some point:
select pc.*
from (select pc.*, row_number() over (partition by Clientkey, ProgramAdmitKey order by Date_Screening) as seqnum
from FD__CNSLG_BASIS24 PC) pc
where seqnum = 1
I will have to play with the above script a bit first, however. It doesn't like to be inserted into the larger script below.
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
BEGIN
SET NOCOUNT ON;
Declare
#StartDate Date,
#EndDate Date
SET #StartDate = '1/1/2016'
SET #EndDate = '6/1/2016'
WITH CNSL_Clients AS (
SELECT PC_CNT.Clientkey, PC_Cnt.ProgramAdmitKey, PC_Cnt.OP__DOCID
FROM FD__Primary_Client as PC_Cnt
INNER JOIN VW__Cnsl_Session_Count_IndvFamOnly as cnt
ON PC_Cnt.Clientkey = CNT.Clientkey AND PC_Cnt.ProgramAdmitKey = CNT.ProgramAdmitKey
WHERE ((pc_CNT.StartDate between #StartDate AND #EndDate) OR (pc_CNT.StartDate <= #StartDate AND pc_CNT.ENDDate >= #StartDate) OR (pc_CNT.StartDate <= #StartDate AND pc_CNT.ENDDate is null))
AND CNT.SessionCount>=6
),
FIRST_BASIS AS (
SELECT CB24_1.OP__DOCID, CB24_1.Date_Screening, CB24_1.ClientKey, CB24_1.ProgramAdmitKey, CB24_1.Composite_score, CB24_1.Depression_Results,CB24_1.Emotional_Results, CB24_1.Relationships_Results
FROM FD__CNSLG_BASIS24 AS CB24_1
WHERE (CB24_1.OP__DOCID =
(Select TOP(1) CB24_2.OP__DOCID
FROM FD__CNSLG_BASIS24 AS CB24_2
Inner JOIN CNSL_Clients
ON CB24_2.ClientKey = CNSL_Clients.ClientKey AND CB24_2.ProgramAdmitKey = CNSL_Clients.ProgramAdmitKey
WHERE (CB24_1.ClientKey = CB24_2.ClientKey) AND (CB24_1.ProgramAdmitKey = CB24_2.ProgramAdmitKey)
ORDER BY CB24_2.Date_Screening))
),
RECENT_BASIS AS (
SELECT CB24_1.OP__DOCID, CB24_1.Date_Screening, CB24_1.ClientKey, CB24_1.ProgramAdmitKey, CB24_1.Composite_score, CB24_1.Depression_Results,CB24_1.Emotional_Results, CB24_1.Relationships_Results
FROM FD__CNSLG_BASIS24 AS CB24_1
WHERE (CB24_1.OP__DOCID =
(Select TOP(1) CB24_2.OP__DOCID
FROM FD__CNSLG_BASIS24 AS CB24_2
Inner JOIN CNSL_Clients
ON CB24_2.ClientKey = CNSL_Clients.ClientKey AND CB24_2.ProgramAdmitKey = CNSL_Clients.ProgramAdmitKey
WHERE (CB24_1.ClientKey = CB24_2.ClientKey) AND (CB24_1.ProgramAdmitKey = CB24_2.ProgramAdmitKey)
ORDER BY CB24_2.Date_Screening DESC))
)
SELECT F.OP__DOCID AS First_DOCID,R.OP__DOCID as Recent_DOCID,F.ClientKey, F.ProgramAdmitKey, F.Composite_Score AS FComposite_Score, R.Composite_Score as RComposite_Score, Composite_Change = R.Composite_Score - F.Composite_Score, F.Depression_Results AS FDepression_Results, R.Depression_Results AS RDepression_Resluts, Depression_Change = R.Depression_Results - F.Depression_Results, F.Emotional_Results AS FEmotional_Resluts, R.Emotional_Results AS REmotionall_Reslu, Emotional_Change = R.Emotional_Results - F.Emotional_Results, F.Relationships_Results AS FRelationships_Resluts, R.Relationships_Results AS RRelationships_Resluts, Relationship_Change = R.Relationships_Results - F.Relationships_Results
FROM First_basis AS F
FULL Outer JOIN RECENT_BASIS AS R
ON F.ClientKey = R.ClientKey AND F.ProgramAdmitKey = R.ProgramAdmitKey
ORDER BY F.ClientKey
END
GO
I'm trying to clean up some stored procedures, and was curious about the following. I did a search, but couldn't find anything that really talked about performance.
Explanation
Imagine a stored procedure that has the following parameters defined:
#EntryId uniqueidentifier,
#UserId int = NULL
I have the following table:
tbl_Entry
-------------------------------------------------------------------------------------
| EntryId PK, uniqueidentifier | Name nvarchar(140) | Created datetime | UserId int |
-------------------------------------------------------------------------------------
All columns are NOT NULL.
The idea behind this stored procedure is that you can get an Entry by its uniqueidentifier PK and, optionally, you can validate that it has the given UserId assigned by passing that as the second parameter. Imagine administrators who can view all entries versus a user who can only view their own entries.
Option 1 (current)
DECLARE #sql nvarchar(3000);
SET #sql = N'
SELECT
a.EntryId,
a.Name,
a.UserId,
b.UserName
FROM
tbl_Entry a,
tbl_User b
WHERE
a.EntryId = #EntryId
AND b.UserId = a.UserId';
IF #UserId IS NOT NULL
SET #sql = #sql + N' AND a.UserId = #UserId';
EXECUTE sp_executesql #sql;
Option 2 (what I thought would be better)
SELECT
a.EntryId,
a.Name,
a.UserId,
b.UserName
FROM
tbl_Entry a,
tbl_User b
WHERE
a.EntryId = #EntryId
AND a.UserId = COALESCE(#UserId, a.UserId)
AND b.UserId = a.UserId;
I realize this case is fairly, simple, and could likely be optimized by a single IF statement that separates two queries. I wrote a simple case to try and concisely explain the issue. The actual stored procedure has 6 nullable parameters. There are others that have even more nullable parameters. Using IF blocks would be very complicated.
Question
Will SQL Server still check a.UserId = a.UserId on every row even though that condition will always be true, or will that condition be optimized out when it sees that #UserId is NULL?
If it would check a.UserId = a.UserId on every row, would it be more efficient to build a string like in option 1, or would it still be faster to do the a.UserId = a.UserId condition? Is that something that would depend on how many rows are in the tables?
Is there another option here that I should be considering? I wouldn't call myself a database expert by any means.
You will get the best performance (and the lowest query cost) if you replace the COALESCE with a compound predicate as follows:
(#UserId IS NULL OR a.UserId = #UserId)
I would also suggest when writing T-SQL that you utilize the join syntax rather than the antiquated ANSI-89 coding style. The revised query will look something like this:
SELECT a.EntryId, a.Name, a.UserId, b.UserName
FROM tblEntry a
INNER JOIN tblUser b ON a.UserId = b.UserId
WHERE a.EntryId = #EntryId
AND (#UserId IS NULL OR a.UserId = #UserId);
I was looking at different ways of writing a stored procedure to return a "page" of data. This was for use with the ASP ObjectDataSource, but it could be considered a more general problem.
The requirement is to return a subset of the data based on the usual paging parameters; startPageIndex and maximumRows, but also a sortBy parameter to allow the data to be sorted. Also there are some parameters passed in to filter the data on various conditions.
One common way to do this seems to be something like this:
[Method 1]
;WITH stuff AS (
SELECT
CASE
WHEN #SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name)
WHEN #SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC)
WHEN #SortBy = ...
ELSE ROW_NUMBER() OVER (ORDER BY whatever)
END AS Row,
.,
.,
.,
FROM Table1
INNER JOIN Table2 ...
LEFT JOIN Table3 ...
WHERE ... (lots of things to check)
)
SELECT *
FROM stuff
WHERE (Row > #startRowIndex)
AND (Row <= #startRowIndex + #maximumRows OR #maximumRows <= 0)
ORDER BY Row
One problem with this is that it doesn't give the total count and generally we need another stored procedure for that. This second stored procedure has to replicate the parameter list and the complex WHERE clause. Not nice.
One solution is to append an extra column to the final select list, (SELECT COUNT(*) FROM stuff) AS TotalRows. This gives us the total but repeats it for every row in the result set, which is not ideal.
[Method 2]
An interesting alternative is given here (https://web.archive.org/web/20211020111700/https://www.4guysfromrolla.com/articles/032206-1.aspx) using dynamic SQL. He reckons that the performance is better because the CASE statement in the first solution drags things down. Fair enough, and this solution makes it easy to get the totalRows and slap it into an output parameter. But I hate coding dynamic SQL. All that 'bit of SQL ' + STR(#parm1) +' bit more SQL' gubbins.
[Method 3]
The only way I can find to get what I want, without repeating code which would have to be synchronized, and keeping things reasonably readable is to go back to the "old way" of using a table variable:
DECLARE #stuff TABLE (Row INT, ...)
INSERT INTO #stuff
SELECT
CASE
WHEN #SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name)
WHEN #SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC)
WHEN #SortBy = ...
ELSE ROW_NUMBER() OVER (ORDER BY whatever)
END AS Row,
.,
.,
.,
FROM Table1
INNER JOIN Table2 ...
LEFT JOIN Table3 ...
WHERE ... (lots of things to check)
SELECT *
FROM stuff
WHERE (Row > #startRowIndex)
AND (Row <= #startRowIndex + #maximumRows OR #maximumRows <= 0)
ORDER BY Row
(Or a similar method using an IDENTITY column on the table variable).
Here I can just add a SELECT COUNT on the table variable to get the totalRows and put it into an output parameter.
I did some tests and with a fairly simple version of the query (no sortBy and no filter), method 1 seems to come up on top (almost twice as quick as the other 2). Then I decided to test probably I needed the complexity and I needed the SQL to be in stored procedures. With this I get method 1 taking nearly twice as long as the other 2 methods. Which seems strange.
Is there any good reason why I shouldn't spurn CTEs and stick with method 3?
UPDATE - 15 March 2012
I tried adapting Method 1 to dump the page from the CTE into a temporary table so that I could extract the TotalRows and then select just the relevant columns for the resultset. This seemed to add significantly to the time (more than I expected). I should add that I'm running this on a laptop with SQL Server Express 2008 (all that I have available) but still the comparison should be valid.
I looked again at the dynamic SQL method. It turns out I wasn't really doing it properly (just concatenating strings together). I set it up as in the documentation for sp_executesql (with a parameter description string and parameter list) and it's much more readable. Also this method runs fastest in my environment. Why that should be still baffles me, but I guess the answer is hinted at in Hogan's comment.
I would most likely split the #SortBy argument into two, #SortColumn and #SortDirection, and use them like this:
…
ROW_NUMBER() OVER (
ORDER BY CASE #SortColumn
WHEN 'Name' THEN Name
WHEN 'OtherName' THEN OtherName
…
END *
CASE #SortDirection
WHEN 'DESC' THEN -1
ELSE 1
END
) AS Row
…
And this is how the TotalRows column could be defined (in the main select):
…
COUNT(*) OVER () AS TotalRows
…
I would definitely want to do a combination of a temp table and NTILE for this sort of approach.
The temp table will allow you to do your complicated series of conditions just once. Because you're only storing the pieces you care about, it also means that when you start doing selects against it further in the procedure, it should have a smaller overall memory usage than if you ran the condition multiple times.
I like NTILE() for this better than ROW_NUMBER() because it's doing the work you're trying to accomplish for you, rather than having additional where conditions to worry about.
The example below is one based off a similar query I'm using as part of a research query; I have an ID I can use that I know will be unique in the results. Using an ID that was an identity column would also be appropriate here, though.
--DECLARES here would be stored procedure parameters
declare #pagesize int, #sortby varchar(25), #page int = 1;
--Create temp with all relevant columns; ID here could be an identity PK to help with paging query below
create table #temp (id int not null primary key clustered, status varchar(50), lastname varchar(100), startdate datetime);
--Insert into #temp based off of your complex conditions, but with no attempt at paging
insert into #temp
(id, status, lastname, startdate)
select id, status, lastname, startdate
from Table1 ...etc.
where ...complicated conditions
SET #pagesize = 50;
SET #page = 5;--OR CAST(#startRowIndex/#pagesize as int)+1
SET #sortby = 'name';
--Only use the id and count to use NTILE
;with paging(id, pagenum, totalrows) as
(
select id,
NTILE((SELECT COUNT(*) cnt FROM #temp)/#pagesize) OVER(ORDER BY CASE WHEN #sortby = 'NAME' THEN lastname ELSE convert(varchar(10), startdate, 112) END),
cnt
FROM #temp
cross apply (SELECT COUNT(*) cnt FROM #temp) total
)
--Use the id to join back to main select
SELECT *
FROM paging
JOIN #temp ON paging.id = #temp.id
WHERE paging.pagenum = #page
--Don't need the drop in the procedure, included here for rerunnability
drop table #temp;
I generally prefer temp tables over table variables in this scenario, largely so that there are definite statistics on the result set you have. (Search for temp table vs table variable and you'll find plenty of examples as to why)
Dynamic SQL would be most useful for handling the sorting method. Using my example, you could do the main query in dynamic SQL and only pull the sort method you want to pull into the OVER().
The example above also does the total in each row of the return set, which as you mentioned was not ideal. You could, instead, have a #totalrows output variable in your procedure and pull it as well as the result set. That would save you the CROSS APPLY that I'm doing above in the paging CTE.
I would create one procedure to stage, sort, and paginate (using NTILE()) a staging table; and a second procedure to retrieve by page. This way you don't have to run the entire main query for each page.
This example queries AdventureWorks.HumanResources.Employee:
--------------------------------------------------------------------------
create procedure dbo.EmployeesByMartialStatus
#MaritalStatus nchar(1)
, #sort varchar(20)
as
-- Init staging table
if exists(
select 1 from sys.objects o
inner join sys.schemas s on s.schema_id=o.schema_id
and s.name='Staging'
and o.name='EmployeesByMartialStatus'
where type='U'
)
drop table Staging.EmployeesByMartialStatus;
-- Populate staging table with sort value
with s as (
select *
, sr=ROW_NUMBER()over(order by case #sort
when 'NationalIDNumber' then NationalIDNumber
when 'ManagerID' then ManagerID
-- plus any other sort conditions
else EmployeeID end)
from AdventureWorks.HumanResources.Employee
where MaritalStatus=#MaritalStatus
)
select *
into #temp
from s;
-- And now pages
declare #RowCount int; select #rowCount=COUNT(*) from #temp;
declare #PageCount int=ceiling(#rowCount/20); --assuming 20 lines/page
select *
, Page=NTILE(#PageCount)over(order by sr)
into Staging.EmployeesByMartialStatus
from #temp;
go
--------------------------------------------------------------------------
-- procedure to retrieve selected pages
create procedure EmployeesByMartialStatus_GetPage
#page int
as
declare #MaxPage int;
select #MaxPage=MAX(Page) from Staging.EmployeesByMartialStatus;
set #page=case when #page not between 1 and #MaxPage then 1 else #page end;
select EmployeeID,NationalIDNumber,ContactID,LoginID,ManagerID
, Title,BirthDate,MaritalStatus,Gender,HireDate,SalariedFlag,VacationHours,SickLeaveHours
, CurrentFlag,rowguid,ModifiedDate
from Staging.EmployeesByMartialStatus
where Page=#page
GO
--------------------------------------------------------------------------
-- Usage
-- Load staging
exec dbo.EmployeesByMartialStatus 'M','NationalIDNumber';
-- Get pages 1 through n
exec dbo.EmployeesByMartialStatus_GetPage 1;
exec dbo.EmployeesByMartialStatus_GetPage 2;
-- ...etc (this would actually be a foreach loop, but that detail is omitted for brevity)
GO
I use this method of using EXEC():
-- SP parameters:
-- #query: Your query as an input parameter
-- #maximumRows: As number of rows per page
-- #startPageIndex: As number of page to filter
-- #sortBy: As a field name or field names with supporting DESC keyword
DECLARE #query nvarchar(max) = 'SELECT * FROM sys.Objects',
#maximumRows int = 8,
#startPageIndex int = 3,
#sortBy as nvarchar(100) = 'name Desc'
SET #query = ';WITH CTE AS (' + #query + ')' +
'SELECT *, (dt.pagingRowNo - 1) / ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 As pagingPageNo' +
', pagingCountRow / ' + CAST(#maximumRows as nvarchar(10)) + ' As pagingCountPage ' +
', (dt.pagingRowNo - 1) % ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 As pagingRowInPage ' +
'FROM ( SELECT *, ROW_NUMBER() OVER (ORDER BY ' + #sortBy + ') As pagingRowNo, COUNT(*) OVER () AS pagingCountRow ' +
'FROM CTE) dt ' +
'WHERE (dt.pagingRowNo - 1) / ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 = ' + CAST(#startPageIndex as nvarchar(10))
EXEC(#query)
At result-set after query result columns:
Note:
I add some extra columns that you can remove them:
pagingRowNo : The row number
pagingCountRow : The total number of rows
pagingPageNo : The current page number
pagingCountPage : The total number of pages
pagingRowInPage : The row number that started with 1 in this page