I have a stored procedure that can get data from 2 different sources depending on if the user requests data from a single closed period (archived into a data warehouse table) or from an open period (data from transaction tables).
If I pass parameters that limit the select to the data warehouse table (providing a year and period for a closed period) the procedure takes a very long time to return results unless I comment out the ELSE BEGIN… code. No data is coming from the ELSE portion of code but it is still slowing down the procedure. If I comment out the ELSE portion of code, it is very fast.
I have tried OPTION (RECOMPILE) and I’m using local variables to avoid parameter sniffing but it’s not helping. Is there any way to get around this?
The following is an example of what I’m doing that runs slow:
IF #Year <> 0 AND #Period <> 0 AND (SELECT PerClosedTimestamp
FROM Period
WHERE
PerCompanyID = #CompanyID AND
PerYear = #Year AND
PerPeriod = #Period) IS NOT NULL
BEGIN
SELECT
datawhse.column1, datawhse.column2, etc …
FROM
datawhse
END
ELSE
BEGIN
SELECT
trantable.column1, trantable.column2, etc…
FROM
trantable
END
If I exclude the ELSE statement it runs very fast:
IF #Year <> 0
AND #Period <> 0
AND (SELECT PerClosedTimestamp
FROM Period
WHERE PerCompanyID = #CompanyID
AND PerYear = #Year
AND PerPeriod = #Period) IS NOT NULL
BEGIN
SELECT datawhse.column1
,datawhse.column2, etc …
FROM datawhse
END
Are #Year and #Period directly from the input of the stored procedure? like in your sproc definition, did you write in this following way?
create proc USP_name #Year int, #Period int as
begin
...
end
You can try using local variable, according to my experience in many cases like this, local variables help a lot.
create proc USP_name #Year int, #Period int as
begin
declare #Year_local int, #Period_local int
set #Year_local = #Year, #Period_local = #period
if #Year_local <> 0 AND #Period_local <> 0 AND ...
....
end
As mentioned in the comments, the definitive answer to why is it slow is always to be found in the query plan.
At a guess, the appearance of trantable in the procedure is biasing the query optimizer in a way that disfavors datawhse. I'd be tempted to at least try UNION ALL instead of IF/THEN, something along the lines of
SELECT
datawhse.column1, datawhse.column2, etc …
FROM
datawhse
WHERE #Year <> 0 AND #Period <> 0 AND (SELECT PerClosedTimestamp
FROM Period
WHERE
PerCompanyID = #CompanyID AND
PerYear = #Year AND
PerPeriod = #Period) IS NOT NULL
UNION ALL
SELECT
trantable.column1, trantable.column2, etc…
FROM
trantable
WHERE #Year = 0 OR #Period = 0 OR (SELECT PerClosedTimestamp
FROM Period
WHERE
PerCompanyID = #CompanyID AND
PerYear = #Year AND
PerPeriod = #Period) IS NULL
It would be interesting to see how the query plans compare.
Thanks everyone for your suggestions. I ended up creating 2 separate functions to return data from either the data warehouse table or the transaction tables. I select from the functions within the IF THEN ELSE statement and that seems to have solved my problem.
Related
I'm getting this error from the function:
CREATE FUNCTION getLavel(#id int ,#lavel char)
RETURNS date
BEGIN
DECLARE #date date
select #date = (select authorization_date from Authorized WHERE diver_number = #id and #lavel =level_name)
return #date
END
GO
What can be the reason?
Ty very much.
The function needs to be either the only function in the query window OR the only statement in the batch. If there are more statements in the query window, you can make it the only one "in the batch" by surrounding it with GO's.
e.g.
GO
CREATE FUNCTION getLavel(#id int ,#lavel char)
RETURNS date
BEGIN
DECLARE #date date
select #date = (select authorization_date from Authorized WHERE diver_number = #id and #lavel =level_name)
return #date
END
GO
Turn this into an inline table valued function. This will perform better than the scalar function. Also, you should NOT use the default sizes for character datatypes. Do you know what the default length for a char is? Did you know that it can vary based on usage?
CREATE FUNCTION getLavel
(
#id int
, #lavel char --You need to define the length instead of the default length
)
RETURNS table
return
select authorization_date
from Authorized
WHERE diver_number = #id
and #lavel = level_name
GO
You need to add RETURN before the END statement
That should fix your issue, that's what fixed mine. :D
Make sure that this statement is the only the only sql in your query window before you execute it.
Or you can highlight the function declaration and execute
What solved it for me, was that I was trying to create the function inside of a transaction context - that doesn't make sense from a SQL Server point of view. Transactions are for data, not functions.
Take the CREATE FUNCTION statement out of the transaction, then wrap it in GO's
CREATE FUNCTION CalculateAge(#DOB DATE)
RETURNS INT
AS
BEGIN
DECLARE #Age INT
SET #DOB='08/12/1990'
SET #Age =DATEDIFF(YEAR,#DOB,GETDATE()) -
CASE
WHEN (MONTH (#DOB)> MONTH (GETDATE ())) OR
(MONTH (#DOB)= MONTH (GETDATE ()) AND DAY (#DOB) >DAY (GETDATE ()))
THEN 1
ELSE 0
END
SELECT #Age
END
The Error is given to you in only query Page But if you execute the query then it will successfully execute.
CREATE FUNCTION getLavel(#id int ,#lavel char)
RETURNS date
BEGIN
DECLARE #date date
select #date = (select authorization_date from Authorized WHERE diver_number = #id and #lavel = level_name)
return #date
END
GO
I have a query:
DECLARE #date date = '2017-09-13'
DECLARE #pos int = 111222
DECLARE #UserNo int = 122425
DECLARE #sameDist bit = 0
DECLARE #brandId int = NULL
DECLARE #sameArea bit = 0
SELECT
PosId, SUM(NetSales) as SumSales,
ROW_NUMBER() OVER(ORDER BY SUM(NetSales) DESC) as RowID
FROM
dbo.t_Sales_Daily
INNER JOIN
dbo.t_Pos ON dbo.t_Sales_Daily.PosId = dbo.t_Pos.Id
INNER JOIN
t_User ON dbo.t_Sales_Daily.RetailerNo = t_User.UserNo
WHERE
(CONVERT(DATE, SalesDate) = #date
AND (#brandId IS NULL OR BrandId = #brandId)
AND (#sameDist = 0 OR dbo.t_Pos.DistType = (SELECT TOP 1 DistType
FROM t_Pos WHERE Id = #pos))
AND (#sameArea = 0 OR t_User.RegionName = (SELECT top 1 RegionName
FROM t_User WHERE UserNo = #userNo)))
GROUP BY
PosId
When I run this query with declared parameters I get about ~2000 rows, but when I insert this query into a stored procedure and run the stored procedure I get only about ~200 rows.
The query is exactly the same and I triple-checked that the query is the same and that the same parameters are passed through.
Here is how I execute the stored procedure:
exec dbo.GetPosRankDaily #userNo, #date, #pos, NULL, NULL, NULL
As I said the stored procedure does work but just returns less results (I used set nocount on), all results returned from the stored procedure are contained in the query.
OK well I found the solution 1 second after posting the question...
Though in the stored proc i declared 2 bit paramaters with default 0 values, I passed null in the argument instead of 0 and it somehow yielded less results which is strange because it shouldn't have returned anything.
So basically running the sp as so solves the issue:
exec dbo.GetPosRankDaily #userNo,#date,#pos,NULL,0,0
I would like to change the WHERE clause of a query based upon the value of an input parameter. I'm guessing this is possible, but it seems that I'm approaching it in the wrong way?
A simplified version on my SP query:
CREATE PROCEDURE [dbo].[GetMailboxMessagesByStatus]
#UserId UNIQUEIDENTIFIER,
#MessageStatus INT
AS
BEGIN
SELECT *
FROM MailboxMessages m
WHERE
CASE #MessageStatus
WHEN 4 THEN m.SenderId = #UserId --Sent
ELSE m.RecipientId = #UserId --Inbox
END
END
GO
No need for case or iif constructs:
WHERE #MessageStatus = 4 AND m.SenderId = #UserId
OR #MessageStatus <> 4 AND m.RecipientId = #UserId
EDIT:
Be aware on big tables using this construct when the table being queried is quite large. Using 2 seperate queries using a IF statement like Chester Lim suggested might be the better solution. Preventing parameter sniffing might also be a good idea
Use an IF statement.
CREATE PROCEDURE [dbo].[GetMailboxMessagesByStatus]
#UserId UNIQUEIDENTIFIER ,
#MessageStatus INT
AS
BEGIN
IF ( #MessageStatus = 4 )
BEGIN
SELECT *
FROM MailboxMessages
WHERE m.SenderId = #UserId; --Sent
END;
ELSE
BEGIN
SELECT *
FROM MailboxMessages m
WHERE m.RecipientId = #UserId; --Inbox
END;
END;
GO
EDIT - a much better way provided by LukStorms (since i did not know IIF until i saw his answer)
CREATE PROCEDURE [dbo].[GetMailboxMessagesByStatus]
#UserId UNIQUEIDENTIFIER ,
#MessageStatus INT
AS
BEGIN
SELECT *
FROM MailboxMessages m
WHERE IIF (#MessageStatus = 4, m.SenderId, m.RecipientId) = #UserId; --Sent
END
GO
You could change that WHERE clause to
WHERE (CASE WHEN #MessageStatus = 4 THEN m.SenderId ELSE m.RecipientId END) = #UserId
Because what you put after the THEN in a CASE should just be a value, not a comparison.
Or use IIF instead of a CASE:
WHERE IIF(#MessageStatus = 4,m.SenderId,m.RecipientId) = #UserId
But the SQL will run more efficient if you use an IF ... ELSE ... and run a different Query based on the #MessageStatus.
Was writing an example for that, but Chester Lim already beat me to it. ;)
(so no need to repeat that approach in this answer)
I'm going through stored procedures to make them sargable and I noticed something unexpected about how the index was used.
There's a non-clustered index on DateColumn, and a clustered index on the table (not directly referenced in the query).
While the following uses an index seek on the non-clustered index that has DateColumn as an index column:
DECLARE #timestamp as datetime
SET #timestamp = '2014-01-01'
SELECT column1, column2 FROM Table WHERE DateColumn > #timestamp
However the following uses an index scan:
DECLARE #timestamp as datetime
DECLARE #flag as bit
SET #timestamp = '2014-01-01'
SET #flag = 0
SELECT column1, column2 FROM Table WHERE (DateColumn > #timestamp) OR (#flag = 1)
I put the brackets in just in case, but of course it made no difference.
Because the #flag = 1 has nothing to do with the table, I was expecting a seek in both cases. Out of interest if I change it to 0 = 1 it uses index seek again. The #flag value is a parameter for the procedure that tells the query to return all records, so not something I can hard code in reality.
Is there a way to make this use a seek instead of a scan? The only option I can think of is the following, however in reality the queries are much more complex, so duplication like this hurts readability and maintainability:
DECLARE #timestamp as datetime
DECLARE #flag as bit
SET #timestamp = '2014-01-01'
SET #flag = 0
IF #flag = 1
BEGIN
SELECT column1, column2 FROM Table
END
ELSE
BEGIN
SELECT column1, column2 FROM Table WHERE DateColumn > #timestamp
END
Try with dynamic SQL like this.
DECLARE #flag BIT,
#query NVARCHAR(500)
SET #flag=0
SET #query='
SELECT <columnlist>
FROM <tablename>
WHERE columnname = value
or 1=' + CONVERT(NVARCHAR(1), #flag)
EXEC Sp_executesql
#query
Your dynamic solution is actually better because you won't get caught out when you pass in #flag=1 the first time and that's what you get for all subsequent calls. As #RaduGheorghiu says, a scan is better than a seek in these cases.
If it were me I would have 2 procedures, one for "get everything" and one for "get for date". Two procedures, two usages, two query plans. If the repetition bothers you, you can introduce a view.
Just for completeness I'm going to post an option that I just realised works in my specific situation. This is probably what I'm going to use due to simplicity, however this probably won't work for 99.9% of cases, so I don't consider it a better answer than the dynamic SQL.
declare #flag as int
declare #date as datetime
set #flag = 1
set #date = '2015-08-11 09:12:08.553'
set #date = (select case #flag when 1 then '1753-01-01' else #date end)
select Column1, Column2
from Table_1
where DateColumn > #date
The above works because the DateColumn stores the modified date for the record (I'm returning deltas). It has a default value of getdate() and is set to getdate() on updates. This means in this specific case I know that for a value of #date = '1753-01-01' all records will be returned.
I have a scalar function in my code that calls another scalar function that calls 2 other tables as detailed below. I know this must be performing like a pig. It is used throughout the database... My problem is its a little outside developing skills to rewrite this as an table valued function.
I'm attempting to win some of the developers over to rewriting the function, but we only have JAVA guys and no dedicated SQL developer, so they dont see any problems. can anyone suggest how this should be rewritten? many thanks...
CREATE FUNCTION [dbo].[getInvertCurrencyExchangeRateByDate](#casino_id char(16),#currency_code char(3), #end_date datetime)
RETURNS float AS
BEGIN
declare #retval float;
set #retval =
dbo.getCurrencyExchangeRateByDate(#casino_id,#currency_code,#end_date);
if (#retval != 0) return 1/#retval;
return 0;
END
CREATE FUNCTION [dbo].[getCurrencyExchangeRateByDate](#casino_id char(16),#currency_code char(3), #end_date datetime)
RETURNS float AS
BEGIN
declare #retval float;
declare #casino_curr_code char(3);
set #casino_curr_code =
(SELECT TOP 1 currency_code
FROM Casino
WHERE
casino_id=#casino_id
);
if (#currency_code = #casino_curr_code) return 1;
set #retval =
COALESCE(
(
SELECT TOP 1 exchange_rate
FROM CurrencyExchangeRateHistory
WHERE
casino_id=#casino_id and
currency_code=#currency_code AND
transact_time <= #end_date
ORDER BY
transact_time DESC
),0.0);
return #retval
END
I'm sorry, but thats a heck lot of code for something rather simple
I think this satisfies the query needs.
CREATE FUNCTION dbo.TVF(#casino_id char(16),#currency_code char(3), #end_date datetime)
RETURNS TABLE
AS
RETURN --IF THE JOIN FAILS OR RETURNS 0, DIVISION WILL NEVER HAPPEN AND FALL IN THE ISNULL
SELECT TOP 1 CASE WHEN A.currency_code = #currency_code THEN 1 ELSE ISNULL(1/NULLIF(B.exchange_rate,0), 0) END AS RETVAL
FROM Casino A
LEFT JOIN CurrencyExchangeRateHistory B ON A.casino_id = B.casino_id AND B.transact_time <= #end_date AND B.currency_code = A.currency_code
WHERE A.casino_id = #casino_id
ORDER BY B.transact_time DESC