tsql - optimize case statement in select clause - sql-server

This snippet in my basic select in a table is making my query from 3 seconds to 8. Any ideas?
case
when #excludeprojtag = 1
then qtyruletag
else 0
end as qtyruletag
EDIT:
HERE IS WHOLE QUERY
select case when #excludeprojtag = 1 then qtyruletag else 0 end as qtyruletag,listid,quantity
from tb_sales where date between '1/1/2015' and '1/1/2016'
AND CASE WHEN #excludeTestOrders = 1 THEN AccountID ELSE 123 END <> 1234

USE AdventureWorks -- 2012 version
GO
DECLARE #includezip INT = 0
SELECT #includezip = DATEPART(ss,GETDATE()) % 2
SET STATISTICS TIME ON;
SELECT pa.AddressLine1, pa.City,
CASE -- CASE method
WHEN #includezip = 1
THEN pa.PostalCode
ELSE ''
END AS ZipCode
FROM Person.BusinessEntity pbe
LEFT JOIN Person.BusinessEntityAddress pbea
ON pbe.BusinessEntityID = pbea.BusinessEntityID
LEFT JOIN Person.Address pa
ON pbea.AddressID = pa.AddressID
SELECT pa.AddressLine1, pa.City,
SUBSTRING(pa.PostalCode,1,LEN(pa.PostalCode) * #includezip) AS ZipCode -- No CASE used
FROM Person.BusinessEntity pbe
LEFT JOIN Person.BusinessEntityAddress pbea
ON pbe.BusinessEntityID = pbea.BusinessEntityID
LEFT JOIN Person.Address pa
ON pbea.AddressID = pa.AddressID
GO
SET STATISTICS TIME OFF;
GO
(20812 row(s) affected)
SQL Server Execution Times:
CPU time = 15 ms, elapsed time = 102 ms.
(20812 row(s) affected)
SQL Server Execution Times:
CPU time = 16 ms, elapsed time = 140 ms.
I've run this a bunch of times. Seems to me like the second one is always slower. When I first created this, I assumed that zip would be numeric, but I forgot about Canada, etc. So if for some reason you are averse to CASE statements out of general principle, you can use this method when you have a binary parameter. Here I've just set it to be more or less random.

I think that where can be improved
select case when #excludeprojtag = 1 then qtyruletag else 0 end as qtyruletag
, listid, quantity
from tb_sales
where date between '1/1/2015' and '1/1/2016'
AND ( #excludeTestOrders <> 1 OR AccountID <> 1234 )

Split the 2 where condition one on top and the other one leave it on the where clause.
If (CASE WHEN #excludeTestOrders = 1 THEN AccountID ELSE 123 END <> 1234)
Begin
select case when #excludeprojtag = 1 then qtyruletag else 0 end as qtyruletag,listid,quantity
from tb_sales
where date between '1/1/2015' and '1/1/2016'
End

Related

SQL - Finding Gaps in Coverage

I am running this problem on SQL server
Here is my problem.
have something like this
Dataset A
FK_ID StartDate EndDate Type
1 10/1/2018 11/30/2018 M
1 12/1/2018 2/28/2019 N
1 3/1/2019 10/31/2019 M
I have a second data source I have no control over with data something like this:
Dataset B
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/15/2018 M
1 10/1/2018 10/25/2018 M
1 2/15/2019 4/30/2019 M
1 5/1/2019 10/31/2019 M
What I am trying to accomplish is to check to make sure every date within each TYPE M record in Dataset A has at least 1 record in Dataset B.
For example record 1 in Dataset A does NOT have coverage from 10/26/2018 through 11/30/2018. I really only care about when the coverage ends, in this case I want to return 10/26/2018 because it is the first date where the span has no coverage from Dataset B.
I've written a function that does this but it is pretty slow because it is cycling through each date within each M record and counting the number of records in Dataset B. It exits the loop when it finds the first one but I would really like to make this more efficient. I am sure I am not thinking about this properly so any suggestions anyone can offer would be helpful.
This is the section of code I'm currently running
else if #SpanType = 'M'
begin
set #CurrDate = #SpanStart
set #UncovDays = 0
while #CurrDate <= #SpanEnd
Begin
if (SELECT count(*)
FROM eligiblecoverage ec join eligibilityplan ep on ec.plandescription = ep.planname
WHERE ec.masterindividualid = #IndID
and ec.planbegindate <= #CurrDate and ec.planenddate >= #CurrDate
and ec.sourcecreateddate = #MaxDate
and ep.medicaidcoverage = 1) = 0
begin
SET #Result = concat('NON Starting ',format(#currdate, 'M/d/yyyy'))
BREAK
end
set #CurrDate = #CurrDate + 1
end
end
I am not married to having a function it just could not find a way to do this in queries that wasn't very very slow.
EDIT: Dataset B will never have any TYPEs except M so that is not a consideration
EDIT 2: The code offered by DonPablo does de-overlap the data but only in cases where there is an overlap at all. It reduces dataset B to:
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/25/2018 M
instead of
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/25/2018 M
1 2/15/2019 4/30/2019 M
1 5/1/2019 10/31/2019 M
I am still futzing around with it but it's a start.
I would approach this by focusing on B. My assumption is that any absent record would follow span_end in the table. So here is the idea:
Unpivot the dates in B (adding "1" to the end dates)
Add a flag if they are present with type "M".
Check to see if any not-present records are in the span for A.
Check the first and last dates as well.
So, this looks like:
with bdates as (
select v.dte,
(case when exists (select 1
from b b2
where v.dte between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 1 else 0
end) as in_b
from b cross apply
(values (spanstart), (dateadd(day, 1, spanend)
) v(dte)
where b.type = 'M' -- all we care about
group by v.dte -- no need for duplicates
)
select a.*,
(case when not exists (select 1
from b b2
where a.startdate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 0
when not exists (select 1
from b b2
where a.enddate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
when exists (select 1
from bdates bd
where bd.dte between a.startdate and a.enddate and
bd.in_b = 0
)
then 0
when exists (select 1
from b b2
where a.startdate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 1
else 0
end)
from a;
What is this doing? Four validity checks:
Is the starttime valid?
Is the endtime valid?
Are any intermediate dates invalid?
Is there at least one valid record?
Start by framing the problem in smaller pieces, in a sequence of actions like I did in the comment.
See George Polya "How To Solve It" 1945
Then Google is your friend -- look at==> sql de-overlap date ranges into one record (over a million results)
UPDATED--I picked Merge overlapping dates in SQL Server
and updated it for our table and column names.
Also look at theory from 1983 Allen's Interval Algebra https://www.ics.uci.edu/~alspaugh/cls/shr/allen.html
Or from 2014 https://stewashton.wordpress.com/2014/03/11/sql-for-date-ranges-gaps-and-overlaps/
This is a primer on how to setup test data for this problem.
Finally determine what counts via Ranking the various pairs of A vs B --
bypass those totally Within, then work with earliest PartialOverlaps, lastly do the Precede/Follow items.
--from Merge overlapping dates in SQL Server
with SpanStarts as
(
select distinct FK_ID, SpanStart
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanStart < t1.SpanStart
and t2.SpanEnd >= t1.SpanStart)
),
SpanEnds as
(
select distinct FK_ID, SpanEnd
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanEnd > t1.SpanEnd
and t2.SpanStart <= t1.SpanEnd)
),
DeOverlapped_B as
(
Select FK_ID, SpanStart,
(select min(SpanEnd) from SpanEnds as e
where e.FK_ID = s.FK_ID
and SpanEnd >= SpanStart) as SpanEnd
from SpanStarts as s
)
Select * from DeOverlapped_B
Now we have something to feed into the next steps, and we can use the above as a CTE
======================================
with SpanStarts as
(
select distinct FK_ID, SpanStart
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanStart < t1.SpanStart
and t2.SpanEnd >= t1.SpanStart)
),
SpanEnds as
(
select distinct FK_ID, SpanEnd
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanEnd > t1.SpanEnd
and t2.SpanStart <= t1.SpanEnd)
),
DeOverlapped_B as
(
Select FK_ID, SpanStart,
(select min(SpanEnd) from SpanEnds as e
where e.FK_ID = s.FK_ID
and SpanEnd >= SpanStart) as SpanEnd
from SpanStarts as s
),
-- find A row's coverage
ACoverage as (
Select
a.*, b.SpanEnd, b.SpanStart,
Case
When SpanStart <= StartDate And StartDate <= SpanEnd
And SpanStart <= EndDate And EndDate <= SpanEnd
Then '1within' -- starts, equals, during, finishes
When EndDate < SpanStart
Or SpanEnd < StartDate
Then '3beforeAfter' -- preceeds, meets, preceeded, met
Else '2overlap' -- one or two ends hang over spanStart/End
End as relation
From Coverage_A a
Left Join DeOverlapped_B b
On a.FK_ID = b.FK_ID
Where a.Type = 'M'
)
Select
*
,Case
When relation1 = '2' And StartDate < SpanStart Then StartDate
When relation1 = '2' Then DateAdd(d, 1, SpanEnd)
When relation1 = '3' Then StartDate
End as UnCoveredBeginning
From (
Select
*
,SUBSTRING(relation,1,1) as relation1
,ROW_NUMBER() Over (Partition by A_ID Order by relation, SpanStart) as Rownum
from ACoverage
) aRNO
Where Rownum = 1
And relation1 <> '1'

SQL Server : select when record is 1 then return line else return the 0 record

I need a little help with a query.
I have written a script that brings back an order number and the number of containers needed (code below):
SELECT
CONI.CONTNO,
CONI.ITEMNO,
CONI.[WEIGHT],
CONI.QTY,
STOK.PGROUP,
CASE WHEN CPRO.TNTCOL = 1 THEN 1
WHEN CPRO.TNTCOL = 0 THEN 0
WHEN CPRO.TNTCOL IS NULL THEN 0 END AS [TNT],
CONI.RECID,
CPRO.RECKEY
INTO
#SUB
FROM
ContItems CONI
LEFT JOIN
ContractItemProfiles CPRO ON CONI.RECID = CPRO.RECKEY
JOIN
Stock STOK ON CONI.ITEMNO = STOK.ITEMNO
WHERE
STOK.PGROUP LIKE 'FLI%'
SELECT
#SUB.CONTNO,
#SUB.TNT,
SUM(#SUB.QTY) AS [Number of flight cases]
FROM
#SUB
WHERE
#SUB.CONTNO = '123/321581'
GROUP BY
#SUB.CONTNO,
#SUB.TNT
DROP TABLE #SUB
I get this result:
Contno TNT Number of flight cases
------------------------------------------
123/321581 0 20.00
123/321581 1 1.00
I need to conditionally bring back the line that has the TNT = 1 Else if there isn't a 1 in the TNT column then bring back the record with 0
I hope this is explained enough.
That case can be replaced with
isnull(CPRO.TNTCOL, 0)
select top 1
from ( SELECT #SUB.CONTNO,
#SUB.TNT,
SUM(#SUB.QTY) AS [Number of flight cases]
FROM #SUB
WHERE #SUB.CONTNO = '123/321581'
) t
order by TNT desc

Second level lookup with SQL statement

How do I write a SQL statement that does a second level lookup only if first is not matched. For example:
In the below query, if my SEDOLCode condition does not return a record, proceed to lookup with condition 2 with RICCode.
select
*, GETDATE()
from
Securities sec
where
sec.SEDOLCode = 'ABCDEF'
or sec.RICCode = '002815.SZ'
This query is returning two different records - for example:
1234 ABCDEF DUMY906.X
5675 EFTFS 002815.SZ
I am taking data from a file to update the Pricetable as below. I want to use SedolCode as primary lookup.
IF ##ROWCOUNT = 0
INSERT INTO dbo.Price (sec.SecurityID, ClosingPrice, UpdatedDate, UpdatedByUser, Priced)
SELECT
..., GETDATE()
FROM
Securities sec
WHERE
sec.SEDOLCode = #SedolCode
OR sec.RICCode = #RicCode
Try this the logic is basically if the sedolcode is found then it will only meet the first condition. Otherwise the count of that sedolcolde will be 0 and it will look at riccode.
select
*, GETDATE()
from
Securities sec
where
sec.sedolcode = 'ABCDEF'
OR ((SELECT COUNT(1) FROM securites WHERE sedolcode ='ABCDEF') = 0 AND sec.riccode = '002815.SZ')
Ah - reminds me of my FTSE days........
Match Sedol and Not Ric
or
Ric and Not Sedol and use myOrdering & TOP to get the first.
INSERT INTO dbo.Price
(
sec.SecurityID
, ClosingPrice
, UpdatedDate
, UpdatedByUser
, Priced
)
SELECT TOP 1 [specifiy fields to insert]
FROM
(
select 1 as myOrdering ...
, GETDATE()
from Securities sec
WHERE
(sec.RICCode = #RicCode AND sec.SEDOLCode != #SedolCode)
UNION
select 2 as myOrdering ...
, GETDATE()
from Securities sec
WHERE
(sec.RICCode = #RicCode AND sec.SEDOLCode != #SedolCode)
)SUB_Q ORDER BY myOrdering

SQL Query - Same filter (obviously not) but huge performance difference

I have a query that contains this filter:
...
WHERE P.CustomerId = #CustomerId
AND NOT EXISTS (SELECT 1 FROM PaymentRestriction RS
WHERE RS.CustomerId = P.CustomerId )
and it does not execute in 5 min. I tried to change the parameter with real value:
...
WHERE P.CustomerId = '79579DFB-5610-48E0-A585-08D97867AA1F'
AND NOT EXISTS (SELECT 1 FROM PaymentRestriction RS
WHERE RS.CustomerId = P.CustomerId )
But nothing changed, then I tried something different and it executed in just one sec. :
...
WHERE (P.CustomerId = #CustomerId OR P.CustomerId = '79579DFB-5610-48E0-A585-08D97867AA1F')
AND NOT EXISTS (SELECT 1 FROM PaymentRestriction RS
WHERE RS.CustomerId = P.CustomerId )
I changed indexes and used OPTION (RECOMPILE) but nothing changed.Where should I control and change? Is it related to statistics? If it is, how can I fix?
PS: It is a long query and has many other filters also. But I guess, the problem is clear.
Edit 1: I changed the query filters in the question according to related parts.
Edit 2: I found that the join part is also affecting the performance. If I delete join part, there is no problem in all conditions. I simplified the query that problem still exists. Here is my simplified query:
SELECT COUNT(0)
FROM Payment (NOLOCK) P
LEFT JOIN PaymentFile (NOLOCK) PF
ON PF.PaymentFileId = P.PaymentFileId
WHERE -- (P.CustomerId = #Customer OR P.CustomerId = '79579DFB-5610-48E0-A585-08D97867AA1F')
(P.CustomerId = #Customer)
AND NOT EXISTS (SELECT 1 FROM PaymentRestriction RS
WHERE RS.CustomerId = P.CustomerId )
Another interesting results:
If I replace COUNT(0) with *, the execution time drops from 45 sec. to 20 sec.
If I uncomment the first filter line and comment the second filter line, the execution time drops from 45 sec to 0.1 sec
If I remove the last filter NOT EXIST, the execution time drops from 45 sec. to 0.1 sec again.

Optimize TSQL query with 3 tables

I need to get all the runs from the database, but need to mark if there is an error for this run.
3 Tables:
Runs: contains the runs)
Runfiles: contains the file ids that were processed during a run
Messages: contains errors, warnings, ...
Can this query be optimized any further?
SELECT TOP 1000 runid,
start,
end,
userid,
CASE
WHEN EXISTS(SELECT rf.fk_fileid
FROM runfiles rf
WHERE rf.fk_runid = r.runid
AND EXISTS(SELECT m.messageid
FROM messages m
WHERE m.fk_fileid =
rf.fk_fileid
AND m.fk_statusid = 4))
THEN 1
ELSE 0
END AS ContainsError
FROM runs r
ORDER BY start DESC
Please don't comment on the table names, they were translated for this question.
Thanks!
Try this:
SELECT TOP 1000
r.runid
,r.start
,r.[end]
,r.userid
,CASE WHEN m.messageid IS NOT NULL THEN 1 ELSE 0 END AS ContainsError
FROM runs r
LEFT JOIN runfiles rf
ON rf.fk_runid = r.runid
LEFT JOIN [messages] m
ON m.fk_fileid = rf.fk_fileid
AND m.fk_statusid = 4
ORDER BY r.start DESC
Anything in the select list is ran for each row in the result set. This means that the nested subquery in your CASE statement is being executed for each of those TOP 1000 rows.
Using left joins and a CASE statement to check if the primary key is null allow the entire statement to be evaluated as a set, which SQL Server is built to do. It should perform better this way.

Resources