Perfomance of SELECT-query with branch in WHERE-clause

Perfomance of SELECT-query with branch in WHERE-clause - sql-server

I've a following legacy SP:
CREATE PROCEDURE dbo.get_orders_history
(
#FirstDt DATETIME2(6),
#LastDt DATETIME2(6),
#Class VARCHAR(12),
#PeriodType SMALLINT
)
AS
SET NOCOUNT ON
CREATE TABLE #BufferTable (OrderId INT)
INSERT INTO #BufferTable
SELECT DISTINCT
O.OrderId
FROM
BaseOrders O JOIN Classes C ON O.ClassId = C.ClassId
WHERE
(O.Changed = 0) AND
(C.ClassCode = #ClassCode) AND
(
(#PeriodType = 1 AND O.LastActionDateTime >= #FirstDt AND O.OrderDateTime < #LastDt) OR
(#PeriodType = 2 AND O.OrderDateTime >= #FirstDt AND O.OrderDateTime <= #LastDt)
)
OPTION(RECOMPILE);
SELECT A.Column,
C.Column,
OB.Column1,
...
OB.Column10,
O.Column1,
...
O.Column100,
FROM BaseOrders OB
JOIN #BufferTable IDL ON (OB.OrderId = IDL.OrderId)
JOIN Orders O ON (O.OrderId = IDL.OrderId)
JOIN Classes C ON (O.ClassId = C.ClassId)
ORDER BY
O.OrderId
DROP TABLE #BufferTable
GO
The parameter 'PeriodType' is to be added now and I doubt whether this way of make branch (condition in WHERE-clause) is efficient.
SP is rarely called but returns a lot of rows (100K+), so I think OPTION RECOMPILE for SELECT is reasonable solution in this case.
Could any of SQL experts suggest more efficient way to implement such branch?
--
EDIT: I will clarify that current prod version of SP has no parameter 'Period type' and WHERE is following:
WHERE
(O.Changed = 0) AND
(C.ClassCode = #ClassCode) AND
(O.LastActionDateTime >= #FirstDt AND O.OrderDateTime < #LastDt)
My goal is to implement two types of date range type in current SP without or with minimal performance penalties.

Did you try removing OR from the initial insert and deleting afterwards with negation? Here is a code snippet.
CREATE TABLE #BufferTable (OrderId INT, LastActionDateTime datetime, OrderDateTime datetime)
INSERT INTO #BufferTable
SELECT DISTINCT
o.OrderId
, o.LastActionDateTime
, o.OrderDateTime
FROM
BaseOrders O JOIN Classes C ON O.ClassId = C.ClassId
WHERE
(O.Changed = 0) AND
(C.ClassCode = #ClassCode);
delete p
from #BufferTable p
where not
(
(#PeriodType = 1 AND p.LastActionDateTime >= #FirstDt AND p.OrderDateTime < #LastDt) OR
(#PeriodType = 2 AND p.OrderDateTime >= #FirstDt AND p.OrderDateTime <= #LastDt)
)

Related

Query works as expected, SSRS finds error?

This question was closed because someone thought it was the same issue as SSRS multi-value parameter using a stored procedure
But it is not. My report is not a stored procedure and thus, behaves differently. Also, this issue describes a result of getting no results if multi-valued params are used and that too is inaccurate for this scenario. So I'll try posting this again.
My report for the most part works. It is when I select more than one value from either of 2 specific params (#global, #manual) that I get this error:
Here is the SQL:
DECLARE #STATE VARCHAR(2) = 'mn'
,#START DATE = '6/1/2020'
,#END DATE = '7/1/2020'
,#GLOBAL VARCHAR(50) = 'indigent fee'
,#MANUAL VARCHAR(100) = '''misc charges'',''discount'''
DROP TABLE
IF EXISTS #customers
,#test
SELECT DISTINCT ch.amount
,ch.vehicle_program_id
,c.customer_id
,ch.customer_charge_id
,ch.charge_type
INTO #customers
FROM customer c
JOIN customer_charge ch(NOLOCK) ON c.customer_id = ch.customer_id
JOIN service_history sh(NOLOCK) ON sh.customer_id = c.customer_id
JOIN header h(NOLOCK) ON h.service_history_id = sh.service_history_id
WHERE ch.entry_date BETWEEN #START
AND #END
AND ch.price_trigger_id IN (
16
,15
)
AND ch.source_type = 1
AND sh.service_type = 5
AND h.is_duplicate = 0;
WITH CTE_global
AS (
SELECT DISTINCT ch.charge_type
,'global' AS type
FROM customer_charge ch
JOIN store s ON ch.store_id = s.store_id
JOIN address a ON a.id = s.address_id
JOIN locality l ON a.locality_id = l.id
WHERE l.region = #state
AND ch.price_trigger_id = 16
UNION ALL
SELECT 'None'
,'global'
)
,CTE_manual
AS (
SELECT DISTINCT ch.charge_type
,'manual' AS type
FROM customer_charge ch
JOIN store s ON ch.store_id = s.store_id
JOIN address a ON a.id = s.address_id
JOIN locality l ON a.locality_id = l.id
WHERE l.region = #state
AND ch.price_trigger_id = 15
UNION ALL
SELECT 'None'
,'manual'
)
SELECT DISTINCT c.last_name
,c.first_name
,vp.account_no
,cust.charge_type
,cust.amount
,sh.service_date
,s.store_name_short
,GLOBAL = g.charge_type
,manual = m.charge_type
INTO #test
FROM vehicle_program vp(NOLOCK)
JOIN vehicle v(NOLOCK) ON v.vehicle_id = vp.vehicle_id
JOIN service_history sh(NOLOCK) ON sh.vehicle_program_id = vp.program_id
AND service_type = 5
JOIN customer c(NOLOCK) ON v.customer_id = c.customer_id
AND c.customer_id = sh.customer_id
JOIN store s(NOLOCK) ON vp.current_store_id = s.store_id
JOIN #customers cust ON cust.customer_id = c.customer_id
AND cust.vehicle_program_id = sh.vehicle_program_id
JOIN customer_condition cc(NOLOCK) ON c.customer_id = cc.customer_id
JOIN customer_charge ch(NOLOCK) ON ch.customer_id = c.customer_id
JOIN service_charge sc ON sc.service_history_id = sh.service_history_id
AND sc.customer_charge_id = cust.customer_charge_id
JOIN header h(NOLOCK) ON h.service_history_id = sh.service_history_id
JOIN CTE_global g ON g.charge_type = ch.charge_type
JOIN CTE_manual m ON m.charge_type = ch.charge_type
WHERE cc.state_of_conviction = #state
AND sh.service_date BETWEEN #START
AND #END
AND h.is_duplicate = 0
SELECT *
FROM #test
WHERE GLOBAL IN (
CASE
WHEN #global IN ('None')
THEN charge_type
WHEN #global NOT IN ('None')
THEN #global
END
)
OR manual IN (
CASE
WHEN #manual IN ('None')
THEN charge_type
WHEN #manual NOT IN ('None')
THEN #manual
END
)
For clarity, the last bit in the query there is some logic to allow for these two params to be optional: so by selecting 'None' that param is rendered useless basically. It seems clear that the issue is with this last bit, specifically my WHERE clause using the CASE expression. When I remove that, I don't get the error, but I of course lose my logic. What's most confusing is that the error indicates an issue with a comma, but there's no comma in that part of the SQL?? Any help is going to be greatly appreciated.

Assuming users will only select 'None' from the list on it's own and never with another value then the following should work.
WHERE (GLOBAL IN (#Global) OR #Global = 'None')
AND
(manual IN (#manual) OR #manual = 'None')

this question was closed because someone thought it was the same issue
It is a dupe, but you kind of have to read between the lines in the other answers to apply it to this scenario. The point is that SSRS replaces multi-select parameters with delimited strings in the query body itself, and this transformation can lead either to unexpectedly getting no results, or in an illegal SQL query, depending on where the parameter marker appears in the original query.
I'll make it a bit clearer exactly what's going on. You can repro this behavior with this as your Data Set query:
drop table if exists #foo
create table #foo(charge_type varchar(200) , global varchar(200))
select *
from #foo
WHERE GLOBAL IN (
CASE
WHEN #global IN ('None')
THEN charge_type
WHEN #global NOT IN ('None')
THEN #global
END
)
And configure #global as a parameter that allows multi-select. When the user selects multiple values SSRS transforms the query into:
drop table if exists #foo
create table #foo(charge_type varchar(200) , global varchar(200))
select *
from #foo
WHERE GLOBAL IN (
CASE
WHEN N'a',N'b' IN ('None')
THEN charge_type
WHEN N'a',N'b' NOT IN ('None')
THEN N'a',N'b'
END
)
Which fails with An expression of non-boolean type specified in a context where a condition is expected, near ','.

In plpgsql (of PostgreSQL), can a CTE be preserved to an outer loop?

(Edited from the original).
In plpgsql, (PostgreSQL 9.2), I have a function defined as:
CREATE OR REPLACE FUNCTION test (patient_recid integer, tencounter timestamp without time zone)
RETURNS SETOF view_dx AS
$BODY$
#variable_conflict use_column
DECLARE
r view_dx%rowtype;
BEGIN
FOR r IN
With person AS (
select ....
)
, alldx AS (
select ....
)
............
select ... from first cte
union
select ... from second cte
union
etc., etc.,
LOOP
r.tposted = ( .
With person AS (
... SAME AS ABOVE,
alldx AS (
... SAME AS ABOVE,
)
select max(b.tposted)
from alldx b
where r.cicd9 = b.code and r.cdesc = b.cdesc);
r.treated = (
With person AS (
........SAME AS ABOVE )
, alldx AS (
........SAME AS ABOVE
)
select ...);
r.resolved = (
With person AS (
select p.chart_recid as recid
from patients p
where p.recid = patient_recid
)
...etc, etc,
RETURN NEXT r;
END LOOP;
RETURN;
END
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION test(integer, timestamp without time zone)
OWNER TO postgres;
Edit: Essentially, I have multiple cte's defined which work well in the "For r IN" section of code with multiple unions, but when executing the LOOP...END LOOP section, each CTE needs to be redefined with each SELECT statement. Is there a good way to avoid multiple definitions of the same CTE?
Or is there a better (i.e., faster) way of doing this.
All suggestions are most welcome and appreciated.
TIA

[this is not an answer (too little information, too large program), but a hint for rewriting the stacked CTE.]
The members of the union all appear to be based on select b.* from alldx b, all with a few different extra conditions, mostly based on the existance of other tuples within the same CTE. My suggestion is to unify these, replacing them by boolean flags, as in:
WITH person AS (
SELECT p.chart_recid as recid
FROM patients p
WHERE p.recid = patient_recid
)
, alldx AS (
SELECT d.tposted, d.treated, d.resolved, d.recid as dx_recid, d.pmh, d.icd9_recid
, i.code, i.cdesc, i.chronic
FROM dx d
JOIN icd9 i ON d.icd9_recid = i.recid
JOIN person p ON d.chart_recid = p.recid
WHERE d.tposted::date <= tencounter::date
)
SELECT uni.tposted, uni.treated, uni.resolved, uni.dx_recid, uni.pmh, uni.icd9_recid
, uni.code, uni.cdesc, uni.chronic
, (uni.tposted::date = tencounter::date
) AS is_dx_at_encounter -- bitfield
, EXISTS ( -- a record from a more recent date has resolved this problem.
SELECT 1
FROM alldx x
WHERE x.resolved = true
AND uni.code = x.code AND uni.cdesc = x.cdesc AND uni.tposted = x.tposted
AND x.tposted >= uni.tposted
) AS dx_resolved -- bitfield
, EXISTS ( -- a record from a more recent date has resolved this problem.
SELECT 1
FROM alldx x
WHERE x.resolved = false
AND uni.code = x.code AND uni.cdesc = x.cdesc AND uni.tposted = x.tposted
AND x.tposted > uni.tposted
) AS dx_recurred -- bitfield
, EXISTS ( SELECT * from alldx x where x.chronic = true
AND uni.code = x.code AND uni.cdesc = x.cdesc
) AS dx_chronic -- bitfield
-- etcetera
FROM alldx uni
;
The person CTE could probably be incorporated, too.
and maybe you don't even need the final loop
but you'll have to find out which combination(s) of the resulting bitfields will be needed.
the UNION (without ALL) in the original is a terrible beast: it collects all the results from the union parts, but has to remove duplicates. This will probably introduce a sort-step, since CTE-references tend to hide their key fields or implied ordering from the calling query.

As far as I can tell, CTE's defined before the LOOP do not transfer to the LOOP itself. However, a temporary table can be defined in the BEGIN block which is available in the LOOP block. The following solution runs 50 times faster then my original code. Anybody have a better approach?
CREATE OR REPLACE FUNCTION test2 (patient_recid integer, tencounter timestamp without time zone)
RETURNS SETOF view_dx AS
$BODY$
#variable_conflict use_column
DECLARE
r view_dx%rowtype;
BEGIN
-- create table can only be created in the BEGIN block
Create temp table all_dx ON COMMIT DROP AS
With person AS (
select p.chart_recid as recid
from patients p
where p.recid = patient_recid
)
, alldx AS (
select d.tposted, d.treated, d.resolved, d.recid as dx_recid, d.pmh, d.icd9_recid, i.code, i.cdesc, i.chronic
from dx d
join icd9 i on (d.icd9_recid = i.recid)
join person p on (d.chart_recid = p.recid)
where d.tposted::date <= tencounter::date
)
select * from alldx order by tposted desc;
-- will loop through all the records produced by the unions and assign tposted, pmh, chronic, etc...
FOR r IN
With
dx_at_encounter AS ( -- get all diagnosis at time of encounter
select code, cdesc from all_dx a
where a.tposted::date = tencounter::date
)
, dx_resolved AS ( -- get most recent date of every resolved problem.
select b.* from all_dx b
join (
select a.code, a.cdesc , max(tposted) as tposted
from all_dx a
where a.resolved = true
group by code,cdesc) j
on (b.code = j.code and b.cdesc = j.cdesc and b.tposted = j.tposted)
)
, never_resolved AS ( -- get all problems that have never been resolved before time of encounter.
-- "not exists" is applied to each select output row AFTER the output row b.* is formed.
select b.code, b.cdesc from all_dx b
where not exists
(select 1
from dx_resolved d
where b.code = d.code and b.cdesc = d.cdesc)
)
, recurrent AS ( -- get all recurrent problems. (Problems that are now current after being resolved).
select b.code, b.cdesc
from all_dx b
join dx_resolved r on (b.cdesc = r.cdesc and b.tposted::date > r.tposted::date )
where (b.resolved is null or b.resolved = false)
)
, chronic_dx AS (
select b.code, b.cdesc
from all_dx b
where b.chronic = true
)
-- all diagnosis at time of encounter
select a.code,
a.cdesc
from dx_at_encounter a
union
-- all recurrent problems
select
a.code,
a.cdesc
from recurrent a
union
-- all problems that have never been resolved
select
a.code,
a.cdesc
from never_resolved a
union
--all chonic problems
select
a.code,
a.cdesc
from chronic_dx a
-- LOOP goes to END LOOP which returns back to LOOP to process each of the result records from the unions.
LOOP
r.tposted = ( -- get most recent useage of a diagnosis.
select max(b.tposted)
from all_dx b
where r.cicd9 = b.code and r.cdesc = b.cdesc);
r.treated = (
select b.treated from all_dx b
where b.tposted = r.tposted and b.code = r.cicd9 and b.cdesc = r.cdesc);
r.resolved = (
select b.resolved from all_dx b
where b.tposted = r.tposted and b.code = r.cicd9 and b.cdesc = r.cdesc);
r.pmh = (
select distinct true
from all_dx b
where
b.pmh = true and
b.code = r.cicd9 and
b.cdesc = r.cdesc );
r.chronic = (
select distinct true
from all_dx b
where
b.chronic = true and
b.code = r.cicd9 and
b.cdesc = r.cdesc);
RETURN NEXT r; -- return current row of SELECT
END LOOP;
RETURN;
END
$BODY$
LANGUAGE plpgsql VOLATILE
COST 100
ROWS 1000;
ALTER FUNCTION test2(integer, timestamp without time zone)
OWNER TO postgres;

Get all parent rows that do not have a row for current date in child table?

SELECT
[dbo].[Mission].[MissionId]
FROM
[dbo].[Mission]
LEFT OUTER JOIN
[dbo].[Report] ON [dbo].[Mission].[MissionId] = [dbo].[Report].[MissionId]
WHERE
[dbo].[Report].ReportDate IS NULL
ORDER BY
[dbo].[Mission].[MissionId]
How can I change the above query such that it gives me all MissionId's from table [dbo].[Mission] that do not have a row in table [dbo].[Report] where [dbo].[Report].ReportDate is today?
MissionId is the primary key in table Mission and a foreign key in table Report. So I want to get all missions that do not have a row in table Report for the current date.

I've introduced some aliases to make the query easier to read, and added the needed condition. I've also changed the WHERE clause, not sure if that's required:
SELECT m.[MissionId]
FROM [dbo].[Mission] m LEFT OUTER JOIN [dbo].[Report] r
ON m.[MissionId] = r.[MissionId]
AND r.ReportDate = DATEADD(day,DATEDIFF(day,0,GETDATE()),0)
WHERE r.MissionId IS NULL
ORDER BY m.[MissionId]
This assumes that ReportDate contains dates with the time portions set to midnight. If that's not so, then a slightly more complex query is required:
SELECT m.[MissionId]
FROM [dbo].[Mission] m
WHERE NOT EXISTS(select * from dbo.Report r
where r.MissionID = m.MissionID and
r.ReportDate >= DATEADD(day,DATEDIFF(day,0,GETDATE()),0) and
r.ReportDate < DATEADD(day,DATEDIFF(day,0,GETDATE()),1)
)
ORDER BY m.[MissionId]
GETDATE() returns the current date and time. I'm using a couple of tricks with DATEADD and DATEDIFF to take that value and turn it into the current date at midnight, and (in the second query) tomorrow's date at midnight.
Second query as a fully runnable query:
declare #mission table (MissionID int not null);
insert into #mission (MissionID) select 1 union all select 2;
declare #report table (MissionID int not null,ReportDate datetime not null);
insert into #report (MissionID,ReportDate)
select 2,GETDATE() union all select 1,DATEADD(day,-1,GETDATE());
SELECT m.[MissionId]
FROM #mission m
WHERE NOT EXISTS(select * from #report r
where r.MissionID = m.MissionID and
r.ReportDate >= DATEADD(day,DATEDIFF(day,0,GETDATE()),0) and
r.ReportDate < DATEADD(day,DATEDIFF(day,0,GETDATE()),1)
)
ORDER BY m.[MissionId]
Result:
MissionId
-----------
1

select
m.MissionId
from Mission m
left join Report r
on m.MissionId = r.MissionId
and day(r.ReportDate) = day(getdate())
and month(r.ReportDate) = month(getdate())
and year(r.ReportDate) = year(getdate())
WHERE r.ReportDate is null
ORDER BY m.MissionId

how to convert the below subquery into joins using one update statement

Below is a complete query I have and the ultimate aim is to update the claim table. But it should be only one statement without any subquery, only joins are allowed because I am going to run this in an appliance which won't support subquery:
DECLARE #DecWdrwn as TABLE(CtryId smallint, CmId int, DecWdrwnDt int);
WITH s AS
(
SELECT
Ctryid,CmId,Dt,
ISNULL((
SELECT max(CmHistDtTmId)
FROM ClaimHistory l
WHERE St = 3
AND l.Ctryid = c.Ctryid
AND l.CmId = c.CmId)
, 0) MaxDec,
ISNULL((
SELECT max(CmHistDtTmId)
FROM ClaimHistory l
WHERE St = 7
AND l.Ctryid = c.Ctryid
AND l.CmId = c.CmId)
, 0) MaxSet
FROM
ClaimHistory c
WHERE
St =3
)
INSERT INTO #DecWdrwn
SELECT CtryId, CmId, Max(Dt) DecDt
FROM s
WHERE MaxSet > MaxDec
GROUP BY CtryId,CmId
Your response is much appreciated...
UPDATE Claims
SET CmDclnWdwnDt = (
SELECT DecWdrwnDt
FROM #DecWdrwn d
WHERE d.CmId = Claims.CmId
AND d.CtryId = Claims.CtryId
)
WHERE EXISTS (
SELECT *
FROM #DecWdrwn d
WHERE d.CmId = Claims.CmId
AND d.CtryId = Claims.CtryId
)

Please try INNER JOIN Update:
UPDATE a
SET a.CmDclnWdwnDt = b.DecWdrwnDt
FROM Claims a, #DecWdrwn b
WHERE a.CmId = b.CmId AND
a.CtryId =b.CtryId

Joining on varchar(50) foreign key slows query

I have this query which is pretty long, but adding a where clause to it, or joining on a string makes it take an extra 2 seconds to run. I can't figure out why.
Here's the query in full:
ALTER PROCEDURE [dbo].[RespondersByPracticeID]
#practiceID int = null,
#activeOnly bit = 1
AS
BEGIN
SET NOCOUNT ON;
select
isnull(sum(isResponder),0) as [Responders]
,isnull(count(*) - sum(isResponder),0) as [NonResponders]
,isnull((select
count(p.patientID)
from patient p
inner join practice on practice.practiceid = p.practiceid
inner join [lookup] l on p.dosing = l.lookupid and l.lookupid = 'da_ncd'
where
p.practiceID = isnull(#practiceID, p.practiceID)
and p.active = case #activeOnly when 1 then 1 else p.active end
) - (isnull(sum(isResponder),0) + isnull(count(*) - sum(isResponder),0)),0)
as [Undetermined]
from (
select
v.patientID
,firstVisit.hbLevel as startHb
,maxHbVisit.hblevel as maxHb
, case when (maxHbVisit.hblevel - firstVisit.hbLevel >= 1) then 1 else 0 end as isResponder
,count(v.patientID) as patientCount
from patient p
inner join visit v on v.patientid = v.patientid
inner join practice on practice.practiceid = p.practiceid
inner join [lookup] l on p.dosing = l.lookupid and l.lookupid = 'da_ncd'
inner join (
SELECT
p.PatientID
,v.VisitID
,v.hblevel
,v.VisitDate
FROM Patient p
INNER JOIN Visit v ON p.PatientID = v.PatientID
WHERE
v.VisitDate = (
SELECT MIN(VisitDate)
FROM Visit
WHERE PatientId = p.PatientId
)
) firstVisit on firstVisit.patientID = v.patientID
inner join (
select
p.patientID
,max(v.hbLevel) as hblevel
from Patient p
INNER JOIN Visit v ON p.PatientID = v.PatientID
group by
p.patientID
) MaxHbVisit on maxHbVisit.patientid = v.patientId
where
p.practiceID = isnull(#practiceID, p.practiceID)
and p.active = case #activeOnly when 1 then 1 else p.active end
group by
v.patientID
,firstVisit.hbLevel
,maxHbVisit.hblevel
having
datediff(
d,
dateadd(
day
,-DatePart(
dw
,min(v.visitDate)
) + 1
,min(v.visitDate)
)
, max(v.visitDate)
) >= (7 * 8) -- Eight weeks.
) responders
END
The line that slows it down is:
inner join [lookup] l on p.dosing = l.lookupid and l.lookupid = 'da_ncd'
Also, moving it to the where clause has the same effect:
where p.dosing = 'da_ncd'
Otherwise, the query runs almost instantly. >.<

Ah, sorry I figured it out. Patient.Dosing was set as allow nulls. I guess that made it a different sort of index.

For the record, even though the question is answered.
Usually things like this happen because the execution plan is changed. Compare the plans in query analyzer.

Another gotcha is data types - if p.dosing and l.lookupid differ - nvarchar vs. varchar, for example, can have a huge impact.

Try creating an index on that table, being sure to properly include that VARCHAR field in the list of fields.

Categories

apple-business-manager

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Perfomance of SELECT-query with branch in WHERE-clause - sql-server

Related

Query works as expected, SSRS finds error?

In plpgsql (of PostgreSQL), can a CTE be preserved to an outer loop?

Get all parent rows that do not have a row for current date in child table?

how to convert the below subquery into joins using one update statement

Joining on varchar(50) foreign key slows query

Categories

Resources