SQL - Finding Gaps in Coverage - sql-server

I am running this problem on SQL server
Here is my problem.
have something like this
Dataset A
FK_ID StartDate EndDate Type
1 10/1/2018 11/30/2018 M
1 12/1/2018 2/28/2019 N
1 3/1/2019 10/31/2019 M
I have a second data source I have no control over with data something like this:
Dataset B
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/15/2018 M
1 10/1/2018 10/25/2018 M
1 2/15/2019 4/30/2019 M
1 5/1/2019 10/31/2019 M
What I am trying to accomplish is to check to make sure every date within each TYPE M record in Dataset A has at least 1 record in Dataset B.
For example record 1 in Dataset A does NOT have coverage from 10/26/2018 through 11/30/2018. I really only care about when the coverage ends, in this case I want to return 10/26/2018 because it is the first date where the span has no coverage from Dataset B.
I've written a function that does this but it is pretty slow because it is cycling through each date within each M record and counting the number of records in Dataset B. It exits the loop when it finds the first one but I would really like to make this more efficient. I am sure I am not thinking about this properly so any suggestions anyone can offer would be helpful.
This is the section of code I'm currently running
else if #SpanType = 'M'
begin
set #CurrDate = #SpanStart
set #UncovDays = 0
while #CurrDate <= #SpanEnd
Begin
if (SELECT count(*)
FROM eligiblecoverage ec join eligibilityplan ep on ec.plandescription = ep.planname
WHERE ec.masterindividualid = #IndID
and ec.planbegindate <= #CurrDate and ec.planenddate >= #CurrDate
and ec.sourcecreateddate = #MaxDate
and ep.medicaidcoverage = 1) = 0
begin
SET #Result = concat('NON Starting ',format(#currdate, 'M/d/yyyy'))
BREAK
end
set #CurrDate = #CurrDate + 1
end
end
I am not married to having a function it just could not find a way to do this in queries that wasn't very very slow.
EDIT: Dataset B will never have any TYPEs except M so that is not a consideration
EDIT 2: The code offered by DonPablo does de-overlap the data but only in cases where there is an overlap at all. It reduces dataset B to:
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/25/2018 M
instead of
FK_ID SpanStart SpanEnd Type
1 10/1/2018 10/25/2018 M
1 2/15/2019 4/30/2019 M
1 5/1/2019 10/31/2019 M
I am still futzing around with it but it's a start.

I would approach this by focusing on B. My assumption is that any absent record would follow span_end in the table. So here is the idea:
Unpivot the dates in B (adding "1" to the end dates)
Add a flag if they are present with type "M".
Check to see if any not-present records are in the span for A.
Check the first and last dates as well.
So, this looks like:
with bdates as (
select v.dte,
(case when exists (select 1
from b b2
where v.dte between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 1 else 0
end) as in_b
from b cross apply
(values (spanstart), (dateadd(day, 1, spanend)
) v(dte)
where b.type = 'M' -- all we care about
group by v.dte -- no need for duplicates
)
select a.*,
(case when not exists (select 1
from b b2
where a.startdate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 0
when not exists (select 1
from b b2
where a.enddate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
when exists (select 1
from bdates bd
where bd.dte between a.startdate and a.enddate and
bd.in_b = 0
)
then 0
when exists (select 1
from b b2
where a.startdate between b2.spanstart and b2.spanend and
b2.type = 'M'
)
then 1
else 0
end)
from a;
What is this doing? Four validity checks:
Is the starttime valid?
Is the endtime valid?
Are any intermediate dates invalid?
Is there at least one valid record?

Start by framing the problem in smaller pieces, in a sequence of actions like I did in the comment.
See George Polya "How To Solve It" 1945
Then Google is your friend -- look at==> sql de-overlap date ranges into one record (over a million results)
UPDATED--I picked Merge overlapping dates in SQL Server
and updated it for our table and column names.
Also look at theory from 1983 Allen's Interval Algebra https://www.ics.uci.edu/~alspaugh/cls/shr/allen.html
Or from 2014 https://stewashton.wordpress.com/2014/03/11/sql-for-date-ranges-gaps-and-overlaps/
This is a primer on how to setup test data for this problem.
Finally determine what counts via Ranking the various pairs of A vs B --
bypass those totally Within, then work with earliest PartialOverlaps, lastly do the Precede/Follow items.
--from Merge overlapping dates in SQL Server
with SpanStarts as
(
select distinct FK_ID, SpanStart
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanStart < t1.SpanStart
and t2.SpanEnd >= t1.SpanStart)
),
SpanEnds as
(
select distinct FK_ID, SpanEnd
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanEnd > t1.SpanEnd
and t2.SpanStart <= t1.SpanEnd)
),
DeOverlapped_B as
(
Select FK_ID, SpanStart,
(select min(SpanEnd) from SpanEnds as e
where e.FK_ID = s.FK_ID
and SpanEnd >= SpanStart) as SpanEnd
from SpanStarts as s
)
Select * from DeOverlapped_B
Now we have something to feed into the next steps, and we can use the above as a CTE
======================================
with SpanStarts as
(
select distinct FK_ID, SpanStart
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanStart < t1.SpanStart
and t2.SpanEnd >= t1.SpanStart)
),
SpanEnds as
(
select distinct FK_ID, SpanEnd
from Coverage_B as t1
where not exists
(select * from Coverage_B as t2
where t2.FK_ID = t1.FK_ID
and t2.SpanEnd > t1.SpanEnd
and t2.SpanStart <= t1.SpanEnd)
),
DeOverlapped_B as
(
Select FK_ID, SpanStart,
(select min(SpanEnd) from SpanEnds as e
where e.FK_ID = s.FK_ID
and SpanEnd >= SpanStart) as SpanEnd
from SpanStarts as s
),
-- find A row's coverage
ACoverage as (
Select
a.*, b.SpanEnd, b.SpanStart,
Case
When SpanStart <= StartDate And StartDate <= SpanEnd
And SpanStart <= EndDate And EndDate <= SpanEnd
Then '1within' -- starts, equals, during, finishes
When EndDate < SpanStart
Or SpanEnd < StartDate
Then '3beforeAfter' -- preceeds, meets, preceeded, met
Else '2overlap' -- one or two ends hang over spanStart/End
End as relation
From Coverage_A a
Left Join DeOverlapped_B b
On a.FK_ID = b.FK_ID
Where a.Type = 'M'
)
Select
*
,Case
When relation1 = '2' And StartDate < SpanStart Then StartDate
When relation1 = '2' Then DateAdd(d, 1, SpanEnd)
When relation1 = '3' Then StartDate
End as UnCoveredBeginning
From (
Select
*
,SUBSTRING(relation,1,1) as relation1
,ROW_NUMBER() Over (Partition by A_ID Order by relation, SpanStart) as Rownum
from ACoverage
) aRNO
Where Rownum = 1
And relation1 <> '1'

Related

Second level lookup with SQL statement

How do I write a SQL statement that does a second level lookup only if first is not matched. For example:
In the below query, if my SEDOLCode condition does not return a record, proceed to lookup with condition 2 with RICCode.
select
*, GETDATE()
from
Securities sec
where
sec.SEDOLCode = 'ABCDEF'
or sec.RICCode = '002815.SZ'
This query is returning two different records - for example:
1234 ABCDEF DUMY906.X
5675 EFTFS 002815.SZ
I am taking data from a file to update the Pricetable as below. I want to use SedolCode as primary lookup.
IF ##ROWCOUNT = 0
INSERT INTO dbo.Price (sec.SecurityID, ClosingPrice, UpdatedDate, UpdatedByUser, Priced)
SELECT
..., GETDATE()
FROM
Securities sec
WHERE
sec.SEDOLCode = #SedolCode
OR sec.RICCode = #RicCode
Try this the logic is basically if the sedolcode is found then it will only meet the first condition. Otherwise the count of that sedolcolde will be 0 and it will look at riccode.
select
*, GETDATE()
from
Securities sec
where
sec.sedolcode = 'ABCDEF'
OR ((SELECT COUNT(1) FROM securites WHERE sedolcode ='ABCDEF') = 0 AND sec.riccode = '002815.SZ')
Ah - reminds me of my FTSE days........
Match Sedol and Not Ric
or
Ric and Not Sedol and use myOrdering & TOP to get the first.
INSERT INTO dbo.Price
(
sec.SecurityID
, ClosingPrice
, UpdatedDate
, UpdatedByUser
, Priced
)
SELECT TOP 1 [specifiy fields to insert]
FROM
(
select 1 as myOrdering ...
, GETDATE()
from Securities sec
WHERE
(sec.RICCode = #RicCode AND sec.SEDOLCode != #SedolCode)
UNION
select 2 as myOrdering ...
, GETDATE()
from Securities sec
WHERE
(sec.RICCode = #RicCode AND sec.SEDOLCode != #SedolCode)
)SUB_Q ORDER BY myOrdering

How to use count in subquery in MSSQL

I would like to merge two tables in mssql. The first Table have a task column. I would like to count the specific tasks and give the counted result to the second table to the AuftNr.
Here
Do i need a subquery and group by to solve this ?
So far i have done this.
SELECT AB.PersNr as PersonalNumber
,CONVERT(char(10),DATEADD(DAY, AB.Tag, '30.12.1899'),126) AS Day
,CONVERT(char(10),DATEADD(SECOND, AB.Von, DATEADD(DAY, AB.Tag,
'30.12.1899')),108) AS [From]
,AB.Bis as [To]
,AB.Auftrag as Task
FROM AStpVonBis AB
LEFT JOIN Auftrag A ON (A.AuftNr = AB.Auftrag)
INNER JOIN Personen P ON (P.PersNr = AB.PersNr)
WHERE P.Abteilung = 170 AND AB.Tag = DATEDIFF(DAY, '30.12.1899', GETDATE())
AND AB.Bis = -2
SELECT A.AuftNr FROM Auftrag A
Using a GROUP BY and a COUNT should do it :
SELECT
AB.Auftrag as Task,
count(*) as Total
FROM AStpVonBis AB
JOIN Personen P ON (P.PersNr = AB.PersNr)
WHERE P.Abteilung = 170
AND AB.Tag = DATEDIFF(DAY, convert(date,'30.12.1899',104), GETDATE())
AND AB.Bis = -2
GROUP BY AB.Auftrag
ORDER BY AB.Auftrag
Note that the left join with [Auftrag] wasn't included.
Since there's already AB.Auftrag to group by, and there's no grouping needed on the name of the Task.
The date stamp is converted with the 104 format to a date.
Just so it'll also work on connections that use another default date format.
Disclaimer: only tested in notepad
If I understand your question correctly, the bellow query should work
DECLARE #Count TABLE (PhoneNumver INT, [Day] DATE,[From] VARCHAR(150),[To] VARCHAR(3),Task INT)
INSERT INTO #Count
VALUES(1003,'2017-06-28','07:46:20','-2',150 ),
(1010,'2017-06-28','11:44:47','-2',140),
(1012,'2017-06-28','10:57:00','-2',120 ),
(1016,'2017-06-28','12:20:16','-2',120 ),
(1019,'2017-06-28','08:31:03','-2',120 ),
(1020,'2017-06-28','11:38:02','-2',120 ),
(1021,'2017-06-28','07:54:55','-2',120 ),
(1025,'2017-06-28','11:38:12','-2',120 ),
(1027,'2017-06-28','09:47:46','-2',130 )
DECLARE #Task TABLE (AuftNr INT)
INSERT INTO #Task VALUES (110),(120),(130),(140),(150),(200),(210),(220),(230)
SELECT
A.AuftNr,
COUNT(C.Task) AS Total_Count
FROM #Task A
LEFT JOIN #Count C ON A.AuftNr=C.Task
--From here you can add all the exclussions in where clause
GROUP BY A.AuftNr
ORDER BY Total_Count DESC
OUTPUT
AuftNr Total_Count
120 6
130 1
140 1
150 1
200 0
210 0
220 0
230 0
110 0

Select from same column under different conditons

I need to join these two tables. I need to select occurrences where:
ex_head of_family_active = 1 AND tax_year = 2017
and also:
ex_head of_family_active = 0 AND tax_year = 2016
The first time I tried to join these two tables I got the warehouse data
dbo.tb_master_ascend AND warehouse_data.dbo.tb_master_ascend in the from clause have the same exposed names. As the query now shown below, I get a syntax error on the "where". What am I doing wrong? Thank you
use [warehouse_data]
select
parcel_number as Account,
pact_code as type,
owner_name as Owner,
case
when ex_head_of_family_active >= 1
then 'X'
else ''
end 'Head_Of_Fam'
from
warehouse_data.dbo.tb_master_ascend
inner join
warehouse_data.dbo.tb_master_ascend on parcel_number = parcel_number
where
warehouse_data.dbo.tb_master_ascend.tax_year = '2016'
and ex_head_of_family_active = 0
where
warehouse_data.dbo.tb_master_ascend.t2.tax_year = '2017'
and ex_head_of_family_active >= 1
and (eff_from_date <= getdate())
and (eff_to_date is null or eff_to_date >= getdate())
#marc_s I changed the where statements and updated my code however the filter is not working now:
use [warehouse_data]
select
wh2.parcel_number as Account
,wh2.pact_code as Class_Type
,wh2.owner_name as Owner_Name
,case when wh2.ex_head_of_family_active >= 1 then 'X'
else ''
end 'Head_Of_Fam_2017'
from warehouse_data.dbo.tb_master_ascend as WH2
left join warehouse_data.dbo.tb_master_ascend as WH1 on ((WH2.parcel_number = wh1.parcel_number)
and (WH1.tax_year = '2016')
and (WH1.ex_head_of_family_active is null))
where WH2.tax_year = '2017'
and wh2.ex_head_of_family_active >= 1
and (wh2.eff_from_date <= getdate())
and (wh2.eff_to_date is null or wh2.eff_to_date >= getdate())
I would use a CTE to get all your parcels that meet your 2016 rules.
Then join that against your 2017 rules on parcel ID.
I'm summarizing:
with cte as
(
select parcelID
from
where [2016 rules]
group by parcelID --If this isn't unique you will cartisian your results
)
select columns
from table
join cte on table.parcelid=cte.parcelID
where [2017 rules]

Calculate a Running Monthly Average in SQL Server

We want to create a data-set which shows the monthly average count in our equipment table broken down by it's status: Active, Scrapped, New.
The more I ponder this it seems that the only way to accomplish this is to first create a container temp table and evaluate each record using a cursor.
Can this be accomplished without a temp table?
The following just shows the fields we're working with:
SELECT a1.statusdate, a1.CreateDate,
RunningTotalActive = count([status]='Active'),
RunningTotalScrapped = count([status]='Scrapped'),
NewEquipment = count(Month(a1.CreateDate) )
FROM dbo.Equipment AS a1
INNER JOIN dbo.Equipment AS a2
ON a2.statusdate <= a1.CreateDate
GROUP BY a1.statusdate
ORDER BY a1.statusdate desc
I'm making a few SWAGs about your data, but the idea is to average the sums over simple counts by month; using CTEs.
; WITH A AS (
SELECT a1.statusdate,
Active = CASE a1.[status] WHEN 'Active' THEN 1 ELSE 0 END,
Scrapped = CASE a1.[status] WHEN 'Scrapped' THEN 1 ELSE 0 END,
New = CASE WHEN a2.statusdate = a1.CreateDate THEN 1 ELSE 0 END --Guessing here that "new" means status date and create date are the same
FROM dbo.Equipment AS a1
INNER JOIN dbo.Equipment AS a2
ON a2.statusdate <= a1.CreateDate --"status" can be older than "create" for a piece of equipment? Not sure I understand this criteria. May need sample data.
), B AS (
SELECT Y = DATEPART(YEAR, statusdate)
, M = DATEPART(MONTH, statusdate)
, SumActive = SUM(Active)
, SumScrapped = SUM(Scrapped)
, SumNew = SUM(New)
FROM A
GROUP BY DATEPART(YEAR, statusdate), DATEPART(MONTH, statusdate)
)
SELECT Y, M,
RunningTotalActive = AVG(SumActive)OVER(PARTITION BY Y,M ORDER BY Y,M),
RunningTotalScrapped = AVG(SumScrapped)OVER(PARTITION BY Y,M ORDER BY Y,M),
NewEquipment = AVG(SumNew)OVER(PARTITION BY Y,M ORDER BY Y,M)
FROM B;
Can you show some sample data?
I had modified your script following my understanding. Can you try it?
SELECT YEAR(a1.statusdate) AS yr,MONTH(a1.statusdate) AS mon,
RunningTotalActive = count(CASE WHEN a1.[status]='Active' THEN 1 ELSE NULL END ), -- or SUM(CASE WHEN [status]='Active' THEN 1 ELSE 0 END ),
RunningTotalScrapped = count(CASE WHEN a1.[status]='Scrapped' THEN 1 ELSE NULL END),
NewEquipment = count(CASE WHEN YEAR(a1.CreateDate)*12+ MONTH(a1.CreateDate)= YEAR(a1.statusdate)*12+ MONTH(a1.statusdate)) THEN 1 ELSE NULL END )
FROM dbo.Equipment AS a1
GROUP BY YEAR(a1.statusdate),MONTH(a1.statusdate)
ORDER BY YEAR(a1.statusdate),MONTH(a1.statusdate) DESC

SQL query - need to exclude if Requirement NOT met, and exclude if Disqualifier IS met

I have a feeling once i see the solution i'll slap my forehead, but right now I'm not seeing it.
I have a lookup table, say TableB, which looks like this. All fields are INT except the last two which are BOOL.
ID, TableA_ID, Value, Required, Disqualifies
I have a list of TableA_Id values (1, 2, 3 ) etc
For each record in this table, either Required can be true or disqualified can be true - they cant both be true at the same time. They can both be false or null though. There can be duplicate values of TableA_Id but there should never be duplicates of TableA_Id and Value
If required is true for any of those TableA_ID values, and none of those values are in my list, return no records. If none of the values are marked as required (required = 0 or null) then return records UNLESS any of the values are marked as Disqualifies and are in the list, in which case i want to return no records.
So - if a field is required and i dont have it, dont return any records. If a field is marked as disqualified and i have it, don't return any records. Only return a record if either i have a required value or don't have a disqualified value or there are no required values.
I hope I explained myself clearly.
Thanks in advance for pointing me in the right direction.
As an example of what my records might look like:
ID TableA_ID Value Required Disqualifies
-- --------- ----- -------- ------------
1 123 1 True False
2 123 2 True False
3 123 3 False False
4 123 4 False True
5 456 1 False True
6 456 2 False False
Given this set of sample data, if we're working with TableA_Id 123 and my list of values is lets say 1 and 3, i would get data returned because i have a required value and dont have any disqualified values. If my list of values were just 3, i'd get no records since i'm missing of the Required values. If my list of values were 1 and 4, i'd get no records because 4 is marked as disqualified.
Now if we're working with TableA_Id 456, the only list of values that would return any records is 2.
Maybe i should post the whole SQL query - i was trying to keep this short to make it easier for everyone, but it looks like maybe that's not working so well.
Here is the full dynamically generated query. The bit i am working on now is the 2nd line from the bottom. To equate this to my example, t.id would be TableA_ID, Value would be PDT_ID.
SELECT DISTINCT t.ID, t.BriefTitle, stat.Status, lstat.Status AS LocationStatus, st.SType, t.LAgency, l.City, state.StateCode
,( SELECT TOP 1 UserID
FROM TRecruiter
WHERE TrialID = t.ID AND Lead = 1 ), l.ID as LocationID
, l.WebBased
FROM Trial t
INNER JOIN Location l ON t.ID = l.TrialID
FULL JOIN pdt on t.ID = pdt.trialid
FULL JOIN pdm on t.ID = pdm.TrialID
FULL JOIN s on t.ID = s.TrialID
FULL JOIN hy on t.ID = hy.TrialID
FULL JOIN ta on t.ID = ta.TrialID
FULL JOIN stt on t.ID = stt.TrialID
FULL JOIN [Status] stat ON t.StatusID = stat.ID
FULL JOIN st ON t.StudyTypeID = st.ID
FULL JOIN State state ON l.StateID = state.ID
FULL JOIN [Status] lstat ON l.StatusID = lstat.ID
FULL JOIN ts ON t.ID = ts.TrialID
FULL JOIN tpdm ON t.ID = tpdm.TrialID
WHERE ((t.ID IS NOT NULL)
AND (EligibleHealthyVolunteers IS NULL OR EligibleHealthyVolunteers = 1 OR (0 = 0 AND EligibleHealthyVolunteers = 0))
AND (eligiblegenderid is null OR eligiblegenderid = 1 OR eligiblegenderid = 3)
AND ((EligibleMinAge <= 28 AND EligibleMaxAge >= 28) OR (EligibleMinAge <= 28 AND EligibleMaxAge is null) OR (EligibleMinAge IS NULL AND EligibleMaxAge >= 28))
AND (HYID = 6 AND (hy.Disqualify = 0 OR hy.Disqualify IS NULL AND NOT EXISTS (SELECT * FROM hy WHERE t.id = hy.TrialID AND hy.Req =1)) OR HYID = 6 AND hy.req = 1)
AND (PDT_ID IN (1) AND ( pdt.Disqualify = 0 OR pdt.Disqualify IS NULL AND NOT EXISTS (select * from pdt where t.id = pdt.TrialID AND pdt.Req = 1)) OR PDT_ID IN (1) AND (pdt.Req = 1 AND (pdt.Disqualify = 0 or pdt.Disqualify is null )))
) AND ((3959 * acos(cos(radians(34.18)) * cos(radians(l.Latitude)) * cos(radians(l.Longitude) - radians(-118.46)) + sin(radians(34.18)) * sin(radians(l.Latitude)))) <= 300 OR l.Latitude IS NULL) AND t.IsPublished = 1 AND (t.StatusID = 1 OR t.StatusID = 2)
I've changed/shortened some table names just for security/privacy reasons.
Edit:
I think i am close to getting this working, but I'm getting tripped up on the logic again.
I have the following bit of sql:
AND ( exists (SELECT * FROM pdt WHERE Req = 1 AND trialid = t.id AND pdT_ID IN (2) ) AND EXISTS (SELECT * FROM pdt WHERE Req = 1 AND trialid = t.id ) )
I'm not sure how to structure this. Those two exists statement should make the whole thing true in the following combination:
True & False
True & True
False & False
If it's False & True, then the whole thing is false. In other words if there is a Req =1 AND the PDT_ID that is marked as Req=1 is not in our list (in the example above the list just contains '2') then return false.
EDIT:
I think i finally got it.
AND NOT EXISTS (SELECT * FROM pdt WHERE Disqualify = 1 AND trialid = t.id AND PDT_ID IN (2) )
AND NOT ( NOT exists (SELECT * FROM pdt WHERE Req = 1 AND trialid = t.id AND PDT_ID IN (2) ) AND EXISTS (SELECT * FROM pdt WHERE Req = 1 AND trialid = t.id ) )
So far this seems to work in testing. Although I'm only working with two values of PDT_ID. If this does resolve my problem, i will come back and give someone the credit for helping me.
SELECT *
FROM TABLEB B
WHERE
(
B.REQUIRED = 1
AND EXISTS
(
SELECT 1
FROM TABLEA A
WHERE A.ID =B.TABLEA_ID
)
)
OR
(
B.REQUIRED != 1
AND B.DISQUALIFIES <> 1
)
OR
(
B.REQUIRED != 1
AND B.DISQUALIFIES = 1
AND EXISTS
(
SELECT 1
FROM TABLEA A
WHERE A.ID =B.TABLEA_ID
)
)
UPDATE - after the EDIT and explanation from OP:
Change the line
FULL JOIN pdt on t.ID = pdt.trialid
To
FULL JOIN (SELECT * FROM pdt BB WHERE
BB.TrialID IN (SELECT AA.ID FROM Trial AA WHERE AA.ID = BB.TrialID) AND
1 > (SELECT COUNT(*) FROM Trial A
LEFT OUTER JOIN pdt B ON B.Req != 1 AND B.Disqualify != 1 AND B.TrialID = A.ID
WHERE B.TrialID IS NULL)) pdt ON t.ID = pdt.TiralID
AND change the line before last from
AND (PDT_ID IN (1) AND ( pdt.Disqualify = 0 OR pdt.Disqualify IS NULL AND NOT EXISTS (select * from pdt where t.id = pdt.TrialID AND pdt.Req = 1)) OR PDT_ID IN (1) AND (pdt.Req = 1 AND (pdt.Disqualify = 0 or pdt.Disqualify is null )))
To
AND PDT_ID IN (1)
(You seem to have found a solution, yet I've decided to share my thoughts about this problem anyway.)
Given you've got a set of TableA IDs, each of which is accompanied by a set of some values, and you want to test the entire row set against this TableB thing using the rules you've set forth, I think the entire checking process might look like this:
Match every pair of TableA.ID and Value against TableB and get aggregate maximums of Required and Disqualifies for every TableA.ID along the way.
Derive a separate list of TableA_ID values with their corresponding maximum values of Required, from TableB. That will be for us to know whether a particular TableA_ID must have a required value at all.
Match the row set obtained at Stage 1 against the derived table (Stage 2) and check the aggregate values:
1) if the actual aggregate Disqualifies for a TableA_ID is 1, discard this TableA_ID set;
2) if a TableA_ID has a match in the Stage 2 derived table and the aggregate maximum of Required that we obtained at Stage 1 doesn't match the maximum Required in the derived table, discard the set as well.
Something tells me that it would be better at this point to move on to some sort of illustration. Here's a sample script, with comments explaining which part of the script implements which part of the description above:
;
WITH
/* this is the row set to be tested and which
is supposed to contain TableA.IDs and Values */
testedRowSet AS (
SELECT
TableA.ID AS TableA_ID,
SomethingElse.TestedValue AS Value,
...
FROM TableA
JOIN SomethingElse ON some_condition
...
),
/* at this point, we are getting the aggregate maximums
of TableB.Required and TableB.Disqualifies for every
TableA_ID in testedRowSet */
aggregated AS (
SELECT
testedRowSet.TableA_ID,
testedRowSet.Value,
...
DoesHaveRequiredValues = MAX(CASE TableB.Required WHEN 1 THEN 1 ELSE 0 END) OVER (PARTITION BY testedRowSet.TableA_ID),
HasDisqualifyingValues = MAX(CASE TableB.Disqualifies WHEN 1 THEN 1 ELSE 0 END) OVER (PARTITION BY testedRowSet.TableA_ID)
FROM testedRowSet
LEFT JOIN TableB ON testedRowSet.TableA_ID = TableB.TableA_ID
AND testedRowSet.Value = TableB.Value
),
/* this row set will let us see whether a particular
TableA_ID must have a required value */
properties AS (
SELECT
TableA_ID,
MustHaveRequiredValues = MAX(CASE Required WHEN 1 THEN 1 ELSE 0 END)
FROM TableB
GROUP BY TableA_ID
),
/* this is where we are actually checking the previously
obtained aggregate values of Required and Disqualifies */
tested AS (
SELECT
aggregated.TableA_ID,
aggregated.Value,
...
FROM aggregated
LEFT JOIN properties ON aggregated.TableA_ID = properties.TableA_ID
WHERE aggregated.HasDisqualifyingValues = 0
AND (properties.TableA_ID IS NULL
OR properties.MustHaveRequiredValues = aggregated.DoesHaveRequiredValues)
)
SELECT * FROM tested

Resources