ORACLE SQL: I've got a crude solution to a problem using LEAD and CASE WHEN but would like a more elegant solution - loops

Goal: Given a table of patient records that show entry and exit dates per facility, I want to find the Original_entrydate for each record in the table.
Consider this table, which is sorted by entry_date descending and grouped by ID:
ID
Facility
entry_date
exit_date
original_entrydate
003246
C
2022-03-22
null
2012-10-01
003246
B
2015-07-24
2022-03-22 (matches row 1 entry_date)
2012-10-01
003246
A
2012-10-01
2015-07-24 (matches row 2 entry_date)
2012-10-01
003246
D
2001-02-02
2010-04-05 (does NOT match row 3 entry_date!)
2001-02-02
Starting from the earliest record, Patient #003246
Entered Facility D on 2/2/2001 and left on 4/5/2010.
Entered Facility A on 10/1/2012 and left on 7/24/2015 - so there was a gap in service between 2010 & 2012.
Entered Facility B on 7/24/2015 and left on 3/22/2022
Entered Facility C on 3/22/2022 and remains there today.
As you can see, when the individual entered Facility A, there has been no gap in service between 10/1/2012 and today--even though the patient switched facilities, they have been a resident since 10/1/2012--therefore 10/1/2012 is the original_entrydate for the first three records in the table, but not the 4th record, since there was a gap in service between Facilities D and A.
I managed to write a query that gave me what I want, but it's extremely clunky and will fail if a resident appears more than 4 times in the table (because my LEADs only go 4 rows deep--sufficient for my test table but not necessarily for the actual data).
SELECT
t1.resident_key,
t1.lidda,
t1.facility_key,
t1.admit_dt_key,
t1.separation_dt_key,
CASE
WHEN prev_sep1 IS NULL
THEN admit_dt_key
WHEN admit_dt_key != prev_sep1
THEN admit_dt_key
WHEN admit_dt_key = prev_sep1
THEN CASE
WHEN prev_sep2 IS NULL
THEN prev_admit1
WHEN prev_admit1 != prev_sep2
THEN prev_admit1
WHEN prev_admit1 = prev_sep2
THEN CASE
WHEN prev_sep3 IS NULL
THEN prev_admit2
WHEN prev_admit2 != prev_sep3
THEN prev_admit2
WHEN prev_admit2 = prev_sep3
THEN CASE
WHEN prev_sep4 IS NULL
THEN prev_admit3
WHEN prev_admit3 != prev_sep4
THEN prev_admit3
END
END
END
END
as Original_AdmDt
FROM
(SELECT my_table.*,
LEAD (separation_dt_key, 1) OVER (
PARTITION BY resident_key
ORDER BY admit_dt_key DESC
) prev_sep1,
LEAD (admit_dt_key, 1) OVER (
PARTITION BY resident_key
ORDER BY admit_dt_key DESC
) prev_admit1,
LEAD (separation_dt_key, 2) OVER (
PARTITION BY resident_key
ORDER BY admit_dt_key DESC
) prev_sep2,
LEAD (admit_dt_key, 2) OVER (
PARTITION BY resident_key
ORDER BY admit_dt_key DESC
) prev_admit2,
LEAD (separation_dt_key, 3) OVER (
PARTITION BY resident_key
ORDER BY admit_dt_key DESC
) prev_sep3,
LEAD (admit_dt_key, 3) OVER (
PARTITION BY resident_key
ORDER BY admit_dt_key DESC
) prev_admit3,
LEAD (separation_dt_key, 4) OVER (
PARTITION BY resident_key
ORDER BY admit_dt_key DESC
) prev_sep4,
LEAD (admit_dt_key, 4) OVER (
PARTITION BY resident_key
ORDER BY admit_dt_key DESC
) prev_admit4
FROM my_table) t1
Can I make this iterative, to deal with however many times a unique ID may appear in the table?

SELECT t1.date1, t2.date2,
CASE
WHEN t1.date1 < t2.date2 THEN t1.date1
ELSE t2.date2
END AS result
FROM table1 t1
JOIN table1 t2
ON t1.id < t2.id
WHERE t1.date1 < t2.date2
ORDER BY t1.id, t2.id;

Related

Custom Sort Order in CTE

I need to get a custom sort order in a CTE but the error shows
"--The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified."
What's a better way to get the custom order in the CTE?
WITH
ctedivisiondesc
as
(
SELECT * FROM (
SELECT --TOP 1 --[APPID]
DH1.[ID_NUM]
--,[SEQ_NUM_2]
--,[CUR_DEGREE]
--,[NON_DEGREE_SEEKING]
,DH1.[DIV_CDE]
,DDF.DEGREE_DESC 'DivisionDesc'
--,[DEGR_CDE]
--,[PRT_DEGR_ON_TRANSC]
--,[ACAD_DEGR_CDE]
,[DTE_DEGR_CONFERRED]
--,MAX([DTE_DEGR_CONFERRED]) AS Date_degree_conferred
,ROW_NUMBER() OVER (
PARTITION BY [ID_NUM]
ORDER BY [DTE_DEGR_CONFERRED] DESC --Getting last degree
) AS [ROW NUMBER]
FROM [TmsePrd].[dbo].[DEGREE_HISTORY] As DH1
inner join [TmsePrd].[dbo].[DEGREE_DEFINITION] AS DDF
on DH1.[DEGR_CDE] = DDF.[DEGREE]
--ORDER BY
--DIV_CDE Level
--CE Continuing Education
--CT Certificate 1
--DC Doctor of Chiropractic 4
--GR Graduate 3
--PD Pending Division
--UG Undegraduate 2
--The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.
ORDER BY CASE
WHEN DDF.DEGREE_DESC = 'Certificate' THEN 1
WHEN DDF.DEGREE_DESC = 'Undegraduate' THEN 2
WHEN DDF.DEGREE_DESC = 'Graduate' THEN 3
WHEN DDF.DEGREE_DESC = 'Doctor of Chiropractic' THEN 4
ELSE 5
END
) AS t
WHERE [ROW NUMBER] <= 1
)
SELECT * FROM ctedivisiondesc
You need to sort the outer query.
Sorting a subquery is not allowed because it is meaningless, consider this simple example:
WITH CTE AS
( SELECT ID
FROM (VALUES (1), (2)) AS t (ID)
ORDER BY ID DESC
)
SELECT *
FROM CTE
ORDER BY ID ASC;
The ordering on the outer query has overridden the ordering on the inner query rendering it a waste of time.
It is not just about explicit sorting of the outer query either, in more complex scenarios SQL Server may sort the subqueries any which way it wishes to enable merge joins or grouping etc. So the only way to guarantee the order or a result is to order the outer query as you wish.
Since you may not have all the data you need in the outer query, you may would probably need to create a further column inside the CTE to use for sorting. e.g.
WITH ctedivisiondesc AS
(
SELECT *
FROM ( SELECT DH1.ID_NUM,
DH1.DIV_CDE,
DDF.DEGREE_DESC AS DivisionDesc,
DTE_DEGR_CONFERRED,
ROW_NUMBER() OVER (PARTITION BY ID_NUM ORDER BY DTE_DEGR_CONFERRED DESC) AS [ROW NUMBER],
CASE
WHEN DDF.DEGREE_DESC = 'Certificate' THEN 1
WHEN DDF.DEGREE_DESC = 'Undegraduate' THEN 2
WHEN DDF.DEGREE_DESC = 'Graduate' THEN 3
WHEN DDF.DEGREE_DESC = 'Doctor of Chiropractic' THEN 4
ELSE 5
END AS SortOrder
FROM TmsePrd.dbo.DEGREE_HISTORY AS DH1
INNER JOIN TmsePrd.dbo.DEGREE_DEFINITION AS DDF
ON DH1.DEGR_CDE = DDF.DEGREE
) AS t
WHERE t.[ROW NUMBER] <= 1
)
SELECT ID_NUM,
DIV_CDE,
DivisionDesc,
DTE_DEGR_CONFERRED
FROM ctedivisiondesc
ORDER BY SortOrder;

Getting more than two counts in one query

scenario: I have table A and Table B. Both have the primary key of Rep_time_frame. We replicate data by terms. however, that process at times was repeated.
task: Match the source counts with the destination table counts.
problem: I can match the rows of the source table but not the destination table because the replication for some terms happened twice. So, my counts are off.
SELECT d.REPT_TIME_FRAME,
c.SOURCE_TOTAL_RECORDS,
COUNT(*) TOTAL_RECORDS,
CASE
WHEN COUNT(*) > c.SOURCE_TOTAL_RECORDS THEN c.SOURCE_TOTAL_RECORDS
ELSE COUNT(*)
END FINAL_TOTAL_RECORDS,
***CASE
WHEN EXISTS (SELECT REPT_TIME_FRAME
FROM TableA_Counts
WHERE REPT_TIME_FRAME IN (201705, 201708, 201801, 201706, 201710, 201803)
GROUP BY REPT_TIME_FRAME
HAVING COUNT(*) > 1) THEN 'UPLOADED MORE THAN ONCE'
ELSE'UPLOADED ONCE'
END NUMBER_OF_REPLICATIONS***
FROM TableA d
INNER JOIN TableA_Counts c ON (d.REPT_TIME_FRAME = c.REPT_TIME_FRAME)
WHERE d.REPT_TIME_FRAME IN (201705, 201708, 201801, 201706, 201710, 201803)
GROUP BY d.REPT_TIME_FRAME, c.SOURCE_TOTAL_RECORDS```
please note that the bolded area, when ran returns for the newly made column Number_of_replications "Uploaded More than once" on all counts. For the purposes of this example. Here are the facts.
201705 was replicated 2X
201801 once and 201708 once.
expected result. I want to have 201705 'Uploaded more than once', and the rest 201801 and 201708 'Uploaded Once'
Does this help?
http://sqlfiddle.com/#!18/8692a/1
I always try and get the counts then do a case statement on the final figures. I've based that SqlFiddle on the notion that TableA_Counts is the table with more than one entry for 201705. Is that correct?
CREATE TABLE TableA (REPT_TIME_FRAME int)
INSERT INTO TableA(REPT_TIME_FRAME)VALUES
(201705),
(201708),
(201801),
(201706),
(201710),
(201803 )
CREATE TABLE TableA_Counts (REPT_TIME_FRAME int)
INSERT INTO TableA_Counts(REPT_TIME_FRAME)VALUES
(201705),
(201708),
(201705),
(201801),
(201706),
(201710),
(201803)
SELECT
d.REPT_TIME_FRAME,count(*) as TOTAL_RECORDS,SOURCE_TOTAL_RECORDS,
CASE WHEN SOURCE_TOTAL_RECORDS > 1 then 'more than' else 'just one' end as NUMBER_OF_REPLICATIONS
FROM
TableA d
LEFT JOIN
(SELECT REPT_TIME_FRAME ,count(*) as SOURCE_TOTAL_RECORDS from TableA_Counts group by REPT_TIME_FRAME) c ON (d.REPT_TIME_FRAME = c.REPT_TIME_FRAME)
group by d.REPT_TIME_FRAME,SOURCE_TOTAL_RECORDS
SELECT d.REPT_TIME_FRAME,
c.SOURCE_TOTAL_RECORDS,
COUNT(*) TOTAL_RECORDS,
CASE
WHEN COUNT(*) > c.SOURCE_TOTAL_RECORDS THEN c.SOURCE_TOTAL_RECORDS
ELSE COUNT(*)
END FINAL_TOTAL_RECORDS,
a.count AS NUM_OF_REPLICATIONS
FROM TableA d
INNER JOIN TableACounts c ON (d.REPT_TIME_FRAME = c.REPT_TIME_FRAME)
INNER JOIN (SELECT REPT_TIME_FRAME ,count(REPT_TIME_FRAME) as count
FROM TableAcounts
WHERE REPT_TIME_FRAME IN (201705, 201708, 201801, 201706, 201710, 201803)
GROUP BY REPT_TIME_FRAME) AS a ON (a.REPT_TIME_FRAME = d.REPT_TIME_FRAME)
WHERE d.REPT_TIME_FRAME IN (201705, 201708, 201801, 201706, 201710, 201803)
GROUP BY d.REPT_TIME_FRAME, c.SOURCE_TOTAL_RECORDS, a.count

Query running tally of open issues on a given day

I have been banging my head on this for a while now and I think I've drastically over-complicating things at this point. What I have is a table containing the fields
OpenDate
ClosedDate
Client
Contract
Service
What I need to turn that into is
Date
Client
Contract
Service
OpenedOnThisDay
OpenedYesterday
ClosedOnThisDay
ClosedYesterday
OpenAtStartOfTomorrow
OpenAtStartOfToday
For any given Date, there may or may not be any issues opened or closed ON that day. That day should still be included with 0's
I have come at this a number of ways and can produce one of the desired results at a time (opened on, closed on, Open at end of), but I cannot get them all at once, at least not without exponentially increasing the query time.
My queries currently as views are as follows
Opened On
select Cast(EntryDateTime as Date) as DateStamp
,ContractNumber
,Client
,services.Service
,sum(1) as Count
,lag(sum(1)) OVER (
partition by tickets.ContractNumber
,services.Service ORDER BY Cast(EntryDateTime as Date) ASC
) as CountDayBefore
from v_JiraImpactedServices as services
LEFT JOIN v_JiraTickets as tickets ON services.ticketnumber = tickets.TicketNumber
WHERE tickets.Client is not null
AND tickets.TicketNumber IS NOT NULL
and tickets.ContractNumber is not null
GROUP BY Cast(tickets.EntryDateTime as Date)
,tickets.ContractNumber
,tickets.Client
,services.Service;
Closed On
select Cast(ResolvedDateTime as Date) as DateStamp
,ContractNumber
,Client
,services.Service
,sum(1) as Count
,lag(sum(1)) OVER (
partition by tickets.ContractNumber
,services.Service ORDER BY Cast(ResolvedDateTime as Date) ASC
) as CountDayBefore
from v_JiraImpactedServices as services
LEFT JOIN v_JiraTickets as tickets ON services.ticketnumber = tickets.TicketNumber
WHERE tickets.Client is not null
and tickets.TicketNumber is not null
AND tickets.ContractNumber is not null
GROUP BY Cast(tickets.ResolvedDateTime as Date)
,tickets.ContractNumber
,tickets.Client
,services.Service;
Open On
SELECT calendar.FullDate as DateStamp
,tickets.ContractNumber
,tickets.client
,services.Service
,IsNull(count(tickets.TicketNumber), 0) as Count
,IsNull(lag(count(tickets.TicketNumber), 1) OVER (
partition by tickets.ContractNumber
,services.Service Order By FullDate ASC
), 0) as CountDayBefore
FROM v_Calendar as calendar
LEFT JOIN v_JiraTickets as tickets ON Cast(tickets.EntryDateTime as Date) <= calendar.FullDate
AND (
Cast(tickets.ResolvedDateTime as Date) > calendar.FullDate
OR tickets.ResolvedDateTime is null
)
LEFT JOIN v_JiraImpactedServices as services ON services.ticketnumber = tickets.TicketNumber
WHERE tickets.Client is not null
AND tickets.ContractNumber is not null
GROUP BY calendar.FullDate
,tickets.ContractNumber
,tickets.Client
,services.Service;
As I said each of these by itself gives ALMOST the desired results, but omits days with 0 values.
Aside from producing days with 0 values, I need to also combine these into a single table result. All attempts so far have either produced obviously wrong JOIN results, or takes an hour to execute.
I would be most grateful if someone could point me in the right direction here.
Just to give you an idea, although the fieldnames don't match your scenario, this is how I would approach this:
WITH
SourceData (ClientID, ContractID, ServiceID, DateStamp) AS (
SELECT a.ID, b.ID, c.ID, d.DateStamp
FROM clients a
JOIN contracts b ON a.ID = b.ClientID
JOIN [services] c ON b.ID = c.ContractID
CROSS JOIN calendar d
WHERE d.DateStamp >= DATEADD(day, -60, GETDATE())
)
SELECT d.DateStamp, s.ClientID, s.ContractID, s.ServiceID
, COUNT(CASE WHEN Cast(EntryDateTime as Date) = d.DateStamp THEN 1 END) AS OpenedOn
, COUNT(CASE WHEN Cast(ResolvedDateTime as Date) = d.DateStamp THEN 1 END) AS ClosedOn
, COUNT(CASE WHEN Cast(ResolvedDateTime as Date) > d.DateStamp OR ResolvedDateTime IS NULL AND EntryDateTime IS NOT NULL THEN 1 END) AS InProgress
FROM SourceData s
LEFT JOIN tickets t
ON s.ClientID = t.ClientID
AND s.ContractID = t.ContractID
AND s.ServiceID = t.ServiceID
AND s.DateStamp >= Cast(EntryDateTime as Date)
AND (s.DateStamp <= ResolvedDateTime OR ResolvedDateTime IS NULL)
GROUP BY d.DateStamp, s.ClientID, s.ContractID, s.ServiceID

How to combine fields from 2 columns to create a "matrix"?

I have a logging table in my application that only logs changed data, and leaves the other columns NULL. What I'm wanting to do now is create a view that takes 2 of those columns (Type and Status),
and create a resultset that returns the Type and Status on the entry of that log row, assuming that either one or both columns could be null.
For example, with this data:
Type Status AddDt
A 1 7/8/2013
NULL 2 7/7/2013
NULL 3 7/6/2013
NULL NULL 7/5/2013
B NULL 7/4/2013
C NULL 7/3/2013
C 4 7/2/2013
produce the resultset:
Type Status AddDt
A 1 7/8/2013
A 2 7/7/2013
A 3 7/6/2013
A 3 7/5/2013
B 3 7/4/2013
C 3 7/3/2013
C 4 7/2/2013
From there I'm going to figure out the first time in these results the Type and Status meet certain conditions, such as a Type of B and Status 3 (7/4/2013) and ultimately use that date in a calculation, so performance is a huge issue with this.
Here's what I was thinking so far, but it doesn't get me where I need to be:
SELECT
Type.TypeDesc
, Status.StatusDesc
, *
FROM
jw50_Item c
OUTER APPLY (SELECT TOP 10000 * FROM jw50_ItemLog csh WHERE csh.ItemID = c.ItemID AND csh.StatusCode = 'OPN' ORDER BY csh.AddDt DESC) [Status]
OUTER APPLY (SELECT TOP 10000 * FROM jw50_ItemLog cth WHERE cth.ItemID = c.ItemID AND cth.ItemTypeCode IN ('F','FG','NG','PF','SXA','AB') ORDER BY cth.AddDt DESC) Type
WHERE
c.ItemID = #ItemID
So with the help provided below, I was able to get where I needed. Here is my final solution:
SELECT
OrderID
, CustomerNum
, OrderTitle
, ItemTypeDesc
, ItemTypeCode
, StatusCode
, OrdertatusDesc
FROM
jw50_Order c1
OUTER APPLY (SELECT TOP 1 [DateTime] FROM
(SELECT c.ItemTypeCode, c.OrderStatusCode, c.OrderStatusDt as [DateTime] FROM jw50_Order c WHERE c.OrderID = c1.OrderID
UNION
select (select top 1 c2.ItemTypeCode
from jw50_OrderLog c2
where c2.UpdatedDt >= c.UpdatedDt and c2.ItemTypeCode is not null and c2.OrderID = c.OrderID
order by UpdatedDt DESC
) as type,
(select top 1 c2.StatusCode
from jw50_OrderLog c2
where c2.UpdatedDt >= c.UpdatedDt and c2.StatusCode is not null and c2.OrderID = c.OrderID
order by UpdatedDt DESC
) as status,
UpdatedDt as [DateTime]
from jw50_OrderLog c
where c.OrderID = c1.OrderID AND (c.StatusCode IS NOT NULL OR c.ItemTypeCode IS NOT NULL)
) t
WHERE t.ItemTypeCode IN ('F','FG','NG','PF','SXA','AB') AND t.StatusCode IN ('OPEN')
order by [DateTime]) quart
WHERE quart.DateTime <= #FiscalPeriod2 AND c1.StatusCode = 'OPEN'
Order By c1.OrderID
The union is to bring in the current data in addition to the log table data to create the resultset, since the current data maybe what meets the conditions required. Thanks again for the help guys.
Here is an approach that uses correlated subqueries:
select (select top 1 c2.type
from jw50_Item c2
where c2.AddDt >= c.AddDt and c2.type is not null
order by AddDt
) as type,
(select top 1 c2.status
from jw50_Item c2
where c2.AddDt >= c.AddDt and c2.status is not null
order by AddDt
) as status,
(select AddDt
from jw50_Item c
If you have indexes on jw50_item(AddDt, type) and jw50_item(AddDt, status), then the performance should be pretty good.
I suppose you want to "generate a history": for those dates that has some data missing, the next available data should be set.
Something similar should work:
Select i.AddDt, t.Type, s.Status
from Items i
join Items t on (t.addDt =
(select min(t1.addDt)
from Items t1
where t1.addDt >= i.addDt
and t1.Type is not null))
join Items s on (s.addDt =
(select min(s1.addDt)
from Items s1
where s1.addDt >= i.addDt
and s1.status is not null))
Actually I'm joining the base table to 2 secondary tables and the join condition is that we match the smallest row where the respective column in the secondary table is not null (and of course smaller than the current date).
I'm not absolutely sure that it will work, since I don't have an SQL Server in front of me but give it a try :)

Looking up the first row of results

I have a history table containing a snapshot of each time a record is changed. I'm trying to return a certain history row with the original captured date. I am currently using this at the moment:
select
s.Description,
h.CaptureDate OriginalCaptureDate
from
HistoryStock s
left join
( select
StockId,
CaptureDate
from
HistoryStock
where
HistoryStockId in ( select MIN(HistoryStockId) from HistoryStock group by StockId )
) h on s.StockId = h.StockId
where
s.HistoryStockId = #HistoryStockId
This works but with 1 Million records its on the slow side and I'm not sure how to optimize this query.
How can this query be optimized?
UPDATE:
WITH OriginalStock (StockId, HistoryStockId)
AS (
SELECT StockId, min(HistoryStockId)
from HistoryStock group by StockId
),
OriginalCaptureDate (StockId, OriginalCaptureDate)
As (
SELECT h.StockId, h.CaptureDate
from HistoryStock h join OriginalStock o on h.HistoryStockId = o.HistoryStockId
)
select
s.Description,
h.OriginalCaptureDate
from
HistoryStock s left join OriginalCaptureDate h on s.StockId = h.StockId
where
s.HistoryStockId = #HistoryStockId
I've update the code to use CTE but I'm not better off performance wise, only have small performance increase. Any ideas?
Just another note, I need to get to the first record in the history table for StockId and not the earliest Capture date.
I am not certain I understand entirely how the data works from your query but nesting queries like that is never good for performance in my opinion. You could try something along the lines of:
WITH MinCaptureDate (StockID, MinCaptureDate)
AS (
SELECT HS.StockID
,MIN(HS.CaptureDate) AS OriginalCaptureDate
FROM HistoryStock HS
GROUP BY
HS.Description
)
SELECT HS.Description
,MCD.OriginalCaptureDate
FROM HistoryStock HS
JOIN MinCaptureDate MCD
ON HS.StockID = MCD.StockID
WHERE HS.StockID = #StockID
I think i see what you are trying to achieve. You basically want the description of the specified history stock record, but you want the date associated with the first history record for the stock... so if your history table looks like this
StockId HistoryStockId CaptureDate Description
1 1 Apr 1 Desc 1
1 2 Apr 2 Desc 2
1 3 Apr 3 Desc 3
and you specify #HistoryStockId = 2, you want the following result
Description OriginalCaptureDate
Desc 2 Apr 1
I think the following query would give you a slightly better performance.
WITH OriginalStock (StockId, CaptureDate, RowNumber)
AS (
SELECT
StockId,
CaptureDate,
RowNumber = ROW_NUMBER() OVER (PARTITION BY StockId ORDER BY HistoryStockId ASC)
from HistoryStock
)
select
s.Description,
h.CaptureDate
from
HistoryStock s left join OriginalStock h on s.StockId = h.StockId and h.RowNumber = 1
where
s.HistoryStockId = #HistoryStockId

Resources