Retrieve records based on preference - sql-server

I have a table with sample data below.
PatId NetType
100 In
100 Out
100 NA
101 Out
101 NA
102 NA
103 In
When there are multiple netTypeid for same patient return only top one prioritized by( In,Out,NA) as order. What i am trying to do when there are In/Out/NA available for a patid then should return back only In, when there is Out/NA available for a patid then it should return back only In.If no duplicate just return back as is. Output for above scenario should be
PatId NetType
100 In
101 Out
102 NA
103 In

Use row_number() to order your table by NetType
select
PatId, NetType
from (
select
PatId, NetType
, row_number() over (partition by PatId order by case NetType when 'In' then 1 when 'Out' then 2 else 3 end) rn
from
myTable
) t
where
rn = 1

Similar to uzi
DECLARE #T AS TABLE (PatId int, NetType varchar(20));
insert into #t values
(100, 'In')
, (100, 'Out')
, (100, 'NA')
, (101, 'Out')
, (101, 'NA')
, (102, 'NA')
, (103, 'In');
DECLARE #O AS TABLE (ord int primary key, NetType varchar(20));
insert into #O values (1, 'In'), (2, 'Out'), (3, 'NA');
select tt.PatId, tt.NetType
from ( select t.*
, ROW_NUMBER() over (partition by PatId order by o.ord) as rn
from #t t
join #O o
on t.NetType = o.NetType
) tt
where tt.rn = 1;

Related

What is the most effcient way to replace values in a specific column of a table for this specific scenario?

I am using SQL Server 2014 and I have a table in my database called t1 (extract of only 2 columns shown below):
ResaID StayDate
100 2020-02-03
100 2020-02-04
100 2020-02-05
120 2020-04-06
120 2020-04-07
120 2020-04-08
120 2020-04-09
120 2020-04-10
I need to change the dates in the StayDate column based on the following information (extract shown exactly as provided):
ID StartDate EndDate
100 2020-06-04 2020-06-06
120 2021-03-01 2021-03-05
I have started writing my T-SQL query as follows (but it is getting quite tedious as I have to do it for more than 100 ResaID!):
USE MyDatabase
UPDATE t1
SET StayDate = CASE WHEN ResaID = 100 and StayDate = '2020-02-03' THEN '2020-06-04'
WHEN ResaID = 100 and StayDate = '2020-02-04' THEN '2020-06-05'
WHEN ResaID = 100 and StayDate = '2020-02-05' THEN '2020-06-06'
...
ELSE StayDate
END
Is there a more efficient way to tackle this problem?
You can use recursive approach :
with r_cte as (
select id, convert(date, startdate) as startdate, convert(date, enddate) as enddate
from ( values (100, '2020-06-04', '2020-06-06'),
(120, '2021-03-01', '2021-03-03')
) t(id, startdate, enddate)
union all
select id, dateadd(day, 1, startdate), enddate
from cte c
where startdate < enddate
), r_cte_seq as (
select r.cte.*, row_number() over(partition by id order by startdate) as seq
from r_cte
), cte_seq as (
select t1.*, row_number() over (partition by ResaID order by staydate) as seq
from t1
)
update cs
set cs.staydate = rc.startdate
from cte_seq cs inner join
r_cte_seq rc
on rc.id = cs.ResaID and rc.seq = cs.seq;
Here is my approach to this problem. I would use a numbers table to generate a record for each date in the new range for each reservation ID. I would then partition this data by reservation ID, ordered by the date. Doing the same partition logic on the existing data will allow records to be properly joined together.
I would then do a DELETE operation followed by an INSERT operation. This would leave you with the appropriate amount of records. The only manual thing that would need to be done is to populate the auxiliary data for reservations with expanded date ranges. I expanded one of your new ranges to show this scenario.
I've marked where the setup for this demo ends in the code below. Everything below that is my intended solution that should be able to be implemented with your real tables.
--Ranges Table
DECLARE #ranges TABLE
(
ID INT
,StartDate DATETIME
,EndDate DATETIME
)
DECLARE #t1 TABLE
(
ResaID INT
,StayDate DATETIME
,ColA INT
,ColB NVARCHAR(100)
,ColC BIT
)
INSERT INTO #t1
(
ResaID
,StayDate
,ColA
,ColB
,ColC
)
VALUES
(100, '2020-02-03', 1, 'A', 0)
,(100, '2020-02-04', 100, 'B', 1)
,(100, '2020-02-05', 255, 'C', 1)
,(120, '2020-04-06', 34, 'D', 1)
,(120, '2020-04-07', 67, 'E', 0)
,(120, '2020-04-08', 87, 'F', 0)
,(120, '2020-04-09', 545, 'G', 1)
,(120, '2020-04-10', 288, 'H', 0)
INSERT INTO #ranges
(
ID
,StartDate
,EndDate
)
VALUES
(100, '2020-06-04', '2020-06-07')
,(120, '2021-03-01', '2021-03-05')
--END DEMO SETUP
DROP TABLE IF EXISTS #numbers
DROP TABLE IF EXISTS #newRecords
--GENERATE NUMBERS TABLE
;WITH e1(n) AS
(
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL
SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
), -- 10
e2(n) AS (SELECT 1 FROM e1 CROSS JOIN e1 AS b), -- 10*10
e3(n) AS (SELECT 1 FROM e2 CROSS JOIN e2 AS b), -- 100*100
e4(n) AS (SELECT 1 FROM e3 CROSS JOIN (SELECT TOP 5 n FROM e1) AS b) -- 5*10000
SELECT ROW_NUMBER() OVER (ORDER BY n) as Num
INTO #numbers
FROM e4
ORDER BY n;
;with oldData --PARTITION THE EXISTING RECORDS
AS
(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY ResaID ORDER BY STAYDATE) as ResPartID
FROM #t1
)
,newRanges --GENERATE YOUR NEW RANGES AND PARITITION
AS
(
select
r.ID
,CAST(n.num as DATETIME) as StayDate
,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY n.num) as ResPartID
from #ranges r
inner join #numbers n on CAST(r.StartDate as INT) <= n.Num AND CAST(r.EndDate as INT) >= n.Num
)
SELECT n.ID
,n.StayDate
,o.ColA
,o.ColB
,o.ColC
into #newRecords
FROM newRanges n
left join oldData o on n.ID = o.ResaID and n.ResPartID = o.ResPartID
--DELETE OLD RECORDS
DELETE t
FROM #t1 t
inner join #ranges r on t.ResaID = r.ID
--INSERT NEW DATA
INSERT INTO #t1
(
ResaID
,StayDate
,ColA
,ColB
,ColC
)
SELECT
ID
,StayDate
,ColA
,ColB
,ColC
FROM #newRecords
SELECT * FROM #t1
The following code converts the t1 dates into ranges and then uses the corresponding range dates to calculate new StayDate values. You can swap out the final select for one of the commented statements to see what is going on in the CTEs. The final select can be replaced with an update if you want to change the original table data.
-- Thanks to Aaron Hughes for setting up the sample data.
-- I changed the DateTime columns to Date .
--Ranges Table
DECLARE #ranges TABLE
(
ID INT
,StartDate DATE
,EndDate DATE
)
DECLARE #t1 TABLE
(
ResaID INT
,StayDate DATE
,ColA INT
,ColB NVARCHAR(100)
,ColC BIT
)
INSERT INTO #t1
(
ResaID
,StayDate
,ColA
,ColB
,ColC
)
VALUES
(100, '2020-02-03', 1, 'A', 0)
,(100, '2020-02-04', 100, 'B', 1)
,(100, '2020-02-05', 255, 'C', 1)
,(120, '2020-04-06', 34, 'D', 1)
,(120, '2020-04-07', 67, 'E', 0)
,(120, '2020-04-08', 87, 'F', 0)
,(120, '2020-04-09', 545, 'G', 1)
,(120, '2020-04-10', 288, 'H', 0)
INSERT INTO #ranges
(
ID
,StartDate
,EndDate
)
VALUES
(100, '2020-06-04', '2020-06-07')
,(120, '2021-03-01', '2021-03-05');
with
-- Calculate the date range for each stay in #t1 .
ResaRanges as (
select ResaId, Min( StayDate ) as ResaStartDate, Max( StayDate ) as ResaEndDate
from #t1
group by ResaId ),
-- Match up the #t1 date ranges with the #ranges date ranges.
CombinedRanges as (
select RR.ResaId, RR.ResaStartDate, RR.ResaEndDate, DateDiff( day, RR.ResaStartDate, RR.ResaEndDate ) + 1 as ResaDays,
R.StartDate, R.EndDate, DateDiff( day, R.StartDate, R.EndDate ) + 1 as RangeDays,
DateDiff( day, RR.ResaStartDate, R.StartDate ) as DaysOffset
from ResaRanges as RR inner join
#ranges as R on R.ID = RR.ResaId )
-- Calculate the new StayDate values for all #t1 ranges that are not longer than the corresponding #range .
-- The difference between range starting dates is added to each StayDate .
select T.ResaId, T.StayDate, DateAdd( day, CR.DaysOffset, T.StayDate ) as NewStayDate
from #t1 as T inner join
CombinedRanges as CR on CR.ResaID = T.ResaID
where CR.RangeDays >= CR.ResaDays;
-- To see the steps you can use one of the following select staements to view the intermediate results:
-- select * from ResaRanges;
-- select * from CombinedRanges;

Update column with 4 consecutive purchases

I need to update my Result column values for the entire user to yes if the user did make 4 consecutive purchases without receiving a bonus in between. How can this be done. Please see my code below.....
-- drop table #Test
CREATE TABLE #Test (UserID int, TheType VARCHAR(10), TheDate DATETIME, Result VARCHAR(10))
INSERT INTO #Test
SELECT 1234, 'Bonus', GETDATE(), NULL
UNION
SELECT 1234, 'Purchase', GETDATE()-1, NULL
UNION
SELECT 1234, 'Purchase', GETDATE()-2, NULL
UNION
SELECT 1234, 'Purchase', GETDATE()-3, NULL
UNION
SELECT 1234, 'Purchase', GETDATE()-4, NULL
UNION
SELECT 1234, 'Bonus', GETDATE()-5, NULL
UNION
SELECT 1234, 'Purchase', GETDATE()-6, NULL
UNION
SELECT 1234, 'Bonus', GETDATE()-7, NULL
SELECT * FROM #Test ORDER BY TheDate
Again, please note that the purchases need to be consecutive (By TheDate)
You can as the below:
;WITH CTE1
AS
(
SELECT
ROW_NUMBER() OVER (ORDER BY TheDate) RowId,
ROW_NUMBER() OVER (PARTITION BY UserID,TheType ORDER BY TheDate) PurchaseRowId,
*
FROM #Test
), CTE2
AS
(
SELECT
MIN(A.RowId) MinId,
MAX(A.RowId) MaxId
FROM
CTE1 A
GROUP BY
A.TheType,
A.RowId - A.PurchaseRowId
)
SELECT
A.UserID ,
A.TheType ,
A.TheDate ,
CASE WHEN B.MinId IS NULL THEN NULL ELSE 'YES' END Result
FROM
CTE1 A LEFT JOIN
CTE2 B ON A.RowId >= B.MinId AND A.RowId <= B.MaxId AND (B.MaxId - B.MinId) > 2
--AND A.TheType = 'Purchase'
ORDER BY A.TheDate
Result:
UserID TheType TheDate Result
----------- ---------- ----------------------- - ------
1234 Bonus 2017-06-06 11:06:03.130 NULL
1234 Purchase 2017-06-07 11:06:03.130 NULL
1234 Bonus 2017-06-08 11:06:03.130 NULL
1234 Purchase 2017-06-09 11:06:03.130 YES
1234 Purchase 2017-06-10 11:06:03.130 YES
1234 Purchase 2017-06-11 11:06:03.130 YES
1234 Purchase 2017-06-12 11:06:03.130 YES
1234 Bonus 2017-06-13 11:06:03.130 NULL
First you have to derive the column group and then group by that (having = 4) and inner join with the original table.
drop table if exists #Test;
create table #Test
(
UserID int
, TheType varchar(10)
, TheDate date
, Result varchar(10)
);
insert into #Test
select 1234, 'Bonus', getdate(), null
union
select 1234, 'Purchase', getdate() - 1, null
union
select 1234, 'Purchase', getdate() - 2, null
union
select 1234, 'Purchase', getdate() - 3, null
union
select 1234, 'Purchase', getdate() - 4, null
union
select 1234, 'Bonus', getdate() - 5, null
union
select 1234, 'Purchase', getdate() - 6, null
union
select 1234, 'Bonus', getdate() - 7, null;
drop table if exists #temp;
select
*
, lag(t.TheDate, 1) over ( order by t.TheDate ) as Lag01
, lag(t.TheType, 1) over ( order by t.TheDate ) as LagType
into
#temp
from #Test t;
with cteHierarchy
as
(
select
UserID
, TheType
, TheDate
, Result
, Lag01
, t.TheDate as Root
from #temp t
where t.LagType <> t.TheType
union all
select
t.UserID
, t.TheType
, t.TheDate
, t.Result
, t.Lag01
, cte.Root as Root
from #temp t
inner join cteHierarchy cte on t.Lag01 = cte.TheDate
and t.TheType = cte.TheType
)
update test
set
Result = 4
from (
select
t.Root
, count(t.UserID) as Cnt
, t.UserID
from cteHierarchy t
group by t.UserID, t.Root
having count(t.UserID) = 4
) tt
inner join #Test test on tt.UserID = test.UserID
select * from #Test t
order by t.TheDate;

SQL Server Ranking issue

I am trying to apply ranking to my data set the logic is as follows:
For each ID , Order by ID2 ASC and Order by IsMaster Desc rank the row 1 and only change it when the ID4 value changes
My dataset and desired output looks like:
Test data
CREATE TABLE Test_Table
(ID INT ,ID2 INT, IsMaster INT, ID4 VARCHAR(10))
GO
INSERT INTO Test_Table (ID ,ID2 , IsMaster , ID4 )
VALUES
(1, 101, 1 ,'AAA') -- 1 <-- Desired output for rank
,(1, 102, 0 ,'AAA') -- 1
,(1, 103, 0 ,'AAB') -- 2
,(1, 104, 0 ,'AAB') -- 2
,(1, 105, 0 ,'CCC') -- 3
,(2, 101, 1 ,'AAA') -- 1
,(2, 102, 0 ,'AAA') -- 1
,(2, 103, 0 ,'AAA') -- 1
,(2, 104, 0 ,'AAB') -- 2
,(2, 105, 0 ,'CCC') -- 3
this is what I have tried so far:
SELECT *
,DENSE_RANK() OVER (PARTITION BY ID ORDER BY ID2 ASC, IsMaster DESC ) rn
FROM Test_Table
please please please help me thank you.
This is a island/gap problem.
First you use LAG() to see if you have a different ID4 on the same partition.
Is important you also need partition by IsMaster
Then you create the islands when ID4 changes.
Finally use comulative SUM() to get the proper rank.
Sql Demo
WITH id4_change as (
SELECT *,
LAG(ID4) OVER (PARTITION BY ID, IsMaster ORDER BY ID2) as prev
FROM Test_Table
), islands as (
SELECT *,
CASE WHEN ID4 = PREV
THEN 0
ELSE 1
END as island
FROM id4_change
)
SELECT *,
SUM(island) OVER (PARTITION BY ID, IsMaster ORDER BY ID2) rank
FROM islands
ORDER BY ID, ID2, IsMaster DESC
;
OUTPUT: You can see when ID4 = PREV doesnt create a new "Island" so have same rank.
EDIT: You can simplify first two querys
WITH id4_change as (
SELECT *,
CASE WHEN ID4 = LAG(ID4) OVER (PARTITION BY ID, IsMaster ORDER BY ID2)
THEN 0
ELSE 1
END as island
FROM Test_Table
)
SELECT *,
SUM(island) OVER (PARTITION BY ID, IsMaster ORDER BY ID2) rank
FROM id4_change
ORDER BY ID, ID2, IsMaster DESC
;
Another way probably less efficient but it will work.
WITH X AS
(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID2) RowNum
FROM dbo.Test_Table
)
, CTE_VehicleNumber
as
(
SELECT T.ID , T.ID2, t.IsMaster ,T.ID4 , t.RowNum , 1 as [Rank]
FROM X as T
WHERE T.IsMaster = 1
UNION ALL
SELECT T.ID, T.ID2, t.IsMaster ,T.ID4 , t.RowNum , CASE WHEN t.ID4 <> c.ID4 THEN 1+ C.[Rank]
ELSE 0+ C.[Rank]
END as [Rank]
FROM CTE_VehicleNumber as C
inner join X as T ON T.RowNum = C.RowNum + 1
AND t.ID = c.ID
)
SELECT ID , ID2, IsMaster ,ID4 , [Rank]
FROM CTE_VehicleNumber
ORDER BY ID , ID2, IsMaster ,ID4 , [Rank]
OPTION (MAXRECURSION 0);
Are you sure that your orders of ID2 and IsMaster affect the desired result, considering the rest of the data in ID and ID4?
I just tried to use the following code:
; WITH CTE AS (
SELECT DISTINCT ID, ID4, DENSE_RANK() OVER (ORDER BY ID4) Rnk
FROM #Test_Table
)
SELECT t.*, c.Rnk
FROM #Test_Table t
INNER JOIN CTE c ON t.ID = c.ID AND t.ID4 = c.ID4;
... and even with changing the order of ID2 and IsMaster I can't get it to "misbehave" - IF there's only one IsMaster = 1 per a group of ID4's and no duplicates in ID2.

Select top 2 distinct for each id and date

I have a table like this :
Table1:
[Id] [TDate] [Score]
1 1.1.00 50
1 1.1.00 60
2 1.1.01 50
2 1.1.01 70
2 1.3.01 40
3 1.1.00 80
3 1.1.00 30
3 1.2.00 40
My desired output should be like this:
[ID] [TDate] [Score]
1 1.1.00 60
2 1.1.01 70
2 1.3.01 40
3 1.1.00 80
3 1.2.00 40
So fare, I have written this:
SELECT DISTINCT TOP 2 Id, TDate, Score
FROM
( SELECT Id, TDate, Score, ROW_NUMBER() over(partition by TDate order by Score) Od
FROM Table1
) A
WHERE A.Od = 1
ORDER BY Score
But it gives me :
[ID] [TDate] [Score]
2 1.1.01 70
3 1.1.00 80
of course I can do this:
"select top 2 ...where ID = 1"
and then:
union
`"Select top 2 ... where ID = 2"`
etc..
but I have a 100,000 of this..
Any way to generalize it to any Id?
Thank you.
WITH TOPTWO AS (
SELECT Id, TDate, Score, ROW_NUMBER()
over (
PARTITION BY TDate
order by SCORE
) AS RowNo
FROM [table_name]
)
SELECT * FROM TOPTWO WHERE RowNo <= 2
Your output doesn't make sense. Let me assume you want two rows per id. Then the query would look like:
SELECT TOP 2 Id, TDate, Score
FROM (SELECT Id, TDate, Score,
ROW_NUMBER() over (partition by id order by Score DESC) as seqnum
FROM Table1
) t
WHERE seqnum <= 2
ORDER BY Score;
Notes:
This assumes that you want two rows per id. Hence, id is in the PARTITION BY.
The WHERE now selects two rows per group in the PARTITION BY.
There is no need for SELECT DISTINCT in the outer query -- at least for this question.
Try this : Make partition by ID and TDate and sort by score in descending order
ROW_NUMBER() over(partition by ID,TDate order by Score DESC) Od
Complete script
WITH CTE AS(
SELECT *,
ROW_NUMBER() over(partition by ID,TDate order by Score DESC) RN
FROM TableName
)
SELECT *
FROM CTE
WHERE RN = 1
Unless I am missing something this can be done with a simple group by
First I prepare a temp table for testing :
declare #table table (ID int, TDate varchar(10), Score int)
insert into #Table values(1, '1.1.00', 50)
insert into #Table values(1, '1.1.00', 60)
insert into #Table values(2, '1.1.01', 50)
insert into #Table values(2, '1.1.01', 70)
insert into #Table values(2, '1.3.01', 40)
insert into #Table values(3, '1.1.00', 80)
insert into #Table values(3, '1.1.00', 30)
insert into #Table values(3, '1.2.00', 40)
Now lets do a select on this table
select ID, TDate, max(Score) as Score
from #table
group by ID, TDate
order by ID, TDate
The result is this :
ID TDate Score
1 1.1.00 60
2 1.1.01 70
2 1.3.01 40
3 1.1.00 80
3 1.2.00 40
So all you need to do is change #table to your table name and you are done

Maximum and Minimum Rows Alternatively in SQL Server

This is an Employee table,
Id Name Salary
1 A.J 7000
2 B.S 30000
3 C.K 2000
4 D.O 10000
5 E.L 500
Now i want to display 1st highest salary then minimum salary then 2nd maximum salary then 2nd minimum salaray and so on..up to nth row.
Expected Output,
Id Name Salary
2 B.S 30000
5 E.L 500
4 D.O 10000
3 C.K 2000
1 A.J 7000
One more variant without explicit COUNT. SQL Fiddle.
Try also to add this row to sample data (6, 'X.Y', 7000) in the fiddle. The query still returns correct results.
DECLARE #Employee TABLE (ID int, Name nvarchar(50), Salary money);
INSERT INTO #Employee (ID, Name, Salary) VALUES
(1, 'A.J', 7000),
(2, 'B.S', 30000),
(3, 'C.K', 2000),
(4, 'D.O', 10000),
(5, 'E.L', 500);
WITH
CTE
AS
(
SELECT *, NTILE(2) OVER (ORDER BY Salary, ID) AS n
FROM #Employee AS E
)
SELECT
*
,SIGN(n-1.5) AS s
,SIGN(n-1.5)*Salary AS ss
,ROW_NUMBER() OVER(PARTITION BY n ORDER BY SIGN(n-1.5)*Salary DESC) AS rn
FROM CTE
ORDER BY rn, ss DESC;
Result
ID Name Salary n s ss rn
2 B.S 30000.00 2 1.0 30000.00000 1
5 E.L 500.00 1 -1.0 -500.00000 1
4 D.O 10000.00 2 1.0 10000.00000 2
3 C.K 2000.00 1 -1.0 -2000.00000 2
1 A.J 7000.00 1 -1.0 -7000.00000 3
I left intermediary columns in the output to illustrate how it works.
Using Row_Number() and Count()
Fiddle Demo
declare #count int=(select count(1) from Employee);
with cte1 as
(
select ROW_NUMBER() over(order by salary desc) as rn,0 Sort,Id,Name,Salary, count(Id) over () cnt from Employee
union all
select ROW_NUMBER() over(order by salary) as rn,1 Sort,Id,Name,Salary, count(Id) over () cnt from Employee
)
select top (#count) Id,Name,Salary from cte1 where rn <= (floor(cnt/2) + cnt%2) order by rn,sort
Below is the solution:
--Create dummy employee table
CREATE TABLE tbl_Employee
(
Id INT,
Name VARCHAR(100),
Salary NUMERIC(9, 2)
)
GO
--Insert few dummy rows in the table
INSERT INTO #Employee
(Id, Name, Salary)
VALUES(100, 'John', 7000),
(101, 'Scott', 30000),
(102, 'Jeff', 2000),
(103, 'Jimy', 10000),
(104, 'Andrew', 500),
(105, 'Alister', 100)
GO
--Get data as required
DECLARE #Cnt INT = 0, #SeqLimit INT = 0
SELECT #Cnt = COUNT(1) FROM tbl_employee
SET #SeqLimit = CEILING(#Cnt / 2.0)
SELECT * FROM
(
SELECT ROW_NUMBER() OVER(ORDER BY Salary DESC) AS SEQ, Id, Name, Salary FROM tbl_employee
)DT1
WHERE SEQ <= #SeqLimit
UNION ALL
SELECT * FROM
(
SELECT ROW_NUMBER() OVER(ORDER BY Salary ASC) AS SEQ, Id, Name, Salary FROM tbl_employee
)DT2
WHERE SEQ <= #SeqLimit - (#Cnt % 2)
ORDER BY SEQ ASC, Salary DESC
The same can be achieved with different approaches and here you can find more on this:
http://www.sqlrelease.com/order-max-and-min-value-rows-alternatively-in-sql-server

Resources