SQL Server - Select most recent records with condition

SQL Server - Select most recent records with condition - sql-server

I have a table like this.
Table :
ID EnrollDate ExitDate
1 4/1/16 8/30/16
2 1/1/16 null
2 1/1/16 7/3/16
3 2/1/16 8/1/16
3 2/1/16 9/1/16
4 1/1/16 12/12/16
4 1/1/16 12/12/16
4 1/1/16 12/12/16
4 1/1/16 null
5 5/1/16 11/12/16
5 5/1/16 11/12/16
5 5/1/16 11/12/16
Need to select the most recent records with these conditions.
One and only one record has the most recent enroll date - select that
Two or more share same most recent enroll date and one and only one record has either a NULL Exit Date or the most recent Exit Date - Select the record with null. If no null record pick the record with recent exit date
Two or more with same enroll and Exit Date - If this case exists, don't select those record
So the expected result for the above table should be :
ID EnrollDate ExitDate
1 4/1/16 8/30/16
2 1/1/16 null
3 2/1/16 9/1/16
4 1/1/16 null
I wrote the query with group by. I am not sure how to select with the conditions 2 and 3.
select t1.* from table t1
INNER JOIN(SELECT Id,MAX(EnrollDate) maxentrydate
FROM table
GROUP BY Id)t2 ON EnrollDate = t2.maxentrydate and t1.Id=t2.Id
Please let me know what is the best way to do this.

Using the rank() window function, I think it's possible.
This is untested, but it should work:
select t.ID, t.EnrollDate, t.ExitDate
from (select t.*,
rank() over(
partition by ID
order by EnrollDate desc,
case when ExitDate is null then 1 else 2 end,
ExitDate desc) as rnk
from tbl t) t
where t.rnk = 1
group by t.ID, t.EnrollDate, t.ExitDate
having count(*) = 1
The basic idea is that the rank() window function will rank the most "recent" rows with a value of 1, which we filter on in the outer query's where clause.
If more than one row have the same "most recent" data, they will all share the same rank of 1, but will get filtered out by the having count(*) = 1 clause.

Use ROW_NUMBER coupled with CASE expression to achieve the desired result:
WITH Cte AS(
SELECT t.*,
ROW_NUMBER() OVER(
PARTITION BY t.ID
ORDER BY
t.EnrollDate DESC,
CASE WHEN t.ExitDate IS NULL THEN 0 ELSE 1 END,
t.ExitDate DESC
) AS rn
FROM Tbl t
INNER JOIN (
SELECT
ID,
COUNT(DISTINCT CHECKSUM(EnrollDate, ExitDate)) AS DistinctCnt, -- Count distinct combination of EnrollDate and ExitDate per ID
COUNT(*) AS RowCnt -- Count number of rows per ID
FROM Tbl
GROUP BY ID
) a
ON t.ID = a.ID
WHERE
(a.DistinctCnt = 1 AND a.RowCnt = 1)
OR a.DistinctCnt > 1
)
SELECT
ID, EnrollDate, ExitDate
FROM Cte c
WHERE Rn = 1
The ORDER BY clause in the ROW_NUMBER takes care of conditions 2 and 3.
The INNER JOIN and the WHERE clause take care of 1 and 4.
ONLINE DEMO

with B as (
select id, enrolldate ,
exitdate,
row_number() over (partition by id order by enrolldate desc, case when exitdate is null then 0 else 1 end, exitdate desc) rn
from ab )
select b1.id, b1.enrolldate, b1.exitdate from b b1
left join b b2
on b1.rn = b2.rn -1 and
b1.id = b2.id and
b1.exitdate = b2.exitdate and
b1.enrolldate = b2.enrolldate
where b1.rn = 1 and
b2.id is nULL
The left join is used to fullfill the 3) requirement. When record is returned then we don't want it.

Related

Getting Top 3 values for each id and status

I have data something like this,
ID Time Status
--- ---- ------
1 10 B
1 20 B
1 30 C
1 70 C
1 100 B
1 490 D
The desired result should be,
ID Time Status
1 490 D
1 100 B
1 70 C
This is how,I should get top 3 Time vales for ID and distinct status.
For this I Tried:-
;WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY TIME DESC) AS rn
FROM MyTable
)
SELECT id,TIME,Status
FROM cte
where rn<=3
But it doesn't meet my requirement iam gettng top 3 duplicates staus values,How can i solve this.Help!

Partition by status as well:
WITH cte AS (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY id, status
ORDER BY TIME DESC
) AS rn
FROM MyTable t
)
SELECT id, TIME, Status
FROM t
WHERE rn <= 3;

The with ties argument of the top function will return all the of the rows which match the top values:
select top (3) with ties id, Time, Status from table1 order by Time desc
Alternatively, if you wanted to return 3 values only, but make sure they are always the same 3 values, then you will need to use something else as a tie-breaker. In this case, it looks like your id column could be unique.
select top (3) id, Time, Status from table1 order by Time desc, id

Try this:
select distinct id,max(time) over (partition by id,status) as time ,status
from mytable t order by time desc
Output -
id time status
1 490 D
1 100 B
1 70 C
EDIT:
select distinct TOP 3 id,max(time) over (partition by id,status) as time,status
from mytable t order by time desc

Try this:
SELECT TOP 3 * FROM [MyTable] WHERE [Id] = 1 ORDER BY [Time] DESC
This will give you top three records for ID = 1. For any other ID, just change the number in WHERE clause.
Additionally you can make some stored procedure to UNION all top three records for each ID - this can be done using looping through all distinct IDs in your table :)

Try using RANK.
You may use the below query to get your desired result.
select * from
(select *, RANK() over(partition by status order by time desc) as rn from myTable)T
where rn = 1
FIDDLE

Group By and inner join with latest records based on TimeStamp

I have a History table as below:
ID | GroupCode | Category | TimeStamp
---+-----------+----------+-----------
1 | x | shoes | 2016-09-01
2 | y | blach | 2016-09-01
History table gets updated every month and a single entry for each GroupCode gets inserted in the table.
I have also a Current table which holds the latest position.
Before or after I update the History table with the current position I would like to find out whether the Category has changed from last month to this month.
I need to compare the last Category with the current Category and, if it has changed, then flag the CategoryChanged in the Current table.
Current table:
ID | GroupCode | Category | CategoryChanged
---+-----------+----------+----------------
1 | x | shoes | True
2 | y | blah | False
I tried to achieve this with INNER JOIN but I am having difficulties to INNER JOIN to latest month and year entries in History table, but no success.

--get highest group code based on timestamp
;with History
as
(select top 1 with ties groupcode,category
from
history
order by
row_number() over (partition by group code order by timestamp desc) as rownum
)
--now do a left join with current table
select
ct.ID,
ct.GroupCode,
ct.Category,
case when ct.category=ht.category or ht.category is null then 'False'
else 'true'
end as 'changed'
from
currenttable ct
left join
history ht
on ht.groupcode=ct.groupcode
use below to update ,after checking if your select values are correct..
update ct
set ct.category=
case when ct.category=ht.category or ht.category is null then 'False'
else 'true'
end
from
currenttable ct
left join
history ht
on ht.groupcode=ct.groupcode

if you make a CTE where the history records have rown_numbwer for each GroupCode ordered by date descending, then you are interested in rows 1 AND 2, SO YOU CAN THEREFORE join your CTE on GroupCode, and select records 1 and 2, you can the see if category has changed between rows 1 and 2
;WITH CTE AS (SELECT *, row_number() OVER (PARTITION BY GroupCode ORDER BY TimeStamp Desc) RN FROM History)
SELECT
C1.ID,
C1.GroupCode,
C1.Category,
CASE WHEN C1.Category = C2.Category THEN
'false'
else
'true'
end AS CategoryChanged
FROM CTE C1
JOIN
CTE C2
ON C1.GroupCode = C2.GroupCode
AND C1.Rn=1 AND C2.RN = 2;
if you have null categories, you can avoid with - BTW you will need to learn how to handle NULLs the way you want to handle them - you can't expect people to post on here thinking about NULLs you never mentioned forever! And happening to realise what you want to do with them for that matter
;WITH CTE AS (SELECT *, row_number() OVER (PARTITION BY GroupCode ORDER BY TimeStamp Desc) RN FROM History)
SELECT
C1.ID,
C1.GroupCode,
C1.Category,
CASE WHEN C1.Category = C2.Category OR C1.Category IS NULL AND C2.Category IS NULL THEN
'false'
else
'true'
end AS CategoryChanged
FROM CTE C1
JOIN
CTE C2
ON C1.GroupCode = C2.GroupCode
AND C1.Rn=1 AND C2.RN = 2;

Combine two tables without repeat second table value

I have two tables that need to be joined.
Example:
Table 1: tbl_Item
Id ---- ItemName
1 ----- A<br/>
1 ----- B<br/>
1 ----- c<br/>
2 ----- A<br/>
2 ----- B<br/>
Table 2: tbl_Detail
Id ---- Total
1 ---- 100 <br/>
2 ---- 300<br/>
I need to join the tables and get the following result:
Id -- ItemName -- Total
1 -- A --- Null<br/>
1 -- B --- Null<br/>
1 -- C --- 100<br/>
2 -- A --- Null<br/>
2 -- B --- 300<br/>
Thanks in advance.

You can assign the total value to an indeterminate single row by using row_number():
select t.id, t.ItemName,
(case when row_number() over (partition by t.id order by (select NULL)) = 1
then d.total
end) as total
from tbl_item t join
tbl_detail d
on t.id = d.id;
If you have an ordering (probably specified by another column), then replace (select null) with the appropriate logic. For the example data, for instance, you might use i.name desc, but I doubt that is the actual ordering you are looking for.

You could use ROW_NUMBER for this:
;WITH CTE AS (
SELECT Id, ItemName,
ROW_NUMBER() OVER (PARTITION BY Id ORDER BY ItemName DESC) AS rn
FROM tbl_Item
)
SELECT t1.Id, t1.ItemName,
CASE WHEN t1.rn = 1 THEN t2.Total END AS Total
FROM CTE AS t1
LEFT JOIN tbl_Detail AS t2 ON t1.Id = t2.Id

Join two tables with conditions depending on multiples columns

In SQL Server 2008, I want to join two table on key that might have duplicate, but the match is unique with the information from other columns.
For a simplified purchase record example,
Table A:
UserId PayDate Amount
1 2015 100
1 2010 200
2 2014 150
Table B:
UserId OrderDate Count
1 2009 4
1 2014 2
2 2013 5
Desired Result:
UserId OrderDate PayDate Amount Count
1 2009 2010 200 4
1 2014 2015 100 2
2 2013 2014 150 5
It's guaranteed that:
Table A and Table B have same number of rows, and UserId in both table are same set of numbers.
For any UserId, PayDate is always later than OrderDate
Rows with same UserId are matched by sorted sequence of Date. For example, Row 1 in Table A should match Row 2 in Table B
My idea is that on both tables, first sort by Date, then add another Id column, then join on this Id column. But I not authorized to write anything into the database. How can I do this task?

Row_Number() will be your friend here. It allows you to add a virtual sequencing to your resultset.
Run this and study the output:
SELECT UserID
, OrderDate
, "Count" As do_not_use_reserved_words_for_column_names
, Row_Number() OVER (PARTITION BY UserID ORDER BY OrderDate) As sequence
FROM table_b
The PARTITION BY determines when the counter should be "reset" i.e. it should restart after a change of UserID
The ORDER BY, well, you've guessed it - determines the order of the sequence!
Pull this all together:
; WITH payments AS (
SELECT UserID
, PayDate
, Amount
, Row_Number() OVER (PARTITION BY UserID ORDER BY PayDate) As sequence
FROM table_b
)
, orders AS (
SELECT UserID
, OrderDate
, "Count" As do_not_use_reserved_words_for_column_names
, Row_Number() OVER (PARTITION BY UserID ORDER BY OrderDate) As sequence
FROM table_b
)
SELECT orders.UserID
, orders.OrderDate
, orders.do_not_use_reserved_words_for_column_names
, payments.PayDate
, payments.Amount
FROM orders
LEFT
JOIN payments
ON payments.UserID = orders.UserID
AND payments.sequence = orders.sequence
P.S. I've opted for an outer join because I assumed that there's not always going to be a payment for every order.

Try:
;WITH t1
AS
(
SELECT UserId, PayDate, Amount,
ROW_NUMBER() OVER (PARTITION BY UserId ORDER BY PayDate) AS RN
FROM TableA
),
t2
AS
(
SELECT UserId, OrderDate, [Count],
ROW_NUMBER() OVER (PARTITION BY UserId ORDER BY OrderDate) AS RN
FROM TableB
)
SELECT t1.UserId, t2.OrderDate, t1.PayDate, t1.Amount, t2.[Count]
FROM t1
INNER JOIN t2
ON t1.UserId = t2.UserId AND t1.RN = t2.RN

MSSql only group those with count greater than 3 and return the rest records

I want to group the key with count greater than 3, and the query will return the rest of the records also. I don't want to use Union All, is there any other way to do it?
ID
1
1
1
2
3
3
4
4
4
4
Return
1
1
1
2
3
3
4

You can use ranking- and aggregate functions:
WITH CTE AS
(
SELECT ID,
CNT = COUNT(*) OVER (PARTITION BY ID),
RN = ROW_NUMBER() OVER (PARTITION BY ID ORDER BY ID)
FROM dbo.TableName
)
SELECT ID
FROM CTE
WHERE CNT <= 3 OR RN = 1
Demo

I'd do it like this
SELECT
GroupedData.ID
FROM
(SELECT ID, CNT = COUNT(*)
FROM dbo.TableName
GROUP BY ID) GroupedData AS g
LEFT JOIN dbo.TableName AS t
ON t.id = g.id and g.CNT<=3
This also allows you to add further columns which report details for the group or individual record as appropriate
SELECT
g.ID,
ISNULL(t.RecordName,'Grouped Records') as RecordName,
ISNULL(t.NumericField,g.NumericField) as NumericField
FROM
(
SELECT ID, CNT = COUNT(*), SUM(NumericField) as NumericField
FROM dbo.TableName
GROUP BY ID
) GroupedData AS g
LEFT JOIN dbo.TableName AS t
ON t.id = g.id and g.CNT<=3

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL Server - Select most recent records with condition - sql-server

Related

Getting Top 3 values for each id and status

Group By and inner join with latest records based on TimeStamp

Combine two tables without repeat second table value

Join two tables with conditions depending on multiples columns

MSSql only group those with count greater than 3 and return the rest records

Categories

Resources