Getting Top 3 values for each id and status - sql-server

I have data something like this,
ID Time Status
--- ---- ------
1 10 B
1 20 B
1 30 C
1 70 C
1 100 B
1 490 D
The desired result should be,
ID Time Status
1 490 D
1 100 B
1 70 C
This is how,I should get top 3 Time vales for ID and distinct status.
For this I Tried:-
;WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY id ORDER BY TIME DESC) AS rn
FROM MyTable
)
SELECT id,TIME,Status
FROM cte
where rn<=3
But it doesn't meet my requirement iam gettng top 3 duplicates staus values,How can i solve this.Help!

Partition by status as well:
WITH cte AS (
SELECT t.*,
ROW_NUMBER() OVER (PARTITION BY id, status
ORDER BY TIME DESC
) AS rn
FROM MyTable t
)
SELECT id, TIME, Status
FROM t
WHERE rn <= 3;

The with ties argument of the top function will return all the of the rows which match the top values:
select top (3) with ties id, Time, Status from table1 order by Time desc
Alternatively, if you wanted to return 3 values only, but make sure they are always the same 3 values, then you will need to use something else as a tie-breaker. In this case, it looks like your id column could be unique.
select top (3) id, Time, Status from table1 order by Time desc, id

Try this:
select distinct id,max(time) over (partition by id,status) as time ,status
from mytable t order by time desc
Output -
id time status
1 490 D
1 100 B
1 70 C
EDIT:
select distinct TOP 3 id,max(time) over (partition by id,status) as time,status
from mytable t order by time desc

Try this:
SELECT TOP 3 * FROM [MyTable] WHERE [Id] = 1 ORDER BY [Time] DESC
This will give you top three records for ID = 1. For any other ID, just change the number in WHERE clause.
Additionally you can make some stored procedure to UNION all top three records for each ID - this can be done using looping through all distinct IDs in your table :)

Try using RANK.
You may use the below query to get your desired result.
select * from
(select *, RANK() over(partition by status order by time desc) as rn from myTable)T
where rn = 1
FIDDLE

Related

Partition with select distinct

I have a dataset that looks like this
StudentName Course Studentmailid Score
Student1 A student1#gmail.com. 80
Student1 A student1#gmail.com. 75
Student2 A student2#gmail.com. 70
Student1 B student2#gmail.com. 70
Now I want records 1,3,4.Basically the first occurance of the student in each Course
I have my query as
select distinct StudentName,Course, Studentmailid,Score fromStudentTable group by Course
and it throws an error.What would I have to tweak the query as to get the desired output
Hope it help.
;with cte as (
select StudentName,Course, Studentmailid,Score, Row_Number() over (partition by StudentName, Course order by StudentName) as Row_Num from StudentTable
)
select * from cte where Row_Num = 1
You need a column to define first item from multiple rows. Other wise there is no guaranty that you will get the same output every time. Following is a sample where I order by NULL, which will return record on normal order. But this will not always return the same result.
You need a column with values to order all records. You can then simply replace the ordering part in WINDOW function with your column.
Demo Here
SELECT * FROM
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY StudentName, Course ORDER BY (SELECT NULL)) RN
-- ORDER BY (SELECT NULL) is basically to keep data as it is now in the table
-- But there is no guaranty that it will order data in same order always
FROM your_table
)A
WHERE RN=1

How to Remove Duplicate Statement

How to delete duplicate data row in SQL Server where there are not any unique value differences? I remain only one statement from my sales table (dbo.Sales)
ID DESCRIPTIONS QTY RATE AMOUNT
--------------------------------
1 APPLE 50 100 1000
1 APPLE 50 100 1000
1 APPLE 50 100 1000
1 APPLE 50 100 1000
We can try using a CTE here to arbitrarily delete all but one of the duplicates:
WITH cte AS (
SELECT *,
ROW_NUMBER() OVER (PARTITION BY ID, DESCRIPTIONS, QTY, RATE, AMOUNT
ORDER BY (SELECT NULL)) rn
FROM yourTable
)
DELETE
FROM cte
WHERE rn > 1;
You can delete like following.
DELETE A
FROM (SELECT Row_number()
OVER (
partition BY id, descriptions, qty, rate, amount
ORDER BY (SELECT 1)) AS rn
FROM table1) A
WHERE a.rn > 1
If you want to use CTE, you can try like following.
;WITH cte
AS (SELECT Row_number()
OVER(
partition BY id, descriptions, qty, rate, amount
ORDER BY (SELECT 1)) RN
FROM table1)
DELETE FROM cte
WHERE rn > 1
you can use this:
select distinct * into temp from tableName
delete from tableName
insert into tableName
select * from temp
drop table temp
I suggest to add a column like rn and feed it by row_number() over (Partition by ID, DESCRIPTIONS ,QTY, RATE, AMOUNT order by Id)
Now delete the data having rn not equal to 1
after completion drop that column... this is a one time solution if it is frequent that add a unique key in your table

Select highest value in a column with other column values SQL Server

My table looks like this:
And I want to get highest bid amount for a specific product, with the row id. My query is like this
SELECT
MAX(BidAmount) as highestBid,id
FROM
[wf_bid]
WHERE
ProductId = 101 AND ClientId = 101
GROUP BY
id
I expect only one row with highest BidAmount, but the query returns all rows with this product id and client id. How can I fix this issue?
How about sub-query ? If you have multiple records with same BidAmount, then it return top 1.
SELECT TOP 1
BidAmount as highestBid,id
FROM [wf_bid] WHERE BidAmount = (Select Max(BidAmount) FROM [wf_bid] WHERE ProductId=101 and ClientId=101)
You can use row_number() and select the first row:
SELECT *
FROM
(
SELECT
id,
BidAmount,
ROW_NUMBER() OVER (ORDER BY BidAmount desc) as rn
FROM
[wf_bid]
WHERE ProductId = 101 and ClientId = 101
) i
WHERE
i.rn = 1
How about this way:
SELECT id,highestBid from
(Select Max(BidAmount)highestBid,productID,clientid FROM [wf_bid] WHERE ProductId=101 and ClientId=101) a
LEFT JOIN
(SELECT id,productID,clientid FROM [wf_bid]) as b
where a.productID = b.productid and a.clientid = b.clientid
try this way,
select * FROM
(SELECT
id,
BidAmount,
ROW_NUMBER() OVER (parrtition by ProductId ORDER BY BidAmount desc) as rn
FROM
[wf_bid]
WHERE ClientId = 101)t4
where rn=1
Your problem is in the group by ID, it doesn't work that way because it isn't "adding your bids" it is telling you the max number of every ID not just which is the biggest bid and it's ID. I'm guessing you'll get what you want if you delete group by id. If not you would need to explain your need further.

SQL Server - Select most recent records with condition

I have a table like this.
Table :
ID EnrollDate ExitDate
1 4/1/16 8/30/16
2 1/1/16 null
2 1/1/16 7/3/16
3 2/1/16 8/1/16
3 2/1/16 9/1/16
4 1/1/16 12/12/16
4 1/1/16 12/12/16
4 1/1/16 12/12/16
4 1/1/16 null
5 5/1/16 11/12/16
5 5/1/16 11/12/16
5 5/1/16 11/12/16
Need to select the most recent records with these conditions.
One and only one record has the most recent enroll date - select that
Two or more share same most recent enroll date and one and only one record has either a NULL Exit Date or the most recent Exit Date - Select the record with null. If no null record pick the record with recent exit date
Two or more with same enroll and Exit Date - If this case exists, don't select those record
So the expected result for the above table should be :
ID EnrollDate ExitDate
1 4/1/16 8/30/16
2 1/1/16 null
3 2/1/16 9/1/16
4 1/1/16 null
I wrote the query with group by. I am not sure how to select with the conditions 2 and 3.
select t1.* from table t1
INNER JOIN(SELECT Id,MAX(EnrollDate) maxentrydate
FROM table
GROUP BY Id)t2 ON EnrollDate = t2.maxentrydate and t1.Id=t2.Id
Please let me know what is the best way to do this.
Using the rank() window function, I think it's possible.
This is untested, but it should work:
select t.ID, t.EnrollDate, t.ExitDate
from (select t.*,
rank() over(
partition by ID
order by EnrollDate desc,
case when ExitDate is null then 1 else 2 end,
ExitDate desc) as rnk
from tbl t) t
where t.rnk = 1
group by t.ID, t.EnrollDate, t.ExitDate
having count(*) = 1
The basic idea is that the rank() window function will rank the most "recent" rows with a value of 1, which we filter on in the outer query's where clause.
If more than one row have the same "most recent" data, they will all share the same rank of 1, but will get filtered out by the having count(*) = 1 clause.
Use ROW_NUMBER coupled with CASE expression to achieve the desired result:
WITH Cte AS(
SELECT t.*,
ROW_NUMBER() OVER(
PARTITION BY t.ID
ORDER BY
t.EnrollDate DESC,
CASE WHEN t.ExitDate IS NULL THEN 0 ELSE 1 END,
t.ExitDate DESC
) AS rn
FROM Tbl t
INNER JOIN (
SELECT
ID,
COUNT(DISTINCT CHECKSUM(EnrollDate, ExitDate)) AS DistinctCnt, -- Count distinct combination of EnrollDate and ExitDate per ID
COUNT(*) AS RowCnt -- Count number of rows per ID
FROM Tbl
GROUP BY ID
) a
ON t.ID = a.ID
WHERE
(a.DistinctCnt = 1 AND a.RowCnt = 1)
OR a.DistinctCnt > 1
)
SELECT
ID, EnrollDate, ExitDate
FROM Cte c
WHERE Rn = 1
The ORDER BY clause in the ROW_NUMBER takes care of conditions 2 and 3.
The INNER JOIN and the WHERE clause take care of 1 and 4.
ONLINE DEMO
with B as (
select id, enrolldate ,
exitdate,
row_number() over (partition by id order by enrolldate desc, case when exitdate is null then 0 else 1 end, exitdate desc) rn
from ab )
select b1.id, b1.enrolldate, b1.exitdate from b b1
left join b b2
on b1.rn = b2.rn -1 and
b1.id = b2.id and
b1.exitdate = b2.exitdate and
b1.enrolldate = b2.enrolldate
where b1.rn = 1 and
b2.id is nULL
The left join is used to fullfill the 3) requirement. When record is returned then we don't want it.

Filter first then select page

How to first filter the result based on params then to apply where-between?
Some thing like
With Results as
(
Select colName,Title, Row_Number(Over...) as row from a table where colName=5
)
Select * from Results
where
row between #first and #last
But it does not works. I need to move my where colName=5 from with clause to outside then I got wrong data as It first get rows between #first n #last then search for colName=5.
Also I want count of Results.
Any idea?
You can use COUNT(*) OVER() to get the count of the unfiltered results
WITH cte as
(
select *,
ROW_NUMBER() over (order by name desc) AS RN,
count(*) over() AS [Count]
from master..spt_values
)
SELECT name, number,[Count]
FROM cte
WHERE RN BETWEEN 20 AND 24
Returns
name number Count
----------------------------------- ----------- -----------
VIEW 8278 2506
VIEW 8278 2506
view 2 2506
varchar 3 2506
varbinary 1 2506
This has performance implications though. You might want to just calculate the COUNT up front and cache it somewhere rather than recalculating it for every page request.
Your ROW_NUMBER syntax is incorrect. It should be this:
With Results as
(
SELECT colName, Title, ROW_NUMBER() OVER (ORDER BY ...) AS RN
FROM your_table
WHERE colName = 5
)
SELECT * FROM Results
WHERE rn BETWEEN #first AND #last
ORDER BY rn
See the documentation for more information.
I use approach very similar to Martin Smiths (currently selected answer) and at least in the tests I've made it gives better performance results.
; WITH cte as
(
select *,
ROW_NUMBER() over (order by name desc) AS RN
from master..spt_values
)
SELECT name, number, (SELECT COUNT(*) FROM cte) AS [Count]
FROM cte
WHERE RN BETWEEN 20 AND 24
Run this and his queries side by side and compare execution plans.

Resources