I have a problem that I've wasted way too much time playing with. It can be simplified to something like:
Platform: SQL Server
you have a table with age and zipcode
list the top 5 oldest people in each zipcode
I can see how to do it with cursors, but is there a way with top and group by to achieve this?
All inputs appreciated!
SELECT *
FROM (
SELECT *
,ROW_NUMBER() OVER (PARTITION BY zipcode ORDER BY Age DESC) rn
FROM TableName
)A
WHERE RN <= 5
You need to use ROW_NUMBER analytic function
SELECT *
FROM
( SELECT name,
age,
zipcode,
ROW_NUMBER() OVER
( PARTITION by zipcode
order by age desc)
as seq
) T
Where T.seq <=5
Related
I have the following tables: https://pastebin.com/Js0Sm69S (CREATE and INSERT statements).
I would like to find the third-highest salary in each department if there is such.
I was able to achieve this:
Using the following query:
SELECT *,
DENSE_RANK() OVER
(PARTITION BY DepartmentId ORDER BY Salary DESC) AS DRank
FROM Employees
I am not sure if DENSE_RANK() is the best ranking function to use here. Maybe not, because WHERE DRank=3 may return more than one result (but we can say TOP(1)). What do you think about this? Now how to display the third-highest salary in each department if there is such?
Try this
Select EmployeeID,FirstName,DepartmentID,Salary
From (
Select *
,RN = Row_Number() over (Partition By DepartmentID Order By Salary)
,Cnt = sum(1) over (Partition By DepartmentID)
From Employees
) A
Where RN = case when Cnt<3 then Cnt else 3 end
You're almost there, but you can achieve this with ROW_NUMBER, instead of DENSE_RANK. I think following query should help.
WITH cte AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY DepartmentId ORDER BY Salary DESC) AS DRank
FROM Employees
)
SELECT *
FROM cte
WHERE DRank= 3
I have two tables:
Customer which has an Id column representing the customer Id.
CustomerDonation that contains CustomerId (FK), Amount and DatePayed
I'd like have all the customers together with their latest donation and the amount of that donation.
I am receiving duplicate values on my query so I will not paste it here.
You could also use the WITH TIES option
Select Top 1 With Ties *
From YourTable
Order By Row_Number() over (Partition By CustomerId Order By DatePayed Desc)
WITH
SortedDonation AS
(
SELECT
ROW_NUMBER() OVER (PARTITION BY CustomerId ORDER BY DatePayed DESC) AS SeqID,
*
FROM
CustomerDonation
)
SELECT
*
FROM
Customer
LEFT JOIN
SortedDonation
ON SortedDonation.CustomerId = Customer.Id
AND SortedDonation.SeqId = 1
If the same customer can make multiple donations with the same DatePayed, then this will arbitrarily pick just one of them.
If you add additional fields to the ORDER BY you can deterministically pick which one you want.
Or, if you want all of them use DENSE_RANK() instead of ROW_NUMBER()
Use Row_Number() Analytic function .
Select * from (
Select customerId,Amount,DatePayed, row_number() over (partition by CustomerId order by DatePayed desc) as rowN)
as tab where rowN = 1
You only need the CustomerDonation table for this. You can join with the Customer table if you want other information of the customer.
WITH cte AS (
SELECT
CustomerId
, MAX(DatePayed) AS LastDate
FROM
CustomerDonation
)
SELECT
cd.CustomerId
, cd.Amount
, cd.DatePayed
FROM
CustomerDonation cd
JOIN cte ON cd.CustomerId = cte.CustomerId
AND cd.DatePayed = cte.LastDate
i have this Table Chips:
im looking to find the max for each ID but
the code im using is just not correct i need the new table to be
select *
from
(
max (numchips) over (partition by Id)
from #chips
)
You can use ROW_NUMBER:
SELECT Id, numchips
FROM (
SELECT Id, numchips,
ROW_NUMBER() OVER (PARTITION BY Id
ORDER BY numchips DESC) as rn
FROM #chips
) t
WHERE rn = 1
rn is equal to 1 for the record having the highest numchips value within each Id partition.
Using ROW_NUMBER() makes sense only if you have some additional columns in Chips table that you also want to retrieve.
Why can't you just do?:
SELECT
MAX(c.numchips),
c.Id
FROM
#chips as c
GROUP BY
c.Id
Struggling with what's probably a very simple problem. I have a query like this:
;WITH rankedData
AS ( -- a big, complex subquery)
SELECT UserId,
AttributeId,
ItemId
FROM rankedData
WHERE rank = 1
ORDER BY datEventDate DESC
The sub-query is designed to grab a big chunk of interlined data and rank it by itemId and date, so that the rank=1 in the above query ensures we only get unique ItemIds, ordered by date. The partition is:
Rank() OVER (partition BY ItemId ORDER BY datEventDate DESC) AS rk
The problem is that what I want is the top 75 records for each UserID, ordered by date. Seeing as I've already got a rank inside my sub-query to sort out item duplicates by date, I can't see a straightforward way of doing this.
Cheers,
Matt
I think your query should look like
SELECT t.UserId, t.AttributeId, t.ItemId
FROM (
SELECT UserId, AttributeId, ItemId, rowid = ROW_NUMBER() OVER (
PARTITION BY UserId ORDER BY datEventDate
)
FROM rankedData
) t
WHERE t.rowid <= 75
How to first filter the result based on params then to apply where-between?
Some thing like
With Results as
(
Select colName,Title, Row_Number(Over...) as row from a table where colName=5
)
Select * from Results
where
row between #first and #last
But it does not works. I need to move my where colName=5 from with clause to outside then I got wrong data as It first get rows between #first n #last then search for colName=5.
Also I want count of Results.
Any idea?
You can use COUNT(*) OVER() to get the count of the unfiltered results
WITH cte as
(
select *,
ROW_NUMBER() over (order by name desc) AS RN,
count(*) over() AS [Count]
from master..spt_values
)
SELECT name, number,[Count]
FROM cte
WHERE RN BETWEEN 20 AND 24
Returns
name number Count
----------------------------------- ----------- -----------
VIEW 8278 2506
VIEW 8278 2506
view 2 2506
varchar 3 2506
varbinary 1 2506
This has performance implications though. You might want to just calculate the COUNT up front and cache it somewhere rather than recalculating it for every page request.
Your ROW_NUMBER syntax is incorrect. It should be this:
With Results as
(
SELECT colName, Title, ROW_NUMBER() OVER (ORDER BY ...) AS RN
FROM your_table
WHERE colName = 5
)
SELECT * FROM Results
WHERE rn BETWEEN #first AND #last
ORDER BY rn
See the documentation for more information.
I use approach very similar to Martin Smiths (currently selected answer) and at least in the tests I've made it gives better performance results.
; WITH cte as
(
select *,
ROW_NUMBER() over (order by name desc) AS RN
from master..spt_values
)
SELECT name, number, (SELECT COUNT(*) FROM cte) AS [Count]
FROM cte
WHERE RN BETWEEN 20 AND 24
Run this and his queries side by side and compare execution plans.