Rank by 2 different levels of partitioning/grouping - sql-server

I have this set of data using Microsoft SQL Server Management Studio
Category|pet name| date |food price|vet expenses|vat
A | jack |2017-08-28| 12.98 | 2424 |23
A | jack |2017-08-29| 2339 | 2424 |23
A | smithy |2017-08-28| 22.35 | 2324 |12
A | smithy |2017-08-29| 123.35 | 2432 |23
B | casio |2017-08-28| 11.38 | 44324 |32
B | casio |2017-08-29| 2.24 | 3232 |43
B | lala |2017-08-28| 343.36 | 42342 |54
B | lala |2017-08-29| 34.69 | 22432 |54
C | blue |2017-08-28| 223.02 | 534654 |78
C | blue |2017-08-29| 321.01 | 6654 |67
C | collie |2017-08-28| 232.05 | 4765 |43
C | collie |2017-08-29| 233.03 | 4654 |65
What I want to do is rank by food price, but group by category, order by category, pet name, date and then rank by vet expenses, but group by category, order by category, pet name, date and then rank by vat, but group by category, order by category, pet name, date.
I'm thinking this will be a join statement for the table above?
Something exactly like below:
Category|pet name| date |food price|vet expenses|vat|Rankfp|Rankve|Rankvat
A | jack |2017-08-28| 12.98 | 2424 |23 | 2 | 1 |1
A | jack |2017-08-29| 2339 | 2424 |23 | 1 | 2 |1
A | smithy |2017-08-28| 22.35 | 2324 |12 | 1 | 2 |2
A | smithy |2017-08-29| 123.35 | 2432 |22 | 2 | 1 |2
B | casio |2017-08-28| 11.38 | 44324 |32 | 2 | 1 |2
B | casio |2017-08-29| 2.24 | 3232 |43 | 2 | 2 |2
B | lala |2017-08-28| 343.36 | 42342 |54 | 1 | 2 |1
B | lala |2017-08-29| 34.69 | 22432 |54 | 1 | 1 |1
C | blue |2017-08-28| 223.02 | 534654 |78 | 2 | 1 |1
C | blue |2017-08-29| 321.01 | 6654 |67 | 1 | 1 |1
C | collie |2017-08-28| 232.05 | 4765 |43 | 1 | 2 |2
C | collie |2017-08-29| 233.03 | 4654 |65 | 2 | 2 |2
NB: this is not needed in the final output but to make it more readable I have ordered the outcome by category, pet name, date:
Category|pet name| date |food price|vet expenses|vat|Rankfp|Rankve|Rankvat
A | jack |2017-08-28| 12.98 | 2424 |23 | 2 | 1 |1
A | smithy |2017-08-28| 22.35 | 2324 |12 | 1 | 2 |2
A | jack |2017-08-29| 2339 | 2424 |23 | 1 | 2 |1
A | smithy |2017-08-29| 123.35 | 2432 |22 | 2 | 1 |2
B | casio |2017-08-28| 11.38 | 44324 |32 | 2 | 1 |2
B | lala |2017-08-28| 343.36 | 42342 |54 | 1 | 2 |1
B | lala |2017-08-28| 343.36 | 42342 |54 | 1 | 2 |1
B | lala |2017-08-29| 34.69 | 22432 |54 | 1 | 1 |1
C | blue |2017-08-28| 223.02 | 534654 |78 | 2 | 1 |1
C | collie |2017-08-28| 232.05 | 4765 |43 | 1 | 2 |2
C | blue |2017-08-29| 321.01 | 6654 |67 | 1 | 1 |1
C | collie |2017-08-29| 233.03 | 4654 |65 | 2 | 2 |2
The code I have below only ranks by category, but does not group by food price, vet expenses and vat.
RANK ()OVER(PARTITION BY [Category], [Date] order by [Category] ,[Pet Name],[Date]) as 'Rank'
Would it be a case of grouping the costs separately then left joining the rankings on to the original data?
(I will be using pivots and slicers in excel so want to have all the data on one table/query)

After walking away with some time to refresh my brain i had a eureka moment and solved this. It was actually easy when I thought about it.
so
the code to get the desired table goes something like this:
select *
, rank ()OVER(PARTITION BY [Category], [date] order by [food price], [Category] ,[pet name],[date]) as 'Rankfp'
, rank ()OVER(PARTITION BY [Category], [date] order by [vet expenses], [Category] ,[pet name], [date]) as 'Rankve'
, rank ()OVER(PARTITION BY [Category], [date] order by [vat], [Category] ,[pet name], [date]) as 'Rankvat'
from petcost
order by [category, [pet name]

Related

PostgreSQL filter entity with intermediate table

I would like to create a query, which filters all entities.
Like that ->
FIRST_TABLE
---------------------
|A_ID | TITLE |
|--------------------
|1 | TEST1 |
|2 | TEST2 |
|3 | TEST3 |
|4 | TEST4 |
---------------------
SECOND_TABLE
---------------------
|B_ID | NAME |
|--------------------
|1 | NAME1 |
|2 | NAME2 |
|3 | NAME3 |
|4 | NAME4 |
---------------------
INTERMEDIATE_TABLE
-----------------
|A_FK | B_FK|
|----------------
|2 | 1 |
|2 | 2 |
|2 | 3 |
|3 | 1 |
-----------------
QUERY
SELECT * FROM FIRST_TABLE ft
JOIN INTERMEDIATE_TABLE it
ON ft.A_ID = it.A_FK
WHERE it.B_FK = 1
AND it.B_FK = 2
Then it should only show the entity 2 from first_table because this entity has a relation with NAME1 and NAME2.
How can I make this work?

How to join two tables using a third table when NULLS are involved

I have two time card tables that I need to join. The two tables should be joined by the week ID, and employee resource code if applicable. However, with the exception of one week, the data that the two tables contain is from different time frames (i.e. in most cases there will not be matching data in both tables).
The first table (dt5) has that week’s ID, the employee's resource code, that employee's capacity for that week, and their actual hours worked for that week.
dt5:
+---------------+---------------+----------+---------------+
| id | Resource_code | capacity | time_reported |
+---------------+---------------+----------+---------------+
| 1 | 555 | 40 | 40 |
| 1 | 333 | 25 | 20 |
| 2 | 555 | 40 | 40 |
| 2 | 333 | 25 | 20 |
| 3 | 555 | 40 | 40 |
| 3 | 333 | 25 | 20 |
| 4 | 555 | 40 | 39 |
| 4 | 333 | 25 | 24 |
+---------------+---------------+----------+---------------+
The second table (dt4) has the week’s ID, the employee's resource code, and the employee's planned hours for that week.
dt4:
+---------------+---------------+---------------+
| id | Resource_code | planned_hours |
+---------------+---------------+---------------+
| 4 | 555 | 30 |
| 4 | 333 | 20 |
| 5 | 555 | 30 |
| 5 | 333 | 20 |
| 6 | 555 | 30 |
| 6 | 333 | 20 |
+---------------+---------------+---------------+
When an employee completes their time card, the planned hours data is removed; before this occurs, there is a short period of time when the data overlaps (when both tables have data for the same period, like period 4 in my example tables). Because the two tables will only have one period in common at any given time, I am using a third table (gtd) that contains each week's ID to help join them.
gtd:
+----+------------+----------+
| id | start_date | end_date |
+----+------------+----------+
| 1 | 10 | 20 |
| 2 | 30 | 40 |
| 3 | 50 | 60 |
| 4 | 70 | 80 |
| 5 | 90 | 100 |
| 6 | 110 | 120 |
| 7 | 130 | 140 |
| 8 | 150 | 160 |
| 9 | 170 | 180 |
| 10 | 190 | 200 |
+----+------------+----------+
dates changed to integers in this example for simplification
My result should look like this:
Note that the week 4 rows contain data from both dt4 and dt5 (capacity, time reported, planned hours), because week 4 is the only overlapping week.
+----+---------------+----------+---------------+---------------+---------------+
| id | Resource_code | capacity | time_reported | Resource_code | planned_hours |
+----+---------------+----------+---------------+---------------+---------------+
| 1 | 555 | 40 | 40 | NULL | NULL |
| 1 | 333 | 25 | 20 | NULL | NULL |
| 2 | 555 | 40 | 40 | NULL | NULL |
| 2 | 333 | 25 | 20 | NULL | NULL |
| 3 | 555 | 40 | 40 | NULL | NULL |
| 3 | 333 | 25 | 20 | NULL | NULL |
| 4 | 555 | 40 | 39 | 555 | 30 |
| 4 | 333 | 25 | 24 | 333 | 20 |
| 5 | NULL | NULL | NULL | 555 | 30 |
| 5 | NULL | NULL | NULL | 333 | 20 |
| 6 | NULL | NULL | NULL | 555 | 30 |
| 6 | NULL | NULL | NULL | 333 | 20 |
| 7 | NULL | NULL | NULL | NULL | NULL |
| 8 | NULL | NULL | NULL | NULL | NULL |
| 9 | NULL | NULL | NULL | NULL | NULL |
| 10 | NULL | NULL | NULL | NULL | NULL |
+----+---------------+----------+---------------+---------------+---------------+
Here is the SQL I have so far:
SELECT
gtd.id,
dt5.resource_code,
dt5.capacity,
dt5.time_reported,
dt4.resource_code,
dt4.planned_hours
FROM gtd
LEFT JOIN dt5 ON gtd.id = dt5.id
LEFT OUTER JOIN dt4 ON gtd.id = dt4.id
My (incorrect) results are shown below:
The errors are occurring in the week 4 rows. In two of the week 4 rows, the resource code and planned hours information from dt4 does not match up with the resource code from dt5.
+----+---------------+----------+---------------+---------------+---------------+
| id | resource_code | capacity | time_reported | resource_code | planned_hours |
+----+---------------+----------+---------------+---------------+---------------+
| 1 | 555 | 40 | 40 | NULL | NULL |
| 1 | 333 | 25 | 20 | NULL | NULL |
| 2 | 555 | 40 | 40 | NULL | NULL |
| 2 | 333 | 25 | 20 | NULL | NULL |
| 3 | 555 | 40 | 40 | NULL | NULL |
| 3 | 333 | 25 | 20 | NULL | NULL |
| 4 | 555 | 40 | 39 | 555 (Correct) | 30 |
| 4 | 555 | 40 | 39 | 333 (Wrong) | 20 |
| 4 | 333 | 25 | 24 | 555 (Wrong) | 30 |
| 4 | 333 | 25 | 24 | 333 (Correct) | 20 |
| 5 | NULL | NULL | NULL | 555 | 30 |
| 5 | NULL | NULL | NULL | 333 | 20 |
| 6 | NULL | NULL | NULL | 555 | 30 |
| 6 | NULL | NULL | NULL | 333 | 20 |
| 7 | NULL | NULL | NULL | NULL | NULL |
| 8 | NULL | NULL | NULL | NULL | NULL |
| 9 | NULL | NULL | NULL | NULL | NULL |
| 10 | NULL | NULL | NULL | NULL | NULL |
+----+---------------+----------+---------------+---------------+---------------+
Based off my research, I think that I am either incorrectly using JOINS, or that I need a CASE statement somewhere. I’ve also tried joining the tables on resource code, but that eliminated a lot of my data. Any solutions or pointers in the right direction would be much appreciated.
I am using tsql.
*Edited my question to fix inconsistencies with the column names (period_number changed to id)
There's no doubt a simpler and more elegant solution that my answer, but since I'm very tired here's a brute force approach:
Use UNION to mash the two tables together. You'll need to manufacture dummy information that is only present in one table (such as Capacity).
Take the combined table and organise the data using GROUP BY:
SELECT f1.Period, f1.RC, f1.PlanTime, f1.ActTime
FROM
(SELECT
dt5.period_number AS 'Period',
dt5.resource_code AS 'RC',
dt5.capacity AS 'ActCap',
0 AS 'PlanTime',
dt5.time_reported AS 'ActTime'
FROM dt5
UNION ALL
SELECT
dt4.period_number AS 'Period',
dt4.resource_code AS 'RC',
0 AS 'ActCap',
dt4.planned_hours AS 'PlanTime',
0 AS 'ActTime'
FROM dt4) AS f1
GROUP BY f1.Period, f1.RC
I do not think you need the gtd table. Please try and see if this work for you. Please correct me if my understanding of your request is incorrect.
SELECT COALESCE(dt5.period_number, dt4.period_number) AS period_number,
dt5.Resource_code,
dt5.capacity,
dt5.time_reported,
dt4.Resource_code,
dt4.planned_hours
FROM dt5
FULL OUTER JOIN (
SELECT *
FROM dt4 a
WHERE NOT EXISTS (
SELECT 1
FROM dt5 b
WHERE b.period_number = a.period_number
AND b.Resource_code = a.Resource_code
)
) dt4
ON dt5.period_number = dt4.period_number
AND dt4.Resource_code = dt5.Resource_code
ORDER BY COALESCE(dt5.period_number, dt4.period_number) ASC
Test Data
;WITH cte_dt5(period_number,Resource_code,capacity,time_reported) AS
(
SELECT 1, 555, 40, 40 UNION ALL
SELECT 1, 333, 25, 20 UNION ALL
SELECT 2, 555, 40, 40 UNION ALL
SELECT 2, 333, 25, 20 UNION ALL
SELECT 3, 555, 40, 40 UNION ALL
SELECT 3, 333, 25, 20 UNION ALL
SELECT 4, 555, 40, 39 UNION ALL
SELECT 4, 333, 25, 24
)
,cte_dt4 (period_number, Resource_code, planned_hours) AS
(
SELECT 4, 555, 30 UNION ALL
SELECT 4, 333, 20 UNION ALL
SELECT 5, 555, 30 UNION ALL
SELECT 5, 333, 20 UNION ALL
SELECT 6, 555, 30 UNION ALL
SELECT 6, 333, 20
)
SELECT COALESCE(dt5.period_number, dt4.period_number) AS period_number,
dt5.Resource_code,
dt5.capacity,
dt5.time_reported,
dt4.Resource_code,
dt4.planned_hours
FROM cte_dt5 AS dt5
FULL OUTER JOIN (
SELECT *
FROM cte_dt4 a
WHERE NOT EXISTS (
SELECT 1
FROM cte_dt5 b
WHERE b.period_number = a.period_number
AND b.Resource_code = a.Resource_code
)
) dt4
ON dt5.period_number = dt4.period_number
AND dt4.Resource_code = dt5.Resource_code
ORDER BY COALESCE(dt5.period_number, dt4.period_number) ASC
Result
+---------------------------------------------------------------------------------+
|period_number|Resource_code|capacity |time_reported|Resource_code|planned_hours|
+-------------|-------------|-----------|-------------|-------------|-------------+
|1 |555 |40 |40 |NULL |NULL |
|1 |333 |25 |20 |NULL |NULL |
|2 |555 |40 |40 |NULL |NULL |
|2 |333 |25 |20 |NULL |NULL |
|3 |555 |40 |40 |NULL |NULL |
|3 |333 |25 |20 |NULL |NULL |
|4 |555 |40 |39 |NULL |NULL |
|4 |333 |25 |24 |NULL |NULL |
|5 |NULL |NULL |NULL |333 |20 |
|5 |NULL |NULL |NULL |555 |30 |
|6 |NULL |NULL |NULL |333 |20 |
|6 |NULL |NULL |NULL |555 |30 |
+---------------------------------------------------------------------------------+
Code changes as per OP's request below. Commenting the Exist clause will give the result desired.
user7571220: Thank you for your help! Everything is correct except for
the planned hours and resource code (which come from dt4) in week 4. I
am trying to include data from both tables in the week that they
overlap (week 4). I'm essentially trying to get the data for week 4 to
look like the comments I've posted below.
| 4 | 555 | 40 | 39 | 555 | 30 |
| 4 | 333 | 25 | 24 | 333 | 20 |
SELECT COALESCE(dt5.period_number, dt4.period_number) AS period_number,
dt5.Resource_code,
dt5.capacity,
dt5.time_reported,
dt4.Resource_code,
dt4.planned_hours
FROM cte_dt5 AS dt5
FULL OUTER JOIN (
SELECT *
FROM cte_dt4 a
--WHERE NOT EXISTS (
-- SELECT 1
-- FROM cte_dt5 b
-- WHERE b.period_number = a.period_number
-- AND b.Resource_code = a.Resource_code
-- )
) dt4
ON dt5.period_number = dt4.period_number
AND dt4.Resource_code = dt5.Resource_code
ORDER BY COALESCE(dt5.period_number, dt4.period_number) ASC
Result
+---------------------------------------------------------------------------------+
|period_number|Resource_code|capacity |time_reported|Resource_code|planned_hours|
+-------------|-------------|-----------|-------------|-------------|-------------+
|1 |555 |40 |40 |NULL |NULL |
|1 |333 |25 |20 |NULL |NULL |
|2 |555 |40 |40 |NULL |NULL |
|2 |333 |25 |20 |NULL |NULL |
|3 |555 |40 |40 |NULL |NULL |
|3 |333 |25 |20 |NULL |NULL |
|4 |555 |40 |39 |555 |30 |
|4 |333 |25 |24 |333 |20 |
|5 |NULL |NULL |NULL |333 |20 |
|5 |NULL |NULL |NULL |555 |30 |
|6 |NULL |NULL |NULL |333 |20 |
|6 |NULL |NULL |NULL |555 |30 |
+---------------------------------------------------------------------------------+

Bio-Metric device record

Hi i have get data from bio-metric device like :-
|Id |EmpCode | WorkDate |InOutMode
|247 |51 | 2017-02-13 20:08:52.000 |0
|392 |51 | 2017-02-13 22:38:51.000 |1
|405 |51 | 2017-02-13 22:59:18.000 |0
|415 |51 | 2017-02-13 23:18:17.000 |1
|423 |51 | 2017-02-13 23:33:44.000 |0
|456 |51 | 2017-02-13 01:30:15.000 |1
|463 |51 | 2017-02-13 02:52:02.000 |0
|483 |51 | 2017-02-13 05:11:54.000 |1
|1034 |51 | 2017-02-14 20:09:23.000 |0
|1172 |51 | 2017-02-14 21:59:23.000 |1
|1217 |51 | 2017-02-14 22:30:28.000 |0
|1214 |51 | 2017-02-14 22:30:39.000 |0
|1238 |51 | 2017-02-14 22:49:51.000 |1
|1257 |51 | 2017-02-14 23:19:10.000 |0
|1315 |51 | 2017-02-14 05:04:16.000 |1
|1323 |51 | 2017-02-14 05:05:17.000 |0
|1329 |51 | 2017-02-14 05:08:17.000 |1
|1330 |51 | 2017-02-14 05:08:18.000 |1
I want to get data from above table record like:-
|EmpCode |WorkDate |CheckIn |CheckOut |TotalHours
|51 |2017-02-13 |20:08:52 |22:38:51 |2.499722000
|51 |2017-02-13 |22:59:18 |23:18:17 |0.316388000
|51 |2017-02-13 |23:33:44 |01:30:15 |3.103330000
|51 |2017-02-13 |02:52:02 |05:11:54 |2.331111000
|51 |2017-02-14 |20:09:23 |21:59:23 |1.833333000
|51 |2017-02-14 |22:30:28 |22:49:51 |0.323055000
|51 |2017-02-14 |23:19:10 |05:04:16 |5.323055000
|51 |2017-02-14 |05:05:17 |05:08:18 |0.050000000
PS: The duplicate IN or OUT is ignored.13th,14th,17th and 18th lines in the raw data. 2. Minutes are in decimal point to the hour in the hours calculation.
I need help of the Sql-Server query to use to get these results.
My current code is not help me and also leave some rows and get wrong result and total of hours thanks :)
Note:- When my query excute missing two rows :-
|456 |51 | 2017-02-13 01:30:15.000 |1
|463 |51 | 2017-02-13 02:52:02.000 |0
Assuming 0 in In and 1 is Out.
I included an Overnight column to return 1 when CheckOut is on the next day. You can comment it out if you do not need it.
using cross apply()
rextester: http://rextester.com/ENFRC28977
with cte as (
select
Id
, EmpCode
, WorkDate
, InOutMode
, Lag_InOutMode = Lag(InOutMode) over (order by EmpCode, WorkDate)
from t
)
select
i.EmpCode
, WorkDate = convert(varchar(10),convert(date,i.WorkDate))
, Overnight = case when datediff(day,i.WorkDate,o.WorkDate)>0 then 1 else 0 end
, CheckIn = convert(time,i.WorkDate)
, CheckOut = convert(time,o.WorkDate)
, TotalHours = datediff(second,i.WorkDate,o.WorkDate)/3600.0
from cte i
cross apply (
select top 1 WorkDate
from cte o
where o.EmpCode = i.EmpCode
and o.InOutMode = 1
and o.Lag_InOutMode != 1
and o.WorkDate > i.WorkDate
order by o.WorkDate asc
) as o
where i.InOutMode = 0
and i.Lag_InOutMode != 0
order by i.WorkDate
returns:
+---------+------------+-----------+----------+----------+------------+
| EmpCode | WorkDate | Overnight | CheckIn | CheckOut | TotalHours |
+---------+------------+-----------+----------+----------+------------+
| 51 | 2017-02-13 | 0 | 02:52:02 | 05:11:54 | 2,331111 |
| 51 | 2017-02-13 | 0 | 20:08:52 | 22:38:51 | 2,499722 |
| 51 | 2017-02-13 | 0 | 22:59:18 | 23:18:17 | 0,316388 |
| 51 | 2017-02-13 | 1 | 23:33:44 | 05:04:16 | 5,508888 |
| 51 | 2017-02-14 | 0 | 05:05:17 | 05:08:17 | 0,050000 |
| 51 | 2017-02-14 | 0 | 20:09:23 | 21:59:23 | 1,833333 |
| 51 | 2017-02-14 | 0 | 22:30:28 | 22:49:51 | 0,323055 |
+---------+------------+-----------+----------+----------+------------+
I do not see a 0 InOutMode prior to for '2017-02-13 01:30:15', so my results do not contain a row for:
|51 |2017-02-13 |23:33:44 |01:30:15 |3.103330000

Counting Retrieved Records from SQL Server

I just want to ask if how I may be able to create a dynamic numbering column based from what I will be retrieving from the database?
Ex.
Table Reservations
|ReservationNo----ClientNo------DateAdded----DateModified|
|1 | 1 | 01-01-01 | 01-01-01 |
|2 | 2 | 01-01-01 | 01-01-01 |
|3 | 2 | 01-01-01 | 01-01-01 |
|4 | 2 | 01-01-01 | 01-01-01 |
|5 | 1 | 01-01-01 | 01-01-01 |
|6 | 3 | 01-01-01 | 01-01-01 |
|7 | 3 | 01-01-01 | 01-01-01 |
|8 | 2 | 01-01-01 | 01-01-01 |
|9 | 1 | 01-01-01 | 01-01-01 |
|10 | 1 | 01-01-01 | 01-01-01 |
When I execute the statement below...
SELECT * FROM Table WHERE ClientNo = '1'
Result :
**Counter**-----ReservationNo----Client--------DateAdded----DateModified|
|1 | 1 | 1 | 01-01-01 | 01-01-01 |
|2 | 5 | 1 | 01-01-01 | 01-01-01 |
|3 | 9 | 1 | 01-01-01 | 01-01-01 |
|4 | 10 | 1 | 01-01-01 | 01-01-01 |
You could use the row_number() function:
select row_number() over (order by ReservationNo) as Counter
, *
from YourTable
order by
ReservationNo
Looks like you're searching for the ROW_NUMBER function, see http://msdn.microsoft.com/de-de/library/ms186734.aspx
It seems you need total the number of available rows in the table you fetch with each condition/query.
If thats the case COUNT(*) OVER() will meet your requirements.
SELECT ReservationNo
,ClientNo
,DateAdded
,DateModified
,COUNT(*) OVER()
FROM Reservations
WHERE condition if required

Select top n records based on ordinal and attribute data

I have a case where I need to show only the top rows based on a setting in a table and the ordinal set.
Example dataset below shows two customers; each of the customers have a different product.
Since NumRowsToShow is "1" I only want to show one row (the top row based on ordinal) for EACH Customer.
| CustomerID | ProductID | Ordinal | NumRowsToShow |
+------------+-----------+---------+---------------+
| 1 |A |1 |1 |
| 1 |B |2 |1 |
| 1 |C |3 |1 |
| 5 |D |1 |1 |
| 5 |E |2 |1 |
| 5 |F |3 |1 |
The result set after query is run should be
| CustomerID | ProductID |
+------------+-----------+
| 1 |A |
| 5 |D |
In the same scenario if NumRowsToShow were 1 for customerID 1 and 2 for CustomerID 5 I would see something like.
| CustomerID | ProductID | Ordinal | NumRowsToShow |
+------------+-----------+---------+---------------+
| 1 |A |1 |1 |
| 1 |B |2 |1 |
| 1 |C |3 |1 |
| 5 |D |1 |2 |
| 5 |E |2 |2 |
| 5 |F |3 |2 |
The result set after query is run should be
| CustomerID | ProductID |
+------------+-----------+
| 1 |A |
| 5 |D |
| 5 |E |
How can this be done?
Including a screen cap of actual result set with highlights of what I'm trying to filter down to which may be a little helpful.
(source: harpernet.net)
It feels like "cheating in the exams":
SELECT CustomerID, ProductID
FROM tableX
WHERE Ordinal <= NumRowsToShow
If, as comments suggest, the Ordinal can have 10, 20, 30 values and not only 1, ..., n values, then this will work:
SELECT t.CustomerID, t.ProductID
FROM tableX AS t
JOIN tableX AS tt
ON tt.CustomerID = t.CustomerID
AND tt.Ordinal <= t.Ordinal
GROUP BY t.CustomerID
, t.ProductID
, t.NumRowsToShow
HAVING COUNT(*) <= t.NumRowsToShow
or even better, the:
SELECT CustomerID, ProductID
FROM
( SELECT CustomerID, ProductID, NumRowsToShow
, ROW_NUMBER() OVER( PARTITION BY CustomerID
ORDER BY Ordinal
) AS Rn
FROM tableX
) AS tmp
WHERE Rn <= NumRowsToShow ;
Test in: SQL-Fiddle
Your table looks to be not normalized. The NumRowsToShow columns has duplicate infomation and that can lead to update anomalies. This:
| CustomerID | ProductID | Ordinal | NumRowsToShow |
+------------+-----------+---------+---------------+
| 1 |A |1 |1 |
| 1 |B |2 |1 |
| 1 |C |3 |1 |
| 5 |D |1 |2 |
| 5 |E |2 |2 |
| 5 |F |3 |2 |
could be normalized to 2 tables:
| CustomerID | ProductID | Ordinal |
+------------+-----------+---------+
| 1 |A |1 |
| 1 |B |2 |
| 1 |C |3 |
| 5 |D |1 |
| 5 |E |2 |
| 5 |F |3 |
and:
| CustomerID | NumRowsToShow |
+------------+---------------+
| 1 |1 |
| 5 |2 |

Resources