SQL Rank does not work as expected

SQL Rank does not work as expected - sql-server

Im trying to use SQL function Rank() to get a list the top records of several groups. Here is what im tring that does not work :
select hc.hId, hc.DpId, hc.Rank
from (
select d.hId, DpId, Rank()
OVER (Partition by DpId ORDER BY d.hId) AS Rank
FROM CurDp d
INNER JOIN HostList h on d.DpId = h.hId
INNER JOIN Coll_hList pch on d.hId = pch.hId
where h.Model = 'PRIMARY'
) hc where hc.Rank <= 10
I get the top 10 records as follows :
HId | DpId | Rank
-------x------x------
7 | 590 | 1
18 | 590 | 2
23 | 590 | 3
24 | 590 | 4
26 | 590 | 5
36 | 590 | 6
63 | 590 | 7
80 | 590 | 8
84 | 590 | 9
88 | 590 | 10
But when I use CROSS APPLY, which the function I need because i have to get that kind of records on different models, I use this code :
select pch.hId, cc.DpId, cc.Rank from from Coll_hList pch
cross apply
(
select hc.hId, hc.DpId, hc.Rank
from (
select d.hId, DpId, Rank()
OVER (Partition by DpId ORDER BY d.hId) AS Rank
FROM CurrDp d
INNER JOIN HostList h on d.DpId = h.hId
where h.Model = 'PRIMARY' and d.hId = pch.hId
) hc where hc.Rank <= 10
) cc
Here, I get always rank 1, and it doesn't filter anything (not showing the whole result) :
HId | DpId | Rank
-------x------x------
7 590 1
18 590 1
23 590 1
24 590 1
26 590 1
36 590 1
63 590 1
80 590 1
84 590 1
88 590 1
124 590 1
125 590 1
133 590 1
Am I doing it wrong ? Is it because of CROSS APPLY ?
I also used dense_rank() instead of rank(), but it shows the same result.
Any help to achieve this request with CROSS APPLY would be greatly appreciated.
Thanks

In the first case, you join on Coll_hList and get a result set of more than 10 entries which then are ranked.
In the second case, in your apply-sub-select, you only create a one-entry result set. Ranking of that results in rank one.
Your ranking has to be done in the outer statement:
select pch.hId, cc.DpId, Rank()
OVER (Partition by cc.DpId ORDER BY cc.hId) AS Rank
from Coll_hList pch
cross apply
(
select d.hId, DpId
FROM CurrDp d
INNER JOIN HostList h on d.DpId = h.hId
where h.Model = 'PRIMARY' and d.hId = pch.hId
) cc

Related

Calculate row difference within groups

I'm looking for help with calculating the difference between consecutive ordered rows within groups in SQL (Microsoft SQL server).
I have a table like this:
ID School_ID Enrollment_Start_Date Order
1 56 1/1/2018 10
1 56 5/5/2018 24
1 56 7/7/2018 35
1 103 4/4/2019 26
1 103 3/3/2019 19
I want to calculate the difference between Order, group by ID, School_ID, and order by Enrollment_Start_Date.
so I want something like this:
ID School_ID Enrollment_Start_Date Order Diff
1 56 1/1/2018 10 10 # nothing to be subtracted from 10
1 56 5/5/2018 24 14 # 24-10
1 56 7/7/2018 35 11 # 35-24
1 103 3/3/2019 19 19 # nothing to be subtracted from 19
1 103 4/4/2019 26 7 # 26-19
I have hundreds of IDs, and each ID can have at most 6 Enrollment_Start_Date, so I'm looking for some generalizable implementations.

Use LAG(<column>) analytic function to obtain a "previous" column value specified within the OVER part, then substract current value from it and make it a positive number multiplying it by -1. If previous value isn't present (is null) then take the current value.
Pseudo code would be:
If previous_order_value exists:
-1 * (previous_order_value - current_order_value)
Else
current_order_value
where previous_order_value is based on the same id & school_id and is sorted by enrollment_start_date in ascending order
SQL Code:
select
id,
school_id,
enrollment_start_date,
[order],
coalesce(-1 * (lag([order]) over (partition by id, school_id order by enrollment_start_date ) - [order]), [order]) as diff
from yourtable
Also note, that order keyword is reserved in SQL Server, which is why your column was created with name wrapped within [ ]. I suggest using some other word for this column, if possible.

use lag() analytic function for getting difference of two row and case when for getting orginal value of order column where no difference exist
with cte as
(
select 1 as id, 56 as sclid, '2018-01-01' as s_date, 10 as orders
union all
select 1,56,'2018-05-05',24 union all
select 1,56,'2018-07-07',35 union all
select 1,103,'2019-04-04',26 union all
select 1,103,'2019-03-03',19
) select t.*,
case when ( lag([orders])over(partition by id,sclid order by s_date ) -[orders] )
is null then [orders] else
( lag([orders])over(partition by id,sclid order by s_date ) -[orders] )*(-1) end
as diff
from cte t
output
id sclid s_date orders diff
1 56 2018-01-01 10 10
1 56 2018-05-05 24 14
1 56 2018-07-07 35 11
1 103 2019-03-03 19 19
1 103 2019-04-04 26 7
demo link

Use LAG(COLUMN_NAME)
Query
SELECT id, School_ID, Enrollment_Start_Date, cOrder,
ISNULL((cOrder - (LAG(cOrder) OVER(PARTITION BY id, School_ID ORDER BY Enrollment_Start_Date))),cOrder)Diff
FROM Table1
Samle Output
| id | School_ID | Enrollment_Start_Date | cOrder | Diff |
|----|-----------|-----------------------|--------|------|
| 1 | 56 | 2018-01-01 | 10 | 10 |
| 1 | 56 | 2018-05-05 | 24 | 14 |
| 1 | 56 | 2018-07-07 | 35 | 11 |
| 1 | 103 | 2019-03-03 | 19 | 19 |
| 1 | 103 | 2019-04-04 | 26 | 7 |
SQL Fiddle Demo

TSQL Divide set in groups

I have a table with 3 fields: wk, cor, id
"wk" is the week, "cor" groups items from same location, "id" is the id of each item to retrieve from warehouse.
Given a certain number of items to retrieve, I must take almost the same quantity of items from each group ("cor" represents groups) for balancing the warehouse performance, respecting the week precedence (before going to the following week, the previous must be ehausted).
If you follow the link the image may be clear:
Data sample
rows are taken in this order:
yellow, orange, green, gray (this last one starts with "cor 2" because "cor 1" was the last used in week 28)
The RES column (done by hand in the sample) represents the right order I should take items; currently this is obtained with a cursor, which is very very slow and I'd like to do something better, if possible; I've tried with windowed functions, cte, recursive cte but was not able to get anything right.
With this script you can have the same table
DECLARE #t TABLE (wk int, cor int, id int)
INSERT INTO #t
(
wk
,cor
,id
)
VALUES
(28,1,4044534),
(28,1,6778322),
(28,1,7921336),
(28,1,4326390),
(28,2,2669622),
(28,2,6580257),
(28,2,1179795),
(28,3,3980111),
(28,3,2549129),
(28,3,6763533),
(29,1,6023538),
(29,1,8219574),
(29,1,3836858),
(29,2,3355314),
(29,2,148847),
(29,2,8083320),
(29,3,1359966),
(29,3,8746308)
The expected result:
All fields are given while the RES field must be calculated and represents the order in which items will be taken out (explained below the table).
+----+-----+---------+-----+
| wk | cor | id | RES |
+----+-----+---------+-----+
| 28 | 1 | 4044534 | 1 |
| 28 | 1 | 6778322 | 4 |
| 28 | 1 | 7921336 | 7 |
| 28 | 1 | 4326390 | 10 |
| 28 | 2 | 2669622 | 2 |
| 28 | 2 | 6580257 | 5 |
| 28 | 2 | 1179795 | 8 |
| 28 | 3 | 3980111 | 3 |
| 28 | 3 | 2549129 | 6 |
| 28 | 3 | 6763533 | 9 |
| 29 | 1 | 6023538 | 11 |
| 29 | 1 | 8219574 | 14 |
| 29 | 1 | 3836858 | 17 |
| 29 | 2 | 3355314 | 12 |
| 29 | 2 | 148847 | 15 |
| 29 | 2 | 8083320 | 18 |
| 29 | 3 | 1359966 | 13 |
| 29 | 3 | 8746308 | 16 |
+----+-----+---------+-----+
The algo is like that:
The older week must be first exausted (in the sample, wk 28 must be finished before taking itmes from wk 29)
Items must be equally reparted in "cor"s, so if 10 items are required they must come out like that: 3 from cor1,3 from cor2, 3 from cor3. The last one may come from whichever cor because 10 is not divisible by 3, obv
If 11 items are required; week 28 only contains 10 items so the last one will be taken from week 29, with the same principle: equally distribute the exit among cors, even if weeks change. If the last article from week 28 was taken from cor 1, the next one in week 29 will be taken from cor 2

Does this answer your problem ?
DROP TABLE IF EXISTS #temp
DROP TABLE IF EXISTS #temp2
CREATE TABLE #temp (idx INT PRIMARY KEY IDENTITY(1,1), wk int, cor int, id int)
INSERT INTO #temp
VALUES
(28,1,4044534),
(28,1,6778322),
(28,1,7921336),
(28,1,4326390),
(28,2,2669622),
(28,2,6580257),
(28,2,1179795),
(28,3,3980111),
(28,3,2549129),
(28,3,6763533),
(29,1,6023538),
(29,1,8219574),
(29,1,3836858),
(29,2,3355314),
(29,2,148847),
(29,2,8083320),
(29,3,1359966),
(29,3,8746308)
SELECT wk, cor, id
, ROW_NUMBER() OVER (ORDER BY wk, RES, idx) as RES
FROM (
SELECT idx
, wk
, cor
, id
, ROW_NUMBER() OVER (PARTITION BY wk, cor ORDER BY cor) AS RES
FROM #temp
) AS t
ORDER BY idx

You don't have the correct information in your test data to support the desired output.
If however, you were to have an identity column that represents the insertion order, you could use something like the following...
WITH cte_RankOrder AS (
SELECT
t.rn, t.wk, t.cor, t.id,
RankOrder = DENSE_RANK() OVER (PARTITION BY t.wk, t.cor ORDER BY t.rn, t.wk)
FROM
#t t
)
SELECT
ro.rn, ro.wk, ro.cor, ro.id,
RES = ROW_NUMBER() OVER (ORDER BY wk, ro.RankOrder, ro.cor)
FROM
cte_RankOrder ro
ORDER BY ro.rn;
results...
rn wk cor id RES
----------- ----------- ----------- ----------- --------------------
1 28 1 4044534 1
2 28 1 6778322 4
3 28 1 7921336 7
4 28 1 4326390 10
5 28 2 2669622 2
6 28 2 6580257 5
7 28 2 1179795 8
8 28 3 3980111 3
9 28 3 2549129 6
10 28 3 6763533 9
11 29 1 6023538 11
12 29 1 8219574 14
13 29 1 3836858 17
14 29 2 3355314 12
15 29 2 148847 15
16 29 2 8083320 18
17 29 3 1359966 13
18 29 3 8746308 16
HTH, Jason

Print out same row multiple times based on calculated value

I have a query that returns something similar to the following:
Zone | NeededItems
===========================
209 | 5
213 | 1
216 | 1
220 | 2
218 | 1
219 | 4
215 | 1
The query behind it is something like:
SELECT
r.Zone as Zone, r.Required - COUNT(i.Item) as NeededItems
FROM
MyItems i
INNER JOIN
MyRequirements r ON i.Zone = r.Zone
GROUP BY
r.Zone, r.Required
Where MyItems looks like: (Value of Item doesn't matter)
Zone | Item
================
209 | a
209 | b
209 | c
216 | a
220 | a
213 | z
218 | x
219 | q
219 | w
219 | e
219 | r
215 | t
And MyRequirements looks like:
Zone | Required
======================
209 | 8
213 | 2
216 | 2
220 | 3
218 | 2
219 | 5
215 | 2
What I need to be able to do is print out the Zone multiple times based on the value in Needed. The value in Needed is a calculated value which is what is making this difficult (I can't just remove the count!)
So the results I am looking for is simply a list of zones, each appearing the number of times it is needed.
Zone
====
209
209
209
209
209
213
216
220
220
218
219
219
219
219
215
Is there any way in SQL that this can be done? Using SQL Server 2012.

Below is one way to do it - using the e1, e2 and e3 queries are not the cleanest way to do it, but it's the only way that I could manage to get it working.
One bit limitation: it only works for up to 1000 items of each (more than enough for mine.) This could be changed by editing WHERE c<9 but be aware this is recursive so best not to have it more than what is needed.
WITH CTE as
(
SELECT
r.Zone as Zone, r.Required - COUNT(i.Item) as NeededItems
FROM
MyItems i
INNER JOIN
MyRequirements r ON i.Zone = r.Zone
GROUP BY
r.Zone, r.Required
),
e1(n,c ) AS
(
SELECT 1, 0
UNION ALL
SELECT n, c + 1
FROM e1
WHERE c<9
), -- 10
e2(n) AS
(
SELECT 1 FROM e1 CROSS JOIN e1 AS b -- 100
),
e3(n) AS
(
SELECT 1 FROM e1 CROSS JOIN e2 -- 1000
),
Numbers AS
(
SELECT n = ROW_NUMBER() OVER (ORDER BY n) FROM e3
)
SELECT
Zone
FROM
Numbers
INNER JOIN
CTE on CTE.NeededItems >= n ORDER BY Zone

Uniquely left join many to many table

I am using sqlserver and I have two table which contains below data. I need to select those matched rows without duplicate.
Table_A:
A_ID | Item_ID
--------------------
1 | 101
2 | 101
3 | 103
4 | 103
5 | 199
Table_B:
B_ID | Item_ID
--------------------
11 | 101
12 | 101
13 | 102
14 | 103
15 | 103
16 | 103
Expected Result:
A_ID | Item_ID | B_ID
----------------------
1 | 101 | 11
2 | 101 | 12
3 | 103 | 14
4 | 103 | 15
I tried:
SELECT A_ID, a.Item_ID, B_ID FROM Table_A a LEFT JOIN
Table_B b ON a.Item_ID = b.Item_ID
But it show all the possible records.
How can i display the expected result above?

Based on the result set you gave you want one unique record from B for each A, ignoring records in A for which there is no corresponding record in B. The following will work:
SELECT
AValues.A_ID,
AValues.Item_ID,
BValues.B_ID
FROM
(SELECT
A_ID,
Item_ID,
ROW_NUMBER() OVER(PARTITION BY Item_ID ORDER BY A_ID) ARowID
FROM
Table_A) AValues
INNER JOIN (SELECT
B_ID,
Item_ID,
ROW_NUMBER() OVER(PARTITION BY Item_ID ORDER BY B_ID) BRowID
FROM
Table_B) BValues ON AValues.Item_ID = BValues.Item_ID AND AValues.ARowID = BValues.BRowID

Query is very slow

I have tables
table1
epid etid id EValue reqdate
----------- ----------- ----------- ------------ ----------
15 1 1 498925307069 2012-01-01
185 1 2 A5973FC43CE3 2012-04-04
186 1 2 44C6A4B776A2 2012-04-05
205 1 2 7A0ED3F1DA13 2012-09-19
206 1 2 77771D65F9C4 2012-09-19
207 1 2 AD74A4AA41BD 2012-09-19
208 1 2 9595ABE5A0C8 2012-09-19
209 1 2 7611D2FB395B 2012-09-19
210 1 2 04A510D6067A 2012-09-19
211 1 2 24D43EC268F8 2012-09-19
table2
PEId Id EPId
----------- ----------- -----------
43 9 15
44 10 15
45 11 15
46 12 15
47 13 15
48 14 15
49 15 15
50 16 15
51 17 15
52 18 15
table3
PLId PEId Id ToPayId
----------- ----------- ----------- -----------
71 43 9 1
72 43 9 2
73 44 10 1
74 44 10 2
75 45 11 1
76 45 11 2
77 46 12 1
78 46 12 2
79 47 13 1
80 47 13 2
I want to get one id whose count is less than 8 in table 3 and order by peid in table 2,
I have written query
SELECT Top 1 ToPayId FROM
(
SELECT Count(pl.ToPayId) C, pl.ToPayId
FROM table3 pl
INNER JOIN table2 pe ON pl.peid = pe.peid
INNER JOIN table1 e ON pe.epid = e.epid
WHERE e.EtId=1 GROUP BY pl.ToPayId
) As T
INNER JOIN table2 p ON T.ToPayId= p.Id
WHERE C < 8 ORDER BY p.PEId ASC
This query executes more than 1000 times in stored procedure depends on the entries in user-defined-table-type using while condition.
But it is very slow as we have millions of entries in each table.
Can anyone suggest better query regarding above?

maybe try with the having clause to get rid of the from select
select table2.id as due
from table3 inner join table2 on table2.PEId=table3.PEId...
group by ...
having count(due) <8
order by ...
-> you have a redundant Id column in table3 : seems pretty useless as the couple PEId and Id appears unique so remove it and reduce the size of table 3 by 25% hence improving performance of db

Will.. since you did not provide enough sample data and I am not sure what exactly your business logic is. So that I can just modify the code in blind.
SELECT ToPayId
FROM (
SELECT TOP 1 Count(pl.ToPayId) C, pl.ToPayId, pe.PEId
FROM table3 as pl
INNER JOIN table2 as pe ON pl.peid = pe.peid AND pl.ToPayId = pe.Id
INNER JOIN table1 e ON pe.epid = e.epid
WHERE e.EtId=1
GROUP BY pl.ToPayId, pe.PEId
HAVING Count(pl.ToPayId) < 8
ORDER BY pe.PEId ASC
) AS T