Repair historical data - sql-server

I have a historical table like this:
+------------------+------------------+---------+-------+
| valid_from | valid_to | Profit | ID |
+------------------+------------------+---------+-------+
| 20.05.2019 00:02 | 22.05.2019 23:42 | 10 | 12345 |
| 22.05.2019 23:42 | 28.05.2019 13:11 | 10 | 12345 |
| 28.05.2019 13:11 | 28.05.2019 23:59 | 10 | 12345 |
| 28.05.2019 23:59 | 29.05.2019 06:48 | 123 | 12345 |
| 29.05.2019 06:48 | 29.05.2019 13:21 | 123 | 12345 |
| 29.05.2019 13:21 | 29.05.2019 23:59 | 123 | 12345 |
| 29.05.2019 23:59 | 30.05.2019 06:39 | 10 | 12345 |
| 30.05.2019 06:39 | 30.05.2019 12:37 | 123 | 12345 |
| 30.05.2019 12:37 | 31.05.2019 00:09 | 123 | 12345 |
| 31.05.2019 00:09 | 31.05.2019 08:41 | 145 | 12345 |
| 31.05.2019 08:41 | 01.06.2019 00:22 | 145 | 12345 |
+------------------+------------------+---------+-------+
I deleted some columns. Row 1, 2 and 3 can now be summarized.
At first I tried following GROUP-BY Statement:
SELECT MIN(valid_from ) AS valid_from
,MAX(valid_to ) AS valid_to
,Profit
,ID
INTO [repaired_archiv]
FROM temp.[wrong_archiv]
GROUP BY Profit
,ID
The result is:
+------------------+------------------+---------+-------+
| valid_from | valid_to | Profit | ID |
+------------------+------------------+---------+-------+
| 20.05.2019 00:02 | 30.05.2019 06:39 | 10 | 12345 |
| 28.05.2019 23:59 | 31.05.2019 00:09 | 123 | 12345 |
| 31.05.2019 00:09 | 01.06.2019 00:22 | 145 | 12345 |
+------------------+------------------+---------+-------+
but as you see, the valid_to column in the first row ist wrong. The reason for this is the wrong GROUP-BY Statement. I don't know how to get my aspected result like this:
+------------------+------------------+---------+-------+
| valid_from | valid_to | Profit | ID |
+------------------+------------------+---------+-------+
| 20.05.2019 00:02 | 28.05.2019 23:59 | 10 | 12345 |
| 28.05.2019 23:59 | 29.05.2019 23:59 | 123 | 12345 |
| 29.05.2019 23:59 | 30.05.2019 06:39 | 10 | 12345 |
| 30.05.2019 06:39 | 31.05.2019 00:09 | 123 | 12345 |
| 31.05.2019 00:09 | 01.06.2019 00:22 | 145 | 12345 |
+------------------+------------------+---------+-------+

You need two row_number() :
select min(valid_from) as valid_from, max(valid_to) as valid_to, id, profit
from (select t.*,
row_number() over (order by valid_from) as seq1,
row_number() over (partition by id, profit order by valid_from) as seq2
from temp.[wrong_archiv] t
) t
group by id, profit, (seq1 - seq2)
order by valid_from;

Related

SQL Server Dynamic Resetting Running Balance

My current issue is that I have a running balance, where one value falls below another the running balance needs to reset. But not only reset, but also use a another value as its starting value and start the balance again.
Below is the table with data in it:
+-------------+--------+---------------------+-------------------+--------------+-------------+
| Tran_DateSK | Amount | Running_AccountFees | Overlimit_Balance | Restart_Calc | Actual_Calc |
+-------------+--------+---------------------+-------------------+--------------+-------------+
| 20200217 | 39 | 39 | 3867.76 | 0 | 39 |
| 20200217 | 50 | 89 | 3867.76 | 0 | 89 |
| 20200316 | 39 | 128 | 4735.52 | 0 | 128 |
| 20200316 | 50 | 178 | 4735.52 | 0 | 178 |
| 20200324 | 50 | 228 | 2685.52 | 0 | 228 |
| 20200330 | 50 | 278 | 49.52 | 1 | 49.52 |
| 20200415 | 39 | 317 | 49.52 | 1 | 49.52 |
| 20200515 | 39 | 356 | 3917.28 | 0 | 88.52 |
| 20200515 | 50 | 406 | 3917.28 | 0 | 138.52 |
| 20200519 | 50 | 456 | 3467.28 | 0 | 188.52 |
| 20200604 | 50 | 506 | 3017.28 | 0 | 238.52 |
| 20200609 | 50 | 556 | 2167.28 | 0 | 288.52 |
| 20200611 | 50 | 606 | 49.28 | 1 | 49.28 |
| 20200615 | 39 | 645 | 3917.04 | 0 | 88.28 |
| 20200615 | 50 | 695 | 3917.04 | 0 | 138.28 |
| 20200616 | 50 | 745 | 3017.04 | 0 | 188.28 |
| 20200616 | 50 | 795 | 3017.04 | 0 | 238.28 |
| 20200619 | 50 | 845 | 2567.04 | 0 | 288.28 |
| 20200624 | 50 | 895 | 47.04 | 1 | 47.04 |
| 20200715 | 39 | 934 | 47.04 | 1 | 47.04 |
+-------------+--------+---------------------+-------------------+--------------+-------------+
Actual Calc is the desired outcome and Running account fees is the issue.
Running account fees is the running balance of "Amount" and overlimit_balance is the test. We need to see that the running_accountfees isn't greater than over limit,
If it is, take overlimits value and start calculating again by adding amount on again.
My query that produced this:
SELECT
[Transaction].ReportDateSK AS 'Tran_DateSK'
,[Transaction].AmountChange/100.00 AS 'Amount'
,SUM([Transaction].AmountChange/100.00)
OVER (PARTITION BY [Transaction].AccountSK
ORDER BY [Transaction].ReportDateSK
ROWS BETWEEN UNBOUNDED PRECEDING AND 0 PRECEDING) AS 'Running_AccountFees'
,[Summary].Overlimit_Balance AS 'Overlimit_Balance'
,CASE
WHEN SUM([Transaction].AmountChange/100.00)
OVER (PARTITION BY [Transaction].AccountSK
ORDER BY [Transaction].ReportDateSK
ROWS BETWEEN UNBOUNDED PRECEDING AND 0 PRECEDING) > [Summary].Overlimit_Balance
THEN 1
ELSE 0
END AS 'Restart_Calc'
,'' AS 'Actual_Calc'
FROM
Fact.[Transaction] [Transaction]
INNER JOIN Fact.AccountSummary [Summary] ON [Summary].DateSK = [Transaction].ReportDateSK
AND [Summary].AccountSK = [Transaction].AccountSK
AND [Summary].[Current] = 1
WHERE IsFeeTransaction = 1
AND [Transaction].AccountSK = 725
AND [Transaction].ReportDateSK BETWEEN 20200217 AND 20200730
Realised that the data in your question is essentially the source data and have been able to come up with the below. It isn't exactly pretty but it provides the correct output. Explanations on how it works are in the comments:
declare #t table(Tran_DateSK int, Amount decimal(10,2), Running_AccountFees int, Overlimit_Balance decimal(10,2), Restart_Calc bit, Actual_Calc decimal(10,2));
insert into #t values(20200217,39,39,3867.76,0,39),(20200217,50,89,3867.76,0,89),(20200316,39,128,4735.52,0,128),(20200316,50,178,4735.52,0,178),(20200324,50,228,2685.52,0,228),(20200330,50,278,49.52,1,49.52),(20200415,39,317,49.52,1,49.52),(20200515,39,356,3917.28,0,88.52),(20200515,50,406,3917.28,0,138.52),(20200519,50,456,3467.28,0,188.52),(20200604,50,506,3017.28,0,238.52),(20200609,50,556,2167.28,0,288.52),(20200611,50,606,49.28,1,49.28),(20200615,39,645,3917.04,0,88.28),(20200615,50,695,3917.04,0,138.28),(20200616,50,745,3017.04,0,188.28),(20200616,50,795,3017.04,0,238.28),(20200619,50,845,2567.04,0,288.28),(20200624,50,895,47.04,1,47.04),(20200715,39,934,47.04,1,47.04);
with t as
(
select Tran_DateSK
,Amount
-- Check if the Running_AccountFees are over the Overlimit_Balance
,case when sum(Amount) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding) > Overlimit_Balance
-- If so, check if the Running_AccountFees in the previous row were also over the Overlimit_Balance
then case when (sum(Amount) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding) - Amount) > lag(Overlimit_Balance,1,0) over (order by Tran_DateSK,Amount,Overlimit_Balance)
then 0 -- and in those instances this means multiple Restart_Calcs in a row, so set the Amount to zero as we don't want to increase the fees when calculating the Actual_Calc
else Amount
end
else Amount
end as Amount_Adj
,sum(Amount) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding) as Running_AccountFees
,lag(Overlimit_Balance,1,0) over (order by Tran_DateSK,Amount,Overlimit_Balance) as Prev_Overlimit_Balance
,Overlimit_Balance
,case when sum(Amount) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding) > Overlimit_Balance
then 1
else 0
end as Restart_Calc
from #t
)
,b as
(
select *
,case when Running_AccountFees > Overlimit_Balance -- If this row is the first in a possible series of balance resets
and sum(Amount_Adj) over (order by Tran_DateSK,Amount,Overlimit_Balance rows between unbounded preceding and 1 preceding) <= Prev_Overlimit_Balance
then Overlimit_Balance -- Take the Overlimit_Balance and subtract the *Adjusted* Running_AccountFees
- sum(Amount_Adj) over (order by Tran_DateSK,Amount,Overlimit_Balance rows between unbounded preceding and 1 preceding)
- Amount_Adj
else 0
end as Reset_Bal
from t
)
select Tran_DateSK
,Amount
,Running_AccountFees
,Overlimit_Balance
,Restart_Calc
-- For each *Adjusted* Running_AccountFees, apply the most negative Reset_Bal value, as this will contain the entire amount that needs to be reset from the current *Adjusted* Running_AccountFees to get the correct Balance_Calc
,sum(Amount_Adj) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding)
+ min(Reset_Bal) over (order by Tran_DateSK,Amount,Overlimit_Balance rows unbounded preceding)
as Balance_Calc
from b
order by Tran_DateSK;
Output
+-------------+--------+---------------------+-------------------+--------------+--------------+
| Tran_DateSK | Amount | Running_AccountFees | Overlimit_Balance | Restart_Calc | Balance_Calc |
+-------------+--------+---------------------+-------------------+--------------+--------------+
| 20200217 | 39.00 | 39.00 | 3867.76 | 0 | 39.00 |
| 20200217 | 50.00 | 89.00 | 3867.76 | 0 | 89.00 |
| 20200316 | 39.00 | 128.00 | 4735.52 | 0 | 128.00 |
| 20200316 | 50.00 | 178.00 | 4735.52 | 0 | 178.00 |
| 20200324 | 50.00 | 228.00 | 2685.52 | 0 | 228.00 |
| 20200330 | 50.00 | 278.00 | 49.52 | 1 | 49.52 |
| 20200415 | 39.00 | 317.00 | 49.52 | 1 | 49.52 |
| 20200515 | 39.00 | 356.00 | 3917.28 | 0 | 88.52 |
| 20200515 | 50.00 | 406.00 | 3917.28 | 0 | 138.52 |
| 20200519 | 50.00 | 456.00 | 3467.28 | 0 | 188.52 |
| 20200604 | 50.00 | 506.00 | 3017.28 | 0 | 238.52 |
| 20200609 | 50.00 | 556.00 | 2167.28 | 0 | 288.52 |
| 20200611 | 50.00 | 606.00 | 49.28 | 1 | 49.28 |
| 20200615 | 39.00 | 645.00 | 3917.04 | 0 | 88.28 |
| 20200615 | 50.00 | 695.00 | 3917.04 | 0 | 138.28 |
| 20200616 | 50.00 | 745.00 | 3017.04 | 0 | 188.28 |
| 20200616 | 50.00 | 795.00 | 3017.04 | 0 | 238.28 |
| 20200619 | 50.00 | 845.00 | 2567.04 | 0 | 288.28 |
| 20200624 | 50.00 | 895.00 | 47.04 | 1 | 47.04 |
| 20200715 | 39.00 | 934.00 | 47.04 | 1 | 47.04 |
+-------------+--------+---------------------+-------------------+--------------+--------------+

How to find max(sortnumber) on item code in SQL Server?

I have following SQL Server table ITEM:
+------------+-----------+------+--------+-----------+------------+
| Date | item_code | name | in/out | total_qty | SortNumber |
+------------+-----------+------+--------+-----------+------------+
| 08/07/2019 | 001 | A | -50 | 100 | 8 |
| 07/07/2019 | 001 | A | 50 | 100 | 7 |
| 06/07/2019 | 003 | C | 25 | 25 | 6 |
| 05/07/2019 | 001 | A | 50 | 50 | 5 |
| 04/07/2019 | 002 | B | 100 | 200 | 4 |
| 03/07/2019 | 003 | C | -25 | 0 | 3 |
| 02/07/2019 | 003 | C | 25 | 25 | 2 |
| 01/07/2019 | 002 | B | 100 | 100 | 1 |
+------------+-----------+------+--------+-----------+------------+
I've tried:
select itemcode, max(Sort_Number)
from ITEM
group by item_code
order by item_code asc
but I want result:
+---------------------+-----------+------------------+
| Distinct(item_code) | Total_qty | Max(Sort_Number) |
+---------------------+-----------+------------------+
| 001 | 100 | 8 |
| 002 | 200 | 4 |
| 003 | 25 | 6 |
+---------------------+-----------+------------------+
Can anyone help me?
The below query gives you the desired result -
With cteItem as
(
select item_code, total_qty, SortNumber,
Row_Number() over (partition by item_code order by SortNumber desc) maxSortNumber
from ITEM
)
select item_code, total_qty, SortNumber from cteItem where maxSortNumber = 1
just need to add max(sort_number) to your query
select item_code ,max(total_qty), max(sort_number)
from ITEM
group by item_code
order by item_code asc

T-SQL: Values are grouped by month, if there is no value for a month the month should also appear and display "NULL"

i have a SQL that displays turnover, stock and other values for stores grouped by month. Logically, if there is no value for a month, the month doesn't appear. The target is that the empty month should appear and display "NULL" for the values. The empty months should range from the #FROM to the #TO parameter (201807 to 201907) in this case.
Before:
+-------+--------+----------+----------+-------+
| Store | Month | Incoming | Turnover | Stock |
+-------+--------+----------+----------+-------+
| 123 | 201810 | 5 | 4 | 1 |
| 123 | 201811 | 0 | 1 | 0 |
| 123 | 201901 | 25 | 5 | 20 |
| 123 | 201902 | 5 | 10 | 15 |
| 123 | 201903 | 8 | 9 | 14 |
| 123 | 201904 | 5 | 4 | 15 |
| 123 | 201905 | 10 | 5 | 20 |
+-------+--------+----------+----------+-------+
After:
+-------+--------+----------+----------+-------+
| Store | Month | Incoming | Turnover | Stock |
+-------+--------+----------+----------+-------+
| 123 | 201807 | NULL | NULL | NULL |
| 123 | 201808 | NULL | NULL | NULL |
| 123 | 201809 | NULL | NULL | NULL |
| 123 | 201810 | 5 | 4 | 1 |
| 123 | 201811 | 0 | 1 | 0 |
| 123 | 201812 | NULL | NULL | NULL |
| 123 | 201901 | 25 | 5 | 20 |
| 123 | 201902 | 5 | 10 | 15 |
| 123 | 201903 | 8 | 9 | 14 |
| 123 | 201904 | 5 | 4 | 15 |
| 123 | 201905 | 10 | 5 | 20 |
| 123 | 201906 | NULL | NULL | NULL |
| 123 | 201907 | NULL | NULL | NULL |
+-------+--------+----------+----------+-------+
Code Example: db<>fiddle
I have absolutely no idea how to solve this and will thank you in advance for your help! :)
You can try to use cte recursive make a calendar table, then do outer-join
;WITH CTE AS (
SELECT CAST(CAST(#FROM AS VARCHAR(10)) + '01' AS DATE) fromDt,
CAST(CAST(#TO AS VARCHAR(10)) + '01' AS DATE) toDt,
Store
FROM (SELECT DISTINCT Store FROM #Test) t1
UNION ALL
SELECT DATEADD(MONTH,1,fromDt),toDt,Store
FROM CTE
WHERE DATEADD(MONTH,1,fromDt) <= toDt
)
SELECT FORMAT(fromDt,'yyyyMM') Month,
c.Store,
t.Incoming,
t.Turnover,
t.Stock
FROM CTE c
LEFT JOIN #Test t on
c.fromDt = CAST(CAST(t.Month AS VARCHAR(10)) + '01' AS DATE)
and
c.Store = t.Store
sqlfiddle

Sum in subquery for a group of numbers

We are trying to get a combined table where we also try to sum the volume.
Dateset right now:
+-------------+-----+------------+------------+--------+---------+
| Voorziening | BSN | Begindatum | Einddatum | Volume | Product |
+-------------+-----+------------+------------+--------+---------+
| 1000 | 1 | 1-1-2017 | 31-1-2017 | 50 | AAAA |
+-------------+-----+------------+------------+--------+---------+
| 1200 | 1 | 1-2-2017 | 31-3-2017 | 200 | AAAA |
+-------------+-----+------------+------------+--------+---------+
| 1250 | 1 | 1-4-2017 | 10-4-2017 | 90 | AAAA |
+-------------+-----+------------+------------+--------+---------+
| 1111 | 2 | 4-1-2017 | 10-1-2017 | 4 | AABB |
+-------------+-----+------------+------------+--------+---------+
| 1345 | 2 | 11-1-2017 | 29-1-2017 | 80 | AABB |
+-------------+-----+------------+------------+--------+---------+
| 2000 | 1 | 10-1-2017 | 31-1-2017 | 90 | CCCC |
+-------------+-----+------------+------------+--------+---------+
| 2190 | 1 | 1-2-2017 | 31-12-2017 | 100 | CCCC |
+-------------+-----+------------+------------+--------+---------+
What I want to achieve
+-------------+-----+------------+------------+--------+---------+
| Voorziening | BSN | Begindatum | Einddatum | Volume | Product |
+-------------+-----+------------+------------+--------+---------+
| 1000 | 1 | 1-1-2017 | 10-4-2017 | 340 | AAAA |
+-------------+-----+------------+------------+--------+---------+
| 2000 | 1 | 10-1-2017 | 31-12-2017 | 190 | CCCC |
+-------------+-----+------------+------------+--------+---------+
| 1111 | 2 | 4-1-2017 | 29-1-2017 | 84 | AABB |
+-------------+-----+------------+------------+--------+---------+
What i've got so for is the folowwing query:
SELECT min(b.Voorziening) as voorzieningsnummer
,a.BSN
,min(b.Begindatum) as mindatum
,MAX(b.Einddatum) AS maxdatum
,a.Productcode
,
(SELECT sum(Volume)
FROM Voorziening
)as totaal
FROM Voorziening a
INNER JOIN Voorziening b
ON a.BSN = b.BSN
AND a.Productcode = b.Productcode
GROUP BY a.BSN, a.Productcode
The result is gives me is this:
+-------------+-----+------------+------------+--------+
| Voorziening | BSN | Begindatum | Einddatum | Volume |
+-------------+-----+------------+------------+--------+
| 1000 | 1 | 1-1-2017 | 10-4-2017 | 424 |
+-------------+-----+------------+------------+--------+
| 1111 | 2 | 4-1-2017 | 29-1-2017 | 424 |
+-------------+-----+------------+------------+--------+
You guys can help me to get the sum right?
There isn't any reason to use JOIN. you can use aggregate function directly.
You can try this.
SELECT min(a.Voorziening) as voorzieningsnummer
,a.BSN
,min(a.Begindatum) as mindatum
,MAX(a.Einddatum) AS maxdatum
,a.Productcode
,SUM(a.Volume) Volume
FROM Voorziening a
GROUP BY a.BSN, a.Productcode
if you are using sql server 2008 or above version then just go ahead with PARTITION BY
SUM(Volume)over(Partition by Product order by Voorziening,another,another)

T-SQL to create an ID column

I'm using SQL Server 2008 R2 and I have the following dataset:
+---------+--------------+--------------+----------+------------+------------+
| Dossier | refmouvement | refadmission | refunite | datedeb | datefin |
+---------+--------------+--------------+----------+------------+------------+
| P001234 | 2567 | 1234 | 227 | 2012-01-01 | 2012-01-02 |
| P001234 | 2568 | 1234 | 227 | 2012-01-02 | 2012-01-03 |
| P001234 | 2569 | 1234 | 224 | 2012-01-03 | 2012-01-06 |
| P001234 | 2570 | 1234 | 232 | 2012-01-06 | 2012-01-10 |
| P001234 | 2571 | 1234 | 232 | 2012-01-10 | 2012-01-15 |
| P001234 | 2572 | 1234 | 232 | 2012-01-15 | 2012-01-20 |
| P001234 | 2573 | 1234 | 232 | 2012-01-20 | 2012-01-25 |
| P001234 | 2574 | 1234 | 224 | 2012-01-25 | 2012-01-29 |
| P001234 | 2575 | 1234 | 227 | 2012-01-29 | 2012-02-05 |
| P001234 | 2576 | 1234 | 227 | 2012-02-05 | 2012-02-10 |
| P001234 | 2577 | 1234 | 232 | 2012-02-10 | 2012-02-15 |
| P001234 | 2578 | 1234 | 201 | 2012-02-15 | 2012-02-26 |
+---------+--------------+--------------+----------+------------+------------+
This dataset is ordered by datedeb, otherwise known as startdate.
As you can notice this is a contiguous dataset where datefin is equal to the next line's datedeb
I need to create an ID column that is going to give an unique ID based on the refunite and the datedeb columns like this:
+----+---------+--------------+--------------+----------+------------+------------+
| ID | Dossier | refmouvement | refadmission | refunite | datedeb | datefin |
+----+---------+--------------+--------------+----------+------------+------------+
| 1 | P001234 | 2567 | 1234 | 227 | 2012-01-01 | 2012-01-02 |
| 1 | P001234 | 2568 | 1234 | 227 | 2012-01-02 | 2012-01-03 |
| 2 | P001234 | 2569 | 1234 | 224 | 2012-01-03 | 2012-01-06 |
| 3 | P001234 | 2570 | 1234 | 232 | 2012-01-06 | 2012-01-10 |
| 3 | P001234 | 2571 | 1234 | 232 | 2012-01-10 | 2012-01-15 |
| 3 | P001234 | 2572 | 1234 | 232 | 2012-01-15 | 2012-01-20 |
| 3 | P001234 | 2573 | 1234 | 232 | 2012-01-20 | 2012-01-25 |
| 4 | P001234 | 2574 | 1234 | 224 | 2012-01-25 | 2012-01-29 |
| 5 | P001234 | 2575 | 1234 | 227 | 2012-01-29 | 2012-02-05 |
| 5 | P001234 | 2576 | 1234 | 227 | 2012-02-05 | 2012-02-10 |
| 6 | P001234 | 2577 | 1234 | 232 | 2012-02-10 | 2012-02-15 |
| 7 | P001234 | 2578 | 1234 | 201 | 2012-02-15 | 2012-02-26 |
+----+---------+--------------+--------------+----------+------------+------------+
I just can't wrap my head around a RANK(), ROW_NUMBER() or DENSE_RANK() function or a combination of that could achieve this, I have looked everywhere but I cannot find anything, maybe I'm not using the proper keywords but I just can't figure it out
Any help will be appreciated
Thanks.
Here's the code that I've tried so far:
SELECT
ROW_NUMBER() over(order by t1.[datedeb]) as [ID1],
dense_Rank() over(partition by t1.[refunite] order by t1.[datedeb]) as [ID2],
t1.[Dossier]
,t1.[refmouvement]
,t1.[refadmission]
,t1.[refunite]
,t1.[datedeb]
,t1.[datefin]
,t2.[refmouvement] as [prev_refmouvement]
,t2.refunite as prev_refunite
FROM [sometable] t1
LEFT OUTER JOIN [sometable] t2 /*self join*/
ON t2.datefin = t1.datedeb
AND t1.[refadmission] = t2.[refadmission]
ORDER BY
t1.[datedeb]
This is what it gives me :
+-----+-----+---------+--------------+--------------+----------+------------+------------+-------------------+---------------+
| ID1 | ID2 | Dossier | refmouvement | refadmission | refunite | datedeb | datefin | prev_refmouvement | prev_refunite |
+-----+-----+---------+--------------+--------------+----------+------------+------------+-------------------+---------------+
| 1 | 1 | P001234 | 2567 | 1234 | 227 | 2012-01-01 | 2012-01-02 | NULL | NULL |
| 2 | 2 | P001234 | 2568 | 1234 | 227 | 2012-01-02 | 2012-01-03 | 2567 | 227 |
| 3 | 1 | P001234 | 2569 | 1234 | 224 | 2012-01-03 | 2012-01-06 | 2568 | 227 |
| 4 | 1 | P001234 | 2570 | 1234 | 232 | 2012-01-06 | 2012-01-10 | 2569 | 224 |
| 5 | 2 | P001234 | 2571 | 1234 | 232 | 2012-01-10 | 2012-01-15 | 2570 | 232 |
| 6 | 3 | P001234 | 2572 | 1234 | 232 | 2012-01-15 | 2012-01-20 | 2571 | 232 |
| 7 | 4 | P001234 | 2573 | 1234 | 232 | 2012-01-20 | 2012-01-25 | 2572 | 232 |
| 8 | 2 | P001234 | 2574 | 1234 | 224 | 2012-01-25 | 2012-01-29 | 2573 | 232 |
| 9 | 3 | P001234 | 2575 | 1234 | 227 | 2012-01-29 | 2012-02-05 | 2574 | 224 |
| 10 | 4 | P001234 | 2576 | 1234 | 227 | 2012-02-05 | 2012-02-10 | 2575 | 227 |
| 11 | 5 | P001234 | 2577 | 1234 | 232 | 2012-02-10 | 2012-02-15 | 2576 | 227 |
| 12 | 1 | P001234 | 2578 | 1234 | 201 | 2012-02-15 | 2012-02-26 | 2577 | 232 |
+-----+-----+---------+--------------+--------------+----------+------------+------------+-------------------+---------------+
Shaz
DECLARE #Results TABLE(
RowNum INT PRIMARY KEY,
refunite INT NOT NULL,
datedeb DATETIME NOT NULL
);
INSERT #Results (RowNum, refunite, datedeb)
SELECT ROW_NUMBER() OVER(ORDER BY datedeb) AS RowNum,
refunite,
datedeb
FROM dbo.MyTable;
WITH CTERecursive
AS (
SELECT crt.RowNum,
crt.refunite,
crt.datedeb,
1 AS Rnk -- Starting rank
FROM #Results crt
WHERE crt.RowNum = 1
UNION ALL
SELECT crt.RowNum,
crt.refunite,
crt.datedeb,
CASE WHEN prev.refunite = crt.refunite THEN prev.Rnk ELSE prev.Rnk + 1 END
FROM #Results crt INNER JOIN CTERecursive prev ON crt.RowNum = prev.RowNum + 1
)
SELECT *
FROM CTERecursive
-- OPTION(MAXRECURSION 1000); -- Uncomment this line if you change the number of recursion levels allowed (default 100)
Results:
RowNum refunite datedeb Rnk
----------- ----------- ----------------------- ---
1 227 2012-01-01 00:00:00.000 1
2 227 2012-01-02 00:00:00.000 1
3 224 2012-01-03 00:00:00.000 2
4 232 2012-01-06 00:00:00.000 3
5 232 2012-01-10 00:00:00.000 3
6 232 2012-01-15 00:00:00.000 3
7 232 2012-01-20 00:00:00.000 3
8 224 2012-01-25 00:00:00.000 4
9 227 2012-01-29 00:00:00.000 5
10 227 2012-02-05 00:00:00.000 5
11 232 2012-02-10 00:00:00.000 6
12 201 2012-02-15 00:00:00.000 7
You could, of course, have multiple tables in the WITH, eliminating the table variable.
Based on Bogdan Sahleans answer, you could rewrite like this:
WITH CTEHelper AS
(SELECT ROW_NUMBER() OVER(ORDER BY datedeb) AS RowNum,
refunite,
datedeb
FROM dbo.Sometable),
CTERecursive AS (
SELECT crt.RowNum,
crt.refunite,
crt.datedeb,
1 AS Id -- Starting rank
FROM CTEHelper crt
WHERE crt.RowNum = 1
UNION ALL
SELECT crt.RowNum,
crt.refunite,
crt.datedeb,
CASE WHEN prev.refunite = crt.refunite THEN prev.Id ELSE prev.Id + 1 END
FROM CTEHelper crt INNER JOIN CTERecursive prev ON crt.RowNum = prev.RowNum + 1
)
SELECT crt.id,
s.*
FROM CTERecursive crt
JOIN Sometable s ON s.refunite = crt.refunite AND s.datedeb = crt.datedeb
with sometable as (
select *
from (
values ('P001234', 2567, 1234, 227, cast('2012-01-01' as date), cast('2012-01-02' as date)),
('P001234', 2568, 1234, 227, cast('2012-01-02' as date), cast('2012-01-03' as date)),
('P001234', 2569, 1234, 224, cast('2012-01-03' as date), cast('2012-01-06' as date)),
('P001234', 2570, 1234, 232, cast('2012-01-06' as date), cast('2012-01-10' as date)),
('P001234', 2571, 1234, 232, cast('2012-01-10' as date), cast('2012-01-15' as date)),
('P001234', 2572, 1234, 232, cast('2012-01-15' as date), cast('2012-01-20' as date)),
('P001234', 2573, 1234, 232, cast('2012-01-20' as date), cast('2012-01-25' as date)),
('P001234', 2574, 1234, 224, cast('2012-01-25' as date), cast('2012-01-29' as date)),
('P001234', 2575, 1234, 227, cast('2012-01-29' as date), cast('2012-02-05' as date)),
('P001234', 2576, 1234, 227, cast('2012-02-05' as date), cast('2012-02-10' as date)),
('P001234', 2577, 1234, 232, cast('2012-02-10' as date), cast('2012-02-15' as date)),
('P001234', 2578, 1234, 201, cast('2012-02-15' as date), cast('2012-02-26' as date))
) t (Dossier, refmouvement, refadmission, refunite, datedeb, datefin)
), pos as (
select d.*, (case when d2.refunite is null then null
when d2.refunite != d.refunite then d2.datedeb
else d.datedeb end) as forward,
(case when d3.refunite is null then null
when d3.refunite != d.refunite then d3.datedeb
else d.datedeb end) as backward
from sometable d
left outer join sometable d2 on d.refadmission = d2.refadmission and d.datefin = d2.datedeb
left outer join sometable d3 on d.refadmission = d3.refadmission and d.datedeb = d3.datefin
)
select dense_rank() over (order by isnull((select min(datedeb)
from pos
where refadmission = t.refadmission
and refunite = t.refunite
and datedeb > t.datedeb
and datedeb = backward
and ((t.datedeb = t.backward and t.datedeb = t.forward)
or t.datedeb != t.backward or t.backward is null)
and datedeb != forward), datedeb)) as ID,
Dossier, refmouvement, refadmission, refunite, datedeb, datefin
from pos t
order by datedeb

Resources