Creating rolling window for time series data in SQL

Creating rolling window for time series data in SQL - sql-server

I have a question regarding adding rolling window column in SQL. Table A is a sample of 24 months time series data. I need to add column for difference between each month balances with pervious month and a month before pervious month. For example for Mar 2020 I need to have difference between Mar and Feb and also Mar and Jan for Deposit and Withdraw separately for each ID (Table B). I try to use 'window' function in sql but I do not know how.
**Table A**
ID | Date |A | B |
+--------+-----------+-------+---------
| 1 | Jan 20 | $200 | $100 |
| 1 | Feb 20 | $500 | $250 |
| 1 | Mar 20 | $1000 | $550 |
+--------+-----------+-------+---------+
I want results like this:
**Table B**
ID | Date |A | B | A(Mar-Feb)| A(Mar-Jan)| B(Mar-Feb)| B(Mar-Jan)|
+--------+-----------+-------+------------------------------------------------------
| 1 | Jan 20 | $200 | $100 | | | | |
| 1 | Feb 20 | $500 | $250 | | | | |
| 1 | Mar 20 | $1000 | $550 | $500 |$800 |$300 |$450 |
+--------+-----------+-------+---------+------------+-----------+----------+-----------+
I'd really appreciated if someone can help me.

Edited: See edit at bottom for corrected answer based on more information from OP
I "think" this is what you're asking for and it may not perfectly be what you want, because it fills in the other rows as well...
IF OBJECT_ID('tempdb..#TableA','U') IS NOT NULL DROP TABLE #TableA; --SELECT * FROM #TableA
CREATE TABLE #TableA (
ID int NOT NULL,
[Date] date NOT NULL,
A int NOT NULL,
B int NOT NULL,
)
INSERT INTO #TableA (ID, Date, A, B)
VALUES (1, '2020-01-01', 200, 100)
, (1, '2020-02-01', 500, 250)
, (1, '2020-03-01', 1000, 550)
SELECT ta.ID
, [Date] = FORMAT(ta.[Date],'MMM yy')
, ta.A, ta.B
, A_DiffPrev = ta.A - LAG(ta.A) OVER (ORDER BY ta.[Date])
, A_DiffFirst = ta.A - FIRST_VALUE(ta.A) OVER (ORDER BY ta.[Date])
, B_DiffPrev = ta.B - LAG(ta.B) OVER (ORDER BY ta.[Date])
, B_DiffFirst = ta.B - FIRST_VALUE(ta.B) OVER (ORDER BY ta.[Date])
FROM #TableA ta
Returns:
| ID | Date | A | B | A_DiffPrev | A_DiffFirst | B_DiffPrev | B_DiffFirst |
|----|--------|------|-----|------------|-------------|------------|-------------|
| 1 | Jan 20 | 200 | 100 | NULL | 0 | NULL | 0 |
| 1 | Feb 20 | 500 | 250 | 300 | 300 | 150 | 150 |
| 1 | Mar 20 | 1000 | 550 | 500 | 800 | 300 | 450 |
Explanation
LAG(ta.A) OVER (ORDER BY ta.[Date]) - This will give you the previous value as sorted by the provided ORDER BY. So in this case, it's saying, give me the value that occurs prior to the current row, if you sort by [Date] Ascending
FIRST_VALUE(ta.A) OVER (ORDER BY ta.[Date]) - Similar idea to LAG() except it's saying to get the very first item, rather than the previous item.
Edit
In the comments you mentioned that FIRST_VALUE() will not work for you because you don't want to compare with the first month, you want to compare with the previous month and two months back.
In that case, you can use this solution:
SELECT ta.ID
, [Date] = FORMAT(ta.[Date],'MMM yy')
, ta.A, ta.B
, A_DiffPrev1 = ta.A - LAG(ta.A,1) OVER (ORDER BY ta.[Date])
, A_DiffPrev2 = ta.A - LAG(ta.A,2) OVER (ORDER BY ta.[Date])
, B_DiffPrev1 = ta.B - LAG(ta.B,1) OVER (ORDER BY ta.[Date])
, B_DiffPrev2 = ta.B - LAG(ta.B,2) OVER (ORDER BY ta.[Date])
FROM #TableA ta
Explanation:
In this change, I'm using LAG() for everything. But instead, I'm telling LAG() how many rows I want it to look back.
So to get the previous month, I say LAG(A, 1) which means to grab the previous row, which is the default, I'm only providing it here to make it more explicitly clear what is happening.
Then I say LAG(A, 2) which means to go back two rows and grab that value.
NOTE: This is all assuming you do not have gaps in your data.

Related

SQL Server Stock age calculation - Alternative to nested cursors

I'm trying to do some kind of stock age calculation based on two tables.
I have a current stock for each reference and I'd like to match it with the most recent warehouse entrances until I have no units left.
My Stock_Table looks like this
| Product_Ref | Stock |
| ----------- | ------- |
| Prod_A | 100 |
| Prod_B | 50 |
My Entrances_Table (ordered by most recent date) looks like this
| Product_Ref | Month | Units |
| ----------- | ------- | ------- |
| Prod_A | July | 50 |
| Prod_A | June | 30 |
| Prod_A | May | 35 |
| Prod_B | May | 10 |
| Prod_B | April | 55 |
What I need (as a previous step to do some other calcuations that I have already figured out) is to build this results table:
| Product_Ref | Month | Units |
| ----------- | ------- | ------- |
| Prod_A | July | 50 |
| Prod_A | June | 30 |
| Prod_A | May | 20 | <-- previous 50+30 so only 20 "left" to achive 100 units
| Prod_B | May | 10 |
| Prod_B | April | 40 | <-- previous 10 so only 40 "left" to achive 100 units
I know I could iterate throught both tables with a nested cursor, but I would like to know if there's a more elegant solution (maybe via running sums, using lead() , CTE, or something that I'm missing ... )
Any ideas are appreciated!
Thank you very much in advance!

You can get the desired output by using Sum() over(order by). This function returns running total of the column in SQL. So the query will be:
SELECT A.Product_Ref ,
A.Month ,
CASE WHEN A.RT <= A.Stock THEN A.Units ELSE A.Stock - (LAG(A.RT) OVER (ORDER BY A.id)) END AS Units
FROM ( SELECT Product_Ref, Units, Month, Stock, Id ,
SUM(E.Units) OVER ( PARTITION BY E.Product_Ref ORDER BY E.Id ) AS RT
FROM Entrances E
INNER JOIN Stock S ON S.Prod = E.Product_Ref
) A;
To view the output of the query click on DEMO

Try with this query,
You can use Lead() in sql-server for check next row value of particular column.
select et.Product_Ref, et.Month, et.Unit,
case when SUM(et.Unit) OVER (PARTITION BY Product_Ref ORDER BY id) > st.Unit1
OR
(lead(et.id)over( order by et.id)) is null
then
case when lead(et.Product_Ref) over(order by id) <> et.Product_Ref
AND
( et.unit - ((SUM(et.Unit) OVER (PARTITION BY Product_Ref ORDER BY id) ) - st.Unit1) ) <= 0
then -- (-)ve Stock
st.Unit1 - ((SUM(et.Unit) OVER (PARTITION BY Product_Ref ORDER BY id) ))
else -- Stock Adjustment As per Stock_Table
et.unit - ((SUM(et.Unit) OVER (PARTITION BY Product_Ref ORDER BY id) ) - st.Unit1)
end
else et.unit -- Normal Stock
end as outputvalue
from Entrances_Table et
left join Stock_Table st on st.Product_Ref1 = et.Product_Ref
DB Fiddle

Reconstructing Balances By Weekly Transaction Sums

I am looking for some advice or pointers on how to construct this. I have spent the last year self-learning SQL. I am at work and I only have access to the query interface in report builder. Which for me means, no procedures, no create tables and no IDE :(. So thats the limitations!
I am trying to reconstruct account balances. I have no intervening balances. I have the current balance and a table full of the transaction history
My current approach is to sum the transactions by posting week (Which I have done) in my CTE named
[SUMTRANSREF]
+--------------+------------+-----------+
| TNCY-SYS-REF | POSTING-WK | SUM-TRANS |
+--------------+------------+-----------+
| 1 | 47 | 37.95 |
| 1 | 46 | 37.95 |
| 1 | 45 | 37.95 |
| 2 | 47 | 50.00 |
| 2 | 46 | 25.00 |
| 2 | 45 | 25.00 |
+--------------+------------+-----------+
I then get the current balances in another CTE called
[CBAL]
+--------------+-------------+-----------+
| TNCY-SYS-REF | CUR-BALANCE | CURR-WEEK |
+--------------+-------------+-----------+
| 1 | 27.52 | 47 |
| 1 | 52.00 | 47 |
+--------------+-------------+-----------+
Now I am assuming I could create intervening CTEs to sum and then splice those altogether but is there a smarter (more automated) way?
Ideally my result should be
+--------------+-------------+----------+----------+
| TNCY-SYS-REF | CUR-BALANCE | BAL-WK46 | BAL-Wk45 |
+--------------+-------------+----------+----------+
| 1 | 27.52 | -10.43 | -48.38 |
| 2 | 52.00 | 2.00 | -48.00 |
+--------------+-------------+----------+----------+
I just am uncertain because each column requires the sum of intervening transactions
So BAL-WK46 is (CURR-BALANCE) - SUM(Transactions from 47)
So BAL-WK46 is (CURR-BALANCE) - SUM(Transactions 46+47)
So BAL-WK45 is (CURR-BALANCE) - SUM(Transactions 45+46+47)
and so on.
Normally I have an idea where to start but I am flummoxed by this one.
Any help you can give would be appreciated. Thank you

Here is some T-SQL that gets the result you require. Should be easy enough to play with to get what you want.
It makes use of Recursive CTE and a PIVOT
IF OBJECT_ID('Tempdb..#SUMTRANSREF') IS NOT NULL
DROP TABLE #SUMTRANSREF
IF OBJECT_ID('Tempdb..#CBAL') IS NOT NULL
DROP TABLE #CBAL
IF OBJECT_ID('Tempdb..#TEMP') IS NOT NULL
DROP TABLE #TEMP
CREATE TABLE #SUMTRANSREF
(
[TNCY-SYS-REF] int,
[POSTING-WK] int,
[SUM-TRANS] float
)
CREATE TABLE #CBAL
(
[TNCY-SYS-REF] int ,
[CUR-BALANCE] float , [CURR-WEEK] int
)
INSERT INTO #SUMTRANSREF
VALUES (1 ,47 , 37.95),
(1 ,46 , 37.95),
(1 ,45 , 37.95),
(2 ,47 , 50.00),
(2 ,46 , 25.00),
(2 ,45 , 25.00 )
INSERT INTO #CBAL
VALUES (1,27.52,47),(2,52.00,47);
WITH CBAL AS
(SELECT * FROM #CBAL),
SUMTRANSREF AS(SELECT * FROM #SUMTRANSREF),
RecursiveTotals([TNCY-SYS-REF],[CURR-WEEK],[CUR-BALANCE],RunningBalance)
AS
(
select C.[TNCY-SYS-REF], C.[CURR-WEEK],C.[CUR-BALANCE],C.[CUR-BALANCE] + S.RunningTotal RunningBalance from CBAL C
JOIN (select *,-SUM([SUM-TRANS]) OVER (PARTITION BY [TNCY-SYS-REF] ORDER BY [POSTING-WK] DESC) RunningTotal
from SUMTRANSREF) S
ON C.[CURR-WEEK]=S.[POSTING-WK] AND C.[TNCY-SYS-REF]=S.[TNCY-SYS-REF]
UNION ALL
select RT.[TNCY-SYS-REF], RT.[CURR-WEEK] -1 [CURR_WEEK],RT.[CUR-BALANCE],RT.[CUR-BALANCE] + S.RunningTotal RunningBalance FROM RecursiveTotals RT
JOIN (select *,-SUM([SUM-TRANS]) OVER (PARTITION BY [TNCY-SYS-REF] ORDER BY [POSTING-WK] DESC) RunningTotal
from #SUMTRANSREF) S ON RT.[TNCY-SYS-REF] = S.[TNCY-SYS-REF] AND RT.[CURR-WEEK]-1 = S.[POSTING-WK]
)
select [TNCY-SYS-REF],[CUR-BALANCE],[46] as 'BAL-WK46',[45] as 'BAL-WK45',[44] as 'BAL-WK44'
FROM (
select [TNCY-SYS-REF],[CUR-BALANCE],RunningBalance,BalanceWeek from (SELECT *,R.[CURR-WEEK]-1 'BalanceWeek' FROm RecursiveTotals R
) RT) AS SOURCETABLE
PIVOT
(
AVG(RunningBalance)
FOR BalanceWeek in ([46],[45],[44])
) as PVT

Date difference for same ID

I ve got a data set similar to
+----+------------+------------+------------+
| ID | Udate | last_code | Ddate |
+----+------------+------------+------------+
| 1 | 05/11/2018 | ACCEPTED | 13/10/2018 |
| 1 | 03/11/2018 | ATTEMPT | 13/10/2018 |
| 1 | 01/11/2018 | INFO | 13/10/2018 |
| 1 | 22/10/2018 | ARRIVED | 13/10/2018 |
| 1 | 15/10/2018 | SENT | 13/10/2018 |
+----+------------+------------+------------+
I m trying to get the date difference for each code on Udate, but for the first date I want to make datedifference between Udate and Ddate.
So I ve been trying:
DATEDIFF(DAY,LAG(Udate) OVER (PARTITION BY Shipment_Number ORDER BY Udate), Udate)
to get the difference between dates and it works so far, but I also need the first date difference between Udate and Ddate.
I was thinking about ISNULL()
Also, at the end I need an average of days between codes as well, usually they keep the same pattern. Sample output data:
+----+------------+------------+------------+------------+
| ID | Udate | last_code | Ddate | Difference |
+----+------------+------------+------------+------------+
| 1 | 05/11/2018 | ACCEPTED | 13/10/2018 | 2 |
| 1 | 03/11/2018 | ATTEMPT | 13/10/2018 | 2 |
| 1 | 01/11/2018 | INFO | 13/10/2018 | 10 |
| 1 | 22/10/2018 | ARRIVED | 13/10/2018 | 7 |
| 1 | 15/10/2018 | SENT | 13/10/2018 | 2 |
+----+------------+------------+------------+------------+
Notice that when there is no previous code, the date diff is between Udate and Ddate.
Would appreciate any idea.
Thank you.

Well, ISNULL is the way to go here.
Since you also want the average difference, you can use a common table expression to get the difference, and query it to get the average:
First, Create and populate sample data (Please save us this step in your future questions)
-- This would not be needed if you've used ISO8601 for date strings (yyyy-mm-dd | yyyymmdd)
SET DATEFORMAT DMY;
DECLARE #T AS TABLE
(
ID int,
UDate date,
last_code varchar(10),
Ddate date
) ;
INSERT INTO #T (ID, Udate, last_code, Ddate) VALUES
(1, '05/11/2018', 'ACCEPTED', '13/10/2018'),
(1, '03/11/2018', 'ATTEMPT' , '13/10/2018'),
(1, '01/11/2018', 'INFO' , '13/10/2018'),
(1, '22/10/2018', 'ARRIVED' , '13/10/2018'),
(1, '15/10/2018', 'SENT' , '13/10/2018');
The cte:
WITH CTE AS
(
SELECT ID,
Udate,
last_code,
Ddate,
DATEDIFF(
DAY,
ISNULL(
LAG(Udate) OVER(PARTITION BY ID ORDER BY Udate),
Ddate
),
UDate
) As Difference
FROM #T
)
The query:
SELECT *, AVG(Difference) OVER(PARTITION BY ID) As AverageDifference
FROM CTE;
Results:
ID Udate last_code Ddate Difference AverageDifference
1 15.10.2018 SENT 13.10.2018 2 4
1 22.10.2018 ARRIVED 13.10.2018 7 4
1 01.11.2018 INFO 13.10.2018 10 4
1 03.11.2018 ATTEMPT 13.10.2018 2 4
1 05.11.2018 ACCEPTED 13.10.2018 2 4

Altering vs adding column - setting all dates in month to one end date

I'm trying to add rank by sales and also change the date column to a 'month end' field that would have one month end date per month - if that makes sense?
Would you alter table and add column or could you just rename the date field and use set and case to make all March dates = 3-31-18 and all April 4-30-18?
I got this far:
UPDATE table1
SET DATE=EOMONTH(DATE) AS MONTH_END;
ALTER TABLE table1
ADD COLUMN RANK INT AFTER sales;
UPDATE table1
SET RANK=
RANK() OVER(PARTITION BY cust ORDER BY sales DESC);
LIMIT 2
can i do two sets in a row like that without adding an update? I'm looking for top 2 within each month - would this work? I feel like this is right and most efficient query, but its not working - any help appreciated!!
orig table
+------+----------+-------+--+
| CUST | DATE | SALES | |
+------+----------+-------+--+
| 36 | 3-5-2018 | 50 | |
| 37 | 3-15-18 | 100 | |
| 38 | 3-25-18 | 65 | |
| 37 | 4-5-18 | 95 | |
| 39 | 4-21-18 | 500 | |
| 40 | 4-45-18 | 199 | |
+------+----------+-------+--+
desired output
+------+-----------+-------+------+
| CUST | Month End | SALES | Rank |
+------+-----------+-------+------+
| | | | |
| 37 | 3-31-18 | 100 | 1 |
| 38 | 3-31-18 | 65 | 2 |
| 39 | 4-30-18 | 500 | 1 |
| 40 | 4-30-18 | 199 | 2 |
+------+-----------+-------+------+

Based on your expected output I think this may work as well.
create table Salesdate (Cust int, Dates date, Sales int)
insert into Salesdate values
(36 , '2018-03-05' , 50 )
,(37 , '2018-03-15' , 100 )
,(38 , '2018-03-25' , 65 )
,(37 , '2018-04-05' , 95 )
,(40 , '2018-04-25' , 199 )
,(39 , '2018-04-21' , 500 )
Updating the same column dates to the last day of the month (EOmonth will help to give last day of the month), you can add a separate column or update the column as you prefer.
Update Salesdate
set Dates = eomonth(Dates)
Add a column called rank in the table.
Alter table Salesdate
add rank int
Update the column rank which was just added.
update Salesdate
set Salesdate.[rank] = tbl.Ranked from
(select Cust, Sales, Dates , rank() over (Partition by Dates order by Sales Desc)
Ranked from Salesdate ) tbl
where tbl.Cust = salesdate.Cust
and tbl.Sales = salesdate.Sales
and tbl.dates = salesdate.Dates
--Not sure if this step is necessary if you want your final table to have only rank 1 and 2, then you can delete the data. Or it can be filtered out only on select list as well. Please note that sometimes rank may skip the number if we don't have unique set of sales amount for a given customer.
;With cte as (
select * from Salesdate)
delete from cte
where [RANK] > 2
select * from Salesdate
order by dates, [RANK]
Output
Cust Dates Sales rank
37 2018-03-31 100 1
38 2018-03-31 65 2
39 2018-04-30 500 1
40 2018-04-30 199 2

how to get sum of each column new records in SQL Server

I have a question about SQL Server. I have a table something like this:
productname |Level| January | Feburary | March | total
------------x-----x-----------x----------x-------x------
Rin | L1 | 10 | 20 | 30 | 60
Rin | L2 | 5 | 10 | 10 | 25
Rin | L3 | 20 | 5 | 5 | 30
Pen | L1 | 5 | 6 | 10 | 21
Pen | L2 | 10 | 10 | 20 | 40
Pen | L3 | 30 |10 | 40 | 80
based on above table data I want output like below
productname |Level| January | Feburary | March | total
------------x-----x-----------x----------x-------x------
Rin | L1 | 10 | 20 | 30 | 60
Rin | L2 | 5 | 10 | 10 | 25
Rin | L3 | 20 | 5 | 5 | 30
RinTotal |All | 35 | 35 | 45 | 115
Pen | L1 | 5 | 6 | 10 | 21
Pen | L2 | 10 | 10 | 20 | 40
Pen | L3 | 30 | 10 | 40 | 80
PenTotal | All | 45 | 26 | 70 |141
I tried like bellow query
SELECT productname
,LEVEL
,sum(january) AS January
,sum(Feburary) AS Feburary )
,Sum(march) AS March
,Sum(total) AS total
FROM test
UNION
SELECT *
FROM test
but its not given exact output .Please point me to right direction on how to achieve this task in SQL Server.

please try this:
SELECT * FROM TEST
UNION
SELECT PRODUCTNAME+'TOTAL','ALL' AS LEVEL,SUM(JANUARY)AS JANUARY,SUM(FEBURARY)AS FEBURARY),SUM(MARCH)AS MARCH,SUM(TOTAL)AS TOTAL
FROM TEST GROUP BY PRODUCTNAME

This really belongs in the front end. Group subtotals and such are usually really simple from most reporting tools. Also, don't get lazy and use select *, you should always be explicit in your columns. Since you have a specific order I added a couple of extra columns to use for sorting.
Also don't be afraid to add some white space and formatting to your queries. It makes your life a lot easier to read and later debug.
I think something like this should get you close. Notice I changed to a UNION ALL. When using UNION it will exclude duplicates. Since you know for a fact that there are no duplicate rows a UNION ALL will eliminate the need to check for duplicates.
select productname + 'Total' as productname
, 'All' as level
, sum(january) as January
, sum(Feburary) as Feburary
, Sum(march) as March
, Sum(total) as total
, productname as SortName
, 1 as SortOrder
from test
group by productname
union ALL
select productname
, level
, January
, Feburary
, March
, Total
, productname as SortName
, 0 as SortOrder
from test
order by SortName, SortOrder

I would do this using Group by With Rollup. For more info check here
SELECT *
FROM (SELECT productname=productname + CASE WHEN level IS NULL THEN 'Total'
ELSE '' END,
Level=Isnull(level, 'ALL'),
Sum(january) AS January,
Sum(feburary) AS Feburary,
Sum(march) AS March,
Sum(total) AS total
FROM Yourtable
GROUP BY rollup ( productname, level )) a
WHERE productname IS NOT NULL
SQLFIDDLE DEMO