Sum running total in sql - sql-server

I am trying to insert a running total column into a SQL Server table as part of a stored procedure. I am needing this for a financial database so I am dealing with accounts and departments. For example, let's say I have this data set:
Account | Dept | Date | Value | Running_Total
--------+--------+------------+----------+--------------
5000 | 40 | 2018-02-01 | 10 | 15
5000 | 40 | 2018-01-01 | 5 | 5
4000 | 40 | 2018-02-01 | 10 | 30
5000 | 30 | 2018-02-01 | 15 | 15
4000 | 40 | 2017-12-01 | 20 | 20
The Running_Total column provides a historical sum of dates less than or equal to each row's date value. However, the account and dept must match for this to be the case.
I was able to get close by using
SUM(Value) OVER (PARTITION BY Account, Dept, Date)
but it does not go back and get the previous months...
Any ideas? Thanks!

You are close. You need an order by:
Sum(Value) over (partition by Account, Dept order by Date)

Related

Creating rolling window for time series data in SQL

I have a question regarding adding rolling window column in SQL. Table A is a sample of 24 months time series data. I need to add column for difference between each month balances with pervious month and a month before pervious month. For example for Mar 2020 I need to have difference between Mar and Feb and also Mar and Jan for Deposit and Withdraw separately for each ID (Table B). I try to use 'window' function in sql but I do not know how.
**Table A**
ID | Date |A | B |
+--------+-----------+-------+---------
| 1 | Jan 20 | $200 | $100 |
| 1 | Feb 20 | $500 | $250 |
| 1 | Mar 20 | $1000 | $550 |
+--------+-----------+-------+---------+
I want results like this:
**Table B**
ID | Date |A | B | A(Mar-Feb)| A(Mar-Jan)| B(Mar-Feb)| B(Mar-Jan)|
+--------+-----------+-------+------------------------------------------------------
| 1 | Jan 20 | $200 | $100 | | | | |
| 1 | Feb 20 | $500 | $250 | | | | |
| 1 | Mar 20 | $1000 | $550 | $500 |$800 |$300 |$450 |
+--------+-----------+-------+---------+------------+-----------+----------+-----------+
I'd really appreciated if someone can help me.
Edited: See edit at bottom for corrected answer based on more information from OP
I "think" this is what you're asking for and it may not perfectly be what you want, because it fills in the other rows as well...
IF OBJECT_ID('tempdb..#TableA','U') IS NOT NULL DROP TABLE #TableA; --SELECT * FROM #TableA
CREATE TABLE #TableA (
ID int NOT NULL,
[Date] date NOT NULL,
A int NOT NULL,
B int NOT NULL,
)
INSERT INTO #TableA (ID, Date, A, B)
VALUES (1, '2020-01-01', 200, 100)
, (1, '2020-02-01', 500, 250)
, (1, '2020-03-01', 1000, 550)
SELECT ta.ID
, [Date] = FORMAT(ta.[Date],'MMM yy')
, ta.A, ta.B
, A_DiffPrev = ta.A - LAG(ta.A) OVER (ORDER BY ta.[Date])
, A_DiffFirst = ta.A - FIRST_VALUE(ta.A) OVER (ORDER BY ta.[Date])
, B_DiffPrev = ta.B - LAG(ta.B) OVER (ORDER BY ta.[Date])
, B_DiffFirst = ta.B - FIRST_VALUE(ta.B) OVER (ORDER BY ta.[Date])
FROM #TableA ta
Returns:
| ID | Date | A | B | A_DiffPrev | A_DiffFirst | B_DiffPrev | B_DiffFirst |
|----|--------|------|-----|------------|-------------|------------|-------------|
| 1 | Jan 20 | 200 | 100 | NULL | 0 | NULL | 0 |
| 1 | Feb 20 | 500 | 250 | 300 | 300 | 150 | 150 |
| 1 | Mar 20 | 1000 | 550 | 500 | 800 | 300 | 450 |
Explanation
LAG(ta.A) OVER (ORDER BY ta.[Date]) - This will give you the previous value as sorted by the provided ORDER BY. So in this case, it's saying, give me the value that occurs prior to the current row, if you sort by [Date] Ascending
FIRST_VALUE(ta.A) OVER (ORDER BY ta.[Date]) - Similar idea to LAG() except it's saying to get the very first item, rather than the previous item.
Edit
In the comments you mentioned that FIRST_VALUE() will not work for you because you don't want to compare with the first month, you want to compare with the previous month and two months back.
In that case, you can use this solution:
SELECT ta.ID
, [Date] = FORMAT(ta.[Date],'MMM yy')
, ta.A, ta.B
, A_DiffPrev1 = ta.A - LAG(ta.A,1) OVER (ORDER BY ta.[Date])
, A_DiffPrev2 = ta.A - LAG(ta.A,2) OVER (ORDER BY ta.[Date])
, B_DiffPrev1 = ta.B - LAG(ta.B,1) OVER (ORDER BY ta.[Date])
, B_DiffPrev2 = ta.B - LAG(ta.B,2) OVER (ORDER BY ta.[Date])
FROM #TableA ta
Explanation:
In this change, I'm using LAG() for everything. But instead, I'm telling LAG() how many rows I want it to look back.
So to get the previous month, I say LAG(A, 1) which means to grab the previous row, which is the default, I'm only providing it here to make it more explicitly clear what is happening.
Then I say LAG(A, 2) which means to go back two rows and grab that value.
NOTE: This is all assuming you do not have gaps in your data.

Ranking within multiple groups & Efficient query for multiple table updates

I'm trying to add rank by sales by month and also change the date column to a 'month end' field that would show only last day of month.
Can i do two sets in a row like that without adding an update?
I'm looking for top 2 within each month - does limit and group by work?
I feel like this is right and most efficient query, but its not working - any help appreciated!!
UPDATE table1
SET DATE=EOMONTH(DATE) AS MONTH_END;
ALTER TABLE table1
ADD COLUMN RANK INT AFTER sales;
UPDATE table1
SET RANK=
RANK() OVER(PARTITION BY cust ORDER BY sales DESC);
LIMIT 2
orig table
+------+----------+-------+--+
| CUST | DATE | SALES | |
+------+----------+-------+--+
| 36 | 3-5-2018 | 50 | |
| 37 | 3-15-18 | 100 | |
| 38 | 3-25-18 | 65 | |
| 37 | 4-5-18 | 95 | |
| 39 | 4-21-18 | 500 | |
| 40 | 4-45-18 | 199 | |
+------+----------+-------+--+
desired output
+------+-----------+-------+------+
| CUST | Month End | SALES | Rank |
+------+-----------+-------+------+
| | | | |
| 37 | 3-31-18 | 100 | 1 |
| 38 | 3-31-18 | 65 | 2 |
| 39 | 4-30-18 | 500 | 1 |
| 40 | 4-30-18 | 199 | 2 |
+------+-----------+-------+------+
I do not know why you want EOMONTH as a stored value, but what you have for that will work.
I would not use [rank] as a column name as I avoid any words that are used in SQL, maybe [sales_rank] or similar.
ALTER TABLE table1
ADD COLUMN [sales_rank] INT AFTER sales;
with cte as (
select
cust
, DENSE_RANK() OVER(PARTITION BY cust ORDER BY sales DESC) as ranking
from table1
)
update cte
set sales_rank = ranking
where ranking < 3
;
LIMIT 2 is not something that can be used in SQL Server by the way, and it sure can't be used "per grouping". When you use a "window function" such as rank() or dense_rank() you can use the output of those in the where clause of the next "layer". i.e. use those functions in a subquery (or cte) and then use a where clause to filter rows by the calculated values.
Also note I used dense_rank() to guarantee that no rank numbers are skipped, so that the subsequent where clause will be effective.

Days between status change in SQL Server

I need find the number of days between status change in SQL Server 2014.
For example, please see the data below
+--------+--------+------------+-------------+
| status | Number | updated_on | opened_at |
+--------+--------+------------+-------------+
| Draft | 100 | 2017-11-03 | 2017-11-03 |
| Draft | 100 | 2017-12-12 | 2017-11-03 |
| WIP | 100 | 2017-12-12 | 2017-11-03 |
| Appr | 100 | 2018-01-05 | 2017-11-03 |
| Launch | 100 | 2018-01-10 | 2017-11-03 |
| Close | 100 | 2018-01-11 | 2017-11-03 |
+--------+--------+------------+-------------+
Based on the above input, I need to get
Draft --- 40 days,
WIP --- 23 days,
appro -- 5 days,
deploy/launch - 1 days,
closed --- 69 days
Please help me with SQL query to arrive this results.
Thanks.
I don't think your numbers are right. But this should do what you want, assuming that the statuses are unique:
select status,
datediff(day, updated_on, lead(updated_on) over (order by updated_on) ) as days
from t;
I don't understand the first and last numbers, though.
Try this
SELECT
tb.status,
DATEDIFF(dayofyear, tb.opened_at, tb.LastUpdate) AS DaysInDifference
FROM
(
SELECT
DISTINCT
status,
Max(updated_on) OVER(PARTITION BY [status] )LastUpdate,
opened_at
FROM Table1
)AS tb

Limit RANGE with condition in Window function

Take an example I have the following transaction table, with transaction values of each department for each trimester.
TransactionID | Department | Trimester | Year | Value | Moving Avg
1 | Dep1 | T1 | 2014 | 13 |
2 | Dep1 | T1 | 2014 | 43 |
3 | Dep1 | T2 | 2014 | 36 |
300 | Dep1 T1 | 2017 | 28 |
301 | Dep2 T1 | 2014 | 24 |
I would like to calculate moving average for each transaction from the same department, taking the window as from the 6 trimesters to 2 trimesters before the current line's trimester. Example for transaction 300 in T1 2017, I'd like to have the average of transaction values for Dep1 from T1-2015 to T2-2016.
How can I achieve this with sliding window function in SQL Server 2014. My thought is that I should use something like
SELECT
AVG(VALUES) OVER
(PARTITION BY DEPARTMENT ORDER BY TRIMESTER,
YEAR RANGE [Take the range from previous 6 to 2 trimesters])
How would we define the RANGE clause. I suppose I could not use ROWS due to the number of rows for the window is unknown.
The same question for median. How would we rewrite for calculating the median instead of mean ?

how to get sum of each column new records in SQL Server

I have a question about SQL Server. I have a table something like this:
productname |Level| January | Feburary | March | total
------------x-----x-----------x----------x-------x------
Rin | L1 | 10 | 20 | 30 | 60
Rin | L2 | 5 | 10 | 10 | 25
Rin | L3 | 20 | 5 | 5 | 30
Pen | L1 | 5 | 6 | 10 | 21
Pen | L2 | 10 | 10 | 20 | 40
Pen | L3 | 30 |10 | 40 | 80
based on above table data I want output like below
productname |Level| January | Feburary | March | total
------------x-----x-----------x----------x-------x------
Rin | L1 | 10 | 20 | 30 | 60
Rin | L2 | 5 | 10 | 10 | 25
Rin | L3 | 20 | 5 | 5 | 30
RinTotal |All | 35 | 35 | 45 | 115
Pen | L1 | 5 | 6 | 10 | 21
Pen | L2 | 10 | 10 | 20 | 40
Pen | L3 | 30 | 10 | 40 | 80
PenTotal | All | 45 | 26 | 70 |141
I tried like bellow query
SELECT productname
,LEVEL
,sum(january) AS January
,sum(Feburary) AS Feburary )
,Sum(march) AS March
,Sum(total) AS total
FROM test
UNION
SELECT *
FROM test
but its not given exact output .Please point me to right direction on how to achieve this task in SQL Server.
please try this:
SELECT * FROM TEST
UNION
SELECT PRODUCTNAME+'TOTAL','ALL' AS LEVEL,SUM(JANUARY)AS JANUARY,SUM(FEBURARY)AS FEBURARY),SUM(MARCH)AS MARCH,SUM(TOTAL)AS TOTAL
FROM TEST GROUP BY PRODUCTNAME
This really belongs in the front end. Group subtotals and such are usually really simple from most reporting tools. Also, don't get lazy and use select *, you should always be explicit in your columns. Since you have a specific order I added a couple of extra columns to use for sorting.
Also don't be afraid to add some white space and formatting to your queries. It makes your life a lot easier to read and later debug.
I think something like this should get you close. Notice I changed to a UNION ALL. When using UNION it will exclude duplicates. Since you know for a fact that there are no duplicate rows a UNION ALL will eliminate the need to check for duplicates.
select productname + 'Total' as productname
, 'All' as level
, sum(january) as January
, sum(Feburary) as Feburary
, Sum(march) as March
, Sum(total) as total
, productname as SortName
, 1 as SortOrder
from test
group by productname
union ALL
select productname
, level
, January
, Feburary
, March
, Total
, productname as SortName
, 0 as SortOrder
from test
order by SortName, SortOrder
I would do this using Group by With Rollup. For more info check here
SELECT *
FROM (SELECT productname=productname + CASE WHEN level IS NULL THEN 'Total'
ELSE '' END,
Level=Isnull(level, 'ALL'),
Sum(january) AS January,
Sum(feburary) AS Feburary,
Sum(march) AS March,
Sum(total) AS total
FROM Yourtable
GROUP BY rollup ( productname, level )) a
WHERE productname IS NOT NULL
SQLFIDDLE DEMO

Resources