Average value depending on cumulative value from other column aka running total - sql-server

I've sifted through the various sql-server tagged threads using AVERAGE and Cumulative as search terms. Various desperate answers, but I can't cobble them together for my needs. The use case is to find the initial average value (cumulative value/cumulative days on) for a time period when cumulative days on is greater than 60 and less than 90.
Below is a table where ID identifies the object, VALUE is the amount reported on a monthly basis and DAYSON is the number of days in that month where the object ran to produce the value. YEARMONTH is date value on which on can sort.
ID VALUE DASYON YEARMONTH
1 166 27 201502
1 1 2 201505
1 569 19 201507
1 312 19 201508
2 364 27 201502
2 328 31 201503
2 242 29 201504
2 273 31 201505
2 174 30 201506
2 188 25 201507
2 203 25 201508
3 474 28 201502
3 521 31 201503
3 465 30 201504
3 473 31 201505
3 434 30 201506
3 404 31 201507
I would like to create a summary table that averages the cumulative value divided by the cumulative days uniquely for each ID where cumulative days is greater than 60 and less than 90. Below is a table that with the cumulative values. (I generated this in Excel)
ID VALUE cumValue DASYON cumDaysOn YEARMONTH
1 166 166 27 27 201502
1 1 167 2 29 201505
1 569 736 19 48 201507
1 312 1048 19 67 201508
2 364 364 27 27 201502
2 328 692 31 58 201503
2 242 934 29 87 201504
2 273 1207 31 118 201505
2 174 1381 30 148 201506
2 188 1569 25 173 201507
2 203 1772 25 198 201508
3 474 474 28 28 201502
3 521 505 31 59 201503
3 465 535 30 89 201504
3 473 566 31 120 201505
3 434 596 30 150 201506
3 404 627 31 181 201507
I try this based on other threads:
SELECT
ID,
Value,
SUM(Value) OVER (ORDER BY ID, YearMonth) [cumValue],
DaysOn,
SUM (DaysOn) OVER (Order by ID, YearMonth) as cumDaysOn,
YearMonth
FROM table
WHERE DAYSON > 0 and Liquid > 0 and YearMonth > 201501
GROUP BY ID, YearMonth, Value, DaysOn
ORDER BY ID, yearmonth
I can't get it to iterate over the ID; it just keeps summing down the column. If I could create a table or view like the one above, then I could always use a select statement and divide cumvalue by cumdayson.
Below is a table to show where I would get the initial average value (InititalAverageValue) based on the criteria:
ID VALUE cumValue DASYON cumDaysOn YEARMONTH InitalAvgValue
1 166 166 27 27 201502
1 1 167 2 29 201505
1 569 736 19 48 201507
1 312 1048 19 67 201508 55
2 364 364 27 27 201502
2 328 692 31 58 201503
2 242 934 29 87 201504 32
2 273 1207 31 118 201505
2 174 1381 30 148 201506
2 188 1569 25 173 201507
2 203 1772 25 198 201508
3 474 474 28 28 201502
3 521 505 31 59 201503
3 465 535 30 89 201504 18
3 473 566 31 120 201505
3 434 596 30 150 201506
3 404 627 31 181 201507
Ultimately what I desire is table as such:
ID InitalAvgValue
1 55
2 32
3 18
Thanks in advance for any help.

The crux is that you need a running total. There are several approaches to calculating running totals, but they have various tradeoffs between simplicity and performance. The "best" approach depends on the expected size of your data set and whether you are using SQL Server 2012 or an earlier version. The following article describes some different options along with the pros and cons:
http://sqlperformance.com/2012/07/t-sql-queries/running-totals
Here's a quick example using correlated subqueries, which may be reasonable for small data sets, but likely would not scale well to larger data:
SELECT
ID,
ROUND(AVG(CAST(CumulativeValue AS FLOAT) / CAST(CumulativeDaysOn AS FLOAT)), 1) AS Average
FROM
(
SELECT
ID,
Value,
DaysOn,
(SELECT SUM(Value) FROM ExampleTable t2 WHERE t1.ID = t2.ID and t2.YearMonth <= t1.YearMonth) AS CumulativeValue,
(SELECT SUM(DaysOn) FROM ExampleTable t2 WHERE t1.ID = t2.ID and t2.YearMonth <= t1.YearMonth) AS CumulativeDaysOn
FROM
ExampleTable t1
) AS ExampleWithTotals
WHERE
CumulativeDaysOn > 60 AND CumulativeDaysOn < 90
GROUP BY
ID
ORDER BY
ID
;
Output:
ID Average
1 15.6
2 10.7
3 16.4

Related

Subtract depletion And Limit for SUM

First I'm using AdventureWork2019 as a reference
I have a query where I'm joining 5 Tables
USE [AdventureWorks2019]
GO
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
Alter PROCEDURE dbo.TestLocation
#UseDate DateTime
AS
BEGIN
SET NOCOUNT ON;
SELECT prodID
,SUM(PurchQty) AS TotalPurchase
,SUM(SalesQty) AS TotalSell
,StartDate
from (
SELECT DISTINCT WO.ProductID AS prodID
, StartDate
,WO.OrderQty AS PurchQty
,SOD.OrderQty AS SalesQty
FROM Sales.SalesOrderDetail SOD
LEFT JOIN Production.WorkOrderRouting WOR ON WOR.ProductID = SOD.ProductID
--LEFT JOIN Production.Location PL ON PL.LocationID = WOR.LocationID
--The above Join is the one for the locationID and it's working Fine
LEFT JOIN Production.WorkOrder WO ON WO.ProductID = SOD.ProductID
FULL OUTER JOIN Purchasing.PurchaseOrderDetail POD ON POD.ProductID = SOD.ProductID
WHERE StartDate = #UseDate
-- AND PL.LocationID >= 10
) Test3
Group by prodID,StartDate
order by prodID ASC, StartDate
END
GO
EXEC TestLocation '2011-07-02 00:00:00.000'
Output(sample):
prodID TotalPurc TotalSell StartDate
717 8 36 2011-07-02 00:00:00.000
730 9 47 2011-07-02 00:00:00.000
744 2 3 2011-07-02 00:00:00.000
747 12 21 2011-07-02 00:00:00.000
749 5 15 2011-07-02 00:00:00.000
761 16 138 2011-07-02 00:00:00.000
775 26 91 2011-07-02 00:00:00.000
777 12 78 2011-07-02 00:00:00.000
802 6 21 2011-07-02 00:00:00.000
804 40 60 2011-07-02 00:00:00.000
806 16 138 2011-07-02 00:00:00.000
807 24 23 2011-07-02 00:00:00.000
810 21 28 2011-07-02 00:00:00.000
811 6 21 2011-07-02 00:00:00.000
813 8 37 2011-07-02 00:00:00.000
817 21 28 2011-07-02 00:00:00.000
And another Table For LocationID (as a warehouse)
SELECT LocationID,CostRate,Availability
FROM Production.Location
WHERE LocationID >= 10
order by CostRate ASC
LocationID CostRate Availability
50 12.25 120.00
60 12.25 120.00
30 14.50 120.00
40 15.75 120.00
45 18.00 80.00
10 22.50 96.00
20 25.00 108.00
What I want to do is to take each LoactionId and ProdID and take TotalPurc to the location and decrement the quantity in the Availability column, each TotalSell will increment the Availability column. The max Availability quantity is 130.
If all locations have no Available quantity that is the Available is 0 for all locations then it will stop.
the above will work with the date specified as you can check the query and run it if you have
AdventureWork2019
simple output to check how I want the data to be:
prodID TotalPurc TotalSell StartDate
717 8 36 2011-07-02 00:00:00.000
730 9 47 2011-07-02 00:00:00.000
744 2 3 2011-07-02 00:00:00.000
747 12 21 2011-07-02 00:00:00.000
749 5 15 2011-07-02 00:00:00.000
LocationID CostRate Availability
50 12.25 120.00
60 12.25 120.00
30 14.50 120.00
40 15.75 120.00
45 18.00 80.00
10 22.50 96.00
20 25.00 108.00
Output :
prodID TotalPurc TotalSell StartDate LocationID Availability Remaining
717 8 36 2011-07-02 00:00:00.000 50 130 18
717 8 36 2011-07-02 00:00:00.000 60 130 8
717 8 36 2011-07-02 00:00:00.000 30 128 0
--what happened above is that I took the (120-8) = 112 then 112+36 = 148 we only can use 130 then the remaining is 18 then we took the next `LocationID` with the least Cost (120+18 = 138 we can use 130 so we took the 8) and used it in the next `LocationID`
730 9 47 2011-07-02 00:00:00.000 30 130 36
730 9 47 2011-07-02 00:00:00.000 40 130 26
730 9 47 2011-07-02 00:00:00.000 45 106 0
744 2 3 2011-07-02 00:00:00.000 45 107 0
747 12 21 2011-07-02 00:00:00.000 45 116 0
749 5 15 2011-07-02 00:00:00.000 45 126 0
--the above is the same as the first 3 rows we subtract and add to the availability
The other condition is that if all locations reached 0 or 130 then stop
How can I do that in SQL Server? I tried using CTE but didn't work well with me and tried the cursor which I think is the best for this kind of thing but didn't achieve anything.
Thank you in advance
Edit :
ALTER FUNCTION GetStockMovment
(
-- Add the parameters for the function here
#ForDate Datetime
)
RETURNS #Sums TABLE (
RemoveQTY Numeric(24, 7),
ADDQTY Numeric(24, 7)
)
AS
BEGIN
Declare #WoSum Numeric(24, 7),
#SODSUM Numeric(24, 7),
#WORSum Numeric(24, 7),
#PODSum Numeric(24, 7)
select #SODSUM = SUM(SOD.OrderQty) from Sales.SalesOrderDetail SOD
INNER JOIN Sales.SalesOrderHeader SOH ON SOD.SalesOrderID = SOH.SalesOrderID
where SOH.OrderDate = #ForDate
select #WoSum = sum(orderQty) from Production.WorkOrder
where StartDate = #ForDate
select #PODSum = sum(POD.OrderQty) from Purchasing.PurchaseOrderDetail POD
INNER JOIN Purchasing.PurchaseOrderHeader POH ON POD.PurchaseOrderID = POH.PurchaseOrderID
where POH.OrderDate = #ForDate
select #WoSum = sum(WO.OrderQty) from Production.WorkOrder WO
where WO.DueDate = #ForDate
INSERT INTO #Sums (RemoveQTY,ADDQTY)
SELECT isnull(#SODSUM,0) + isnull(#WORSum,0) , isnull(#PODSum,0) + isnull(#WoSum,0)
RETURN;
END;
GO
select * from dbo.GetStockMovment ('2014-05-26 00:00:00.000')
Output:
RemoveQTY ADDQTY
189.0000000 5334.0000000
You should use LAG or LEAD function.
https://learn.microsoft.com/en-us/sql/t-sql/functions/lead-transact-sql?view=sql-server-ver15
https://learn.microsoft.com/en-us/sql/t-sql/functions/lag-transact-sql?view=sql-server-ver15

Running (Average True Range) calculation in a column SQL

I have a dataset of price data and would like to get the calculation of the ongoing ATR (Average True Range) for all rows > 21. Row 21 is the AVG([TR]) from Rows 2-21 and is equal to 353.7.
The calculation that needs to be continuous for the rest of that [ATR_20] column will need to be:
ATR_20 (after row 21) = (([Previous ATR_20]*19)+[TR])/20
My dataset:
Date Open High Low Close TotalVolume Prev_Close TR_A TR_B TR_C TR ATR
2017-02-01 5961 5961 5425 5498 22689 NULL 536 NULL NULL NULL NULL
2017-02-02 5697 5868 5615 5734 22210 5498 253 370 117 370 NULL
2017-02-03 5742 5811 5560 5725 15852 5734 251 77 174 251 NULL
2017-02-06 5675 5679 5545 5554 9777 5725 134 46 180 180 NULL
2017-02-07 5597 5613 5426 5481 12692 5554 187 59 128 187 NULL
2017-02-08 5459 5630 5450 5625 9134 5481 180 149 31 180 NULL
2017-02-09 5615 5738 5532 5668 10630 5625 206 113 93 206 NULL
2017-02-10 5651 5661 5488 5602 9709 5668 173 7 180 180 NULL
2017-02-13 5700 6195 5639 6161 26031 5602 556 593 37 593 NULL
2017-02-14 6197 6594 6073 6571 35969 6161 521 433 88 521 NULL
2017-02-15 6510 6650 6275 6492 22046 6571 375 79 296 375 NULL
2017-02-16 6505 6680 6325 6419 12515 6492 355 188 167 355 NULL
2017-02-17 6434 6670 6429 6658 14947 6419 241 251 10 251 NULL
2017-02-21 6800 6957 6603 6654 23838 6658 354 299 55 354 NULL
2017-02-22 6704 6738 6145 6222 25004 6654 593 84 509 593 NULL
2017-02-23 6398 6437 5901 6343 46677 6222 536 215 321 536 NULL
2017-02-24 5280 5589 5260 5404 51757 6343 329 754 1083 1083 NULL
2017-02-27 5437 5461 5260 5300 19831 5404 201 57 144 201 NULL
2017-02-28 5258 5410 5167 5195 15900 5300 243 110 133 243 NULL
2017-03-01 5251 5299 5052 5215 16958 5195 247 104 143 247 NULL
2017-03-02 5160 5231 5063 5130 17805 5215 168 16 152 168 353.7
2017-03-03 5141 5363 5088 5320 14516 5130 275 233 42 275 NULL
I got to this point by the following
WITH cte_ACIA ([RowNumber], [Date], [Open], [High], [Low], [Close],
[Prev_Close], [TotalVolume], [TR_A], [TR_B], [TR_C])
AS
(SELECT
ROW_NUMBER() OVER (ORDER BY [Date] ASC) RowNumber,
[Date],
[Open],
[High],
[Low],
[Close],
LAG([Close]) OVER(ORDER BY [Date]) AS Prev_Close,
[TotalVolume],
ROUND([High]-[Low], 5) AS TR_A,
ABS(ROUND([High]-LAG([Close]) OVER(ORDER BY [Date]), 5)) AS TR_B,
ABS(ROUND([Low]-LAG([Close]) OVER(ORDER BY [Date]), 5)) AS TR_C,
FROM NASDAQ.ACIA_TEMP)
SELECT [RowNumber], [Date], [Open], [High], [Low], [Close], [Prev_Close],
[TotalVolume], [TR_A], [TR_B], [TR_C], [TR],
CASE
WHEN RowNumber = 21 THEN AVG([TR]) OVER (ORDER BY [Date] ASC ROWS 19 PRECEDING)
END AS ATR_20
FROM
(
SELECT [RowNumber],[Date],[Open],[High],[Low],[Close],
IIF(RowNumber = 1, NULL, Prev_Close) Prev_Close,
[TotalVolume],
[TR_A],
IIF(RowNumber > 1, [TR_B], NULL) TR_B,
IIF(RowNumber > 1, [TR_C], NULL) TR_C,
CASE
WHEN TR_A > TR_B AND TR_A > TR_C THEN TR_A
WHEN TR_B > TR_A AND TR_B > TR_C THEN TR_B
ELSE TR_C
END AS TR
FROM cte_ACIA) sub
Please let me know if you have questions or I need to clarify anything.
I suppose you are just looking for a hint. Otherwise you would have posted your table definition. We can't construct a query for you since we don't have the basic pieces. However, here's the hint! Use an aggregating window function with the OVER clause specifying ROWS PRECEDING.
See SELECT - OVER Clause

Join on Multiple columns with duplicates

I have two tables Master table and the Matched Master Table based on the same saldate , saltime , lctid ,masid the counts has to be pulled up
For count(masid,saldate,saltime,lctid)>1 , for masid 121 in master table there are 5 records in Master Matched 3 records, I need to get the output which are not in Mactched and how much in Master and the total amount with the masid
Master
ID Saldate SalTime lctid masid Sal_amt
101 1/1/2000 100 120 121 15
102 1/1/2000 100 120 121 25
103 1/1/2000 100 120 121 100
118 1/1/2001 120 118 201 25
119 1/1/2009 302 222 187 60
104 1/1/2000 100 120 121 125
108 1/1/2000 100 120 121 22
Master matched
ID Saldate SalTime lctid masid Sal_Amt
101 1/1/2000 100 120 121 15
102 1/1/2000 100 120 121 25
118 1/1/2001 120 118 201 25
119 1/1/2009 302 222 187 60
OP1-Master OP2-Master matched
Masid Count(iD) SalAMt Masid Count(iD) Sal_Amt
121 3 247 121 2 40

How to bulk update the date column

I have a table with 10000 records.
Sample
Id Transaction_Id Contract_Id Contractor_Id ServiceDetail_Id ServiceMonth UnitsDelivered CreateDate
----------------------------------------------------------------------------------------------------------------------------
1 1 352 466 590 2016-03-01 203 2016-04-25 17:01:55.000
2 1 352 466 566 2016-03-01 200 2016-04-25 17:02:38.807
3 1 352 466 138 2016-04-13 20 2016-04-13 00:00:00.000
5 1 352 466 138 2016-04-14 21 2016-04-13 00:00:00.000
6 10011 40 460 68 2016-03-17 10 2016-04-25 17:20:13.413
7 10011 40 460 511 2016-03-17 15 2016-04-25 17:20:13.413
8 10011 40 460 1611 2016-03-17 20 2016-04-25 17:20:13.413
9 20011 352 466 2563 2016-02-05 10 2016-04-25 17:20:25.307
11 100 40 460 68 2016-03-17 10 2016-04-25 17:29:23.653
In this table I have servicemonth with different dates.
I want to update the servicemonth column to the existing months last date.
suppose if I have 2016-03-17 in the table, it should be updated to 2016-3-31
suppose if I have 2016-05-12 in the table, it should be updated to 2016-5-31
Can anyone suggest a single query to update this?
EOMONTH: Returns the last day of the month that contains the specified date, with an optional offset.
UPDATE ... SET servicemonth = EOMONTH(servicemonth)
Update servicemonth
set servicemonth = DATEADD(s,-1,DATEADD(mm, DATEDIFF(m,0,GETDATE())+1,0))
--replace getdate() with the date column for which you want the end day of the month

How do I get a total from a SQL query with a caculated field

I have a table that lists visits to a clinic. I'd like to get a "histogram" of sorts showing how frequently patients visit the clinic along with totals. Here's some sample code (tested under MS SQL Server 2005) to show what I'm talking about:
CREATE TABLE #test (
visit_id int IDENTITY(1,1),
patient_id int
);
DECLARE #num_patients int;
SELECT #num_patients = 1000 + ABS(CHECKSUM(NEWID())) % 250;
INSERT INTO #test (patient_id)
SELECT TOP 15 PERCENT ABS(CHECKSUM(NEWID())) % #num_patients
FROM sysobjects a, sysobjects b;
-- SELECT COUNT(*) AS total_visits FROM #test;
-- SELECT COUNT(DISTINCT patient_id) AS distinct_patients FROM #test;
SELECT CASE GROUPING(num_pat_visits) WHEN 1 THEN 'Total'
ELSE CAST(num_pat_visits AS varchar(5)) END AS num_pat_visits,
COUNT(*) AS num_patients, num_pat_visits * COUNT(*) AS tot_pat_visit
FROM
(SELECT patient_id, COUNT(*) AS num_pat_visits FROM #test GROUP BY patient_id) a
GROUP BY num_pat_visits WITH ROLLUP
ORDER BY CAST(num_pat_visits AS int) DESC;
This gets me almost to where I want:
num_pat_visits num_patients tot_pat_visit
-------------- ------------ -------------
60 1 60
54 2 108
52 2 104
51 4 204
50 3 150
49 3 147
48 7 336
47 7 329
46 15 690
45 15 675
44 29 1276
43 36 1548
42 45 1890
41 45 1845
40 59 2360
39 71 2769
38 51 1938
37 72 2664
36 77 2772
35 74 2590
34 72 2448
33 82 2706
32 90 2880
31 74 2294
30 69 2070
29 47 1363
28 30 840
27 27 729
26 26 676
25 21 525
24 13 312
23 4 92
22 5 110
21 4 84
20 2 40
18 2 36
Total 1186 NULL
However, I can't seem to get SQL Server to display the total number of visits where it says NULL on the total row.
Any ideas?
I think you can just do:
sum(num_pat_visits) as tot_pat_visit
SELECT CASE GROUPING(num_pat_visits) WHEN 1 THEN 'Total'
ELSE CAST(num_pat_visits AS varchar(5)) END AS num_pat_visits,
COUNT(*) AS num_patients,
--num_pat_visits * COUNT(*) AS tot_pat_visit
sum(num_pat_visits) as tot_pat_visit
FROM
(SELECT patient_id, COUNT(*) AS num_pat_visits FROM #test GROUP BY patient_id) a
GROUP BY num_pat_visits WITH ROLLUP
ORDER BY CAST(num_pat_visits AS int) DESC;

Resources