Anyone know an efficient way to write a query to compare a current month's revenue to the average monthly revenue for the past 6 months?
Here's an example with just 2 columns, the actual month and the month's revenue
Columns:
MonthYear RevenueAmt
Jan2017 120
Dec2016 75
Nov2016 50
Oct2016 100
Sep2016 75
Aug2016 100
Jul2016 100
so....the average of the previous 6 months (Jul to Dec) is
(75 + 50 + 100 + 75 + 100 + 100) = 500
500 / 6 = 83.33
The current month (Jan2017) is 120,
so the difference becomes-
120 - 83.33 = 36.67
So, Jan2017 is 36.67 higher than the average of its past 6 months.
You can use the window functions and set the frame via ROWS BETWEEN 6 PRECEDING AND 1 PRECEDING
This is a rolling variance, and I did make one modification... I used an actual date so we can set the proper Order By in the Over clause
Edit: I added the Prior6MthAvg column to illustrate the math
Declare #YourTable table (MonthYear Date,RevenueAmt int)
Insert Into #YourTable values
('2017-01-01',120),
('2016-12-01',75),
('2016-11-01',50),
('2016-10-01',100),
('2016-09-01',75),
('2016-08-01',100),
('2016-07-01',100)
Select A.*
,Prior6MthAvg = avg(RevenueAmt+0.0) over (Order By MonthYear ROWS BETWEEN 6 PRECEDING AND 1 PRECEDING)
,Variance = RevenueAmt-avg(RevenueAmt+0.0) over (Order By MonthYear ROWS BETWEEN 6 PRECEDING AND 1 PRECEDING)
From #YourTable A
Order by MonthYear Desc
Returns
MonthYear RevenueAmt Prior6MthAvg Variance
2017-01-01 120 83.333333 36.666667
2016-12-01 75 85.000000 -10.000000
2016-11-01 50 93.750000 -43.750000
2016-10-01 100 91.666666 8.333334
2016-09-01 75 100.000000 -25.000000
2016-08-01 100 100.000000 0.000000
2016-07-01 100 NULL NULL
I was going to comment, but it is more of an answer....
Please note: efficiency, and planning thereof, requires a broader scope of design. That said, in general - assuming a larger amount of data - the most optimal way (in my experience) is to keep a separate table for the "running averages."
To explain, I would keep a separate table (single record, if only 1 set of data - i.e. not company based), keyed by current month, with the given average AND the 6th month total.
Once done - insert a trigger for adding a new month (again, in general, or by company, if divided) that will subtract the 6th month, and add the current.
Then, either the total, average or both are stored separately, and quickly accessed.
Once done - a simple join will bring the total/average into any given query.
Edited to add: (NOTE: my T-SQL may be rusty, but this should give you the idea)
NOTE: it assumes the base table of ClientData, with a Secondary Table SalesAvg, linked by id (of the rep) and the date (the month marker, if the date can vary, you would need to split out month/year for the key and link). Pulling it from the same table as given will basically task the server at the point of query. This method distributes the work to the point of insertion (normally more spread out) and as the average is keyed, allows for quickest retrieval using an inner join.
CREATE TRIGGER UpdateSalesAvg
ON schema.ClientData
AFTER INSERT
AS
DECLARE #ID as INT;
DECLARE #Date as DATE;
DECLARE #Count as INT;
SET #ID=new.id;
SET #Date=new.date;
SELECT #Count = ISNULL(RunCount, 0) FROM schema.SalesAvg
WHERE id = #ID AND monthdate = #Date;
IF (#Count = 0)
BEGIN
INSERT INTO schema.SalesAvg (id, monthdate, RunAvg, RunCount, LastPost)
VALUES (new.id, new.monthdate, new.value, 1, new.value);
END
ELSE IF (#Count = 6)
BEGIN
UPDATE schema.SalesAvg
SET RunAvg = ROUND((RunAvg * 6) - LastPost + new.value) / 6, 2),
LastPost = new.value
WHERE id = #ID
AND monthdate = #Date;
END
ELSE
BEGIN
UPDATE schema.SalesAvg
SET RunAvg = ROUND( (RunAvg * RunCount + new.value) / (RunCount + 1), 2 ),
RunCount = RunCount + 1,
LastPost = new.value
WHERE id = #ID
AND monthdate= #Date;
END
GO
Related
I have a Dimension table containing machines.
Each machine has a date created value.
I would like to have a Select statement that generates for each day after a certain start date the available number of machines. A machine is available after the date created on wards
As I have read only access to the database I am not able to create a physical calendar table
I hope somebody can help me solving my issue
I assume this is what you want. Based on this sample table:
USE tempdb;
GO
CREATE TABLE dbo.Machines
(
MachineID int,
CreatedDate date
);
INSERT dbo.Machines VALUES(1,'20200104'),(2,'20200202'),(3,'20200214');
Then say you wanted the number of active machines starting on January 1st:
DECLARE #StartDate date = '20200101';
;WITH x AS
(
SELECT n = 0 UNION ALL SELECT n + 1 FROM x
WHERE n < DATEDIFF(DAY, #StartDate, GETDATE())
),
days(d) AS
(
SELECT DATEADD(DAY, x.n, #StartDate) FROM x
)
SELECT days.d, MachineCount = COUNT(m.MachineID)
FROM days
LEFT OUTER JOIN dbo.Machines AS m
ON days.d >= m.CreatedDate
GROUP BY days.d
ORDER BY days.d
OPTION (MAXRECURSION 0);
Results:
d MachineCount
---------- ------------
2020-01-01 0
2020-01-02 0
2020-01-03 0
2020-01-04 1
2020-01-05 1
...
2020-01-31 1
2020-02-01 1
2020-02-02 2
2020-02-03 2
...
2020-02-12 2
2020-02-13 2
2020-02-14 3
2020-02-15 3
Clean up:
DROP TABLE dbo.Machines;
(Yes, some people hiss at recursive CTEs. You can replace it with any number of set generation techniques, some I talk about here, here, and here.)
Title sounds confusing but let me please explain:
I have a table that has two columns that provide a date range, and one column that provides a value. I need to query that table and "detail" the data such as this
Is it possible to do only using TSQL?
Additional Info
The table in question is about 2-3million records long (and growing)
Assuming the range of dates is fairly narrow, an alternative is to use a recursive CTE to create a list of all dates in the range and then join interpolate to it:
WITH LastDay AS
(
SELECT MAX(Date_To) AS MaxDate
FROM MyTable
),
Days AS
(
SELECT MIN(Date_From) AS TheDate
FROM MyTable
UNION ALL
SELECT DATEADD(d, 1, TheDate) AS TheDate
FROM Days CROSS JOIN LastDay
WHERE TheDate <= LastDay.MaxDate
)
SELECT mt.Item_ID, mt.Cost_Of_Item, d.TheDate
FROM MyTable mt
INNER JOIN Days d
ON d.TheDate BETWEEN mt.Date_From AND mt.Date_To;
I've also assumed an that date from and date to represent an inclusive range (i.e. includes both edges) - it is unusual to use inclusive BETWEEN on dates.
SqlFiddle here
Edit
The default MAXRECURSION on a recursive CTE in Sql Server is 100, which will limit the date range in the query to a span of 100 days. You can adjust this to a maximum of 32767.
Also, if you are filtering just a smaller range of dates in your large table, you can adjust the CTE to limit the number of days in the range:
WITH DateRange AS
(
SELECT CAST('2014-01-01' AS DATE) AS MinDate,
CAST('2014-02-16' AS DATE) AS MaxDate
),
Days AS
(
SELECT MinDate AS TheDate
FROM DateRange
UNION ALL
SELECT DATEADD(d, 1, TheDate) AS TheDate
FROM Days CROSS APPLY DateRange
WHERE TheDate <= DateRange.MaxDate
)
SELECT mt.Item_ID, mt.Cost_Of_Item, d.TheDate
FROM MyTable mt
INNER JOIN Days d
ON d.TheDate BETWEEN mt.Date_From AND mt.Date_To
OPTION (MAXRECURSION 0);
Update Fiddle
This can be achieved using Cursors.
I've simulated the test data provided and created another table with the name "DesiredTable" to store the data inside, and created the following cusror which achieved exactly what you are looking for:
SET NOCOUNT ON;
DECLARE #ITEM_ID int, #COST_OF_ITEM Money,
#DATE_FROM date, #DATE_TO date;
DECLARE #DateDiff INT; -- holds number of days between from & to columns
DECLARE #counter INT = 0; -- for loop counter
PRINT '-------- Begin the Date Expanding Cursor --------';
-- defining the cursor target statement
DECLARE Date_Expanding_Cursor CURSOR FOR
SELECT [ITEM_ID]
,[COST_OF_ITEM]
,[DATE_FROM]
,[DATE_TO]
FROM [dbo].[OriginalTable]
-- openning the cursor
OPEN Date_Expanding_Cursor
-- fetching next row data into the declared variables
FETCH NEXT FROM Date_Expanding_Cursor
INTO #ITEM_ID, #COST_OF_ITEM, #DATE_FROM, #DATE_TO
-- if next row is found
WHILE ##FETCH_STATUS = 0
BEGIN
-- calculate the number of days in between the date columns
SELECT #DateDiff = DATEDIFF(day,#DATE_FROM,#DATE_TO)
-- reset the counter to 0 for the next loop
set #counter = 0;
WHILE #counter <= #DateDiff
BEGIN
-- inserting rows inside the new table
insert into DesiredTable
Values (#COST_OF_ITEM, DATEADD(day,#counter,#DATE_FROM))
set #counter = #counter +1
END
-- fetching next row
FETCH NEXT FROM Date_Expanding_Cursor
INTO #ITEM_ID, #COST_OF_ITEM, #DATE_FROM, #DATE_TO
END
-- cleanup code
CLOSE Date_Expanding_Cursor;
DEALLOCATE Date_Expanding_Cursor;
The code fetches every row from your original table, then it calculates the number of days between DATE_FROM and DATE_TO columns, then using this number the script will create identical rows to be inserted inside the new table DesiredTable.
give it a try and let me know of the results.
You can generate an increment table and join it to your date From:
Query:
With inc(n) as (
Select ROW_NUMBER() over (order by (select 1)) -1 From (
Select 1 From (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x1(n)
Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x2(n)
) as x(n)
)
Select item_id, cost, DATEADD(day, n, dateFrom), n From #dates d
Inner Join inc i on n <= DATEDIFF(day, dateFrom, dateTo)
Order by item_id
Output:
item_id cost Date n
1 100 2014-01-01 00:00:00.000 0
1 100 2014-01-02 00:00:00.000 1
1 100 2014-01-03 00:00:00.000 2
2 105 2014-01-08 00:00:00.000 2
2 105 2014-01-07 00:00:00.000 1
2 105 2014-01-06 00:00:00.000 0
2 105 2014-01-09 00:00:00.000 3
3 102 2014-02-14 00:00:00.000 3
3 102 2014-02-15 00:00:00.000 4
3 102 2014-02-16 00:00:00.000 5
3 102 2014-02-11 00:00:00.000 0
3 102 2014-02-12 00:00:00.000 1
3 102 2014-02-13 00:00:00.000 2
Sample Data:
declare #dates table(item_id int, cost int, dateFrom datetime, dateTo datetime);
insert into #dates(item_id, cost, dateFrom, dateTo) values
(1, 100, '20140101', '20140103')
, (2, 105, '20140106', '20140109')
, (3, 102, '20140211', '20140216');
Yet another way is to create and maintain calendar table, containing all dates for many years (in our app we have table for 30 years or so, extending every year). Then you can just link to calendar:
select <whatever you need>, calendar.day
from <your tables> inner join calendar on calendar.day between <min date> and <max date>
This approach allows to include additional information (holidays etc) in calendar table - sometimes very helpful.
I am working on a stored procedure. The input is going to be a date and ID and the procedure is going to set a value to true if there are 4 weeks with less then 2 inputs per week.
In my data only the weeks with the stars have less than two inputs and if I pass the date 7-7-2015 is going to set the output value to true
Any help will be appreciated. Do I need to iterate through every record and set a counter if less then two inputs or there is an easier way ?
ID Date
1 7-7-2015
2 6-23-2015
3 6-12-2015
1 7-8-2015
1 7-14-2015 *
1 7-21-2015 *
1 7-27-2015
1 7-28-2015
1 7-29-2015
1 7-30-2015
1 8-3-2015 *
1 8-11-2015 *
There might be better ways, but one way that should work is to use a windowed aggregate function (count) in a derived table, and then sum up the number of rows with count 1 and return true if the sum is >= 4. Something like this:
create proc sp_test (#id int, #d date)
as
select case when sum(c) >= 4 then 'true' else 'false' end status
from (
select c = count(*) over (partition by id, datepart(week, date))
from t -- this is your table
where id = #id and date >= #d
) a
where c = 1
go
Execute it using:
exec sp_test #id = 1, #d = '2015-07-07'
I have a result set like this:
YearMonth Sales
201411 100
201412 100
201501 100
201502 100
201503 100
201504 100
201505 100
201506 100
201507 100
201508 100
Need to add another row with 4% more sales than the previous month. For example my Result should be
YearMonth Sales New Sales
201411 100 100.00
201412 100 104.00
201501 100 108.16
201502 100 112.49
201503 100 116.99
201504 100 121.67
201505 100 126.53
201506 100 131.59
201507 100 136.86
201508 100 142.33
Please help me to get the best way for it.
Got perfect answer for your requirement. It took long time to figure out. Just change the #Temp table name with your table name and verify the column names also.
DECLARE #nCurrentSale FLOAT
DECLARE #nYeatDate INT
DECLARE #nSale FLOAT
CREATE TABLE #TempNEW(YearMonth VARCHAR(10), Sales FLOAT, NewSale FLOAT)
SELECT TOP 1 #nCurrentSale = Sales FROM #Temp
ORDER BY (CAST('01/' + SUBSTRING (CAST(YearMonth AS VARCHAR), 5 , 2) + '/' + SUBSTRING (CAST(YearMonth AS
VARCHAR), 0 , 5) AS DATETIME)) ASC
DECLARE Cursor1 CURSOR FOR
SELECT YearMonth, Sales FROM #Temp
ORDER BY (CAST('01/' + SUBSTRING (CAST(YearMonth AS VARCHAR), 5 , 2) + '/' + SUBSTRING (CAST(YearMonth AS
VARCHAR), 0 , 5) AS DATETIME)) ASC
OPEN Cursor1
FETCH NEXT FROM Cursor1 INTO #nYeatDate, #nSale
WHILE ##FETCH_STATUS = 0
BEGIN
INSERT INTO #TempNEW(YearMonth, Sales, NewSale) VALUES(#nYeatDate, #nSale, CAST(#nCurrentSale AS DECIMAL(12,2)))
SET #nCurrentSale = #nCurrentSale + ((#nCurrentSale/100) * 4)
FETCH NEXT FROM Cursor1 INTO #nYeatDate, #nSale
END
CLOSE Cursor1
DEALLOCATE Cursor1
SELECT * FROM #TempNEW
Notify me with your status.
Yes it possible. But first you have to alter the table and add the extra column NewSales then try with this link
https://dba.stackexchange.com/questions/34243/update-row-based-on-match-to-previous-row
i think you can done it through this link
Also sql server support some "Computed Columns in SQL Server with Persisted Values"
using that you can specify the formula what you want, then the new column value will automatically created according to your formula
Here's two thoughts... Not super clear if I understood the use case... Also this solution will only work for SQL 2012 and up
So given the table
CREATE TABLE [dbo].[LagExample](
[YearMonth] [nvarchar](100) NOT NULL,
[Sales] [money] NOT NULL
)
First one is fairly simple and just assumes you are wanting to base the magnitude of your percentage increase on how many days came before it...
;WITH cte
as
(
SELECT YearMonth,
ROW_NUMBER() OVER (ORDER BY YearMonth) - 1 AS SalesEntry,
cast(LAG(Sales, 1,Sales) OVER (ORDER BY YearMonth) as float) as Sales
FROM LagExample
)
SELECT YearMonth,
Sales,
cast(Sales * POWER(cast(1.04 as float), SalesEntry) AS decimal(10,2)) as NewSales
FROM cte
The Second one uses a recursive CTE to calculate the value as you move along the months..
Here's a good link about recursive CTEs
http://www.codeproject.com/Articles/683011/How-to-use-recursive-CTE-calls-in-T-SQL
;with data
as
(
SELECT Lead(le.YearMonth, 1, null) OVER (ORDER BY le.YearMonth) as NextYearMonth,
cast(le.Sales as Decimal(10,4)) as Sales,
le.YearMonth
FROM LagExample le
)
,cte
as
(
SELECT *
FROM data
Where YearMonth = '201411'
UNION ALL
SELECT
data.NextYearMonth,
cast(cte.Sales * 1.04 as Decimal(10,4)) as Sales,
data.YearMonth
From cte join
data on data.YearMonth = cte.NextYearMonth
)
SELECT YearMonth, cast(Sales as Decimal(10,2))
FROM cte
order by YearMonth
I've written a query that groups the number of rows per hour, based on a given date range.
SELECT CONVERT(VARCHAR(8),TransactionTime,101) + ' ' + CONVERT(VARCHAR(2),TransactionTime,108) as TDate,
COUNT(TransactionID) AS TotalHourlyTransactions
FROM MyTransactions WITH (NOLOCK)
WHERE TransactionTime BETWEEN CAST(#StartDate AS SMALLDATETIME) AND CAST(#EndDate AS SMALLDATETIME)
AND TerminalId = #TerminalID
GROUP BY CONVERT(VARCHAR(8),TransactionTime,101) + ' ' + CONVERT(VARCHAR(2),TransactionTime,108)
ORDER BY TDate ASC
Which displays something like this:
02/11/20 07 4
02/11/20 10 1
02/11/20 12 4
02/11/20 13 1
02/11/20 14 2
02/11/20 16 3
Giving the number of transactions and the given hour of the day.
How can I display all hours of the day - from 0 to 23, and show 0 for those which have no values?
Thanks.
UPDATE
Using the tvf below works for me for one day, however I'm not sure how to make it work for a date range.
Using the temp table of 24 hours:
-- temp table to store hours of the day
DECLARE #tmp_Hours TABLE ( WhichHour SMALLINT )
DECLARE #counter SMALLINT
SET #counter = -1
WHILE #counter < 23
BEGIN
SET #counter = #counter + 1
--print
INSERT INTO #tmp_Hours
( WhichHour )
VALUES ( #counter )
END
SELECT MIN(CONVERT(VARCHAR(10),[dbo].[TerminalTransactions].[TransactionTime],101)) AS TDate, [#tmp_Hours].[WhichHour], CONVERT(VARCHAR(2),[dbo].[TerminalTransactions].[TransactionTime],108) AS TheHour,
COUNT([dbo].[TerminalTransactions].[TransactionId]) AS TotalTransactions,
ISNULL(SUM([dbo].[TerminalTransactions].[TransactionAmount]), 0) AS TransactionSum
FROM [dbo].[TerminalTransactions] RIGHT JOIN #tmp_Hours ON [#tmp_Hours].[WhichHour] = CONVERT(VARCHAR(2),[dbo].[TerminalTransactions].[TransactionTime],108)
GROUP BY [#tmp_Hours].[WhichHour], CONVERT(VARCHAR(2),[dbo].[TerminalTransactions].[TransactionTime],108), COALESCE([dbo].[TerminalTransactions].[TransactionAmount], 0)
Gives me a result of:
TDate WhichHour TheHour TotalTransactions TransactionSum
---------- --------- ------- ----------------- ---------------------
02/16/2010 0 00 4 40.00
NULL 1 NULL 0 0.00
02/14/2010 2 02 1 10.00
NULL 3 NULL 0 0.00
02/14/2010 4 04 28 280.00
02/14/2010 5 05 11 110.00
NULL 6 NULL 0 0.00
02/11/2010 7 07 4 40.00
NULL 8 NULL 0 0.00
02/24/2010 9 09 2 20.00
So how can I get this to group properly?
The other issue is that for some days there will be no transactions, and these days also need to appear.
Thanks.
You do this by building first the 23 hours table, the doing an outer join against the transactions table. I use, for same purposes, a table valued function:
create function tvfGetDay24Hours(#date datetime)
returns table
as return (
select dateadd(hour, number, cast(floor(cast(#date as float)) as datetime)) as StartHour
, dateadd(hour, number+1, cast(floor(cast(#date as float)) as datetime)) as EndHour
from master.dbo.spt_values
where number < 24 and type = 'p');
Then I can use the TVF in queries that need to get 'per-hour' basis data, even for missing intervals in the data:
select h.StartHour, t.TotalHourlyTransactions
from tvfGetDay24Hours(#StartDate) as h
outer apply (
SELECT
COUNT(TransactionID) AS TotalHourlyTransactions
FROM MyTransactions
WHERE TransactionTime BETWEEN h.StartHour and h.EndHour
AND TerminalId = #TerminalID) as t
order by h.StartHour
Updated
Example of a TVF that returns 24hours between any arbitrary dates:
create function tvfGetAnyDayHours(#dateFrom datetime, #dateTo datetime)
returns table
as return (
select dateadd(hour, number, cast(floor(cast(#dateFrom as float)) as datetime)) as StartHour
, dateadd(hour, number+1, cast(floor(cast(#dateFrom as float)) as datetime)) as EndHour
from master.dbo.spt_values
where type = 'p'
and number < datediff(hour,#dateFrom, #dateTo) + 24);
Note that since master.dbo.spt_values contains only 2048 numbers, the function will not work between dates further apart than 2048 hours.
You have just discovered the value of the NUMBERS table. You need to create a table with a single column containing the numbers 0 to 23 in it. Then you join again this table using an OUTER join to ensure you always get 24 rows returned.
So going back to using Remus' original function, I've re-used it in a recursive call and storing the results in a temp table:
DECLARE #count INT
DECLARE #NumDays INT
DECLARE #StartDate DATETIME
DECLARE #EndDate DATETIME
DECLARE #CurrentDay DATE
DECLARE #tmp_Transactions TABLE
(
StartHour DATETIME,
TotalHourlyTransactions INT
)
SET #StartDate = '2000/02/10'
SET #EndDate = '2010/02/13'
SET #count = 0
SET #NumDays = DateDiff(Day, #StartDate, #EndDate)
WHILE #count < #NumDays
BEGIN
SET #CurrentDay = DateAdd(Day, #count, #StartDate)
INSERT INTO #tmp_Transactions (StartHour, TotalHourlyTransactions)
SELECT h.StartHour ,
t.TotalHourlyTransactions
FROM tvfGetDay24Hours(#CurrentDay) AS h
OUTER APPLY ( SELECT COUNT(TransactionID) AS TotalHourlyTransactions
FROM [dbo].[TerminalTransactions]
WHERE TransactionTime BETWEEN h.StartHour AND h.EndHour
AND TerminalId = 4
) AS t
ORDER BY h.StartHour
SET #count = #Count + 1
END
SELECT *
FROM #tmp_Transactions
group by datepart('hour', thetime). to show those hours with no values you'd have to left join a table of times against the grouping (coalesce(transaction.amount, 0))
I've run into a version of this problem before. The suggestion that worked the best was to setup a table (temporary, or not) with the hours of the day, then do an outer join to that table and group by datepart('h', timeOfRecord).
I don't remember why, but probably due to lack of flexibility because of the need for the other table, I ended up using a method where I group by whatever datepart I want and order by the datetime, then loop through and fill any spaces that are skipped with a 0. This approach worked well for me because I'm not reliant on the database to do all my work for me, and it's also MUCH easier to write an automated test for it.
Step 1, Create #table or a CTE to generate a hours days table. Outer loop for days and inner loop hours 0-23. This should be 3 columns Date, Days, Hours.
Step 2, Write your main query to also have days and hours columns and alias it so you can join it. CTE's have to be above this main query and pivots should be inside CTE's for it to work naturally.
Step 3, Do a select from step 1 table and Left join this Main Query table
ON A.[DATE] = B.[DATE]
AND A.[HOUR] = B.[HOUR]
You can also create a order by if your date columns like
ORDER BY substring(CONVERT(VARCHAR(15), A.[DATE], 105),4,2)
Guidlines
This will then give you all data for hours and days and including zeros for hours with no matches to do that use isnull([col1],0) as [col1].
You can now graph facts against days and hours.