Moving averages with nulls in SQL Server 2012 - sql-server

So I have three columns: a location id, a year and a height.
I want to calculate a five-year rolling average. But, if there are not five years worth of data, I do not want a result.
I've been learning about using OVER. And I've seen other questions dealing with this topic, but couldn't find a solution for my issue.
Here's where I stand:
select locationID, year_num, height_num2,
avg(cast(height_num2 as float)) over (PARTITION BY locationID
ORDER BY year_num
ROWS 4 PRECEDING) as FiveYearRollingAverage
from combined;
And now I'm stumped on how best to approach this.

I think you have to just add a counter, so as to know the number of records involved in the average. If they are 5, then select the record containing the rolling average using an outer query:
SELECT locationID, year_num, FiveYearRollingAverage
FROM (
SELECT locationID, year_num,
AVG(CAST(height_num2 AS FLOAT)) OVER (PARTITION BY locationID ORDER BY year_num ROWS 4 PRECEDING) FiveYearRollingAverage,
COUNT(*) OVER (PARTITION BY locationID ORDER BY year_num ROWS 4 PRECEDING) yearsCount
FROM #combined) u
WHERE u.yearsCount = 5
With this input:
DECLARE #combined TABLE (locationID INT, year_num INT, height_num2 INT)
INSERT #combined VALUES
(1, 2009, 1),
(1, 2010, 4),
(1, 2011, 3),
(1, 2012, 2),
(1, 2013, 5),
(1, 2014, 7),
(2, 2014, 2),
(2, 2015, 1),
(2, 2016, 4),
(2, 2017, 3)
you get this output:
locationID year_num FiveYearRollingAverage
----------------------------------------------
1 2013 3
1 2014 4,2
There is no output for locationID = 2, since only 4 years are available for this ID.

Related

T-SQL: process consequtive periods and count by group

I deal with big job to find max period by FULL months of enrollment within the year (12 months period), which I did OK if we have 2 periods. Just got stuck at the end while testing if I have 3+ periods. Data below and picture hope will provide all information and easy start. Thanks to all. This is final work table I got at the end of my process, thanks all. Code below produces partially correct results. My global task find MAX period for each member, so some fields are just for easy working.
/*
DROP TABLE IF EXISTS #t;
CREATE TABLE #t ( Cust VARCHAR(10), mm INT, mm_prev INT, rn INT)
INSERT #t values
(123456, 1, NULL, 1), (123456, 2, 1, 2),
(123456, 4, 2, 3), (123456, 5, 4, 4), (123456, 6, 5, 5),
(123456, 8, 6, 6), (123456, 9, 8, 7), (123456, 10, 9, 8), (123456, 11, 10, 9), (123456, 12, 11, 10),
(777 , 1, NULL, 1),(777 , 2, 1, 2)
SELECT * from #t
*/
select
Cust, MIN(mm) mmStart, MAX(mm) mmEnd,
CASE WHEN mm = rn THEN 'Grp A' ELSE 'Grp B' END Grp
,COUNT(*) mm_count
FROM #t
WHERE 1=1
--mm - ISNULL(mm_prev,0) = 1 --check for conseq but we drop mm=6--> start of new period
-- AND mm = rn -- this brings only first group by mm
GROUP BY Cust, CASE WHEN mm = rn THEN 'Grp A' ELSE 'Grp B' END
ORDER BY 1,4
just for the case if somebody prefer to deal with initial raw data I posting it here too with some gap and islands:
CREATE TABLE #tr ( Cust varchar(10), ENR_START date, enr_END date, rn INT); -- SELECT * FROM #t
INSERT #tr VALUES
('123456' , '2018-12-01', '2019-3-1' , 1),
('123456' , '2019-3-28', '2019-6-30' , 2), -- 6 month with 2 periods, island
('123456' , '2019-7-26', '2019-8-20' , 3),
('123456' , '2019-8-15', '2019-12-31' , 4),
('777' , '2018-11-4', '2019-3-3' , 1)
select * from #tr
Screenshot is here:
looks to me, you wanted this. Not really sure what is the purpose of the case statement in your query
with cte as
(
SELECT *,
grp = mm - rn
from #t
)
SELECT Cust, MIN(mm) as mmStart, MAX(mm) as mmEnd, grp,
count(*) as mm_count
FROM cte
GROUP BY Cust, grp
order by Cust, mmStart

return each date that a value changed

I have a table with a PersonID, TeamID, and DateEffective. I'd like to return the earliest date that each person changed team. It is possible that a person can move back into a team they were previously allocated to.
Create Table PHistory
(PersonID INT,
TeamID VARCHAR(8),
DateEffective datetime)
Insert INTO PHistory
Values (1, 'TeamA', '2017-07-01'),
(1, 'TeamA', '2017-07-02'),
(1, 'TeamB', '2017-07-03'),
(1, 'TeamA', '2017-07-04'),
(2, 'TeamA', '2017-07-01'),
(2, 'TeamA', '2017-07-02'),
(2, 'TeamA', '2017-07-03'),
(2, 'TeamA', '2017-07-04')
I'd like the query to return:
1|'TeamA'|2017-07-01
1|'TeamB'|2017-07-03
1|'TeamA'|2017-07-04
2|'TeamA'|2017-07-01
I've messed around with MIN() but it doesn't cope with the fact that people can move back into old teams.
I've seen this post that uses a window function, but it only returns the latest record and I need to see all team changes (I couldn't see how to adapt it to my needs).
Here is one approach using row_number
select
PersonId, minDate = min(DateEffective)
from (
select
*, rn1 = row_number() over (partition by PersonId order by DateEffective)
, rn2 = row_number() over (partition by PersonId, TeamId order by DateEffective)
from
PHistory
) t
group by PersonId, rn1 - rn2
order by 1,2
Output
1 2017-07-01 00:00:00.000
1 2017-07-03 00:00:00.000
1 2017-07-04 00:00:00.000
2 2017-07-01 00:00:00.000

Sort by most recent but keep together by another ID column

I am trying to get some sorting and keep together (not really grouping) working.
In my sample data I would like to keep the DealerIDs together, sorted by IsPrimaryDealer DESC, but show the group (ok maybe it is grouping) of dealers by the ones with the most recent entry.
Result set 2 is the closest, but Grant and his brother should be displayed as the first two rows, in that order. (Grant should be row 1, Grants Brother row 2 because Grants Brother was the most recently added)
DECLARE #temp TABLE (
DealerPK int not null IDENTITY(1,1), DealerID int,
IsPrimaryDealer bit, DealerName varchar(50), DateAdded datetime
)
INSERT INTO #temp VALUES
(1, 1, 'Bob', GETDATE() - 7),
(2, 1, 'Robert', GETDATE() - 7),
(3, 1, 'Grant', GETDATE() - 7),
(3, 0, 'Grants Brother', GETDATE() - 1),
(2, 0, 'Roberts Nephew', GETDATE() - 2),
(1, 0, 'Bobs Cousin', GETDATE() - 3)
-- Data As Entered
SELECT * FROM #temp
-- Data Attempt at Row Numbering
SELECT *, intPosition =
ROW_NUMBER() OVER (PARTITION BY IsPrimaryDealer ORDER BY DealerID, IsPrimaryDealer DESC)
FROM #temp
ORDER BY DateAdded DESC
-- Data Attempt By DateAdded
SELECT *, intPosition =
ROW_NUMBER() OVER (PARTITION BY DealerID ORDER BY DateAdded DESC)
FROM #temp
ORDER BY intPosition, DateAdded
Expected Result
PK DID IsPr Name DateAdded
3 3 1 Grant 2015-10-08 17:14:26.497
4 3 0 Grants Brother 2015-10-14 17:14:26.497
2 2 1 Robert 2015-10-08 17:14:26.497
5 2 0 Roberts Nephew 2015-10-13 17:14:26.497
1 1 1 Bob 2015-10-08 17:14:26.497
6 1 0 Bobs Cousin 2015-10-12 17:14:26.497
As requested by OP:
;WITH Cte AS(
SELECT *,
mx = MAX(DateAdded) OVER(PARTITION BY DealerID) FROM #temp
)
SELECT *
FROM Cte
ORDER BY mx DESC, DealerID, IsPrimaryDealer DESC
Hope i understood your question,
This query results expected output :
SELECT Row_number()
OVER (
PARTITION BY DealerID
ORDER BY DealerPK)RN,
DealerPK,
DealerID,
IsPrimaryDealer,
DealerName,
DateAdded
FROM #temp
ORDER BY DealerID DESC

SQL - Filter on dates X number of days apart from the previous

I have a table containing orders. I would like to select those orders that are a certain number of days apart for a specific client. For example, in the table below I would like to select all of the orders for CustomerID = 10 that are at least 30 days apart from the previous instance. With the starting point to be the first occurrence (07/05/2014 in this data).
OrderID | CustomerID | OrderDate
==========================================
1 10 07/05/2014
2 10 07/15/2014
3 11 07/20/2014
4 11 08/20/2014
5 11 09/21/2014
6 10 09/23/2014
7 10 10/15/2014
8 10 10/30/2014
I would want to select OrderIDs (1,6,8) since they are 30 days apart from each other and all from CustomerID = 10. OrderIDs 2 and 7 would not be included as they are within 30 days of the previous order for that customer.
What confuses me is how to set the "checkpoint" to the last valid date. Here is a little "pseudo" SQL.
SELECT OrderID
FROM Orders
WHERE CusomerID = 10
AND OrderDate > LastValidOrderDate + 30
i came here and i saw #SveinFidjestøl already posted answer but i can't control my self after by long tried :
with the help of LAG and LEAD we can comparison between same column
and as per your Q you are looking 1,6,8. might be this is helpful
SQL SERVER 2012 and after
declare #temp table
(orderid int,
customerid int,
orderDate date
);
insert into #temp values (1, 10, '07/05/2014')
insert into #temp values (2, 10, '07/15/2014')
insert into #temp values (3, 11, '07/20/2014')
insert into #temp values (4, 11, '08/20/2014')
insert into #temp values (5, 11, '09/21/2014')
insert into #temp values (6, 10, '09/23/2014')
insert into #temp values (7, 10, '10/15/2014')
insert into #temp values (8, 10, '10/30/2014');
with cte as
(SELECT orderid,customerid,orderDate,
LAG(orderDate) OVER (ORDER BY orderid ) PreviousValue,
LEAD(orderDate) OVER (ORDER BY orderid) NextValue,
rownum = ROW_NUMBER() OVER (ORDER BY orderid)
FROM #temp
WHERE customerid = 10)
select orderid,customerid,orderDate from cte
where DATEDIFF ( day , PreviousValue , orderDate) > 30
or PreviousValue is null or NextValue is null
SQL SERVER 2005 and after
WITH CTE AS (
SELECT
rownum = ROW_NUMBER() OVER (ORDER BY p.orderid),
p.orderid,
p.customerid,
p.orderDate
FROM #temp p
where p.customerid = 10)
SELECT CTE.orderid,CTE.customerid,CTE.orderDate,
prev.orderDate PreviousValue,
nex.orderDate NextValue
FROM CTE
LEFT JOIN CTE prev ON prev.rownum = CTE.rownum - 1
LEFT JOIN CTE nex ON nex.rownum = CTE.rownum + 1
where CTE.customerid = 10
and
DATEDIFF ( day , prev.orderDate , CTE.orderDate) > 30
or prev.orderDate is null or nex.orderDate is null
GO
You can use the LAG() function, available in SQL Server 2012, together with a Common Table Expression. You calculate the days between the customer's current order and the customer's previous order and then query the Common Table Expression using the filter >= 30
with cte as
(select OrderId
,CustomerId
,datediff(d
,lag(orderdate) over (partition by CustomerId order by OrderDate)
,OrderDate) DaysSinceLastOrder
from Orders)
select OrderId, CustomerId, DaysSinceLastOrder
from cte
where DaysSinceLastOrder >= 30 or DaysSinceLastOrder is null
Results:
OrderId CustomerId DaysSinceLastOrder
1 10 NULL
6 10 70
3 11 NULL
4 11 31
5 11 32
(Note that 1970-01-01 is chosen arbitrarily, you may choose any date)
Update
A slighty more reliable way of doing it will involve a temporary table. But the original table tbl can be left unchanged. See here:
CREATE TABLE #tmp (id int); -- set-up temp table
INSERT INTO #tmp VALUES (1); -- plant "seed": first oid
WHILE (##ROWCOUNT>0)
INSERT INTO #tmp (id)
SELECT TOP 1 OrderId FROM tbl
WHERE OrderId>0 AND CustomerId=10
AND OrderDate>(SELECT max(OrderDate)+30 FROM tbl INNER JOIN #tmp ON id=OrderId)
ORDER BY OrderDate;
-- now list all found entries of tbl:
SELECT * FROM tbl WHERE EXISTS (SELECT 1 FROM #tmp WHERE id=OrderId)
#tinka shows how to use CTEs to do the trick, and the new windowed functions (for 2012 and later) are probably the best answer. There is also the option, assuming you do not have a very large data set, to use a recursive CTE.
Example:
declare #customerid int = 10;
declare #temp table
(orderid int,
customerid int,
orderDate date
);
insert into #temp values (1, 10, '07/05/2014')
insert into #temp values (2, 10, '07/15/2014')
insert into #temp values (3, 11, '07/20/2014')
insert into #temp values (4, 11, '08/20/2014')
insert into #temp values (5, 11, '09/21/2014')
insert into #temp values (6, 10, '09/23/2014')
insert into #temp values (7, 10, '10/15/2014')
insert into #temp values (8, 10, '10/30/2014');
with datefilter AS
(
SELECT row_number() OVER(PARTITION BY CustomerId ORDER BY OrderDate) as RowId,
OrderId,
CustomerId,
OrderDate,
DATEADD(day, 30, OrderDate) as FilterDate
from #temp
WHERE CustomerId = #customerid
)
, firstdate as
(
SELECT RowId, OrderId, CustomerId, OrderDate, FilterDate
FROM datefilter
WHERE rowId = 1
union all
SELECT datefilter.RowId, datefilter.OrderId, datefilter.CustomerId,
datefilter.OrderDate, datefilter.FilterDate
FROM datefilter
join firstdate
on datefilter.CustomerId = firstdate.CustomerId
and datefilter.OrderDate > firstdate.FilterDate
WHERE NOT EXISTS
(
SELECT 1 FROM datefilter betweens
WHERE betweens.CustomerId = firstdate.CustomerId
AND betweens.orderdate > firstdate.FilterDate
AND datefilter.orderdate > betweens.orderdate
)
)
SELECT * FROM firstdate

Having trouble with grouping rows - MS SQL 2008

I have a product table which has some duplicate records.
I need to get primarykeys atfer grouped them according to names and types
DECLARE #Products TABLE
(
pkProductId INT,
productName NVARCHAR(500),
productType INT
)
INSERT INTO #Products (pkProductId, productName, productType)
VALUES
(1, 'iphone', 0),
(2, 'iphone', 0),
(3, 'iphone', 1),
(4, 'iphone', 1),
(5, 'iphone', 1)
After I run like tsql
SELECT pr.pkProductId FROM #Products pr
GROUP BY pr.productName, pr.productType
HAVING COUNT(pr.productName) > 1
I Want To Get These IDs
pkProductId
---------------
2
4
5
Thank You For Your Hepls :)
You could use row_number() to get the result:
select pkProductId
from
(
select pkProductId,
productName,
productType,
row_number() over(partition by productName, productType order by pkproductId) rn
from #Products
) d
where rn >1;
See SQL Fiddle with Demo

Resources