Please help to create average when some values are NULL
fact table:
cube:
SELECT NON EMPTY {[Measures].[Score]} * [Date].[Month].allmembers ON COLUMNS
,{[Name].[name].allmembers} ON ROWS
FROM Test
problem:
when I calculate average, NULL values are excluded. I tried COALESCEEMPTY(), but did not manage to calculate average correctly anyway. Average for months where Score=0 is not correct. Heres the code:
WITH
MEMBER [Measures].[DateCount] AS DISTINCTCOUNT([Data].[date].[date])
MEMBER [Measures].[ScoreX] AS COALESCEEMPTY([Measures].[Score],0)
MEMBER [Measures].[DateCountX] AS COALESCEEMPTY([Measures].[DateCount],0)
MEMBER [Measures].[AvgScore] AS IIF([Measures].[DateCountX]=0,0,[Measures].[ScoreX]/[Measures].[DateCountX])
SELECT NON EMPTY {[Measures].[AvgScore]} * [Date].[Month].allmembers ON COLUMNS
,{[Name].[name].allmembers} ON ROWS
FROM Test
Please help find the solution.
Maybe something like the following:
WITH
MEMBER [Measures].[Score X] AS
IIF(
[Measures].[Data Count]=0
,0
,[Measures].[Data Count]
)
MEMBER [Measures].[Data Count X] AS
COUNT(
[name].[name].CURRENTMEMBER
*[Measures].[Score X]
)
MEMBER [Measures].[Avg Score] AS
DIVIDE(
[Measures].[Score]
,[Measures].[Data Count X]
)
...
...
As Tab mentioned, you could use the function COALESCEEMPTY for the first calculated member above:
WITH
MEMBER [Measures].[Score X] AS
COALESCEEMPTY(
[Measures].[Data Count]
,0)
MEMBER [Measures].[Data Count X] AS
COUNT(
[name].[name].CURRENTMEMBER
*[Measures].[Score X]
)
MEMBER [Measures].[Avg Score] AS
DIVIDE(
[Measures].[Score]
,[Measures].[Data Count X]
)
...
...
the final solution was this:
WITH
MEMBER Measures.[AvgScore] AS
Avg(
Descendants(
[Data].[Date].CurrentMember,
[Data].[Date].[Date]
),
coalesceempty(Measures.[Score],0)
)
SELECT NON EMPTY {[Measures].[AvgScore]} * [Date].[Month].allmembers ON COLUMNS
,{[Name].[name].allmembers} ON ROWS
FROM Test
Your fact table needs to represent 0's for each month with missing names. You could do this with a common-table-expression.
declare #facttable table(name varchar(10),date datetime,score int);
insert into #facttable(name,date,score)
values
('a1','2015/01/01',15),
('a2','2015/01/01',30),
('a3','2015/01/01',26),
('a1','2015/02/01',20),
('a3','2015/02/01',14),
('a4','2015/02/01',45),
('a5','2015/02/01',3)
;
with fact_cte as(
select
tDistinctNames.DistinctName AS Name,
tDistinctDates.DistinctDate AS Date,
ISNULL(t.Score,0) AS Score
from
(select distinct name as DistinctName from #facttable) tDistinctNames
cross join
(select distinct date as DistinctDate from #facttable) tDistinctDates
left outer join #facttable t on
t.name = tDistinctNames.DistinctName AND
t.date = tDistinctDates.DistinctDate
)
select *
from fact_cte
The result would be this:
Name Date Score
a1 2015-01-01 00:00:00.000 15
a2 2015-01-01 00:00:00.000 30
a3 2015-01-01 00:00:00.000 26
a4 2015-01-01 00:00:00.000 0
a5 2015-01-01 00:00:00.000 0
a1 2015-02-01 00:00:00.000 20
a2 2015-02-01 00:00:00.000 0
a3 2015-02-01 00:00:00.000 14
a4 2015-02-01 00:00:00.000 45
a5 2015-02-01 00:00:00.000 3
Related
I have a table that records any time a certain field changes for an item, along with the date of the change. I need to query the data to find all items where that field had a specific value at any time during a requested date range.
In other words, if the item had that value at the start, end, or anytime during the data range, it should be included.
Example data:
Item Valid Date Changed
---- ----- ------------
A Yes 2015-01-01
B No 2015-01-01
B Yes 2017-03-01
C Yes 2015-01-01
C No 2017-04-01
D No 2015-01-01
D Yes 2017-05-01
D No 2017-06-01
E Yes 2015-01-01
E No 2017-05-01
E Yes 2017-06-01
F Yes 2015-01-01
F No 2018-02-01
G Yes 2017-12-31
V No 2015-01-01
V Yes 2018-02-01
W Yes 2015-01-01
W No 2016-01-01
X No 2015-01-01
Y Yes 2018-01-01
Z Yes 2015-01-01
Z No 2017-01-01
So if I need all Items that were valid during 2017, the query would include:
A (Valid since 2015)
B (Became valid during 2017)
C (Was valid until mid-2017)
D (Was valid for a month during 2017)
E (Was valid at start and end of 2017)
F (Was valid throughout 2017)
G (Became valid during 2017)
The query would not include V, W, X, Y, or Z -- none of which were valid during 2017. (Pay special attention to G & Z, which are tricky edge cases!)
-- Sample data
create table #Temp (
ItemID char,
Valid bit,
StartDate date
);
insert into #Temp (ItemID, Valid, StartDate)
values ('A', 1, '2015-01-01'),
('B', 0, '2015-01-01'),
('B', 1, '2017-03-01'),
('C', 1, '2015-01-01'),
('C', 0, '2017-04-01'),
('D', 0, '2015-01-01'),
('D', 1, '2017-05-01'),
('D', 0, '2017-06-01'),
('E', 1, '2015-01-01'),
('E', 0, '2017-05-01'),
('E', 1, '2017-06-01'),
('F', 1, '2015-01-01'),
('F', 0, '2018-02-01'),
('G', 1, '2017-12-31'),
('V', 0, '2015-01-01'),
('V', 1, '2018-02-01'),
('W', 1, '2015-01-01'),
('W', 0, '2016-01-01'),
('X', 0, '2015-01-01'),
('Y', 1, '2018-01-01'),
('Z', 1, '2015-01-01'),
('Z', 0, '2017-01-01');
FYI, here are some other SO questions I found that ask similar questions, but not exactly the same:
SQL query: list of all IDs that were active during a given time interval, sorted by their start-time
Extract signal state during specified time frame
Query to find records that were active within a range of dates
First, you can turn the original list of timestamps:
ItemID Valid StartDate
------ ----- ----------
A 1 2015-01-01
B 0 2015-01-01
B 1 2017-03-01
C 1 2015-01-01
C 0 2017-04-01
D 0 2015-01-01
D 1 2017-05-01
D 0 2017-06-01
E 1 2015-01-01
E 0 2017-05-01
E 1 2017-06-01
F 1 2015-01-01
F 0 2018-02-01
G 1 2017-12-31
V 0 2015-01-01
V 1 2018-02-01
W 1 2015-01-01
W 0 2016-01-01
X 0 2015-01-01
Y 1 2018-01-01
Z 1 2015-01-01
Z 0 2017-01-01
into a list of ranges, where the end date is either the item's next entry's StartDate or, if the current row is the last entry, today's date:
ItemID Valid StartDate EndDate
------ ----- ---------- ----------
A 1 2015-01-01 (today)
B 0 2015-01-01 2017-03-01
B 1 2017-03-01 (today)
C 1 2015-01-01 2017-04-01
C 0 2017-04-01 (today)
D 0 2015-01-01 2017-05-01
D 1 2017-05-01 2017-06-01
D 0 2017-06-01 (today)
E 1 2015-01-01 2017-05-01
E 0 2017-05-01 2017-06-01
E 1 2017-06-01 (today)
F 1 2015-01-01 2018-02-01
F 0 2018-02-01 (today)
G 1 2017-12-31 (today)
V 0 2015-01-01 2018-02-01
V 1 2018-02-01 (today)
W 1 2015-01-01 2016-01-01
W 0 2016-01-01 (today)
X 0 2015-01-01 (today)
Y 1 2018-01-01 (today)
Z 1 2015-01-01 2017-01-01
Z 0 2017-01-01 (today)
You can use the LEAD analytic function to achieve that:
EndDate = LEAD(StartDate, 1, CAST(CURRENT_TIMESTAMP AS date))
OVER (PARTITION BY ItemID ORDER BY StartDate ASC)
Once you have a list of ranges, it is easy to match the rows by using this established method of finding intersecting ranges (the ranges in the tables intersecting with the range specified in the query parameters):
StartDate < #EndDate AND EndDate > #StartDate
Here is the complete solution:
DECLARE
#StartDate date = '2017-01-01',
#EndDate date = '2018-01-01',
#ValidValue bit = 1
;
WITH
ranges AS
(
SELECT
ItemID,
Valid,
StartDate,
EndDate = LEAD(StartDate, 1, CAST(CURRENT_TIMESTAMP AS date))
OVER (PARTITION BY ItemID ORDER BY StartDate ASC)
FROM
#Temp
)
SELECT DISTINCT
ItemID
FROM
ranges
WHERE
Valid = #ValidValue
AND StartDate < #EndDate
AND EndDate > #StartDate
;
You can play with this method in this demo at db<>fiddle.
Note: After completing my answer I realised that it ended up being very similar to Sami's. The difference is in handling the items' last entries.
Here is a solution
DECLARE #SD DATE = '2017-01-01',
#ED DATE = '2017-12-31';
WITH BSD AS
(
SELECT *,
LAST_VALUE(Valid) OVER(PARTITION BY ItemID ORDER BY StartDate) LV,
COUNT(1) OVER(PARTITION BY ItemID ORDER BY StartDate DESC) CNT
FROM #Temp
WHERE StartDate <= #SD
)
SELECT ItemID
FROM BSD
WHERE LV = 1 AND CNT = 1
UNION
SELECT ItemID
FROM #Temp
WHERE Valid = 1
AND
StartDate <= #ED
AND
StartDate >= #SD;
Live Demo
Here's the solution I came up with:
-- Date range includes all of 2017
declare
#beginSearchDate date = '2017-01-01',
#endSearchDate date = '2017-12-31';
with
-- CTE: Existing data combined with current value as of today
a as (
select ItemID, Valid, StartDate
from #Temp
union
select t1.ItemID, t1.Valid, convert(date, getdate())
from (
select ItemID, max(StartDate) as LatestStartDate
from #Temp
group by ItemID
) as t2
inner join #Temp as t1
on t1.ItemID = t2.ItemID
and t1.StartDate = t2.LatestStartDate
),
-- CTE: Current and previous values included in each record
b as (
select a1.*,
lag(a1.Valid) over ( partition by a1.ItemID order by a1.StartDate )
as PrevValid,
lag(a1.StartDate) over ( partition by a1.ItemID order by a1.StartDate )
as PrevStartDate
from a as a1
inner join a as a2
on a1.ItemID = a2.ItemID
and a1.StartDate = a2.StartDate
),
-- CTE: Values as a series of date ranges
c as (
select distinct ItemID,
StartDate as UntilDate,
PrevValid as Valid,
PrevStartDate as FromDate
from b
where PrevValid is not null
)
-- Find all records where date range overlaps
select distinct ItemID
from c
where Valid = 1
and FromDate <= #endSearchDate
and UntilDate > #beginSearchDate
order by ItemID;
Result:
ItemID
------
A
B
C
D
E
F
G
Here is my swipe at it. I build the first table with the items that had valid flag = 1 that fell anywhere below the end date. This would account for Item A or any like it.
I then matched it up to the last invalid date for each item if it had one and then filtered it out by date.
declare
#beginSearchDate date = '2017-01-01',
#endSearchDate date = '2017-12-31';
;WITH CTE as (
select itemid, VALID, MAX(StartDate) stDate from #temp
where valid <> 0 and StartDate <= #endSearchDate
group by itemID, VALID
)
SELECT t1.ItemID, VALID , stDate
from CTE t1
outer apply (
SELECT ItemID, MAX(StartDate) inValDate from #Temp
where Valid = 0
and StartDate <= #endSearchDate
and ItemID = t1.ItemID GROUP BY ItemID) t2
WHERE t2.inValDate IS NULL
or (t1.stDate > t2.inValDate OR t1.stDate > #beginSearchDate OR t2.inValDate > #beginSearchDate)
I am trying to create a 13 period calendar in mssql but I am a bit stuck. I am not sure if my approach is the best way to achieve this. I have my base script which can be seen below:
Set DateFirst 1
Declare #Date1 date = '20180101' --startdate should always be start of
financial year
Declare #Date2 date = '20181231' --enddate should always be start of
financial year
SELECT * INTO #CalendarTable
FROM dbo.CalendarTable(#Date1,#Date2,0,0,0)c
DECLARE #StartDate datetime,#EndDate datetime
SELECT #StartDate=MIN(CASE WHEN [Day]='Monday' THEN [Date] ELSE NULL END),
#EndDate=MAX([Date])
FROM #CalendarTable
;With Period_CTE(PeriodNo,Start,[End])
AS
(SELECT 1,#StartDate,DATEADD(wk,4,#StartDate) -1
UNION ALL
SELECT PeriodNo+1,DATEADD(wk,4,Start),DATEADD(wk,4,[End])
FROM Period_CTE
WHERE DATEADD(wk,4,[End])< =#EndDate
OR PeriodNo+1 <=13
)
select * from Period_CTE
Which gives me this:
PeriodNo Start End
1 2018-01-01 00:00:00.000 2018-01-28 00:00:00.000
2 2018-01-29 00:00:00.000 2018-02-25 00:00:00.000
3 2018-02-26 00:00:00.000 2018-03-25 00:00:00.000
4 2018-03-26 00:00:00.000 2018-04-22 00:00:00.000
5 2018-04-23 00:00:00.000 2018-05-20 00:00:00.000
6 2018-05-21 00:00:00.000 2018-06-17 00:00:00.000
7 2018-06-18 00:00:00.000 2018-07-15 00:00:00.000
8 2018-07-16 00:00:00.000 2018-08-12 00:00:00.000
9 2018-08-13 00:00:00.000 2018-09-09 00:00:00.000
10 2018-09-10 00:00:00.000 2018-10-07 00:00:00.000
11 2018-10-08 00:00:00.000 2018-11-04 00:00:00.000
12 2018-11-05 00:00:00.000 2018-12-02 00:00:00.000
13 2018-12-03 00:00:00.000 2018-12-30 00:00:00.000
The result i am trying to get is
Even if I have to take a different approach I would not mind, as long as the result is the same as the above.
dbo.CalendarTable() is a function that returns the following results. I can share the code if desired.
I'd create a general number's table like suggested here and add a column Periode13.
The trick to get the tiling is the integer division:
DECLARE #PeriodeSize INT=28; --13 "moon-months" a 28 days
SELECT TOP 100 (ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1)/#PeriodeSize
FROM master..spt_values --just a table with many rows to show the principles
You can add this to an existing numbers table with a simple update statement.
UPDATE A fully working example (using the logic linked above)
DECLARE #RunningNumbers TABLE (Number INT NOT NULL
,CalendarDate DATE NOT NULL
,CalendarYear INT NOT NULL
,CalendarMonth INT NOT NULL
,CalendarDay INT NOT NULL
,CalendarWeek INT NOT NULL
,CalendarYearDay INT NOT NULL
,CalendarWeekDay INT NOT NULL);
DECLARE #CountEntries INT = 100000;
DECLARE #StartNumber INT = 0;
WITH E1(N) AS(SELECT 1 FROM(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))t(N)), --10 ^ 1
E2(N) AS(SELECT 1 FROM E1 a CROSS JOIN E1 b), -- 10 ^ 2 = 100 rows
E4(N) AS(SELECT 1 FROM E2 a CROSS JOIN E2 b), -- 10 ^ 4 = 10,000 rows
E8(N) AS(SELECT 1 FROM E4 a CROSS JOIN E4 b), -- 10 ^ 8 = 10,000,000 rows
CteTally AS
(
SELECT TOP(ISNULL(#CountEntries,1000000)) ROW_NUMBER() OVER(ORDER BY(SELECT NULL)) -1 + ISNULL(#StartNumber,0) As Nmbr
FROM E8
)
INSERT INTO #RunningNumbers
SELECT CteTally.Nmbr,CalendarDate.d,CalendarExt.*
FROM CteTally
CROSS APPLY
(
SELECT DATEADD(DAY,CteTally.Nmbr,{ts'2018-01-01 00:00:00'})
) AS CalendarDate(d)
CROSS APPLY
(
SELECT YEAR(CalendarDate.d) AS CalendarYear
,MONTH(CalendarDate.d) AS CalendarMonth
,DAY(CalendarDate.d) AS CalendarDay
,DATEPART(WEEK,CalendarDate.d) AS CalendarWeek
,DATEPART(DAYOFYEAR,CalendarDate.d) AS CalendarYearDay
,DATEPART(WEEKDAY,CalendarDate.d) AS CalendarWeekDay
) AS CalendarExt;
--The mockup table from above is now filled and can be queried
WITH AddPeriode AS
(
SELECT Number/28 +1 AS PeriodNumber
,CalendarDate
,CalendarWeek
,r.CalendarDay
,r.CalendarMonth
,r.CalendarWeekDay
,r.CalendarYear
,r.CalendarYearDay
FROM #RunningNumbers AS r
)
SELECT TOP 100 p.*
,(SELECT MIN(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber) AS [Start]
,(SELECT MAX(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber) AS [End]
,(SELECT MIN(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber AND x.CalendarWeek=p.CalendarWeek) AS [wkStart]
,(SELECT MAX(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber AND x.CalendarWeek=p.CalendarWeek) AS [wkEnd]
,(ROW_NUMBER() OVER(PARTITION BY PeriodNumber ORDER BY CalendarDate)-1)/7+1 AS WeekOfPeriode
FROM AddPeriode AS p
ORDER BY CalendarDate
Try it out...
Hint: Do not use a VIEW or iTVF for this.
This is non-changing data and much better placed in a physically stored table with appropriate indexes.
Not abundantly sure external links are accepted here, but I wrote an article that pulls of a 5-4-4 'Crop Year' fiscal year with all the code. Feel free to use all the code in these articles.
SQL Server Calendar Table
SQL Server Calendar Table: Fiscal Years
What I need to do is get a Cost breakout for each grouping, aggregated by day. Also, only taking the top N per the whole date range. I'm probably not explaining this well so let me give examples. Say my table schema and data looks like this:
SoldDate Product State Cost
----------------------- --------------------- --------- ------
2017-07-11 01:00:00.000 Apple NY 6
2017-07-11 07:00:00.000 Banana NY 1
2017-07-11 07:00:00.000 Banana NY 1
2017-07-12 01:00:00.000 Pear NY 2
2017-07-12 03:00:00.000 Olive TX 1
2017-07-12 16:00:00.000 Banana NY 1
2017-07-13 22:00:00.000 Apple NY 6
2017-07-13 22:00:00.000 Apple NY 6
2017-07-13 23:00:00.000 Banana NY 1
Call this table SoldProduce.
Now what I'm looking for is to group by Day, Product and State but for each day, only take the top two of the group NOT the top of that particular day. Anything else gets lumped under 'other'.
So in this case, our top two groups with the greatest Cost are Apple-NY and Banana-NY. So those are the two that should show up in the output only. Anything else is under 'Other'
So in the end this is the desired output:
SoldDay Product State Total Cost
----------------------- --------------------- --------- ------
2017-07-11 00:00:00.000 Apple NY 6
2017-07-11 00:00:00.000 Banana NY 2
2017-07-11 00:00:00.000 OTHER OTHER 0
2017-07-12 00:00:00.000 OTHER OTHER 3
2017-07-12 00:00:00.000 Banana NY 1
2017-07-13 00:00:00.000 Apple NY 12
2017-07-13 00:00:00.000 Banana NY 1
2017-07-13 00:00:00.000 OTHER OTHER 0
Note how on the 12th Pear and Olive were lumped under other. Even though it outsold Banana on that day. This is because I want the Top N selling groups for the whole range, not just on a day by day basis.
I did a lot of googleing a way to make a query to get this data but I'm not sure if it's the best way:
WITH TopX AS
(
SELECT
b.Product,
b.State,
b.SoldDate,
b.Cost,
DENSE_RANK() OVER (ORDER BY GroupedCost DESC) as [Rank]
FROM
(
SELECT
b.Product,
b.State,
b.SoldDate,
b.Cost,
SUM(b.Cost) OVER (PARTITION BY b.Product, b.State) as GroupedCost
FROM
SoldProduce b WITH (NOLOCK)
) as b
)
SELECT
DATEADD(d,DATEDIFF(d,0,SoldDate),0),
b.Product,
b.State,
SUM(b.Cost)
FROM
TopX b
WHERE
[Rank] <= 2
GROUP BY
DATEADD(d,DATEDIFF(d,0,SoldDate),0),
b.Product,
b.State
UNION ALL
SELECT
DATEADD(d,DATEDIFF(d,0,SoldDate),0),
null,
null,
SUM(b.Cost)
from
TopX b
WHERE
[Rank] > 2
GROUP BY
DATEADD(d,DATEDIFF(d,0,SoldDate),0)
Step 1) Create a common query that first projects the cost that the row would be has we just grouped by Product and State. Then it does a second projection to rank that cost 1-N where 1 has the greatest grouped cost.
Step 2) Call upon the common query, grouping by day and restricting to rows <= 2. This is the Top elements. Then union the other category to this, or anything ranked > 2.
What do you guys think? Is this an efficient solution? Could I do this better?
Edit:
FuzzyTrees suggestion benchmarks better than mine.
Final query used:
WITH TopX AS
(
SELECT
TOP(2)
b.Product,
b.State
FROM
SoldProduce b
GROUP BY
b.Product,
b.State
ORDER BY
SUM(b.Cost)
)
SELECT
DATEADD(d,DATEDIFF(d,0,SoldDate),0),
coalesce(b.Product, 'Other') Product,
coalesce(b.State, 'Other') State,
SUM(b.Cost)
FROM
SoldProduce a
LEFT JOIN TopX b ON
(a.Product = b.Product OR (a.Product IS NULL AND b.Product IS NULL)) AND
(a.State = b.State OR (a.State IS NULL AND b.State IS NULL))
GROUP BY
DATEADD(d,DATEDIFF(d,0,SoldDate),0),
coalesce(b.Product, 'Other') Product,
coalesce(b.State, 'Other') State,
ORDER BY DATEADD(d,DATEDIFF(d,0,SoldDate),0)
-- Order by optional. Just for display purposes.
--More effienct to order in code for the final product.
--Don't use I/O if you don't have to :)
I suggest using a plain group by without window functions for your TopX view:
With TopX AS
(
select top 2 Product, State
from SoldProduce
group by Product, State
order by sum(cost) desc
)
Then you can left join to your TopX view and use coalesce to determine which products fall into the Other group
select
coalesce(TopX.Product, 'Other') Product,
coalesce(TopX.State, 'Other') State,
sum(Cost),
sp.SoldDate
from SoldProduce sp
left join TopX on TopX.Product = sp.Product
and TopX.State = sp.State
group by
coalesce(TopX.Product, 'Other'),
coalesce(TopX.State, 'Other'),
SoldDate
order by SoldDate
Note: This query will not return 0 counts
So what I'm trying to attain is to count how many Users trigger EventCode 90 relative to when they last recieved a Notification.
Source tables are the following:
ServiceOne
UserNr RegisteredUntil NotificationMonth
532091985 2016-05-15 00:00:00.000 5
950628185 2016-03-15 00:00:00.000 3
561007126 2016-09-15 00:00:00.000 9
Notifications
UserNr NotificationNr NotificationDate Service
532091985 134567 2013-04-16 00:00:00.000 1
532091985 153468 2014-04-15 00:00:00.000 1
950628185 235481 2014-02-17 00:00:00.000 1
561007126 354812 2012-08-15 00:00:00.000 1
EventLog
Time EventCode UserNr
2012-12-19 00:00:00.000 90 561007126
2014-05-02 00:00:00.000 90 120456873
2009-08-24 00:00:00.000 90 935187423
The table I want is something like this:
CancMonth CancAmount
0 49091
1 53564
2 14308
What I have so far is
Select Max(datediff(month, I.NotificationDate, E.Time) ) as CancMonth
,Count(datediff(month, I.NotificationDate, E.Time) ) as CancAmount
From ServiceOne P, Eventlog E, Notifications N
Where P.UserNr=E.UserNr
AND P.UserNr=N.UserNr
AND E.EventCode = 90 --EventCode 90 is both flagging for deregistration and manual deregistration
AND N.Service=1
AND P.Status In (0,4) -- 0 is not registered and 4 is flagged for deregistration
AND datediff(month, N.NotificationDate, E.Time ) < 13 --Notifications are sent once a year
AND N.NotificationDate < E.Time
Group By datediff(month, N.NotificationDate, E.Time )
Order By CancMonth
I did a count on how many total records this gave and it returns about 35 000 more than I have passive and flagged users in ServiceOne.
Help is much appreciated since this has given me a massive headache the last couple of days.
EDIT: I added my source-tables and all possibly usable columns with some random sample-data
Is this What you are looking for?
--I assue that Latest NotificationDate has Largest NotificationNr
SELECT MAX(DATEDIFF(MONTH, I.NotificationDate, E.Time)) AS CancMonth,
COUNT(DATEDIFF(MONTH, I.NotificationDate, E.Time)) AS CancAmount
FROM ServiceOne P
JOIN Eventlog E ON P.UserNr =E.UserNr
JOIN (
SELECT N.*
FROM Notifications N
JOIN (
SELECT UserNr,
MAX(NotificationDate) NotificationDate,
MAX(NotificationNr) NotificationNr
FROM Notifications) LU
ON N.UserNr = LU.UserNr
AND N.NotificationDate = LU.NotificationDate
AND N.NotificationNr = LU.NotificationNr
) N ON P.UserNr = N.UserNr
WHERE E.EventCode = 90
AND N.Service=1
AND P.Status In (0,4)
AND DATEDIFF(MONTH, N.NotificationDate, E.Time ) < 13
AND N.NotificationDate < E.Time
GROUP BY DATEDIFF(MONTH, N.NotificationDate, E.Time )
ORDER BY CancMonth
for SQL Server 2008 R2
I have a resultset that looks like this (note [price] is numeric, NULL below represents a
NULL value, the result set is ordered by product_id and timestamp)
product timestamp price
------- ---------------- -----
5678 2008-01-01 12:00 12.34
5678 2008-01-01 12:01 NULL
5678 2008-01-01 12:02 NULL
5678 2008-01-01 12:03 23.45
5678 2008-01-01 12:04 NULL
I want to transform that to a result set that (essentially) copies a non-null value from the latest preceding row, to produce a resultset that looks like this:
product timestamp price
------- ---------------- -----
5678 2008-01-01 12:00 12.34
5678 2008-01-01 12:01 12.34
5678 2008-01-01 12:02 12.34
5678 2008-01-01 12:03 23.45
5678 2008-01-01 12:04 23.45
I don't find any aggregate/windowing function that will allow me to do this (again this ONLY needed for SQL Server 2008 R2.)
I was hoping to find an analytic aggregate function that do this for me, something like...
LAST_VALUE(price) OVER (PARTITION BY product_id ORDER BY timestamp)
But I don't seem to find any way to do a "cumulative latest non-null value" in the window (to bound the window to the preceding rows, rather than the entire partition)
Aside from creating a table-valued user defined function, is there any builtin that would accomplish this?
UPDATE:
Apparently, this functionality is available in the 'Denali' CTP, but not in SQL Server 2008 R2.
LAST_VALUE http://msdn.microsoft.com/en-us/library/hh231517%28v=SQL.110%29.aspx
I just expected it to be available in SQL Server 2008. It's available in Oracle (since 10gR2 at least), and I can do something similar in MySQL 5.1, using a local variable.
http://download.oracle.com/docs/cd/E14072_01/server.112/e10592/functions083.htm
You can try the following:
* Updated **
-- Test Data
DECLARE #YourTable TABLE(Product INT, Timestamp DATETIME, Price NUMERIC(16,4))
INSERT INTO #YourTable
SELECT 5678, '20080101 12:00:00', 12.34
UNION ALL
SELECT 5678, '20080101 12:01:00', NULL
UNION ALL
SELECT 5678, '20080101 12:02:00', NULL
UNION ALL
SELECT 5678, '20080101 12:03:00', 23.45
UNION ALL
SELECT 5678, '20080101 12:04:00', NULL
;WITH CTE AS
(
SELECT *
FROM #YourTable
)
-- Query
SELECT A.Product, A.Timestamp, ISNULL(A.Price,B.Price) Price
FROM CTE A
OUTER APPLY ( SELECT TOP 1 *
FROM CTE
WHERE Product = A.Product AND Timestamp < A.Timestamp
AND Price IS NOT NULL
ORDER BY Product, Timestamp DESC) B
--Results
Product Timestamp Price
5678 2008-01-01 12:00:00.000 12.3400
5678 2008-01-01 12:01:00.000 12.3400
5678 2008-01-01 12:02:00.000 12.3400
5678 2008-01-01 12:03:00.000 23.4500
5678 2008-01-01 12:04:00.000 23.4500
I have a table containing the following data. I want to update all nulls in salary columns with previous value without taking null value.
Table:
id name salary
1 A 4000
2 B
3 C
4 C
5 D 2000
6 E
7 E
8 F 1000
9 G 2000
10 G 3000
11 G 5000
12 G
here is the query that works for me.
select a.*,first_value(a.salary)over(partition by a.value order by a.id) as abc from
(
select *,sum(case when salary is null then 0 else 1 end)over(order by id) as value from test)a
output:
id name salary Value abc
1 A 4000 1 4000
2 B 1 4000
3 C 1 4000
4 C 1 4000
5 D 2000 2 2000
6 E 2 2000
7 E 2 2000
8 F 1000 3 1000
9 G 2000 4 2000
10 G 3000 5 3000
11 G 5000 6 5000
12 G 6 5000
Try this:
;WITH SortedData AS
(
SELECT
ProductID, TimeStamp, Price,
ROW_NUMBER() OVER(PARTITION BY ProductID ORDER BY TimeStamp DESC) AS 'RowNum'
FROM dbo.YourTable
)
UPDATE SortedData
SET Price = (SELECT TOP 1 Price
FROM SortedData sd2
WHERE sd2.RowNum > SortedData.RowNum
AND sd2.Price IS NOT NULL)
WHERE
SortedData.Price IS NULL
Basically, the CTE creates a list sorted by timestamp (descending) - the newest first. Whenever a NULL is found, the next row that contains a NOT NULL price will be found and that value is used to update the row with the NULL price.