Snowflake: window function 'range' not support, how to query this? - snowflake-cloud-data-platform

I have a table of transactions that includes txn_date and cust_id.
For each customer that had a transaction in December, I want to know how many transactions that customer had in the 90 days previous to the given transaction.
This seems to be a query that I could run with a window function and a RANGE sliding window, but Snowflake doesn't support the RANGE sliding window frame.
How can I run this query in Snowflake?

How about something like this:
WITH T1 AS (
SELECT CUSTOMER_ID, TX_DATE
FROM TRANSACTIONS
WHERE TX_DATE BETWEEN '2020-12-01' AND '2020-12-31')
SELECT T2.CUSTOMER_ID, T2.TX_DATE
FROM TRANSACTIONS T2
INNER JOIN T1 ON T2.CUSTOMER_ID = T2.CUSTOMER_ID
WHERE T2.TX_DATE BETWEEN (T1.TX_DATE - 90) AND T1.TX_DATE

So much the same is NickW's answer at first.
WITH data AS (
SELECT txn_date::timestamp_ntz as txn_date, cust_id, txn_id
FROM VALUES
('2020-12-04',0, 0),
('2020-12-03',1, 1),
('2020-11-04',1, 2),
('2020-10-04',1, 3),
('2020-09-04',1, 4), -- just on 90 days
('2020-09-02',1, 5), -- too far
('2021-01-05',1, 6) -- in the future
v(txn_date , cust_id, txn_id)
), dec_txn AS (
SELECT txn_id,
cust_id,
DATEADD('day',-90, txn_date) AS win_start,
txn_date AS win_end
FROM data
WHERE date_trunc('month', txn_date) = '2020-12-01'
)
SELECT dt.*
,t.*
,datediff('days', dt.win_end, t.txn_date) as win_time
FROM dec_txn AS dt
LEFT JOIN data AS t
ON t.cust_id = dt.cust_id
AND t.txn_date between dt.win_start and win_end AND t.txn_id != dt.txn_id
;
which gives:
TXN_ID CUST_ID WIN_START WIN_END TXN_DATE CUST_ID TXN_ID WIN_TIME
1 1 2020-09-04 00:00:00.000 2020-12-03 00:00:00.000 2020-11-04 00:00:00.000 1 2 -29
1 1 2020-09-04 00:00:00.000 2020-12-03 00:00:00.000 2020-10-04 00:00:00.000 1 3 -60
1 1 2020-09-04 00:00:00.000 2020-12-03 00:00:00.000 2020-09-04 00:00:00.000 1 4 -90
0 0 2020-09-05 00:00:00.000 2020-12-04 00:00:00.000 NULL NULL NULL NULL
thus to counts we:
WITH data AS (
SELECT txn_date::timestamp_ntz as txn_date, cust_id, txn_id
FROM VALUES
('2020-12-04',0, 0),
('2020-12-03',1, 1),
('2020-11-04',1, 2),
('2020-10-04',1, 3),
('2020-09-04',1, 4), -- just on 90 days
('2020-09-02',1, 5), -- too far
('2021-01-05',1, 6) -- in the future
v(txn_date , cust_id, txn_id)
), dec_txn AS (
SELECT txn_id,
cust_id,
txn_date,
DATEADD('day',-90, txn_date) AS win_start,
txn_date AS win_end
FROM data
WHERE date_trunc('month', txn_date) = '2020-12-01'
)
SELECT dt.cust_id
,dt.txn_id
,dt.txn_date
,count(t.txn_id) as c__prior_90_days_transaction
FROM dec_txn AS dt
LEFT JOIN data AS t
ON t.cust_id = dt.cust_id
AND t.txn_date >= dt.win_start and t.txn_date < dt.win_end AND t.txn_id != dt.txn_id
GROUP BY 1,2,3
ORDER BY 1,2
;
giving:
CUST_ID TXN_ID TXN_DATE C__PRIOR_90_DAYS_TRANSACTION
0 0 2020-12-04 00:00:00.000 0
1 1 2020-12-03 00:00:00.000 3
What is not well defined in the question is what to do if there are many requests in december for one customer
What to do if there are multiple transactions in the same december day.
The above will return a row for each Dec transaction per customer, and it includes transactions that happen on the same day. But if you date/timestamp has time then it will only count transtions earlier in the same day.
But if you want prior days and the txn_date is just a date then
AND t.txn_date >= dt.win_start and t.txn_date < dt.win_end AND t.txn_id != dt.txn_id
should be used.
if txn_date is a timestamp, then dec_txn should be altered to:
dec_txn AS (
SELECT txn_id,
cust_id,
DATEADD('day',-90, txn_date::date) AS win_start,
txn_date::date AS win_end
FROM data
WHERE date_trunc('month', txn_date) = '2020-12-01'
and now that the window timestamps are truncated to days, then you will have to workout if you want midnight transaction to count on the day, or if you don't have midnight timestamps...

Related

Calculate time between startdate and enddate and subtracting days that have no worktime

My goal is to check if an email is answered within 24 hours during workdays. de definition of a workday is if there is time registered in another table. this because we sometimes work on a Saturday or a Sunday or to exclude holidays. I made a view from that table that gives a 1 if the date has worktime or a 0 if there is no worktime registered.
DateWorked
HasWorked
2021-04-01 00:00:00.000
1
2021-04-02 00:00:00.000
1
2021-04-03 00:00:00.000
1
2021-04-04 00:00:00.000
0
2021-04-05 00:00:00.000
1
So for example a few situations:
1. MailIncoming: 2021-04-01 16:30:00, MailAnswering: 2021-04-02 14:00:00
This one is easy, I don't have to subtract anything and the mail is answered within 24 hours.
2. MailIncoming: 2021-04-01 09:30:00, MailAnswering: 2021-04-03 14:00:00
This one is also easy, I don't have to subtract anything and the mail is not answered within 24 hours.
3. MailIncoming: 2021-04-03 12:30:00, MailAnswering: 2021-04-05 10:00:00
There is 1 day where no one has worked, so I need to subtract 1 whole day from the total time, and in that case the email is answered within 24 hours during workdays.
4. MailIncoming: 2021-04-04 11:00:00, MailAnswering: 2021-04-05 18:00:00
The remaining 13 hours from 04 do not count toward the '24 hours during workdays' so the email is answered within 24 during workdays.
Also, there can be multiple dates with zero after each other.
So the outcome I'm looking for is:
MailIncoming
MailAnswering
TotalTime
TotalTimeWithoutDaysNotWorked
2021-04-04 11:00:00.000
2021-04-05 18:00:00.000
31
18
How can I calculate this last column? Or am I approaching this in the wrong way?
The query needs a way to generate calculated dates between MailIncoming and MailAnswering so there can be a LEFT JOIN (or INNER JOIN) to the WorkingDay table. In this case the query uses dbo.fnTally which is known to be a fast and efficient way to generate rows.
tables
drop table if exists #WorkingDay;
go
create table #WorkingDay(
DateWorked Date,
HasNotWorked int);
drop table if exists #MailIncoming;
go
create table #MailIncoming(
MailIncoming DateTime,
MailAnswering DateTime);
insert into #WorkingDay values
('2021-04-01', 0),
('2021-04-02', 0),
('2021-04-03', 0),
('2021-04-04', 1),
('2021-04-05', 0),
('2021-04-06', 0);
insert into #MailIncoming values
('2021-04-01 16:30:00', '2021-04-02 14:00:00'),
('2021-04-01 09:30:00', '2021-04-03 14:00:00'),
('2021-04-03 12:30:00', '2021-04-05 10:00:00'),
('2021-04-04 11:00:00', '2021-04-05 18:00:00');
dbo.fnTally
CREATE FUNCTION [dbo].[fnTally]
/**********************************************************************************************************************
Jeff Moden Script on SSC: https://www.sqlservercentral.com/scripts/create-a-tally-function-fntally
**********************************************************************************************************************/
(#ZeroOrOne BIT, #MaxN BIGINT)
RETURNS TABLE WITH SCHEMABINDING AS
RETURN WITH
H2(N) AS ( SELECT 1
FROM (VALUES
(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
,(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1),(1)
)V(N)) --16^2 or 256 rows
, H4(N) AS (SELECT 1 FROM H2 a, H2 b) --16^4 or 65,536 rows
, H8(N) AS (SELECT 1 FROM H4 a, H4 b) --16^8 or 4,294,967,296 rows
SELECT N = 0 WHERE #ZeroOrOne = 0 UNION ALL
SELECT TOP(#MaxN)
N = ROW_NUMBER() OVER (ORDER BY N)
FROM H8
;
query
select mi.MailIncoming, mi.MailAnswering,
avg(datediff(hour, MailIncoming, MailAnswering)) hrs_to_ans,
sum(case when w.HasNotWorked=1 and
v.calc_dt > mi_dt.inc_dt and
v.calc_dt < mi_dt.ans_dt
then -24
when w.HasNotWorked=1
then datediff(hour, dateadd(day, 1, mi_dt.inc_dt), mi.MailIncoming)
else 0 end) hrs_to_sub
from #MailIncoming mi
cross apply (values (cast(MailIncoming as date),
cast(MailAnswering as date))) mi_dt(inc_dt, ans_dt)
cross apply dbo.fnTally(0, datediff(day, mi.MailIncoming, mi.MailAnswering)) fn
cross apply (values (dateadd(day, fn.n, mi_dt.inc_dt))) v(calc_dt)
left join #WorkingDay w on v.calc_dt=w.DateWorked
group by mi.MailIncoming, mi.MailAnswering
order by mi.MailIncoming;
MailIncoming MailAnswering hrs_to_ans hrs_to_sub
2021-04-01 09:30:00.000 2021-04-03 14:00:00.000 53 0
2021-04-01 16:30:00.000 2021-04-02 14:00:00.000 22 0
2021-04-03 12:30:00.000 2021-04-05 10:00:00.000 46 -24
2021-04-04 11:00:00.000 2021-04-05 18:00:00.000 31 -13
I suggest you to use a column HasNotWorked, so the tables are
create table WorkingDay(DateWorked Date, HasNotWorked int);
create table MailIncoming(MailIncoming DateTime, MailAnswering DateTime);
and the rows
insert into WorkingDay values('2021-04-01', 0);
insert into WorkingDay values('2021-04-02', 0);
insert into WorkingDay values('2021-04-03', 0);
insert into WorkingDay values('2021-04-04', 1);
insert into WorkingDay values('2021-04-05', 0);
insert into WorkingDay values('2021-04-06', 0);
insert into MailIncoming values('2021-04-04 11:00:00.000', '2021-04-06 18:00:00.000');
I want calculate the start date. If is in working day, we must consider the hour of the mail, else the first working day with
case when
(select HasNotWorked from WorkingDay where DateWorked = convert(date, MailIncoming)) = 1 then
(select min(DateWorked) from WorkingDay where DateWorked > MailIncoming and HasNotWorked = 0)
else MailIncoming end as startDate
and discard the day that are not working day
((select sum(HasNotWorked) from WorkingDay where DateWorked between convert(date, startDate)
and convert(date, MailAnswering)
) * 24) as numNotWorkingDay
so the query could be
select startDate, MailAnswering, MailIncoming, hour, numNotWorkingDay, hour - numNotWorkingDay hourWitoutWorkingDay
from (
select
MailAnswering, startDate, MailIncoming,
DateDiff("hh", startDate, MailAnswering) hour,
((select sum(HasNotWorked) from WorkingDay where DateWorked between convert(date, startDate)
and convert(date, MailAnswering)
) * 24) as numNotWorkingDay
from (
select *,
case when
(select HasNotWorked from WorkingDay where DateWorked = convert(date, MailIncoming)) = 1 then
(select min(DateWorked) from WorkingDay where DateWorked > MailIncoming and HasNotWorked = 0)
else MailIncoming end as startDate
from MailIncoming) as startCalc
) as calcTable;
sqlfiddle

13 Period Calendar 4-4-5 Calendar T-SQL MSSQL

I am trying to create a 13 period calendar in mssql but I am a bit stuck. I am not sure if my approach is the best way to achieve this. I have my base script which can be seen below:
Set DateFirst 1
Declare #Date1 date = '20180101' --startdate should always be start of
financial year
Declare #Date2 date = '20181231' --enddate should always be start of
financial year
SELECT * INTO #CalendarTable
FROM dbo.CalendarTable(#Date1,#Date2,0,0,0)c
DECLARE #StartDate datetime,#EndDate datetime
SELECT #StartDate=MIN(CASE WHEN [Day]='Monday' THEN [Date] ELSE NULL END),
#EndDate=MAX([Date])
FROM #CalendarTable
;With Period_CTE(PeriodNo,Start,[End])
AS
(SELECT 1,#StartDate,DATEADD(wk,4,#StartDate) -1
UNION ALL
SELECT PeriodNo+1,DATEADD(wk,4,Start),DATEADD(wk,4,[End])
FROM Period_CTE
WHERE DATEADD(wk,4,[End])< =#EndDate
OR PeriodNo+1 <=13
)
select * from Period_CTE
Which gives me this:
PeriodNo Start End
1 2018-01-01 00:00:00.000 2018-01-28 00:00:00.000
2 2018-01-29 00:00:00.000 2018-02-25 00:00:00.000
3 2018-02-26 00:00:00.000 2018-03-25 00:00:00.000
4 2018-03-26 00:00:00.000 2018-04-22 00:00:00.000
5 2018-04-23 00:00:00.000 2018-05-20 00:00:00.000
6 2018-05-21 00:00:00.000 2018-06-17 00:00:00.000
7 2018-06-18 00:00:00.000 2018-07-15 00:00:00.000
8 2018-07-16 00:00:00.000 2018-08-12 00:00:00.000
9 2018-08-13 00:00:00.000 2018-09-09 00:00:00.000
10 2018-09-10 00:00:00.000 2018-10-07 00:00:00.000
11 2018-10-08 00:00:00.000 2018-11-04 00:00:00.000
12 2018-11-05 00:00:00.000 2018-12-02 00:00:00.000
13 2018-12-03 00:00:00.000 2018-12-30 00:00:00.000
The result i am trying to get is
Even if I have to take a different approach I would not mind, as long as the result is the same as the above.
dbo.CalendarTable() is a function that returns the following results. I can share the code if desired.
I'd create a general number's table like suggested here and add a column Periode13.
The trick to get the tiling is the integer division:
DECLARE #PeriodeSize INT=28; --13 "moon-months" a 28 days
SELECT TOP 100 (ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1)/#PeriodeSize
FROM master..spt_values --just a table with many rows to show the principles
You can add this to an existing numbers table with a simple update statement.
UPDATE A fully working example (using the logic linked above)
DECLARE #RunningNumbers TABLE (Number INT NOT NULL
,CalendarDate DATE NOT NULL
,CalendarYear INT NOT NULL
,CalendarMonth INT NOT NULL
,CalendarDay INT NOT NULL
,CalendarWeek INT NOT NULL
,CalendarYearDay INT NOT NULL
,CalendarWeekDay INT NOT NULL);
DECLARE #CountEntries INT = 100000;
DECLARE #StartNumber INT = 0;
WITH E1(N) AS(SELECT 1 FROM(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))t(N)), --10 ^ 1
E2(N) AS(SELECT 1 FROM E1 a CROSS JOIN E1 b), -- 10 ^ 2 = 100 rows
E4(N) AS(SELECT 1 FROM E2 a CROSS JOIN E2 b), -- 10 ^ 4 = 10,000 rows
E8(N) AS(SELECT 1 FROM E4 a CROSS JOIN E4 b), -- 10 ^ 8 = 10,000,000 rows
CteTally AS
(
SELECT TOP(ISNULL(#CountEntries,1000000)) ROW_NUMBER() OVER(ORDER BY(SELECT NULL)) -1 + ISNULL(#StartNumber,0) As Nmbr
FROM E8
)
INSERT INTO #RunningNumbers
SELECT CteTally.Nmbr,CalendarDate.d,CalendarExt.*
FROM CteTally
CROSS APPLY
(
SELECT DATEADD(DAY,CteTally.Nmbr,{ts'2018-01-01 00:00:00'})
) AS CalendarDate(d)
CROSS APPLY
(
SELECT YEAR(CalendarDate.d) AS CalendarYear
,MONTH(CalendarDate.d) AS CalendarMonth
,DAY(CalendarDate.d) AS CalendarDay
,DATEPART(WEEK,CalendarDate.d) AS CalendarWeek
,DATEPART(DAYOFYEAR,CalendarDate.d) AS CalendarYearDay
,DATEPART(WEEKDAY,CalendarDate.d) AS CalendarWeekDay
) AS CalendarExt;
--The mockup table from above is now filled and can be queried
WITH AddPeriode AS
(
SELECT Number/28 +1 AS PeriodNumber
,CalendarDate
,CalendarWeek
,r.CalendarDay
,r.CalendarMonth
,r.CalendarWeekDay
,r.CalendarYear
,r.CalendarYearDay
FROM #RunningNumbers AS r
)
SELECT TOP 100 p.*
,(SELECT MIN(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber) AS [Start]
,(SELECT MAX(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber) AS [End]
,(SELECT MIN(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber AND x.CalendarWeek=p.CalendarWeek) AS [wkStart]
,(SELECT MAX(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber AND x.CalendarWeek=p.CalendarWeek) AS [wkEnd]
,(ROW_NUMBER() OVER(PARTITION BY PeriodNumber ORDER BY CalendarDate)-1)/7+1 AS WeekOfPeriode
FROM AddPeriode AS p
ORDER BY CalendarDate
Try it out...
Hint: Do not use a VIEW or iTVF for this.
This is non-changing data and much better placed in a physically stored table with appropriate indexes.
Not abundantly sure external links are accepted here, but I wrote an article that pulls of a 5-4-4 'Crop Year' fiscal year with all the code. Feel free to use all the code in these articles.
SQL Server Calendar Table
SQL Server Calendar Table: Fiscal Years

Time and Date Clash - Sql Server

stuck on a project. I wrote this code in sql server which finds the duplicate date matches for a staff member, but I'm stuck when trying to expand it to narrow it down to when the time ranges overlap each other also.
So there is a table called 'Rosters' with columns 'StaffID', 'Date', 'Start', 'End'
SELECT
y.[Date],y.StaffID,y.Start,y.[End]
FROM Rosters y
INNER JOIN (SELECT
[Date],StaffID, COUNT(*) AS CountOf
FROM Rosters
GROUP BY [Date],StaffID
HAVING COUNT(*)>1)
dd ON y.[Date]=dd.[Date] and y.StaffID=dd.StaffID
It returns all duplicate dates for each staff member, I wish to add the logic-
y.Start <= dd.[End] && dd.Start <= y.[End]
Is it possible with the way I'm currently doing it? Any help would be appreciated.
#TT. Sorry, below is probably a better visual explanation -
e.g This would be the roster table
ID Date Start End
1 01/01/2000 8:00 12:00
1 01/01/2000 9:00 11:00
2 01/01/2000 10:00 14:00
2 01/01/2000 8:00 9:00
3 01/01/2000 14:00 18:00
3 02/02/2002 13:00 19:00
And I'm trying to return what is below for the example as they are the only 2 rows that clash for ID, Date, and the Time range (start - end)
ID Date Start End
1 01/01/2000 8:00 12:00
1 01/01/2000 9:00 11:00
This is the logic that you would need to filter your results to overlapping time ranges, though I think this can be handled without your intermediate step of finding the duplicates. If you simply post your source table schema with some test data and your desired output, you will get a much better answer:
declare #t table (RowID int
,ID int
,DateValue date --\
,StartTime Time -- > Avoid using reserved words for your object names.
,EndTime Time --/
);
insert into #t values
(1,1, '01/01/2000', '8:00','12:00' )
,(2,1, '01/01/2000', '9:00','11:00' )
,(3,2, '01/01/2000', '10:00','14:00')
,(4,2, '01/01/2000', '8:00','9:00' )
,(5,3, '01/01/2000', '14:00','18:00')
,(6,3, '02/02/2002', '13:00','19:00');
select t1.*
from #t t1
inner join #t t2
on(t1.RowID <> t2.RowID -- If you don't have a unique ID for your rows, you will need to specify all columns so as no to match on the same row.
and t1.ID = t2.ID
and t1.DateValue = t2.DateValue
and t1.StartTime <= t2.EndTime
and t1.EndTime >= t2.StartTime
)
order by t1.RowID
Try this
with cte as
(
SELECT ROW_NUMBER() over (order by StaffID,Date,Start,End) as rno
,StaffID, Date, Start, End
FROM Rosters
)
select distinct t1.*
from cte t1
inner join cte t2
on(t1.rno <> t2.rno
and t1.StaffID = t2.StaffID
and t1.Date = t2.Date
and t1.Start <= t2.End
and t1.End >= t2.Start
)
order by t1.rno
Made some changes in #iamdave's Answer
If you use SQL Server 2012 up, you can try below script:
declare #roster table (StaffID int,[Date] date,[Start] Time,[End] Time);
insert into #roster values
(1, '01/01/2000', '9:00','11:00' )
,(1, '01/01/2000', '8:00','12:00' )
,(2, '01/01/2000', '10:00','14:00')
,(2, '01/01/2000', '8:00','9:00' )
,(3, '01/01/2000', '14:00','18:00')
,(3, '02/02/2002', '13:00','19:00');
SELECT t.StaffID,t.Date,t.Start,t.[End] FROM (
SELECT y.StaffID,y.Date,y.Start,y.[End]
,CASE WHEN y.[End] BETWEEN
LAG(y.Start)OVER(PARTITION BY y.StaffID,y.Date ORDER BY y.Start) AND LAG(y.[End])OVER(PARTITION BY y.StaffID,y.Date ORDER BY y.Start) THEN 1 ELSE 0 END
+CASE WHEN LEAD(y.[End])OVER(PARTITION BY y.StaffID,y.Date ORDER BY y.Start) BETWEEN y.Start AND y.[End] THEN 1 ELSE 0 END AS IsOverlap
,COUNT (0)OVER(PARTITION BY y.StaffID,y.Date) AS cnt
FROM #roster AS y
) t WHERE t.cnt>1 AND t.IsOverlap>0
StaffID Date Start End
----------- ---------- ---------------- ----------------
1 2000-01-01 08:00:00.0000000 12:00:00.0000000
1 2000-01-01 09:00:00.0000000 11:00:00.0000000

Fetching dates monthly from the table

Can you help out with a problem
I have table price table which has daily prices starting 31st Dec 2010 till todays date.The table contains daily prices
2009-12-31 00:00:00.000 1.0020945351
2010-01-01 00:00:00.000 1.0021009300
2010-01-04 00:00:00.000 1.0021910181
2010-01-05 00:00:00.000 1.0022005986
2010-01-06 00:00:00.000 1.0022428696
2010-01-07 00:00:00.000 1.0022647147
2010-01-08 00:00:00.000 1.0022842726
2010-01-11 00:00:00.000 1.0023374302
2010-01-12 00:00:00.000 1.0023465374
2010-01-13 00:00:00.000 1.0023638081
2010-01-14 00:00:00.000 1.0023856533
2010-01-00 00:00:00.000 1.0024083955
2010-01-18 00:00:00.000 1.0024779677
2010-01-19 00:00:00.000 1.0025020553
2010-01-20 00:00:00.000 1.002521135
2010-01-21 00:00:00.000 1.0025420688
2010-01-22 00:00:00.000 1.0025593397
2010-01-25 00:00:00.000 1.0026180146
2010-01-26 00:00:00.000 1.002637573
2010-01-27 00:00:00.000 1.0026648447
2010-01-28 00:00:00.000 1.0026957934
2010-01-29 00:00:00.000 1.0027267421
2010-02-01 00:00:00.000 1.0028195885
2010-02-02 00:00:00.000 1.0028573523
2010-02-03 00:00:00.000 1.0028964611
2010-02-04 00:00:00.000 1.00293557
2010-02-05 00:00:00.000 1.002973334
2010-02-08 00:00:00.000 1.0030879717
2010-02-09 00:00:00.000 1.0031279777
2010-02-10 00:00:00.000 1.003171166
2010-02-11 00:00:00.000 1.0032007452
2010-02-12 00:00:00.000 1.0032575895
2010-02-00 00:00:00.000 1.0033749191
2010-02-1 00:00:00.000 1.0034140292
2010-02-17 00:00:00.000 1.003452691
2010-02-18 00:00:00.000 1.0034918013
2010-02-19 00:00:00.000 1.0035395633
2010-02-22 00:00:00.000 1.0036664439
2010-02-23 00:00:00.000 1.0037042097
2010-02-24 00:00:00.000 1.0037510759
2010-02-25 00:00:00.000 1.0038001834
2010-02-26 00:00:00.000 1.003850077
I need to write a query to get index based on
(Last day of current month/Previous month last day) - 1 * 100.So that output comes something like this
31-Jan-10 0.01%
28-Feb-10 0.02%
31-Mar-10 0.00%
Following is one of the solution I thought about however please share best ideas to implement this problem
Extract last day of all the months with values into a temp table and then order by dates so that they subtract and put the values into another temp table
Looking forward to your help.
Try this....
DECLARE #StartDate DATETIME = '2010-01-01',
#EndDate DATETIME = GETDATE();
WITH data AS (
SELECT 1 AS i, CONVERT(DATETIME, NULL) AS StartDate, DATEADD(MONTH, 0, #StartDate) - 1 AS EndDate
UNION ALL
SELECT i + 1, data.EndDate, DATEADD(MONTH, i, #StartDate) - 1 AS EndDate
FROM data
WHERE DATEADD(MONTH, i, #StartDate) - 1 < #EndDate
)
SELECT (
((SELECT TOP 1 Rate FROM RateTable WHERE Date <= data.EndDate ORDER BY Date DESC) /
(SELECT TOP 1 Rate FROM RateTable WHERE Date <= data.StartDate ORDER BY Date DESC)- 1) * 100)
FROM DATA -- parenthesis were causing issues
WHERE data.StartDate IS NOT NULL
OPTION (MAXRECURSION 10000);
You'll need to replace the
(SELECT Rate FROM RateTable WHERE Date = data.StartDate)
and
(SELECT Rate FROM RateTable WHERE Date = data.EndDate)
With the values for your rate table. as you didn't mention column and table names in your question.
rwking indicated that there might be gaps in the rates table that would cause issues.
I've modified the subquery to bring back the first rate on or nearest the start and end dates.
Hope that helps
You can use the LAG function introduced in SQL2012 to make it a bit easier:
WITH DataWithOrder AS
(
SELECT DateField, PriceField,
ROW_NUMBER() OVER(PARTITION BY YEAR(DateField), Month(DateField) ORDER BY DateField DESC) AS Pos
FROM PriceTable
)
SELECT
DateField,
PriceField,
LAG(PriceField) OVER(ORDER BY DateField) AS PriceLastMonth,
((PriceField / LAG(PriceField) OVER(ORDER BY DateField)) - 1) * 100 AS PCIncrease
FROM DataWithOrder
WHERE Pos = 1
ORDER BY DateField
I took a very different approach than the other guy. His is more elegant and would work better if the daily data does represent every single day of every month. If there are gaps in days, however, as your sample data represents, you can try the following code.
with cte as (select mydate
, price
, ROW_NUMBER() over(partition by YEAR(mydate), MONTH(mydate)
order by day(mydate) desc) row_n
from #temp)
select mydate, price, ROW_NUMBER() over(order by mydate desc) row_num
into #temp2
from cte
where row_n = 1
alter table #temp2
add idx float
declare #counter int = 1
while #counter < (select MAX(row_num)+1 from #temp2)
begin
update t2
set t2.idx = ((t2.price/t3.price)-1)*100
from #temp2 t2 left join
#temp2 t3 on 1 = 1
where t2.row_num = #counter and t3.row_num = #counter + 1
set #counter = #counter + 1
end
select mydate, idx
from #temp2
As the other poster mentioned, you didn't provide column or table names. My process was to insert your data into a table called [#temp] with column names [mydate] and [price].
Also, the data sample you provided contains two invalid dates that I changed to arbitrary dates just for the purposes of getting code to run. (2010-01-00 and 2010-02-00)

Selecting rows with the nearest date using SQL

I have a SQL statement.
SELECT
ID, LOCATION, CODE,MAX(DATE),FLAG
FROM
TABLE1
WHERE
DATE <= CONVERT(DATETIME,'11-11-2012')
AND EXISTS (SELECT * FROM #TEMP_CODE WHERE TABLE1.CODE = #TEMP_CODE.CODE)
AND ID IN (14, 279)
GROUP BY
ID, LOCATION, CODE
I need rows with the nearest date to the 11-11-2012, but the table returns all the values. What am I doing wrong. Thanks
ID LOCATION CODE DATE FLAG
-------------------------------------------------------------------
14 CAR STREET,UDUPI 234 2012-08-08 00:00:00.000 0
14 CAR STREET,UDUPI 234 2012-08-10 00:00:00.000 1
14 CAR STREET,UDUPI 234 2012-08-14 00:00:00.000 0
279 MADHUGIRI 234 2012-08-08 00:00:00.000 1
279 MADHUGIRI 234 2012-08-11 00:00:00.000 0
I want to show only the rows with dates less than or equal to the given date. The required result is
ID LOCATION CODE DATE FLAG
-------------------------------------------------------------------
14 CAR STREET,UDUPI 234 2012-08-10 00:00:00.000 1
279 MADHUGIRI 234 2012-08-11 00:00:00.000 0
;WITH x AS
(
SELECT ID, Location, Code, Date, Flag,
rn = ROW_NUMBER() OVER
(PARTITION BY ID, Location, Code ORDER BY [Date] DESC)
FROM dbo.TABLE1 AS t1
WHERE [Date] <= '20121111'
AND ID IN (14, 279) -- sorry, missed this
AND EXISTS (SELECT 1 FROM #TEMP_CODE WHERE CODE = t1.CODE)
)
SELECT ID, Location, Code, Date, Flag
FROM x WHERE rn = 1;
This yields:
ID LOCATION CODE [Date] FLAG
--- ---------------- ---- ---------- ----
14 CAR STREET,UDUPI 234 2012-08-14 0
279 MADHUGIRI 234 2012-08-11 0
This disagrees with your required results, but I think those are wrong and I think you should check them.
Use a subquery to get the max date for each ID, and then join that to your table:
SELECT
ID, LOCATION, CODE, DATE, FLAG
FROM
TABLE1
JOIN (
SELECT ID AS SubID, MAX(DATE) AS SubDATE
FROM TABLE1
WHERE DATE < '11/11/2012'
AND EXISTS (SELECT * FROM #TEMP_CODE WHERE TABLE1.CODE = #TEMP_CODE.CODE)
AND ID IN (14, 279)
GROUP BY ID
) AS SUB ON ID = SubID AND DATE = SubDATE
add a Order BY DATE LIMIT 0,2
With the order by you will make the date order by the closest to your condition in where and with the limit will return only the top 2 values!
SET ROWCOUNT 2
SELECT
ID, LOCATION, CODE,MAX(DATE),FLAG
FROM
TABLE1
WHERE
DATE <= CONVERT(DATETIME,'11-11-2012')
AND EXISTS (SELECT * FROM #TEMP_CODE WHERE TABLE1.CODE = #TEMP_CODE.CODE)
AND ID IN (14, 279)
GROUP BY
ID, LOCATION, CODE
ORDER BY DATE

Resources