Given two tables Event and Activity, I want to join all of the activities that happened at or after an Event started but before the next Event started. The difficult part is that Events do not have end datetimes only starting.
CREATE TABLE Event (
id INT PRIMARY KEY
,start_datetime DATETIME
);
CREATE TABLE Activity (
id INT PRIMARY KEY
,log_datetime DATETIME
);
INSERT INTO Event (id,start_datetime)
VALUES (1,'2020-01-01')
,(2,'2020-01-01 04:55:15')
,(3,'2021-05-01 16:23:45');
INSERT INTO Activity (id,log_datetime)
VALUES (1,'2020-01-01 00:00:00')
,(2,'2020-01-01 01:20:15')
,(3,'2020-01-01 16:23:45')
,(4,'2020-02-01 00:00:00')
,(5,'2021-05-10 13:00:00')
,(6,'2022-02-01 18:30:00');
SELECT
Event.id AS EventID
,Event.start_datetime
,Activity.id AS ActivityID
,Activity.log_datetime
FROM
Event
LEFT JOIN Activity
ON Activity.log_datetime >= Event.start_datetime
ORDER BY
Event.start_datetime
,Activity.log_datetime;
EventID
start_datetime
ActivityID
log_datetime
1
2020-01-01 00:00:00.000
1
2020-01-01 00:00:00.000
1
2020-01-01 00:00:00.000
2
2020-01-01 01:20:15.000
1
2020-01-01 00:00:00.000
3
2020-01-01 16:23:45.000
1
2020-01-01 00:00:00.000
4
2020-02-01 00:00:00.000
1
2020-01-01 00:00:00.000
5
2021-05-10 13:00:00.000
1
2020-01-01 00:00:00.000
6
2022-02-01 18:30:00.000
2
2020-01-01 04:55:15.000
3
2020-01-01 16:23:45.000
2
2020-01-01 04:55:15.000
4
2020-02-01 00:00:00.000
2
2020-01-01 04:55:15.000
5
2021-05-10 13:00:00.000
2
2020-01-01 04:55:15.000
6
2022-02-01 18:30:00.000
3
2021-05-01 16:23:45.000
5
2021-05-10 13:00:00.000
3
2021-05-01 16:23:45.000
6
2022-02-01 18:30:00.000
The results I am trying to get would be:
EventID
start_datetime
ActivityID
log_datetime
1
2020-01-01 00:00:00.000
1
2020-01-01 00:00:00.000
1
2020-01-01 00:00:00.000
2
2020-01-01 01:20:15.000
2
2020-01-01 04:55:15.000
3
2020-01-01 16:23:45.000
2
2020-01-01 04:55:15.000
4
2020-02-01 00:00:00.000
3
2021-05-01 16:23:45.000
5
2021-05-10 13:00:00.000
3
2021-05-01 16:23:45.000
6
2022-02-01 18:30:00.000
I know I could do this with an additional condition on my join, if Events had an end time.
What I am needing to do I think is first get a query to generate Events with the end_datetime equal to the following Event start_datetime:
EventID
start_datetime
end_datetime
1
2020-01-01 00:00:00.000
2020-01-01 04:55:15.000
2
2020-01-01 04:55:15.000
2021-05-01 16:23:45.000
3
2021-05-01 16:23:45.000
NULL
Which I am able to achieve with:
SELECT
e.id
,e.start_datetime
,MIN(e2.start_datetime) AS end_datetime
FROM
Event e
LEFT JOIN Event e2
ON e2.start_datetime>e.start_datetime
GROUP BY e.id, e.start_datetime
So the solution I have is to take the above and form it into a subquery and then join to it the Activity table using the start and now end datetimes:
SELECT
Events.id AS EventID
,Events.start_datetime
,Events.end_datetime
,Activity.id AS ActivityID
,Activity.log_datetime
FROM
(
SELECT
e.id
,e.start_datetime
,MIN(e2.start_datetime) AS end_datetime
FROM
Event e
LEFT JOIN Event e2
ON e2.start_datetime>e.start_datetime
GROUP BY e.id, e.start_datetime
) Events
LEFT JOIN Activity
ON Activity.log_datetime >= Events.start_datetime
AND (
Activity.log_datetime < Events.end_datetime
OR Events.end_datetime IS NULL
)
ORDER BY
Events.start_datetime
,Activity.log_datetime
Is there a better way to do this? I got the result wanted, but it doesn't feel very efficient with the subquery.
I know I can and should add indexes on the start_datetime and log_datetime columns to increase optimization. But is there any restructure of the query that would also help optimize?
Fiddle Link:
https://dbfiddle.uk/?rdbms=sqlserver_2016&fiddle=b5639752cc9cba32faee10bf3810d950
EDIT: The IDs of the Events are not always in the same order as the start_datetime. An Event will always last until the next event is started. If there is no next event, then the current event is still in progress. The reason that the IDs are not consistent with the start_datetime order is because this data is ingested and given auto incremented ids and sometimes the ingestion of Events may come out of sequential datetime order.
You can simplify the query a bit. Start with this subquery to retrieve your Event table including an end_datetime. Using the LEAD windowing function.
SELECT id, start_datetime,
LEAD (start_datetime, 1) OVER (ORDER BY start_datetime ASC)
FROM Event
Will this make the query faster? Probably: LEAD() is generally more efficient than MIN() ... LEFT JOIN ... GROUP BY because it has to wrangle less data.
Then use it with your (correct) logic to join the Activity table. Using a Common Table Expression, the query now looks like this.
WITH StartEnd AS (
SELECT
id, start_datetime,
LEAD (start_datetime, 1) OVER (ORDER BY start_datetime ASC) end_datetime
FROM Event a
)
SELECT StartEnd.id EventId, StartEnd.start_datetime, StartEnd.end_datetime,
Activity.id ActivityId, Activity.log_datetime
FROM StartEnd
LEFT JOIN Activity
ON Activity.log_datetime >= StartEnd.start_datetime
AND ( Activity.log_datetime < StartEnd.end_datetime
OR StartEnd.end_datetime IS NULL)
ORDER BY StartEnd.start_datetime, Activity.log_datetime;
Fiddle
I suspect an index on Event (start_datetime) will help. And so will an index on Activity(log_datetime). You can omit id from those indexes because id is the PK on both tables and is implicitly part of any other index.
But to be sure, when you run this query with your real data in SSMS, right-click in the query window, choose Show Actual Execution Plan, then run the query and examine the plan. Those Actual plans sometimes suggest appropriate indexes.
Common Table Expressions definitely make queries easier to read. And the way the query planner works, they don't harm performance compared to using subqueries.
Related
I am trying to create a 13 period calendar in mssql but I am a bit stuck. I am not sure if my approach is the best way to achieve this. I have my base script which can be seen below:
Set DateFirst 1
Declare #Date1 date = '20180101' --startdate should always be start of
financial year
Declare #Date2 date = '20181231' --enddate should always be start of
financial year
SELECT * INTO #CalendarTable
FROM dbo.CalendarTable(#Date1,#Date2,0,0,0)c
DECLARE #StartDate datetime,#EndDate datetime
SELECT #StartDate=MIN(CASE WHEN [Day]='Monday' THEN [Date] ELSE NULL END),
#EndDate=MAX([Date])
FROM #CalendarTable
;With Period_CTE(PeriodNo,Start,[End])
AS
(SELECT 1,#StartDate,DATEADD(wk,4,#StartDate) -1
UNION ALL
SELECT PeriodNo+1,DATEADD(wk,4,Start),DATEADD(wk,4,[End])
FROM Period_CTE
WHERE DATEADD(wk,4,[End])< =#EndDate
OR PeriodNo+1 <=13
)
select * from Period_CTE
Which gives me this:
PeriodNo Start End
1 2018-01-01 00:00:00.000 2018-01-28 00:00:00.000
2 2018-01-29 00:00:00.000 2018-02-25 00:00:00.000
3 2018-02-26 00:00:00.000 2018-03-25 00:00:00.000
4 2018-03-26 00:00:00.000 2018-04-22 00:00:00.000
5 2018-04-23 00:00:00.000 2018-05-20 00:00:00.000
6 2018-05-21 00:00:00.000 2018-06-17 00:00:00.000
7 2018-06-18 00:00:00.000 2018-07-15 00:00:00.000
8 2018-07-16 00:00:00.000 2018-08-12 00:00:00.000
9 2018-08-13 00:00:00.000 2018-09-09 00:00:00.000
10 2018-09-10 00:00:00.000 2018-10-07 00:00:00.000
11 2018-10-08 00:00:00.000 2018-11-04 00:00:00.000
12 2018-11-05 00:00:00.000 2018-12-02 00:00:00.000
13 2018-12-03 00:00:00.000 2018-12-30 00:00:00.000
The result i am trying to get is
Even if I have to take a different approach I would not mind, as long as the result is the same as the above.
dbo.CalendarTable() is a function that returns the following results. I can share the code if desired.
I'd create a general number's table like suggested here and add a column Periode13.
The trick to get the tiling is the integer division:
DECLARE #PeriodeSize INT=28; --13 "moon-months" a 28 days
SELECT TOP 100 (ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1)/#PeriodeSize
FROM master..spt_values --just a table with many rows to show the principles
You can add this to an existing numbers table with a simple update statement.
UPDATE A fully working example (using the logic linked above)
DECLARE #RunningNumbers TABLE (Number INT NOT NULL
,CalendarDate DATE NOT NULL
,CalendarYear INT NOT NULL
,CalendarMonth INT NOT NULL
,CalendarDay INT NOT NULL
,CalendarWeek INT NOT NULL
,CalendarYearDay INT NOT NULL
,CalendarWeekDay INT NOT NULL);
DECLARE #CountEntries INT = 100000;
DECLARE #StartNumber INT = 0;
WITH E1(N) AS(SELECT 1 FROM(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))t(N)), --10 ^ 1
E2(N) AS(SELECT 1 FROM E1 a CROSS JOIN E1 b), -- 10 ^ 2 = 100 rows
E4(N) AS(SELECT 1 FROM E2 a CROSS JOIN E2 b), -- 10 ^ 4 = 10,000 rows
E8(N) AS(SELECT 1 FROM E4 a CROSS JOIN E4 b), -- 10 ^ 8 = 10,000,000 rows
CteTally AS
(
SELECT TOP(ISNULL(#CountEntries,1000000)) ROW_NUMBER() OVER(ORDER BY(SELECT NULL)) -1 + ISNULL(#StartNumber,0) As Nmbr
FROM E8
)
INSERT INTO #RunningNumbers
SELECT CteTally.Nmbr,CalendarDate.d,CalendarExt.*
FROM CteTally
CROSS APPLY
(
SELECT DATEADD(DAY,CteTally.Nmbr,{ts'2018-01-01 00:00:00'})
) AS CalendarDate(d)
CROSS APPLY
(
SELECT YEAR(CalendarDate.d) AS CalendarYear
,MONTH(CalendarDate.d) AS CalendarMonth
,DAY(CalendarDate.d) AS CalendarDay
,DATEPART(WEEK,CalendarDate.d) AS CalendarWeek
,DATEPART(DAYOFYEAR,CalendarDate.d) AS CalendarYearDay
,DATEPART(WEEKDAY,CalendarDate.d) AS CalendarWeekDay
) AS CalendarExt;
--The mockup table from above is now filled and can be queried
WITH AddPeriode AS
(
SELECT Number/28 +1 AS PeriodNumber
,CalendarDate
,CalendarWeek
,r.CalendarDay
,r.CalendarMonth
,r.CalendarWeekDay
,r.CalendarYear
,r.CalendarYearDay
FROM #RunningNumbers AS r
)
SELECT TOP 100 p.*
,(SELECT MIN(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber) AS [Start]
,(SELECT MAX(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber) AS [End]
,(SELECT MIN(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber AND x.CalendarWeek=p.CalendarWeek) AS [wkStart]
,(SELECT MAX(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber AND x.CalendarWeek=p.CalendarWeek) AS [wkEnd]
,(ROW_NUMBER() OVER(PARTITION BY PeriodNumber ORDER BY CalendarDate)-1)/7+1 AS WeekOfPeriode
FROM AddPeriode AS p
ORDER BY CalendarDate
Try it out...
Hint: Do not use a VIEW or iTVF for this.
This is non-changing data and much better placed in a physically stored table with appropriate indexes.
Not abundantly sure external links are accepted here, but I wrote an article that pulls of a 5-4-4 'Crop Year' fiscal year with all the code. Feel free to use all the code in these articles.
SQL Server Calendar Table
SQL Server Calendar Table: Fiscal Years
What I need to do is get a Cost breakout for each grouping, aggregated by day. Also, only taking the top N per the whole date range. I'm probably not explaining this well so let me give examples. Say my table schema and data looks like this:
SoldDate Product State Cost
----------------------- --------------------- --------- ------
2017-07-11 01:00:00.000 Apple NY 6
2017-07-11 07:00:00.000 Banana NY 1
2017-07-11 07:00:00.000 Banana NY 1
2017-07-12 01:00:00.000 Pear NY 2
2017-07-12 03:00:00.000 Olive TX 1
2017-07-12 16:00:00.000 Banana NY 1
2017-07-13 22:00:00.000 Apple NY 6
2017-07-13 22:00:00.000 Apple NY 6
2017-07-13 23:00:00.000 Banana NY 1
Call this table SoldProduce.
Now what I'm looking for is to group by Day, Product and State but for each day, only take the top two of the group NOT the top of that particular day. Anything else gets lumped under 'other'.
So in this case, our top two groups with the greatest Cost are Apple-NY and Banana-NY. So those are the two that should show up in the output only. Anything else is under 'Other'
So in the end this is the desired output:
SoldDay Product State Total Cost
----------------------- --------------------- --------- ------
2017-07-11 00:00:00.000 Apple NY 6
2017-07-11 00:00:00.000 Banana NY 2
2017-07-11 00:00:00.000 OTHER OTHER 0
2017-07-12 00:00:00.000 OTHER OTHER 3
2017-07-12 00:00:00.000 Banana NY 1
2017-07-13 00:00:00.000 Apple NY 12
2017-07-13 00:00:00.000 Banana NY 1
2017-07-13 00:00:00.000 OTHER OTHER 0
Note how on the 12th Pear and Olive were lumped under other. Even though it outsold Banana on that day. This is because I want the Top N selling groups for the whole range, not just on a day by day basis.
I did a lot of googleing a way to make a query to get this data but I'm not sure if it's the best way:
WITH TopX AS
(
SELECT
b.Product,
b.State,
b.SoldDate,
b.Cost,
DENSE_RANK() OVER (ORDER BY GroupedCost DESC) as [Rank]
FROM
(
SELECT
b.Product,
b.State,
b.SoldDate,
b.Cost,
SUM(b.Cost) OVER (PARTITION BY b.Product, b.State) as GroupedCost
FROM
SoldProduce b WITH (NOLOCK)
) as b
)
SELECT
DATEADD(d,DATEDIFF(d,0,SoldDate),0),
b.Product,
b.State,
SUM(b.Cost)
FROM
TopX b
WHERE
[Rank] <= 2
GROUP BY
DATEADD(d,DATEDIFF(d,0,SoldDate),0),
b.Product,
b.State
UNION ALL
SELECT
DATEADD(d,DATEDIFF(d,0,SoldDate),0),
null,
null,
SUM(b.Cost)
from
TopX b
WHERE
[Rank] > 2
GROUP BY
DATEADD(d,DATEDIFF(d,0,SoldDate),0)
Step 1) Create a common query that first projects the cost that the row would be has we just grouped by Product and State. Then it does a second projection to rank that cost 1-N where 1 has the greatest grouped cost.
Step 2) Call upon the common query, grouping by day and restricting to rows <= 2. This is the Top elements. Then union the other category to this, or anything ranked > 2.
What do you guys think? Is this an efficient solution? Could I do this better?
Edit:
FuzzyTrees suggestion benchmarks better than mine.
Final query used:
WITH TopX AS
(
SELECT
TOP(2)
b.Product,
b.State
FROM
SoldProduce b
GROUP BY
b.Product,
b.State
ORDER BY
SUM(b.Cost)
)
SELECT
DATEADD(d,DATEDIFF(d,0,SoldDate),0),
coalesce(b.Product, 'Other') Product,
coalesce(b.State, 'Other') State,
SUM(b.Cost)
FROM
SoldProduce a
LEFT JOIN TopX b ON
(a.Product = b.Product OR (a.Product IS NULL AND b.Product IS NULL)) AND
(a.State = b.State OR (a.State IS NULL AND b.State IS NULL))
GROUP BY
DATEADD(d,DATEDIFF(d,0,SoldDate),0),
coalesce(b.Product, 'Other') Product,
coalesce(b.State, 'Other') State,
ORDER BY DATEADD(d,DATEDIFF(d,0,SoldDate),0)
-- Order by optional. Just for display purposes.
--More effienct to order in code for the final product.
--Don't use I/O if you don't have to :)
I suggest using a plain group by without window functions for your TopX view:
With TopX AS
(
select top 2 Product, State
from SoldProduce
group by Product, State
order by sum(cost) desc
)
Then you can left join to your TopX view and use coalesce to determine which products fall into the Other group
select
coalesce(TopX.Product, 'Other') Product,
coalesce(TopX.State, 'Other') State,
sum(Cost),
sp.SoldDate
from SoldProduce sp
left join TopX on TopX.Product = sp.Product
and TopX.State = sp.State
group by
coalesce(TopX.Product, 'Other'),
coalesce(TopX.State, 'Other'),
SoldDate
order by SoldDate
Note: This query will not return 0 counts
I have below SQL table:
Id | Code | DateTime1 | DateTime2
1 3AA2 2017-02-01 14:23:00.000 2017-02-01 20:00:00.000
2 E323 2017-02-12 17:34:34.032 2017-02-12 18:34:34.032
3 DFG3 2017-03-08 09:20:10.032 2017-03-08 12:30:10.032
4 LKF0 2017-04-24 11:14:00.000 2017-04-24 13:40:00.000
5 DFG3 2017-04-20 13:34:42.132 2017-04-20 15:12:12.132
6 DFG3 2017-04-20 13:34:42.132 NULL
Id is an auto numeric field.
Code is string and Datetime1 and DateTime2 are datetime type. Also DateTime1 cannot be null but datetime2 can be.
I would like to obtain the last row by datetime1 (MAX datetime1, most recent one) that match a concrete code and it has datetime2 set to NULL.
For example, taken into account above table, for code DFG3 I would like to obtain row with Id=6, its max date for datetime1, that is "2017-04-20 13:34:42.132"
But now imagine the following case:
Id | Code | DateTime1 | DateTime2
1 3AA2 2017-02-01 14:23:00.000 2017-02-01 20:00:00.000
2 E323 2017-02-12 17:34:34.032 2017-02-12 18:34:34.032
3 DFG3 2017-03-08 09:20:10.032 2017-03-08 12:30:10.032
4 LKF0 2017-04-24 11:14:00.000 2017-04-24 13:40:00.000
5 DFG3 2017-04-20 13:34:42.132 NULL
6 DFG3 2017-05-02 16:34:34.032 2017-05-02 21:00:00.032
Again, taken into account above table, I would like to obtain the same, that is, the last row by datetime1 (Max datetime1, most recent one) that match a concrete code and it has datetime2 set to NULL.
Then, in this last case for code DFG3 no rows must be return because row with Id=6 is the last by datetime1 (most recent) for code DFG3 but is not NULL.
How can I do this?
Can you try this query and let me know if it works for your case
Select * From [TableName] where [Code]='DFG3' and [datetime2] is null and [datetime1] = (select max([datetime1]) from [TableName] where [Code]='DFG3')
This bring you all the latest code on your table, then you select only the one with datetime2 is null
SELECT *
FROM (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Code
ORDER BY DateTime1 Desc) as rn
FROM yourTable
) as T
WHERE rn = 1 -- The row with latest date for each code will have 1
and dateTime2 IS NULL
and code = 'DFG3' -- OPTIONAL
Can you help out with a problem
I have table price table which has daily prices starting 31st Dec 2010 till todays date.The table contains daily prices
2009-12-31 00:00:00.000 1.0020945351
2010-01-01 00:00:00.000 1.0021009300
2010-01-04 00:00:00.000 1.0021910181
2010-01-05 00:00:00.000 1.0022005986
2010-01-06 00:00:00.000 1.0022428696
2010-01-07 00:00:00.000 1.0022647147
2010-01-08 00:00:00.000 1.0022842726
2010-01-11 00:00:00.000 1.0023374302
2010-01-12 00:00:00.000 1.0023465374
2010-01-13 00:00:00.000 1.0023638081
2010-01-14 00:00:00.000 1.0023856533
2010-01-00 00:00:00.000 1.0024083955
2010-01-18 00:00:00.000 1.0024779677
2010-01-19 00:00:00.000 1.0025020553
2010-01-20 00:00:00.000 1.002521135
2010-01-21 00:00:00.000 1.0025420688
2010-01-22 00:00:00.000 1.0025593397
2010-01-25 00:00:00.000 1.0026180146
2010-01-26 00:00:00.000 1.002637573
2010-01-27 00:00:00.000 1.0026648447
2010-01-28 00:00:00.000 1.0026957934
2010-01-29 00:00:00.000 1.0027267421
2010-02-01 00:00:00.000 1.0028195885
2010-02-02 00:00:00.000 1.0028573523
2010-02-03 00:00:00.000 1.0028964611
2010-02-04 00:00:00.000 1.00293557
2010-02-05 00:00:00.000 1.002973334
2010-02-08 00:00:00.000 1.0030879717
2010-02-09 00:00:00.000 1.0031279777
2010-02-10 00:00:00.000 1.003171166
2010-02-11 00:00:00.000 1.0032007452
2010-02-12 00:00:00.000 1.0032575895
2010-02-00 00:00:00.000 1.0033749191
2010-02-1 00:00:00.000 1.0034140292
2010-02-17 00:00:00.000 1.003452691
2010-02-18 00:00:00.000 1.0034918013
2010-02-19 00:00:00.000 1.0035395633
2010-02-22 00:00:00.000 1.0036664439
2010-02-23 00:00:00.000 1.0037042097
2010-02-24 00:00:00.000 1.0037510759
2010-02-25 00:00:00.000 1.0038001834
2010-02-26 00:00:00.000 1.003850077
I need to write a query to get index based on
(Last day of current month/Previous month last day) - 1 * 100.So that output comes something like this
31-Jan-10 0.01%
28-Feb-10 0.02%
31-Mar-10 0.00%
Following is one of the solution I thought about however please share best ideas to implement this problem
Extract last day of all the months with values into a temp table and then order by dates so that they subtract and put the values into another temp table
Looking forward to your help.
Try this....
DECLARE #StartDate DATETIME = '2010-01-01',
#EndDate DATETIME = GETDATE();
WITH data AS (
SELECT 1 AS i, CONVERT(DATETIME, NULL) AS StartDate, DATEADD(MONTH, 0, #StartDate) - 1 AS EndDate
UNION ALL
SELECT i + 1, data.EndDate, DATEADD(MONTH, i, #StartDate) - 1 AS EndDate
FROM data
WHERE DATEADD(MONTH, i, #StartDate) - 1 < #EndDate
)
SELECT (
((SELECT TOP 1 Rate FROM RateTable WHERE Date <= data.EndDate ORDER BY Date DESC) /
(SELECT TOP 1 Rate FROM RateTable WHERE Date <= data.StartDate ORDER BY Date DESC)- 1) * 100)
FROM DATA -- parenthesis were causing issues
WHERE data.StartDate IS NOT NULL
OPTION (MAXRECURSION 10000);
You'll need to replace the
(SELECT Rate FROM RateTable WHERE Date = data.StartDate)
and
(SELECT Rate FROM RateTable WHERE Date = data.EndDate)
With the values for your rate table. as you didn't mention column and table names in your question.
rwking indicated that there might be gaps in the rates table that would cause issues.
I've modified the subquery to bring back the first rate on or nearest the start and end dates.
Hope that helps
You can use the LAG function introduced in SQL2012 to make it a bit easier:
WITH DataWithOrder AS
(
SELECT DateField, PriceField,
ROW_NUMBER() OVER(PARTITION BY YEAR(DateField), Month(DateField) ORDER BY DateField DESC) AS Pos
FROM PriceTable
)
SELECT
DateField,
PriceField,
LAG(PriceField) OVER(ORDER BY DateField) AS PriceLastMonth,
((PriceField / LAG(PriceField) OVER(ORDER BY DateField)) - 1) * 100 AS PCIncrease
FROM DataWithOrder
WHERE Pos = 1
ORDER BY DateField
I took a very different approach than the other guy. His is more elegant and would work better if the daily data does represent every single day of every month. If there are gaps in days, however, as your sample data represents, you can try the following code.
with cte as (select mydate
, price
, ROW_NUMBER() over(partition by YEAR(mydate), MONTH(mydate)
order by day(mydate) desc) row_n
from #temp)
select mydate, price, ROW_NUMBER() over(order by mydate desc) row_num
into #temp2
from cte
where row_n = 1
alter table #temp2
add idx float
declare #counter int = 1
while #counter < (select MAX(row_num)+1 from #temp2)
begin
update t2
set t2.idx = ((t2.price/t3.price)-1)*100
from #temp2 t2 left join
#temp2 t3 on 1 = 1
where t2.row_num = #counter and t3.row_num = #counter + 1
set #counter = #counter + 1
end
select mydate, idx
from #temp2
As the other poster mentioned, you didn't provide column or table names. My process was to insert your data into a table called [#temp] with column names [mydate] and [price].
Also, the data sample you provided contains two invalid dates that I changed to arbitrary dates just for the purposes of getting code to run. (2010-01-00 and 2010-02-00)
Let's say I have the following tables
tableA
seq datea
1 2010-01-01
2 2010-02-01
3 2010-03-01
tableb
dateb sthvalue
2010-01-11 AAA
2010-01-12 AAB
2010-02-03 CCC
2010-02-06 CCD
2010-02-10 CCE
2010-03-05 FFF
I want to join the two tables on tableb.dateb is within the daterange of tablea
i.e. output should be
seq datea dateb sthvalue
1 2010-01-01 2010-01-11 AAA
1 2010-01-01 2010-01-12 AAB
2 2010-02-01 2010-02-03 CCC
2 2010-02-01 2010-02-06 CCD
2 2010-02-01 2010-02-10 CCE
3 2010-03-01 2010-03-05 FFF
Many thanks for your kind help!
Assuming that table A values are always one month apart and set on the 1st of each month, the existing answers will do.
If your table A can contain more variety:
SELECT
*
FROM
TableB b
inner join
TableA a
on
b.dateb >= a.datea
left join
TableA a_nolater
on
a_nolater.datea > a.datea and
b.dateb >= a_nolater.datea
WHERE
a_nolater.seq is null
This joins the two tables together, then attempts to find a "better" join (a row from tablea that occurs later than the currently matching one, and would still be a match for tableb). It only returns rows where it cannot find this "better" join. As such, it find the latest dated row in tableA that is on or before the date from tableB.
I believe what you are asking for is to join on year and month
select
seq,datea,dateb,sthvalue
from
TableA inner join Tableb
on datepart(year,datea) = datepart(year,dateb) and
datepart(month,datea) = datepart(month,dateb)
order by seq,dateb
You can
select
a.seq,
a.datea,
b.dateb,
b.sthvalue
from
tablea a inner join tableb b on (b.dateb >= a.datea and b.dateb < dateadd(month, 1, a.datea))
order by
a.seq, b.sthvalue