stuck on a project. I wrote this code in sql server which finds the duplicate date matches for a staff member, but I'm stuck when trying to expand it to narrow it down to when the time ranges overlap each other also.
So there is a table called 'Rosters' with columns 'StaffID', 'Date', 'Start', 'End'
SELECT
y.[Date],y.StaffID,y.Start,y.[End]
FROM Rosters y
INNER JOIN (SELECT
[Date],StaffID, COUNT(*) AS CountOf
FROM Rosters
GROUP BY [Date],StaffID
HAVING COUNT(*)>1)
dd ON y.[Date]=dd.[Date] and y.StaffID=dd.StaffID
It returns all duplicate dates for each staff member, I wish to add the logic-
y.Start <= dd.[End] && dd.Start <= y.[End]
Is it possible with the way I'm currently doing it? Any help would be appreciated.
#TT. Sorry, below is probably a better visual explanation -
e.g This would be the roster table
ID Date Start End
1 01/01/2000 8:00 12:00
1 01/01/2000 9:00 11:00
2 01/01/2000 10:00 14:00
2 01/01/2000 8:00 9:00
3 01/01/2000 14:00 18:00
3 02/02/2002 13:00 19:00
And I'm trying to return what is below for the example as they are the only 2 rows that clash for ID, Date, and the Time range (start - end)
ID Date Start End
1 01/01/2000 8:00 12:00
1 01/01/2000 9:00 11:00
This is the logic that you would need to filter your results to overlapping time ranges, though I think this can be handled without your intermediate step of finding the duplicates. If you simply post your source table schema with some test data and your desired output, you will get a much better answer:
declare #t table (RowID int
,ID int
,DateValue date --\
,StartTime Time -- > Avoid using reserved words for your object names.
,EndTime Time --/
);
insert into #t values
(1,1, '01/01/2000', '8:00','12:00' )
,(2,1, '01/01/2000', '9:00','11:00' )
,(3,2, '01/01/2000', '10:00','14:00')
,(4,2, '01/01/2000', '8:00','9:00' )
,(5,3, '01/01/2000', '14:00','18:00')
,(6,3, '02/02/2002', '13:00','19:00');
select t1.*
from #t t1
inner join #t t2
on(t1.RowID <> t2.RowID -- If you don't have a unique ID for your rows, you will need to specify all columns so as no to match on the same row.
and t1.ID = t2.ID
and t1.DateValue = t2.DateValue
and t1.StartTime <= t2.EndTime
and t1.EndTime >= t2.StartTime
)
order by t1.RowID
Try this
with cte as
(
SELECT ROW_NUMBER() over (order by StaffID,Date,Start,End) as rno
,StaffID, Date, Start, End
FROM Rosters
)
select distinct t1.*
from cte t1
inner join cte t2
on(t1.rno <> t2.rno
and t1.StaffID = t2.StaffID
and t1.Date = t2.Date
and t1.Start <= t2.End
and t1.End >= t2.Start
)
order by t1.rno
Made some changes in #iamdave's Answer
If you use SQL Server 2012 up, you can try below script:
declare #roster table (StaffID int,[Date] date,[Start] Time,[End] Time);
insert into #roster values
(1, '01/01/2000', '9:00','11:00' )
,(1, '01/01/2000', '8:00','12:00' )
,(2, '01/01/2000', '10:00','14:00')
,(2, '01/01/2000', '8:00','9:00' )
,(3, '01/01/2000', '14:00','18:00')
,(3, '02/02/2002', '13:00','19:00');
SELECT t.StaffID,t.Date,t.Start,t.[End] FROM (
SELECT y.StaffID,y.Date,y.Start,y.[End]
,CASE WHEN y.[End] BETWEEN
LAG(y.Start)OVER(PARTITION BY y.StaffID,y.Date ORDER BY y.Start) AND LAG(y.[End])OVER(PARTITION BY y.StaffID,y.Date ORDER BY y.Start) THEN 1 ELSE 0 END
+CASE WHEN LEAD(y.[End])OVER(PARTITION BY y.StaffID,y.Date ORDER BY y.Start) BETWEEN y.Start AND y.[End] THEN 1 ELSE 0 END AS IsOverlap
,COUNT (0)OVER(PARTITION BY y.StaffID,y.Date) AS cnt
FROM #roster AS y
) t WHERE t.cnt>1 AND t.IsOverlap>0
StaffID Date Start End
----------- ---------- ---------------- ----------------
1 2000-01-01 08:00:00.0000000 12:00:00.0000000
1 2000-01-01 09:00:00.0000000 11:00:00.0000000
Related
I am stuck with a problem.
I have some data likes these :
Id Creation date Creation date hour range Id vehicule Id variable Value
1 2017-03-01 9:10 2017-03-01 9:00 1 6 0.18
2 2017-03-01 9:50 2017-03-01 9:00 1 3 0.50
3 2017-03-01 9:27 2017-03-01 9:00 1 3 null
4 2017-03-01 10:05 2017-03-01 10:00 1 3 0.35
5 2017-03-01 10:17 2017-03-01 10:00 1 3 0.12
6 2017-03-01 9:05 2017-03-01 9:00 1 5 0.04
7 2017-03-01 9:57 2017-03-01 9:00 1 5 null
I need to select rowset group by Id vehicule, Id variable, Creation date hour range and order by group by Id vehicule, Id variable, Creation date where the first Value is null but second value, third value, ... is not null. So, in the sample above, the following rowset :
Id Creation date Creation date hour range Id vehicule Id variable Value
3 2017-03-01 9:27 2017-03-01 9:00 1 3 null
2 2017-03-01 9:50 2017-03-01 9:00 1 3 0.50
Could you help me please ?
Thank you
You will have no luck with a group by in this case. I would give 2 "if exists" into the where clause to filter all IDs that fit your criteria:
(for example/not tested/probably takes forever)
select *
from yourTable y1
where id in
--the id must be in all IDs, where the first value of the set is null
--same ID instead of group by
(select 1 from yourTable y2 where y1.IDs = y2.IDs and
--the first in the set
y2.createdate = (select min(createdate) from yourtable y3 with sameid) and
y2.value is null)
AND
--the id must also be in the IDs, where there are values besides the first that are not null
id in (same select but with "not min" and "not null" obviously
hope that helped :)
Include the Value field in the ORDER BY clause and it will be sorted to the top because NULL has a lower practical value than a non-NULL value.
Assuming (because your middle paragraph is hard to understand) you want all the fields output but you want the 4th and 5th columns to produce some grouping of the output, with Value = NULL at the top of each group:
SELECT Id, CreatedDate, CreatedDateHourRange, IdVehicule, IdVariable, Value
ORDER BY IdVehicule, IdVariable, Value
I don't see any need for an actual GROUP BY clause.
I think it is unclear as to whether you want to limit the NULL Value rows in each block to just one row of NULL, but if you do you would need to state the order for which the datetime columns are sorted.
indeed group by was no use here. Also I wasn't sure where your 10:00 records were going to. Does this help?
;WITH CTE_ADD_SOME_LOGIC
AS
(
SELECT Id, CreationDate ,CreationDateHourRange ,IdVehicle ,IdVariable ,Value
, CASE WHEN Value IS NULL THEN 1 ELSE 0 END AS VALUE_IS_NULL FROM tbl
),
CTE_MORE_LOGIC
AS
(
SELECT Id, CreationDate ,CreationDateHourRange ,IdVehicle ,IdVariable ,Value,VALUE_IS_NULL
, RANK() OVER (ORDER BY CreationDateHourRange,VALUE_IS_NULL) AS RN FROM CTE_ADD_SOME_LOGIC),
CTE_ORDER
AS
(
SELECT Id, CreationDate ,CreationDateHourRange ,IdVehicle ,IdVariable ,Value,VALUE_IS_NULL, RN
, ROW_NUMBER() OVER(PARTITION BY RN ORDER BY RN,IdVehicle,IdVariable,CreationDate, VALUE_IS_NULL DESC) AS HIERARCHY FROM CTE_MORE_LOGIC
)
SELECT Id, CreationDate ,CreationDateHourRange ,IdVehicle ,IdVariable ,Value FROM CTE_ORDER WHERE HIERARCHY = 1
ORDER BY Id
Try this Query
DECLARE #Nulloccurrence INT=1 -- Give like 1,2,3 value to get first null occurrence 2 for 2nd null occurrence
SELECT TOP 2 *
FROM cte
WHERE Id <= (
SELECT ID FROM
(
SELECT Id, ROW_NUMBER()OVER( Order by id) AS Seq
FROM cte
WHERE (
CASE
WHEN CAST(variableValue AS VARCHAR) IS NULL
THEN 'P'
ELSE CAST(variableValue AS VARCHAR)
END
) = 'P'
)Dt
WHERE Dt.Seq=#Nulloccurrence
)
ORDER BY 1 DESC
Expected Result
Id Creationdate Creationdatehourrange Ids vehicleId variableValue
------------------------------------------------------------------------
3 2017-03-01 9:27 2017-03-01 9:00 1 3 NULL
2 2017-03-01 9:50 2017-03-01 9:00 1 3 0.50
For 'where the first Value is null but second value, third value, ... is not null' i suppose you want to filter cases where there is a null and a not null value at [Value] within the set you group by, to decide to filter or not that grouped row. This cannot be filtered on standard WHERE clause because at WHERE clause each row is filtered with conditions relevant to that row scope only. Simply put, each row filtered cannot 'see' other rows unless you use sub-query. You need to use HAVING clause (the comment out is for 2+ null records)
This will work:
> DECLARE #mytbl TABLE(Id INT, [Creation date] DATETIME, [Creation date
> hour range] DATETIME, [Id veh] INT, [Id var] INT, Value INT )
>
> INSERT INTO #mytbl VALUES (1,'2017-03-01 9:10 ','2017-03-01 9:00 ',1,
> 6, 0.18) INSERT INTO #mytbl VALUES (2,'2017-03-01 9:50 ','2017-03-01
> 9:00 ',1, 3, 0.50) INSERT INTO #mytbl VALUES (3,'2017-03-01 9:27
> ','2017-03-01 9:00 ',1, 3, NULL) INSERT INTO #mytbl VALUES
> (4,'2017-03-01 10:05','2017-03-01 10:00',1, 3, 0.35) INSERT INTO
> #mytbl VALUES (5,'2017-03-01 10:17','2017-03-01 10:00',1, 3, 0.12)
> INSERT INTO #mytbl VALUES (6,'2017-03-01 9:05 ','2017-03-01 9:00 ',1,
> 5, 0.04) INSERT INTO #mytbl VALUES (7,'2017-03-01 9:57 ','2017-03-01
> 9:00 ',1, 5, NULL)
>
> SELECT [Id veh], [Id var],[Creation date hour range] FROM #mytbl GROUP
> BY [Id veh], [Id var],[Creation date hour range] HAVING COUNT([Id
> veh]) - COUNT(Value) = 1
> --HAVING COUNT([Id veh]) - COUNT(Value) >= 1 ORDER BY [Id veh], [Id var],[Creation date hour range]
There was one other SIMILAR answer but it is 2 pages long and my requirement doesn't need that. I have 2 tables, tableA and a tableB, and I need to find the COUNTS of rows that are present in tableA but are not present in tableB OR if update_on in tableB is not today's date.
My tables:
tableA:
release_id book_name release_begin_date
----------------------------------------------------
1122 midsummer 2016-01-01
1123 fool's errand 2016-06-01
1124 midsummer 2016-04-01
1125 fool's errand 2016-08-01
tableB:
release_id book_name updated_on
-----------------------------------------
1122 midsummer 2016-08-17
1123 fool's errand 2016-08-16**
Expected result: Since each book is missing one release id, 1 is count. But in addition fool's errand's existing row in tableB has updated_on date of yesterday and not today, it needs to be counted in count_of_not_updated.
book_name count_of_missing count_of_not_updated
-------------------------------------------------------
midsummer 1 0
fool's errand 1 1
Note: Even though fool's errand is present in tableB, I need to show it in count_of_missing because it's updated_on date is yesterday and not today. I know it has to be a combination of a left join and something else, but the kicker here is not only getting the missing rows from left table but at the same time checking if the updated_on table was today's date and if not, count that row in count_of_not_updated.
select sum(case when b.release_id is null then 1 else 0 end) as noReleaseID
, sum(case when datediff(d, b.release_date, getdate()) > 0 then 1 else 0 end) as releaseDateNotToday
, a.release_id
from tableA a
left outer join tableB b on a.release_id = b.release_id
Group by a.release_id
This example uses a sum function on a case statement to add up the instances where the case statement returns true. Note that the current code assumes, as in your example, that you are looking to count all old release dates from table b - more steps would be required if each book has multiple old release dates in table b, and you only want to compare to the most recent release date.
Try this
DECLARE #tableA TABLE (release_id INT, book_name NVARCHAR(50), release_begin_date DATETIME)
DECLARE #tableB TABLE (release_id INT, book_name NVARCHAR(50), updated_on DATETIME)
INSERT INTO #tableA
VALUES
(1122, 'midsummer', '2016-01-01'),
(1123, 'fool''s errand', '2016-06-01'),
(1124, 'midsummer', '2016-04-01'),
(1125, 'fool''s errand', '2016-08-01')
INSERT INTO #tableB
VALUES
(1122, 'midsummer', '2016-08-17'),
(1123, 'fool''s errand', '2016-08-16')
;WITH TmpTableA
AS
(
SELECT
book_name,
COUNT(1) CountOfTableA
FROM
#tableA
GROUP BY
book_name
), TmpTableB
AS
(
SELECT
book_name,
COUNT(1) CountOfTableB,
SUM(CASE WHEN CONVERT(VARCHAR(11), updated_on, 112) = CONVERT(VARCHAR(11), GETDATE(), 112) THEN 0 ELSE 1 END) count_of_not_updated
FROM
#tableB
GROUP BY
book_name
)
SELECT
A.book_name ,
A.CountOfTableA - ISNULL(B.CountOfTableB, 0) AS count_of_missing,
ISNULL(B.count_of_not_updated, 0) AS count_of_not_updated
FROM
TmpTableA A LEFT JOIN
TmpTableB B ON A.book_name = B.book_name
Result:
book_name count_of_missing count_of_not_updated
-------------------- ---------------- --------------------
fool's errand 1 1
midsummer 1 1
I have a table that tracks the Datetime of Incident_IDs created for specific Device_IDs and I am trying to find a way to track chronic issues over a range of dates. The definition of chronic issue is any Device_ID that had 3 or more Incident_IDs created in the past 5 days. I need to be able to search over a range of different dates (mostly monthly).
Given table:
IF OBJECT_ID('tempdb.dbo.#temp') IS NOT NULL
DROP TABLE #temp
CREATE TABLE #temp
(Device_ID INT,
Incident_ID INT,
Incident_Datetime DATETIME)
INSERT INTO #temp
VALUES
(2,1001,'2016-02-01'),
(3,1002,'2016-02-02'),
(2,1003,'2016-02-09'),
(2,1004,'2016-02-10'),
(5,1005,'2016-02-12'),
(2,1006,'2016-02-13'),
(5,1007,'2016-02-14'),
(5,1008,'2016-02-15'),
(3,1009,'2016-02-18'),
(3,1010,'2016-02-19'),
(3,1011,'2016-02-20'),
(5,1012,'2016-02-21'),
(3,1013,'2016-03-18'),
(3,1014,'2016-03-19'),
(3,1015,'2016-03-20');
The desired result for chronic issues for 02-2016 is:
Device_ID Incident_ID Incident_Datetime
2 1003 2/9/16 0:00
2 1004 2/10/16 0:00
2 1006 2/13/16 0:00
3 1009 2/18/16 0:00
3 1010 2/19/16 0:00
3 1011 2/20/16 0:00
5 1005 2/12/16 0:00
5 1007 2/14/16 0:00
5 1008 2/15/16 0:00
I have tried the following query which shows me the ascending count of incidents and allows me to find those device_ids that have had chronic issues but I'm having a hard time isolating all the incidents that make up the chronic issue while excluding those outliers that occurred outside the 3 day range.
SELECT c.Device_ID, c.Incident_ID, c.Incident_Datetime,
(SELECT COUNT(*)
FROM #temp AS t
WHERE
c.Device_ID = t.Device_ID
AND
t.Incident_Datetime BETWEEN DATEADD(DAY,-5,c.Incident_Datetime) AND c.Incident_Datetime) AS Incident_Count
FROM #temp AS c
WHERE
c.Incident_Datetime >= '2016-02-01'
AND
c.Incident_Datetime < '2016-03-01'
ORDER BY
Device_ID, Incident_Datetime
This is probably not quite as nice as Jake's answer, but here's an alternative solution that might work:
WITH cte AS
(
SELECT tmp.Device_ID, tmp.Incident_Datetime FROM #temp AS tmp
CROSS APPLY
(
SELECT Device_ID
FROM #temp AS t
WHERE tmp.Device_ID = t.Device_ID AND t.Incident_Datetime BETWEEN DATEADD(d,-5,tmp.Incident_Datetime) AND tmp.Incident_Datetime
GROUP BY Device_ID HAVING COUNT(Incident_ID) >= 3
) p
WHERE tmp.Incident_Datetime BETWEEN '02-01-2016' AND '03-01-2016'
)
SELECT f.*
FROM #temp f
INNER JOIN cte
ON f.Device_ID = cte.Device_ID
WHERE f.Incident_Datetime BETWEEN DATEADD(d,-5,cte.Incident_Datetime) AND cte.Incident_Datetime
GROUP BY f.Device_ID, f.Incident_ID, f.Incident_Datetime
ORDER BY f.Device_ID, f.Incident_Datetime
How about this...
DECLARE #StartDate datetime, #EndDate datetime
SET #StartDate='2016-02-01'
SET #EndDate='2016-03-01'
SELECT c.Device_ID, c.Incident_ID, c.Incident_DateTime FROM #temp c
INNER JOIN (SELECT t.Device_ID, Count(*) FROM #temp
WHERE t.Incident_DateTime BETWEEN DATEADD(dd, -3, c.Incident_DateTime) AND DATEADD(dd, +3, c.Incident_DateTime)
GROUP BY t.Device_ID
HAVING Count(*) > 2)) t ON c.Device_ID = t.Device_ID
AND c.Incident_DateTime BETWEEN #StartDate AND #EndDate
ORDER BY
c.Device_ID, c.Incident_Datetime
Here's a way to derive a running incidents within n days total:
with
incidents as (
select * from #temp cross apply (
select incident_datetime, 1 union all
select incident_datetime + 5, -1) x(dt, delta)),
rolling as (
select *, incidents_in_range = sum(delta)
over (partition by device_id order by dt)
from incidents)
select t.* from #temp t join rolling r
on r.device_id=t.device_id
and t.incident_datetime between r.incident_datetime - 5 and r.incident_datetime
where r.incidents_in_range >= 3
..basically find the points at which "3 incidents in 5 days" was reached, and then join back to include the incidents within 5 days.
Title sounds confusing but let me please explain:
I have a table that has two columns that provide a date range, and one column that provides a value. I need to query that table and "detail" the data such as this
Is it possible to do only using TSQL?
Additional Info
The table in question is about 2-3million records long (and growing)
Assuming the range of dates is fairly narrow, an alternative is to use a recursive CTE to create a list of all dates in the range and then join interpolate to it:
WITH LastDay AS
(
SELECT MAX(Date_To) AS MaxDate
FROM MyTable
),
Days AS
(
SELECT MIN(Date_From) AS TheDate
FROM MyTable
UNION ALL
SELECT DATEADD(d, 1, TheDate) AS TheDate
FROM Days CROSS JOIN LastDay
WHERE TheDate <= LastDay.MaxDate
)
SELECT mt.Item_ID, mt.Cost_Of_Item, d.TheDate
FROM MyTable mt
INNER JOIN Days d
ON d.TheDate BETWEEN mt.Date_From AND mt.Date_To;
I've also assumed an that date from and date to represent an inclusive range (i.e. includes both edges) - it is unusual to use inclusive BETWEEN on dates.
SqlFiddle here
Edit
The default MAXRECURSION on a recursive CTE in Sql Server is 100, which will limit the date range in the query to a span of 100 days. You can adjust this to a maximum of 32767.
Also, if you are filtering just a smaller range of dates in your large table, you can adjust the CTE to limit the number of days in the range:
WITH DateRange AS
(
SELECT CAST('2014-01-01' AS DATE) AS MinDate,
CAST('2014-02-16' AS DATE) AS MaxDate
),
Days AS
(
SELECT MinDate AS TheDate
FROM DateRange
UNION ALL
SELECT DATEADD(d, 1, TheDate) AS TheDate
FROM Days CROSS APPLY DateRange
WHERE TheDate <= DateRange.MaxDate
)
SELECT mt.Item_ID, mt.Cost_Of_Item, d.TheDate
FROM MyTable mt
INNER JOIN Days d
ON d.TheDate BETWEEN mt.Date_From AND mt.Date_To
OPTION (MAXRECURSION 0);
Update Fiddle
This can be achieved using Cursors.
I've simulated the test data provided and created another table with the name "DesiredTable" to store the data inside, and created the following cusror which achieved exactly what you are looking for:
SET NOCOUNT ON;
DECLARE #ITEM_ID int, #COST_OF_ITEM Money,
#DATE_FROM date, #DATE_TO date;
DECLARE #DateDiff INT; -- holds number of days between from & to columns
DECLARE #counter INT = 0; -- for loop counter
PRINT '-------- Begin the Date Expanding Cursor --------';
-- defining the cursor target statement
DECLARE Date_Expanding_Cursor CURSOR FOR
SELECT [ITEM_ID]
,[COST_OF_ITEM]
,[DATE_FROM]
,[DATE_TO]
FROM [dbo].[OriginalTable]
-- openning the cursor
OPEN Date_Expanding_Cursor
-- fetching next row data into the declared variables
FETCH NEXT FROM Date_Expanding_Cursor
INTO #ITEM_ID, #COST_OF_ITEM, #DATE_FROM, #DATE_TO
-- if next row is found
WHILE ##FETCH_STATUS = 0
BEGIN
-- calculate the number of days in between the date columns
SELECT #DateDiff = DATEDIFF(day,#DATE_FROM,#DATE_TO)
-- reset the counter to 0 for the next loop
set #counter = 0;
WHILE #counter <= #DateDiff
BEGIN
-- inserting rows inside the new table
insert into DesiredTable
Values (#COST_OF_ITEM, DATEADD(day,#counter,#DATE_FROM))
set #counter = #counter +1
END
-- fetching next row
FETCH NEXT FROM Date_Expanding_Cursor
INTO #ITEM_ID, #COST_OF_ITEM, #DATE_FROM, #DATE_TO
END
-- cleanup code
CLOSE Date_Expanding_Cursor;
DEALLOCATE Date_Expanding_Cursor;
The code fetches every row from your original table, then it calculates the number of days between DATE_FROM and DATE_TO columns, then using this number the script will create identical rows to be inserted inside the new table DesiredTable.
give it a try and let me know of the results.
You can generate an increment table and join it to your date From:
Query:
With inc(n) as (
Select ROW_NUMBER() over (order by (select 1)) -1 From (
Select 1 From (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x1(n)
Cross Join (values(1), (1), (1), (1), (1), (1), (1), (1), (1), (1)) as x2(n)
) as x(n)
)
Select item_id, cost, DATEADD(day, n, dateFrom), n From #dates d
Inner Join inc i on n <= DATEDIFF(day, dateFrom, dateTo)
Order by item_id
Output:
item_id cost Date n
1 100 2014-01-01 00:00:00.000 0
1 100 2014-01-02 00:00:00.000 1
1 100 2014-01-03 00:00:00.000 2
2 105 2014-01-08 00:00:00.000 2
2 105 2014-01-07 00:00:00.000 1
2 105 2014-01-06 00:00:00.000 0
2 105 2014-01-09 00:00:00.000 3
3 102 2014-02-14 00:00:00.000 3
3 102 2014-02-15 00:00:00.000 4
3 102 2014-02-16 00:00:00.000 5
3 102 2014-02-11 00:00:00.000 0
3 102 2014-02-12 00:00:00.000 1
3 102 2014-02-13 00:00:00.000 2
Sample Data:
declare #dates table(item_id int, cost int, dateFrom datetime, dateTo datetime);
insert into #dates(item_id, cost, dateFrom, dateTo) values
(1, 100, '20140101', '20140103')
, (2, 105, '20140106', '20140109')
, (3, 102, '20140211', '20140216');
Yet another way is to create and maintain calendar table, containing all dates for many years (in our app we have table for 30 years or so, extending every year). Then you can just link to calendar:
select <whatever you need>, calendar.day
from <your tables> inner join calendar on calendar.day between <min date> and <max date>
This approach allows to include additional information (holidays etc) in calendar table - sometimes very helpful.
Can someone steer me in the right direction for solving this issue with a set-based solution versus cursor-based?
Given a table with the following rows:
Date Value
2013-11-01 12
2013-11-12 15
2013-11-21 13
2013-12-01 0
I need a query that will give me a row for each date between 2013-11-1 and 2013-12-1, as follows:
2013-11-01 12
2013-11-02 12
2013-11-03 12
...
2013-11-12 15
2013-11-13 15
2013-11-14 15
...
2013-11-21 13
2013-11-21 13
...
2013-11-30 13
2013-11-31 13
Any advice and/or direction will be appreciated.
The first thing that came to my mind was to fill in the missing dates by looking at the day of the year. You can do this by joining to the spt_values table in the master DB and adding the number to the first day of the year.
DECLARE #Table AS TABLE(ADate Date, ANumber Int);
INSERT INTO #Table
VALUES
('2013-11-01',12),
('2013-11-12',15),
('2013-11-21',13),
('2013-12-01',0);
SELECT
DateAdd(D, v.number, MinDate) Date
FROM (SELECT number FROM master.dbo.spt_values WHERE name IS NULL) v
INNER JOIN (
SELECT
Min(ADate) MinDate
,DateDiff(D, Min(ADate), Max(ADate)) DaysInSpan
,Year(Min(ADate)) StartYear
FROM #Table
) dates ON v.number BETWEEN 0 AND DaysInSpan - 1
Next I would wrap that to make a derived table, and add a subquery to get the most recent number. Your end result may look something like:
DECLARE #Table AS TABLE(ADate Date, ANumber Int);
INSERT INTO #Table
VALUES
('2013-11-01',12),
('2013-11-12',15),
('2013-11-21',13),
('2013-12-01',0);
-- Uncomment the following line to see how it behaves when the date range spans a year end
--UPDATE #Table SET ADate = DateAdd(d, 45, ADate)
SELECT
AllDates.Date
,(SELECT TOP 1 ANumber FROM #Table t WHERE t.ADate <= AllDates.Date ORDER BY ADate DESC)
FROM (
SELECT
DateAdd(D, v.number, MinDate) Date
FROM
(SELECT number FROM master.dbo.spt_values WHERE name IS NULL) v
INNER JOIN (
SELECT
Min(ADate) MinDate
,DateDiff(D, Min(ADate), Max(ADate)) DaysInSpan
,Year(Min(ADate)) StartYear
FROM #Table
) dates ON v.number BETWEEN 0 AND DaysInSpan - 1
) AllDates
Another solution, not sure how it compares to the two already posted performance wise but it's a bit more concise:
Uses a numbers table:
Linky
Query:
DECLARE #SDATE DATETIME
DECLARE #EDATE DATETIME
DECLARE #DAYS INT
SET #SDATE = '2013-11-01'
SET #EDATE = '2013-11-29'
SET #DAYS = DATEDIFF(DAY,#SDATE, #EDATE)
SELECT Num, DATEADD(DAY,N.Num,#SDATE), SUB.[Value]
FROM Numbers N
LEFT JOIN MyTable M ON DATEADD(DAY,N.Num,#SDATE) = M.[Date]
CROSS APPLY (SELECT TOP 1 [Value]
FROM MyTable M2
WHERE [Date] <= DATEADD(DAY,N.Num,#SDATE)
ORDER BY [Date] DESC) SUB
WHERE N.Num <= #DAYS
--
SQL Fiddle
It's possible, but neither pretty nor very performant at scale:
In addition to your_table, you'll need to create a second table/view dates containing every date you'd ever like to appear in the output of this query. For your example it would need to contain at least 2013-11-01 through 2013-12-01.
SELECT m.date, y.value
FROM your_table y
INNER JOIN (
SELECT md.date, MAX(my.date) AS max_date
FROM dates md
INNER JOIN your_table my ON md.date >= my.date
GROUP BY md.date
) m
ON y.date = m.max_date