SQL Server PIVOT on multiple columns is losing data - sql-server

I've spent way too much time on this issue, and I'm not getting to the finish line. Please read this through before you run to a conclusion that this is a duplicate of all the other pivot with multiple columns on SO.
We have properties and units, with a table which keeps track of when something changed in the unit. We cannot change the structure of the table, as this is a vendor application.
Objective: Pull out the begin and end date for when a unit had an unavailable code of "model".
Issue: I need to filter out the dates where it was available in the middle, though that seems to omit one row of data each time (for unit 105).
what I've tried: PIVOT, CROSS APPLY in conjunction with LEAD/LAG
Here's a link to a SQLFiddle: http://sqlfiddle.com/#!6/29592/2/0
The rest of the question has the tsql from the SQLfiddle including the results which I got. The desired result is at the end.
Create table and insert sample data
DROP TABLE IF EXISTS testModelUnit;
CREATE TABLE testModelUnit(
propertykey INT NOT NULL
,unitNumber VARCHAR(10) NOT NULL
,rowStartDate DATETIME NOT NULL
,rowEndDate DATETIME NOT NULL
,unavailableCode varchar(10) NULL
,CONSTRAINT pk_testModelUnit PRIMARY KEY (propertykey, unitNumber, rowStartDate )
)
GO
INSERT INTO testModelUnit VALUES
(33,'105', '2010-11-11 00:00:00.000','2016-11-11 00:00:00.000','MODEL')
,(33,'105', '2016-11-11 00:00:00.000','2016-12-14 07:51:03.307','MODEL')
,(33,'105', '2016-12-14 07:51:03.307','2017-01-01 00:00:00.000',NULL)
,(33,'105', '2017-01-01 00:00:00.00','2017-03-21 12:21:13.703','MODEL')
,(33,'105', '2017-03-21 12:21:13.703','2017-04-21 12:21:13.703','MODEL')
,(33,'105', '2017-04-21 12:21:13.703','9999-12-31 00:00:00.000','MODEL')
,(33,'2606','2017-04-21 12:21:23.207','9999-12-31 00:00:00.000','MODEL')
,(33,'2606','2017-04-19 10:30:09.227','2017-04-21 12:21:23.207','MODEL')
,(33,'2703','2016-12-14 07:51:03.307','2017-04-19 10:29:47.970','MODEL')
,(33,'2703','2011-11-11 00:00:00.000','2016-12-14 07:51:03.307','MODEL')
GO
That gives you all the data which you need in order to test it, as unit 105 was available for a short period of time at the end of 2016.
Attempt 1 - use LEAD/LAG to determine if a date is the first in a series - then use multiple PIVOT statements
SELECT
propertykey
,unitNumber
,firstDate
,lastDate
FROM (
SELECT
propertykey
,unitNumber
,rowStartDate
,rowEndDate
,CASE
WHEN propertykey = LAG(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LAG(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LAG(rowEndDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowStartDate THEN NULL
ELSE 'firstDate'
END ISFIRST
,CASE
WHEN propertykey = LEAD(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LEAD(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LEAD(rowStartDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowEndDate THEN NULL
ELSE 'lastDate'
END ISLAST
FROM testModelUnit
WHERE UnavailableCode = 'model'
) SRC
PIVOT (
MAX(rowStartDate)
FOR isfirst in ([firstDate])
) as pivotFirst
PIVOT (
MAX(rowEndDate)
FOR islast in ([lastDate])
) as pivotLast
Results were:
propertykey unitNumber firstDate lastDate
33 105 NULL 9999-12-31 00:00:00.000
33 105 2010-11-11 00:00:00.000 NULL
33 105 2017-01-01 00:00:00.000 NULL
33 2606 NULL 9999-12-31 00:00:00.000
33 2606 2017-04-19 10:30:09.227 NULL
33 2703 NULL 2017-04-19 10:29:47.970
33 2703 2011-11-11 00:00:00.000 NULL
Issue is twofold: firstly, I have the NULLs in different rows, and secondly, I am missing an end date for unit 105 (by reversing the order of the two pivot statements, I reversed the issue, and I was then missing on start date)
Second attempt: use the LAG/LEAD as before, though this time use CROSS APPLY to get the first/last values into one column and then pivot the result
SELECT
propertykey
,unitNumber
,firstDate
,lastDate
FROM(
SELECT
propertykey
,unitNumber
,ca.col
,ca.value
FROM
(
SELECT
propertykey
,unitNumber
,rowStartDate
,rowEndDate
,CASE
WHEN propertykey = LAG(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LAG(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LAG(rowEndDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowStartDate THEN NULL
ELSE 'firstDate'
END ISFIRST
,CASE
WHEN propertykey = LEAD(propertykey,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND unitNumber = LEAD(unitNumber,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate)
AND LEAD(rowStartDate,1,NULL) OVER (PARTITION BY propertykey,unitNumber ORDER BY rowStartDate) = rowEndDate THEN NULL
ELSE 'lastDate'
END ISLAST
FROM testModelUnit
WHERE UnavailableCode = 'model'
) sub
OUTER APPLY (
SELECT ISFIRST, rowStartDate
UNION ALL
SELECT ISLAST, rowEndDate
) CA (col, value)
WHERE col IS NOT NULL
)src
PIVOT
(
max(value)
for col in ([firstDate],[lastDate])
) AS pivoted
Result:
propertykey unitNumber firstDate lastDate
33 105 2017-01-01 00:00:00.000 9999-12-31 00:00:00.000
33 2606 2017-04-19 10:30:09.227 9999-12-31 00:00:00.000
33 2703 2011-11-11 00:00:00.000 2017-04-19 10:29:47.970
Issue: I got rid of the NULL rows, though I am still missing one record of data for 105
Desired result:
propertykey unitNumber firstDate lastDate
33 105 2010-11-11 00:00:00.000 2016-12-14 07:51:03.307
33 105 2017-01-01 00:00:00.000 9999-12-31 00:00:00.000
33 2606 2017-04-19 10:30:09.227 9999-12-31 00:00:00.000
33 2703 2011-11-11 00:00:00.000 2017-04-19 10:29:47.970

Are you looking query like below?
Select PropertyKey, UnitNumber, Min(RowStartDate) as FirstDate, Max(rowEndDate) as LastDate from (
Select *, Bucket = Row_number() over(partition by propertykey, unitnumber order by rowStartDate) -
Row_number() over(partition by propertykey, unitnumber, unavailablecode order by rowStartDate)
from testModelUnit
) a
Where a.unavailableCode is not null
group by propertykey, unitNumber, Bucket
Output as below:
+-------------+------------+-------------------------+-------------------------+
| PropertyKey | UnitNumber | FirstDate | LastDate |
+-------------+------------+-------------------------+-------------------------+
| 33 | 105 | 2010-11-11 00:00:00.000 | 2016-12-14 07:51:03.307 |
| 33 | 105 | 2017-01-01 00:00:00.000 | 9999-12-31 00:00:00.000 |
| 33 | 2606 | 2017-04-19 10:30:09.227 | 9999-12-31 00:00:00.000 |
| 33 | 2703 | 2011-11-11 00:00:00.000 | 2017-04-19 10:29:47.970 |
+-------------+------------+-------------------------+-------------------------+
Demo

Related

13 Period Calendar 4-4-5 Calendar T-SQL MSSQL

I am trying to create a 13 period calendar in mssql but I am a bit stuck. I am not sure if my approach is the best way to achieve this. I have my base script which can be seen below:
Set DateFirst 1
Declare #Date1 date = '20180101' --startdate should always be start of
financial year
Declare #Date2 date = '20181231' --enddate should always be start of
financial year
SELECT * INTO #CalendarTable
FROM dbo.CalendarTable(#Date1,#Date2,0,0,0)c
DECLARE #StartDate datetime,#EndDate datetime
SELECT #StartDate=MIN(CASE WHEN [Day]='Monday' THEN [Date] ELSE NULL END),
#EndDate=MAX([Date])
FROM #CalendarTable
;With Period_CTE(PeriodNo,Start,[End])
AS
(SELECT 1,#StartDate,DATEADD(wk,4,#StartDate) -1
UNION ALL
SELECT PeriodNo+1,DATEADD(wk,4,Start),DATEADD(wk,4,[End])
FROM Period_CTE
WHERE DATEADD(wk,4,[End])< =#EndDate
OR PeriodNo+1 <=13
)
select * from Period_CTE
Which gives me this:
PeriodNo Start End
1 2018-01-01 00:00:00.000 2018-01-28 00:00:00.000
2 2018-01-29 00:00:00.000 2018-02-25 00:00:00.000
3 2018-02-26 00:00:00.000 2018-03-25 00:00:00.000
4 2018-03-26 00:00:00.000 2018-04-22 00:00:00.000
5 2018-04-23 00:00:00.000 2018-05-20 00:00:00.000
6 2018-05-21 00:00:00.000 2018-06-17 00:00:00.000
7 2018-06-18 00:00:00.000 2018-07-15 00:00:00.000
8 2018-07-16 00:00:00.000 2018-08-12 00:00:00.000
9 2018-08-13 00:00:00.000 2018-09-09 00:00:00.000
10 2018-09-10 00:00:00.000 2018-10-07 00:00:00.000
11 2018-10-08 00:00:00.000 2018-11-04 00:00:00.000
12 2018-11-05 00:00:00.000 2018-12-02 00:00:00.000
13 2018-12-03 00:00:00.000 2018-12-30 00:00:00.000
The result i am trying to get is
Even if I have to take a different approach I would not mind, as long as the result is the same as the above.
dbo.CalendarTable() is a function that returns the following results. I can share the code if desired.
I'd create a general number's table like suggested here and add a column Periode13.
The trick to get the tiling is the integer division:
DECLARE #PeriodeSize INT=28; --13 "moon-months" a 28 days
SELECT TOP 100 (ROW_NUMBER() OVER(ORDER BY (SELECT NULL))-1)/#PeriodeSize
FROM master..spt_values --just a table with many rows to show the principles
You can add this to an existing numbers table with a simple update statement.
UPDATE A fully working example (using the logic linked above)
DECLARE #RunningNumbers TABLE (Number INT NOT NULL
,CalendarDate DATE NOT NULL
,CalendarYear INT NOT NULL
,CalendarMonth INT NOT NULL
,CalendarDay INT NOT NULL
,CalendarWeek INT NOT NULL
,CalendarYearDay INT NOT NULL
,CalendarWeekDay INT NOT NULL);
DECLARE #CountEntries INT = 100000;
DECLARE #StartNumber INT = 0;
WITH E1(N) AS(SELECT 1 FROM(VALUES (1),(1),(1),(1),(1),(1),(1),(1),(1),(1))t(N)), --10 ^ 1
E2(N) AS(SELECT 1 FROM E1 a CROSS JOIN E1 b), -- 10 ^ 2 = 100 rows
E4(N) AS(SELECT 1 FROM E2 a CROSS JOIN E2 b), -- 10 ^ 4 = 10,000 rows
E8(N) AS(SELECT 1 FROM E4 a CROSS JOIN E4 b), -- 10 ^ 8 = 10,000,000 rows
CteTally AS
(
SELECT TOP(ISNULL(#CountEntries,1000000)) ROW_NUMBER() OVER(ORDER BY(SELECT NULL)) -1 + ISNULL(#StartNumber,0) As Nmbr
FROM E8
)
INSERT INTO #RunningNumbers
SELECT CteTally.Nmbr,CalendarDate.d,CalendarExt.*
FROM CteTally
CROSS APPLY
(
SELECT DATEADD(DAY,CteTally.Nmbr,{ts'2018-01-01 00:00:00'})
) AS CalendarDate(d)
CROSS APPLY
(
SELECT YEAR(CalendarDate.d) AS CalendarYear
,MONTH(CalendarDate.d) AS CalendarMonth
,DAY(CalendarDate.d) AS CalendarDay
,DATEPART(WEEK,CalendarDate.d) AS CalendarWeek
,DATEPART(DAYOFYEAR,CalendarDate.d) AS CalendarYearDay
,DATEPART(WEEKDAY,CalendarDate.d) AS CalendarWeekDay
) AS CalendarExt;
--The mockup table from above is now filled and can be queried
WITH AddPeriode AS
(
SELECT Number/28 +1 AS PeriodNumber
,CalendarDate
,CalendarWeek
,r.CalendarDay
,r.CalendarMonth
,r.CalendarWeekDay
,r.CalendarYear
,r.CalendarYearDay
FROM #RunningNumbers AS r
)
SELECT TOP 100 p.*
,(SELECT MIN(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber) AS [Start]
,(SELECT MAX(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber) AS [End]
,(SELECT MIN(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber AND x.CalendarWeek=p.CalendarWeek) AS [wkStart]
,(SELECT MAX(CalendarDate) FROM AddPeriode AS x WHERE x.PeriodNumber=p.PeriodNumber AND x.CalendarWeek=p.CalendarWeek) AS [wkEnd]
,(ROW_NUMBER() OVER(PARTITION BY PeriodNumber ORDER BY CalendarDate)-1)/7+1 AS WeekOfPeriode
FROM AddPeriode AS p
ORDER BY CalendarDate
Try it out...
Hint: Do not use a VIEW or iTVF for this.
This is non-changing data and much better placed in a physically stored table with appropriate indexes.
Not abundantly sure external links are accepted here, but I wrote an article that pulls of a 5-4-4 'Crop Year' fiscal year with all the code. Feel free to use all the code in these articles.
SQL Server Calendar Table
SQL Server Calendar Table: Fiscal Years

MSSQL: Create incremental row label per group

In my table, I have a primary key and a date. What I'd like to achieve is to have an incremental label based on whether or not there is a break between the dates - column Goal.
Now, below is an example. The break column was calculated using LEAD function (I thought it might help).
I am able to solve it using T-SQL, but this would be last resort. Nothing I tried has worked so far. I am using MSSQL 2014.
PK | Date | break | Goal |
-------------------------------
1 | 03/2017 | 0 | 1 |
1 | 04/2017 | 0 | 1 |
1 | 08/2017 | 1 | 2 |
1 | 09/2017 | 0 | 2 |
1 | 10/2017 | 0 | 2 |
1 | 02/2018 | 1 | 3 |
1 | 03/2018 | 0 | 3 |
Here is a code to reproduce this example:
CREATE TABLE #test
(
ConsumerId INT,
FullDate DATE,
Goal INT
)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-03-01',1)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-04-01',1)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-08-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-09-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-10-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2018-02-01',3)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2018-03-01',3)
SELECT ConsumerId,
FullDate,
CASE WHEN (datediff(month,
isnull(
LEAD (FullDate,1) OVER (PARTITION BY ConsumerId ORDER BY FullDate DESC),
FullDate),
FullDate) > 1)
THEN 1
ELSE 0
END AS break,
Goal
FROM #test
ORDER BY FullDate ASC
EDIT
This is apparently a famous problem "Islands and gaps" as pointed out in the comments. And Google offers many solutions as well as other questions here at SO.
Try this...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT
sg.ConsumerId, sg.FullDate,
GroupValue = DENSE_RANK() OVER (PARTITION BY sg.ConsumerId ORDER BY sg.GV)
FROM
cte_SmearGap sg;
An explanation of the code an how it works...
The 1st query, in cte_TestGap, uses the LAG function along with ROW_NUMBER() function to mark the location of gap in the data. We can see that by breaking it out and looking at it's results...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
)
SELECT * FROM cte_TestGap;
cte_TestGap results...
ConsumerId FullDate Gap
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 0
1 2017-08-01 3
1 2017-09-01 0
1 2017-10-01 0
1 2018-02-01 6
1 2018-03-01 0
At this point we want the 0 values to take on the value of the preceding non-0 values, allowing them to be grouped together. This is done in the 2nd query (cte_SmearGap) using the MAX function with a "window frame". So if we look at the output of cte_SmearGap, we can see that...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT * FROM cte_SmearGap;
cte_SmearGap results...
ConsumerId FullDate GV
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 1
1 2017-08-01 3
1 2017-09-01 3
1 2017-10-01 3
1 2018-02-01 6
1 2018-03-01 6
At this point All of the rows are in distinct groups... but... We'd like to have our group numbers in a contiguous sequence (1,2,3) as opposed to (1,3,6).
Of course that's easy enough to fix using the DENSE_Rank() function, which is what's happening in the final select...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT
sg.ConsumerId, sg.FullDate,
GroupValue = DENSE_RANK() OVER (PARTITION BY sg.ConsumerId ORDER BY sg.GV)
FROM
cte_SmearGap sg;
The end result...
ConsumerId FullDate GroupValue
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 1
1 2017-08-01 2
1 2017-09-01 2
1 2017-10-01 2
1 2018-02-01 3
1 2018-03-01 3
The comment from David Browne was actually extremely useful. If you google "Islands and Gaps", there are many variations of the solution. Below is the one I liked the most.
In the end, I needed the Goal column to be able to group the dates into MIN/MAX. This solution skips this step and directly creates the aggregated range.
Here is the source.
SELECT MIN(FullDate) AS range_start,
MAX(FUllDate) AS range_end
FROM (
SELECT FullDate,
DATEADD(MM, -1 * ROW_NUMBER() OVER(ORDER BY FullDate), FullDate) AS grp
FROM #test
) a
GROUP BY a.grp
And the output:
range_start | range_end |
--------------------------
2017-03-01 | 2017-04-01 |
2017-08-01 | 2017-10-01 |
2018-02-01 | 2018-03-01 |

Return a Date Is null as maximum value in t-sql

I have this table.
ID Date Value
___ ____ _____
3241 9/17/12 5
3241 9/16/12 100
3241 9/15/12 20
4355 9/16/12 12
4355 9/15/12 132
4355 9/14/12 4
1001 NULL 89
1001 9/16/12 125
5555 NULL 89
1234 9/16/12 45
2236 9/15/12 128
2236 9/14/12 323
2002 9/17/12 45
I would like to select the maximum date grouped by id and including NULL as maximum value that should be in the result to get something like that.
ID Date Value
___ ____ _____
3241 9/17/12 5
4355 9/16/12 12
1001 9/16/12 125
5555 NULL 89
1234 9/16/12 45
2236 9/15/12 128
2002 9/17/12 45
I found a solution but that not include NULL as maximum value
the solution by #bluefeet
Return value at max date for a particular id
SELECT t1.id,
t2.mxdate,
t1.value
FROM yourtable t1
INNER JOIN
( SELECT max(date) mxdate,
id
FROM yourtable
GROUP BY id) t2 ON t1.id = t2.id
AND t1.date = t2.mxdate
I also search for a solution to see how can we select NULL maximum value in t-sql so i found this solution by #Damien_The_Unbeliever How can I include null values in a MIN or MAX?
SELECT recordid,
MIN(startdate),
CASE
WHEN MAX(CASE
WHEN enddate IS NULL THEN 1
ELSE 0
END) = 0 THEN MAX(enddate)
END
FROM tmp
GROUP BY recordid
But i m stuck i don't know how to merge between this two solution to get what i want.
PS: I m using SQL SERVER 2008
You can use this
SELECT
ID
,[Date]
,[Value]
FROM(
SELECT
*
, ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ISNULL([Date],'9999-12-31') DESC) AS Row#
FROM yourtable
) A WHERE Row# = 1

SQL Query time spent between certain value

I have a database for all temperatures the last 10 years.
Now I want to find all periods where the temperature was above ex. 15 degree.
Simplified example:
...
2015-05-10 12
2015-05-11 15 |
2015-05-12 16 |
2015-05-13 17 |
2015-05-14 16 |
2015-05-15 15 |
2015-05-16 12
2015-05-17 11
2015-05-18 15 |
2015-05-19 12
2015-05-20 18 |
...
Så now I want get all time periods like this:
Min Max
2015-05-11 2015-05-15
2015-05-18 2015-05-18
2015-05-20 2015-05-20
Any suggestion of how this query will look like ?
You could use CTE
CREATE TABLE #Date (DateT datetime, Value int )
INSERT INTO #Date
VALUES ('2015-05-10',12),
('2015-05-11',15),
('2015-05-12',16),
('2015-05-13',17),
('2015-05-14',16),
('2015-05-15',15),
('2015-05-16',12),
('2015-05-17',11),
('2015-05-18',15),
('2015-05-19',12),
('2015-05-20',18)
WITH t AS (
SELECT DateT d,ROW_NUMBER() OVER(ORDER BY DateT) i
FROM #Date
WHERE Value >= 15
GROUP BY DateT
)
SELECT MIN(d) as DataStart,MAX(d) as DataFinal, ROW_NUMBER() OVER(ORDER BY DATEDIFF(day,i,d)) as RN
FROM t
GROUP BY DATEDIFF(day,i,d)
RN column is optional you could use
SELECT MIN(d) as DataStart,MAX(d) as DataFinal
FROM t
GROUP BY DATEDIFF(day,i,d)
Here is a solution using a gaps and islands algorithm. It looks kind of bulky but it runs fast and scales great. It is also modular if you want to add a gap-allowed parameter and you can rewrite it to partition by some other columns and it still performs nicely.
Inspired by Peter Larssons post here: http://www.sqltopia.com/?page_id=83
WITH [theSource](Col1,Col2)
AS
(
SELECT Col1,Col2 FROM (VALUES
('2015-05-10',12),
('2015-05-11',15),
('2015-05-12',16),
('2015-05-13',17),
('2015-05-14',16),
('2015-05-15',15),
('2015-05-16',12),
('2015-05-17',11),
('2015-05-18',15),
('2015-05-19',12),
('2015-05-20',18)
) as x(Col1,Col2)
)
,filteredSource([Value])
AS
(
SELECT Col1 as [Value]
FROM theSource WHERE Col2 >= 15
)
,cteSource(RangeStart, RangeEnd)
AS (
SELECT RangeStart,
CASE WHEN [RangeStart] = [RangeEnd] THEN [RangeEnd] ELSE LEAD([RangeEnd]) OVER (ORDER BY Value) END AS [RangeEnd]
FROM (
SELECT [Value],
CASE
WHEN DATEADD(DAY,1,LAG([Value]) OVER (ORDER BY [Value])) >= [Value] THEN NULL
ELSE [Value]
END AS RangeStart,
CASE
WHEN DATEADD(DAY,-1,LEAD([Value]) OVER (ORDER BY [Value])) <= [Value] THEN NULL
ELSE [Value]
END AS RangeEnd
FROM filteredSource
) AS d
WHERE RangeStart IS NOT NULL
OR RangeEnd IS NOT NULL
)
SELECT RangeStart AS [Min],
RangeEnd AS [Max]
FROM cteSource
WHERE RangeStart IS NOT NULL;

Selecting rows with the nearest date using SQL

I have a SQL statement.
SELECT
ID, LOCATION, CODE,MAX(DATE),FLAG
FROM
TABLE1
WHERE
DATE <= CONVERT(DATETIME,'11-11-2012')
AND EXISTS (SELECT * FROM #TEMP_CODE WHERE TABLE1.CODE = #TEMP_CODE.CODE)
AND ID IN (14, 279)
GROUP BY
ID, LOCATION, CODE
I need rows with the nearest date to the 11-11-2012, but the table returns all the values. What am I doing wrong. Thanks
ID LOCATION CODE DATE FLAG
-------------------------------------------------------------------
14 CAR STREET,UDUPI 234 2012-08-08 00:00:00.000 0
14 CAR STREET,UDUPI 234 2012-08-10 00:00:00.000 1
14 CAR STREET,UDUPI 234 2012-08-14 00:00:00.000 0
279 MADHUGIRI 234 2012-08-08 00:00:00.000 1
279 MADHUGIRI 234 2012-08-11 00:00:00.000 0
I want to show only the rows with dates less than or equal to the given date. The required result is
ID LOCATION CODE DATE FLAG
-------------------------------------------------------------------
14 CAR STREET,UDUPI 234 2012-08-10 00:00:00.000 1
279 MADHUGIRI 234 2012-08-11 00:00:00.000 0
;WITH x AS
(
SELECT ID, Location, Code, Date, Flag,
rn = ROW_NUMBER() OVER
(PARTITION BY ID, Location, Code ORDER BY [Date] DESC)
FROM dbo.TABLE1 AS t1
WHERE [Date] <= '20121111'
AND ID IN (14, 279) -- sorry, missed this
AND EXISTS (SELECT 1 FROM #TEMP_CODE WHERE CODE = t1.CODE)
)
SELECT ID, Location, Code, Date, Flag
FROM x WHERE rn = 1;
This yields:
ID LOCATION CODE [Date] FLAG
--- ---------------- ---- ---------- ----
14 CAR STREET,UDUPI 234 2012-08-14 0
279 MADHUGIRI 234 2012-08-11 0
This disagrees with your required results, but I think those are wrong and I think you should check them.
Use a subquery to get the max date for each ID, and then join that to your table:
SELECT
ID, LOCATION, CODE, DATE, FLAG
FROM
TABLE1
JOIN (
SELECT ID AS SubID, MAX(DATE) AS SubDATE
FROM TABLE1
WHERE DATE < '11/11/2012'
AND EXISTS (SELECT * FROM #TEMP_CODE WHERE TABLE1.CODE = #TEMP_CODE.CODE)
AND ID IN (14, 279)
GROUP BY ID
) AS SUB ON ID = SubID AND DATE = SubDATE
add a Order BY DATE LIMIT 0,2
With the order by you will make the date order by the closest to your condition in where and with the limit will return only the top 2 values!
SET ROWCOUNT 2
SELECT
ID, LOCATION, CODE,MAX(DATE),FLAG
FROM
TABLE1
WHERE
DATE <= CONVERT(DATETIME,'11-11-2012')
AND EXISTS (SELECT * FROM #TEMP_CODE WHERE TABLE1.CODE = #TEMP_CODE.CODE)
AND ID IN (14, 279)
GROUP BY
ID, LOCATION, CODE
ORDER BY DATE

Resources