How to generate cartesian product of two tables in snowflake?

How to generate cartesian product of two tables in snowflake? - snowflake-cloud-data-platform

** EDIT **
SELECT DATEADD(MONTH, SEQ4(), to_date('2020-12-01')) AS "REPORTING MONTH" FROM TABLE (GENERATOR(ROWCOUNT => 12))
Root cause is the seq4() function is causing the issue when there are large number of records , is there any alternative to seq4() ( tried seq8() too but din't work either)
I am trying to generate a cartesian product of results from two tables in snowflake.
As you can see in second query M was generating results for all the months of 2020 and cross joined with Query A that returns around 1717 results. Final output has 18887 results.
Can you please let me know the right way to use cross join ?
Question
However the "REPORTING MONTH" field has three invalid values '2021-05-01' , '2022-01-01' , '2022-09-01' for each row in Query A, i.e
Expected
A1 , 2020-01-01
A1 , 2020-02-01
A1 , 2020-03-01
A1 , 2020-04-01
A1 , 2020-05-01
A1 , 2020-06-01
A1 , 2020-07-01
A1 , 2020-08-01
A1 , 2020-09-01
A1 , 2020-10-01
A1 , 2020-11-01
A1 , 2020-12-01
Error Actual
A1 , 2020-01-01
A1 , 2020-02-01
A1 , 2020-03-01
A1 , 2020-04-01
A1 , 2020-05-01
A1 , 2020-06-01
A1 , 2020-07-01
A1 , 2020-08-01
A1 , 2020-09-01
A1 , 2021-05-01 (invalid)
A1 , 2022-01-01 (invalid)
A1 , 2022-09-01 (invalid)
WITH A AS (
(SELECT * FROM (SELECT TO_TIMESTAMP_NTZ(TO_VARCHAR(EVENT_TIME, 'YYYY-MM-01')) AS "EVENT REPORTING MONTH", *,
RANK() OVER (PARTITION BY ID, SERVICE_TYPE ORDER BY EVENT_TIME) AS RANK FROM "MY_TABLE") AS E
WHERE RANK = 1 AND DATEDIFF(MONTH , DATE("EVENT REPORTING MONTH") ,CURRENT_DATE()) > 0)
),
M AS (
SELECT DATEADD(MONTH, SEQ4(), to_date('2020-01-01')) AS "REPORTING MONTH" FROM TABLE (GENERATOR(ROWCOUNT => 12))
),
F AS (
SELECT * FROM A cross JOIN M
)
SELECT * FROM F order by "ID","REPORTING MONTH" DESC

The SEQ4 / SEQ8 can have gaps, as documented here:
https://docs.snowflake.com/en/sql-reference/functions/seq1.html
Try to use row_number instead. Here is your example with row_number:
SELECT
DATEADD(
MONTH,
row_number() over ( order by 1 ),
to_date('2020-12-01')
) AS "REPORTING MONTH"
FROM
TABLE (GENERATOR(ROWCOUNT = > 12))

Related

IF statment in SQL

I have a table_changes (Id,stard_date,end_date) and I want to add two columns rank_end_date and new_end_date.
The problem I have in my data is that not always there is continuousness (in the month level, the day in the month is not in my intrest) between end_date and the start_date coming just after it (see example 1) so I need to "strech" end_date in some cases so there will be continuousness at the level of the month.
For example 1, the new_end_date is 1/2/2015 and doesn't have to be 28/2/2015. If the end_date in rank 1 is sooner than 31/12/2015 strech it to 31/12/9999.
Some Examples:
Ex1:
Id --start date --end_date --rank_end_date new_end_date
111 01/01/1970 1/1/1980 2 1/2/2015
111 01/03/2015 31/12/9999 1 31/12/9999
Ex2:
Id --start_date --end_date --rank_end_date new_end_date
111 01/01/1970 1/1/1980 1 31/12/9999
Ex3:
Id --start_date --end_date --rank_end_date new_end_date
111 01/01/1970 1/1/1980 2 01/05/1990
111 01/05/1990 31/12/1995 1 31/12/9999
Ex4:
Id --start_date --end_date --rank__end_date new_end_date
111 01/03/2015 31/12/9999 1 31/12/9999
Ex5:
Id --start_Date --end_date --rank__end_date new_end_date
111 01/02/2015 31/5/2015 2 01/5/2015
111 01/06/2015 31/12/9999 1 31/12/9999
the syntax should be something like this but I don't know how to write those IF statements in SQL:
if rank_end_date ==2 then new_end_date == 1/Month(start_date(rank_end_date - 1)) - 1 /2015
if rank_end_date ==1 then new_end_date == 31/12/2015
else new_end_date = end_date
Select [Id],[StartDate],[EndDate],
Rank_End_Date, case
when t.Rank_End_Date = (2) **then
CAST(CAST(Year([StartDate]) AS varchar) + '-' + CAST(Month([StartDate]) AS varchar) + '-' +
--How to do I choose the Start_Date from the record with Rank==1? It is selecting
the start date from the record with rank==2 ofcourse.
CAST(Day ([EMER_StartDate]) AS varchar) AS DATE)
when t.Rank_End_Date = (1) then '9999-12-31'
else t.[EMER_EndDate] end As New_End_Date
from (
Select [Id],[StartDate],[EndDate],
Rank() OVER (PARTITION BY [Id] order by [EndDate] desc) as Rank_End_Date
from [dbo].[Changes]
) t
Could anybody help in achieving the result?

If I've understood your question right, and you can only have values in rank_end_date of 1 or 2 then something like this query should give you the answer you're looking for. Either way, the LEAD (or LAG function if you sort the records ascending) will allow you to fetch the value from a different record.
SELECT ID
, start_date
, end_date
, rank_end_date
, CASE WHEN rank_end_date = 1 THEN
CASE WHEN end_date < '31/12/2005' THEN '31/12/9999' ELSE end_date END
WHEN rank_end_date = 2 THEN LEAD(start_date,1) OVER(ORDER BY ID, rank_end_date DESC)
END AS new_end_date
FROM dbo.Changes

You can't use LEAD OR LAG functions in SQL Server 2008, so you can try this solution.
with CTE as
(
Select [Id] as ID,[StartDate] as StartDate,[EndDate] as EndDate,
ROW_NUMBER() OVER (PARTITION BY [Id] order by [StartDate] DESC) as rn_Start_Date
from [dbo].[Changes]
)
Select C1.[Id] , C1.[StartDate], C1.[EndDate], C1.rn_Start_Date as Rank_end_date,
ISNULL(DATEADD(MONTH, DATEDIFF(MONTH, 0, C2.[StartDate])-1, 0), cast('9999-12-31' as DATE)) As New_End_Date
From CTE C1
LEFT JOIN CTE C2 ON C1.[ID] = C2.[ID] AND C1.Rn_Start_Date = C2.Rn_Start_Date + 1

Running sum from a point

I have a forecast of change that I need to add on to actuals.
Example:
Date Group Count ActForc
Nov-15 GrpA 10 A
Dec-15 GrpA 12 A
Jan-16 GrpA -1 F
Feb-16 GrpA 2 F
What I would like to see is:
Date Group Count
Nov-15 GrpA 10
Dec-15 GrpA 12
Jan-16 GrpA 11
Feb-16 GrpA 13
but all of the counting/running sum queries I have seen assume that I want the sections to be separate, and give me ways to create sums for each section, but essentially, I want to seed the sum for the second section with the final value from the first section, and continue from that point, without disturbing the values from the second section

If your forecasts are always in the end of the date range, you can also do this by using few window functions inside each other. Here is a running total calculated over a field that checks if the next row is 'F' then it takes count, otherwise 0. When that is then taken instead of count when the next row is F, it will contain the figure you want.
select
[date],
[group],
case when isnull(lead(ActForc) over (order by Date asc),ActForc) = 'F' then
sum(Count2) over (order by Date asc) else [Count] end,
[count],
ActForc
from (
select
[date],
[group],
case when isnull(lead(ActForc) over (order by Date asc),ActForc) = 'F' then [Count] else 0 end as Count2,
[count],
ActForc
from
table1
) X
This should perform better than any recursive CTEs / correlated subqueries because the data isn't read several times. If you have more groups, partitioning the window functions with the group should fix that.
Example in SQL Fiddle with few more months.

Try with a recursive cte.
First create a subquery to have a row_id
Then create the base case with rn = 1
And finally the recursion calculate each next level.
SQL Fiddle Demo
WITH addID as (
SELECT [Date], [Group], [Count], [ActForc],
ROW_NUMBER() OVER ( ORDER BY [DATE]) as rn
FROM myTable
), cte_name ( [Date], [Group], [Count], [level] ) AS
(
SELECT [Date], [Group], [Count], 1 as [level]
FROM addID
WHERE rn = 1
UNION ALL
SELECT A.[Date],
A.[Group],
CASE WHEN [ActForc] = 'F' THEN C.[Count] + A.[Count]
ELSE A.[Count]
END AS [Count],
C.[level] + 1
FROM addID A
INNER JOIN cte_name C
ON A.rn = C.[level] + 1
)
SELECT *
FROM cte_name
OUTPUT
| Date | Group | Count | level |
|----------------------------|-------|-------|-------|
| November, 01 2015 00:00:00 | GrpA | 10 | 1 |
| December, 01 2015 00:00:00 | GrpA | 12 | 2 |
| January, 01 2016 00:00:00 | GrpA | 11 | 3 |
| February, 01 2016 00:00:00 | GrpA | 13 | 4 |

how to calculate average when some rows does not exist?

Please help to create average when some values are NULL
fact table:
cube:
SELECT NON EMPTY {[Measures].[Score]} * [Date].[Month].allmembers ON COLUMNS
,{[Name].[name].allmembers} ON ROWS
FROM Test
problem:
when I calculate average, NULL values are excluded. I tried COALESCEEMPTY(), but did not manage to calculate average correctly anyway. Average for months where Score=0 is not correct. Heres the code:
WITH
MEMBER [Measures].[DateCount] AS DISTINCTCOUNT([Data].[date].[date])
MEMBER [Measures].[ScoreX] AS COALESCEEMPTY([Measures].[Score],0)
MEMBER [Measures].[DateCountX] AS COALESCEEMPTY([Measures].[DateCount],0)
MEMBER [Measures].[AvgScore] AS IIF([Measures].[DateCountX]=0,0,[Measures].[ScoreX]/[Measures].[DateCountX])
SELECT NON EMPTY {[Measures].[AvgScore]} * [Date].[Month].allmembers ON COLUMNS
,{[Name].[name].allmembers} ON ROWS
FROM Test
Please help find the solution.

Maybe something like the following:
WITH
MEMBER [Measures].[Score X] AS
IIF(
[Measures].[Data Count]=0
,0
,[Measures].[Data Count]
)
MEMBER [Measures].[Data Count X] AS
COUNT(
[name].[name].CURRENTMEMBER
*[Measures].[Score X]
)
MEMBER [Measures].[Avg Score] AS
DIVIDE(
[Measures].[Score]
,[Measures].[Data Count X]
)
...
...
As Tab mentioned, you could use the function COALESCEEMPTY for the first calculated member above:
WITH
MEMBER [Measures].[Score X] AS
COALESCEEMPTY(
[Measures].[Data Count]
,0)
MEMBER [Measures].[Data Count X] AS
COUNT(
[name].[name].CURRENTMEMBER
*[Measures].[Score X]
)
MEMBER [Measures].[Avg Score] AS
DIVIDE(
[Measures].[Score]
,[Measures].[Data Count X]
)
...
...

the final solution was this:
WITH
MEMBER Measures.[AvgScore] AS
Avg(
Descendants(
[Data].[Date].CurrentMember,
[Data].[Date].[Date]
),
coalesceempty(Measures.[Score],0)
)
SELECT NON EMPTY {[Measures].[AvgScore]} * [Date].[Month].allmembers ON COLUMNS
,{[Name].[name].allmembers} ON ROWS
FROM Test

Your fact table needs to represent 0's for each month with missing names. You could do this with a common-table-expression.
declare #facttable table(name varchar(10),date datetime,score int);
insert into #facttable(name,date,score)
values
('a1','2015/01/01',15),
('a2','2015/01/01',30),
('a3','2015/01/01',26),
('a1','2015/02/01',20),
('a3','2015/02/01',14),
('a4','2015/02/01',45),
('a5','2015/02/01',3)
;
with fact_cte as(
select
tDistinctNames.DistinctName AS Name,
tDistinctDates.DistinctDate AS Date,
ISNULL(t.Score,0) AS Score
from
(select distinct name as DistinctName from #facttable) tDistinctNames
cross join
(select distinct date as DistinctDate from #facttable) tDistinctDates
left outer join #facttable t on
t.name = tDistinctNames.DistinctName AND
t.date = tDistinctDates.DistinctDate
)
select *
from fact_cte
The result would be this:
Name Date Score
a1 2015-01-01 00:00:00.000 15
a2 2015-01-01 00:00:00.000 30
a3 2015-01-01 00:00:00.000 26
a4 2015-01-01 00:00:00.000 0
a5 2015-01-01 00:00:00.000 0
a1 2015-02-01 00:00:00.000 20
a2 2015-02-01 00:00:00.000 0
a3 2015-02-01 00:00:00.000 14
a4 2015-02-01 00:00:00.000 45
a5 2015-02-01 00:00:00.000 3

Combine continuous datetime intervals by type

Say we have such a table:
declare #periods table (
s date,
e date,
t tinyint
);
with date intervals without gaps ordered by start date (s)
insert into #periods values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
All date intervals have different types (t).
It is required to combine date intervals of the same type where they are not broken by intervals of the other types (having all intervals ordered by start date).
So the result table should look like:
s | e | t
------------|------------|-----
2013-01-01 | 2013-01-02 | 3
2013-01-02 | 2013-01-05 | 1
2013-01-05 | 2013-01-08 | 2
2013-01-08 | 2013-01-09 | 1
Any ideas how to do this without cursor?
I've got one working solution:
declare #periods table (
s datetime primary key clustered,
e datetime,
t tinyint,
period_number int
);
insert into #periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
declare #t tinyint = null;
declare #PeriodNumber int = 0;
declare #anchor date;
update #periods
set period_number = #PeriodNumber,
#PeriodNumber = case
when #t <> t
then #PeriodNumber + 1
else
#PeriodNumber
end,
#t = t,
#anchor = s
option (maxdop 1);
select
s = min(s),
e = max(e),
t = min(t)
from
#periods
group by
period_number
order by
s;
but I doubt if I can rely on such a behavior of UPDATE statement?
I use SQL Server 2008 R2.
Edit:
Thanks to Daniel and this article: http://www.sqlservercentral.com/articles/T-SQL/68467/
I found three important things that were missed in the solution above:
There must be clustered index on the table
There must be anchor variable and call of the clustered column
Update statement should be executed by one processor, i.e. without parallelism
I've changed the above solution in accordance with these rules.

Since your ranges are continuous, the problem essentially becomes a gaps-and-islands one. If only you had a criterion to help you to distinguish between different sequences with the same t value, you could group all the rows using that criterion, then just take MIN(s), MAX(e) for every group.
One method of obtaining such a criterion is to use two ROW_NUMBER calls. Consider the following query:
SELECT
*,
rnk1 = ROW_NUMBER() OVER ( ORDER BY s),
rnk2 = ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
FROM #periods
;
For your example it would return the following set:
s e t rnk1 rnk2
---------- ---------- -- ---- ----
2013-01-01 2013-01-02 3 1 1
2013-01-02 2013-01-04 1 2 1
2013-01-04 2013-01-05 1 3 2
2013-01-05 2013-01-06 2 4 1
2013-01-06 2013-01-07 2 5 2
2013-01-07 2013-01-08 2 6 3
2013-01-08 2013-01-09 1 7 3
The interesting thing about the rnk1 and rnk2 rankings is that if you subtract one from the other, you will get values that, together with t, uniquely identify every distinct sequence of rows with the same t:
s e t rnk1 rnk2 rnk1 - rnk2
---------- ---------- -- ---- ---- -----------
2013-01-01 2013-01-02 3 1 1 0
2013-01-02 2013-01-04 1 2 1 1
2013-01-04 2013-01-05 1 3 2 1
2013-01-05 2013-01-06 2 4 1 3
2013-01-06 2013-01-07 2 5 2 3
2013-01-07 2013-01-08 2 6 3 3
2013-01-08 2013-01-09 1 7 3 4
Knowing that, you can easily apply grouping and aggregation. This is what the final query might look like:
WITH partitioned AS (
SELECT
*,
g = ROW_NUMBER() OVER ( ORDER BY s)
- ROW_NUMBER() OVER (PARTITION BY t ORDER BY s)
FROM #periods
)
SELECT
s = MIN(s),
e = MAX(e),
t
FROM partitioned
GROUP BY
t,
g
;
If you like, you can play with this solution at SQL Fiddle.

How about this?
declare #periods table (
s datetime primary key,
e datetime,
t tinyint,
s2 datetime
);
insert into #periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
update #periods set s2 = s;
while ##ROWCOUNT > 0
begin
update p2 SET s2=p1.s
from #periods p1
join #PERIODS P2 ON p2.t = p1.t AND p2.s2 = p1.e;
end
select s2 as s, max(e) as e, min(t) as t
from #periods
group by s2
order by s2;

a possibly solution to avoid update and cursor should be using common table expressions...
like this...
declare #periods table (
s date,
e date,
t tinyint
);
insert into #periods values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-06' , '2013-01-07', 2),
('2013-01-07' , '2013-01-08', 2),
('2013-01-08' , '2013-01-09', 1);
with cte as ( select 0 as n
,p.s as s
,p.e as e
,p.t
,case when p2.s is null then 1 else 0 end fl_s
,case when p3.e is null then 1 else 0 end fl_e
from #periods p
left outer join #periods p2
on p2.e = p.s
and p2.t = p.t
left outer join #periods p3
on p3.s = p.e
and p3.t = p.t
union all
select n+1 as n
, p2.s as s
, p.e as e
,p.t
,case when not exists(select * from #periods p3 where p3.e =p2.s and p3.t=p2.t) then 1 else 0 end as fl_s
,p.fl_e as fl_e
from cte p
inner join #periods p2
on p2.e = p.s
and p2.t = p.t
where p.fl_s = 0
union all
select n+1 as n
, p.s as s
, p2.e as e
,p.t
,p.fl_s as fl_s
,case when not exists(select * from #periods p3 where p3.s =p2.e and p3.t=p2.t) then 1 else 0 end as fl_e
from cte p
inner join #periods p2
on p2.s = p.e
and p2.t = p.t
where p.fl_s = 1
and p.fl_e = 0
)
,result as (select s,e,t,COUNT(*) as count_lines
from cte
where fl_e = 1
and fl_s = 1
group by s,e,t
)
select * from result
option(maxrecursion 0)
resultset achieved...
s e t count_lines
2013-01-01 2013-01-02 3 1
2013-01-02 2013-01-05 1 2
2013-01-05 2013-01-08 2 3
2013-01-08 2013-01-09 1 1

Hooray! I've found the solution that suits me and it is done without iteration
with cte1 as (
select s, t from #periods
union all
select max(e), null from #periods
),
cte2 as (
select rn = row_number() over(order by s), s, t from cte1
),
cte3 as (
select
rn = row_number() over(order by a.rn),
a.s,
a.t
from
cte2 a
left join cte2 b on a.rn = b.rn + 1 and a.t = b.t
where
b.rn is null
)
select
s = a.s,
e = b.s,
a.t
from
cte3 a
inner join cte3 b on b.rn = a.rn + 1;
Thanks everyone for sharing your thoughts and solutions!
Details:
cte1 returns the chain of dates with the types after them:
s t
---------- ----
2013-01-01 3
2013-01-02 1
2013-01-04 1
2013-01-05 2
2013-01-06 2
2013-01-07 2
2013-01-08 1
2013-01-09 NULL -- there is no type *after* the last date
ct2 just add row number to the above result:
rn s t
---- ---------- ----
1 2013-01-01 3
2 2013-01-02 1
3 2013-01-04 1
4 2013-01-05 2
5 2013-01-06 2
6 2013-01-07 2
7 2013-01-08 1
8 2013-01-09 NULL
if we output all the fields from the query in cte3 without where condition, we get the following results:
select * from cte2 a left join cte2 b on a.rn = b.rn + 1 and a.t = b.t;
rn s t rn s t
---- ---------- ---- ------ ---------- ----
1 2013-01-01 3 NULL NULL NULL
2 2013-01-02 1 NULL NULL NULL
3 2013-01-04 1 2 2013-01-02 1
4 2013-01-05 2 NULL NULL NULL
5 2013-01-06 2 4 2013-01-05 2
6 2013-01-07 2 5 2013-01-06 2
7 2013-01-08 1 NULL NULL NULL
8 2013-01-09 NULL NULL NULL NULL
For the dates where type is repeted there are values on the right side of the results. So we can just remove all the lines where values exist on the right side.
So cte3 returns:
rn s t
----- ---------- ----
1 2013-01-01 3
2 2013-01-02 1
3 2013-01-05 2
4 2013-01-08 1
5 2013-01-09 NULL
Note that because of the removal some rows there are some gaps in rn sequence, so we have to renumber them again.
From here only one thing left - to transform the dates to periods:
select
s = a.s,
e = b.s,
a.t
from
cte3 a
inner join cte3 b on b.rn = a.rn + 1;
and we've got the required result:
s e t
---------- ---------- ----
2013-01-01 2013-01-02 3
2013-01-02 2013-01-05 1
2013-01-05 2013-01-08 2
2013-01-08 2013-01-09 1

this is your solution with a different data on the table..
declare #periods table (
s datetime primary key,
e datetime,
t tinyint,
period_number int
);
insert into #periods (s, e, t) values
('2013-01-01' , '2013-01-02', 3),
('2013-01-02' , '2013-01-04', 1),
('2013-01-04' , '2013-01-05', 1),
('2013-01-05' , '2013-01-06', 2),
('2013-01-09' , '2013-01-10', 2),
('2013-01-10' , '2013-01-11', 1);
declare #t tinyint = null;
declare #PeriodNumber int = 0;
update #periods
set period_number = #PeriodNumber,
#PeriodNumber = case
when #t <> t
then #PeriodNumber + 1
else
#PeriodNumber
end,
#t = t;
select
s = min(s),
e = max(e),
t = min(t)
from
#periods
group by
period_number
order by
s;
where have a gap between
('2013-01-05' , '2013-01-06', 2),
--and
('2013-01-09' , '2013-01-10', 2),
your solution resultset is..
s e t
2013-01-01 2013-01-02 3
2013-01-02 2013-01-05 1
2013-01-05 2013-01-10 2
2013-01-10 2013-01-11 1
isnt was spected the resultset like this..??
s e t
2013-01-01 2013-01-02 3
2013-01-02 2013-01-05 1
2013-01-05 2013-01-06 2
2013-01-09 2013-01-10 2
2013-01-10 2013-01-11 1
maybe I did misunderstood the rule of your problem...

Find the min and max dates between multiple sets of dates

Given the following set of data, I'm trying to determine how I can select the start and end dates of the combined date ranges, when they intersect with each other.
For instance, for PartNum 115678, I would want my final result set to display the date ranges 2012/01/01 - 2012/01/19 (rows 1, 2 and 4 combined since the date ranges intersect) and 2012/02/01 - 2012/03/28 (row 3 since this ones does not intersect with the range found previously).
For PartNum 213275, I would want to select the only row for that part, 2012/12/01 - 2013/01/01.
Edit:
I'm currently playing around with the following SQL statement, but it's not giving me exactly what I need.
with DistinctRanges as (
select distinct
ha1.PartNum "PartNum",
ha1.StartDt "StartDt",
ha2.EndDt "EndDt"
from dbo.HoldsAll ha1
inner join dbo.HoldsAll ha2
on ha1.PartNum = ha2.PartNum
where
ha1.StartDt <= ha2.EndDt
and ha2.StartDt <= ha1.EndDt
)
select
PartNum,
StartDt,
EndDt
from DistinctRanges
Here are the results of the query shown in the edit:

You're better off having a persisted Calendar table, but if you don't, the CTE below will create it ad-hoc. The TOP(36000) part is enough to give you 10 years worth of dates from the pivot ('20100101') on the same line.
SQL Fiddle
MS SQL Server 2008 Schema Setup:
create table data (
partnum int,
startdt datetime,
enddt datetime,
age int
);
insert data select
12345, '20120101', '20120116', 15 union all select
12345, '20120115', '20120116', 1 union all select
12345, '20120201', '20120328', 56 union all select
12345, '20120113', '20120119', 6 union all select
88872, '20120201', '20130113', 43;
Query 1:
with Calendar(thedate) as (
select TOP(36600) dateadd(d,row_number() over (order by 1/0),'20100101')
from sys.columns a
cross join sys.columns b
cross join sys.columns c
), tmp as (
select partnum, thedate,
grouper = datediff(d, dense_rank() over (partition by partnum order by thedate), thedate)
from Calendar c
join data d on d.startdt <= c.thedate and c.thedate <= d.enddt
)
select partnum, min(thedate) startdt, max(thedate) enddt
from tmp
group by partnum, grouper
order by partnum, startdt
Results:
| PARTNUM | STARTDT | ENDDT |
------------------------------------------------------------------------------
| 12345 | January, 01 2012 00:00:00+0000 | January, 19 2012 00:00:00+0000 |
| 12345 | February, 01 2012 00:00:00+0000 | March, 28 2012 00:00:00+0000 |
| 88872 | February, 01 2012 00:00:00+0000 | January, 13 2013 00:00:00+0000 |

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

How to generate cartesian product of two tables in snowflake? - snowflake-cloud-data-platform

Related

IF statment in SQL

Running sum from a point

how to calculate average when some rows does not exist?

Combine continuous datetime intervals by type

Find the min and max dates between multiple sets of dates

Categories

Resources