IF statment in SQL - sql-server

I have a table_changes (Id,stard_date,end_date) and I want to add two columns rank_end_date and new_end_date.
The problem I have in my data is that not always there is continuousness (in the month level, the day in the month is not in my intrest) between end_date and the start_date coming just after it (see example 1) so I need to "strech" end_date in some cases so there will be continuousness at the level of the month.
For example 1, the new_end_date is 1/2/2015 and doesn't have to be 28/2/2015. If the end_date in rank 1 is sooner than 31/12/2015 strech it to 31/12/9999.
Some Examples:
Ex1:
Id --start date --end_date --rank_end_date new_end_date
111 01/01/1970 1/1/1980 2 1/2/2015
111 01/03/2015 31/12/9999 1 31/12/9999
Ex2:
Id --start_date --end_date --rank_end_date new_end_date
111 01/01/1970 1/1/1980 1 31/12/9999
Ex3:
Id --start_date --end_date --rank_end_date new_end_date
111 01/01/1970 1/1/1980 2 01/05/1990
111 01/05/1990 31/12/1995 1 31/12/9999
Ex4:
Id --start_date --end_date --rank__end_date new_end_date
111 01/03/2015 31/12/9999 1 31/12/9999
Ex5:
Id --start_Date --end_date --rank__end_date new_end_date
111 01/02/2015 31/5/2015 2 01/5/2015
111 01/06/2015 31/12/9999 1 31/12/9999
the syntax should be something like this but I don't know how to write those IF statements in SQL:
if rank_end_date ==2 then new_end_date == 1/Month(start_date(rank_end_date - 1)) - 1 /2015
if rank_end_date ==1 then new_end_date == 31/12/2015
else new_end_date = end_date
Select [Id],[StartDate],[EndDate],
Rank_End_Date, case
when t.Rank_End_Date = (2) **then
CAST(CAST(Year([StartDate]) AS varchar) + '-' + CAST(Month([StartDate]) AS varchar) + '-' +
--How to do I choose the Start_Date from the record with Rank==1? It is selecting
the start date from the record with rank==2 ofcourse.
CAST(Day ([EMER_StartDate]) AS varchar) AS DATE)
when t.Rank_End_Date = (1) then '9999-12-31'
else t.[EMER_EndDate] end As New_End_Date
from (
Select [Id],[StartDate],[EndDate],
Rank() OVER (PARTITION BY [Id] order by [EndDate] desc) as Rank_End_Date
from [dbo].[Changes]
) t
Could anybody help in achieving the result?

If I've understood your question right, and you can only have values in rank_end_date of 1 or 2 then something like this query should give you the answer you're looking for. Either way, the LEAD (or LAG function if you sort the records ascending) will allow you to fetch the value from a different record.
SELECT ID
, start_date
, end_date
, rank_end_date
, CASE WHEN rank_end_date = 1 THEN
CASE WHEN end_date < '31/12/2005' THEN '31/12/9999' ELSE end_date END
WHEN rank_end_date = 2 THEN LEAD(start_date,1) OVER(ORDER BY ID, rank_end_date DESC)
END AS new_end_date
FROM dbo.Changes

You can't use LEAD OR LAG functions in SQL Server 2008, so you can try this solution.
with CTE as
(
Select [Id] as ID,[StartDate] as StartDate,[EndDate] as EndDate,
ROW_NUMBER() OVER (PARTITION BY [Id] order by [StartDate] DESC) as rn_Start_Date
from [dbo].[Changes]
)
Select C1.[Id] , C1.[StartDate], C1.[EndDate], C1.rn_Start_Date as Rank_end_date,
ISNULL(DATEADD(MONTH, DATEDIFF(MONTH, 0, C2.[StartDate])-1, 0), cast('9999-12-31' as DATE)) As New_End_Date
From CTE C1
LEFT JOIN CTE C2 ON C1.[ID] = C2.[ID] AND C1.Rn_Start_Date = C2.Rn_Start_Date + 1

Related

T-SQL find rows with dates in correct order

I want to somehow mark the rows for each Class-number in the example below that have a row with the same date or rows with dates in order of each other. Have tried to accomplish this way to long but really have nothing of value to share with you...
Having the following sample data:
Date Class
2016-10-17 00:00:00.000 1
2016-10-20 00:00:00.000 1
2016-10-18 00:00:00.000 1
2016-10-25 00:00:00.000 1
2016-10-19 00:00:00.000 2
2016-10-19 00:00:00.000 2
2016-10-28 00:00:00.000 2
2016-10-25 00:00:00.000 3
With the logic above, it should produce:
This works.
drop table if exists dbo.test_table;
go
create table dbo.test_table(
[Date] date not null,
Class int not null)
insert dbo.test_table([Date], Class) values
('2016-10-17',1),
('2016-10-20',1),
('2016-10-18',1),
('2016-10-25',1),
('2016-10-19',2),
('2016-10-19',2),
('2016-10-28',2),
('2016-10-25',3);
select tt.*,
iif(datediff(day, tt.[Date], lead([Date]) over (partition by Class order by [Date])) in (0,1), 1, 0)+
iif(datediff(day, lag([Date]) over (partition by Class order by [Date]), tt.[Date]) in (0,1), 1, 0) Marked
from dbo.test_table tt;
Hmmm . . . I think you want conditional logic on the results of window functions:
select t.*,
(case when count(*) over (partition by class, date) > 1
then 1
when lag(class) over (partition by class order by date) = dateadd(day, -1, date)
then 1
when lead(class) over (partition by class order by date) = dateadd(day, 1, date)
then 1
else 0
end) as mark
from t;

MSSQL: Create incremental row label per group

In my table, I have a primary key and a date. What I'd like to achieve is to have an incremental label based on whether or not there is a break between the dates - column Goal.
Now, below is an example. The break column was calculated using LEAD function (I thought it might help).
I am able to solve it using T-SQL, but this would be last resort. Nothing I tried has worked so far. I am using MSSQL 2014.
PK | Date | break | Goal |
-------------------------------
1 | 03/2017 | 0 | 1 |
1 | 04/2017 | 0 | 1 |
1 | 08/2017 | 1 | 2 |
1 | 09/2017 | 0 | 2 |
1 | 10/2017 | 0 | 2 |
1 | 02/2018 | 1 | 3 |
1 | 03/2018 | 0 | 3 |
Here is a code to reproduce this example:
CREATE TABLE #test
(
ConsumerId INT,
FullDate DATE,
Goal INT
)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-03-01',1)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-04-01',1)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-08-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-09-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2017-10-01',2)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2018-02-01',3)
INSERT INTO #test (ConsumerId, FullDate, Goal) VALUES (1,'2018-03-01',3)
SELECT ConsumerId,
FullDate,
CASE WHEN (datediff(month,
isnull(
LEAD (FullDate,1) OVER (PARTITION BY ConsumerId ORDER BY FullDate DESC),
FullDate),
FullDate) > 1)
THEN 1
ELSE 0
END AS break,
Goal
FROM #test
ORDER BY FullDate ASC
EDIT
This is apparently a famous problem "Islands and gaps" as pointed out in the comments. And Google offers many solutions as well as other questions here at SO.
Try this...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT
sg.ConsumerId, sg.FullDate,
GroupValue = DENSE_RANK() OVER (PARTITION BY sg.ConsumerId ORDER BY sg.GV)
FROM
cte_SmearGap sg;
An explanation of the code an how it works...
The 1st query, in cte_TestGap, uses the LAG function along with ROW_NUMBER() function to mark the location of gap in the data. We can see that by breaking it out and looking at it's results...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
)
SELECT * FROM cte_TestGap;
cte_TestGap results...
ConsumerId FullDate Gap
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 0
1 2017-08-01 3
1 2017-09-01 0
1 2017-10-01 0
1 2018-02-01 6
1 2018-03-01 0
At this point we want the 0 values to take on the value of the preceding non-0 values, allowing them to be grouped together. This is done in the 2nd query (cte_SmearGap) using the MAX function with a "window frame". So if we look at the output of cte_SmearGap, we can see that...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT * FROM cte_SmearGap;
cte_SmearGap results...
ConsumerId FullDate GV
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 1
1 2017-08-01 3
1 2017-09-01 3
1 2017-10-01 3
1 2018-02-01 6
1 2018-03-01 6
At this point All of the rows are in distinct groups... but... We'd like to have our group numbers in a contiguous sequence (1,2,3) as opposed to (1,3,6).
Of course that's easy enough to fix using the DENSE_Rank() function, which is what's happening in the final select...
WITH
cte_TestGap AS (
SELECT
t.ConsumerId, t.FullDate,
Gap = CASE
WHEN DATEDIFF(mm, t.FullDate, LAG(t.FullDate, 1) OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)) = -1
THEN 0
ELSE ROW_NUMBER() OVER (PARTITION BY t.ConsumerId ORDER BY t.FullDate)
END
FROM
#test t
),
cte_SmearGap AS (
SELECT
tg.ConsumerId, tg.FullDate,
GV = MAX(tg.Gap) OVER (PARTITION BY tg.ConsumerId ORDER BY tg.FullDate ROWS UNBOUNDED PRECEDING)
FROM
cte_TestGap tg
)
SELECT
sg.ConsumerId, sg.FullDate,
GroupValue = DENSE_RANK() OVER (PARTITION BY sg.ConsumerId ORDER BY sg.GV)
FROM
cte_SmearGap sg;
The end result...
ConsumerId FullDate GroupValue
----------- ---------- --------------------
1 2017-03-01 1
1 2017-04-01 1
1 2017-08-01 2
1 2017-09-01 2
1 2017-10-01 2
1 2018-02-01 3
1 2018-03-01 3
The comment from David Browne was actually extremely useful. If you google "Islands and Gaps", there are many variations of the solution. Below is the one I liked the most.
In the end, I needed the Goal column to be able to group the dates into MIN/MAX. This solution skips this step and directly creates the aggregated range.
Here is the source.
SELECT MIN(FullDate) AS range_start,
MAX(FUllDate) AS range_end
FROM (
SELECT FullDate,
DATEADD(MM, -1 * ROW_NUMBER() OVER(ORDER BY FullDate), FullDate) AS grp
FROM #test
) a
GROUP BY a.grp
And the output:
range_start | range_end |
--------------------------
2017-03-01 | 2017-04-01 |
2017-08-01 | 2017-10-01 |
2018-02-01 | 2018-03-01 |

SQL Query time spent between certain value

I have a database for all temperatures the last 10 years.
Now I want to find all periods where the temperature was above ex. 15 degree.
Simplified example:
...
2015-05-10 12
2015-05-11 15 |
2015-05-12 16 |
2015-05-13 17 |
2015-05-14 16 |
2015-05-15 15 |
2015-05-16 12
2015-05-17 11
2015-05-18 15 |
2015-05-19 12
2015-05-20 18 |
...
Så now I want get all time periods like this:
Min Max
2015-05-11 2015-05-15
2015-05-18 2015-05-18
2015-05-20 2015-05-20
Any suggestion of how this query will look like ?
You could use CTE
CREATE TABLE #Date (DateT datetime, Value int )
INSERT INTO #Date
VALUES ('2015-05-10',12),
('2015-05-11',15),
('2015-05-12',16),
('2015-05-13',17),
('2015-05-14',16),
('2015-05-15',15),
('2015-05-16',12),
('2015-05-17',11),
('2015-05-18',15),
('2015-05-19',12),
('2015-05-20',18)
WITH t AS (
SELECT DateT d,ROW_NUMBER() OVER(ORDER BY DateT) i
FROM #Date
WHERE Value >= 15
GROUP BY DateT
)
SELECT MIN(d) as DataStart,MAX(d) as DataFinal, ROW_NUMBER() OVER(ORDER BY DATEDIFF(day,i,d)) as RN
FROM t
GROUP BY DATEDIFF(day,i,d)
RN column is optional you could use
SELECT MIN(d) as DataStart,MAX(d) as DataFinal
FROM t
GROUP BY DATEDIFF(day,i,d)
Here is a solution using a gaps and islands algorithm. It looks kind of bulky but it runs fast and scales great. It is also modular if you want to add a gap-allowed parameter and you can rewrite it to partition by some other columns and it still performs nicely.
Inspired by Peter Larssons post here: http://www.sqltopia.com/?page_id=83
WITH [theSource](Col1,Col2)
AS
(
SELECT Col1,Col2 FROM (VALUES
('2015-05-10',12),
('2015-05-11',15),
('2015-05-12',16),
('2015-05-13',17),
('2015-05-14',16),
('2015-05-15',15),
('2015-05-16',12),
('2015-05-17',11),
('2015-05-18',15),
('2015-05-19',12),
('2015-05-20',18)
) as x(Col1,Col2)
)
,filteredSource([Value])
AS
(
SELECT Col1 as [Value]
FROM theSource WHERE Col2 >= 15
)
,cteSource(RangeStart, RangeEnd)
AS (
SELECT RangeStart,
CASE WHEN [RangeStart] = [RangeEnd] THEN [RangeEnd] ELSE LEAD([RangeEnd]) OVER (ORDER BY Value) END AS [RangeEnd]
FROM (
SELECT [Value],
CASE
WHEN DATEADD(DAY,1,LAG([Value]) OVER (ORDER BY [Value])) >= [Value] THEN NULL
ELSE [Value]
END AS RangeStart,
CASE
WHEN DATEADD(DAY,-1,LEAD([Value]) OVER (ORDER BY [Value])) <= [Value] THEN NULL
ELSE [Value]
END AS RangeEnd
FROM filteredSource
) AS d
WHERE RangeStart IS NOT NULL
OR RangeEnd IS NOT NULL
)
SELECT RangeStart AS [Min],
RangeEnd AS [Max]
FROM cteSource
WHERE RangeStart IS NOT NULL;

SQL count where between dates by month

Consider the below data:
ID Reference Manager LeaseFirstStart LeaseStop
1 KLEIN John 2008-04-02 00:00:00.000 2010-04-01 00:00:00.000
2 HAWKER John 2008-12-18 00:00:00.000 2010-09-17 00:00:00.000
3 SLEEP Bob 2008-01-23 00:00:00.000 2009-01-22 00:00:00.000
4 CODD Bob 2009-08-03 00:00:00.000 2010-08-02 00:00:00.000
5 ALLEN Bob 2008-01-30 00:00:00.000 2009-07-31 00:00:00.000
The earliest month is Jan 2008 and the latest month is Sep 2010.
How can I count the number of leases that were current per month? The output should look like this:
Month Number of Leases
2008-01 2
2008-02 2
2008-03 2
2008-04 3
2008-05 3
2008-06 3
2008-07 3
2008-08 4
… …
Ultimately, I want to use the answer to the question to create the dataset below for use in excel by the user so they can see who had how many leases during the data period.
Month Manager Number of Leases
2008-01 Bob 2
2008-01 John 0
2008-02 Bob 2
2008-02 John 0
2008-03 Bob 2
2008-03 John 0
2008-04 Bob 2
2008-04 John 1
2008-05 Bob 2
2008-05 John 1
2008-06 Bob 2
2008-06 John 1
2008-07 Bob 2
2008-07 John 1
2008-08 Bob 3
2008-08 John 1
… … …
I know I've done it before, but it was a long time ago and I remember it being messy. Thanks in advance!
select sum (no) as no,datet from ( SELECT COUNT (*) as no ,(convert(varchar,datepart (yyyy,[ Start] )) + '-' + convert(varchar, MONTH([ Start] ))) as datet
FROM <tbl>
GROUP BY (convert(varchar,datepart (yyyy,[ Start] )) + '-' + convert(varchar, MONTH([ Start] )))
union SELECT COUNT (*) as no ,(convert(varchar,datepart (yyyy,[ End] )) + '-' + convert(varchar, MONTH([ End] ))) as datet
FROM <tbl>
GROUP BY (convert(varchar,datepart (yyyy,[ End] )) + '-' + convert(varchar, MONTH([ End] )) ) ) t
This is very logical question, finally I created the sql which gives the desired result.. I verified every date and month count and its all ok.
Declare #t table (ID int, Reference varchar(50), Manager varchar(50),LeaseFirstStart datetime,LeaseStop datetime)
insert into #t
values
(1,'KLEIN','John','2008-04-02 00:00:00.000','2010-04-01 00:00:00.000'),
(2,'HAWKER','John','2008-12-18 00:00:00.000','2010-09-17 00:00:00.000'),
(3,'SLEEP','Bob','2008-01-23 00:00:00.000','2009-01-22 00:00:00.000'),
(4,'CODD','Bob','2009-08-03 00:00:00.000','2010-08-02 00:00:00.000'),
(5,'ALLEN','Bob','2008-02-28 00:00:00.000','2009-07-31 00:00:00.000')
declare #lowerdate datetime , #currentdt datetime
select #lowerdate = min(leasefirststart), #currentdt= max(leasestop) from #t
;with cte as
(
select firstday,DATEADD(d, -1, DATEADD(m, DATEDIFF(m, 0, FirstDay) + 1, 0)) Lastday, mng from
( select dateadd(m,datediff(m,0,#lowerdate)+v.number,0) as FirstDay
From master..spt_values v
Where v.type='P' and v.number between 0 and datediff(m, #lowerdate, #currentdt)
) as a
, (select distinct manager mng from #t ) as b
)
select (convert(varchar,datepart (yyyy,FirstDay )) + '-' + convert(varchar, MONTH(FirstDay ))) MonthAndYear ,mng as mng , count( manager ) cnt
from cte
left join #t on
(
firstday between LeaseFirstStart and LeaseStop
or
Lastday between LeaseFirstStart and LeaseStop
) and cte.mng = Manager
group by firstday, mng
order by FirstDay

SQL query to split records by intervals

Let's assume I have a table which has columns From and To which are dates and a bit type column which identifies whether it is a cancel (1 = cancel). Also an Id which is a PK and CancelId which references what is cancelled.
Let's say I have records which look like:
Id From To IsCancel CancelId
1 2015-01-01 2015-01-31 0 NULL
2 2015-01-03 2015-01-09 1 1
3 2015-01-27 2015-01-31 1 1
I am expecting the result to show what intervals of then non-cancel records are still uncancelled:
Id From To
1 2015-01-01 2015-01-02
1 2015-01-10 2015-01-26
I can make it so it would split each record into dates, then subtract cancelled dates from the records then merge the intervals but since I have quite a lot of records, I find this very inefficient and am pretty sure that I am overlooking something simple.
The task you want to achieve is non trivial. A possible solution involves placing all From / To dates in an ordered sequence. The following UNPIVOT operation:
SELECT ID, EventDate, StartStop,
ROW_NUMBER() OVER (ORDER BY ID, EventDate, StartStop) AS EventRowNum,
IsCancel
FROM
(SELECT ID, IsCancel, [From], [To]
FROM Event) Src
UNPIVOT (
EventDate FOR StartStop IN ([From], [To])
) AS Unpvt
produces this result set:
ID EventDate StartStop EventRowNum IsCancel
--------------------------------------------------
1 2015-01-01 From 1 0
2 2015-01-03 From 2 1
2 2015-01-09 To 3 1
3 2015-01-27 From 4 1
3 2015-01-31 To 5 1
1 2015-01-31 To 6 0
Using a CTE, you can subsequently simulate LEAD function (available from SQL Server 2012 onwards) in order to place in a single record the current and the next date from the sequence above:
;WITH StretchEventDates AS
(
-- above query goes here
), CTE AS
(
SELECT s.ID, s.EventDate, s.StartStop, s.IsCancel,
sLead.EventDate As LeadEventDate, sLead.StartStop AS LeadStartStop, sLead.IsCancel AS LeadIsCancel
FROM StretchEventDates AS s
LEFT JOIN StretchEventDates AS sLead ON s.EventRowNum + 1 = sLead.EventRowNum
)
The above produces the following result set:
ID EventDate StartStop IsCancel LeadEventDate LeadStartStop LeadIsCancel
--------------------------------------------------------------------------------------
1 2015-01-01 From 0 2015-01-03 From 1
2 2015-01-03 From 1 2015-01-09 To 1
2 2015-01-09 To 1 2015-01-27 From 1
3 2015-01-27 From 1 2015-01-31 To 1
3 2015-01-31 To 1 2015-01-31 To 0
1 2015-01-31 To 0 NULL NULL NULL
Using CASE statements you can filter these records in order to get the desired output.
Putting it all together:
;WITH StretchEventDates AS
(
SELECT ID, EventDate, StartStop,
ROW_NUMBER() OVER (ORDER BY EventDate, StartStop) AS EventRowNum,
IsCancel
FROM
(SELECT ID, IsCancel, [From], [To]
FROM Event) Src
UNPIVOT (
EventDate FOR StartStop IN ([From], [To])
) AS Unpvt
), CTE AS
(
SELECT s.ID, s.EventDate, s.StartStop, s.IsCancel,
sLead.EventDate As LeadEventDate, sLead.StartStop AS LeadStartStop, sLead.IsCancel AS LeadIsCancel
FROM StretchEventDates AS s
LEFT JOIN StretchEventDates AS sLead ON s.EventRowNum + 1 = sLead.EventRowNum
), CTE_FINAL AS
(SELECT *,
CASE WHEN StartStop = 'From' AND IsCancel = 0 THEN EventDate
WHEN StartStop = 'To' AND IsCancel = 1 THEN DATEADD(d, 1, EventDate)
END AS [From],
CASE WHEN LeadStartStop = 'From' AND LeadIsCancel = 1 THEN DATEADD(d, -1, LeadEventDate)
WHEN LeadStartStop = 'To' AND LeadIsCancel = 0 THEN LeadEventDate
END AS [To]
FROM CTE
)
SELECT ID, [From], [To]
FROM CTE_FINAL
WHERE [From] IS NOT NULL AND [To] IS NOT NULL AND [From] <= [To]
You may have to add additional CASEs in the query above to handle additional combinations of 'cancelations' following 'non-canceled' (and vice-versa) events.
With the data provided in the OP the above yields the following output:
ID From To
---------------------------
1 2015-01-01 2015-01-02
2 2015-01-10 2015-01-26

Resources