Find mathematical complement of a set of intervals with SQL

Find mathematical complement of a set of intervals with SQL - sql-server

I'm pondering over the following problem which needs to be solved via SQL. Let there be an interval [a, b] of natural numbers, and a (finite) set of intervals A that are all subsets of [a, b]. We want to determine the complement auf A, that is, a set of intervals B such that A + B = [a, b] and A and B are pairwise disjoint.
For example: Given [a, b] = the days of 2017 ("all days"), and the intervals March-June, April, April-July and November ("possible days"). Now produce the intervals jan-feb, aug-oct and dec ("impossible days"). All intervals are resp. should be defined via start date and end date.
I tried the following. Produce a calender of 2017 and check for every day if it is contained in neither of the intervals. From these days, construct the corresponding intervals. So far it seems complicated and I'm starting to think that this solution approach is somewhat unlucky with SQL. But maybe it's just my implementation. What do you think? Would you maybe know a better way?
Greetings from Frankfurt,
Johannes

So long as you're working with a known fixed range then you can easily find the candidates to be a start or end that is outside of any current interval. And then just pair those up based on date order:
declare #RangeStart date
declare #RangeEnd date
select #RangeStart = '20170101',#RangeEnd = '20171231'
declare #intervals table (
StartAt date not null,
EndAt date not null
)
insert into #intervals (StartAt,EndAt) values
('20170301','20170630'),
('20170401','20170430'),
('20170401','20170731'),
('20171101','20171130')
;With Starts as (
select
#RangeStart as StartDT
where
not exists (select * from #intervals i where #RangeStart between i.StartAt and i.EndAt) --Start outside an interval
union all
select
DATEADD(day,1,i1.EndAt)
from
#intervals i1
left join
#intervals i2
on
DATEADD(day,1,i1.EndAt) between i2.StartAt and i2.EndAt --No succeeding interval
where
i2.EndAt is null
), Ends as (
select
#RangeEnd as EndDT
where
not exists (select * from #intervals i where #RangeEnd between i.StartAt and i.EndAt) --End outside an interval
union all
select
DATEADD(day,-1,i1.StartAt)
from
#intervals i1
left join
#intervals i2
on
DATEADD(day,-1,i1.StartAt) between i2.StartAt and i2.EndAt --No preceding interval
where
i2.StartAt is null
), OrderedStarts as (
select StartDT,ROW_NUMBER() OVER (ORDER BY StartDT) as rn
from Starts where StartDT between #RangeStart and #RangeEnd
), OrderedEnds as (
select EndDT,ROW_NUMBER() OVER (ORDER BY EndDt) as rn
from Ends where EndDT between #RangeStart and #RangeEnd
)
select
os.StartDT,oe.EndDT
from
OrderedStarts os
inner join
OrderedEnds oe
on
os.rn = oe.rn
Result:
StartDT EndDT
---------- ----------
2017-01-01 2017-02-28
2017-08-01 2017-10-31
2017-12-01 2017-12-31
That is - valid start dates are the start of our range or the day after any other interval, provided that doesn't overlap with another interval. Similarly for valid ends.

Related

Generate Missing date and add zero value and add one to existing date in sql

I am new to sql and I need to generate missing date and add value 0 and for existing date value as 1.
refereed many example on generating missing date in sql all show, adding 0 to missing date but no solution to add 1 to existing date
i have a table called Alarm and column Alarm_start, Alarm_Start has below data.
2019-03-24 11:36:24.000
2019-03-25 07:47:49.000
2019-03-27 09:40:39.000
2019-03-29 10:04:43.000
result needed is only with date and 0 and 1
2019-03-24 1
2019-03-25 1
2019-03-26 0
2019-03-27 1
2019-03-28 0
2019-03-29 1

Please try following
DECLARE #StartDateTime DATETIME
DECLARE #EndDateTime DATETIME
SET #StartDateTime = '1/1/2019'
SET #EndDateTime = '12/31/2019';
--delete from #tmp
;WITH DateRange(DateData) AS
(
SELECT #StartDateTime as Date
UNION ALL
SELECT DATEADD(d,1,DateData)
FROM DateRange
WHERE DateData < #EndDateTime
)
SELECT DateRange.DateData, CASE WHEN Your_table.DateCol IS NULL THEN 0 ELSE 1 END AS NUM
FROM DateRange LEFT OUTER JOIN
(VALUES ('1/1/2019'),('1/3/2019')) AS Your_table(DateCol)
ON DateRange.DateData = CAST(Your_table.DateCol AS date)
OPTION (MAXRECURSION 0)

To rephrase your question, you are looking for output that shows whether or not a date within a range has a record in your table (output = 1), or does not have a record (output = 0).
Assumption: You will have a start and end date for your query, e.g. ... BETWEEN '2019-03-24' AND '2019-03-29'
The easiest way to do this is with a "Tally Table", also called a "Numbers Table". This is a table object containing a sequence of numbers, starting with 0 or 1, and ending at whatever number you need. For this example, I will create a dynamic tally table, but you may find that you want to keep a permanent tally table somewhere in your database so you don't have to create it on the fly each time.
DECLARE #startDate date = '2019-03-24'
DECLARE #endDate date = 2019-03-29
-- Get the number of days between start and end date
DECLARE #days int
SET #days = datediff (day, #StartDate, #EndDate) + 1 -- Add 1 so you have six days total
-- Build the tally table
-- NOTE: Must use SELECT...INTO to user the IDENTITY function.
-- DECLARE #Tally TABLE (N int not null primary key)
-- INSERT INTO #Tally (N)
SELECT TOP (#days)
IDENTITY(INT,0,1) AS N
INTO #Tally
FROM master.sys.syscolumns sc1
CROSS JOIN master.sys.syscolumns sc2
-- NOTE: there are other ways to create a tally table. This is just one example
-- NOTE: In SQL 2016, there are more than 15,000 rows in the master.dbo.syscolumns table.
-- The cross join may be unnecessary for your needs. If so, you can rewrite this as:
-- INSERT INTO #Tally (N)
-- SELECT TOP (#days)
-- IDENTITY(INT,0,1) AS N
-- FROM master.sys.syscolumns sc
-- Since your dynamic tally table has only the number of entries you need, no special
-- filtering on the table is needed. However, if you have too many rows, an index
-- on the N field will help. Simply use CREATE INDEX Idx1 ON #Tally(N)
;WITH Dates as (
SELECT Dateadd(day, t.N, #StartDate) As CheckDate
FROM #Tally
)
SELECT CheckDate,
CASE
WHEN EXISTS (SELECT * FROM Alarm WHERE Convert(date, Alarm_Start) = CheckDate)
Then 1
Else 0
END As Alarm_Exists
FROM Dates
ORDER BY CheckDate

How to generate calendar table having begin month date and end month Date

I need to generate calendar table between two dates having begging month date and end month date. And if its greater than today, then it should stop at current date.
Should look like this:
As you can see the last value for column Eomonth has today's date (not the end of the month)
Thank you

If you don't have a calendar table, you can use an ad-hoc tally table
Example
Declare #Date1 date = '2018-01-01'
Declare #Date2 date = GetDate()
Select [Month] = D
,[Eomonth] = case when EOMONTH(D)>#Date2 then convert(date,GetDate()) else EOMONTH(D) end
From (
Select Top (DateDiff(Month,#Date1,#Date2)+1)
D=DateAdd(Month,-1+Row_Number() Over (Order By (Select Null)),#Date1)
From master..spt_values n1
) A
Returns
Month Eomonth
2018-01-01 2018-01-31
2018-02-01 2018-02-28
2018-03-01 2018-03-31
2018-04-01 2018-04-30
2018-05-01 2018-05-31
2018-06-01 2018-06-30
2018-07-01 2018-07-31
2018-08-01 2018-08-31
2018-09-01 2018-09-30
2018-10-01 2018-10-31
2018-11-01 2018-11-30
2018-12-01 2018-12-31
2019-01-01 2019-01-31
2019-02-01 2019-02-13 <-- Today's date

Just to build on John's excellent answer... I made one change to his solution:
Declare #Date1 date = '2018-01-01'
Declare #Date2 date = '2018-03-02';--GETDATE()
-- BEFORE
SELECT [Month] = D
,[Eomonth] = case when EOMONTH(D)>#Date2 then convert(date,GetDate()) else EOMONTH(D) end
FROM (
Select Top (DateDiff(Month,#Date1,#Date2)+1)
D=DateAdd(Month,-1+Row_Number() Over (Order By (Select Null)),#Date1)
From master..spt_values n1
) A
ORDER BY A.D;
-- AFTER
SELECT [Month] = D
,[Eomonth] = case when EOMONTH(D)>#Date2 then convert(date,GetDate()) else EOMONTH(D) end
FROM (
Select Top (DateDiff(Month,#Date1,#Date2)+1)
D=DateAdd(Month,-1+Row_Number() Over (Order By (Select Null)),#Date1),
RN=ROW_NUMBER() OVER (ORDER BY (SELECT NULL))
From master..spt_values n1
) A
ORDER BY A.RN
GO
Now the execution plans:
How did we remove that sort? By leveraging, what I refer to as a virtual index. ROW_NUMBER returns an Ordered Stream of Numbers. This is why you can have column called RN defined as RN = ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) and add an ORDER BY RN statement which does not cause a sort. It appears that all Window Ranking Functions do this (RANK, DENSE_RANK, NTILE and ROW_NUMBER).
With this in mind let's examine my solution which leverages dbo.RangeAB, a function that fully exploits the power of the virtual index.
-- 1. Solution
DECLARE #startDate DATE = '2018-06-01'
DECLARE #endDate DATE = '2019-02-21'; --GETDATE()
SELECT f.Dt, dt.Mx
FROM (VALUES(CAST(GETDATE() AS DATE))) AS x(Dt)
CROSS APPLY (VALUES(IIF(#endDate<x.Dt,#endDate,x.Dt))) AS d(Mx)
CROSS APPLY dates.ageInMonths(#startDate,d.Mx) AS m
CROSS APPLY dbo.RangeAB(0,m.Months,1,0) AS r
CROSS APPLY (VALUES(DATEADD(MONTH,r.RN,#startDate))) AS f(Dt)
CROSS APPLY (VALUES(IIF(EOMONTH(f.Dt)>d.Mx,d.Mx,EOMONTH(f.Dt)))) AS dt(Mx)
ORDER BY r.RN; -- not required; included to demo the virtual index
GO
Returns:
Dt Mx
---------- ----------
2018-06-01 2018-06-30
2018-07-01 2018-07-31
2018-08-01 2018-08-31
2018-09-01 2018-09-30
2018-10-01 2018-10-31
2018-11-01 2018-11-30
2018-12-01 2018-12-31
2019-01-01 2019-01-31
2019-02-01 2019-02-14
If you check the execution plan you will not see a sort despite my order by. But what about Descending Sorts? The ROW_NUMBER virtual index does not handle DESCending sorts? If you change the above query to ORDER BY r.RN DESC you will see a sort in the execution plan. To change that we simply change the reference of r.RN to r.OP. r.OP is ROW_NUMBER's opposite number. Let's compare these two queries that return the same results. What I'm doing here is returning the most recent five months:
DECLARE #startDate DATE = '2018-06-01'
DECLARE #endDate DATE = '2019-01-21'; --GETDATE()
-- INCORRECT!!!
SELECT TOP (5) f.Dt, dt.Mx
FROM (VALUES(CAST(GETDATE() AS DATE))) AS x(Dt)
CROSS APPLY (VALUES(IIF(#endDate<x.Dt,#endDate,x.Dt))) AS d(Mx)
CROSS APPLY dates.ageInMonths(#startDate,d.Mx) AS m
CROSS APPLY dbo.RangeAB(0,m.Months,1,0) AS r
CROSS APPLY (VALUES(DATEADD(MONTH,r.RN,#startDate))) AS f(Dt)
CROSS APPLY (VALUES(IIF(EOMONTH(f.Dt)>d.Mx,d.Mx,EOMONTH(f.Dt)))) AS dt(Mx)
ORDER BY r.RN DESC; -- the virtual index cannot handle Descending sorts, this will sort!
-- CORRECT -- ONE TINY CHANGE! CHANGE r.R1 to r.OP
SELECT TOP (5) f.Dt, dt.Mx
FROM (VALUES(CAST(GETDATE() AS DATE))) AS x(Dt)
CROSS APPLY (VALUES(IIF(#endDate<x.Dt,#endDate,x.Dt))) AS d(Mx)
CROSS APPLY dates.ageInMonths(#startDate,d.Mx) AS m
CROSS APPLY dbo.RangeAB(0,m.Months,1,0) AS r
CROSS APPLY (VALUES(DATEADD(MONTH,r.OP,#startDate))) AS f(Dt)
CROSS APPLY (VALUES(IIF(EOMONTH(f.Dt)>d.Mx,d.Mx,EOMONTH(f.Dt)))) AS dt(Mx)
ORDER BY r.RN;
And the execution plans:
Leveraging what I call Finite Opposite Numbers I am able to use ROW_NUMBER's (RN) Opposite Number (OP) I can return the numbers in reverse order while still sorting by RN in Ascending order. With OP, dbo.rangeAB's finite opposite number column, you can make full use of the virtual index instead of only for ASCending sorts.
The DDL for the functions I used is below.
RangeAB
CREATE FUNCTION dbo.rangeAB
(
#low bigint,
#high bigint,
#gap bigint,
#row1 bit
)
/****************************************************************************************
[Purpose]:
Creates up to 531,441,000,000 sequentia1 integers numbers beginning with #low and ending
with #high. Used to replace iterative methods such as loops, cursors and recursive CTEs
to solve SQL problems. Based on Itzik Ben-Gan's getnums function with some tweeks and
enhancements and added functionality. The logic for getting rn to begin at 0 or 1 is
based comes from Jeff Moden's fnTally function.
The name range because it's similar to clojure's range function. The name "rangeAB" as
used because "range" is a reserved SQL keyword.
[Author]: Alan Burstein
[Compatibility]:
SQL Server 2008+ and Azure SQL Database
[Syntax]:
SELECT r.RN, r.OP, r.N1, r.N2
FROM dbo.rangeAB(#low,#high,#gap,#row1) AS r;
[Parameters]:
#low = a bigint that represents the lowest value for n1.
#high = a bigint that represents the highest value for n1.
#gap = a bigint that represents how much n1 and n2 will increase each row; #gap also
represents the difference between n1 and n2.
#row1 = a bit that represents the first value of rn. When #row = 0 then rn begins
at 0, when #row = 1 then rn will begin at 1.
[Returns]:
Inline Table Valued Function returns:
rn = bigint; a row number that works just like T-SQL ROW_NUMBER() except that it can
start at 0 or 1 which is dictated by #row1.
op = bigint; returns the "opposite number that relates to rn. When rn begins with 0 and
ends with 10 then 10 is the opposite of 0, 9 the opposite of 1, etc. When rn begins
with 1 and ends with 5 then 1 is the opposite of 5, 2 the opposite of 4, etc...
n1 = bigint; a sequential number starting at the value of #low and incrimentingby the
value of #gap until it is less than or equal to the value of #high.
n2 = bigint; a sequential number starting at the value of #low+#gap and incrimenting
by the value of #gap.
[Dependencies]:
N/A
[Developer Notes]:
1. The lowest and highest possible numbers returned are whatever is allowable by a
bigint. The function, however, returns no more than 531,441,000,000 rows (8100^3).
2. #gap does not affect rn, rn will begin at #row1 and increase by 1 until the last row
unless its used in a query where a filter is applied to rn.
3. #gap must be greater than 0 or the function will not return any rows.
4. Keep in mind that when #row1 is 0 then the highest row-number will be the number of
rows returned minus 1
5. If you only need is a sequential set beginning at 0 or 1 then, for best performance
use the RN column. Use N1 and/or N2 when you need to begin your sequence at any
number other than 0 or 1 or if you need a gap between your sequence of numbers.
6. Although #gap is a bigint it must be a positive integer or the function will
not return any rows.
7. The function will not return any rows when one of the following conditions are true:
* any of the input parameters are NULL
* #high is less than #low
* #gap is not greater than 0
To force the function to return all NULLs instead of not returning anything you can
add the following code to the end of the query:
UNION ALL
SELECT NULL, NULL, NULL, NULL
WHERE NOT (#high&#low&#gap&#row1 IS NOT NULL AND #high >= #low AND #gap > 0)
This code was excluded as it adds a ~5% performance penalty.
8. There is no performance penalty for sorting by rn ASC; there is a large performance
penalty for sorting in descending order WHEN #row1 = 1; WHEN #row1 = 0
If you need a descending sort the use op in place of rn then sort by rn ASC.
Best Practices:
--===== 1. Using RN (rownumber)
-- (1.1) The best way to get the numbers 1,2,3...#high (e.g. 1 to 5):
SELECT RN FROM dbo.rangeAB(1,5,1,1);
-- (1.2) The best way to get the numbers 0,1,2...#high-1 (e.g. 0 to 5):
SELECT RN FROM dbo.rangeAB(0,5,1,0);
--===== 2. Using OP for descending sorts without a performance penalty
-- (2.1) The best way to get the numbers 5,4,3...#high (e.g. 5 to 1):
SELECT op FROM dbo.rangeAB(1,5,1,1) ORDER BY rn ASC;
-- (2.2) The best way to get the numbers 0,1,2...#high-1 (e.g. 5 to 0):
SELECT op FROM dbo.rangeAB(1,6,1,0) ORDER BY rn ASC;
--===== 3. Using N1
-- (3.1) To begin with numbers other than 0 or 1 use N1 (e.g. -3 to 3):
SELECT N1 FROM dbo.rangeAB(-3,3,1,1);
-- (3.2) ROW_NUMBER() is built in. If you want a ROW_NUMBER() include RN:
SELECT RN, N1 FROM dbo.rangeAB(-3,3,1,1);
-- (3.3) If you wanted a ROW_NUMBER() that started at 0 you would do this:
SELECT RN, N1 FROM dbo.rangeAB(-3,3,1,0);
--===== 4. Using N2 and #gap
-- (4.1) To get 0,10,20,30...100, set #low to 0, #high to 100 and #gap to 10:
SELECT N1 FROM dbo.rangeAB(0,100,10,1);
-- (4.2) Note that N2=N1+#gap; this allows you to create a sequence of ranges.
-- For example, to get (0,10),(10,20),(20,30).... (90,100):
SELECT N1, N2 FROM dbo.rangeAB(0,90,10,1);
-- (4.3) Remember that a rownumber is included and it can begin at 0 or 1:
SELECT RN, N1, N2 FROM dbo.rangeAB(0,90,10,1);
[Examples]:
--===== 1. Generating Sample data (using rangeAB to create "dummy rows")
-- The query below will generate 10,000 ids and random numbers between 50,000 and 500,000
SELECT
someId = r.rn,
someNumer = ABS(CHECKSUM(NEWID())%450000)+50001
FROM rangeAB(1,10000,1,1) r;
--===== 2. Create a series of dates; rn is 0 to include the first date in the series
DECLARE #startdate DATE = '20180101', #enddate DATE = '20180131';
SELECT r.rn, calDate = DATEADD(dd, r.rn, #startdate)
FROM dbo.rangeAB(1, DATEDIFF(dd,#startdate,#enddate),1,0) r;
GO
--===== 3. Splitting (tokenizing) a string with fixed sized items
-- given a delimited string of identifiers that are always 7 characters long
DECLARE #string VARCHAR(1000) = 'A601225,B435223,G008081,R678567';
SELECT
itemNumber = r.rn, -- item's ordinal position
itemIndex = r.n1, -- item's position in the string (it's CHARINDEX value)
item = SUBSTRING(#string, r.n1, 7) -- item (token)
FROM dbo.rangeAB(1, LEN(#string), 8,1) r;
GO
--===== 4. Splitting (tokenizing) a string with random delimiters
DECLARE #string VARCHAR(1000) = 'ABC123,999F,XX,9994443335';
SELECT
itemNumber = ROW_NUMBER() OVER (ORDER BY r.rn), -- item's ordinal position
itemIndex = r.n1+1, -- item's position in the string (it's CHARINDEX value)
item = SUBSTRING
(
#string,
r.n1+1,
ISNULL(NULLIF(CHARINDEX(',',#string,r.n1+1),0)-r.n1-1, 8000)
) -- item (token)
FROM dbo.rangeAB(0,DATALENGTH(#string),1,1) r
WHERE SUBSTRING(#string,r.n1,1) = ',' OR r.n1 = 0;
-- logic borrowed from: http://www.sqlservercentral.com/articles/Tally+Table/72993/
--===== 5. Grouping by a weekly intervals
-- 5.1. how to create a series of start/end dates between #startDate & #endDate
DECLARE #startDate DATE = '1/1/2015', #endDate DATE = '2/1/2015';
SELECT
WeekNbr = r.RN,
WeekStart = DATEADD(DAY,r.N1,#StartDate),
WeekEnd = DATEADD(DAY,r.N2-1,#StartDate)
FROM dbo.rangeAB(0,datediff(DAY,#StartDate,#EndDate),7,1) r;
GO
-- 5.2. LEFT JOIN to the weekly interval table
BEGIN
DECLARE #startDate datetime = '1/1/2015', #endDate datetime = '2/1/2015';
-- sample data
DECLARE #loans TABLE (loID INT, lockDate DATE);
INSERT #loans SELECT r.rn, DATEADD(dd, ABS(CHECKSUM(NEWID())%32), #startDate)
FROM dbo.rangeAB(1,50,1,1) r;
-- solution
SELECT
WeekNbr = r.RN,
WeekStart = dt.WeekStart,
WeekEnd = dt.WeekEnd,
total = COUNT(l.lockDate)
FROM dbo.rangeAB(0,datediff(DAY,#StartDate,#EndDate),7,1) r
CROSS APPLY (VALUES (
CAST(DATEADD(DAY,r.N1,#StartDate) AS DATE),
CAST(DATEADD(DAY,r.N2-1,#StartDate) AS DATE))) dt(WeekStart,WeekEnd)
LEFT JOIN #loans l ON l.lockDate BETWEEN dt.WeekStart AND dt.WeekEnd
GROUP BY r.RN, dt.WeekStart, dt.WeekEnd ;
END;
--===== 6. Identify the first vowel and last vowel in a along with their positions
DECLARE #string VARCHAR(200) = 'This string has vowels';
SELECT TOP(1) position = r.rn, letter = SUBSTRING(#string,r.rn,1)
FROM dbo.rangeAB(1,LEN(#string),1,1) r
WHERE SUBSTRING(#string,r.rn,1) LIKE '%[aeiou]%'
ORDER BY r.rn;
-- To avoid a sort in the execution plan we'll use op instead of rn
SELECT TOP(1) position = r.op, letter = SUBSTRING(#string,r.op,1)
FROM dbo.rangeAB(1,LEN(#string),1,1) r
WHERE SUBSTRING(#string,r.rn,1) LIKE '%[aeiou]%'
ORDER BY r.rn;
---------------------------------------------------------------------------------------
[Revision History]:
Rev 00 - 20140518 - Initial Development - Alan Burstein
Rev 01 - 20151029 - Added 65 rows to make L1=465; 465^3=100.5M. Updated comment section
- Alan Burstein
Rev 02 - 20180613 - Complete re-design including opposite number column (op)
Rev 03 - 20180920 - Added additional CROSS JOIN to L2 for 530B rows max - Alan Burstein
****************************************************************************************/
RETURNS TABLE WITH SCHEMABINDING AS RETURN
WITH L1(N) AS
(
SELECT 1
FROM (VALUES
(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),(0),
(0),(0)) T(N) -- 90 values
),
L2(N) AS (SELECT 1 FROM L1 a CROSS JOIN L1 b CROSS JOIN L1 c),
iTally AS (SELECT rn = ROW_NUMBER() OVER (ORDER BY (SELECT 1)) FROM L2 a CROSS JOIN L2 b)
SELECT r.RN, r.OP, r.N1, r.N2
FROM
(
SELECT
RN = 0,
OP = (#high-#low)/#gap,
N1 = #low,
N2 = #gap+#low
WHERE #row1 = 0
UNION ALL -- COALESCE required in the TOP statement below for error handling purposes
SELECT TOP (ABS((COALESCE(#high,0)-COALESCE(#low,0))/COALESCE(#gap,0)+COALESCE(#row1,1)))
RN = i.rn,
OP = (#high-#low)/#gap+(2*#row1)-i.rn,
N1 = (i.rn-#row1)*#gap+#low,
N2 = (i.rn-(#row1-1))*#gap+#low
FROM iTally AS i
ORDER BY i.rn
) AS r
WHERE #high&#low&#gap&#row1 IS NOT NULL AND #high >= #low AND #gap > 0;
date.AgeInMonths
CREATE FUNCTION dates.ageInMonths(#startDate DATETIME, #endDate DATETIME)
/*****************************************************************************************
[Purpose]:
Calculates the number of months between #startDate and #endDate. This is something that
cannot be done using DATEDIFF. Note how the following query returns a "1":
SELECT DATEDIFF(MM,'Dec 30 2001', 'Jan 3 2002'); -- Returns 1
[Compatibility]:
SQL Server 2005+
[Syntax]:
--===== Autonomous
SELECT f.months
FROM dates.ageInMonths(#startDate, #endDate) f;
--===== Against a table using APPLY
SELECT t.*, f.months
FROM dbo.someTable t
FROM dates.ageInMonths(t.col1, t.col2) f;
[Parameters]:
#startDate = datetime; first date to compare
#endDate = datetime; date to compare #startDate to
[Returns]:
Inline Table Valued Function returns:
months = int; number of months between #startdate and #enddate
[Developer Notes]:
1. NULL when either input parameter is NULL,
2. This function is what is referred to as an "inline" scalar UDF." Technically it's an
inline table valued function (iTVF) but performs the same task as a scalar valued user
defined function (UDF); the difference is that it requires the APPLY table operator
to accept column values as a parameter. For more about "inline" scalar UDFs see this
article by SQL MVP Jeff Moden: http://www.sqlservercentral.com/articles/T-SQL/91724/
and for more about how to use APPLY see the this article by SQL MVP Paul White:
http://www.sqlservercentral.com/articles/APPLY/69953/.
Note the above syntax example and usage examples below to better understand how to
use the function. Although the function is slightly more complicated to use than a
scalar UDF it will yield notably better performance for many reasons. For example,
unlike a scalar UDFs or multi-line table valued functions, the inline scalar UDF does
not restrict the query optimizer's ability generate a parallel query execution plan.
3. ageInMonths requires that #enddate be equal to or later than #startDate. Otherwise a
NULL is returned.
4. ageInMonths is deterministic. For more deterministic functions see:
https://msdn.microsoft.com/en-us/library/ms178091.aspx
[Examples]:
--===== 1. Basic Use
SELECT a.months
FROM dates.ageInMonths('20120109', '20180108') a
--===== 2. Against a table
DECLARE #sometable TABLE (date1 date, date2 date);
BEGIN
INSERT #sometable
VALUES ('20111114','20111209'),('20090401','20110506'),('20091101','20160511');
SELECT t.date1, t.date2, a.months
FROM #sometable t
CROSS APPLY dates.ageInMonths(t.date1, t.date2) a;
END
-----------------------------------------------------------------------------------------
[Revision History]:
Rev 00 - 20180624 - Initial Creation - Alan Burstein
*****************************************************************************************/
RETURNS TABLE WITH SCHEMABINDING AS RETURN
SELECT months =
CASE WHEN SIGN(DATEDIFF(dd,#startDate,#endDate)) > -1
THEN DATEDIFF(month,#startDate,#endDate) -
CASE WHEN DATEPART(dd,#startDate) > DATEPART(dd,#endDate) THEN 1 ELSE 0 END
END;
Note that you will have to adjust for the custom schema's I use (e.g. change to DBO)

How to sum any credits before debits SQL Server?

I'm trying to sum the all credits that occur before a debit, then sum all the debits after credit within a 4 day time period.
Table
ACCT |Date | Amount | Credit or debit
-----+----------+---------+----------------
152 |8/14/2017 | 48 | C
152 |8/12/2017 | 22.5 | D
152 |8/12/2017 | 40 | D
152 |8/11/2017 | 226.03 | C
152 |8/10/2017 | 143 | D
152 |8/10/2017 | 107.23 | C
152 |8/10/2017 | 20 | D
152 |8/10/2017 | 49.41 | C
My query should only sum if there is credit before the debit. the results will have 3 rows with the data above.
Output needed:
acct DateRange credit_amount debit_amount
--------------------------------------------------------------------------
152 2017-10-14 to 2017-10-18 49.41 20
152 2017-10-14 to 2017-10-18 107.23 143
152 2017-10-14 to 2017-10-18 226.03 62.5
The last one is summing the two debits until there is a credit.
First find the first credit.
sum the credits if there are more then 1 before a debit.
then find the debit and sum together until the next credit.
I only need the case where the credit date is before the debit date. The 48 on 8/14 is ignored because there is no debit after it.
The logic is to see if the account was credited then debited after it.
My attempt
DECLARE #StartDate DATE
DECLARE #EndDate DATE
DECLARE #OverallEndDate DATE
SET #OverallEndDate = '2017-08-14'
SET #StartDate = '2017-08-10'
SET #EndDate = dateadd(dd, 4, #startDate);
WITH Dates
AS (
SELECT #StartDate AS sd, #EndDate AS ed, #OverallEndDate AS od
UNION ALL
SELECT dateadd(dd, 1, sd), DATEADD(dd, 1, ed), od
FROM Dates
WHERE od > sd
), credits
AS (
SELECT DISTINCT A.Acct, LEFT(CONVERT(VARCHAR, #StartDate, 120), 10) + 'to' + LEFT(CONVERT(VARCHAR, #EndDate, 120), 10) AS DateRange, credit_amount, debit_amount
FROM (
SELECT t1.acct, sum(amount) AS credit_amount, MAX(t1.datestart) AS c_datestart
FROM [Transactions] T1
WHERE Credit_or_debit = 'C' AND T1.Datestart BETWEEN #StartDate AND #EndDate AND T1.[acct] = '152' AND T1.Datestart <= (
SELECT MIN(D1.Datestart)
FROM [Transactions] D1
WHERE T1.acct = D1.acct AND D1.Credit_or_debit = 'D' AND D1.Datestart BETWEEN #StartDate AND #EndDate
)
GROUP BY T1.acct
) AS A
CROSS JOIN (
SELECT t2.acct, sum(amount) AS debit_amount, MAX(t2.datestart) AS c_datestart
FROM [Transactions] T2 AND T2.DBCR = 'D' AND T2.Datestart BETWEEN #StartDate AND #EndDate AND T2.[acct] = '152' AND T2.Datestart <= (
SELECT MAX(D1.Datestart)
FROM [Transactions] D1
WHERE T2.acct = D1.acct AND D1.Credit_or_debit = 'D' AND D1.Datestart BETWEEN #StartDate AND #EndDate
)
GROUP BY T2.acct
) AS B
WHERE A.acct = B.acct AND A.c_datestart <= B.d_datestart
)
SELECT *
FROM credits
OPTION (MAXRECURSION 0)
Update:
The date stored is actually date timestamped. That is how I verify whether the debit is > credit.

It should be clear now that you definitely need a column that specifies the sequential order of transactions, because otherwise you can't decide whether a debit is placed befor or after a credit when they both have the same datestart. Assuming that you have such a column (in my query I named it ID), a solution could be as follows, without recursion and also without a self-join. The problem can be solved using some of the window functions available since SQL Server 2008.
My solution processes the data in several steps that I implemented as a sequence of 2 CTEs and a final PIVOT query:
DECLARE #StartDate DATE = '20170810';
DECLARE #EndDate DATE = dateadd(dd, 4, #StartDate);
DECLARE #DateRange nvarchar(24);
SET #DateRange =
CONVERT(nvarchar(10), #StartDate, 120) + ' to '
+ CONVERT(nvarchar(10), #EndDate, 120);
WITH
blocks (acct, CD, amount, blockno, r_blockno) AS (
SELECT acct, Credit_or_debit, amount
, ROW_NUMBER() OVER (PARTITION BY acct ORDER BY ID ASC)
- ROW_NUMBER() OVER (PARTITION BY acct, Credit_or_debit ORDER BY ID ASC)
, ROW_NUMBER() OVER (PARTITION BY acct ORDER BY ID DESC)
- ROW_NUMBER() OVER (PARTITION BY acct, Credit_or_debit ORDER BY ID DESC)
FROM Transactions
WHERE datestart BETWEEN #StartDate AND #EndDate
AND Credit_or_debit IN ('C','D') -- not needed, if always true
),
blockpairs (acct, CD, amount, pairno) AS (
SELECT acct, CD, amount
, DENSE_RANK() OVER (PARTITION BY acct, CD ORDER BY blockno)
FROM blocks
WHERE (blockno > 0 OR CD = 'C') -- remove leading debits
AND (r_blockno > 0 OR CD = 'D') -- remove trailing credits
)
SELECT acct, #DateRange AS DateRange
, amt.C AS credit_amount, amt.D AS debit_amount
FROM blockpairs PIVOT (SUM(amount) FOR CD IN (C, D)) amt
ORDER BY acct, pairno;
And this is how it works:
blocks
Here, the relevant data is retrieved from the table, meaning that the date range filter is applied, and another filter on the Credit_or_debit column makes sure that only the values C and D are contained in the result (if this is the case by design in your table, then that part of the WHERE clause can be omitted). The essential part in this CTE is the difference of two rownumbers (blockno). Credits and debits are numbered separately, and their respective rownumber is subtracted from the overall row number. Within a consecutive block of debits or credits, these numbers will be the same for each record, and they will be different (higher) in later blocks of the same type. The main use if this numbering is to identify the very first block (number 0) in order to be able to exclude it from
further processing in the next step in case it's a debit block. To be able to also identify the very last block (and filter it away in the next step if it's a credit block), a similar block numbering is made in the reverse order (r_blockno). The result (which I orderd just for visualization with your sample data) will look like this:
blockpairs
In this CTE, as described before, the very first block is filtered away if it's a debit block, and the very last block is filtered away if it's a credit block. Doing this, the number of remaining blocks must be even, and the logical order of blocks must be a sequence of pairs of credit and debit blocks, each pair starting with a credit block and followed by its associated debit block. Each pair of credit/debit blocks will result in a single row in the end. To associate the credit and debit blocks correctly in the query, I give them the same number by using separate numberings per type (the n-th credit block and the n-th debit block are associated by giving them the same number n). For this numbering, I use the DENSE_RANK function, for all records in a block to obtain the same number (pairno) and make the numbering gapless. For numbrting the blocks of the same type, I reuse the the blockno field described above for ordering. The result in your example (again sorted for visualization):
The final PIVOT query
Finally, the credit_amount and debit_amount are aggregated over the respective blocks grouping by acct and pairno and then diplayed side-by-side using a PIVOT query.
Although the column pairno isn't visible, it is used for sorting the resulting records.

Binary operator OR in TSQL?

What I 'm trying to achieve is to count occurrences in a sort of time line, considering overlapping events as a single one, starting from a field like this and using TSQL:
Pattern (JSON array of couple of values indicating
the start day and the duration of the event)
----------------------------------------------------
[[0,22],[0,24],[18,10],[30,3]]
----------------------------------------------------
For this example the result expected should be 30
What i need is a TSQL function to obtain this number...
Even If I'm not sure it's the right path to follow, I'm trying to simulate a sort of BINARY OR between rows of my dataset.
After some trying I managed to turn my dataset into something like this:
start | length | pattern
----------------------------------------------------
0 | 22 | 1111111111111111111111
0 | 24 | 111111111111111111111111
18 | 10 | 000000000000000001111111111
30 | 3 | 000000000000000000000000000000111
----------------------------------------------------
But now I dont' know how to proceed in TSQL =)
a solution as i said could be a binary OR between the "pattern" fields to obtain something like this:
1111111111111111111111...........
111111111111111111111111.........
000000000000000001111111111......
000000000000000000000000000000111
--------------------------------------
111111111111111111111111111000111
Is it possible to do it in TSQL?
Maybe i'm just complicating things here do you have other ideas?
DO NOT forget I just need the result number!!!
Thank you all

Only the total days that an event occurs needs to be returned.
But I was wondering how hard it would be to actually calculate that binary OR'd pattern.
declare #T table (start int, length int);
insert into #T values
(0,22),
(0,24),
(18,10),
(30,3);
WITH
DIGITS as (
select n
from (values (0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) D(n)
),
NUMBERS as (
select (10*d2.n + d1.n) as n
from DIGITS d1, DIGITS d2
where (10*d2.n + d1.n) < (select max(start+length) from #T)
),
CALC as (
select N.n, max(case when N.n between IIF(T.start>0,T.start,1) and IIF(T.start>0,T.start,1)+T.length-1 then 1 else 0 end) as ranged
from #T T
cross apply NUMBERS N
group by N.n
)
select SUM(c.ranged) as total,
stuff(
(
select ranged as 'text()'
from CALC
order by n
for xml path('')
),1,1,'') as pattern
from CALC c;
Result:
total pattern
30 11111111111111111111111111100111

Depending on your input date, you should be able to do something like the following to calculate your Days With An Event.
The cte is used to generate a table of dates, the start and end of which are defined by the two date variables. These would be best suited as data driven from your source data. If you have to use numbered date values, you could simply return incrementing numbers instead of incrementing dates:
declare #Events table (StartDate date
,DaysLength int
)
insert into #Events values
('20160801',22)
,('20160801',24)
,('20160818',10)
,('20160830',3)
declare #StartDate date = getdate()-30
,#EndDate date = getdate()+30
;with Dates As
(
select DATEADD(day,1,#StartDate) as Dates
union all
select DATEADD(day,1, Dates)
from Dates
where Dates < #EndDate
)
select count(distinct d.Dates) as EventingDays
from Dates d
inner join #Events e
on(d.Dates between e.StartDate and dateadd(d,e.DaysLength-1,e.StartDate)
)
option(maxrecursion 0)

Set based solution for processing rows in a SQL table

Can someone steer me in the right direction for solving this issue with a set-based solution versus cursor-based?
Given a table with the following rows:
Date Value
2013-11-01 12
2013-11-12 15
2013-11-21 13
2013-12-01 0
I need a query that will give me a row for each date between 2013-11-1 and 2013-12-1, as follows:
2013-11-01 12
2013-11-02 12
2013-11-03 12
...
2013-11-12 15
2013-11-13 15
2013-11-14 15
...
2013-11-21 13
2013-11-21 13
...
2013-11-30 13
2013-11-31 13
Any advice and/or direction will be appreciated.

The first thing that came to my mind was to fill in the missing dates by looking at the day of the year. You can do this by joining to the spt_values table in the master DB and adding the number to the first day of the year.
DECLARE #Table AS TABLE(ADate Date, ANumber Int);
INSERT INTO #Table
VALUES
('2013-11-01',12),
('2013-11-12',15),
('2013-11-21',13),
('2013-12-01',0);
SELECT
DateAdd(D, v.number, MinDate) Date
FROM (SELECT number FROM master.dbo.spt_values WHERE name IS NULL) v
INNER JOIN (
SELECT
Min(ADate) MinDate
,DateDiff(D, Min(ADate), Max(ADate)) DaysInSpan
,Year(Min(ADate)) StartYear
FROM #Table
) dates ON v.number BETWEEN 0 AND DaysInSpan - 1
Next I would wrap that to make a derived table, and add a subquery to get the most recent number. Your end result may look something like:
DECLARE #Table AS TABLE(ADate Date, ANumber Int);
INSERT INTO #Table
VALUES
('2013-11-01',12),
('2013-11-12',15),
('2013-11-21',13),
('2013-12-01',0);
-- Uncomment the following line to see how it behaves when the date range spans a year end
--UPDATE #Table SET ADate = DateAdd(d, 45, ADate)
SELECT
AllDates.Date
,(SELECT TOP 1 ANumber FROM #Table t WHERE t.ADate <= AllDates.Date ORDER BY ADate DESC)
FROM (
SELECT
DateAdd(D, v.number, MinDate) Date
FROM
(SELECT number FROM master.dbo.spt_values WHERE name IS NULL) v
INNER JOIN (
SELECT
Min(ADate) MinDate
,DateDiff(D, Min(ADate), Max(ADate)) DaysInSpan
,Year(Min(ADate)) StartYear
FROM #Table
) dates ON v.number BETWEEN 0 AND DaysInSpan - 1
) AllDates

Another solution, not sure how it compares to the two already posted performance wise but it's a bit more concise:
Uses a numbers table:
Linky
Query:
DECLARE #SDATE DATETIME
DECLARE #EDATE DATETIME
DECLARE #DAYS INT
SET #SDATE = '2013-11-01'
SET #EDATE = '2013-11-29'
SET #DAYS = DATEDIFF(DAY,#SDATE, #EDATE)
SELECT Num, DATEADD(DAY,N.Num,#SDATE), SUB.[Value]
FROM Numbers N
LEFT JOIN MyTable M ON DATEADD(DAY,N.Num,#SDATE) = M.[Date]
CROSS APPLY (SELECT TOP 1 [Value]
FROM MyTable M2
WHERE [Date] <= DATEADD(DAY,N.Num,#SDATE)
ORDER BY [Date] DESC) SUB
WHERE N.Num <= #DAYS
--
SQL Fiddle

It's possible, but neither pretty nor very performant at scale:
In addition to your_table, you'll need to create a second table/view dates containing every date you'd ever like to appear in the output of this query. For your example it would need to contain at least 2013-11-01 through 2013-12-01.
SELECT m.date, y.value
FROM your_table y
INNER JOIN (
SELECT md.date, MAX(my.date) AS max_date
FROM dates md
INNER JOIN your_table my ON md.date >= my.date
GROUP BY md.date
) m
ON y.date = m.max_date

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Find mathematical complement of a set of intervals with SQL - sql-server

Related

Generate Missing date and add zero value and add one to existing date in sql

How to generate calendar table having begin month date and end month Date

How to sum any credits before debits SQL Server?

Binary operator OR in TSQL?

Set based solution for processing rows in a SQL table

Categories

Resources