Consecutive Day Query - sql-server

I have a table of contactIDs and datetimes, the time being when a letter was generated for the contact. Each contact can only have one letter generated a day. I want to write a query to select any contact that has had letters generated on more than one consecutive day.
I guess I'd need to increment the datetime as records are found but how would I do this separately for each contact?

select contactid from ContactTable a inner join Contacttable B on a.contctid=b.contactid and datediff(day,a.date,b.date)=1

I decided to utilise a calendar table. Use your favourite search engine to find a script to create a calendar table.
Alternatively, here's one I made earlier
So here's the query I have rolled with in full, I will explain the detail of it afterwards
DECLARE #your_table table (
contact_id int
, created_on datetime
);
INSERT INTO #your_table (contact_id, created_on)
SELECT 9, '2014-01-02 06:00'
UNION ALL SELECT 9, '2014-01-02 18:00'
UNION ALL SELECT 9, '2014-01-05 08:00'
UNION ALL SELECT 9, '2014-01-07 01:00'
UNION ALL SELECT 3, '2014-01-02 00:01'
UNION ALL SELECT 3, '2014-01-03 23:59' -- Over 24 hours but a "day" different
UNION ALL SELECT 7, '2014-01-04 01:00'
UNION ALL SELECT 7, '2014-01-06 01:00'
UNION ALL SELECT 7, '2014-01-08 01:00'
UNION ALL SELECT 7, '2014-01-09 01:00'
UNION ALL SELECT 7, '2014-01-10 01:00'
UNION ALL SELECT 7, '2014-01-11 01:00'
;
; WITH x AS (
SELECT your_table.contact_id
, your_table.created_on
, calendar.the_date
, Row_Number() OVER (PARTITION BY your_table.contact_id ORDER BY calendar.the_date) As sequence
FROM #your_table As your_table
INNER
JOIN dbo.calendar
ON your_table.created_on >= calendar.the_date
AND your_table.created_on < DateAdd(dd, 1, calendar.the_date)
)
, y AS (
SELECT curr.contact_id
, curr.created_on
, curr.the_date As the_date
, prev.the_date As previous_date
, DateDiff(dd, prev.the_date, curr.the_date) As difference_in_days
FROM x As curr
LEFT
JOIN x As prev
ON curr.contact_id = prev.contact_id
AND curr.sequence = prev.sequence + 1
)
SELECT contact_id
, created_on
, the_date
, previous_date
, difference_in_days
FROM y
WHERE difference_in_days = 1
Because you didn't provide any sample data that's where I had to start, so the query is self-contained using a table variable (#your_table) as its source.
Once populated we start out with a couple of Common-Table Expressions (CTE for short). Read up here if you're not familiar with the concept: http://msdn.microsoft.com/en-us/library/ms175972.aspx . There's not a lot of difference between these and subqueries.
Our first CTE (x) joins #your_table to the calendar table. It does this by returning the single row from the calendar on which the created_on date lies, by checking that it is greater than (or equal to) the calendar date and less than the next calendar date (DateAdd()).
Once complete we use the windowed function - Row_Number() to provide some sequencing.
We partition (i.e. reset the sequence) for each contact_id and sort the sequence by the created_on date.
Moving on to the second CTE (y): we perform a self-join on CTE x joining each contact record with its "previous" based on the sequencing.
This allows us to work out the difference in days (DateDiff()) between the current and the previous records.
Finally we reduce our resultset to only those records where the difference in days is 1 i.e. contacts on consecutive days

Related

Find minimum datetime while using FK in two different tables

I have 2 tables:
COURSE
------
Id
Name
TEST
------
Id
CourseId (FK to `COURSE.ID`)
DATETIME
NUMBERS
Suppose COURSE table with ID 1,2 (only 2 columns) and TEST table with 8 numbers of data having different DATETIME and CourseId of 1 (3 columns) and 2 (6 columns).
I want to find the minimum DATETIME,CourseID and Name by joining these 2 tables. The below query is giving a 2 output:
(SELECT min([DATETIME]) as DATETIME ,[TEST].CourseID,Name
FROM [dbo].[TEST]
left JOIN [dbo].[COURSE]
ON [dbo].[TEST].CourseID=[COURSE].ID GROUP BY CourseID,Name)
I want a single column output i.e. a single output column (minimum datetime along with Name and ID)..HOW can i achieve??
With 2 courses you are always going to get 2 rows when joining like this. It will give you the minimum date value for each course. The first way you can get a single row is to use TOP 1 in your query, which will simply give you the course with the earliest test date. The other way is to use a WHERE clause to filter it by a single course.
Please run this sample code with some variations of what you can do, notes included in comments:
CREATE TABLE #course ( id INT, name NVARCHAR(20) );
CREATE TABLE #Test
(
id INT ,
courseId INT ,
testDate DATETIME -- you shouldn't use a keyword for a column name
);
INSERT INTO #course
( id, name )
VALUES ( 1, 'Maths' ),
( 2, 'Science' );
-- note I used DATEADD(HOUR, -1, GETDATE()) to simply get some random datetime values
INSERT INTO #Test
( id, courseId, testDate )
VALUES ( 1, 1, DATEADD(HOUR, -1, GETDATE()) ),
( 2, 1, DATEADD(HOUR, -2, GETDATE()) ),
( 3, 1, DATEADD(HOUR, -3, GETDATE()) ),
( 4, 2, DATEADD(HOUR, -4, GETDATE()) ),
( 5, 2, DATEADD(HOUR, -5, GETDATE()) ),
( 6, 2, DATEADD(HOUR, -6, GETDATE()) ),
( 7, 2, DATEADD(HOUR, -7, GETDATE()) ),
( 8, 2, DATEADD(HOUR, -8, GETDATE()) );
-- returns minumum date for each course - 2 rows
SELECT MIN(t.testDate) AS TestDate ,
t.courseId ,
c.name
FROM #Test t
-- used inner join as could see no reason for left join
INNER JOIN #course c ON t.courseId = c.id
GROUP BY courseId , name;
-- to get course with minimum date - 1 row
SELECT TOP 1
MIN(t.testDate) AS TestDate ,
t.courseId ,
c.name
FROM #Test t
-- used inner join as could see no reason for left join
INNER JOIN #course c ON t.courseId = c.id
GROUP BY t.courseId , c.name
ORDER BY MIN(t.testDate); -- requires order by
-- to get minimum date for a specified course - 1 row
SELECT MIN(t.testDate) AS TestDate ,
t.courseId ,
c.name
FROM #Test t
-- used inner join as could see no reason for left join
INNER JOIN #course c ON t.courseId = c.id
WHERE t.courseId = 1 -- requires you specify a course id
GROUP BY courseId , name;
DROP TABLE #course;
DROP TABLE #Test;
In my understanding, you want to return the minimum date from the entire table with the course details of that day.
Please try the below script
SELECT TOP 1 MIN(t.testDate) OVER (ORDER BY t.testDate) AS TestDate ,
t.courseId ,
c.name
FROM Test t
INNER JOIN course c ON t.courseId = c.id
ORDER BY t.testDate

How To Count Rows Based on Values of Two Variables in SSIS

I am fairly new to SSIS, and now I have this requirement to exclude weekends in order to do a performance management. Now I have created a calendar and marked the weekends; what I am trying to do, using SSIS, is get the start and end date of every status and count how many weekends are there. I am kind of struggling to know which component to use to achieve this task.
So I have mainly two tables:
1- Table Calendar
2- Table History-Log
Calendar has the following columns:
1- ID
2- date
3- year
4- month
5- day of week
6- isweekend
History-Log has the following:
1- ID
2- Status
3- startdate
4- enddate
Your help is really appreciated.
I'm not an SSIS user, so apologies if this answer does not help, but if I wanted to get the result you describe, based on some test data:
DECLARE #Calendar TABLE (
ID INT,
[Date] DATETIME,
[Year] INT,
[Month] INT,
[DayOfWeek] VARCHAR(10),
IsWeekend BIT
)
DECLARE #HistoryLog TABLE (
ID INT,
[Status] INT,
StartDate DATETIME,
EndDate DATETIME
)
DECLARE #StartDate DATE = '20100101', #NumberOfYears INT = 10
DECLARE #CutoffDate DATE = DATEADD(YEAR, #NumberOfYears, #StartDate);
INSERT INTO #Calendar
SELECT ROW_NUMBER() OVER (ORDER BY d) AS ID,
d AS [Date],
DATEPART(YEAR,d) AS [Year],
DATEPART(MONTH,d) AS [Month],
DATENAME(WEEKDAY,d) AS [DayOfWeek],
CASE WHEN DATENAME(WEEKDAY,d) IN ('Saturday','Sunday') THEN 1 ELSE 0 END AS IsWeekend
FROM
(
SELECT d = DATEADD(DAY, rn - 1, #StartDate)
FROM
(
SELECT TOP (DATEDIFF(DAY, #StartDate, #CutoffDate))
rn = ROW_NUMBER() OVER (ORDER BY s1.[object_id])
FROM sys.all_objects AS s1
CROSS JOIN sys.all_objects AS s2
ORDER BY s1.[object_id]
) AS x
) AS y;
INSERT INTO #HistoryLog
SELECT 1, 3, '2016-01-05', '2016-01-20'
UNION
SELECT 2, 7, '2016-01-08', '2016-01-25'
UNION
SELECT 3, 4, '2016-01-01', '2016-02-03'
UNION
SELECT 4, 3, '2016-02-09', '2016-02-10'
I would use a query like this to return all of the HistoryLog records with a count of the number of weekend days between their StartDate and EndDate:
SELECT h.ID,
h.[Status],
h.StartDate,
h.EndDate,
COUNT(c.ID) AS WeekendDays
FROM #HistoryLog h
LEFT JOIN #Calendar c ON c.[Date] >= h.StartDate AND c.[Date] <= h.EndDate AND c.IsWeekend = 1
GROUP BY h.ID, h.[Status], h.StartDate, h.EndDate
ORDER BY 1
If you wanted to know the number of weekends, rather than the number of weekend days, we'd need to slightly amend this logic (and define how a range containing only one weekend day - or one starting on a Sunday and ending on a Saturday inclusive - should be handled). Assuming you just want to know how many distinct weekends are at least partially within the date range, you could do:
SELECT h.ID,
h.[Status],
h.StartDate,
h.EndDate,
COUNT(weekends.ID) AS Weekends
FROM #HistoryLog h
LEFT JOIN
(
SELECT c.ID,
c.[Date] AS SatDate,
DATEADD(DAY,1,c.[Date]) AS SunDate
FROM #Calendar c
WHERE c.[DayOfWeek] = 'Saturday'
) weekends ON h.StartDate BETWEEN weekends.SatDate AND weekends.SunDate
OR h.EndDate BETWEEN weekends.SatDate AND weekends.SunDate
OR (h.StartDate <= weekends.SatDate AND h.EndDate >= weekends.SunDate)
GROUP BY h.ID, h.[Status], h.StartDate, h.EndDate

Count the number of times a date is contained between 2 date columns

I have a table that looks like this
ID start_dt end_dt
--------------------------
1 1951-12-05 1951-12-21
2 1951-12-19 1951-12-31
3 1957-12-05 1957-12-19
4 1995-12-06 1995-12-20
5 1996-06-24 1996-07-08
6 1997-05-12 1997-05-26
7 1997-10-07 1997-10-21
8 1997-12-25 1998-01-08
9 1998-01-19 1998-02-02
10 1998-08-05 1998-08-19
I'd like to know how many times each individual date is contained between start_dt and end_dt.
From my example, the result set should look something like this
date count
------------------
1951-12-05 1
1951-12-06 1
...
1951-12-19 2
1951-12-20 2
1951-12-21 2
...
1998-08-19 1
What would be the best way to do this?
EDIT: To clarify, I need each date that appears at least once in a date range (between start_dt and end_dt) to get a row in my result set and I want the number of ranges that this date fits in next to it
hope this helps
When you need to turn 2 values (a range) into a series of rows you can use a number table (see Aaron Bertrand's The SQL Server Numbers Table article if you aren't familiar with the idea).
I've used shorter and simpler data but you should get the idea.
declare #dates table (id int not null, start_dt date not null, end_dt date not null)
insert #dates values (1, '20160601', '20160603'),
(2, '20160603', '20160605'),
(3, '20160610', '20160612')
;with cte as (
select
row_number() over (order by so1.object_id) - 1 as n
from
sys.objects so1
cross join sys.objects so2
)
select
dateadd(d, c.n, d.start_dt) as [date],
count(*)
from
#dates d
join cte c on dateadd(d, c.n, d.start_dt) <= d.end_dt
group by
dateadd(d, c.n, d.start_dt)
order by
dateadd(d, c.n, d.start_dt)
If there are no more than a few days (< 80 or so, depending in your sys.objects table) between start_dt and end_dt, you can use this approach (inspired on Rhys').
DECLARE #dates TABLE (id int not null, start_dt date not null, end_dt date not null)
INSERT #dates VALUES
(1, '1951-12-05', '1951-12-21'),
(2, '1951-12-19', '1951-12-31'),
(3, '1957-12-05', '1957-12-19'),
(4, '1995-12-06', '1995-12-20'),
(5, '1996-06-24', '1996-07-08'),
(6, '1997-05-12', '1997-05-26'),
(7, '1997-10-07', '1997-10-21'),
(8, '1997-12-25', '1998-01-08'),
(9, '1998-01-19', '1998-02-02'),
(10, '1998-08-05', '1998-08-19');
WITH RawData AS (
SELECT
DATEADD(d, n.n, d.start_dt) AS [date]
FROM #dates d
INNER JOIN (
SELECT ROW_NUMBER() OVER (ORDER BY object_id) - 1 AS n FROM sys.objects
) n ON DATEADD(d, n.n, d.start_dt) <= d.end_dt
)
SELECT [date], COUNT(*) [count]
FROM RawData
GROUP BY [date]
ORDER BY [date]
I don't think this could take long even with 1000 date ranges. Perhaps you are using a table with more fields and even missing some index?
You could use a CTE
WITH CTE AS(SELECT start_dt AS dates FROM Table
UNION ALL
SELECT end_dt AS dates FROM Table)
SELECT CAST(dates as DATE) as Date, COUNT(dates) AS Count
FROM CTE c
GROUP BY c.dates
order by Count desc
Or perhaps you need something broader if your columns are of DATETIME data type. This way will GROUP BY the whole day:
WITH CTE AS(SELECT CAST(start_dt AS DATE) AS dates FROM Table
UNION ALL
SELECT CAST(end_dt AS DATE) AS dates FROM Table)
SELECT Dates as Date, COUNT(Dates) AS Count
FROM CTE c
GROUP BY c.dates
order by Count desc

Getting quantity between a range of months from 2 date parameters

I have a table that stores budget quantities for a company whose fiscal year begins 1st April and ends on 31st March the next year.
I have this query to extract figures for a particular month.
SELECT SUM(T1.U_Quantity) AS 'YTDBOwnMadeTea'
FROM [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA] T0
INNER JOIN [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA_ROW] T1
ON T0.DocEntry = T1.DocEntry
WHERE T1.U_Month = DATENAME(MONTH, '2015-04-01') AND T0.U_Source = 'NTEL'
There is an existing report that takes two parameters, a Start and End Date. (type datetime)
Table below: The month column is of type nvarchar.
How do I modify the query such when a user enters StartDate and EndDate e.g.
1st May 2015 and 31st July 2015, I will get a quantity result of 12640.
You can use couple of ways to do this.
One way would be to use PARSE. Like this.
SELECT SUM(T1.U_Quantity) AS 'YTDBOwnMadeTea'
FROM [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA] T0
INNER JOIN [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA_ROW] T1
ON T0.DocEntry = T1.DocEntry
WHERE PARSE((T1.U_Month + CONVERT(VARCHAR(4),YEAR(CURRENT_TIMESTAMP))) as datetime) BETWEEN #StartDate AND #EndDate
AND T0.U_Source = 'NTEL'
Another way would be to use a numbers table to map your month name to a month number and use it in your query.
;WITH CTE AS (
SELECT 1 as rn UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
MonthMap AS
(
SELECT ROW_NUMBER()OVER(ORDER BY rn ASC) as monthnumber FROM CTE
)
SELECT monthnumber,DATENAME(MONTH,DATEFROMPARTS(2016,monthnumber,1)) FROM MonthMap;
and then join it with your month table like this.
;WITH CTE AS (
SELECT 1 as rn UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1 UNION ALL SELECT 1
),
MonthMap AS
(
SELECT ROW_NUMBER()OVER(ORDER BY rn ASC) as monthnumber FROM CTE
)
SELECT SUM(T1.U_Quantity) AS 'YTDBOwnMadeTea'
FROM [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA] T0
INNER JOIN [SL_NTEL_DB_LIVE].[dbo].[#U_BUDG_MADETEA_ROW] T1
ON T0.DocEntry = T1.DocEntry
INNER JOIN MonthMap M ON T1.U_Month = DATENAME(MONTH,DATEFROMPARTS(2016,monthnumber,1))
WHERE M.monthnumber BETWEEN DATEPART(MONTH,#StartDate) AND DATEPART(MONTH,#EndDate)
AND T0.U_Source = 'NTEL';
You should compare both the approaches for performance. PARSE is simpler to use but would be difficult to index properly.
On a Separate note, you should avoid storing dates or date parts as month names as these take up more storage(even more since you are using NVARCHAR), and are difficult to use efficiently.

Rolling Prior13 months with Current Month Sales

Within a SQL Server 2012 database, I have a table with two columns customerid and date. I am interested in getting by year-month, a count of customers that have purchased in current month but not in prior 13 months. The table is extremely large so something efficient would be highly appreciated. Results table is shown after the input data. In essence, it is a count of customers that purchased in current month but not in prior 13 months (by year and month).
---input table-----
declare #Sales as Table ( customerid Int, date Date );
insert into #Sales ( customerid, date) values
( 1, '01/01/2012' ),
( 1, '04/01/2013' ),
( 1, '01/01/2014' ),
( 1, '01/01/2014' ),
( 1, '04/06/2014' ),
( 2, '04/01/2014' ),
( 3, '01/03/2012' ),
( 3, '01/03/2014' ),
( 4, '01/04/2012' ),
( 4, '04/04/2013' ),
( 5, '02/01/2010' ),
( 5, '02/01/2013' ),
( 5, '04/01/2014' )
select customerid, date
from #Sales;
---desired results ----
yearmth monthpurchasers monthpurchasernot13m
201002 1 1
201201 3 3
201302 1 1
201304 2 2
201401 2 1
201404 3 2
Thanks very much for looking at this!
Dev
You didn't provide the expected result, but I believe this is pretty close (at least logically):
;with g as (
select customerid, year(date)*100 + month(date) as mon
from #Sales
group by customerid, year(date)*100 + month(date)
),
x as (
select *,
count(*) over(partition by customerid order by mon
rows between 13 preceding and 1 preceding) as cnt
from g
),
y as (
select mon, count(*) as cnt from x
where cnt = 0
group by mon
)
select g.mon,
count(distinct(g.customerid)) as monthpurchasers,
isnull(y.cnt, 0) as cnt
from g
left join y on g.mon = y.mon
group by g.mon, y.cnt
order by g.mon
Tell me if this query helps. It extracts all the rows which meet your condition into a Table variable. Then, I use your query and join to this table.
declare #startDate datetime
declare #todayDate datetime
declare #tbl_Custs as Table(customerid int)
set #startDate = '04/01/2014' -- mm/dd/yyyy
set #todayDate = GETDATE()
insert into #tbl_Custs
-- purchased only this month
select customerid
from Sales
where ([date] >= #startDate and [date] <= #todayDate)
and customerid NOT in
(
-- purchased in past 13 months
select distinct customerid
from Sales
where ([date] >= DATEADD(MONTH,-13,[date])
and [date] < #startDate)
)
-- your query goes here
select year(date) as year
,month(date) as month
,count(distinct(c.customerid)) as monthpurchasers
from #tbl_Custs as c right join
Sales as s
on c.customerid = s.customerid
group by year(date) , month(date)
order by year(date) , month(date)
Below query will produce what you are looking for. I am not sure how performance will be on a big table (how big is your table?) but it is pretty straight forward so I think it will be ok. I simply calculate the 13 months earlier on CTE to find my sale window. Than join to the Sales table within that window / customer id and grouping records based on the unmatched records. You don't actually need 2 CTE's here you can do the DATEADD(mm,-13,date) on the join part of the second CTE but I thought it might be more clear this way.
P.S. If you need to change the time frame from 13 months to something else all you have to change is the DATEADD(mm,-13,date) this simply substracts 13 months from the date value.
Hope this helps or at least leads to a better solution
;WITH PurchaseWindow AS (
select customerid, date, DATEADD(mm,-13,date) minsaledate
FROM #Sales
), JoinBySaleWindow AS (
SELECT a.customerid, a.date,a.minsaledate,b.date earliersaledate
FROM PurchaseWindow a
LEFT JOIN #sales b ON a.customerid =b.customerid
--Find the sales for the customer within the last 13 months of original sale
AND b.date BETWEEN a.date AND a.minsaledate
)
SELECT DATEPART(yy,date) AS [year], DATEPART(mm, date) AS [month], COUNT(DISTINCT customerid) monthpurchases
FROM JoinBySaleWindow
--Exclude records where a sale within last 13 months occured
WHERE earliersaledate IS NULL
GROUP BY DATEPART(mm, date), DATEPART(yy,date)
Sorry about the typos they are fixed now.

Resources