Need help selecting a record between two date ranges? - sql-server

I was trying to select a record between two date ranges but I keep getting duplicate record when two date range overlaps as shown below.
Here is an example.
Policy Info
Policy # Policy Effective Date Policy termination date Year
001 2018-10-01 2019-10-01 2018
002 2019-10-01 2020-10-01 2019
003 2020-10-01 2021-10-01 2020
004 2021-10-01 2022-10-01 2022
Policy Limit
LimitID Effective Date Termination Date Limit
1 2018-10-01 2021-10-01 1000
2 2018-10-01 3000-01-01 2500
How can I select Limit ID: 1 for Policy #: 001,002 003 or for the years 2018, 2019, 2020 and for any policy effective date greater than 2021-01-01 use Limit ID = 2
I tried the following but it keeps creating dupicate
((limit.effective_from_date < policy.effective_to_date
AND limit.effective_to_date > policy.effective_from_date
)
OR
(limit.effective_from_date = policy.effective_from_date
AND limit.effective_to_date = CONVERT(datetime, '01/01/3000', 102)))
but the above condition creates a duplicate. Is there any effective way of selecting a record within overlapping date ranges.
Any help will be appreciated!

Your problem is that you have overlapping periods for Policy Limits and you need to choose one. For what I understand from your data and I'm inferring a lot, you need to get the first limit for the FIRST period that it's [Policy Limit].[Effective Date] is earlier than the [Policy Info].[Policy Effective Date]
while [Policy Limit].[Termination Date] is later than [Policy Info].[Policy Termination Date].
If all my guessing is correct, you can do something like
drop table if exists #PolicyInfo
drop table if exists #PolicyLimit
CREATE TABLE #PolicyInfo (
Policy INT,
Policy_Effective_Date DATE,
Policy_termination_date DATE,
[Year] int
)
CREATE TABLE #PolicyLimit(
LimitID INT,
Effective_Date DATE,
Termination_Date DATE,
Limit INT
)
INSERT INTO #PolicyInfo (Policy, Policy_Effective_Date, Policy_termination_date, [Year])
VALUES
(001, '2018-10-01', '2019-10-01', 2018),
(002, '2019-10-01', '2020-10-01', 2019),
(003, '2020-10-01', '2021-10-01', 2020),
(004, '2021-10-01', '2022-10-01', 2022)
INSERT INTO #PolicyLimit (LimitID, Effective_Date, Termination_Date, Limit)
VALUES
(1, '2018-10-01','2021-10-01',1000),
(2, '2018-10-01','3000-01-01',2500)
;with cte AS (
-- Join PolicyInfo with PolicyLimit
-- condition: Policy_Effective_Date are between Effective_Date, pl.Termination_Date
-- AND
-- Policy_Termination_Date are between Effective_Date, pl.Termination_Date
SELECT *,
-- rank with partion by Policy
ROW_NUMBER() OVER (PARTITION BY [pi].Policy ORDER BY pl.Effective_Date, pl.Termination_Date) rn
FROM #PolicyInfo [pi]
INNER JOIN #PolicyLimit pl ON
[pi].Policy_Effective_Date BETWEEN pl.Effective_Date AND pl.Termination_Date
AND [pi].Policy_termination_date BETWEEN pl.Effective_Date AND pl.Termination_Date
)
SELECT Policy, LimitID
FROM cte
WHERE rn = 1 -- Select the first Limit per partition

Related

SQL- Calculating average of differences between times

I have an sql table that has transaction history of all the clients. I want to find what is the average difference in time between two transactions.
ClientCode Date
DL2xxx 2016-04-18 00:00:00.000
DL2xxx 2016-04-18 00:00:00.000
E19xxx 2016-04-18 00:00:00.000
E19xxx 2016-04-18 00:00:00.000
E19xxx 2016-04-18 00:00:00.000
JDZxxx 2016-04-18 00:00:00.000
Given above are the first few lines of the table the date given is the date transaction happened. I want to take an average of difference in days when successive transactions happen. Say for a client he makes transactions of Day 1, Day 3, Day 10, and Day 15. So differences are {2, 7, 5} average of which is 4.66. If only one transaction takes place this should be 0.
ClientCode AverageDays
DL2xxx <float_value>
DL2xxx <float_value>
E19xxx <float_value>
This is what the output should look like where each unique client code occurs only once.
You can use a query like below if you table name is T
see live demo
select
ClientCode,
AvgDays =ISNULL(AVG(d),0)
from
(
select
*,
d=DATEDIFF(
d,
dateofT,
LEAD(DateofT) over(
partition by ClientCode
order by DateofT asc ))
from t
)t
group by ClientCode
If Windowing functions aren't available to you, here's an alternative
--CREATE SAMPLE DATA
CREATE TABLE #TMP(ClientID INT, EventDate DATE)
GO
INSERT INTO #TMP VALUES
(1,DATEADD(DD,RAND()*365,'20180101'))
,(2,DATEADD(DD,RAND()*365,'20180101'))
,(3,DATEADD(DD,RAND()*365,'20180101'))
,(4,DATEADD(DD,RAND()*365,'20180101'))
,(5,DATEADD(DD,RAND()*365,'20180101'))
GO 50
--PRE SQL 2012 Compatible
SELECT A.ClientID
,AVG(DATEDIFF(DD,C.EventDate,A.Eventdate)) AS ClientAvg
FROM #TMP A
CROSS APPLY (SELECT ClientID, MAX(EventDate) EventDate FROM #TMP B
WHERE A.ClientID = B.ClientID AND A.EventDate > B.EventDate
GROUP BY ClientID) C
GROUP BY A.ClientID
ORDER BY A.ClientID
You can use LAG() function to compare a date to it's previous date by client, then group by client and calculate the average.
IF OBJECT_ID('tempdb..#Transactions') IS NOT NULL
DROP TABLE #Transactions
CREATE TABLE #Transactions (
ClientCode VARCHAR(100),
Date DATE)
INSERT INTO #Transactions (
ClientCode,
Date)
VALUES
('DL2', '2016-04-18'),
('DL2', '2016-04-19'),
('DL2', '2016-04-26'),
('E19', '2016-01-01'),
('E19', '2016-01-11'),
('E19', '2016-01-12')
;WITH DayDifferences AS
(
SELECT
T.ClientCode,
T.Date,
DayDifference = DATEDIFF(
DAY,
LAG(T.Date) OVER (PARTITION BY T.ClientCode ORDER BY T.Date ASC),
T.Date)
FROM
#Transactions AS T
)
SELECT
D.ClientCode,
AverageDayDifference = AVG(ISNULL(CONVERT(FLOAT, D.DayDifference), 0))
FROM
DayDifferences AS D
GROUP BY
D.ClientCode
Using the observation that the sum of differences within a group is simply the max - min of that group, you can use the simple group by select:
select IIF(COUNT(*) > 1,
(CAST(DATEDIFF(day, MIN(DateofT), MAX(DateofT)) AS FLOAT)) / (COUNT(*) - 1), 0.0)
AS AVGDays, ClientCode
FROM t GROUP BY ClientCode

Calculating Year to Date Total

I want to generate a Payroll type query whereby the values in Payroll 1 (say for the previous month) should be included in Payroll 2 (for the current month) Year-to-Date Totals.
This can best be explained with an example:
DECLARE #MyTable TABLE(ID INT IDENTITY, PayrollID INT, Description NVARCHAR(MAX), [Current Month] MONEY)
INSERT INTO #MyTable
VALUES (1,'Basic Salary',100),
(1,'Normal Over Time',50),
(1,'Work on Saturday',150),
(1,'Work on Sunday',200),
(2,'Basic Salary',100)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY Description ORDER BY PayrollID) AS [Month to Date]
FROM #MyTable
When I run the above I get
ID EmployeeID PayrollID Description Current Month Month to Date
1 1 1 Basic Salary 100 100
2 1 1 Normal Over Time 50 50
3 1 1 Work on Saturday 150 150
4 1 1 Work on Sunday 200 200
5 1 2 Basic Salary 100 200
The Year-to-Date running totals are per each Description meaning Basic Salary Category has its own running total and so does Saturday and Sunday etc, etc. You will notice that for Basic Salary in Payroll 2 the running Year-to-Date total is 200 (i.e. 100 from Payroll 1 + 100 from Payroll 2)
The challenge I have is that Payroll 1 has data for Basic Salary, Work on Saturday and Work on Sunday whereas Payroll 2 only has Basic Salary as the employee did not work on Saturday nor on Sunday in Payroll 2 (the current month).
However, in the cumulative Year-to-Date column the data from Payroll 1 (previous month) should still be selected and included in the Year-to-Date running Total -
something like this:
ID EmployeeID PayrollID Description Current Month Month to Date
1 1 1 Basic Salary 100 100
2 1 1 Normal Over Time 50 50
3 1 1 Work on Saturday 150 150
4 1 1 Work on Sunday 200 200
5 1 2 Basic Salary 100 200
2 1 1 Normal Over Time NULL 50
3 1 1 Work on Saturday NULL 150
4 1 1 Work on Sunday NULL 200
Although the employee did not work on Saturday nor Sunday in the current month (Payroll 2) the running (Year-to-Date) totals for working on a Saturday should be 150 that he/she worked in the previous month (Payroll 1). The same should apply to working on Sunday where the running total in the current month (Payroll 2) should be the 200 that he/she worked in the previous month (Payroll 1).
How do I do that with a simple Select Statement without writing a complicated Procedure?
EDIT:
I have cleaned up the ode as follows:
DECLARE #MyTable TABLE(ID INT IDENTITY, EmployeeID INT, PayrollID INT, Description NVARCHAR(MAX), [Current Month] MONEY)
INSERT INTO #MyTable
VALUES (1,1,'Basic Salary',100),
(1,1,'Normal Over Time',50),
(1,1,'Work on Saturday',150),
(1,1,'Work on Sunday',200),
(1,2,'Basic Salary',100)
WITH pay_elements AS
(
SELECT Description
FROM #MyTable
GROUP BY Description
)
,pay_slips AS
(
SELECT EmployeeID, PayrollID
FROM #MyTable
GROUP BY EmployeeID, PayrollID
)
,pay_lines AS
(
SELECT
mt.ID
,PS.EmployeeID
,PS.PayrollID
,PE.Description
,ISNULL(mt.[Current Month], 0) AS [Current Month]
FROM
pay_slips AS ps
OUTER APPLY
pay_elements AS pe
LEFT JOIN
#MyTable AS mt
ON (mt.EmployeeID = ps.EmployeeID)
AND (mt.PayrollID = ps.PayrollID)
AND (mt.Description = pe.Description)
)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY EmployeeID, Description ORDER BY PayrollID) AS [Month to Date]
FROM pay_lines
And I get this error:
Msg 319, Level 15, State 1, Line 10
Incorrect syntax near the keyword 'with'. If this statement is a common table expression, an xmlnamespaces clause or a change tracking context clause, the previous statement must be terminated with a semicolon.
Msg 102, Level 15, State 1, Line 17
Incorrect syntax near ','.
Msg 102, Level 15, State 1, Line 23
Incorrect syntax near ','.
You first need to build a "structure" of row headings, and then join that onto the actual data.
So for example:
WITH pay_elements AS
(
SELECT Description
FROM #MyTable
GROUP BY Description
)
,pay_slips AS
(
SELECT EmployeeID, PayrollID
FROM #MyTable
GROUP BY EmployeeID, PayrollID
)
,pay_lines AS
(
SELECT
mt.ID
,pay_slips.EmployeeID
,pay_slips.PayrollID
,pay_elements.Description
,ISNULL(mt.Current_Month, 0) AS Current_Month
FROM
pay_slips AS ps
OUTER APPLY
pay_elements AS pe
LEFT JOIN
#MyTable AS mt
ON (mt.EmployeeID = ps.EmployeeID)
AND (mt.PayrollID = ps.PayrollID)
AND (mt.Description = pe.Description)
)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY EmployeeID, Description ORDER BY PayrollID) AS [Month to Date]
FROM pay_lines
What we're doing here is getting a list of the different kind of pay elements in your table. Then we're getting a list of Employees and Payrolls done to date, and manually forcing every Payroll to include a row in respect of all possible pay elements.
Once that structure is built, we join onto the base table to get the actual values (replacing NULLs with zeros, for those pay elements that weren't originally included in the base table).
Then we simply query this padded-out table in the same way you did originally.
Note, I've written this on the fly and haven't checked this code so please excuse any minor errors.
I am little confused with the column you mentioned Year-to-Date in your description. I assume this might be [Month to Date] column present in your query. Please correct me if I am wrong.
I think what you are trying to achieve is - the descriptions which are not present in payroll ID 2 like Work on Saturday and Work on Sunday should also be selected below the result set.
Problem is:
Summation of NULL value is always NULL so if [Current Month] value is NULL then you can not achieve to display 50,150,200 in the [Month to Date] column
You can have fixed categories against each payroll id:
Normal Over Time
Work on Saturday
Work on Sunday
Basic Salary
Query:
DECLARE #MyTable TABLE(ID INT IDENTITY, PayrollID INT, Description NVARCHAR(MAX), [Current Month] MONEY)
INSERT INTO #MyTable
VALUES (1,'Basic Salary',100),
(1,'Normal Over Time',50),
(1,'Work on Saturday',150),
(1,'Work on Sunday',200),
(2,'Basic Salary',100),
(2,'Normal Over Time',0),
(2,'Work on Saturday',0),
(2,'Work on Sunday',0)
SELECT * ,SUM([Current Month]) OVER (PARTITION BY Description ORDER BY PayrollID) AS [Month to Date]
FROM #MyTable order by ID,PayrollID

T-SQL Get Records for this year grouped by month

I have a table of data which looks like this
ID CreatedDate
A123 2015-01-01
B124 2016-01-02
A125 2016-01-03
A126 2016-01-04
What I would like to do is group by month (as text) for this year only. I have some up with the following query but it returns data from all years not just this one:
Select Count(ID), DateName(month,createddate) from table
Where (DatePart(year,createddate)=datepart(year,getdate())
Group by DateName(month,createddate)
This returns
Count CreatedDate
4 January
Instead of
Count CreatedDate
3 January
Where have I gone wrong? I'm sure it's something to do with converting the date to month where it goes wrong
Just tested your code:
;WITH [table] AS (
SELECT *
FROM (VALUES
('A123', '2015-01-01'),
('B124', '2016-01-02'),
('A125', '2016-01-03'),
('A126', '2016-01-04')
) as t(ID, CreatedDate)
)
SELECT COUNT(ID),
DATENAME(month,CreatedDate)
FROM [table]
WHERE DATEPART(year,CreatedDate)=DATEPART(year,getdate())
GROUP BY DATENAME(month,CreatedDate)
Output was
3 January
I removed ( near WHERE
select count(id) as Count,
case when month(createddate)=1 THEN 'Januray' END as CreatedDate
from [table]
--where year(createddate)=2016 optional if you only want the 2016 count
group by month(createddate),year(createdDate)

Populating a list of dates without a defined end date - SQL server

I have a list of accounts and their cost which changes every few days.
In this list I only have the start date every time the cost updates to a new one, but no column for the end date.
Meaning, I need to populate a list of dates when the end date for a specific account and cost, should be deduced as the start date of the same account with a new cost.
More or less like that:
Account start date cost
one 1/1/2016 100$
two 1/1/2016 150$
one 4/1/2016 200$
two 3/1/2016 200$
And the result I need would be:
Account date cost
one 1/1/2016 100$
one 2/1/2016 100$
one 3/1/2016 100$
one 4/1/2016 200$
two 1/1/2016 150$
two 2/1/2016 150$
two 3/1/2016 200$
For example, if the cost changed in the middle of the month, than the sample data will only hold two records (one per each unique combination of account-start date-cost), while the results will hold 30 records with the cost for each and every day of the month (15 for the first cost and 15 for the second one). The costs are a given, and no need to calculate them (inserted manually).
Note the result contains more records because the sample data shows only a start date and an updated cost for that account, as of that date. While the results show the cost for every day of the month.
Any ideas?
Solution is a bit long.
I added an extra date for test purposes:
DECLARE #t table(account varchar(10), startdate date, cost int)
INSERT #t
values
('one','1/1/2016',100),('two','1/1/2016',150),
('one','1/4/2016',200),('two','1/3/2016',200),
('two','1/6/2016',500) -- extra row
;WITH CTE as
( SELECT
row_number() over (partition by account order by startdate) rn,
*
FROM #t
),N(N)AS
(
SELECT 1 FROM(VALUES(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))M(N)
),
tally(N) AS -- tally is limited to 1000 days
(
SELECT ROW_NUMBER()OVER(ORDER BY N.N) - 1 FROM N,N a,N b
),GROUPED as
(
SELECT
cte.account, cte.startdate, cte.cost, cte2.cost cost2, cte2.startdate enddate
FROM CTE
JOIN CTE CTE2
ON CTE.account = CTE2.account
and CTE.rn = CTE2.rn - 1
)
-- used DISTINCT to avoid overlapping dates
SELECT DISTINCT
CASE WHEN datediff(d, startdate,enddate) = N THEN cost2 ELSE cost END cost,
dateadd(d, N, startdate) startdate,
account
FROM grouped
JOIN tally
ON datediff(d, startdate,enddate) >= N
Result:
cost startdate account
100 2016-01-01 one
100 2016-01-02 one
100 2016-01-03 one
150 2016-01-01 two
150 2016-01-02 two
200 2016-01-03 two
200 2016-01-04 one
200 2016-01-04 two
200 2016-01-05 two
500 2016-01-06 two
Thank you #t-clausen.dk!
It didn't solve the problem completely, but did direct me in the correct way.
Eventually I used the LEAD function to generate an end date for every cost per account, and then I was able to populate a list of dates based on that idea.
Here's how I generate the end dates:
DECLARE #t table(account varchar(10), startdate date, cost int)
INSERT #t
values
('one','1/1/2016',100),('two','1/1/2016',150),
('one','1/4/2016',200),('two','1/3/2016',200),
('two','1/6/2016',500)
select account
,[startdate]
,DATEADD(DAY, -1, LEAD([Startdate], 1,'2100-01-01') OVER (PARTITION BY account ORDER BY [Startdate] ASC)) AS enddate
,cost
from #t
It returned the expected result:
account startdate enddate cost
one 2016-01-01 2016-01-03 100
one 2016-01-04 2099-12-31 200
two 2016-01-01 2016-01-02 150
two 2016-01-03 2016-01-05 200
two 2016-01-06 2099-12-31 500
Please note that I set the end date of current costs to be some date in the far future which means (for me) that they are currently active.

GROUP BY DAY, CUMULATIVE SUM

I have a table in MSSQL with the following structure:
PersonId
StartDate
EndDate
I need to be able to show the number of distinct people in the table within a date range or at a given date.
As an example i need to show on a daily basis the totals per day, e.g. if we have 2 entries on the 1st June, 3 on the 2nd June and 1 on the 3rd June the system should show the following result:
1st June: 2
2nd June: 5
3rd June: 6
If however e.g. on of the entries on the 2nd June also has an end date that is 2nd June then the 3rd June result would show just 5.
Would someone be able to assist with this.
Thanks
UPDATE
This is what i have so far which seems to work. Is there a better solution though as my solution only gets me employed figures. I also need unemployed on another column - unemployed would mean either no entry in the table or date not between and no other entry as employed.
CREATE TABLE #Temp(CountTotal int NOT NULL, CountDate datetime NOT NULL);
DECLARE #StartDT DATETIME
SET #StartDT = '2015-01-01 00:00:00'
WHILE #StartDT < '2015-08-31 00:00:00'
BEGIN
INSERT INTO #Temp(CountTotal, CountDate)
SELECT COUNT(DISTINCT PERSON.Id) AS CountTotal, #StartDT AS CountDate FROM PERSON
INNER JOIN DATA_INPUT_CHANGE_LOG ON PERSON.DataInputTypeId = DATA_INPUT_CHANGE_LOG.DataInputTypeId AND PERSON.Id = DATA_INPUT_CHANGE_LOG.DataItemId
LEFT OUTER JOIN PERSON_EMPLOYMENT ON PERSON.Id = PERSON_EMPLOYMENT.PersonId
WHERE PERSON.Id > 0 AND DATA_INPUT_CHANGE_LOG.Hidden = '0' AND DATA_INPUT_CHANGE_LOG.Approved = '1'
AND ((PERSON_EMPLOYMENT.StartDate <= DATEADD(MONTH,1,#StartDT) AND PERSON_EMPLOYMENT.EndDate IS NULL)
OR (#StartDT BETWEEN PERSON_EMPLOYMENT.StartDate AND PERSON_EMPLOYMENT.EndDate) AND PERSON_EMPLOYMENT.EndDate IS NOT NULL)
SET #StartDT = DATEADD(MONTH,1,#StartDT)
END
select * from #Temp
drop TABLE #Temp
You can use the following query. The cte part is to generate a set of serial dates between the start date and end date.
DECLARE #ViewStartDate DATETIME
DECLARE #ViewEndDate DATETIME
SET #ViewStartDate = '2015-01-01 00:00:00.000';
SET #ViewEndDate = '2015-02-25 00:00:00.000';
;WITH Dates([Date])
AS
(
SELECT #ViewStartDate
UNION ALL
SELECT DATEADD(DAY, 1,Date)
FROM Dates
WHERE DATEADD(DAY, 1,Date) <= #ViewEndDate
)
SELECT [Date], COUNT(*)
FROM Dates
LEFT JOIN PersonData ON Dates.Date >= PersonData.StartDate
AND Dates.Date <= PersonData.EndDate
GROUP By [Date]
Replace the PersonData with your table name
If startdate and enddate columns can be null, then you need to add
addditional conditions to the join
It assumes one person has only one record in the same date range
You could do this by creating data where every start date is a +1 event and end date is -1 and then calculate a running total on top of that.
For example if your data is something like this
PersonId StartDate EndDate
1 20150101 20150201
2 20150102 20150115
3 20150101
You first create a data set that looks like this:
EventDate ChangeValue
20150101 +2
20150102 +1
20150115 -1
20150201 -1
And if you use running total, you'll get this:
EventDate Total
2015-01-01 2
2015-01-02 3
2015-01-15 2
2015-02-01 1
You can get it with something like this:
select
p.eventdate,
sum(p.changevalue) over (order by p.eventdate asc) as total
from
(
select startdate as eventdate, sum(1) as changevalue from personnel group by startdate
union all
select enddate, sum(-1) from personnel where enddate is not null group by enddate
) p
order by p.eventdate asc
Having window function with sum() requires SQL Server 2012. If you're using older version, you can check other options for running totals.
My example in SQL Fiddle
If you have dates that don't have any events and you need to show those too, then the best option is probably to create a separate table of dates for the whole range you'll ever need, for example 1.1.2000 - 31.12.2099.
-- Edit --
To get count for a specific day, it's possible use the same logic, but just sum everything up to that day:
declare #eventdate date
set #eventdate = '20150117'
select
sum(p.changevalue)
from
(
select startdate as eventdate, 1 as changevalue from personnel
where startdate <= #eventdate
union all
select enddate, -1 from personnel
where enddate < #eventdate
) p
Hopefully this is ok, can't test since SQL Fiddle seems to be unavailable.

Resources