Group Values between date time range - sql-server

I have a table that looks like this:
And I want to group the data order by the daydifference=1, the rows where the daydifference is bigger than 1, will be grouped between the days that have difference=1. So this will be like this:
So I tried this way:
SELECT
InicialData
,FinalDate
FROM #ValueAnalysis
WHERE WorkDays=1
GROUP BY InicialData, FinalDate
And now I want to join the orders where the daydifference is bigger than 1.
Can someone help me please?

If you have SQL Server 2017 or newer, you can use STRING_AGG. For example like this:
SELECT va.InitialDate,
va.FinalDate,
STRING_AGG(cc.Order_,',') WITHIN GROUP (ORDER BY cc.Order_ ASC) as Order_1
FROM ValueAnalysis va
OUTER APPLY (
SELECT Order_
FROM ValueAnalysis
WHERE va.InitialDate = InitialDate or va.FinalDate = FinalDate
) as cc
WHERE DayDiff = 1
GROUP BY va.InitialDate, va.FinalDate
Output:
| InitialDate | FinalDate | Order_1 |
|-------------|------------|---------|
| 2020-01-02 | 2020-01-02 | 21,23 |
| 2020-01-03 | 2020-01-03 | 22 |
| 2020-01-04 | 2020-01-04 | 23,24 |
EDIT#1
If you need generate new table and include dates that are not in original dataset, then you can use recursive CTE:
;WITH cte AS (
SELECT MIN(InitialDate) InitialDate,
MAX(FinalDate) FinalDate,
FROM ValueAnalysis
UNION ALL
SELECT DATEADD(day,1,InitialDate)
FinalDate
FROM cte
WHERE InitialDate < FinalDate
)
SELECT c.InitialDate InitialDate,
c.InitialDate FinalDate,
STRING_AGG(va.Order_,',') WITHIN GROUP (ORDER BY va.Order_ ASC) as Order_1
FROM cte c
LEFT JOIN ValueAnalysis va
ON va.InitialDate = c.InitialDate or va.FinalDate = c.InitialDate
GROUP BY c.InitialDate, c.FinalDate
OPTION (MAXRECURSION 100)

Related

Partition by syntax

I have the following statement which works to get the most recent row of data for a particular DDI. What I now want to do is replace the single DDI in the where statement with a long list of them but still have only the most recent row for each. I'm pretty sure that I need to use OVER and PARTITION BY to get a separate window for each DDI but even reading the microsoft documentation and a more simplified tutorial I still can't get the syntax right. I suspect I just need a nudge in the right direction. Can anyone help?
https://learn.microsoft.com/en-us/sql/t-sql/queries/select-over-clause-transact-sql?view=sql-server-2017
http://www.sqltutorial.org/sql-window-functions/sql-partition-by/
SELECT TOP 1
[Start Time]
,[Agent Name]
,[Reference]
,[charged op. (sec)]
,[Type]
,[Activation ID] as [actid]
FROM [iPR].[dbo].[InboundCallsView]
Where [type] = 'Normal operator call'
AND [DDI] = #DDI
Order By [Start Time] Desc
Not sure how you plan on handling the multiple values for DDI but that may be an issue. The best approach would be to use a table valued parameter. If you pass in a delimited list you have to split the string too which is not a good way of handling this type of thing.
This query will return the most recent for every DDI.
SELECT
[Start Time]
, [Agent Name]
, [Reference]
, [charged op. (sec)]
, [Type]
, [actid]
from
(
SELECT
[Start Time]
, [Agent Name]
, [Reference]
, [charged op. (sec)]
, [Type]
, [actid]
, RowNum = ROW_NUMBER() over(partition by DDI order by [Start Time] desc)
FROM [iPR].[dbo].[InboundCallsView]
where [type] = 'Normal operator call'
--and [DDI] = #DDI
) x
where x.RowNum = 1
So let's assume a table with this data (notice how I cleaned up the column names to remove spaces, special characters, etc.):
+---+------------------+--------+------+----+------+---+
| 1 | 2019-03-28 08:00 | agent1 | foo1 | 60 | foo1 | 1 |
+---+------------------+--------+------+----+------+---+
| 1 | 2019-03-28 09:00 | agent2 | foo2 | 70 | foo2 | 2 |
| 2 | 2019-03-27 08:00 | agent3 | foo3 | 80 | foo3 | 3 |
| 2 | 2019-03-27 09:00 | agent4 | foo4 | 90 | foo4 | 4 |
+---+------------------+--------+------+----+------+---+
As you say, you can use a window function to get what you want. However, let me show you a method that doesn't require a window function first.
You want records where the StartTime is the max value for that DDI. You can obtain the max StartTime for each DDI with the following query:
SELECT
ddi,
max_start = MAX(StartTime)
FROM InboundCallsView
GROUP BY ddi
You can then join that query to your base table/view to get the records you want. Using an intermediate CTE, you can do the following:
WITH
ddiWithMaxStart AS
(
SELECT
ddi,
max_start = MAX(StartTime)
FROM InboundCallsView
GROUP BY ddi
)
SELECT InboundCallsView.*
FROM InboundCallsView
INNER JOIN ddiWithMaxStart ON
ddiWithMaxStart.ddi = InboundCallsView.ddi
AND ddiWithMaxStart.max_start = InboundCallsView.StartTime
Now, if you really want to use WINDOW functions, you can use ROW_NUMBER for a similar effect:
WITH
ddiWithRowNumber AS
(
SELECT
InboundCallsView.*,
rn = ROW_NUMBER() OVER
(
PARTITION BY ddi
ORDER BY ddi, StartTime DESC
)
FROM InboundCallsView
)
SELECT *
FROM ddiWithRowNumber
WHERE rn = 1
Notice that with this method, you don't need to join the base view/table to the intermediate CTE.
You can test out performance of each method to see which works best for you.

using all values from one column in another query

I am trying to find a solution for the following issue that I have in sql-server:
I have one table t1 of which I want to use each date for each agency and loop it through the query to find out the avg_rate. Here is my table t1:
Table T1:
+--------+-------------+
| agency | end_date |
+--------+-------------+
| 1 | 2017-10-01 |
| 2 | 2018-01-01 |
| 3 | 2018-05-01 |
| 4 | 2012-01-01 |
| 5 | 2018-04-01 |
| 6 | 2017-12-01l |
+--------+-------------+
I literally want to use all values in the column end_date and plug it into the query here (I marked it with ** **):
with averages as (
select a.id as agency
,c.rate
, avg(c.rate) over (partition by a.id order by a.id ) as avg_cost
from table_a as a
join rates c on a.rate_id = c.id
and c.end_date = **here I use all values from t1.end_date**
and c.Start_date = **here I use all values from above minus half a year** = dateadd(month,-6,end_date)
group by a.id
,c.rate
)
select distinct agency, avg_cost from averages
order by 1
The reason why I need two dynamic dates is that the avg_rates vary if you change the timeframe between these dates.
My problem and my question is now:
How can you take the end_date from table t1 plug it into the query where c.end_date is and loop if through all values in t1.end_date?
I appreciate your help!
Do you really need a windowed average? Try this out.
;with timeRanges AS
(
SELECT
T.end_date,
start_date = dateadd(month,-6, T.end_date)
FROM
T1 AS T
)
select
a.id as agency,
c.rate,
T.end_date,
T.start_date,
avg_cost = avg(c.rate)
from
table_a as a
join rates c on a.rate_id = c.id
join timeRanges AS T ON A.DateColumn BETWEEN T.start_date AND T.end_date
group by
a.id ,
c.rate,
T.end_date,
T.start_date
You need a date column to join your data against T1 (I called it DateColumn in this example), otherwise all time ranges would return the same averages.
I can think of several ways to do this - Cursor, StoredProcedure, Joins ...
Given the simplicity of your query, a cartesian product (Cross Join) of Table T1 against the averages CTE should do the magic.

TSQL - Return duplicate rows with highest value and longest date

I have got a list of staff who are contractors and it includes duplicates as some work on multiple contracts at the same time. I need to find the row with the most hours for that person and secondly with the end date furthest away (if the hours is the same). I guess this is the Current main contract. I also need to make sure the Date From and the Date to is in between the current date - how can this be done?
+------------+----------+------+-------+------------+------------+
| ContractID | PersonID | Name | Hours | Date From | Date To |
+------------+----------+------+-------+------------+------------+
| 8 | 1 | John | 30 | 20/02/2018 | 26/02/2018 |
| 8 | 2 | Paul | 5 | 20/02/2018 | 26/02/2018 |
| 7 | 3 | John | 7 | 20/02/2018 | 26/02/2018 |
+------------+----------+------+-------+------------+------------+
In the above example, I would need to bring back the John – 30hours and the Paul 5 Hours row. PS - The PersonID is different for each row but the "Name" is the same for the person if on multiple contracts.
Thanks
One approach is simply to use exists with appropriate ordering logic:
select c.*
from contracts c
where c.contractid = (select top 1 c2.contractid
from contracts c2
where c2.name = c.cname and
getdate() >= c2.datefrom and
getdate() < c2.dateto
order by c2.hours desc, c2.dateto desc
);
You can put similar logic into a window function:
select c.*
from (select c.*,
row_number() over (partition by c.name order by c.hours desc, c.dateto desc) as seqnum
from contracts c
where getdate() >= c.dateto and getdate() < c.datefrom
) c
where seqnum = 1;
If you need the full row, I'd do somehthing like this:
with
rankedByHours as (
select
ContractID,
PersonID,
Name,
Hours,
[Date From],
[Date To],
row_number() over (partition by PersonID order by Hours desc) as RowID
from
Contracts
)
select
ContractID,
PersonID,
Name,
Hours,
[Date From],
[Date To],
case
when getdate() between [Date From] and [Date To] then 'Current'
when getdate() < [Date From] then 'Not Started'
else 'Expired'
end as ContractStatus
from
RankedByHours
where
RowID = 1;
Use the CTE to inject a row_number() sorting all rows by your sort criteria, then select out the top one in the main body. It can be easily extended to also capture your farthest-out end date.

How to generate date range sequence without temp table in SQL

I want to convert some line of code from Oracle query to MSSQL.
WITH DATE_MONTHS AS
(
SELECT TO_CHAR(ADD_MONTHS(TO_DATE(TRUNC(TO_DATE(P_REQUIRED_DATE),'MON')), - LEVEL
+1),'DD-MON-YYYY') MONTHS FROM DUAL
CONNECT BY LEVEL <= P_MONTH_RANG
)
SELECT * from DATE_MONTHS
Parameters:
P_REQUIRED_DATE i.e sysdate
P_MONTH_RANG i.e 4
Result:
01-05-2017
01-04-2017
01-03-2017
01-02-2017
One simple way is to use tally table and generate like below:
declare #P_Required_Date date = '2015-05-01'
declare #P_Month_Rang int = 4
Select top (#P_Month_Rang) Dts = DateAdd(month, -(Row_Number() over(order by (Select NULL))-1), #P_Required_Date) from
master..spt_values s1, master..spt_values s2
Output as below:
+------------+
| Dts |
+------------+
| 2015-05-01 |
| 2015-04-01 |
| 2015-03-01 |
| 2015-02-01 |
+------------+
Your CTE approach:
declare #P_Required_Date date = '2015-05-01'
declare #P_Month_Rang int = 4
;with Date_Months as
(
Select #P_Required_Date as Dates, 1 as Levl
Union all
Select DateAdd(MONTH,-1, Dates), Levl+1 as Levl from Date_Months
where Levl < #P_Month_Rang
)
Select convert(varchar(10), dates, 103) from Date_Months
For converting to your dd-mm-yyyy format one way is to do convert with option 103 or use Format.
Another option using Stacked Ctes
declare #fromdate date = '20150501';
declare #months int = 4;
;with n as (select n from (values(0),(1),(2),(3),(4),(5),(6),(7),(8),(9)) t(n))
, dates as (
select top (#months)
[Date]=convert(date,dateadd(month,-(row_number() over(order by (select 1))-1),#fromdate))
from n as deka cross join n as hecto cross join n as kilo cross join n as tenK
order by [Date] desc
)
select [Date] = convert(char(10),[date],105)
from dates;
rextester demo: http://rextester.com/UUW2271
returns:
+------------+
| Date |
+------------+
| 01-05-2015 |
| 01-04-2015 |
| 01-03-2015 |
| 01-02-2015 |
+------------+
Benchmarks & Performance testing: Generate a set or sequence without loops - 2 - Aaron Bertrand

Count number of days in a year with a record

I have a SQL Server table named AgentLog in which I store for each agent his daily number of sales.
+-----------+------------+-------------+
| AgentName | Date | SalesNumber |
+-----------+------------+-------------+
| John | 01.01.2014 | 45 |
| Terry | 01.01.2014 | 30 |
| John | 02.01.2014 | 20 |
| Terry | 02.01.2014 | 15 |
| Terry | 03.01.2014 | 52 |
| Terry | 04.01.2014 | 24 |
| Terry | 05.01.2014 | 12 |
| Terry | 06.01.2014 | 10 |
| Terry | 07.01.2014 | 23 |
| John | 08.01.2014 | 48 |
| Terry | 08.01.2014 | 35 |
| John | 09.01.2014 | 37 |
| Terry | 10.01.2014 | 35 |
+-----------+------------+-------------+
If an agent doesn't work on one particular day, there is no record of his sales on that date.
I want to generate a report(query) on a given date interval (ex: 01.01.2014 - 10.01.2014) that counts on how many days an agent wasn't present for work (ex: John - 6 days), was at work (John - 4 days) and also returns the date interval it wasn't present (ex: John 03.01.2014 - 07.01.2014, 10.01.2014) (there can be multiple intervals).
You need to create a custom table and populate it with a record for each date you want in your range (Feel free to go as far back in the past and forward into the future as you feel you may need.). You could do this in Excel very easily and import it.
Select *
from Custom.DateListTable dlt
left outer join agentlog ag
on dlt.Date = ag.Date
I would approach this by getting the number of dates in the interval, as well as the number of dates the agent was at work, and you then have everything you need.
To get the number of days you can use DATEDIFF:
SELECT DATEDIFF(day, '2014-01-01', '2014-10-01') AS totalDays;
To get the number of days an agent worked, you can use the COUNT(*) aggregate function:
SELECT agentName, COUNT(*) AS daysWorked
FROM myTable
GROUP BY agentName;
Then, you can just add to that query to get the days not worked by subtracting totalDays - daysWorked:
SELECT agentName, COUNT(*) AS daysWorked, (DATEDIFF(day, '2014-01-01', '2014-10-01') - COUNT(*)) AS daysMissed
FROM myTable
GROUP BY agentName;
Here is an SQL Fiddle example.
The only way I can think of to resolve this is to creating a temporary table with only one column (datetime) and save there all the dates from the selected range. You can create an stored procedure that fills that temporary table using a cursor with all the dates from the interval. Then do a LEFT join between your table and the temporary table to look for null values in your table (The days where that person didn't come to work)
Try this...
SET DATEFIRST 1; --Monday
DECLARE #StartDate DATETIME = '2014-01.01',
#EndDate DATETIME = '2014-01.10';
WITH data as (
select 0 as i, DATEADD(DAY, 0, #StartDate) as TheDate
union all
select i + 1, DATEADD(DAY, i + 1, #StartDate) as TheDate
from data
where i < (#EndDate - #StartDate)
)
SELECT a.AgentName,
SUM(CASE WHEN c.Date IS NULL THEN 1 ELSE 0 END) AS Missing,
SUM(CASE WHEN c.Date IS NOT NULL THEN 1 ELSE 0 END) AS Working
FROM Agent a
JOIN data b ON NOT EXISTS(SELECT NULL FROM SpecialDate s WHERE s.date = b.TheDate)
LEFT JOIN AgentLog c ON
c.AgentName = a.AgentName
AND c.Date = b.TheDate
WHERE DATEPART(weekday, b.TheDate) <= 5
GROUP BY a.AgentName
OPTION (MAXRECURSION 10000);
It includes a check for weekends, as well as a reference to "SpecialDate" where a list of non working days can be maintained, and excluded from the check.
Reading your question again, I realise that this will only solve half your problem.
NOTE: The following answer mainly addresses the trickiest part of the question, which is how to obtain "absence from work" intervals.
Given these values as Interval Start - End dates:
DECLARE #IntervalStart DATE = '2013-12-30'
DECLARE #IntervalEnd DATE = '2014-01-10'
the following query gives you the "absence from work" intervals:
SELECT AgentName,
DATEADD(d, 1, t.[Date]) As OffWorkStart,
DATEADD(d, -1, t.NextDate) As OffWorkEnd
FROM (
SELECT AgentName, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As NextDate,
DATEDIFF(DAY, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC)) As NextMinusCurrent
FROM #AgentLog) t
WHERE t.NextMinusCurrent > 1
-- Get marginal beginning interval (in case such an interval exists)
UNION ALL
SELECT AgentName, #IntervalStart AS OffWorkStart, DATEADD(DAY, -1, MIN([Date])) AS OffWorkEnd
FROM #AgentLog
GROUP BY AgentName
HAVING MIN([Date]) > #IntervalStart
-- Get marginal ending interval (in case such an interval exists)
UNION ALL
SELECT AgentName, DATEADD(DAY, 1, MAX([Date])) AS OffWorkStart, #IntervalEnd
FROM #AgentLog
GROUP BY AgentName
HAVING MAX([Date]) < #IntervalEnd
ORDER By AgentName, OffWorkStart
With the input data you supplied, the above query gives you the following output:
AgentName OffWorkStart OffWorkEnd
---------------------------------------
John 2013-12-30 2013-12-31
John 2014-01-03 2014-01-07
John 2014-01-10 2014-01-10
Terry 2013-12-30 2013-12-31
Terry 2014-01-09 2014-01-09
The idea behind the basic part of the query is to employ the following nested query:
SELECT AgentName,
[Date],
LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As NextDate,
DATEDIFF(DAY, [Date], LEAD([Date]) OVER (PARTITION BY AgentName ORDER BY [Date] ASC)) As NextMinusCurrent
FROM #AgentLog
in order to get any existing gaps between the days a certain agent is present for work. A value of NextMinusCurrent > 1 indicates such a gap.
Counting days is trivial once you have the above query in place. E.g. placing the above query in a CTE you can count total number of absence days with sth like:
;WITH cte (
... query goes here
)
SELECT AgentName, SUM(DATEDIFF(DAY, OffWorkStart, OffWorkEnd) + 1) AS AbsenceDays
FROM cte
GROUP By AgentName
P.S. The above query makes use of SQL Server LEAD function, which is available from SQL SERVER 2012 onwards.
SQL Fiddle here
EDIT:
CTEs together with ROW_NUMBER() can be used to simulate LEAD function. The first part of the query becomes:
;WITH cte1 AS (
SELECT AgentName,
[Date],
ROW_NUMBER() OVER (PARTITION BY AgentName ORDER BY [Date] ASC) As rn
FROM #AgentLog
),
cte2 AS (
SELECT cte1.AgentName, cte1.[Date],
cteLead.[Date] AS NextDate,
DATEDIFF(DAY, cte1.[Date], cteLead.[Date]) As NextMinusCurrent
FROM cte1
LEFT OUTER JOIN cte1 AS cteLead
ON (cte1.rn = cteLead.rn - 1) AND (cte1.AgentName = cteLead.AgentName)
)
SELECT AgentName,
DATEADD(d, 1, cte2.[Date]) As OffWorkStart,
DATEADD(d, -1, cte2.NextDate) As OffWorkEnd
FROM cte2
WHERE NextMinusCurrent > 1
SQL Fiddle for SQL Server 2008 here. I hope it executes in SQL Server 2005 also!

Resources