I have searched high and low for weeks now trying to find a solution to my problem.
As far as I can ascertain, my SQL Server version (2008r2) is a limiting factor on this but, I am positive there is a solution out there.
My problem is as follows:
A have a table with potential contiguous dates in the form of Customer-Status-DateStart-DateEnd-EventID.
I need to merge contiguous dates by customer and status - the status field can shift up and down throughout a customers pathway.
Some example data is as follows:
DECLARE #Tbl TABLE([CustomerID] INT
,[Status] INT
,[DateStart] DATE
,[DateEnd] DATE
,[EventID] INT)
INSERT INTO #Tbl
VALUES (1,1,'20160101','20160104',1)
,(1,1,'20160104','20160108',3)
,(1,2,'20160108','20160110',4)
,(1,1,'20160110','20160113',7)
,(1,3,'20160113','20160113',9)
,(1,3,'20160113',NULL,10)
,(2,1,'20160101',NULL,2)
,(3,2,'20160109','20160110',5)
,(3,1,'20160110','20160112',6)
,(3,1,'20160112','20160114',8)
Desired output:
Customer | Status | DateStart | DateEnd
---------+--------+-----------+-----------
1 | 1 | 2016-01-01| 2016-01-08
1 | 2 | 2016-01-08| 2016-01-10
1 | 1 | 2016-01-10| 2016-01-13
1 | 3 | 2016-01-13| NULL
2 | 1 | 2016-01-01| NULL
3 | 2 | 2016-01-09| 2016-01-10
3 | 1 | 2016-01-10| 2016-01-14
Any ideas / code will be greatly received.
Thanks,
Dan
Try this
DECLARE #Tbl TABLE([CusomerID] INT
,[Status] INT
,[DateStart] DATE
,[DateEnd] DATE
,[EventID] INT)
INSERT INTO #Tbl
VALUES (1,1,'20160101','20160104',1)
,(1,1,'20160104','20160108',3)
,(1,2,'20160108','20160110',4)
,(1,1,'20160110','20160113',7)
,(1,3,'20160113','20160113',9)
,(1,3,'20160113',NULL,10)
,(2,1,'20160101',NULL,2)
,(3,2,'20160109','20160110',5)
,(3,1,'20160110','20160112',6)
,(3,1,'20160112','20160114',8)
;WITH CTE
AS
(
SELECT CusomerID ,
Status ,
DateStart ,
COALESCE(DateEnd, '9999-01-01') AS DateEnd,
EventID,
ROW_NUMBER() OVER (ORDER BY CusomerID, EventID) RowId,
ROW_NUMBER() OVER (PARTITION BY CusomerID, Status ORDER BY EventID) StatusRowId FROM #Tbl
)
SELECT
A.CusomerID ,
A.Status ,
A.DateStart ,
CASE WHEN A.DateEnd = '9999-01-01' THEN NULL
ELSE A.DateEnd END AS DateEnd
FROM
(
SELECT
CTE.CusomerID,
CTE.Status,
MIN(CTE.DateStart) AS DateStart,
MAX(CTE.DateEnd) AS DateEnd
FROM
CTE
GROUP BY
CTE.CusomerID,
CTE.Status,
CTE.StatusRowId -CTE.RowId
) A
ORDER BY A.CusomerID, A.DateStart
Output
CusomerID Status DateStart DateEnd
----------- ----------- ---------- ----------
1 1 2016-01-01 2016-01-08
1 2 2016-01-08 2016-01-10
1 1 2016-01-10 2016-01-13
1 3 2016-01-13 NULL
2 1 2016-01-01 NULL
3 2 2016-01-09 2016-01-10
3 1 2016-01-10 2016-01-14
Related
I need to create a report when the user entering and exiting time. So far I only manage to get the min and max time. Here, the example of table:
ID | Flag_Location (bit) | Time
----------------------------
1001 | 1 | 8:00
1001 | 1 | 9:00
1001 | 1 | 10:00
1001 | 0 | 11:00
1001 | 0 | 12:00
1001 | 1 | 13:00
1001 | 1 | 14:00
The output that I need for the report is like this :
ID | ENTERTIME | EXITTIME
-------------------------
1001 | 8:00 | 10:00
1001 | 13:00 | 14:00
So far I only manage to get 1 row of result :
ID | ENTERTIME | EXITTIME
-------------------------
1001 | 8:00 | 14:00
You can use the window function to create an ad-hoc Grp
Example
Select ID
,TimeIn = min(Time)
,TimeOut = max(Time)
From (
Select *
,Grp = sum(case when flag_location=0 then 1 else 0 end ) over (partition by id order by time)
From YourTable
) A
Where Flag_Location=1
Group By ID,Grp
Returns
ID TimeIn TimeOut
1001 08:00:00.0000000 10:00:00.0000000
1001 13:00:00.0000000 14:00:00.0000000
If it helps with the visualization, the nested query generates the following:
You can just bucket the to identify group by and do group by as below:
;with cte as (select *, bucket = sum(case when flag_location = 0 then 1 when flag_location = 1 and nextflag = 0 then 2 else 0 end) over (partition by id order by [time]),
[time] as endtime from
(
select *,
lag(flag_location) over(partition by id order by [time]) nextflag
from #table4
) a
)
select id, min([time]), max([time]) from cte
where flag_location = 1
group by id, bucket
Query results:
+------+------------------+------------------+
| id | Entertime | ExitTime |
+------+------------------+------------------+
| 1001 | 08:00:00.0000000 | 10:00:00.0000000 |
| 1001 | 13:00:00.0000000 | 14:00:00.0000000 |
+------+------------------+------------------+
Try below query (explanations in code)
declare #tbl table (ID int, Flag_Location bit, Time varchar(5));
insert into #tbl values
(1001,1,'8:00'),
(1001,1,'9:00'),
(1001,1,'10:00'),
(1001,0,'11:00'),
(1001,0,'12:00'),
(1001,1,'13:00'),
(1001,1,'14:00');
select ID,
cast(max(ts) as varchar(10)),
cast(min(ts) as varchar(10))
from (
select ID, ts, Flag_Location,
row_number() over (order by ts) -
row_number() over (partition by Flag_Location order by ts) grp
from (
select *,
-- add 0 at the beginning for correct cast and cast it to timestamp for correct ordering
cast(right('00000' + time, 5) as timestamp) ts
from #tbl
) a
) a where Flag_Location = 1
group by ID, grp
I am trying to create a stored proc in SQL Server 2008.
I have a "Timings" Table (which could have thousands of records):
StaffID | MachineID | StartTime | FinishTime
1 | 1 | 01/01/2018 12:00 | 01/01/18 14:30
2 | 1 | 01/01/2018 12:00 | 01/01/18 13:00
3 | 2 | 01/01/2018 12:00 | 01/01/18 13:00
3 | 2 | 01/01/2018 13:00 | 01/01/18 14:00
4 | 3 | 01/01/2018 12:00 | 01/01/18 12:30
5 | 3 | 01/01/2018 11:00 | 01/01/18 13:30
This shows how long each staff member was working on each machine.
I would like to produce a results table as below:
MachineID | StaffQty | TotalMins
1 | 1 | 90
1 | 2 | 60
2 | 1 | 120
3 | 1 | 120
3 | 2 | 30
This would show how many minutes each machine had only one person using it, how many minutes each machine had 2 people using it etc.
Normally, I would post what I have tried so far, but all my attempts seem to be so far away, I don't think there is much point.
Obviously, I would be very grateful of a complete solution but I would also appreciate even just a little nudge in the right direction.
I think this answers your question:
declare #t table (StaffID int, MachineID int, StartTime datetime2,FinishTime datetime2)
insert into #t(StaffID,MachineID,StartTime,FinishTime) values
(1,1,'2018-01-01T12:00:00','2018-01-01T14:30:00'),
(2,1,'2018-01-01T12:00:00','2018-01-01T13:00:00'),
(3,2,'2018-01-01T12:00:00','2018-01-01T12:30:00')
;With Times as (
select MachineID,StartTime as Time from #t
union
select MachineID,FinishTime from #t
), Ordered as (
select
*,
ROW_NUMBER() OVER (PARTITION BY MachineID ORDER BY Time) rn
from Times
), Periods as (
select
o1.MachineID,o1.Time as StartTime,o2.Time as FinishTime
from
Ordered o1
inner join
Ordered o2
on
o1.MachineID = o2.MachineID and
o1.rn = o2.rn - 1
)
select
p.MachineID,
p.StartTime,
MAX(p.FinishTime) as FinishTime,
COUNT(*) as Cnt,
DATEDIFF(minute,p.StartTime,MAX(p.FinishTime)) as TotalMinutes
from
#t t
inner join
Periods p
on
p.MachineID = t.MachineID and
p.StartTime < t.FinishTime and
t.StartTime < p.FinishTime
group by p.MachineID,p.StartTime
Results:
MachineID StartTime FinishTime Cnt TotalMinutes
----------- --------------------------- --------------------------- ----------- ------------
1 2018-01-01 12:00:00.0000000 2018-01-01 13:00:00.0000000 2 60
1 2018-01-01 13:00:00.0000000 2018-01-01 14:30:00.0000000 1 90
2 2018-01-01 12:00:00.0000000 2018-01-01 12:30:00.0000000 1 30
Hopefully you can see what each of the CTEs is doing. The only place where this may not give you exactly the results you're seeking is if one person's FinishTime is precisely equal to another person's StartTime on the same machine. Should be rare in real data hopefully.
For Sql server 2012+,
Please mention your Sql server version.
Try my script with other sample data.
Please post other sample data if it is not working.
I think my script can be fix for other Test scenario.
create table #temp(StaffID int,MachineID int,StartTime datetime,FinishTime datetime)
insert into #temp VALUES
(1, 1,'01/01/2018 12:00','01/01/18 14:30')
,(2, 1,'01/01/2018 12:00','01/01/18 13:00')
,(3, 2,'01/01/2018 12:00','01/01/18 12:30')
;
WITH CTE
AS (
SELECT t.*
,t1.StaffQty
,datediff(MINUTE, t.StartTime, t.FinishTime) TotalMinutes
FROM #temp t
CROSS APPLY (
SELECT count(*) StaffQty
FROM #temp t1
WHERE t.machineid = t1.machineid
AND (
t.StartTime >= t1.StartTime
AND t.FinishTime <= t1.FinishTime
)
) t1
)
SELECT MachineID
,StaffQty
,TotalMinutes - isnull(LAG(TotalMinutes, 1) OVER (
PARTITION BY t.MachineID ORDER BY t.StartTime
,t.FinishTime
), 0)
FROM cte t
drop table #temp
for Sql server 2008,
;
WITH CTE
AS (
SELECT t.*
,t1.StaffQty
,datediff(MINUTE, t.StartTime, t.FinishTime) TotalMinutes
,ROW_NUMBER() OVER (
PARTITION BY t.machineid ORDER BY t.StartTime
,t.FinishTime
) rn
FROM #temp t
CROSS APPLY (
SELECT count(*) StaffQty
FROM #temp t1
WHERE t.machineid = t1.machineid
AND (
t.StartTime >= t1.StartTime
AND t.FinishTime <= t1.FinishTime
)
) t1
)
SELECT t.MachineID
,t.StaffQty
,t.TotalMinutes - isnull(t1.TotalMinutes, 0) TotalMinutes
FROM cte t
OUTER APPLY (
SELECT TOP 1 TotalMinutes
FROM cte t1
WHERE t.MachineID = t1.machineid
AND t1.rn < t.rn
ORDER BY t1.rn DESC
) t1
I have this Data set
InvoiceID CDamount companyname
1 2500 NASA
1 -2500 NASA
2 1600 Airjet
3 5000 Boeing
4 -600 EXEarth
5 8000 SpaceX
5 -8000 SpaceX
I want to be able to get that as shown below:
External ID CDamount companyname
1 2500 NASA
1-C -2500 NASA
2 1600 Airjet
3 5000 Boeing
4 -600 EXEarth
5 8000 SpaceX
5-C -8000 SpaceX
I cannot use CASE WHEN CDamount < 0 THEN InvoiceID + '-' + 'C' ELSE InvoiceID END AS "External ID" because some of other companies have negative amount as well that do not fall under this category.
I was wondering how can I say IF InvoiceID is Duplicated AND CDAmount is Negative then Create a new External ID?
Is this something possible?
Below you can create the sample data
Create Table #Incident (
InvoiceID int,
CDamount int,
Companyname Nvarchar(255))
insert into #Incident Values (1,2500,'NASA')
insert into #Incident Values (1,-2500,'NASA')
insert into #Incident Values (2,1600,'Airjet')
insert into #Incident Values (3, 5000, 'Boeing')
insert into #Incident Values (4, -600, 'ExEarth')
insert into #Incident Values (5,8000,'SpaceX')
insert into #Incident Values (5, -8000, 'SpaceX')
Here is What I used but as I mentioned since ID number 4 has negative value as well I get "-C" for it which I do not want to.
Select CASE WHEN T1.CDamount < 0
THEN CAST(T1.InvoiceID AS nvarchar (255)) + '-' + 'C'
ELSE CAST(T1.InvoiceID AS nvarchar (255))
END AS ExternalID,
T1.Companyname
from #Incident AS T1
So I got this based on my knowledge of SQL and that works for my case.
Not sure if it is an smart way to go with but can be a good start for someone who is struggling with a Scenario like this:
;With CTE1 AS (
SELECT Count(*) AS Duplicate, T1.InvoiceID
From #Incident AS T1
Group by T1.InvoiceID
),
Main AS (
Select CASE WHEN T1.CDamount < 0 AND T2.Duplicate > 1
THEN CAST(T1.InvoiceID AS nvarchar (255)) + '-' + 'C'
ELSE CAST(T1.InvoiceID AS nvarchar (255))
END AS ExternalID,
T1.InvoiceID AS count,
T1.CDamount,
T1.Companyname
from #Incident AS T1
Join CTE1 AS T2 ON T1.InvoiceID = T2.InvoiceID
)
SELECT * FROM Main
Alternative solution without CTE, using ROW_NUMBER() function.
SELECT
CASE WHEN CDAmount < 0 AND RowID > 1
THEN InvoiceID + '-C'
ELSE InvoiceID
END AS ExternalID
, CDAmount
, CompanyName
FROM
(
SELECT
CAST(InvoiceID AS NVARCHAR(255)) AS InvoiceID
, CDAmount
, CompanyName
, ROW_NUMBER() OVER (PARTITION BY InvoiceID ORDER BY CompanyName) AS RowID
FROM
#Incident
) AS SourceTable
The trick is using ROW_NUMBER() function to generate a sequence which resets when InvoiceID changes. Here's the subquery and its result. Use CASE statement when CDAmount is negative and RowID greater than 1.
SELECT
CAST(InvoiceID AS NVARCHAR(255)) AS InvoiceID
, CDAmount
, CompanyName
, ROW_NUMBER() OVER (PARTITION BY InvoiceID ORDER BY CompanyName) AS RowID
FROM
#Incident
Subquery result:
+-----------+----------+-------------+-------+
| InvoiceID | CDAmount | CompanyName | RowID |
+-----------+----------+-------------+-------+
| 1 | 2500 | NASA | 1 |
| 1 | -2500 | NASA | 2 |
| 2 | 1600 | Airjet | 1 |
| 3 | 5000 | Boeing | 1 |
| 4 | -600 | ExEarth | 1 |
| 5 | 8000 | SpaceX | 1 |
| 5 | -8000 | SpaceX | 2 |
+-----------+----------+-------------+-------+
I currently have the following table:
+-----+-----------------------------+------------------------------+
| ID | StartDate | EndDate |
+-----+-----------------------------+------------------------------|
| 1 | 2017-07-24 08:00:00.000 | 2017-07-29 08:00:00.000 |
| 2 | 2017-07-25 08:00:00.000 | 2017-07-28 08:00:00.000 |
| 3 | 2017-07-25 08:00:00.000 | 2017-07-26 08:00:00.000 |
+-----+-----------------------------+------------------------------+
I would like to know the count of the ID's that were not Closed on each date.
So for example, I wan't to know the count of open ID's on 2017-07-26 00:00:00.000. This would be all 3 in this case.
Another example: I wan't to know the count of open ID's on 2017-07-29 00:00:00.000. Which would be result to 1. Only ID=1 is Not yet closed at that date.
I have tried using another solution here on StackOverflow, but I can't quite figure why it is giving me false results.
declare #dt date, #dtEnd date
set #dt = getdate()-7
set #dtEnd = dateadd(day, 100, #dt);
WITH CTEt1 (SupportCallID, StartDate, EndDate, Onhold)
as
(SELECT SupportCallID
,OpenDate
,MAX(CASE WHEN StatusID IN('19381771-8E81-40C5-8E36-62A7DB0A2A99', '95C7A5FB-2389-4D14-9DAE-A08BFCC3B09A', 'D5429790-3B43-4462-9E1E-2466EA29AC74') then CONVERT(DATE, LastChangeDate) end) EndDate
,OnHold
FROM [ClienteleITSM_Prod_Application].[dbo].[SupportCall]
group by SupportCallID, OpenDate, OnHold
)
SELECT dates.myDate,
(SELECT COUNT(*)
FROM CTEt1
WHERE myDate BETWEEN StartDate and EndDate
)
FROM
(select dateadd(day, number, #dt) mydate
from
(select distinct number from master.dbo.spt_values
where name is null
) n
where dateadd(day, number, #dt) < #dtEnd) dates
If you use a cte to create a table of dates that span the range of dates in your source table, you can easily left join from that to your source table and count up the rows returned:
declare #t table(ID int,StartDate datetime,EndDate datetime);
insert into #t values (1,'2017-07-24 08:00:00.000','2017-07-29 08:00:00.000'),(2,'2017-07-25 08:00:00.000','2017-07-28 08:00:00.000'),(3,'2017-07-25 08:00:00.000','2017-07-26 08:00:00.000');
declare #StartDate datetime = (select min(StartDate) from #t);
declare #EndDate datetime = (select max(EndDate) from #t);
-- Table with 10 rows in to be joined together to create a large tally table (10 * 10 * 10 * etc)
with t(t) as (select t from (values(1),(1),(1),(1),(1),(1),(1),(1),(1),(1))t(t))
-- Add the row_number of the tally table to your start date to generate all dates within your data range
,d(d) as (select top(datediff(d,#StartDate,#EndDate)+1) dateadd(d,row_number() over (order by (select null))-1,#StartDate) from t t1,t t2,t t3)
select d.d
,count(t.ID) as OpenIDs
from d
left join #t as t
on(d.d between cast(t.StartDate as date) and t.EndDate)
group by d.d
order by d.d;
Output:
+-------------------------+---------+
| d | OpenIDs |
+-------------------------+---------+
| 2017-07-24 08:00:00.000 | 1 |
| 2017-07-25 08:00:00.000 | 3 |
| 2017-07-26 08:00:00.000 | 3 |
| 2017-07-27 08:00:00.000 | 2 |
| 2017-07-28 08:00:00.000 | 2 |
| 2017-07-29 08:00:00.000 | 1 |
+-------------------------+---------+
I have a table containing employees id, year id, client id, and the number of sales. For example:
--------------------------------------
id_emp | id_year | sales | client id
--------------------------------------
4 | 1 | 14 | 1
4 | 1 | 10 | 2
4 | 2 | 11 | 1
4 | 2 | 17 | 2
For a employee, I want to obtain rows with the minimum sales per year and the minimum sales of the previous year.
One of the queries I tried is the following:
select distinct
id_emp,
id_year,
MIN(sales) OVER(partition by id_emp, id_year) AS min_sales,
LAG(min(sales), 1) OVER(PARTITION BY id_emp, id_year
ORDER BY id_emp, id_year) AS previous
from facts
where id_emp = 4
group by id_emp, id_year, sales;
I get the result:
-------------------------------------
id_emp | id_year | sales | previous
-------------------------------------
4 | 1 | 10 | (null)
4 | 1 | 10 | 10
4 | 2 | 11 | (null)
but I expect to get:
-------------------------------------
id_emp | id_year | sales | previous
-------------------------------------
4 | 1 | 10 | (null)
4 | 2 | 11 | 10
SQL Fiddle
Oracle 11g R2 Schema Setup:
CREATE TABLE EMPLOYEE_SALES ( id_emp, id_year, sales, client_id ) AS
SELECT 4, 1, 14, 1 FROM DUAL
UNION ALL SELECT 4, 1, 10, 2 FROM DUAL
UNION ALL SELECT 4, 2, 11, 1 FROM DUAL
UNION ALL SELECT 4, 2, 17, 2 FROM DUAL;
Query 1:
SELECT ID_EMP,
ID_YEAR,
SALES AS SALES,
LAG( SALES ) OVER ( PARTITION BY ID_EMP ORDER BY ID_YEAR ) AS PREVIOUS
FROM (
SELECT e.*,
ROW_NUMBER() OVER ( PARTITION BY id_emp, id_year ORDER BY sales ) AS RN
FROM EMPLOYEE_SALES e
)
WHERE rn = 1
Query 2:
SELECT ID_EMP,
ID_YEAR,
MIN( SALES ) AS SALES,
LAG( MIN( SALES ) ) OVER ( PARTITION BY ID_EMP ORDER BY ID_YEAR ) AS PREVIOUS
FROM EMPLOYEE_SALES
GROUP BY ID_EMP, ID_YEAR
Results - Both give the same output:
| ID_EMP | ID_YEAR | SALES | PREVIOUS |
|--------|---------|-------|----------|
| 4 | 1 | 10 | (null) |
| 4 | 2 | 11 | 10 |
You mean like this?
select id_emp, id_year, min(sales) as min_sales,
lag(min(sales)) over (partition by id_emp order by id_year) as prev_year_min_sales
from facts
where id_emp = 4
group by id_emp, id_year;
I believe it is because you are using sales column in your group by statement.
Try to remove it and just use
GROUP BY id_emp,id_year
You could get your desired output using ROW_NUMBER() and LAG() analytic functions.
For example,
Table
SQL> SELECT * FROM t;
ID_EMP ID_YEAR SALES CLIENT_ID
---------- ---------- ---------- ----------
4 1 14 1
4 1 10 2
4 2 11 1
4 2 17 2
Query
SQL> WITH DATA AS
2 (SELECT t.*,
3 row_number() OVER(PARTITION BY id_emp, id_year ORDER BY sales) rn
4 FROM t
5 )
6 SELECT id_emp,
7 id_year ,
8 sales ,
9 lag(sales) over(order by sales) previous
10 FROM DATA
11 WHERE rn =1;
ID_EMP ID_YEAR SALES PREVIOUS
---------- ---------- ---------- ----------
4 1 10
4 2 11 10