SQL Server 2012 - Calculate time in state based on state change - sql-server

I'm trying to calculate how long a machine was in a specific state then sum by hour. The state is only recorded on change so we can assume it was in the same state until changed.
I was trying to use partition, but I don't think that is the correct approach.
My table structure ordered desc:
+----------+-------------------------+
| state_id | t_stamp |
+----------+-------------------------+
| 0 | 2020-06-01 10:44:06.663 |
| 2 | 2020-06-01 10:43:56.660 |
| 0 | 2020-06-01 10:43:06.653 |
| 2 | 2020-06-01 10:42:56.653 |
| 0 | 2020-06-01 10:41:36.643 |
| 3 | 2020-06-01 10:41:26.640 |
| 0 | 2020-06-01 10:41:16.640 |
| 2 | 2020-06-01 10:40:56.637 |
| 0 | 2020-06-01 10:40:06.630 |
| 3 | 2020-06-01 10:39:56.630 |
+----------+-------------------------+
What I'm trying to get to:
+----------+------------------+
| state_id | duration_seconds |
+----------+------------------+
| 2 | 10 |
| 0 | 50 |
+----------+------------------+

You can use window functions, then aggregation:
select
state_id,
sum(datediff(second, t_stamp, lead_t_stamp)) duration_second
from (
select
t.*,
lead(t_stamp) over(order by t_stamp) lead_t_stamp
from mytable t
) t
where lead_t_stamp is not null
group by state_id
order by state_id
This demo on DB Fiddle with your uample data returns:
state_id | duration_second
-------: | --------------:
0 | 190
2 | 40
3 | 20

Related

Sum Consecutive Months Based on Groups with Criteria

I am having trouble narrowing down sales in top regions that occurred in consecutive months. I know I need to use some form of window function with Row_Number or Dense_Rank, but I am having trouble getting the final output
Here is my source data:
+--------+-----------+------------+
| Fruit | SaleDate | Top_Region |
+--------+-----------+------------+
| Apple | 1/1/2017 | 1 |
| Apple | 2/1/2017 | 1 |
| Apple | 3/1/2017 | 1 |
| Apple | 4/1/2017 | 0 |
| Apple | 5/1/2017 | 0 |
| Apple | 6/1/2017 | 0 |
| Apple | 7/1/2017 | 1 |
| Apple | 8/1/2017 | 1 |
| Apple | 9/1/2017 | 1 |
| Apple | 10/1/2017 | 1 |
| Apple | 11/1/2017 | 0 |
| Apple | 12/1/2017 | 0 |
| Banana | 1/1/2017 | 0 |
| Banana | 2/1/2017 | 0 |
| Banana | 3/1/2017 | 1 |
| Banana | 4/1/2017 | 1 |
| Banana | 5/1/2017 | 1 |
| Banana | 6/1/2017 | 1 |
| Banana | 7/1/2017 | 1 |
| Banana | 8/1/2017 | 1 |
| Banana | 9/1/2017 | 0 |
| Banana | 10/1/2017 | 1 |
| Banana | 11/1/2017 | 1 |
| Banana | 12/1/2017 | 0 |
+--------+-----------+------------+
This is the expected output:
+--------+-----------+-----------+-------+
| Fruit | Start | End | Total |
+--------+-----------+-----------+-------+
| Apple | 1/1/2017 | 3/1/2017 | 3 |
| Apple | 7/1/2017 | 10/1/2017 | 4 |
| Banana | 3/1/2017 | 8/1/2017 | 6 |
| Banana | 10/1/2017 | 11/1/2017 | 2 |
+--------+-----------+-----------+-------+
The goal is to have instances of top region sales in succession with missing in one month.
So far I have tried a few different combinations, with this being the closest.
SELECT fruit,
MIN(saledate) AS spanStart ,
MAX(saledate) AS spanEnd,
COUNT(*) AS spanLength
FROM ( SELECT s.* ,
( ROW_NUMBER() OVER ( ORDER BY month )
- ROW_NUMBER() OVER ( PARTITION BY fruit, topregion ORDER BY month ) ) AS fruits
FROM #salesdata s
) s
GROUP BY fruit,fruits ,
topregion
HAVING topregion = 1
ORDER BY COUNT(*) DESC;
Any help would be greatly appreciated
This is a typical gaps-and-island problem. One strategy is to identify the groups of adjacent rows groups by computing the difference between two row_number()s. We can then filter on groups having top_region = 1 and use aggregation to get the start date, end date and number of records per group.
Your query is really close, but the first row_number() is missing a partition by fruit in its over() clause. And I find that aliasing that column fruits where another column is called fruit is error prone.
select
fruit,
min(sale_date) start_date,
max(sale_date) end_date,
count(*) total
from (
select
t.*,
row_number() over(partition by fruit order by sale_date) rn1,
row_number() over(partition by fruit, top_region order by sale_date) rn2
from mytable t
) t
where top_region = 1
group by fruit, rn1 - rn2
order by fruit, start_date
You can run the inner query separately to see the result it produces.
Demo on DB Fiddle:
fruit | start_date | end_date | total
:----- | :--------- | :--------- | ----:
Apple | 2017-01-01 | 2017-01-03 | 3
Apple | 2017-01-07 | 2017-01-10 | 4
Banana | 2017-01-03 | 2017-01-08 | 6
Banana | 2017-01-10 | 2017-01-11 | 2

SQL server multi-period comparison

I have the following table T1 (sample shown), which shows the category for each client (each with a unique ID) on a specific date and his category on the next date:
+------------+----------------+----------+---------------+
| DATE | ID | STAGE | STAGE_NEXT |
+------------+----------------+----------+---------------+
| 2014-07-01 | 10010101841033 | 1 | 1 |
| 2015-07-01 | 74610108542146 | 1 | 1 |
| 2014-10-01 | 47970108841775 | 3 | 3 |
| 2014-10-01 | 48870108841816 | 2 | 3 |
| 2014-10-01 | 32910097439541 | 1 | 1 |
| 2016-04-01 | 46930097440855 | 2 | 3 |
| 2016-04-01 | 47380097440931 | 2 | 3 |
| 2016-04-01 | 54560097441411 | 3 | 3 |
+------------+----------------+----------+---------------+
Table info:
- Rows: 513,000
- Date range: 2013-01-01 to 2019-10-01
- Stages: 1 - 3
I need to create a new column in T1, which will flag the date a client moved to Stage 1 if at any point he was in Stage 3. For example if we take 1 client from T1 by using this code:
SELECT [DATE], ID, STAGE, STAGE_NEXT
FROM T1
WHERE ID = '74610108542146'
ORDER BY [DATE]
We get the following result:
+------------+----------------+-------+------------+
| DATE | ID | STAGE | STAGE_NEXT |
+------------+----------------+-------+------------+
| 2015-07-01 | 74610108542146 | 1 | 1 |
| 2015-10-01 | 74610108542146 | 1 | 1 |
| 2016-01-01 | 74610108542146 | 1 | 2 |
| 2016-04-01 | 74610108542146 | 2 | 1 |
| 2016-07-01 | 74610108542146 | 1 | 1 |
| 2016-10-01 | 74610108542146 | 1 | 2 |
| 2017-01-01 | 74610108542146 | 2 | 3 |
| 2017-04-01 | 74610108542146 | 3 | 3 |
| 2017-07-01 | 74610108542146 | 3 | 2 |
| 2017-10-01 | 74610108542146 | 2 | 1 |
| 2018-01-01 | 74610108542146 | 1 | 1 |
| 2018-04-01 | 74610108542146 | 1 | NULL |
+------------+----------------+-------+------------+
After the new column with the flag is added to T1 we should be able to get the following result using this code on T1:
SELECT [DATE], ID, STAGE, STAGE_NEXT, FLAG
FROM T1
WHERE ID = '74610108542146'
ORDER BY [DATE]
+------------+----------------+-------+------------+------+
| DATE | ID | STAGE | STAGE_NEXT | FLAG |
+------------+----------------+-------+------------+------+
| 2015-07-01 | 74610108542146 | 1 | 1 | 0 |
| 2015-10-01 | 74610108542146 | 1 | 1 | 0 |
| 2016-01-01 | 74610108542146 | 1 | 2 | 0 |
| 2016-04-01 | 74610108542146 | 2 | 1 | 0 |
| 2016-07-01 | 74610108542146 | 1 | 1 | 0 |
| 2016-10-01 | 74610108542146 | 1 | 2 | 0 |
| 2017-01-01 | 74610108542146 | 2 | 3 | 0 |
| 2017-04-01 | 74610108542146 | 3 | 3 | 0 |
| 2017-07-01 | 74610108542146 | 3 | 2 | 0 |
| 2017-10-01 | 74610108542146 | 2 | 1 | 1 |
| 2018-01-01 | 74610108542146 | 1 | 1 | 0 |
| 2018-04-01 | 74610108542146 | 1 | NULL | 0 |
+------------+----------------+-------+------------+------+
If the client never moved to Stage 3 then the flag for the client is always 0
You could calculate and update the new FLAG column from a CTE.
The update statement uses the LAG function to use the previous STAGE in the calculation of FLAG.
;WITH CTE AS
(
SELECT ID, [DATE], FLAG,
CASE
WHEN STAGE = 2
AND STAGE_NEXT = 1
AND LAG(STAGE) OVER (PARTITION BY ID ORDER BY IIF(STAGE=2 AND STAGE_NEXT=2,0,1), [DATE]) = 3
THEN 1
ELSE 0
END AS CalcFlag
FROM T1
WHERE ID = '10010101841033' -- optional, to target only 1 ID
)
UPDATE CTE
SET FLAG = CalcFlag
WHERE (FLAG IS NULL OR FLAG != CalcFlag);
The IIF(STAGE=2 AND STAGE_NEXT=2,0,1) in the LAG is used to make the calculation also work when the stage 2 is repeated.
Test it on rextester here
Try this,
DECLARE #T1 table
(
[DATE] date,ID numeric(18,0),STAGE int,STAGE_NEXT int
)
INSERT INTO #T1 VALUES
('2013-01-01',10010101841033,1,1 ),
('2013-04-01',10010101841033,1,3 ),
('2013-07-01',10010101841033,3,3 ),
('2013-10-01',10010101841033,3,2 ),
('2014-01-01',10010101841033,2,1 ),
('2014-04-01',10010101841033,1,1 ),
('2014-07-01',10010101841033,1,1 ),
('2014-10-01',10010101841033,1,NULL),
('2014-07-01',47820108841771,1,2)
SELECT A.DATE,A.ID,A.STAGE,A.STAGE_NEXT,
CASE WHEN B.ID IS NOT NULL AND (STAGE_NEXT=1 AND STAGE>STAGE_NEXT) THEN 1 ELSE 0 END AS FLAG
FROM #T1 A
LEFT JOIN
(
SELECT DISTINCT ID AS ID
FROM #T1
WHERE STAGE_NEXT=3
)B
ON A.ID=B.ID

Adding a count column in SQL Server for groups of records

I am trying to update an existing table with an individual count of the record on each row in a count column.
The table has the following columns that need to be incremented:
MBR_NO, CLAIM_N0, Effective_Dt, incr_count
So a sample might look like this before the run:
MBR_NO | CLAIM_N0 | Effective_Dt | incr_count |
-------+----------+----------------+------------+
1 | 2 | 1/1/2015 | NULL |
1 | 4 | 5/5/2015 | NULL |
1 | 5 | 6/7/2016 | NULL |
1 | 7 | 8/7/2016 | NULL |
2 | 2 | 4/3/2015 | NULL |
2 | 5 | 5/21/2015 | NULL |
3 | 8 | 3/27/2015 | NULL |
I want to count by MBR_NO and update the Incr_count to look like this:
MBR_NO | CLAIM_N0 | Effective_Dt | incr_count |
-------+----------+----------------+------------+
1 | 2 | 1/1/2015 | 1 |
1 | 4 | 5/5/2015 | 2 |
1 | 5 | 6/7/2016 | 3 |
1 | 7 | 8/7/2016 | 4 |
2 | 2 | 4/3/2015 | 1 |
2 | 5 | 5/21/2015 | 2 |
3 | 8 | 3/27/2015 | 1 |
I need to change that filed for processing later on.
I know this is not that complex but It seemed that the other topics offered solutions that don't incrementally update. Any help would be appreciated.
You could just do this in a query with
ROW_NUMBER() OVER (PARTITION BY MBR_NO ORDER BY Effective_DT).
but does it matter if the number changes? i.e. in your example if you had
MBR_NO EffectiveDate RowNumber
------------------------------------
2 1/1/2017 1
2 5/1/2017 2
but if you inserted a row with an effective date of say 3/1/2017 it would change the row number for the 5/1/2017 row i.e.
MBR_NO EffectiveDate RowNumber
------------------------------------
2 1/1/2017 1
2 3/1/2017 2
2 5/1/2017 3
You can query as below:
Select MBR_NO, CLAIM_N0, Effective_Dt,
incr_count = count(MBR_NO) over(Partition by MBR_NO order by Effective_Dt)
from yourtable
Output as below:
+--------+----------+--------------+------------+
| MBR_NO | CLAIM_N0 | Effective_Dt | incr_count |
+--------+----------+--------------+------------+
| 1 | 2 | 2015-01-01 | 1 |
| 1 | 4 | 2015-05-05 | 2 |
| 1 | 5 | 2016-06-07 | 3 |
| 1 | 7 | 2016-08-07 | 4 |
| 2 | 2 | 2015-04-03 | 1 |
| 2 | 5 | 2015-05-21 | 2 |
| 3 | 8 | 2015-03-27 | 1 |
+--------+----------+--------------+------------+

Select a specific line if i have the same information

I have a table with a data as bellow :
+--------+----------+-------+------------+--------------+
| month | code | type | date | PersonID |
+--------+----------+-------+------------+--------------+
| 201501 | 178954 | 3 | 2014-12-3 | 10 |
| 201501 | 178954 | 3 | 2014-12-3 | 10 |
| 201501 | 178955 | 2 | 2014-12-13 | 10 |
| 201501 | 178955 | 2 | 2014-12-13 | 10 |
| 201501 | 178956 | 2 | 2014-12-11 | 10 |
| 201501 | 178958 | 1 | 2014-12-10 | 10 |
| 201501 | 178959 | 2 | 2014-12-12 | 15 |
| 201501 | 178959 | 2 | 2014-12-12 | 15 |
| 201501 | 178954 | 1 | 2014-12-11 | 13 |
| 201501 | 178954 | 1 | 2014-12-11 | 13 |
+--------+----------+-------+------------+--------------+
In my first 6 lines i have the same PersonID in the same Month What i want if i have the same personID in the same Month i want to select the person who have the type is 2 with the recent date in my case the output will be like as bellow:
+--------+--------+------+------------+----------+
| month | code | type| date | PersonID |
+--------+--------+------+------------+----------+
| 201501 | 178955 | 2 | 2014-12-13 | 10 |
| 201501 | 178959 | 2 | 2014-12-12 | 15 |
| 201501 | 178954 | 2 | 2014-12-11 | 13 |
+--------+--------+------+------------+----------+
Also if they are some duplicate rows i don't want to display it
They are any solution to that ?
Simply use GROUP BY:
https://msdn.microsoft.com/de-de/library/ms177673(v=sql.120).aspx
SELECT mont, code, ... FROM tabelname GROUP BY PersonID, date, ...
Note that you have to specifiy all columns in the group by.
SELECT DISTINCT A.month, A.code, A.type, B.date, B.PersonID FROM YourTable A
INNER JOIN (SELECT PersonID, MAX(date) as date FROM YourTable
GROUP BY PersonID) B
ON (A.PersonID = B.PersonID
AND A.date = B.date)
WHERE A.type = 2 ORDER BY B.date DESC, A.PersonID
Just in case you/others are still wondering.

pivot and cascade null columns

I have a table that holds values for particular months:
| MFG | DATE | FACTOR |
-----------------------------
| 1 | 2013-01-01 | 1 |
| 2 | 2013-01-01 | 0.8 |
| 2 | 2013-02-01 | 1 |
| 2 | 2013-12-01 | 1.55 |
| 3 | 2013-01-01 | 1 |
| 3 | 2013-04-01 | 1.3 |
| 3 | 2013-05-01 | 1.2 |
| 3 | 2013-06-01 | 1.1 |
| 3 | 2013-07-01 | 1 |
| 4 | 2013-01-01 | 0.9 |
| 4 | 2013-02-01 | 1 |
| 4 | 2013-12-01 | 1.8 |
| 5 | 2013-01-01 | 1.4 |
| 5 | 2013-02-01 | 1 |
| 5 | 2013-10-01 | 1.3 |
| 5 | 2013-11-01 | 1.2 |
| 5 | 2013-12-01 | 1.5 |
What I would like to do is pivot these using a calendar table (already defined):
And finally, cascade the NULL columns to use the previous value.
What I've got so far is a query that will populate the NULLs with the last value for mfg = 3. Each mfg will always have a value for the first of the year. My question is; how do I pivot this and extend to all mfg?
SELECT c.[date],
f.[factor],
Isnull(f.[factor], (SELECT TOP 1 factor
FROM factors
WHERE [date] < c.[date]
AND [factor] IS NOT NULL
AND mfg = 3
ORDER BY [date] DESC)) AS xFactor
FROM (SELECT [date]
FROM calendar
WHERE Datepart(yy, [date]) = 2013
AND Datepart(d, [date]) = 1) c
LEFT JOIN (SELECT [date],
[factor]
FROM factors
WHERE mfg = 3) f
ON f.[date] = c.[date]
Result
| DATE | FACTOR | XFACTOR |
---------------------------------
| 2013-01-01 | 1 | 1 |
| 2013-02-01 | (null) | 1 |
| 2013-03-01 | (null) | 1 |
| 2013-04-01 | 1.3 | 1.3 |
| 2013-05-01 | 1.2 | 1.2 |
| 2013-06-01 | 1.1 | 1.1 |
| 2013-07-01 | 1 | 1 |
| 2013-08-01 | (null) | 1 |
| 2013-09-01 | (null) | 1 |
| 2013-10-01 | (null) | 1 |
| 2013-11-01 | (null) | 1 |
| 2013-12-01 | (null) | 1 |
SQL Fiddle
Don't know if you need the dates to be dynamic from the calender table or if mfg can be more than 5 but this should give you some ideas.
select *
from (
select c.date,
t.mfg,
(
select top 1 f.factor
from factors as f
where f.date <= c.date and
f.mfg = t.mfg and
f.factor is not null
order by f.date desc
) as factor
from calendar as c
cross apply(values(1),(2),(3),(4),(5)) as t(mfg)
) as t
pivot (
max(t.factor) for t.date in ([20130101], [20130201], [20130301],
[20130401], [20130501], [20130601],
[20130701], [20130801], [20130901],
[20131001], [20131101], [20131201])
) as P
SQL Fiddle

Resources