SQL Server, Merging 2 Rows into 1 and limit row grouping

SQL Server, Merging 2 Rows into 1 and limit row grouping - sql-server

I am currently making a SQL Query to access data in a table called "Alarms". This table is set up as in the Following Format:
AlarmNumber | Time | AlarmState
-------------|-------|-----------
1046 | 10:30 | 0
1045 | 10:25 | 1
1044 | 10:24 | 0
1046 | 10:24 | 1
1046 | 10:23 | 0
1046 | 10:22 | 1
What I would like to achieve is to sort the alarms Into the Following Format
The Goal is to display the Alarm Start Time, Alarm Stop Time and Alarm Active Time (Alarm End Time - Alarm Start Time)
AlarmNumber | AlarmStartTime | AlarmEndTime | AlarmActiveTime
-------------|-----------------|--------------|----------------
1046 | 10:24 | 10:30 | 00:02
1045 | 10:24 | - | 10:24 + Current Time
1044 | Shift Start Time| 10:30 |10:30 - Shift Start Time
1046 | 10:22 | 10:23 | 00:01
My current code is the following (Note: _Global_Vars is a table with Timezones):
SELECT
TODATETIMEOFFSET([ALARM_START_TIME],0) AT TIME ZONE (SELECT g.LocalTimeZone FROM _Global_Vars as g) AS [ALARM_START_TIME],
TODATETIMEOFFSET(ALARM_FINISH_TIME,0) AT TIME ZONE (SELECT g.LocalTimeZone FROM _Global_Vars as g) AS [ALARM_FINISH_TIME],
DATEDIFF(SS, [ALARM_START_TIME], [ALARM_FINISH_TIME]),
sub.AlarmNumber
FROM
(
SELECT
(a.[Time]) AS AlarmTime,
(a.[AlarmNumber]+1) as AlarmNumber,
(CASE WHEN a.[AlarmState] = 1 THEN a.[Time] END) [ALARM_START_TIME],
(CASE WHEN a.[AlarmState] = 0 THEN a.[Time] END) [ALARM_FINISH_TIME]
FROM [Alarms] as a
WHERE (a.[Time] > DATEADD(mi, - 60.0 * 12, GETUTCDATE()))
)`
The issue at the moment is that if I use MAX in front of the CASE and GROUP BY AlarmNumber, it combines all of the values for AlarmNumber into a single row where I would like it to have multiple instances of Alarmnumber if the Alarm occurs multiple times
I am a novice regarding writing SQL Queries so any help would be great.

I will post just part of the solution (containing just first three colmns) since you not clearly specified the goal content for the last column.
SELECT t.alarmNumber,
isnull(MAX(CASE WHEN t.alarmstate = 1 THEN CAST(t.[time] as varchar(20)) END), 'Shift Start Time') AlarmStartTime,
isnull(MAX(CASE WHEN t.alarmstate = 0 THEN CAST(t.[time] as varchar(20)) END), 'Current Time') AlarmEndTime
FROM
(
SELECT *, row_number() over (partition by alarmnumber, alarmstate order by [time]) al_group
FROM Alarms
) t
GROUP BY t.alarmNumber, t.al_group
demo

Related

Creating rolling window for time series data in SQL

I have a question regarding adding rolling window column in SQL. Table A is a sample of 24 months time series data. I need to add column for difference between each month balances with pervious month and a month before pervious month. For example for Mar 2020 I need to have difference between Mar and Feb and also Mar and Jan for Deposit and Withdraw separately for each ID (Table B). I try to use 'window' function in sql but I do not know how.
**Table A**
ID | Date |A | B |
+--------+-----------+-------+---------
| 1 | Jan 20 | $200 | $100 |
| 1 | Feb 20 | $500 | $250 |
| 1 | Mar 20 | $1000 | $550 |
+--------+-----------+-------+---------+
I want results like this:
**Table B**
ID | Date |A | B | A(Mar-Feb)| A(Mar-Jan)| B(Mar-Feb)| B(Mar-Jan)|
+--------+-----------+-------+------------------------------------------------------
| 1 | Jan 20 | $200 | $100 | | | | |
| 1 | Feb 20 | $500 | $250 | | | | |
| 1 | Mar 20 | $1000 | $550 | $500 |$800 |$300 |$450 |
+--------+-----------+-------+---------+------------+-----------+----------+-----------+
I'd really appreciated if someone can help me.

Edited: See edit at bottom for corrected answer based on more information from OP
I "think" this is what you're asking for and it may not perfectly be what you want, because it fills in the other rows as well...
IF OBJECT_ID('tempdb..#TableA','U') IS NOT NULL DROP TABLE #TableA; --SELECT * FROM #TableA
CREATE TABLE #TableA (
ID int NOT NULL,
[Date] date NOT NULL,
A int NOT NULL,
B int NOT NULL,
)
INSERT INTO #TableA (ID, Date, A, B)
VALUES (1, '2020-01-01', 200, 100)
, (1, '2020-02-01', 500, 250)
, (1, '2020-03-01', 1000, 550)
SELECT ta.ID
, [Date] = FORMAT(ta.[Date],'MMM yy')
, ta.A, ta.B
, A_DiffPrev = ta.A - LAG(ta.A) OVER (ORDER BY ta.[Date])
, A_DiffFirst = ta.A - FIRST_VALUE(ta.A) OVER (ORDER BY ta.[Date])
, B_DiffPrev = ta.B - LAG(ta.B) OVER (ORDER BY ta.[Date])
, B_DiffFirst = ta.B - FIRST_VALUE(ta.B) OVER (ORDER BY ta.[Date])
FROM #TableA ta
Returns:
| ID | Date | A | B | A_DiffPrev | A_DiffFirst | B_DiffPrev | B_DiffFirst |
|----|--------|------|-----|------------|-------------|------------|-------------|
| 1 | Jan 20 | 200 | 100 | NULL | 0 | NULL | 0 |
| 1 | Feb 20 | 500 | 250 | 300 | 300 | 150 | 150 |
| 1 | Mar 20 | 1000 | 550 | 500 | 800 | 300 | 450 |
Explanation
LAG(ta.A) OVER (ORDER BY ta.[Date]) - This will give you the previous value as sorted by the provided ORDER BY. So in this case, it's saying, give me the value that occurs prior to the current row, if you sort by [Date] Ascending
FIRST_VALUE(ta.A) OVER (ORDER BY ta.[Date]) - Similar idea to LAG() except it's saying to get the very first item, rather than the previous item.
Edit
In the comments you mentioned that FIRST_VALUE() will not work for you because you don't want to compare with the first month, you want to compare with the previous month and two months back.
In that case, you can use this solution:
SELECT ta.ID
, [Date] = FORMAT(ta.[Date],'MMM yy')
, ta.A, ta.B
, A_DiffPrev1 = ta.A - LAG(ta.A,1) OVER (ORDER BY ta.[Date])
, A_DiffPrev2 = ta.A - LAG(ta.A,2) OVER (ORDER BY ta.[Date])
, B_DiffPrev1 = ta.B - LAG(ta.B,1) OVER (ORDER BY ta.[Date])
, B_DiffPrev2 = ta.B - LAG(ta.B,2) OVER (ORDER BY ta.[Date])
FROM #TableA ta
Explanation:
In this change, I'm using LAG() for everything. But instead, I'm telling LAG() how many rows I want it to look back.
So to get the previous month, I say LAG(A, 1) which means to grab the previous row, which is the default, I'm only providing it here to make it more explicitly clear what is happening.
Then I say LAG(A, 2) which means to go back two rows and grab that value.
NOTE: This is all assuming you do not have gaps in your data.

Speed up a select in SQL Server

I have a table that has values like these :
Table 1 :
Name | DateTimeFrom | DateTimeTo
A | 2017-02-03 02:00 | 2017-02-10 23:55
B | 2017-01-03 14:00 | 2017-05-10 19:55
And another table that has values like these :
Table 2:
Name | Date | Hour | Value
A | 2017-01-01 | 00:00 | 0.25
A | 2017-01-01 | 00:15 | 0.25
A | 2017-01-01 | 00:30 | 0
A | 2017-01-01 | 00:45 | 0
A | 2017-01-01 | 01:00 | 0.25
[...] Contains values 0 or 0.25 every 15mins
Result :
Name | DateTimeFrom | DateTimeTo | Value
A | 2017-02-03 02:00 | 2017-02-10 23:55 | 345.0
B | 2017-01-03 14:00 | 2017-05-10 19:55 | 1202
I've created a view that contains all the columns from table 1 and the SUM of all the values from the table 2 according to the daterange on the table 1. The problem is that Table 2 contains more than 3 million rows and the SELECT takes about 10 mins...
Is there a way to speed up the process ?
I tried to create an index on the table 2 but I don't know which index (clustered ? on which columns ? ) i must create to lower the execution time.
Edit (here is the query) :
SELECT Name, DateTimeFrom, DateTimeTo FROM Table1
LEFT OUTER JOIN Table2 ON Table1.Name = Table2.Name AND Table1.DateTimeFrom <=
CAST(Table2.Date AS DATETIME) + CAST(Table2.Hour AS DATETIME)
AND (CASE WHEN Table1.DateTimeTo IS NULL THEN GETDATE() ELSE
Table1.DateTimeTo END) > CAST(Table2.Date AS DATETIME) + CAST(Table2.Hour AS DATETIME)

Op(Swapper) - Are you trying to only return the past 2 days?
Start with a non clustered index on table 2 date include value column.
Then add a filter for only the data set you need, no one can consume 3 million records. something like where datetimefrom > datediff(month, 1, sysdatetime()) (in the view definition)
A second thought, why compute this data over and over again via a view, consider materializing this data into a table.

SQL Query with Average and Grouping

I just want to ask you guys, especially those with MsSQL knowledge, regarding my query.
My goal is to get the average delivery time and group my data by delivery date and route id daily/weekly/monthly.
Here's my query:
SELECT
RouteID,
CONVERT(date, [DeliveryDate]) AS delivery_date,
AVG(
DATEDIFF(
day,
CONVERT(date, [UnloadDate]),
CONVERT(date, [DeliveryDate])
)
) as Averate_Delivery_Time
FROM [CARGODB].[dbo].[Cargo_Transactions]
WHERE
[DeliveryDate] IS NOT NULL AND
[UnloadDate] != 0 AND
[StageID] = 'D' AND
( CONVERT(date, [DeliveryDate]) LIKE '%2016%' or
CONVERT(date, [DeliveryDate]) LIKE '%2017%')
GROUP BY CONVERT(date, [DeliveryDate]), [RouteID]
ORDER BY CONVERT(date, [DeliveryDate]) DESC
I am not confident if the average delivery time is correct so if you think it's wrong or there are other things in my query that needs to be corrected, please let me know.
UPDATE:
I was able to get the right query:
SELECT [RouteID],
CAST(DATEPART(YEAR,[DeliveryDate]) as varchar) + ' Week ' +
CAST(DATEPART(WEEK,[DeliveryDate]) AS varchar) AS week_name,
AVG(DATEDIFF(day, CONVERT(date, [UnloadDate]), CONVERT(date,
[DeliveryDate]))) as Average_Delivery_Days
FROM [CARGODB].[dbo].[Cargo_Transactions]
WHERE [DeliveryDate] IS NOT NULL AND [DeliveryDate] != 0
AND CONVERT(date, [DeliveryDate]) BETWEEN '2016-01-01' AND GETDATE()
AND [UnloadDate] IS NOT NULL AND [UnloadDate] != 0 AND [DeliveryDate] >
[UnloadDate]
AND [Deleted] = 0 and [StageID] = 'D'
GROUP BY DATEPART(YEAR,[DeliveryDate]), DATEPART(WEEK,[DeliveryDate]),
[RouteID]
ORDER BY DATEPART(YEAR,[DeliveryDate]), DATEPART(WEEK,[DeliveryDate]),
Average_Delivery_Days desc
But I have a more complicated query to do now. I have this sample data:
RouteID | week_name | yearnum | weeknum | Average_Delivery_Days
=======================================================================
MK | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
TSM | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
E | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
A | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
D | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
MP | 2016 Week 2 | 2016 | 2 | 1
-----------------------------------------------------------------------
CTN | 2016 Week 3 | 2016 | 3 | 9
-----------------------------------------------------------------------
BIS | 2016 Week 3 | 2016 | 3 | 8
-----------------------------------------------------------------------
C | 2016 Week 3 | 2016 | 3 | 1
-----------------------------------------------------------------------
PN | 2016 Week 4 | 2016 | 4 |10
-----------------------------------------------------------------------
How can I make the above data be like:
MK and TSM are merged into 1 new routeID like Manila1
E, A, and D are merged into another as Manila2
MP, CTN, AND BIS as Visayas
C and PN as Mindanao
and so on..
And the average delivery days will be changed as well.
Your help is highly appreciated. Thank you!

Window function to count occurrences in last 10 minutes

I can use a traditional subquery approach to count the occurrences in the last ten minutes. For example, this:
drop table if exists [dbo].[readings]
go
create table [dbo].[readings](
[server] [int] NOT NULL,
[sampled] [datetime] NOT NULL
)
go
insert into readings
values
(1,'20170101 08:00'),
(1,'20170101 08:02'),
(1,'20170101 08:05'),
(1,'20170101 08:30'),
(1,'20170101 08:31'),
(1,'20170101 08:37'),
(1,'20170101 08:40'),
(1,'20170101 08:41'),
(1,'20170101 09:07'),
(1,'20170101 09:08'),
(1,'20170101 09:09'),
(1,'20170101 09:11')
go
-- Count in the last 10 minutes - example periods 08:31 to 08:40, 09:12 to 09:21
select server,sampled,(select count(*) from readings r2 where r2.server=r1.server and r2.sampled <= r1.sampled and r2.sampled > dateadd(minute,-10,r1.sampled)) as countinlast10minutes
from readings r1
order by server,sampled
go
How can I use a window function to obtain the same result ? I've tried this:
select server,sampled,
count(case when sampled <= r1.sampled and sampled > dateadd(minute,-10,r1.sampled) then 1 else null end) over (partition by server order by sampled rows between unbounded preceding and current row) as countinlast10minutes
-- count(case when currentrow.sampled <= r1.sampled and currentrow.sampled > dateadd(minute,-10,r1.sampled) then 1 else null end) over (partition by server order by sampled rows between unbounded preceding and current row) as countinlast10minutes
from readings r1
order by server,sampled
But the result is just the running count. Any system variable that refers to the current row pointer ? currentrow.sampled ?

This isn't a very pleasing answer but one possibility is to first create a helper table with all the minutes
CREATE TABLE #DateTimes(datetime datetime primary key);
WITH E1(N) AS
(
SELECT 1 FROM (VALUES(1),(1),(1),(1),(1),
(1),(1),(1),(1),(1)) V(N)
) -- 1*10^1 or 10 rows
, E2(N) AS (SELECT 1 FROM E1 a, E1 b) -- 1*10^2 or 100 rows
, E4(N) AS (SELECT 1 FROM E2 a, E2 b) -- 1*10^4 or 10,000 rows
, E8(N) AS (SELECT 1 FROM E4 a, E4 b) -- 1*10^8 or 100,000,000 rows
,R(StartRange, EndRange)
AS (SELECT MIN(sampled),
MAX(sampled)
FROM readings)
,N(N)
AS (SELECT ROW_NUMBER()
OVER (
ORDER BY (SELECT NULL)) AS N
FROM E8)
INSERT INTO #DateTimes
SELECT TOP (SELECT 1 + DATEDIFF(MINUTE, StartRange, EndRange) FROM R) DATEADD(MINUTE, N.N - 1, StartRange)
FROM N,
R;
And then with that in place you could use ROWS BETWEEN 9 PRECEDING AND CURRENT ROW
WITH T1 AS
( SELECT Server,
MIN(sampled) AS StartRange,
MAX(sampled) AS EndRange
FROM readings
GROUP BY Server )
SELECT Server,
sampled,
Cnt
FROM T1
CROSS APPLY
( SELECT r.sampled,
COUNT(r.sampled) OVER (ORDER BY N.datetime ROWS BETWEEN 9 PRECEDING AND CURRENT ROW) AS Cnt
FROM #DateTimes N
LEFT JOIN readings r
ON r.sampled = N.datetime
AND r.server = T1.server
WHERE N.datetime BETWEEN StartRange AND EndRange ) CA
WHERE CA.sampled IS NOT NULL
ORDER BY sampled
The above assumes that there is at most one sample per minute and that all the times are exact minutes. If this isn't true it would need another table expression pre-aggregating by datetimes rounded to the minute.

As far as I know, there is not a simple exact replacement for your subquery using window functions.
Window functions operate on a set of rows and allow you to work with them based on partitions and order.
What you are trying to do isn't the type of partitioning that we can work with in window functions.
To generate the partitions we would need to be able to use window functions in this instance would just result in overly complicated code.
I would suggest cross apply() as an alternative to your subquery.
I am not sure if you meant to restrict your results to within 9 minutes, but with sampled > dateadd(...) that is what is happening in your original subquery.
Here is what a window function could look like based on partitioning your samples into 10 minute windows, along with a cross apply() version.
select
r.server
, r.sampled
, CrossApply = x.CountRecent
, OriginalSubquery = (
select count(*)
from readings s
where s.server=r.server
and s.sampled <= r.sampled
/* doesn't include 10 minutes ago */
and s.sampled > dateadd(minute,-10,r.sampled)
)
, Slices = count(*) over(
/* partition by server, 10 minute slices, not the same thing*/
partition by server, dateadd(minute,datediff(minute,0,sampled)/10*10,0)
order by sampled
)
from readings r
cross apply (
select CountRecent=count(*)
from readings i
where i.server=r.server
/* changed to >= */
and i.sampled >= dateadd(minute,-10,r.sampled)
and i.sampled <= r.sampled
) as x
order by server,sampled
results: http://rextester.com/BMMF46402
+--------+---------------------+------------+------------------+--------+
| server | sampled | CrossApply | OriginalSubquery | Slices |
+--------+---------------------+------------+------------------+--------+
| 1 | 01.01.2017 08:00:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 08:02:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 08:05:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 08:30:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 08:31:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 08:37:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 08:40:00 | 4 | 3 | 1 |
| 1 | 01.01.2017 08:41:00 | 4 | 3 | 2 |
| 1 | 01.01.2017 09:07:00 | 1 | 1 | 1 |
| 1 | 01.01.2017 09:08:00 | 2 | 2 | 2 |
| 1 | 01.01.2017 09:09:00 | 3 | 3 | 3 |
| 1 | 01.01.2017 09:11:00 | 4 | 4 | 1 |
+--------+---------------------+------------+------------------+--------+

Thanks, Martin and SqlZim, for your answers. I'm going to raise a Connect enhancement request for something like %%currentrow that can be used in window aggregates. I'm thinking this would lead to much more simple and natural sql:
select count(case when sampled <= %%currentrow.sampled and sampled > dateadd(minute,-10,%%currentrow.sampled) then 1 else null end) over (...whatever the window is...)
We can already use expressions like this:
select count(case when sampled <= getdate() and sampled > dateadd(minute,-10,getdate()) then 1 else null end) over (...whatever the window is...)
so thinking would be great if we could reference a column that's in the current row.

Issue with Running Total Still

I still have an issue with working out the best way to calculate a running balance.
I am going to be using this code in a Rent Statement that I am going to produce in SSRS, but the problem I am having is that I can't seem to work out how to achieve a running balance.
SELECT rt.TransactionId,
rt.TransactionDate,
rt.PostingDate,
rt.AccountId,
rt.TotalValue,
rab.ClosingBalance,
ROW_NUMBER()OVER(PARTITION BY rt.AccountId ORDER BY rt.PostingDate desc) AS row,
CASE WHEN ROW_NUMBER()OVER(PARTITION BY rt.AccountId ORDER BY rt.PostingDate desc) = 1
THEN ISNULL(rab.ClosingBalance,0)
ELSE 0 end
FROM RentTransactions rt
--all accounts for the specific agreement
INNER JOIN (select raa.AccountId
from RentAgreementEpisode rae
inner join RentAgreementAccount raa on raa.AgreementEpisodeId = rae.AgreementEpisodeId
where rae.AgreementId=1981
) ij on ij.AccountId = rt.AccountId
LEFT JOIN RentBalance rab on rab.AccountId = rt.AccountId AND rt.PostingDate BETWEEN rab.BalanceFromDate AND isnull(rab.BalanceToDate,dateadd(day, datediff(day, 0, GETDATE()), 0))
What this gives me are the below results- I have included the results below -
So my code is sorting my transactions in the order I want and also is row numbering them in the correct order as well.
Where the Row Number is 1 - I need it to pull back the balance on that account at that point in time, which is what I am doing....BUT I am then unsure how I then get my code to start subtracting the proceeding row - so in this case The current figure of 1118.58 would need the Total Value in Row 2 = 91.65 subtracted from it - so the running balance for row 2 would be 1026.93 and so on...
Any help would be greatly appreciated.

Assuming you have all the transactions being returned in your query you can calculate a running total using the over clause, you just need to start at the beginning of your dataset rather than working backwards from your current balance:
declare #t table(d date,v decimal(10,2));
insert into #t values ('20170101',10),('20170102',20),('20170103',30),('20170104',40),('20170105',50),('20170106',60),('20170107',70),('20170108',80),('20170109',90);
select *
,sum(v) over (order by d
rows between unbounded preceding
and current row
) as RunningTotal
from #t
order by d desc
Output:
+------------+-------+--------------+
| d | v | RunningTotal |
+------------+-------+--------------+
| 2017-01-09 | 90.00 | 450.00 |
| 2017-01-08 | 80.00 | 360.00 |
| 2017-01-07 | 70.00 | 280.00 |
| 2017-01-06 | 60.00 | 210.00 |
| 2017-01-05 | 50.00 | 150.00 |
| 2017-01-04 | 40.00 | 100.00 |
| 2017-01-03 | 30.00 | 60.00 |
| 2017-01-02 | 20.00 | 30.00 |
| 2017-01-01 | 10.00 | 10.00 |
+------------+-------+--------------+

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

SQL Server, Merging 2 Rows into 1 and limit row grouping - sql-server

Related

Creating rolling window for time series data in SQL

Speed up a select in SQL Server

SQL Query with Average and Grouping

Window function to count occurrences in last 10 minutes

Issue with Running Total Still

Categories

Resources