Good day,
I have a sql table with the following setup:
DataPoints{ DateTime timeStampUtc , bit value}
The points are on a minute interval, and store either a 1(on) or a 0(off).
I need to write a stored procedure to find the points of interest from all the data points.
I have a simplified drawing below:
I need to find the corner points only. Please note that there may be many data points between a value change. For example:
{0,0,0,0,0,0,0,1,1,1,1,0,0,0}
This is my thinking atm (high level)
Select timeStampUtc, Value
From Data Points
Where Value before or value after differs by 1 or -1
I am struggling to convert this concept to sql, and I also have a feeling there is an more elegant mathematical solution that I am not aware off. This must be a common problem in electronics?
I have wrapped the table into a CTE. Then, I am joining every row in the CTE to the next row of itself. Also, I've added a condition that the consequent rows should differ in the value.
This would return you all rows where the value changes.
;WITH CTE AS(
SELECT ROW_NUMBER() OVER(ORDER BY TimeStampUTC) AS id, VALUE, TIMESTAMPUTC
FROM DataPoints
)
SELECT CTE.TimeStampUTC as "Time when the value changes", CTE.id, *
FROM CTE
INNER JOIN CTE as CTE2
ON CTE.id = CTE2.id + 1
AND CTE.Value != CTE2.Value
Here's a working fiddle: http://sqlfiddle.com/#!6/a0ddc/3
If I got it correct, you are looking for something like this:
with cte as (
select * from (values (1,0),(2,0),(3,1),(4,1),(5,0),(6,1),(7,0),(8,0),(9,1)) t(a,b)
)
select
min(a), b
from (
select
a, b, sum(c) over (order by a rows unbounded preceding) grp
from (
select
*, iif(b = lag(b) over (order by a), 0, 1) c
from
cte
) t
) t
group by b, grp
Related
I would like to ask you a question. Generally, we know that LAG() function can be expressed us LAG(column expression, Offset, [defaul value]) OVER (ORDER BY ...). The offset can't be a subquery in DB2. Does anyone know a function in db2 that has the same functionality with LAG() and it can take as an offset argument a subquery or a calculated field that is calculated in the previous step in a subquery from the source?
So the case is as follows, I want to use lag function but the offset argument LAG(column expression, offset,[default value]) OVER(ORDER BY . . .)
is not a constant value.
For example I have:
LAG(CAST(DATE_utc AS VARCHAR),
offset,
0)
OVER (PARTITION BY ID,
ID_2 ORDER BY DATE_utc) AS PREV
FROM
(
SELECT , CAST(count(fid)
OVER (PARTITION BY f|| t, ID,
ID_2) AS INTEGER) AS offset
FROM SCHEMA.TABLE_1
WHERE ID = 65020475
)
AS inner_dataset
As you can see the [offset] is calculated from the TABLE_1 and is used as a parameter on the LAG function on the outer level.
However, this results in the following error:
SQL Error [42815]: The statement was not processed because the data type, length or value of the argument for the parameter
in position "2" of routine "LAG" is incorrect. Is there any alternative to this?
Thank you in advance!
You can use row_number() and a self join as in:
with tt (a, rn) as (
select a, row_number() over (order by a) as rn from t
)
select t1.a, t2.a as lag_a, t1.rn, t2.rn
from tt t1
left join tt t2
on t2.rn = t1.rn - (select max(x) from p);
Modified Impalers Fiddle
I found some answers to ways to update using over order by, but not anything that solved my issue. In SQL Server 2014, I have a column of DATES (with inconsistent intervals down to the millisecond) and a column of PRICE, and I would like to update the column of OFFSETPRICE with the value of PRICE from 50 rows hence (ordered by DATES). The solutions I found have the over order by in either the query or the subquery, but I think I need it in both. Or maybe I'm making it more complicated than it is.
In this simplified example, if the offset was 3 rows hence then I need to turn this:
DATES, PRICE, OFFSETPRICE
2018-01-01, 5.01, null
2018-01-03, 8.52, null
2018-02-15, 3.17, null
2018-02-24, 4.67, null
2018-03-18, 2.54, null
2018-04-09, 7.37, null
into this:
DATES, PRICE, OFFSETPRICE
2018-01-01, 5.01, 3.17
2018-01-03, 8.52, 4.67
2018-02-15, 3.17, 2.54
2018-02-24, 4.67, 7.37
2018-03-18, 2.54, null
2018-04-09, 7.37, null
This post was helpful, and so far I have this code which works as far as it goes:
select dates, price, row_number() over (order by dates asc) as row_num
from pricetable;
I haven't yet figured out how to point the update value to the future ordered row. Thanks in advance for any assistance.
LEAD is a useful window function for getting values from subsequent rows. (Also, LAG, which looks at preceding rows,) Here's a direct answer to your question:
;WITH cte AS (
SELECT dates, LEAD(price, 2) OVER (ORDER BY dates) AS offsetprice
FROM pricetable
)
UPDATE pricetable SET offsetprice = cte.offsetprice
FROM pricetable
INNER JOIN cte ON pricetable.dates = cte.dates
Since you asked about ROW_NUMBER, the following does the same thing:
;WITH cte AS (
SELECT dates, price, ROW_NUMBER() OVER (ORDER BY dates ASC) AS row_num
FROM pricetable
),
cte2 AS (
SELECT dates, price, (SELECT price FROM cte AS sq_cte WHERE row_num = cte.row_num + 2) AS offsetprice
FROM cte
)
UPDATE pricetable SET offsetprice = cte2.offsetprice
FROM pricetable
INNER JOIN cte2 ON pricetable.dates = cte2.dates
So, you could use ROW_NUMBER to sort the rows and then use that result to select a value 2 rows ahead. LEAD just does that very thing directly.
I have table with measurement with column SERIAL_NBR, DATE_TIME, VALUE.
There is a lot of data so when I need them to get the last 48 hours for 2000 devices
Select * from MY_TABLE where [TIME]> = DATEADD (hh, -48, #TimeNow)
takes a very long time.
Is there a way not to receive all the rows for each device, but only the latest entry? Would this speed up the query execution time?
Assuming that there is column named deviceId(change as per your needs), you can use top 1 with ties with window function row_number:
Select top 1 with ties *
from MY_TABLE
where [TIME]> = DATEADD (hh, -48, #TimeNow)
Order by row_number() over (
partition by deviceId
order by Time desc
);
You can simply create Common Table Expression that sorts and groups the entries and then pick the latest one from there.
;WITH numbered
AS ( SELECT [SERIAL_NBR], [TIME], [VALUE], row_nr = ROW_NUMBER() OVER (PARTITION BY [SERIAL_NBR] ORDER BY [TIME] DESC)
FROM MY_TABLE
WHERE [TIME]> = DATEADD (hh, -48, #TimeNow) )
SELECT [SERIAL_NBR], [TIME], [VALUE]
FROM numbered
WHERE row_nr = 1 -- we want the latest record only
Depending on the amount of data and the indexes available this might or might not be faster than Anthony Hancock's answer.
Similar to his answer you might also try the following:
(from MSSQL's point of view, the below query and Anthony's query are pretty much identical and they'll probably end up with the same query plan)
SELECT [SERIAL_NBR] , [TIME], [VALUE]
FROM MY_TABLE AS M
JOIN (SELECT [SERIAL_NBR] , max_time = MAX([TIME])
FROM MY_TABLE
GROUP BY [SERIAL_NBR]) AS L -- latest
ON L.[SERIAL_NBR] = M.[SERIAL_NBR]
AND L.max_time = M.[TIME]
WHERE M.DATE_TIME >= DATEADD(hh,-48,#TimeNow)
Your listed column values and your code don't quite match up so you'll probably have to change this code a little, but it sounds like for each SERIAL_NBR you want the record with the highest DATE_TIME in the last 48 hours. This should achieve that result for you.
SELECT SERIAL_NBR,DATE_TIME,VALUE
FROM MY_TABLE AS M
WHERE M.DATE_TIME >= DATEADD(hh,-48,#TimeNow)
AND M.DATE_TIME = (SELECT MAX(_M.DATE_TIME) FROM MY_TABLE AS _M WHERE M.SERIAL_NBR = _M.SERIAL_NBR)
This will get you details of the latest record per serial number:
Select t.SERIAL_NBR, q.FieldsYouWant
from MY_TABLE t
outer apply
(
selct top 1 t2.FieldsYouWant
from MY_TABLE t2
where t2.SERIAL_NBR = t.SERIAL_NBR
order by t2.[TIME] desc
)q
where t.[TIME]> = DATEADD (hh, -48, #TimeNow)
Also, worth sticking DATEADD (hh, -48, #TimeNow) into a variable rather than calculating inline.
I need to calculate a decaying average (cumulative moving?) of a set of values. The last value in the series is 50% weight, with the decayed average of all the prior series as the other 50% weight, recursively.
I came up with a CTE query that produces correct results, but it depends on a sequential row number. I'm wondering if there is a better way to do this in SQL 2012, maybe with the new windowing functions for Over(), or something like that?
In the live data, the rows are ordered by time. I can use an SQL view and ROW_NUMBER() to generate the necessary Row field for my CTE approach, but if there is a more efficient way to do this, I would like to keep this as efficient as possible.
I have a sample table with 2 columns: Row int, and Value Float. I have 6 sample data values of 1,2,3,4,4,4. The correct result should be 3.78125.
My solution is:
;WITH items AS (
SELECT TOP 1
Row, Value, Value AS Decayed
FROM Sample Order By Row
UNION ALL
SELECT v.Row, v.Value, Decayed * .5 + v.Value *.5 AS Decayed
FROM Sample v
INNER JOIN items itms ON itms.Row = v.Row-1
)
SELECT top 1 Decayed FROM items order by Row desc
This correctly produces 3.78125 with the test data. My question is: Is there a more efficient and/or simpler way to do this in SQL 2012, or is this about the only way to do it? Thanks.
One possible alternative would be
WITH T AS
(
SELECT
Value * POWER(5E-1, ROW_NUMBER()
OVER (ORDER BY Row DESC)
/* first row decays less so special cased */
-IIF(LEAD(Value) OVER (ORDER BY Row DESC) IS NULL,1,0))
as x
FROM Sample
)
SELECT SUM(x)
FROM T
SQL Fiddle
Or for the updated question using 60%/40%
WITH T AS
(
SELECT IIF(LEAD(Value) OVER (ORDER BY Row DESC) IS NULL, 1,0.6)
* Value
* POWER(4E-1, ROW_NUMBER() OVER (ORDER BY Row DESC) -1)
as x
FROM Sample
)
SELECT SUM(x)
FROM T
SQL Fiddle
both of the above perform a single pass through the data and can potentially use an index on Row INCLUDE(Value) to avoid a sort.
I've got a table of stock market moving average values, and I'm trying to compare two values within a day, and then compare that value to the same calculation of the prior day. My sql as it stands is below... when I comment out the last select statement that defines the result set, and run the last cte shown as the result set, I get my data back in about 15 minutes. Long, but manageable since it'll run as an insert sproc overnight. When I run it as shown, I'm at 40 minutes before any results even start to come in. Any ideas? It goes from somewhat slow, to blowing up, probably with the addition of ROW_NUMBER() OVER (PARTITION BY) BTW I'm still working through the logic, which is currently impossible with this performance issue. Thanks in advance..
Edit: I fixed my partition as suggested below.
with initialSmas as
(
select TradeDate, Symbol, Period, Value
from tblDailySMA
),
smaComparisonsByPer as
(
select i.TradeDate, i.Symbol, i.Period FastPer, i.Value FastVal,
i2.Period SlowPer, i2.Value SlowVal, (i.Value-i2.Value) FastMinusSlow
from initialSmas i join initialSmas as i2 on i.Symbol = i2.Symbol
and i.TradeDate = i2.TradeDate and i2.Period > i.Period
),
smaComparisonsByPerPartitioned as
(
select ROW_NUMBER() OVER (PARTITION BY sma.Symbol, sma.FastPer, sma.SlowPer
ORDER BY sma.TradeDate) as RowNum, sma.TradeDate, sma.Symbol, sma.FastPer,
sma.FastVal, sma.SlowPer, sma.SlowVal, sma.FastMinusSlow
from smaComparisonsByPer sma
)
select scp.TradeDate as LatestDate, scp.FastPer, scp.FastVal, scp.SlowPer, scp.SlowVal,
scp.FastMinusSlow, scp2.TradeDate as LatestDate, scp2.FastPer, scp2.FastVal, scp2.SlowPer,
scp2.SlowVal, scp2.FastMinusSlow, (scp.FastMinusSlow * scp2.FastMinusSlow) as Comparison
from smaComparisonsByPerPartitioned scp join smaComparisonsByPerPartitioned scp2
on scp.Symbol = scp2.Symbol and scp.RowNum = (scp2.RowNum - 1)
1) You have some fields both in the Partition By and the Order By clauses. That doesn't make sense since you will have one and only one value for each (sma.FastPer, sma.SlowPer). You can safely remove these fields from the Order By part of the window function.
2) Assuming that you already have indexes for adequate performance in "initialSmas i join initialSmas" and that you already have and index for (initialSmas.Symbol, initialSmas.Period, initialSmas.TradeDate) the best you can do is to copy smaComparisonsByPer into a temporary table where you can create an index on (sma.Symbol, sma.FastPer, sma.SlowPer, sma.TradeDate)