SQL Server, first of each time series - sql-server

A table 'readings' has a list of dates
[Date] [Value]
2015-03-19 00:30:00 1.2
2015-03-19 00:40:00 1.2
2015-03-19 00:50:00 0.1
2015-03-19 01:00:00 0.1
2015-03-19 01:10:00 2
2015-03-19 01:20:00 0.5
2015-03-19 01:30:00 0.5
I need to get the most recent instance where the value is below a set point (in this case the value 1.0), but I only want the start (earliest datetime) where the value was below 1 for consecutive times.
So with the above data I want to return 2015-03-19 01:20:00, as the most recent block of times where value < 1, but I want the start of that block.
This SQL just returns the most recent date, rather than the first date whilst the value has been low (so returns 2015-03-19 01:30:00 )
select top 1 *
from readings where value <=1
order by [date] desc
I can't work out how to group the consecutive dates, to therefore only get the first ones
It is SQL Server, the real data isn't at exactly ten min intervals, and the readings table is about 70,000 rows- so fairly large!
Thanks, Charli

Demo
SELECT * FROM (
SELECT [Date]
,Value
,ROW_NUMBER() OVER (PARTITION BY cast([Date] AS DATE) ORDER BY [Date] ASC) AS RN FROM #table WHERE value <= 1
) t WHERE t.RN = 1

Select Max( [date] )
From [dbo].[readings]
Where ( [value] <= 1 )

You seem to want the minimum date for each set of consecutive records having a value that is less than 1. The query below returns exactly these dates:
SELECT MIN([Date])
FROM (
SELECT [Date], [Value],
ROW_NUMBER() OVER (ORDER BY [Date]) -
COUNT(CASE WHEN [Value] < 1 THEN 1 END) OVER (ORDER BY [Date]) AS grp
FROM mytable) AS t
WHERE Value < 1
GROUP BY grp
grp calculated field identifies consecutive records having Value<1.
Note: The above query will work for SQL Server 2012+.
Demo here
Edit:
To get the date value of the last group you can modify the above query to:
SELECT TOP 1 MIN([Date])
FROM (
SELECT [Date], [Value],
ROW_NUMBER() OVER (ORDER BY [Date]) -
COUNT(CASE WHEN [Value] < 1 THEN 1 END) OVER (ORDER BY [Date]) AS grp
FROM mytable) AS t
WHERE Value < 1
GROUP BY grp
ORDER BY grp DESC
Demo here

Related

Reporting Previous Records on SQL Server

I’m struggling a bit here. The data is fabricated, but the query concept is very real.
I need to select the Customer, Current Amount, Previous Amount, Sequence and Date
WHERE DATE < 1190105
AND the DATE/SEQ is the maximum date/seq prior to that date point grouping by customer.
I’ve spent quite a few days trying all sorts of things using HAVING, nested select to try and obtain the max-date/amount and min-date/amount by customer and can’t quite get my head around it. I’m sure it should be quite easy, but any help you can offer would be really appreciated.
Thanks
**SEQ DATE CUSTOMER AMOUNT**
1 1181225 Bob 400
2 1181226 Fred 300
3 1190101 Bob 100
4 1190104 Fred 500
5 1190104 George 200
6 1190105 Bob 150
7 1190106 Bob 200
8 1190110 Fred 160
9 1190110 Bob 300
10 1190112 Fred 400
Opt 1 use row number and lag functions
SELECT
ROW_NUMBER() OVER (Partition By CustomerID Order By [Date]) as Sec,
[Date],
Customer,
Amount as CurrentAmount,
Lead(Amount) OVER (Partition By CustomerID, Order By [Date]) as PreviousAmount
FROM
YourTable
WHERE
[DATE] < 1190105
Opt use outer apply
SELECT
ROW_NUMBER() OVER (Partition By Customer Order By [Date]) as Sec,
[Date],
Customer,
Amount as CurrentAmount,
Prev.Amount as PreviousAmount
FROM
YourTable T
OUTER APPLY (
SELECT TOP 1 Amount FROM YourTable
WHERE Customer = T.Customer AND [Date] < T.[Date]
ORDER BY [DATE] DESC
) Prev
WHERE
DATE < 1190105
Opt 3 use a correlated subquery
SELECT
ROW_NUMBER() OVER (Partition By Customer Order By [Date]) as Sec,
[Date],
Customer,
Amount as CurrentAmount,
(
SELECT TOP 1 Amount FROM YourTable
WHERE Customer = T.Customer AND [Date] < T.[Date]
ORDER BY [DATE] DESC
) as PreviousAmount
FROM YourTable
WHERE
DATE < 1190105
First restrict the rows with the date filter, then search for the max by customer.
Using GROUP BY:
DECLARE #FilterDate INT = 1190105
;WITH MaxDateByCustomer AS
(
SELECT
T.CUSTOMER,
MaxSEQ = MAX(T.SEQ)
FROM
YourTable AS T
WHERE
T.Date < #FilterDate
GROUP BY
T.CUSTOMER
)
SELECT
T.*
FROM
YourTable AS T
INNER JOIN MaxDateByCustomer AS M ON
T.CUSTOMER = M.CUSTOMER AND
T.SEQ = M.MaxSEQ
Using ROW_NUMBER window function:
DECLARE #FilterDate INT = 1190105
;WITH DateRankingByCustomer AS
(
SELECT
T.*,
DateRanking = ROW_NUMBER() OVER (PARTITION BY T.CUSTOMER ORDER BY T.SEQ DESC)
FROM
YourTable AS T
WHERE
T.Date < #FilterDate
)
SELECT
D.*
FROM
DateRankingByCustomer AS D
WHERE
D.DateRanking = 1

How to obtain Additions and Deductions from table

I have this table where I am storing the Sale Orders. The scenario is that once any sale order is punched it is not finalized, and requires editing later on so if any more items are added and saved again the sale order is updated with transaction number more than the previous one to keep the track of the changes. Here is a sample data that a sale order was punched and then 2 times more items were added and amount was changed and in the last row as shown items were cancelled and amount was changed.
I want to calculate the amount of the additions made in the sale order every time new items were added and the cancellations as well that how much worth of items were cancelled.
CREATE TABLE SaleOrder
(
TransactionNo Int,
SaleOrderDate DATE,
Code VARCHAR(25),
Quantity INT,
TotalAmount Numeric(18,2),
Remarks VARCHAR(25)
)
INSERT INTO SaleOrder VALUES (NULL, '2018-10-01', 'SO-001-OCT-18', 6, '2500', 'Hello');
INSERT INTO SaleOrder VALUES (1, '2018-10-01', 'SO-001-OCT-18', 8, '2600', 'Hello');
INSERT INTO SaleOrder VALUES (2, '2018-10-01', 'SO-001-OCT-18', 12, '3400', 'Hello');
INSERT INTO SaleOrder VALUES (3, '2018-10-01', 'SO-001-OCT-18', 9, '2900', 'Hello');
This will be the result that I am expected.
Code SaleOrderDate Quantity InitialAmount Addition Cancellation
SO-001-OCT-18 2018-10-01 9 2500.00 900.00 500.00
I have written this query but it's not helping that much.
;WITH CTE AS (
SELECT
[TransactionNo], [Code], [SaleOrderDate], [Quantity], [TotalAmount],
CAST('Oct 1 2018 10:16AM' AS DATE) AS [DateFrom], CAST('Oct 4 2018 10:16AM' AS DATE) AS [DateTo]
FROM [SaleOrder]
GROUP BY
[TransactionNo], [Code], [SaleOrderDate], [TotalAmount], Quantity
)
SELECT
[D].[TransactionNo], [D].[Code], [D].[SaleOrderDate], [D].[Quantity], [D].TotalAmount,
--CAST('Oct 4 2018 4:06PM' AS DATE) AS [DateFrom],
--CAST('Oct 4 2018 4:06PM' AS DATE) AS [DateTo],
[D].[Balance], [D].[Balance]-ISNULL(NULLIF([D].TotalAmount, 0),0) [Opening]
FROM(
SELECT *,
SUM(TotalAmount) OVER (PARTITION BY [Code] ORDER BY [TransactionNo], [SaleOrderDate]) AS [Balance]
FROM CTE
)D
WHERE [SaleOrderDate] BETWEEN CAST('Oct 1 2018 10:16AM' AS DATE) AND CAST('Oct 4 2018 10:16AM' AS DATE)
ORDER BY [SaleOrderDate]
use the LAG() window function to get previous value and compare to determine it is an addition or cancellation.
; WITH cte as
(
SELECT *,
row_no = ROW_NUMBER() OVER (PARTITION BY Code ORDER BY TransactionNo DESC),
Addition = CASE WHEN TotalAmount > LAG(TotalAmount) OVER (PARTITION BY Code ORDER BY TransactionNo)
THEN TotalAmount - LAG(TotalAmount) OVER (PARTITION BY Code ORDER BY TransactionNo)
ELSE 0
END,
Cancellation = CASE WHEN TotalAmount < LAG(TotalAmount) OVER (PARTITION BY Code ORDER BY TransactionNo)
THEN LAG(TotalAmount) OVER (PARTITION BY Code ORDER BY TransactionNo) - TotalAmount
ELSE 0
END
FROM SaleOrder
)
SELECT Code,
SaleOrderDate,
Quantity = MAX (CASE WHEN row_no = 1 then Quantity END),
InitialAmount = MAX (CASE WHEN TransactionNo IS NULL THEN TotalAmount END),
Addition = SUM (Addition),
Cancellation = SUM (Cancellation)
FROM cte
GROUP BY Code, SaleOrderDate
Are you trying to do this? :
SELECT
Code
, MAX(SaleOrderDate) SaleOrderDate
, MAX(Quantity) Quantity
, MAX(InitialAmount) InitialAmount
, SUM(Addition) Addition
, ABS(SUM(Cancellation)) Cancellation
FROM (
SELECT
Code
, CASE WHEN rn = cnt THEN SaleOrderDate END SaleOrderDate
, CASE WHEN rn = cnt THEN Quantity END Quantity
, InitialAmount
, CASE WHEN Diff > 0 THEN Diff ELSE 0 END Addition
, CASE WHEN Diff < 0 THEN Diff ELSE 0 END Cancellation
FROM (
SELECT *
, CASE WHEN TransactionNo IS NULL THEN TotalAmount END InitialAmount
, LEAD(TotalAmount) OVER(PARTITION BY Code ORDER BY TransactionNo) nxtPrice
, LEAD(TotalAmount) OVER(PARTITION BY Code ORDER BY TransactionNo) - TotalAmount Diff
, COUNT(*) OVER(PARTITION BY Code) cnt
, ROW_NUMBER() OVER(PARTITION BY Code ORDER BY SaleOrderDate) rn
FROM SaleOrder
) D
) C
GROUP BY
Code

SQL SMS 2008 -Count column ids and count duplicate ids if createddate is greater than 3 months between ids

*Edit (Hopefully to be more clear)
Table below, I would like to count ids and count duplicate ids where the createddate has a gap of 3 months or more for that ID.
Query I have so far...
if object_id('tempdb..#temp') is not null
begin drop table #temp end
select
top 100
a.id, a.CreatedDate
into #temp
from tbl a
where 1=1
--and year(CreatedDate) = '2015'
if object_id('tempdb..#temp2') is not null
begin drop table #temp2 end
select t.id, count(t.id) as Total_Cnt
into #temp2
from #temp t
group by id
select distinct #temp2.Total_Cnt, #temp2.id, #temp.CreatedDate, DENSE_RANK() over (partition by #temp.id order by createddate) RK
from #temp2
inner join #temp on #temp2.id = #temp.id
where 1=1
order by Total_Cnt desc
Results:
Total_cnt id createddate rk
3 1 01-01-2015 1
3 1 03-02-2015 2
3 1 01-02-2015 3
2 2 05-01-2015 1
2 2 05-02-2015 2
1 3 06-01-2015 1
1 4 07-01-2015 1
Count ids and only count duplicate ids when the createddate from the id is greater than 3 months.
Something like this...
Total_cnt id Countwith3monthgap
3 1 2
2 2 1
1 3 1
1 4 1
You can use a cte and ROW_NUMBER to get your order and self join the cte based on the order..
WITH cte AS
( SELECT
*,
ROW_NUMBER() OVER (PARTITION BY ID ORDER BY CreatedDate) Rn
FROM
Test
)
SELECT
c1.ID,
COUNT(CASE WHEN c2.CreatedDate IS NULL THEN 1
WHEN c1.CreatedDate >= DATEADD(month,3,c2.CreatedDate) THEN 1
END)
FROM
cte c1
LEFT JOIN cte c2 ON c1.ID = c2.ID
AND c1.RN = c2.RN + 1
GROUP BY
c1.ID
You also need to use a conditional count where the Previous CreatedDate is null or if the Current CreatedDate is >= the Previous CreatedDate + 3 months
If you happen to be using SQL 2012+ you can also use LAG here to get the same result
SELECT
ID,
COUNT(*)
FROM
(SELECT
ID,
CreatedDate CurrentDate,
LAG(CreatedDate) OVER (PARTITION BY ID ORDER BY CreatedDate) PreviousDate
FROM
Test
) T
WHERE
PreviousDate IS NULL
OR CurrentDate >= DATEADD(month, 3, PreviousDate)
GROUP BY
ID
You can use a lag to get the previous date, Null for the first in the list
SELECT
id,
lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
CreatedDate
FROM #t
You can use that as a subquery and get the difference in months using DATEDIFF
SELECT sub.id,DATEDiff(month, sub.PreviousCreateDate ,sub.CreatedDate)
FROM (SELECT
id,
lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
CreatedDate
FROM #t) sub
WHERE DATEDiff(month, sub.PreviousCreateDate ,sub.CreatedDate) >=3
OR sub.PreviousCreateDate IS NULL
You can then take your totals
SELECT sub.id,COUNT(sub.id) as cnt
FROM (SELECT
id,
lag(CreatedDate,1) OVER (PARTITION BY Id ORDER BY CreatedDate) AS PreviousCreateDate,
CreatedDate
FROM #t) sub
WHERE DATEDIFF(month, sub.PreviousCreateDate ,sub.CreatedDate) >=3
OR sub.PreviousCreateDate IS NULL
GROUP BY sub.id
Note that using datediff the last day of january is three months before the first day of march. That appears to be the logic you were after.
You might want to define your three month gap criteria as
WHERE sub.PreviousCreateDate <= DATEADD(month, -3, sub.CreatedDate)
OR sub.PreviousCreateDate IS NULL
or
WHERE sub.CreatedDate >= DATEADD(month, +3, sub.PreviousCreateDate )
OR sub.PreviousCreateDate IS NULL
I'm guessing that your desired definition of three-month gap doesn't coincide with datediff()'s. Most of the logic here is to look back at the previous date and decide if the gap is big enough to qualify.
When datediff() counts three months difference we still need to make sure the day of month is later than the first one (per example and ID 5). If difference is more than three months then we're good automatically.
But I'm also assuming that you would want to treat the distance from November 30th to February 28th (or 29th in a leap year) as a full three months because the end date falls on the final day of the month. By adjusting the end date by an extra day this is an easy scenario to snag as it will bump the date into the following month and increase the month difference by one as well. If that's not what you want then just remove the dateadd(day, 1, ...) portion and use only the raw CreatedDate value.
You sample data is limited so I'm also making the assumption that the gaps are measure between consecutive dates. If you're wanting to find blocks of runs that don't span more than three months across the set, then that's a different problem and you should clarify with more information.
Since you've indicated that you're probably on SQL Server 2008 you'll have to do without the lag() function. Although the first query could be adjusted for that it's likely easier to go with the second approach at the end.
with diffs as (
select
ID,
row_number() over (partition by ID order by CreatedDate) as RN,
case when
datediff(
month,
lag(CreatedDate, 1) over (partition by ID order by CreatedDate),
CreatedDate
) = 3
and
datepart(
day,
lag(CreatedDate, 1) over (partition by ID order by CreatedDate)
) <= datepart(day, CreatedDate)
or
datediff(
month,
lag(CreatedDate, 1) over (partition by ID order by CreatedDate),
/* adding one day to handle gaps like Nov30 - Feb28/29 and Jan31 - Apr30 */
dateadd(day, 1, CreatedDate)
) >= 4
then 1
else 0
end as GapFlag
from <T> /* <--- your table name here */
), gaps as (
select
ID, RN,
sum(1 + GapFlag) over (partition by ID order by RN) as Counter
from diffs
)
select ID, count(distinct Counter - RN) as "Count"
from gaps
group by ID
The rest of the logic is a typical gaps and islands scenario looking for holes in the sum(1 + GapCount) sequence with the offset of 1 acting pretty much like row_number().
http://sqlfiddle.com/#!6/61b12/3
JamieD77's approach is also valid. I was originally thinking your problem involved more than looking at the rows in sequence. Here's how I would tweak it for the gap definition I've been running with:
with data as (
select ID, CreatedDate, row_number() over (partition by ID order by CreatedDate) as RN
from T
)
select ID, count(*) as "Count"
from data d1 left outer join data d0
on d0.ID = d1.ID and d0.RN = d1.RN - 1 /* connect to the one before */
where
datediff(month, d0.CreatedDate, d1.CreatedDate) = 3
and datepart(day, d0.CreatedDate) <= datepart(day, d0.CreatedDate)
or datediff(month, d0.CreatedDate, dateadd(day, 1, d0.CreatedDate)) >= 4
or d0.ID is null
group by ID
Edit: You have changed the question since yesterday.
Change this line in the first query to include the total count:
...
select count(*) as TotalCnt, ID, count(distinct Counter - RN) as GapCount
...
Second would look like:
with data as (
select ID, CreatedDate, row_number() over (partition by ID order by CreatedDate) as RN
from T
)
select
count(*) as TotalCnt, ID,
count(case when
datediff(month, d0.CreatedDate, d1.CreatedDate) = 3
and datepart(day, d0.CreatedDate) <= datepart(day, d0.CreatedDate)
or datediff(month, d0.CreatedDate, dateadd(day, 1, d0.CreatedDate)) >= 4
or d0.ID is null then 1 end
) as GapCount
from data d1 left outer join data d0
on d0.ID = d1.ID and d0.RN = d1.RN - 1 /* connect to the one before */
where
group by ID

How to get count of consecutive dates

For example there is some table with dates:
2015-01-01
2015-01-02
2015-01-03
2015-01-06
2015-01-07
2015-01-11
I have to write ms sql query, which will return count of consecutive dates starting from every date in the table. So the result will be like:
2015-01-01 1
2015-01-02 2
2015-01-03 3
2015-01-06 1
2015-01-07 2
2015-01-11 1
It seems to me that I should use LAG and LEAD functions, but now I even can not imagine the way of thinking.
CREATE TABLE #T ( MyDate DATE) ;
INSERT #T VALUES ('2015-01-01'),('2015-01-02'),('2015-01-03'),('2015-01-06'),('2015-01-07'),('2015-01-11')
SELECT
RW=ROW_NUMBER() OVER( PARTITION BY GRP ORDER BY MyDate) ,MyDate
FROM
(
SELECT
MyDate, DATEDIFF(Day, '1900-01-01' , MyDate)- ROW_NUMBER() OVER( ORDER BY MyDate ) AS GRP
FROM #T
) A
DROP TABLE #T;
You can use this CTE:
;WITH CTE AS (
SELECT [Date],
ROW_NUMBER() OVER(ORDER BY [Date]) AS rn,
CASE WHEN DATEDIFF(Day, PrevDate, [Date]) IS NULL THEN 0
WHEN DATEDIFF(Day, PrevDate, [Date]) > 1 THEN 0
ELSE 1
END AS flag
FROM (
SELECT [Date], LAG([Date]) OVER (ORDER BY [Date]) AS PrevDate
FROM #Dates ) d
)
to produce the following result:
Date rn flag
===================
2015-01-01 1 0
2015-01-02 2 1
2015-01-03 3 1
2015-01-06 4 0
2015-01-07 5 1
2015-01-11 6 0
All you have to do now is to calculate a running total of flag up to the first occurrence of a preceding zero value:
;WITH CTE AS (
... cte statements here ...
)
SELECT [Date], b.cnt + 1
FROM CTE AS c
OUTER APPLY (
SELECT TOP 1 COALESCE(rn, 1) AS rn
FROM CTE
WHERE flag = 0 AND rn < c.rn
ORDER BY rn DESC
) a
CROSS APPLY (
SELECT COUNT(*) AS cnt
FROM CTE
WHERE c.flag <> 0 AND rn < c.rn AND rn >= a.rn
) b
OUTER APPLY calculates the rn value of the first zero-valued flag that comes before the current row. CROSS APPLY calculates the number of records preceding the current record up to the first occurrence of a preceding zero valued flag.
I'm assuming this table:
SELECT *
INTO #Dates
FROM (VALUES
(CAST('2015-01-01' AS DATE)),
(CAST('2015-01-02' AS DATE)),
(CAST('2015-01-03' AS DATE)),
(CAST('2015-01-06' AS DATE)),
(CAST('2015-01-07' AS DATE)),
(CAST('2015-01-11' AS DATE))) dates(d);
Here's a recursive solution with explanations:
WITH
dates AS (
SELECT
d,
-- This checks if the current row is the start of a new group by using LAG()
-- to see if the previous date is adjacent
CASE datediff(day, d, LAG(d, 1) OVER(ORDER BY d))
WHEN -1 THEN 0
ELSE 1 END new_group,
-- This will be used for recursion
row_number() OVER(ORDER BY d) rn
FROM #Dates
),
-- Here, the recursion happens
groups AS (
-- We initiate recursion with rows that start new groups, and calculate "GRP"
-- numbers
SELECT d, new_group, rn, row_number() OVER(ORDER BY d) grp
FROM dates
WHERE new_group = 1
UNION ALL
-- We then recurse by the previously calculated "RN" until we hit the next group
SELECT dates.d, dates.new_group, dates.rn, groups.grp
FROM dates JOIN groups ON dates.rn = groups.rn + 1
WHERE dates.new_group != 1
)
-- Finally, we enumerate rows within each group
SELECT d, row_number() OVER (PARTITION BY grp ORDER BY d)
FROM groups
ORDER BY d
SQLFiddle

Get latest record for each day for last n days using MS Sql Server

CurrencyId LeftCurrencyId RightCurrencyId ExchangeRateAt ExchangeRate
1 1 5 2013-06-27 00:51:00.000 39.0123
2 3 5 2013-06-26 01:54:00.000 40.0120
3 1 5 2013-06-26 00:51:00.000 49.0143
4 3 5 2013-06-25 14:51:00.000 33.3123
5 3 5 2013-06-25 06:51:00.000 32.0163
6 1 5 2013-06-25 00:08:00.000 37.0123
I need latest record for each day for last n days based on combination of leftcurrencyid and rightcurrencyid.
Here's one option:
with TopPerDay as
(
select *
, DayRank = row_number() over (partition by LeftCurrencyId, RightCurrencyId, cast(ExchangeRateAt as date)
order by ExchangeRateAt desc)
from ExchangeRate
)
select CurrencyId,
LeftCurrencyId,
RightCurrencyId ,
ExchangeRateDay = cast(ExchangeRateAt as date),
ExchangeRateAt ,
ExchangeRate
from TopPerDay
where DayRank = 1
order by LeftCurrencyId,
RightCurrencyId,
ExchangeRateDay
SQL Fiddle with demo.
It groups by LeftCurrencyId, RightCurrencyId, and ExchangeRateAt day without the time component, then takes the latest record in the day for all those groups.
You don't mention whether you want N days back is from the present day or an unspecified date, but you can add this using a WHERE clause when selecting from the ExchangeRate table in the CTE definition.
Here are my two cents
Select ExchangeRateAt , * from Table1 where ExchangeRateAt in (Select max(ExchangeRateAt) from Table1 Group by cast( ExchangeRateAt as Date))
Order by ExchangeRateAt
Here 7 in the end is the last N days parameter (7 in this example)
with T1 as
(
select t.*,
cast(floor(cast([ExchangeRateAt] as float)) as datetime) as DatePart,
ROW_NUMBER() OVER (
PARTITION BY [LeftCurrencyId],
[RightCurrencyId],
cast(floor(cast([ExchangeRateAt] as float)) as datetime)
ORDER BY [ExchangeRateAt] DESC
) RowNumber
from t
), T2 as
(
select *,
ROW_NUMBER() OVER (PARTITION BY [LeftCurrencyId],
[RightCurrencyId]
ORDER BY DatePart DESC
) as RN
from T1 where RowNumber=1
)
select [CurrencyId],
[LeftCurrencyId],
[RightCurrencyId],
[ExchangeRateAt],
[ExchangeRate],
DatePart
from T2 where RN<=7
SQLFiddle demo

Resources