Summing every third record grouped by three fields - sql-server

I have a database where the travel times are segmented by epochs and are done on a 5 minute bin, that I need to summation up to 15 minutes. This is tricky because the data is segment by a TMC value, a date, and epoch. None of which are unique. As such:
TMC DATE EPOCH Travel_TIME_ALL_VEHICLES
111N20176 7012015 64 63
111N20176 7012015 76 112
111N20176 7012015 80 114
111N20176 7012015 83 127
111N20176 7012015 91 58
111N20176 7012015 93 117
I need the first three travel times to be added together, then the next three, and then the next three, and so on.
select *, sum(Travel_TIME_ALL_VEHICLES) over (order by EPOCH rows between 3 preceding and current row) as rolling_avg
from [dbo].[FHWA_2015_weekend]
WHERE TMC = '113P12373' order by DATE, EPOCH

In MySQL (as the question was originally tagged), you can enumerate the data using variables and then do the aggregation.
For your sample data this should work:
select tmc, date, sum(Travel_TIME_ALL_VEHICLES)
from (select t.*, (#rn := #rn + 1) as rn
from t cross join
(select #rn := 0) params
order by date, epoch
) t
group by tmc, date, floor((#rn - 1) / 3);
In almost any other database, you would use row_number() for this purpose:
select tmc, date, sum(Travel_TIME_ALL_VEHICLES)
from (select t.*, row_number() over (order by date, epoch) as rn
from t
) t
group by tmc, date, floor((rn - 1) / 3);

Related

T-SQL (Transact-SQL): Two separate tables. Need the find the date that is >= in Table A based on a date in Table B

Two separate tables. Need the find the date that is >= in Table A based on a date in Table B. Only TransactionCode 59 in Table A should be considered.
From the example tables below my return in table B First_Tran_Date should be "01/22/2022." Table A contains over 35 million records with thousands of AccountNumber's and grows each day.
Need T-SQL to take Table B ChangeDate "01/21/2022" and find the first time Table A shows a TransactionDate on or after that date and only TransactionCode 59 counts. All other TransactionCode dates should not be evaluated for the return.
Table A:
AccountNumber TransactionDate TransactionCode
xxxx310 2/3/2022 40
xxxx310 1/19/2022 40
xxxx310 1/22/2022 59
xxxx310 1/10/2022 59
xxxx310 3/15/2022 40
xxxx310 1/25/2022 59
xxxx310 1/30/2022 40
xxxx310 1/31/2022 59
xxxx310 1/31/2022 62
xxxx310 3/8/2022 59
Table B:
Account ChangeDate First_Tran_Date COUNT_OF_DAYS
xxxx310 01/21/2022 **RESULT NEEDED** (Calculated First_Tran_Date - ChangeDate = COUNT_OF_DAYS)
I have tried the following without getting a correct result:
T-SQL example...
Created a VIEW…
WITH added_row_number AS (
SELECT
*,
ROW_NUMBER() OVER(
PARTITION BY AccountNumber
ORDER BY
TransactionDate
) AS row_number
FROM dbo.LoanTransactions
)
SELECT
*
FROM added_row_number
WHERE row_number = 1
AND TransactionDate >= '2022-03-01'
AND TransactionCode IN ('59', '61', '70', '77', '82') Used a
SELECT
from that VIEW …
SELECT
DISTINCT Account,
Prod_CD,
OldValue,
NewValue,
Acct_Open_DT,
ChangeDate,
LOSVIEW_All_Transactions_From_CORE1.TransactionDate AS First_Tran_Date,
LastTransactionDate,
CASE
WHEN Prod_CD IN ('L50', 'L51', 'L54', 'L77') THEN DATEDIFF(
DAY,
ChangeDate,
LOSVIEW_All_Transactions_From_CORE1.TransactionDate
)
ELSE DATEDIFF(DAY, Acct_Open_DT, ChangeDate)
END AS COUNT_OF_DAYS
FROM dbo.R_InsuranceCodeChanges
LEFT JOIN dbo.LOSVIEW_All_Transactions_From_CORE AS LOSVIEW_All_Transactions_From_CORE1
ON dbo.R_InsuranceCodeChanges.Account = LOSVIEW_All_Transactions_From_CORE1.AccountNumber
WHERE
dbo.R_InsuranceCodeChanges.ChangeDate >= '2022-01-01'
AND dbo.R_InsuranceCodeChanges.NewValue <> '0'
If I understand correctly you want to look up two dates, and also calculate the number of days between them?
Those dates should be the first and last time a row with TransactionCode 59 appears, for a given account, and only including records on or after a given date?
So for the data in your example, the missing date should be 2022-01-22? Then the number of days would be 52 days?
For that I would use OUTER APPLY; which allows you to effectively run a query once for each input row...
SELECT
*,
DATEDIFF(DAY, table_a.MinTransactionDate, table_a.MaxTransactionDate)
FROM
table_b
OUTER APPLY
(
SELECT
MIN(TransactionDate) AS MinTransactionDate,
MAX(TransactionDate) AS MaxTransactionDate
FROM
table_a
WHERE
AccountNumber = table_b.Account
AND TransactionDate >= table_b.ChangeDate
AND TransactionCode = 59
)
AS table_a
Should be an index on tableA.TransactionDate. But I just translated your words to SQL and this is what I got:
select min(TransactionDate) as minDate
from tableA
where TransactionCode = 59
and TransactionDate >= (select max(TransactionDate) from tableB)

Joining three tables with OVER, for accumulative monthly figures

I have made the amendments as per comments and entered here as not enough characters allowed in comments. I think there may be a few issues on the OVER and with the JOIN. Updated query:
SELECT RIGHT('0' + CAST(day(oh_datetime) AS VARCHAR (2)), 2),
SUM(SUM((CASE oh_sot_id WHEN 1 THEN 1 WHEN 4 THEN -1 END) * oht_net)) over (ORDER BY day(oh_datetime) ROWS UNBOUNDED PRECEDING) AS 'Orders In($)',
SUM((CASE oh_cd_id WHEN 11728 THEN 1 END) * oht_net) AS 'Target($)',
SUM(SUM((CASE ih_credit WHEN 'false' THEN 1 WHEN 'true' THEN -1 END) * ih_net)) over (ORDER BY day(ih_datetime) ROWS UNBOUNDED PRECEDING) AS 'Sales($)'
FROM order_header_total
JOIN order_header ON order_header_total.oht_oh_id = order_header.oh_id
JOIN invoice_header ON order_header.oh_id = invoice_header.ih_oh_id
WHERE oh_datetime >= DATEADD(day, 1, EOMONTH(GETDATE(), -1)) AND oh_datetime < DATEADD(day, 1, EOMONTH(GETDATE()))
GROUP BY day(oh_datetime), day(ih_datetime)
UNION SELECT 'YAxis','0','0','0'
Table should display:
Date
Orders In
Target
Sales
01
6402.19
12321.57
128539.13
02
17795.94
24643.14
148258.63
03
50703.09
36964.71
171231.14
04
116034.29
49286.28
188157.69
28
353989.36
345004.00
446808.05
I have not done the full month for sake of space, but included the last day to show how it should accumulate. Instead it looks like this:
Date
Orders In
Target
Sales
01
5507.32
5507.32
01
5833.52
5833.52
01
6402.19
29377.18
02
10312.35
188157.69
02
16592.09
16023.42
02
17795.94
54950.68
03
30439.28
28666.76
03
30581.03
28808.51
04
36029.72
24700.77
04
37082.36
38767.55
23
144191.70
143992.45
Again a selection of data and then showing the last row for the sake of space. The date is not grouping, 'Orders In' and 'Sales' are not accumulating, and there is no data displaying for 'Target'.

T-SQL Count of Records in Status for Previous Months

I have a T-SQL Quotes table and need to be able to count how many quotes were in an open status during past months.
The dates I have to work with are an 'Add_Date' timestamp and an 'Update_Date' timestamp. Once a quote is put into a 'Won' or 'Loss' columns with a value of '1' in that column it can no longer be updated. Therefore, the 'Update_Date' effectively becomes the Closed_Status timestamp.
Here's a few example records:
Quote_No Add_Date Update_Date Open_Quote Win Loss
001 01-01-2016 NULL 1 0 0
002 01-01-2016 3-1-2016 0 1 0
003 01-01-2016 4-1-2016 0 0 1
Here's a link to all the data here:
https://drive.google.com/open?id=0B4xdnV0LFZI1T3IxQ2ZKRDhNd1k
I asked this question previously this year and have been using the following code:
with n as (
select row_number() over (order by (select null)) - 1 as n
from master..spt_values
)
select format(dateadd(month, n.n, q.add_date), 'yyyy-MM') as yyyymm,
count(*) as Open_Quote_Count
from quotes q join
n
on (closed_status = 1 and dateadd(month, n.n, q.add_date) <= q.update_date) or
(closed_status = 0 and dateadd(month, n.n, q.add_date) <= getdate())
group by format(dateadd(month, n.n, q.add_date), 'yyyy-MM')
order by yyyymm;
The problem is this code is returning a cumulative value. So January was fine, but then Feb is really Jan + Feb, and March is Jan+Feb+March, etc. etc. It took me a while to discover this and the numbers returned now way, way off and I'm trying to correct them.
From the full data set the results of this code are:
Year-Month Open_Quote_Count
2017-01 153
2017-02 265
2017-03 375
2017-04 446
2017-05 496
2017-06 560
2017-07 609
The desired result would be how many quotes were in an open status during that particular month, not the cumulative :
Year-Month Open_Quote_Count
2017-01 153
2017-02 112
2017-03 110
2017-04 71
Thank you in advance for your help!
Unless I am missing something, LAG() would be a good fit here
Example
Declare #YourTable Table ([Year-Month] varchar(50),[Open_Quote_Count] int)
Insert Into #YourTable Values
('2017-01',153)
,('2017-02',265)
,('2017-03',375)
,('2017-04',446)
,('2017-05',496)
,('2017-06',560)
,('2017-07',609)
Select *
,NewValue = [Open_Quote_Count] - lag([Open_Quote_Count],1,0) over (Order by [Year-Month])
From #YourTable --<< Replace with your initial query
Returns
Year-Month Open_Quote_Count NewValue
2017-01 153 153
2017-02 265 112
2017-03 375 110
2017-04 446 71
2017-05 496 50
2017-06 560 64
2017-07 609 49

How to select specific records of groups based on criteria

I'm trying to group a set of data and for some of the fields I need to select a specific value based on the ttype, for example I have the following rows:
caseid age iss gcs ttype
00170 64 25 17 Transfer Out
00170 64 27 15 Transfer In
00201 24 14 40 Transfer In
If a caseID has ttype 'Transfer Out' I want to use the ISS and GCS values from this row, otherwise use the values from the 'Transfer In' row.
My desired output based on the above example would be:
caseid age iss gcs
00170 64 25 17
00201 24 14 40
My current select statement is:
select caseid, max(age), max(iss), max(gcs)
from Table1
group by caseid
Which I know is incorrect but how do I specify the values for ISS and GCS from a specific row?
Thanks
Edit - I will not always need to select from Row1, table below with expanded data:
caseid age iss gcs los ttype disdate
170 64 25 17 5 Transfer Out 2014-01-02 00:00:00.000
170 64 27 15 1 Transfer In 2014-01-04 00:00:00.000
201 24 14 40 4 Transfer In 2014-01-04 00:00:00.000
In this case, I want the max age and the ISS and GCS figure for row1 as before but I need to sum the LOS and select the disdate for row 2 (ie the latest date), so my output would be:
caseid age iss gcs los disdate
170 64 25 17 6 2014-01-04
201 24 14 40 4 2014-01-04
Is this possible?
You can use a CTE and ROW_NUMBER + Over-clause (edited acc. to your updated question):
WITH CTE AS
(
SELECT caseid, age, iss, gcs, los, ttype, disdate,
SumLos = SUM(los) OVER (PARTITION BY caseid),
LatestDisDate = MAX(disdate) OVER (PARTITION BY caseid),
rn = ROW_NUMBER() OVER (PARTITION BY caseid
ORDER BY CASE WHEN ttype = 'Transfer Out'
THEN 0 ELSE 1 END ASC, disdate ASC)
FROM dbo.Table1
)
SELECT caseid, age, iss, gcs, los = SumLos, disdate = LatestDisDate
FROM CTE
WHERE rn = 1
Demo
I think this is what you need -
;WITH CTE AS
(
SELECT case_id, age,iss,gcs, ROW_NUMBER () over (PARTITION BY ttype order by gcs DESC) Rn
from YOUR_TABLE_NAME
)
SELECT case_id,age,iss,gcs
from CTE where Rn =1

SQL Convert each date range into each day row

I have one simple requirement. Below is my sql table.
ID Cname StartDate EndDate Value
1 x 01/15/2015 01/20/2015 50
2 x 01/17/2015 01/22/2015 60
3 y 02/15/2015 02/20/2015 40
4 y 02/17/2015 02/22/2015 80
I have date range and I want to convert this each date range into each day row. Along with that whenever there is a overlap of dates it adds the value.
Below is the sample output for more clarification.
Cname date value
x 1/15/2015 60
x 1/16/2015 60
x 1/17/2015 110
x 1/18/2015 110
x 1/19/2015 110
x 1/20/2015 110
x 1/21/2015 60
x 1/22/2015 60
y 2/15/2015 40
y 2/16/2015 40
y 2/17/2015 120
y 2/18/2015 120
y 2/19/2015 120
y 2/20/2015 120
y 2/21/2015 80
y 2/22/2015 80
Any help would be appreciated.
You can use the technique described here, in order to generate a date range for each interval of your table. Then simply group by Cname and date to get the desired result set:
;WITH natural AS
(
SELECT ROW_NUMBER() OVER (ORDER BY [object_id]) - 1 AS val
FROM sys.all_objects
)
SELECT m.Cname, d = DATEADD(DAY, natural.val, m.StartDate),
SUM(value) AS value
FROM mytable AS m
INNER JOIN natural ON natural.val <= DATEDIFF(DAY, m.StartDate, m.EndDate)
GROUP BY Cname, DATEADD(DAY, natural.val, m.StartDate)
ORDER BY Cname, d
The CTE is used to create a tally table. The numbers of this table are then used to add 1,2,3, ... days to StartDate until EndDate is reached.
If you group by Cname, [Date], then SUM will return the required value since it will add any overlapping records within each Cname partition.
SQL Fiddle Demo

Resources