I am trying to find Max, min, avg and last value of a column in single query.
Platform: SQL Server 2012
Sample Table:
SN Month Acc Bal
------------------------
1 7 101 1,000/-
2 7 101 1,500/-
3 7 101 1,700/-
4 8 101 1,200/-
5 8 101 900/-
6 9 101 2,500/-
Query I wrote:
select
[Month], [Acc],
min(Bal) as MinBal,
avg(Bal) as AvgBal,
max(Bal) as MaxBal
--, ??? for as LastBal
from
MyTable
Group By
[Month], [Acc]
Query with Last_Value returns all records instead of aggregated records
select
[Month], [Acc],
min(Bal) as MinBal,
avg(Bal) as AvgBal,
max(Bal) as MaxBal,
LAST_VALUE(Bal) OVER (partition by [Acc] order by [Month]) as LastBal
from
MyTable
Group By
[Month], [Acc], Bal
Also including last_value(bal) is generating an error with bal required on group by list
Column 'Bal' is invalid in the select list because it is not contained in either an aggregate function or the GROUP BY clause.
Please try this solution-
DATA Generation
CREATE TABLE Alls
(
SN INT
,[Month] INT
,Acc INT
,Bal INT
)
GO
INSERT INTO Alls VALUES
(1, 7, 101, 1000),
(2, 7, 101, 1500),
(3, 7, 101, 1700),
(4, 8, 101, 1200),
(5, 8, 101, 900),
(6, 9, 101, 2500)
GO
SOLUTION
SELECT sn,Acc,[Month] ,Bal
, MIN(Bal) OVER(PARTITION BY Acc,[Month]) MinBal
, AVG(Bal*1.) OVER(PARTITION BY Acc,[Month]) AvgBal
, MAX(Bal) OVER(PARTITION BY Acc,[Month]) MaxBal
, FIRST_VALUE(Bal) OVER(PARTITION BY Acc,[Month] ORDER BY SN DESC) lastVal
FROM Alls
ORDER By SN
OUTPUT
sn Acc Month Bal MinBal AvgBal MaxBal lastVal
----------- ----------- ----------- ----------- ----------- ---------------- ----------- -----------
1 101 7 1000 1000 1400.000000 1700 1700
2 101 7 1500 1000 1400.000000 1700 1700
3 101 7 1700 1000 1400.000000 1700 1700
4 101 8 1200 900 1050.000000 1200 900
5 101 8 900 900 1050.000000 1200 900
6 101 9 2500 2500 2500.000000 2500 2500
(6 rows affected)
IF you only need acc,month and other aggregate columns then use below-
SOLUTION
SELECT Acc,[Month],MAX(MinBal)MinBal,MAX(AvgBal)AvgBal,MAX(MaxBal)MaxBal,MAX(lastVal)lastVal
FROM
(
SELECT sn,Acc,[Month] ,Bal
, MIN(Bal) OVER(PARTITION BY Acc,[Month]) MinBal
, AVG(Bal*1.) OVER(PARTITION BY Acc,[Month]) AvgBal
, MAX(Bal) OVER(PARTITION BY Acc,[Month]) MaxBal
, FIRST_VALUE(Bal) OVER(PARTITION BY Acc,[Month] ORDER BY SN DESC) lastVal
FROM Alls
)u GROUP BY Acc,[Month]
OUTPUT
Acc Month MinBal AvgBal MaxBal lastVal
----------- ----------- ----------- ---------------- ----------- -----------
101 7 1000 1400.000000 1700 1700
101 8 900 1050.000000 1200 900
101 9 2500 2500.000000 2500 2500
(3 rows affected)
select *
from
( SELECT sn, Acc, [Month], Bal
, MIN(Bal) OVER(PARTITION BY Acc, [Month]) MinBal
, AVG(Bal) OVER(PARTITION BY Acc, [Month]) AvgBal
, MAX(Bal) OVER(PARTITION BY Acc, [Month]) MaxBal
, row_number() OVER(PARTITION BY Acc, [Month] ORDER BY SN DESC) as rn
) tt
where rn = 1
ORDER By sn
You can achieve as below:
select
tt.Month
, tt.Acc
, min(Bal) as MinBal
, avg(Bal) as AvgBal
, max(Bal) as MaxBal
, latest.balance
FROM #tbl1 as tt
JOIN (
SELECT
id
,month
,acc
,bal as balance
FROM #tbl1 AS t1
WHERE id = (SELECT MAX(id)
FROM #tbl1 AS t2
WHERE t1.month = t2.month
AND t1.acc = t2.acc
GROUP BY month, acc)
) as latest
on tt.month = latest.month
AND tt.acc = latest.acc
Group By tt.Month, tt.Acc, latest.balance
DROP TABLE #tbl1
Related
Issue: How to update rows between two different sets of criteria in SQL Server without using a loop (SQL Server 2014). In other words, for each row in a result set, how to update every row between the first occurrence (with one criterion) and the second occurrence (with different criteria). I think part of the issue is trying to run a TOP N query for every row in the query.
Specifically:
In the example starting table below, how can I update the last 2 columns of dates where:
Update rows between the null Category rows and the last consecutive "M" Category row if the null Category row is preceded by a "S" Category. Category can contain any order of "S", "M", or null.
Set StartDate = IDEndDate+1 day of the "S" row preceding the null row.
Set EndDate = IDEndDate of the last row with a "M" Category.
Here is a SQLFiddle.
Notes: I have done this in the past with a loop (fetch..) but I am trying to do this with a few queries instead kind of like:
step 1: Get work: select all valid null rows (beginning of range)
step 2: for each row above, select the related last "M" row (end of range) and then run a query to update the StartDate, EndDates in each range.
Starting Table:
ID IDStartDate IDEndDate Category
------------------------------------
11 2017-01-01 2017-01-31 S
11 2017-02-02 2017-02-03 null
11 2017-02-03 2017-03-31 M
11 2017-04-01 2017-04-30 M
22 2017-05-01 2017-06-15 S
22 2017-06-16 2017-06-20 null
22 2017-06-21 2017-06-25 M
22 2017-06-26 2017-06-27 null
22 2017-06-28 2017-06-29 S
22 2017-06-30 2017-07-05 M
33 2017-06-30 2017-07-14 M
33 2017-07-15 2017-07-20 S
33 2017-07-21 2017-07-25 null
44 2018-06-30 2018-07-14 S
44 2018-07-15 2018-07-20 M
44 2018-07-21 2018-07-25 null
Desired Ending Table:
ID IDStartDate IDEndDate Category StartDate EndDate
----------------------------------------------------------
11 2017-01-01 2017-01-31 S
11 2017-02-02 2017-02-03 null 2017-02-01 2017-04-30
11 2017-02-03 2017-03-31 M 2017-02-01 2017-04-30
11 2017-04-01 2017-04-30 M 2017-02-01 2017-04-30
22 2017-05-01 2017-06-15 S
22 2017-06-16 2017-06-20 null 2017-06-16 2017-06-25
22 2017-06-21 2017-06-25 M 2017-06-16 2017-06-25
22 2017-06-26 2017-06-27 null
22 2017-06-28 2017-06-29 S
22 2017-06-30 2017-07-05 M
33 2017-06-30 2017-07-14 M
33 2017-07-15 2017-07-20 S
33 2017-07-21 2017-07-25 null
44 2018-06-30 2018-07-14 S
44 2018-07-15 2018-07-20 M
44 2018-07-21 2018-07-25 null
Below is some SQL to create the table and view the query results that I have started. I tried cte, cross apply, outer apply, inner joins... with no luck.
thanks so much!
CREATE TABLE test (
ID INT,
IDStartDate date,
IDEndDate date,
Category VARCHAR (2),
StartDate date,
EndDate date
);
INSERT INTO test (ID, IDStartDate, IDEndDate, Category)
VALUES
(11, '2017-01-01', '2017-01-31', 'S')
,(11, '2017-02-02', '2017-02-03', null)
,(11, '2017-02-03', '2017-03-31', 'M')
,(11, '2017-04-01', '2017-04-30', 'M')
,(22, '2017-05-01', '2017-06-15', 'S')
,(22, '2017-06-16', '2017-06-20', null)
,(22, '2017-06-21', '2017-06-25', 'M')
,(22, '2017-06-26', '2017-06-27', null)
,(22, '2017-06-28', '2017-06-29', 'S')
,(22, '2017-06-30', '2017-07-05', 'M')
,(33, '2017-06-30', '2017-07-14', 'M')
,(33, '2017-07-15', '2017-07-20', 'S')
,(33, '2017-07-21', '2017-07-25', null)
,(44, '2018-06-30', '2018-07-14', 'S')
,(44, '2018-07-15', '2018-07-20', 'M')
,(44, '2018-07-21', '2018-07-25', null);
--**************************
--results: shows first rows of each range
--**************************
;with cte as
(
select *
,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS RowNum
,LAG(IDEndDate) OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS lastIDEndDate
,LAG(Category) OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS lastCategory
,LEAD(Category) OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS nextCategory
from test
)
select * --select first row of each range to update
from cte
where Category is null and lastCategory = 'S' and nextCategory = 'M'
--*******************************
--6 of 8 "new" values are correct (missing NewEndDate for first range)
--*******************************
;with cte as
(
SELECT *
,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS RowNum
,LAG(IDEndDate) OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS lastIDEndDate
,LAG(Category) OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS lastCategory
,LEAD(Category) OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS nextCategory
FROM test
), cte2 as
(
select * --find the first/start row of each range
,LAG(RowNum) OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS lastRowNum
,IIF(Category is null and lastCategory = 'S' and nextCategory = 'M', DateAdd(day, 1, lastIDEndDate), null) as NewStartDate
,IIF(Category is null and lastCategory = 'S' and nextCategory = 'M', RowNum, null) as NewStartRowNum
from cte
)
select t1.*, t3.*
from cte2 t1
outer apply
(
select top 1 --find the last/ending row of each range
t2.lastIDEndDate as NewEndDate
,t2.lastRowNum as NewEndRowNum
from cte2 t2
where t1.ID = t2.ID
and t1.NewStartRowNum < t2.RowNum
and t2.nextCategory <> 'M'
order by t2.ID, t2.RowNum
) t3
order by t1.ID, t1.RowNum
Here's an attempt on this SQL puzzle.
Basically, it updates from a CTE.
First it calculates a Cummulative sum. To create some kind of ranking.
Then only for rank 2 & 3 it'll calculate the dates.
;WITH CTE AS
(
SELECT ID, IDStartDate, IDEndDate, Category, StartDate, EndDate,
DATEADD(day,1, FIRST_VALUE(IDEndDate) OVER (PARTITION BY ID ORDER BY IDStartDate)) AS NewStartDate,
FIRST_VALUE(IDEndDate) OVER (PARTITION BY ID ORDER BY IDStartDate DESC) AS NewEndDate
FROM
(
SELECT ID, IDStartDate, IDEndDate, Category, StartDate, EndDate,
SUM(CASE WHEN Category = 'S' THEN 2 WHEN Category IS NULL THEN 1 END) OVER (PARTITION BY ID ORDER BY IDStartDate) AS cSum
FROM test t
) q
WHERE cSum IN (2, 3)
)
UPDATE CTE
SET
StartDate = NewStartDate,
EndDate = NewEndDate
WHERE (Category IS NULL OR Category = 'M');
A test on rextester here
I answered my own question. I had two major errors:
1) A Cross Apply (or Outer Apply) is needed for the Top N query to work properly.
Using a cross apply, the Top N query will be run for each row from the inner query.
Using an inner join (or left join), all rows will be returned first from the inner query and the Top N query runs only once.
2) Filtering on "[column] <> 'M'" messed me up as it did not exclude NULL's. I had to use instead "[column] = 'S' or [column] is null"
Final SQL found in rextester
Working code below:
;with cte as
(
SELECT *
,ROW_NUMBER() OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS RowNum
,LAG(IDEndDate) OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS lastIDEndDate
,LAG(Category) OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS lastCategory
,LEAD(Category) OVER(PARTITION BY ID ORDER BY ID, IDStartDate, IDEndDate) AS nextCategory
FROM test
), cte2 as
(
select t1.ID, t1.IDStartDate, t1.IDEndDate --find the first/start row of the range
,IIF(Category is null and lastCategory = 'S' and nextCategory = 'M', DateAdd(day, 1, lastIDEndDate), null) as NewStartDate
,IIF(Category is null and lastCategory = 'S' and nextCategory = 'M', RowNum, null) as NewStartRowNum
,t3.*
from cte t1
cross apply
(
select top 1 --find the last/ending row of the range
t2.IDEndDate as NewEndDate
,t2.RowNum as NewEndRowNum
from cte t2
where t1.ID = t2.ID
and t1.RowNum < t2.RowNum
and (t2.nextCategory ='S' or t2.nextCategory is null)
order by t1.ID, t1.RowNum
) t3
where Category is null and lastCategory = 'S' and nextCategory = 'M'
)
update t4
set StartDate = NewStartDate
,EndDate = NewEndDate
from cte t4
inner join cte2 t5
on t4.ID = t5.ID
and t4.RowNum Between NewStartRowNum and NewEndRowNum
select * from test
My total rows are variable and not fixed , So there are N rows and I want to separate each 5 rows as a group and select the max value of price in following table in SQL.
Date Price
20170101 100
20170102 110
20170103 90
20170105 80
20170109 76
20170110 50
20170111 55
20170113 80
20170115 100
20170120 99
20170121 88
20170122 98
20170123 120
So in first 5 group the max price is 110 , and second group is 100, and last group max price is 120.
Use a common table expression to group them.
WITH CTE AS (SELECT RANK() OVER (ORDER BY Date) AS Rank, Price
FROM yourtable)
SELECT (Rank - 1) / 5 AS GroupedDate, MAX(Price) AS MAXPRICE
FROM CTE
GROUP BY ((Rank - 1) / 5);
Output
GroupedDate MAXPRICE
0 110
1 100
2 120
SQL Fiddle: http://sqlfiddle.com/#!6/b5857/3/0
You can use row_number as below
;With cte as (
Select *, Bucket = Sum(RowN) over(Order by [date]) from (
Select *, RowN = case when row_number() over(order by [date]) % 5 = 0 then 1 else 0 end from #data1
) a
) Select top (1) with ties [Date], [Price]
from cte
order by row_number() over (partition by Bucket order by Price desc)
You could use:
SELECT grp, MAX(Price) AS price
FROM (SELECT *, ROW_NUMBER() OVER(ORDER BY DATE) / 5 AS grp FROM tab) sub
GROUP BY grp;
-- OUTPUT
grp price
0 110
1 100
2 120
Rextester Demo
*assuming that date is unique
EDIT:
As in something like : 20170101 - 20170109 110
SELECT
CONVERT(VARCHAR(8),MIN(DATE),112) + '-' + CONVERT(VARCHAR(8),MAX(date),112)
, MAX(Price) AS price
FROM (SELECT *, (ROW_NUMBER() OVER(ORDER BY DATE) ) / 5 AS grp FROM tab) sub
GROUP BY grp;
Output:
20170101-20170105 110
20170109-20170115 100
20170120-20170123 120
Rextester Demo2
I am trying to get the number of records for a 16 hour time interval. Below is the code that I am using now.
;With Cte_hours as ( --hours generation
Select top(6) hr = (Row_number() over (order by (Select NULL))-1)*4 from master..spt_values
), cte2 as ( --getting range
Select DateAdd(HH, c.hr, Convert(datetime,d.dts) ) as Dts_Start, DateAdd(MS, -2, DateAdd(HH, c.hr+ 4, Convert(datetime,d.dts) ) ) Dts_end
from (select distinct convert(date, dt) as dts from TEST2 ) d
cross apply Cte_hours c
) --actual query
Select c2.Dts_Start as DT, Sum(case when t.Dt is not null then 1 else 0 end) No_of_records,LD_VOY_N,LD_VSL_M
from cte2 c2
Left Join TEST2 t
on t.Dt between c2.Dts_Start and c2.Dts_end
group by c2.Dts_Start,LD_VOY_N,LD_VSL_M
order by LD_VOY_N, LD_VSL_M, Dts_Start ASC
This code is able to count the number of records I have based on a 4,6, and 12 hour interval. However, if I try to count based on a 16 hour interval, it somehow does not work. Below is my code and output that I used for the 16 hour interval.
;With Cte_hours as ( --hours generation
Select top(6) hr = (Row_number() over (order by (Select NULL))-1)*16 from master..spt_values
), cte2 as ( --getting range
Select DateAdd(HH, c.hr, Convert(datetime,d.dts) ) as Dts_Start, DateAdd(MS, -2, DateAdd(HH, c.hr+ 16, Convert(datetime,d.dts) ) ) Dts_end
from (select distinct convert(date, dt) as dts from TEST2 ) d
cross apply Cte_hours c
) --actual query
Select c2.Dts_Start as DT, Sum(case when t.Dt is not null then 1 else 0 end) No_of_records,LD_VOY_N,LD_VSL_M
from cte2 c2
Left Join TEST2 t
on t.Dt between c2.Dts_Start and c2.Dts_end
group by c2.Dts_Start,LD_VOY_N,LD_VSL_M
order by LD_VOY_N, LD_VSL_M, Dts_Start ASC
Result:
DT No_of_records LD_VOY_N LD_VSL_M
2017-05-05 16:00:00.000 14 0002W pqo emzmnwp
2017-05-06 00:00:00.000 14 0002W pqo emzmnwp
2017-05-06 08:00:00.000 12 0002W pqo emzmnwp
2017-05-06 16:00:00.000 12 0002W pqo emzmnwp
2017-05-01 16:00:00.000 1 0007E omq ynzmeoyn
2017-05-02 00:00:00.000 1 0007E omq ynzmeoyn
It is taking the 8 hour timing as well. Do any of you have any idea why?
I have my data which may be around 50 records for each address:
Id AddressId Income Expense Revenue
----------------------------------------
1 1 100 200 300
2 1 150 20 200
3 1 160 80 800
4 1 50 90 200
5 1 600 700 500
Now I need my data in the following format:
Ids Count Income Expense Revenue
---------------------------------------
1 1 100 200 300
1,2 2 250 220 500
1,2,3 3 410 300 1300
1,2,3,4 4 460 390 1500
1,2,3,4,5 5 1060 1090 2000
Every row is being added one after another.
For example:
The Ids 1,2 is a sum of Id 1 and 2
The Ids 1,2,3 is a sum of Id 1 and 2 and 3 and so on
I don't need the Ids column, the only thing I need is the sum
You could use STUFF, ROW_NUMBER() OVER(), and SUM() OVER() like this
DECLARE #SampleData AS TABLE
(
Id int,
AddressId int,
Income int,
Expense int,
Revenue int
)
INSERT INTO #SampleData
VALUES
( 1, 1, 100, 200, 300),
( 2, 1, 150, 20 , 200),
( 3, 1, 160, 80 , 800),
( 4, 1, 50 , 90 , 200),
( 5, 1, 600, 700, 500)
SELECT
STUFF(
(
SELECT ',' + CAST(sd1.Id AS varchar(10))
FROM #SampleData sd1
WHERE sd1.AddressId = sd.AddressId AND sd1.Id <= sd.Id
FOR XML PATH('')
),
1,1,'') AS Ids,
Row_number() OVER(PARTITION BY sd.AddressId ORDER BY sd.Id) AS Count,
sum(sd.Income) OVER(PARTITION BY sd.AddressId ORDER BY sd.Id) AS Income,
sum(sd.Expense) OVER(PARTITION BY sd.AddressId ORDER BY sd.Id) AS Expense,
sum(sd.Revenue) OVER(PARTITION BY sd.AddressId ORDER BY sd.Id) AS Revenue
FROM #SampleData sd
ORDER BY sd.AddressId, sd.Id
Demo link: http://rextester.com/HRIWH92029
Note: The last revenue should be 2000 instead of 1600
If you're using SQL server 2012 and above,
please use below query for the sum up the previous rows
Select ID,
count(*) OVER (PARTITION by AddressID
ORDER BY ID
ROWS BETWEEN unbounded PRECEDING AND current row) as[Count],
sum(Income) OVER (PARTITION by AddressID
ORDER BY ID
ROWS BETWEEN unbounded PRECEDING AND current row) Income,
sum(Expense) OVER (PARTITION by AddressID
ORDER BY ID
ROWS BETWEEN unbounded PRECEDING AND current row)Expense,
sum(Revenue) OVER (PARTITION by AddressID
ORDER BY ID
ROWS BETWEEN unbounded PRECEDING AND current row) Revenue from TableName
If you're using SQL server 2008 and below, please use below query for the sum up the previous rows.
Select ID, (select count(*) from Tablename A where A.Id<=Tablename.ID)[Count],
(select sum(Income) from Tablename A where A.Id<=Tablename.ID) Income, (select
sum(Expense) from Tablename A where A.Id<=Tablename.ID) Expense, (select
sum(Revenue) from Tablename A where A.Id<=Tablename.ID) Revenue from Tablename
Using MSSQL, I am trying to get some information from a journal where one event happens directly after another event.
So what I am effectively aiming for, is to get a row number partitioned by a TransactionID, and then I need the last 2 rows (last 2 row number) for EACH transactionID (Ordered by a TxnDate field). There could be any number of rows per TransactionID.
So I would get:
JnlId TxnId RowNum
5 10001 65
2 10001 66
10 10002 11
8 10002 12
5 10003 15
98 10003 16
Any ideas how I could achieve this as I am at a loss! The end game after this is to filter out the 'JnlId' field for a select few of IDs.
Bit of a back story. This customer thinks their staff is stealing, so I need to filter out when they are cancelling items directly before finishing off each transaction.
Try this, I added some extra rows to make dense rank more obvious:
Test data:
DECLARE #t table(JnlId int,TxnId int,RowNum int, TxnDate date)
INSERT #t values
(5, 10001,65, '2015-01-01'),
(2, 10001,66, '2015-01-02'),
(2, 10001,66, '2015-01-03'),
(2, 10001,66, '2015-01-04'),
(2, 10001,67, '2015-01-04'),
(2, 10001,67, '2015-01-04'),
(10,10002,11, '2015-01-03'),
(8, 10002,12, '2015-01-04'),
(5, 10003,15, '2015-01-05'),
(98,10003,16, '2015-01-06')
Query:
;WITH CTE AS
(
SELECT
DENSE_RANK() over(partition by txnID order by TxnDate desc) rn,
JnlId, TxnId, RowNum, TxnDate
FROM #t
)
SELECT JnlId, TxnId, RowNum, TxnDate FROM CTE
WHERE rn<=2
Result:
JnlId TxnId RowNum TxnDate
2 10001 66 2015-01-04
2 10001 67 2015-01-04
2 10001 67 2015-01-04
2 10001 66 2015-01-03
8 10002 12 2015-01-04
10 10002 11 2015-01-03
98 10003 16 2015-01-06
5 10003 15 2015-01-05
Instead of ordering in ascending order try descending order
select * from
(
select dense_rank() over(partition by transactionID Order by TxnDate Desc) Rn,*
from yourtable
) A
where rn<=2
Just order by RowNum descending and then select what has ROW_NUMBER less or equals 2
DECLARE #Table TABLE
(
JnlId INT
, TxnId INT
, RowNum INT
);
INSERT INTO #Table
(JnlId, TxnId, RowNum)
VALUES
(5, 10001, 65)
, (2, 10001, 66)
, (10, 10002, 11)
, (8, 10002, 12)
, (5, 10003, 15)
, (98, 10003, 16);
SELECT T.JnlId, T.TxnId, T.RowNum
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY TxnId ORDER BY RowNum DESC) AS RowNo, *
FROM #Table) AS T
WHERE T.RowNo <=2