How to use Row_Number to group a resultset

How to use Row_Number to group a resultset - sql-server

i'm stuck with a query and i don't want to use a while loop or another nasty method to do this.
Here's the situation:
I've got a query that gets some data, and i need to calculate a column based on 2 other columns.
My results are as follow:
Type | Customer | Cycle | Amount | Expiration | Row_Number (Partition By Customer, Cycle)
So, my row_number column needs to "group" customers and cycles, here's a Fiddle to better understand it
Here's an example:
As you can see, iteration column is correctly applied as far as i know what row_number does.
But i need to do this:
Is there a way to do this with Row_Number ?
or should i need store the data in a temp table, loop through it and update this ITERATION column?
Maybe a CTE?
Any help on this will be highly appreciated. Thanks!

just run this as new query, replace what you need in your query...
WITH T(StyleID, ID)
AS (SELECT 1,1 UNION ALL
SELECT 1,1 UNION ALL
SELECT 1,1 UNION ALL
SELECT 1,2)
SELECT *,
RANK() OVER(PARTITION BY StyleID ORDER BY ID) AS 'RANK',
ROW_NUMBER() OVER(PARTITION BY StyleID ORDER BY ID) AS 'ROW_NUMBER',
DENSE_RANK() OVER(PARTITION BY StyleID ORDER BY ID) AS 'DENSE_RANK'
FROM T
regards,
Valentin

You could use DENSE_RANK function instead of ROW_NUMBER.
DECLARE #MyTable TABLE
(
Customer NVARCHAR(100) NOT NULL,
[Cycle] SMALLINT NOT NULL
);
INSERT #MyTable VALUES ('C1', 2010);
INSERT #MyTable VALUES ('C1', 2010);
INSERT #MyTable VALUES ('C1', 2011);
INSERT #MyTable VALUES ('C1', 2012);
INSERT #MyTable VALUES ('C1', 2012);
INSERT #MyTable VALUES ('C1', 2012);
INSERT #MyTable VALUES ('C2', 2010);
INSERT #MyTable VALUES ('C2', 2010);
SELECT t.Customer, t.[Cycle],
DENSE_RANK() OVER(PARTITION BY t.Customer ORDER BY t.[Cycle]) AS Rnk
FROM #MyTable t
ORDER BY Customer, [Cycle];
Results:
Customer Cycle Rnk
-------- ------ ---
C1 2010 1
C1 2010 1
C1 2011 2
C1 2012 3
C1 2012 3
C1 2012 3
C2 2010 1
C2 2010 1
SQL Fiddle

Related

How to select only first ROW_NUMBER combined with SUM

I like to group my table by [ID] while using SUM and also bring back
[Product_Name] of the top ROW_NUMBER - not sure if I should use ROW_NUMBER, GROUPING SETS or loop through everything with FETCH... this is what I tried:
DECLARE #SampleTable TABLE
(
[ID] INT,
[Price] MONEY,
[Product_Name] VARCHAR(50)
)
INSERT INTO #SampleTable
VALUES (1, 100, 'Product_1'), (1, 200, 'Product_2'),
(1, 300, 'Product_3'), (2, 500, 'Product_4'),
(2, 200, 'Product_5'), (2, 300, 'Product_6');
SELECT
[ID],
[Product_Name],
[Price],
SUM([Price]) OVER (PARTITION BY [ID]) AS [Price_Total],
ROW_NUMBER() OVER (PARTITION BY [ID] ORDER BY [ID]) AS [Row_Number]
FROM
#SampleTable T1
My desired results - only two records:
1 Product_1 100.00 600.00 1
2 Product_4 500.00 1000.00 1
Any help or guidance is highly appreciated.
UPDATE:
I end up using what Prateek Sharma suggested in his comment, to simply wrap the query with another SELECT WHERE [Row_Number] = 1
SELECT * FROM
(
SELECT
[ID]
,[Product_Name]
,[Price]
,SUM([Price]) OVER (PARTITION BY [ID]) AS [Price_Total]
,ROW_NUMBER() OVER (PARTITION BY [ID] ORDER BY [ID]) AS [Row_Number]
FROM #SampleTable
) MultipleRows
WHERE [Row_Number] = 1

You should have a column on which you will perform ORDER BY for ROW_NUMBER(). In this case if you want to only rely on the table self index then it's OK to use ID column for ORDER BY.
Hence your query is correct and you can go with it.
Other option is to use WITH TIES clause. BUT again, If you will use WITH TIES clause with the ORDER BY on ID column then performance will be very poor. WITH TIES only performs well if you have well defined index. And, then can use that indexed column with WITH TIES clause.
SELECT TOP 1 WITH TIES *
FROM (
SELECT [ID]
,[Product_Name]
,[Price]
,SUM([Price]) OVER (PARTITION BY [ID]) AS [Price_Total]
FROM #SampleTable
) TAB
ORDER BY ROW_NUMBER() OVER (PARTITION BY [ID] ORDER BY <IndexedColumn> DESC)
This query may help you bit. But remember, it is also not going to provide better performance than the query written by you. It is only reducing the line of code.

One option is using the WITH TIES clause. No extra field RN.
Hopefully, you have a proper sequence number or date which can be used in either the sum() over or in the final row_number() over
Example
SELECT Top 1 with ties *
From (
Select [ID]
,[Product_Name]
,[Price]
,SUM([Price]) OVER (PARTITION BY [ID]) AS [Price_Total]
FROM #SampleTable T1
) A
Order By ROW_NUMBER() OVER (PARTITION BY [ID] ORDER BY [Price_Total] Desc)
Returns
ID Product_Name Price Price_Total
1 Product_1 100.00 600.00
2 Product_4 500.00 1000.00

There is no "top ROW_NUMBER" unless you have a column that defines ordering.
If you just want an arbitary row per id you can use the below. To deterministically pick one you would need to order by deterministic unique criteria.
DECLARE #SampleTable TABLE
(
ID INT,
Price MONEY,
Product_Name VARCHAR(50),
INDEX cix CLUSTERED (ID)
);
INSERT INTO #SampleTable
VALUES (1,100,'Product_1'),
(1,200,'Product_2'),
(1,300,'Product_3'),
(2,500,'Product_4'),
(2,200,'Product_5'),
(2,300,'Product_6');
WITH T AS
(
SELECT *,
OrderingColumn = ROW_NUMBER() OVER (ORDER BY (SELECT 0))
FROM #SampleTable
)
SELECT ID,
SUBSTRING(MIN(CONCAT(STR(OrderingColumn), Product_Name)), 11, 50) AS Product_Name,
CAST(SUBSTRING(MIN(CONCAT(STR(OrderingColumn), Price)), 11, 50) AS MONEY) AS Price,
SUM(Price) AS Price_Total
FROM T
GROUP BY ID
The plan for this is pretty efficient as it is able to use the index ordered by id and has no additional sorts, spools, or passes through the table.

sql query that gets the difference between 2 recent rows for every row item that occurs more than once in a table

Sql query that gets the difference between 2 recent rows for every value that occurs more than once in a table.
for example
book value date
A 4 2017-07-17 09:16:44.480
A 2 2017-08-15 10:05:58.273
B 3 2017-04-15 10:05:58.273
C 2 2017-08-15 10:05:58.273
B 3 2017-04-13 10:05:58.273
B 3 2017-04-12 10:05:58.273
should return
A 2
B 0

Here is a solution:
SELECT book, MAX(value) - MIN(value) AS difference FROM (
SELECT book, value, ROW_NUMBER() OVER (PARTITION BY book ORDER BY date DESC) AS rownum FROM t
) AS a WHERE rownum <= 2 GROUP BY book HAVING MAX(rownum) >= 2
And here it is in SQLFiddle

SELECT id_pk FROM [table] GROUP BY [fields you whant to compare by] HAVING COUNT(*) > 1)
this select returns you the list of pk from element that are repited
so, in other select you migth get another Select like
Select * from [table] where id_pk in(
SELECT id_pk FROM [table] GROUP BY [fields you whant to compare by] HAVING COUNT(*) > 1)) limit 2
this is functional, still not good as i'm not analising complexity.

Add a rownumber before calculating:
create table #test ([book] char(1), [value] int, [date] datetime)
insert into #test values ('A', 4, '2017-07-17 09:16:44.480')
insert into #test values ('A', 2, '2017-08-15 10:05:58.273')
insert into #test values ('B', 3, '2017-04-15 10:05:58.273')
insert into #test values ('C', 2, '2017-08-15 10:05:58.273')
insert into #test values ('B', 3, '2017-04-13 10:05:58.273')
insert into #test values ('B', 3, '2017-04-12 10:05:58.273')
;with cte as(
Select ROW_NUMBER () OVER (order by [book], [date] ) as rownumber, *
from #test)
select distinct [1].book, abs(first_value([1].[Value]) over (partition by [1].book order by [1].rownumber desc) - [2].val2) as [Difference]
from cte [1]
inner join
(select rownumber, book, first_value([Value]) over (partition by book order by rownumber desc) as val2
from cte) [2] on [1].book = [2].book and [1].rownumber < [2].rownumber

I would use analytic functions:
;with CTE as (
SELECT book
,value
,LAG(value) OVER (PARTITION BY book ORDER BY date) last_value
,ROW_NUMBER() OVER (PARTITION BY book ORDER BY date DESC) rn
FROM MyTable
)
SELECT book
,value - last_value as value_change
FROM CTE
WHERE rn = 1
AND last_value IS NOT NULL
LAG() was added in SQL Server 2012, but even if you're on a higher version, your database must have the compatibility version set to 110 or higher for them to be available. Here's an alternative that should work on SQL Server 2005 or higher, or a database compatibility 90 or higher.
;with CTE as (
SELECT book
,value
,ROW_NUMBER() OVER (PARTITION BY book ORDER BY date DESC) rn
FROM MyTable
)
SELECT c1.book
c1.value - c2.value as value_change
FROM CTE c1
INNER JOIN CTE c2
ON c1.book = c2.book
WHERE c1.rn = 1
AND c2.rn = 2

Getting the last row from a ROW_NUMBER using SQL

I am thinking there is a better way to grab the last row from a row_number instead of doing multiple nesting using T-SQL.
I need the total number of orders and the last ordered date. Say I have the following:
DECLARE #T TABLE (PERSON_ID INT, ORDER_DATE DATE)
INSERT INTO #T VALUES(1, '2016/01/01')
INSERT INTO #T VALUES(1, '2016/01/02')
INSERT INTO #T VALUES(1, '2016/01/03')
INSERT INTO #T VALUES(2, '2016/01/01')
INSERT INTO #T VALUES(2, '2016/01/02')
INSERT INTO #T VALUES(3, '2016/01/01')
INSERT INTO #T VALUES(3, '2016/01/02')
INSERT INTO #T VALUES(3, '2016/01/03')
INSERT INTO #T VALUES(3, '2016/01/04')
What I want is:
PERSON_ID ORDER_DATE ORDER_CNT
1 2016-01-03 3
2 2016-01-02 2
3 2016-01-04 4
Is there a better way to do this besides the following:
SELECT *
FROM (
SELECT *
, ROW_NUMBER() OVER (PARTITION BY PERSON_ID ORDER BY ORDER_CNT DESC) AS LAST_ROW
FROM (
SELECT *
, ROW_NUMBER () OVER (PARTITION BY PERSON_ID ORDER BY ORDER_DATE) AS ORDER_CNT
FROM #T
) AS A
) AS B
WHERE LAST_ROW = 1

Yes, you can use this:
SELECT
PERSON_ID,
MAX(ORDER_DATE) AS ORDER_DATE,
COUNT(*) AS ORDER_CNT
FROM #T
GROUP BY PERSON_ID

SELECT a.PERSON_ID
, a.ORDER_DATE
, a.ORDER_CNT
FROM
(
SELECT PERSON_ID
, ORDER_DATE
, rn = ROW_NUMBER () OVER (PARTITION BY PERSON_ID ORDER BY ORDER_DATE DESC)
, ORDER_CNT = COUNT(ORDER_DATE) OVER (PARTITION BY PERSON_ID)
FROM #T
) AS a
WHERE rn = 1
ORDER BY a.PERSON_ID;

Get the smallest date from table with unique records in sql server

I have table say VendorReport in this table i have three columns ID,PrefixId,Download_date
data in my table is as follow
ID PrefixId Download_date
1 VIS017 28-09-2012
2 VIS028 29-09-2012
3 VIS035 29-09-2012
4 VIS028 30-09-2012
5 VIS028 29-09-2012
6 VIS028 01-10-2012
7 VIS025 30-09-2012
i want the unique PrefixId records with smallest date as show below
1 VIS017 28-09-2012
2 VIS028 29-09-2012
3 VIS035 29-09-2012
4 VIS025 30-09-2012
so i have tried this query but not getting expected result.
select VendorReport.PrefixId,VendorReport.Download_Date from VendorReport
join (select PrefixId, MIN(Download_Date) d_date from VendorReport group by PrefixId) t2 on VendorReport.PrefixId= t2.PrefixId order by VendorReport.Download_Date asc

I'M new in sql server
pls try this
select prefixId,min(download_date) as download_date from #abc group by prefixId order by prefixId asc

It's not clear what you want to get. Hope this will help:
WITH T AS
(
select
VendorReport.*,
ROW_NUMBER() OVER (PARTITION BY PrefixID
ORDER BY Download_date, ID) as RowNum
from VendorReport
)
SELECT ID,PrefixId, Download_date
FROM T
WHERE RowNum=1
Order by Download_Date DESC
SQLFiddle demo

Here you go
create table #VendorReport(
ID int,
PrefixId nvarchar(50),
Download_date datetime
)
insert into #VendorReport values(1,'IS017','2012-09-28');
insert into #VendorReport values(2,'IS028','2012-09-29');
insert into #VendorReport values(3,'IS035','2012-09-29');
insert into #VendorReport values(4,'IS028','2012-09-30');
insert into #VendorReport values(5,'IS028','2012-09-29');
insert into #VendorReport values(6,'IS028','2012-10-01');
insert into #VendorReport values(7,'IS025','2012-09-30');
select * from #VendorReport
select ROW_NUMBER() OVER(ORDER BY PrefixId) as Id, PrefixId, min(Download_date) as Download_date from #VendorReport group by PrefixId
drop table #VendorReport

Try this..............
select Row_number() over ( order by x.Download_date),x.PrefixId,x.Download_date
(
select PrefixId,Min(Download_date) Download_date
from
VendorReport
group by Prefixid
) x

How to use RANK() in SQL Server

I have a problem using RANK() in SQL Server.
Here’s my code:
SELECT contendernum,
totals,
RANK() OVER (PARTITION BY ContenderNum ORDER BY totals ASC) AS xRank
FROM (
SELECT ContenderNum,
SUM(Criteria1+Criteria2+Criteria3+Criteria4) AS totals
FROM Cat1GroupImpersonation
GROUP BY ContenderNum
) AS a
The results for that query are:
contendernum totals xRank
1 196 1
2 181 1
3 192 1
4 181 1
5 179 1
What my desired result is:
contendernum totals xRank
1 196 1
2 181 3
3 192 2
4 181 3
5 179 4
I want to rank the result based on totals. If there are same value like 181, then two numbers will have the same xRank.

Change:
RANK() OVER (PARTITION BY ContenderNum ORDER BY totals ASC) AS xRank
to:
RANK() OVER (ORDER BY totals DESC) AS xRank
Have a look at this example:
SQL Fiddle DEMO
You might also want to have a look at the difference between RANK (Transact-SQL) and DENSE_RANK (Transact-SQL):
RANK (Transact-SQL)
If two or more rows tie for a rank, each tied rows receives the same
rank. For example, if the two top salespeople have the same SalesYTD
value, they are both ranked one. The salesperson with the next highest
SalesYTD is ranked number three, because there are two rows that are
ranked higher. Therefore, the RANK function does not always return
consecutive integers.
DENSE_RANK (Transact-SQL)
Returns the rank of rows within the partition of a result set, without
any gaps in the ranking. The rank of a row is one plus the number of
distinct ranks that come before the row in question.

To answer your question title, "How to use Rank() in SQL Server," this is how it works:
I will use this set of data as an example:
create table #tmp
(
column1 varchar(3),
column2 varchar(5),
column3 datetime,
column4 int
)
insert into #tmp values ('AAA', 'SKA', '2013-02-01 00:00:00', 10)
insert into #tmp values ('AAA', 'SKA', '2013-01-31 00:00:00', 15)
insert into #tmp values ('AAA', 'SKB', '2013-01-31 00:00:00', 20)
insert into #tmp values ('AAA', 'SKB', '2013-01-15 00:00:00', 5)
insert into #tmp values ('AAA', 'SKC', '2013-02-01 00:00:00', 25)
You have a partition which basically specifies grouping.
In this example, if you partition by column2, the rank function will create ranks for groups of column2 values. There will be different ranks for rows where column2 = 'SKA' than rows where column2 = 'SKB' and so on.
The ranks are decided like this:
The rank for every record is one plus the number of ranks that come before it in its partition. The rank will only increment when one of the fields you selected (other than the partitioned field(s)) is different than the ones that come before it. If all of the selected fields are the same, then the ranks will tie and both will be assigned the value, one.
Knowing this, if we only wanted to select one value from each group in column two, we could use this query:
with cte as
(
select *,
rank() over (partition by column2
order by column3) rnk
from t
) select * from cte where rnk = 1 order by column3;
Result:
COLUMN1 | COLUMN2 | COLUMN3 |COLUMN4 | RNK
------------------------------------------------------------------------------
AAA | SKB | January, 15 2013 00:00:00+0000 |5 | 1
AAA | SKA | January, 31 2013 00:00:00+0000 |15 | 1
AAA | SKC | February, 01 2013 00:00:00+0000 |25 | 1
SQL DEMO

You have to use DENSE_RANK rather than RANK. The only difference is that it doesn't leave gaps. You also shouldn't partition by contender_num, otherwise you're ranking each contender in a separate group, so each is 1st-ranked in their segregated groups!
SELECT contendernum,totals, DENSE_RANK() OVER (ORDER BY totals desc) AS xRank FROM
(
SELECT ContenderNum ,SUM(Criteria1+Criteria2+Criteria3+Criteria4) AS totals
FROM dbo.Cat1GroupImpersonation
GROUP BY ContenderNum
) AS a
order by contendernum
A hint for using StackOverflow, please post DDL and sample data so people can help you using less of their own time!
create table Cat1GroupImpersonation (
contendernum int,
criteria1 int,
criteria2 int,
criteria3 int,
criteria4 int);
insert Cat1GroupImpersonation select
1,196,0,0,0 union all select
2,181,0,0,0 union all select
3,192,0,0,0 union all select
4,181,0,0,0 union all select
5,179,0,0,0;

DENSE_RANK() is a rank with no gaps, i.e. it is “dense”.
select Name,EmailId,salary,DENSE_RANK() over(order by salary asc) from [dbo].[Employees]
RANK()-It contain gap between the rank.
select Name,EmailId,salary,RANK() over(order by salary asc) from [dbo].[Employees]

You have already grouped by ContenderNum, no need to partition again by it.
Use Dense_rank()and order by totals desc.
In short,
SELECT contendernum,totals, **DENSE_RANK()**
OVER (ORDER BY totals **DESC**)
AS xRank
FROM
(
SELECT ContenderNum ,SUM(Criteria1+Criteria2+Criteria3+Criteria4) AS totals
FROM dbo.Cat1GroupImpersonation
GROUP BY ContenderNum
) AS a

SELECT contendernum,totals, RANK() OVER (ORDER BY totals ASC) AS xRank FROM
(
SELECT ContenderNum ,SUM(Criteria1+Criteria2+Criteria3+Criteria4) AS totals
FROM dbo.Cat1GroupImpersonation
GROUP BY ContenderNum
) AS a

RANK() is good, but it assigns the same rank for equal or similar values. And if you need unique rank, then ROW_NUMBER() solves this problem
ROW_NUMBER() OVER (ORDER BY totals DESC) AS xRank

Select T.Tamil, T.English, T.Maths, T.Total, Dense_Rank()Over(Order by T.Total Desc) as Std_Rank From (select Tamil,English,Maths,(Tamil+English+Maths) as Total From Student) as T
enter image description here