SQL ORDER BY clause ERROR in the ROW_NUMBER - sql-server

I want to use ROW_NUMBER() function and get first and latest values.
I write bellow query. But I got an error.
The ORDER BY clause is invalid in views, inline functions, derived tables, subqueries, and common table expressions, unless TOP, OFFSET or FOR XML is also specified.
help me to solve the issue. Below the sql query
SELECT *
FROM(
SELECT OPP_ID,PRJ_ID,
ROW_NUMBER() OVER (PARTITION BY OPP_ID ORDER BY MAX(CREATION_DATE) DESC) AS RN
FROM OPPOR
GROUP BY OPP_ID,PRJ_ID
ORDER BY MAX(CREATION_DATE) DESC) OP
WHERE OP.RN = 1

The row_number function can do it's own aggregation and ordering, so no need to use group by or order by in your subquery (order by won't work in subqueries as you've seen). It is a little unclear if you want to partition by opp_id or opp_id and prj_id though. But this should be what you're looking for:
SELECT *
FROM(
SELECT OPP_ID,PRJ_ID,
ROW_NUMBER() OVER (PARTITION BY OPP_ID ORDER BY CREATION_DATE DESC) AS RN
FROM OPPOR
) OP
WHERE OP.RN = 1

Related

Get the id of the row with the max value with two grouping

We have a data structure with four columns:
ContractoreName, ProjectCode, InvoiceID, OrderID
We want to group the data by both ContractoreName and ProjectCode columns, and then get the InvoiceID of the row for each group with MAX(OrderID).
You could use ROW_NUMBER:
SELECT ContractorName, ProjectName, OrderId, InvoiceId
FROM (SELECT *, ROW_NUMBER() OVER(PARTITION BY ContractorName, ProjectName
ORDER BY OrderId DESC) AS rn
FROM tab
) AS sub
WHERE rn = 1;
ROW_NUMBER() is what I would call the canonical solution. In many cases, an old-fashioned solution has better performance:
select t.*
from t
where t.orderid = (select max(t2.orderid)
from t t2
where t2.contractorname = t.contractorname and
t2.projectname = t.projectname
);
This is especially true if there is an index on (contractorname, projectname, orderid).
Why is this faster? Basically, SQL Server can scan the table doing a lookup in an index. The lookup is really fast because the index is designed for it, so the scan is just a little faster than a full table scan.
When using row_number(), SQL Server has to scan the table to calculate the row number (and that can use the index, so it might be fast). But then it has to go back to the table to fetch the columns and apply the where clause. So, even if it uses an index, it is doing more work.
EDIT:
I should also point out that this can be done without a subquery:
select distinct contractorname, projectname,
max(orderid) over (partition by contractorname, projectname) as lastest_order,
first_value(invoiceid) partition by (order by contractorname, projectname order by orderid desc) as lastest_invoice
from t;
Unfortunately, SQL Server doesn't offer first_value() as an aggregation function, but you can use select distinct and get the same effect.

how do i get the rank of a specific row in SQL?

I tried to use this query to get the ranks of each vendr by their rating
SELECT vendorid, rating, RANK() over(ORDER BY rating DESC)ranking
FROM vendors
but I want to get the ranking of a specific vendor so I put the where clause like this:
SELECT vendorid, rating, RANK() over(ORDER BY rating DESC)ranking
FROM vendors
WHERE vendorid=1
but it returns a value of 1 in ranking even though it is not rank 1.
how should I fix this?
In this case
SELECT
vendorid, rating,
RANK() OVER (ORDER BY rating DESC) ranking
FROM
vendors
WHERE
vendorid = 1
Rank is calculated after where, so after filtering, SQL Server will assign ranks and show rank for whatever values left
How to fix this?
Use subquery or cte like below.
;With cte as
(
SELECT
vendorid, rating,
RANK() OVER (ORDER BY rating DESC) ranking
FROM
YOURTABLE
)
select *
from cte
where vendorid = 1

Replace Group By clause with any other clause

In below query, I am using GROUP BY clause to get list of recently updated records depends on updated date. But I would like to have the query without a GROUP BY clause because of some internal reasons. Can please any one help me to solve this.
SELECT Proj_UpdatedDate,
Proj_UpdatedBy
FROM ProjectProgress PP
WHERE Proj_UpdatedDate IN (SELECT MAX(Proj_UpdatedDate)
FROM ProjectProgress
GROUP BY
Proj_ProjectID)
ORDER BY
Proj_ProjectID
Using TOP 1 should give you the same result assuming you meant the MAX(Proj_UpdatedDate):
SELECT Proj_UpdatedDate,
Proj_UpdatedBy
FROM ProjectProgress PP
WHERE Proj_UpdatedDate IN (SELECT TOP 1 Proj_UpdatedDate
FROM ProjectProgress
ORDER BY Proj_UpdatedDate DESC)
ORDER BY
Proj_ProjectID
However your query actually returns multiple dates since it's GROUPED BY Proj_ProjectId (the max date for each project). Is that your desired outcome - to show a list of dates that the projects were updated and by whom?
If so, try using ROW_NUMBER():
SELECT Proj_UpdatedDate, Proj_UpdatedBy
FROM (
SELECT ROW_NUMBER() OVER (PARTITION BY Proj_ProjectID ORDER BY Proj_UpdatedBy DESC) rn,
Proj_UpdatedDate,
Proj_UpdatedBy
FROM ProjectProgress
) t
WHERE rn = 1
And here is the SQL Fiddle. This assumes you are running SQL Server 2005 or greater.
Good luck.

T-SQL Group skip and take

Consider following tables:
How to skip and take groups from the table? Tried using Row_Number() but it doesn't help. Any ideas?
Used query
;WITH cte AS (SELECT Room.Id, Room.RoomName,
ROW_NUMBER() OVER
(ORDER BY Room.Id) AS RN
FROM Room INNER JOIN
RoomDetails ON Room.Id = RoomDetails.RoomId)
SELECT Id, RoomName
FROM cte
WHERE RN = 1
You need to use partition as part of the dense_rank function
dense_rank() over (partition by roomid) as row
see here for some more examples Windowing functions

cannot use alias in ROW_NUMBER() over in SQL Server?

I have to create a row_number column ordered by a grouped sum, when using sql:
select Sales.Name, SUM(Sales.Bill) as billsum, ROW_NUMBER() over (order by billsum DESC) as rn
from Sales group by Sales.Name
It reports error because row_number over cannot parse the "billsum" alias, I have to write:
select Sales.Name, SUM(Sales.Bill) as billsum, ROW_NUMBER() over (order by SUM(Sales.Bill) DESC) as rn
from Sales group by Sales.Name
so here I write SUM(Sales.Bill) twice, is there anyway to use the alias here?
The MSDN docs for the T-SQL OVER clause say:
value_expression cannot refer to expressions or aliases in the select list.
As already stated out by other member you either have to use CTE or SubQuery.
Not only Row_Number() function but in tsql you can not reference alias in same query, so either you have to use one of the mentioned way above or the expression you used in your post. I hope it makes sense!! :)
Possible work-arounds are to use CTE or a subquery:
SELECT Name, billsum, ROW_NUMBER() OVER (ORDER BY billsum DESC) AS rn
FROM
( SELECT Sales.Name, SUM(Sales.Bill) AS billsum
FROM Sales
GROUP BY Sales.Name
) tmp
-- Reorder after cutting out qty = 0.
SELECT *,ROW_NUMBER() OVER (partition by claimno ORDER BY itemno) as 'alias name'
from dbo.OrderCol
where QTY <> 0

Resources