Need help select query in SQL Server - sql-server

I have this query :
SELECT *
FROM
(SELECT
*,
ROW_NUMBER() OVER (ORDER BY sort_by) as row
FROM table_name) a
WHERE
row > start_row AND row <= limit_row
This query will select anything from table_name, starting from start_row until limit_row, and the result will arranged by the sort_by column.
But I also need to add the condition WHERE column_name = column_value. And the data arranged by the sort_by column can be in either ascending or descending order.
My question is where should I add the condition column_name = column_value, and the ORDER ASC/DESC in my query?
If my question isn't clear, please ask. Thanks.

SELECT *
FROM
(
SELECT *,
ROW_NUMBER() OVER (ORDER BY sort_by DESC) as row
FROM table_name
WHERE column_name = column_value
) a
WHERE row > start_row
AND row <= limit_row
ORDER BY a.row DESC
The row_number function uses the order to determine the order of the data for numbering purposes; this means the row number order is important to know and understand, especially if you are paging data. Typically, when paging data, you want your ordering so that row 1 is the newest record because you want your first page of data to be the most recent; this generally means the order by on the row number would be descending.
The outer order by only changes the order returned back to you and is really acting only as a display ordering. So, typically, that order by would be ascending when paging data as you are already ordering from newest to oldest.
Also, if you are using a new version of SQL Server, they added a paging feature that performs much better (in my experience) than the row numbering paging used in the past.

Related

Row number for for same value

The result of my SQL Server query returns 3 columns.
Select Id, InItemId, Qty
from Mytable
order by InItemId
I need to add a column, call it row, that starts from 1 and will increase by 1, based on the initemid column with same value.
So the result should be:
Thank you !
Use row_number():
select row_number() over (partition by initemid order by initemid) as row,
t.*
from t;
Note: There is no ordering within a given value of initemid. SQL tables represent unordered sets and there is no obvious column to use for ordering.

SQL Server - Delete Duplicate Rows - how does Partition By affect this query?

I've been using the following inherited query where I'm trying to delete duplicate rows and I'm getting some unexpected results when first running it as a SELECT - I believe it has something to do with my lack of understanding of the Partition part of the statement:
WITH CTE AS(
SELECT [Id],
[Url],
[Identifier],
[Name],
[Entity],
[DOB],
RN = ROW_NUMBER()OVER(PARTITION BY Name ORDER BY Name)
FROM Data.Statistics
where Id = 2170
)
DELETE FROM CTE WHERE RN > 1
Can someone help me understand exactly what I'm doing with the Partition BY Name part of this? This doesn't limit the query in any way to only looking for duplicates in the Name field, correct? I need to ensure that it's looking for records where all 5 of the fields inside the CTE definition are the same for a record to be considered a duplicate.
ROW_NUMBER() OVER (PARTITION BY Name ORDER BY Name) doesn't make a lot of sense. You wouldn't ORDER BY the same thing you used in PARTITION BY since it will be the same value for everything in the partition, making the ORDER BY part useless.
Basically the CTE part of this query is saying to split the matching rows (those with [Id] = 2170) temporarily into groups for each distinct name, and within each group of rows with the same name, order those by name (which are obviously all the same value) and then return the row number within that sequence group as RN. Unique names will all have a row number of 1, because there is only one row with that name. Duplicate names will have row numbers 1, 2, 3, and so on. The order of those rows is undefined in this case because of the silly ORDER BY clause, but if you changed the ORDER BY to something meaningful, the row numbers would follow that sequence.

SQL Server Conditional Sort Performance Issue

I have a table and it have around 5 millions rows. When I try a conditional sort for this table, it takes around 25 secs, but when I change conditional sort to a certain sort criteria, it takes 1 second. Only difference like below;
--takes 1 second
ROW_NUMBER() OVER (ORDER BY OrderId DESC) AS RowNumber
--takes around 25 seconds
CASE #SortColumn WHEN 'OrderId' THEN ROW_NUMBER() OVER (ORDER BY OrderId DESC) AS RowNumber
Who can explain what is going on SQL server in this scenario?
OrderId must be indexed. Thus in the first instance:
ROW_NUMBER() OVER (ORDER BY OrderId DESC) AS RowNumber
SQL does not need to perform a sort having originally performed an index scan on the OrderId column. It knows that the index is ordered by the column you want to order by so does not need to perform another sort.
However in the second example SQL has to evaluate CASE #SortColumn WHEN 'OrderId' THEN ROW_NUMBER() OVER (ORDER BY OrderId DESC) END for each row. Thus it performs a Compute Scalar operation on each row to work out the result of the CASE statement. The results of this operation cannot be mapped to an index as they do not represent a column and a further sort operation is required. Over 5 million rows this is a very expensive operation.
If you were to run the queries over non-indexed columns:
--takes 25 second
ROW_NUMBER() OVER (ORDER BY NonIndexedColumn DESC) AS RowNumber
--takes around 25 seconds
CASE #SortColumn WHEN 'NonIndexedColumn' THEN ROW_NUMBER() OVER (ORDER BY NonIndexedColumn DESC) AS RowNumber
then both queries would presumably run equally slowly as SQL would have to sort in both instances (and not just used a sorted index). Thus passing in a column to sort by is always going to end up with slow performance over a large number of rows if someone picks a non-indexed column. You therefore need to ensure your results are filtered down to a manageable amount of rows prior to the ORDER BY being applied.

How ROW_NUMBER used with insertions?

I've multipe uniond statements in MSSQL Server that is very hard to find a unique column among the result.
I need to have a unique value per each row, so I've used ROW_NUMBER() function.
This result set is being copied to other place (actually a SOLR index).
In the next time I will run the same query, I need to pick only the newly added rows.
So, I need to confirm that, the newly added rows will be numbered afterward the last row_number value of the last time.
In other words, Is the ROW_NUMBER functions orders the results with the insertion order - suppose I don't adding any ORDER BY clause?
If no, (as I think), Is there any alternatives?
Thanks.
Without seeing the sql I can only give the general answer that MS Sql does not guarantee the order of select statements without an order clause so that would mean that the row_number may not be the insertion order.
I guess you can do something like this..
;WITH
cte
AS
(
SELECT * , rn = ROW_NUMBER() OVER (ORDER BY SomeColumn)
FROM
(
/* Your Union Queries here*/
)q
)
INSERT INTO Destination_Table
SELECT * FROM
CTE LEFT JOIN Destination_Table
ON CTE.Refrencing_Column = Destination_Table.Refrencing_Column
WHERE Destination_Table.Refrencing_Column IS NULL
I would suggest you consider 'timestamping' the row with the time it was inserted. Or adding an identity column to the table.
But what it sounds like you want to do is get current max id and then add the row_number to it.
Select col1, col2, mid + row_number() over(order by smt) id
From (
Select col1, col2, (select max(id) from tbl) mid
From query
) t

Row_Number Over Where RowNumber between

I'm try to select a certain rows from my table using the row_number over. However, the sql will prompt the error msg "Invalid column name 'ROWNUMBERS' ". Anyone can correct me?
SELECT ROW_NUMBER() OVER (ORDER BY Price ASC) AS ROWNUMBERS, *
FROM Product
WHERE ROWNUMBERS BETWEEN #fromCount AND #toCount
Attempting to reference the aliased column in the WHERE clause does not work because of the logical query processing taking place. The WHERE is evaluated before the SELECT clause. Therefore, the column ROWNUMBERS does not exist when WHERE is evaluated.
The correct way to reference the column in this example would be:
SELECT a.*
FROM
(SELECT ROW_NUMBER() OVER (ORDER BY Price ASC) AS ROWNUMBERS, *
FROM Product) a
WHERE a.ROWNUMBERS BETWEEN #fromCount AND #toCount
For your reference, the order for operations is:
FROM
WHERE
GROUP BY
HAVING
SELECT
ORDER BY
There is another answer here that solves the specific error reported. However, I also want to address the wider problem. It looks a lot like what you are doing here is paging your results for display. If that is the case, and if you can use Sql Server 2012, there is a better way now. Take a look at OFFSET/FETCH:
SELECT First Name + ' ' + Last Name
FROM Employees
ORDER BY First Name
OFFSET 10 ROWS FETCH NEXT 5 ROWS ONLY;
That would show the third page of a query where the page size is 5.

Resources