SQL Server 2012 CTE Bug? - sql-server

I think I found a bug in SQL Server 2012. I have the following complex query that uses CTE with row_number to accomplish paging, followed by a subquery to return total rows in a single query:
with data as ( ...complex query with row_number() ... as rowNumber... )
select *, (select count(*) from data) as totalRows
from data
where rowNumber between 1 and 10
What I'm finding in my specific query is that if the final query returns 5 rows, the totalRows comes back as 8. But there are only 5 rows. How could totalRows be bigger than the number of rows returned? I've tried query hints like disabling parallel execution plans but not only is it slower, it's still not right. Could I be doing something wrong or is this a bug? Is there another way to get the count back in one query?

Without knowing the internals of your query and ensuring you're partitioning statements are well written...
I'd suggest you are getting differences because of your where clauses. You need to be consistent and use either:
with data as ( ...complex query with row_number() ... as rowNumber... )
select *, (select count(*) from data where rowNumber between 1 and 10) as totalRows
from data
where rowNumber between 1 and 10
or
with data as ( ...complex query with row_number() ... as rowNumber... )
select *, (select count(*) from data) as totalRows
from data

Related

One MS SQL server returns wrong order for a query

We have multiple MSSQL servers that has the same copy of a database, the below query returns valid order for all servers except one, I double checked the design of the tables and all looks identical except couple of servers are missing an index.
The query is generated by doctrine
WITH dctrn_cte AS (
SELECT TOP 10 a0_.Priority
FROM PROJECTS a0_
WHERE a0_.ProjectID = 1234
AND (a0_.Check1 > 0
OR
a0_.Check2 > 0)
AND a0_.Active = 1
ORDER BY a0_.Priority DESC)
SELECT *
FROM (
SELECT *, ROW_NUMBER()
OVER (ORDER BY (SELECT 0)) AS doctrine_rownum FROM dctrn_cte
) AS doctrine_tbl
WHERE doctrine_rownum BETWEEN 1 AND 10 ORDER BY doctrine_rownum ASC
Every time the query is executed on that particular server, it gives a random order - it is completely ignoring the ORDER BY part.
your query has one final ORDER BY clause: doctrine_rownum. This is an alias for a column that is an undetermined value:
ROW_NUMBER()
OVER (ORDER BY (SELECT 0)) AS doctrine_rownum
Therefore any order of the result is a correct order. All your servers return the correct result. Select isn't broken.
PS. You also have an ORDER BY inside the CTE, that is irrelevant to the final order, as it does not impose any order on the final result nor on the doctrine_rownum value.
The query is generated by doctrine
The query is generated incorrectly by doctrine, whatever this doctrine is.
Adding OPTION (MAXDOP 1) fixed the order on that server.

Select distinct records from MS SQL database when querying row numbers

This query returns 5 identical products, because there are 5 keywords associated with the resulting product:
SELECT
products.field1,
products.field2
FROM products,
keywords
WHERE products.itemnum = keywords.itemnum
AND products.itemnum = 123
ORDER BY products.field1, products.field2
If I put a "distinct" after "select", then I get 1 result, which is what I want.
However, when I setup my query like this:
SELECT
*
FROM (SELECT
ROW_NUMBER() OVER (ORDER BY products.field1, products.field2) AS rownum,
products.field1,
products.field2
FROM products,
keywords
WHERE products.itemnum = keywords.itemnum
AND products.itemnum = 123) AS qryresults
WHERE rownum >= 1
AND rownum <= 20
I get 5 identical products again. There doesn't seem to be anywhere I can put a "distinct" statement to limit it to 1 result. I'm sure the reason is that by adding the row numbers, that doesn't make the results "distinct" anymore.
I am using the technique shown in this query to limit potentially large search results to only 20 records at a time, which greatly reduces overhead and speeds up my query. So if there are 100,000 results, I can easily set this up to return records 90,000-90,020, for example.
MySQL has this kind of thing built-in, but with MS SQL this is the workaround.
However, I am having trouble figuring out how to make it work when I am combining the keywords table.
If I replace the * with a list of columns, then I get an error:
The multi-part identifier could not be bound.
I'm not sure what else to try. Is there a way to correct this?
Thank you.
Use a CTE to separate the distinct and the ROW_NUMBER() function:
with cte as (select distinct products.field1
, products.field2
from products, keywords
where products.itemnum=keywords.itemnum and products.itemnum=123),
row_n as (select field1
, field2
, row_number() over (order by field1, field2) as rownum
from cte)
select field1, field2
from row_n
where rownum>=1 and rownum<=20

How ROW_NUMBER used with insertions?

I've multipe uniond statements in MSSQL Server that is very hard to find a unique column among the result.
I need to have a unique value per each row, so I've used ROW_NUMBER() function.
This result set is being copied to other place (actually a SOLR index).
In the next time I will run the same query, I need to pick only the newly added rows.
So, I need to confirm that, the newly added rows will be numbered afterward the last row_number value of the last time.
In other words, Is the ROW_NUMBER functions orders the results with the insertion order - suppose I don't adding any ORDER BY clause?
If no, (as I think), Is there any alternatives?
Thanks.
Without seeing the sql I can only give the general answer that MS Sql does not guarantee the order of select statements without an order clause so that would mean that the row_number may not be the insertion order.
I guess you can do something like this..
;WITH
cte
AS
(
SELECT * , rn = ROW_NUMBER() OVER (ORDER BY SomeColumn)
FROM
(
/* Your Union Queries here*/
)q
)
INSERT INTO Destination_Table
SELECT * FROM
CTE LEFT JOIN Destination_Table
ON CTE.Refrencing_Column = Destination_Table.Refrencing_Column
WHERE Destination_Table.Refrencing_Column IS NULL
I would suggest you consider 'timestamping' the row with the time it was inserted. Or adding an identity column to the table.
But what it sounds like you want to do is get current max id and then add the row_number to it.
Select col1, col2, mid + row_number() over(order by smt) id
From (
Select col1, col2, (select max(id) from tbl) mid
From query
) t

MSSQL 2008 R2 Selecting rows withing certain range - Paging - What is the best way

Currently this sql query is able to select between the rows i have determined. But are there any better approach for this ?
select * from (select *, ROW_NUMBER() over (order by Id desc) as RowId
from tblUsersMessages ) dt
where RowId between 10 and 25
Depends on your indexes.
Sometimes this can be better
SELECT *
FROM tblUsersMessages
WHERE Id IN (SELECT Id
FROM (select Id,
ROW_NUMBER() over (order by Id desc) as RowId
from tblUsersMessages) dt
WHERE RowId between 10 and 25)
If a narrower index exists that can be used to quickly find the Id values within the range. See my answer here for an example that demonstrates the type of issue that can arise.
You need to check the execution plans and output of SET STATISTICS IO ON for your specific case.

Variation on Select top n

Is is possible to do a variation of select top n rows to select top n rows starting at a row other than 0.
My (mobile) app has limited resources and no server side caching available. The maximum rows returned is 100. I get the first 100 by select top 100. I would then like the user to be then able to request rows 101-200 and so on. The database data is static and the the re-query time negligible.
Platform SQL Server 2008
Here's an article which demonstrates such queries using the ROW_NUMBER function.
;With CTETable AS
(
SELECT ROW_NUMBER() OVER (ORDER BY Column_Name DESC) AS ROW_NUM, * FROM TABLENAME WHERE <CONDITION>
)
SELECT Column_List FROM CTETable WHERE ROWN_NUM BETWEEN <StartNum> AND <EndNum>
Use your [startNum] and [EndNum] to be any series you want maybe 123 - 147 ! This will work well !

Resources