I can't find an easy way to make paging for complex queries for SQL server. I need to write function that takes sql query as an argument (this query can include subqueries, order by statements, grouping etc.) and retrieve a particular page of results. In oracle it's easy by encapsulating such query with another select statement, but for SQL server I can't find any simillar way. What I would like to avoid is to parse input SQL statement. I'm using SQL server 2005
Paging in SQL Server 2005 and upwards is best done via ranking functions. However, given that an arbitrary SQL query is unsorted, you need to somehow specify what the sort shall be for this to work, which isn't really "compatible" with a generic solution like you're trying to make (*).
The suggested way to do it is like this (assuming the variables #PageSize with the number of items per page, and #Page as 1-based index to the page you want to retrieve):
WITH NumberedQuery AS (
SELECT ROW_NUMBER() OVER (ORDER BY q.SomeColumn) ix, q.*
FROM QueryToPage q
)
SELECT nq.*
FROM NumberedQuery nq
WHERE (nq.ix >= (#Page-1)*#PageSize) AND (nq.ix < #Page*#PageSize);
(*): Your approach with concatenating SQL code has several issues, it prevents the use of parametrized queries, it adds the risk of SQL injection, it hurts performance and it cannot solve the issue at hand if the order is unspecified.
Related
is it possible to make MSSQL Management Studio produce a query that will reproduce a resultset, that you found prior, but use the best way possible to recreate it?
Maybe there is a way to tell the database which rows it shall return instead of it looking for the correct rows by the WHERE conditions? So once you found the rows, you dont have to search again?
So what I thought is: When you place a condition like
Where col1 = 10
The DB will check row 1 col1 for value 10, then row 2 col1 and so on..
Like it is searching, which takes time. Whereas if you could just make a statement that just directly asks for the specific row, you are faster?
I mean you dont need to search for the columns either: You just say give me col1 or col2 or whatever
The short answer is: NO
is it possible to make MSSQL Management Studio produce a query that will reproduce a result set, that you found prior
SQL Server Management Studio does not store your queries or the result SET which the queries return. It is simply a client application which pass the queries to the database server and present the result which the server returns.
On the other hand, SQL Server do store all the queries which you execute (for some time, depend on multiple parameters). Using the following query you can get the last queries which were executed by the server:
SELECT execquery.last_execution_time AS [Date Time], execsql.text AS [Script] FROM sys.dm_exec_query_stats AS execquery
CROSS APPLY sys.dm_exec_sql_text(execquery.sql_handle) AS execsql
ORDER BY execquery.last_execution_time DESC
GO
... use the best way possible to recreate it?
When you execute a query, then the server and the SSMS might provide some alerts and recommendation about the query, which can help us build a better query, but not the SQL Server and not the SQL Server Management Studio will build for you a better query based on a result SET of previous query
This is why we have DBA
I would like to have a SQL Server function dbo.GetNextNumber(), which would generate sequential numbers for each call. As far as I understand this is impossible with a native T-SQL function as SQL Server insists the functions has to be deterministic. But, if you could show me a native T-SQL function that does this would really make my day.
I thought perhaps this could be possible to write using a CLR function. As CLR functions are static, the sequence numbers need to be stored in the calling context of the set operation, as storing it as a static variable would result in several connections using the same sequence, resulting in not-so-sequential numbers. I do not know enough about embedded CLR to see if set operation's (select, update, delete, insert) calling context is reachable from the CLR side.
At the end of the day, the following query
select dbo.GetNextNumber() from sysobjects
must return the result
1
2
3
4
5
It is OK if another function call to reset the context is necessary like
exec dbo.ResetSequenceNumbers()
To prevent some misunderstandings and reduce the chances of wasting your time answering wrong question, please note that I am not looking for an ID generation function for a table and I am aware of some hacks (albeit using a proc not a function) that involves some temp tables with identity columns. The ROW_NUMBER() function is close but it also does not cut.
Thanks a lot for any responses
Kemal
P.S. It is amazing that SQL Server does have a built-in function for that. A function (provided that it cannot be used in joins and where clauses) is really easy to do and extremely useful, but for some reason it is not included.
As you have implemented the CLR sequence based on my article related to the calculation of Running Totals, you can achieve the same using the ROW_NUBER() function.
The ROW_NUMBER() function requires the ORDER BY in the OVER clause, however there is a nice workaround how to avoid sorting due to the ORDER BY. You cannot put an expression in the order by, but you can put SELECT aConstant there. So you can easily achieve number generating using below statement.
SELECT
ROW_NUMBER() OVER(ORDER BY (SELECT NULL)) AS RowNumber,
*
FROM aTable
In the next version of SQL Server you can use a SEQUENCE to do this. In earlier versions it's easy enough to do by inserting to a separate "sequence table" (a table with only an IDENTITY column) and then retrieving the value with the SCOPE_IDENTITY() function. You won't be able to do that in a function but you can use a stored procedure.
Can you give more information about why ROW_NUMBER() doesn't cut it? In the specific example you give, ROW_NUMBER() (with appropriate clarification of OVER(ORDER BY)) certainly seems to match your criteria.
But there are certainly cases where ROW_NUMBER() might not be useful, and in those cases, there are usually other techniques.
Perhaps having this general function seems useful to you, but in most cases, I find a solution better tailored to the problem at hand is a better idea than a general purpose function which ends up causing more difficulties - like a leaky abstraction you are constantly working around. You specifically mention the need to have a reset function. No such thing is needed with ROW_NUMBER(), given that it has the OVER(PARTITION BY ORDER BY) which allow you to specify the grouping.
SQL Server Denali has a new sequence structure for developers
It is easy to manage and maintain sequence numbers in SQL Server after Denali
You can find details in Sequence Numbers in SQL Server
For other versions than Denali, you can use Sequence Tables.
Here is a sample for Sequence Table in SQL Server
For reading sequence number from sequence table, you should insert dummy records from this auto identity enabled sql table
As of SQL Server 2022 there is a new operator to address this:
GENERATE_SERIES ( start, stop [, step ] )
For full details, see:
https://learn.microsoft.com/en-us/sql/t-sql/functions/generate-series-transact-sql?view=sql-server-ver16
This question has been asked before -
How we can use CTE in subquery in sql server?
The only answer suggested was "Just define your CTE on top and access it in the subquery?"
This works, but I would really like to be able to use a CTE in the following scenarios -
as a subquery in a SELECT
as a derived table in the FROM clause of a SELECT
Both of these work in PostgreSQL. With Sql Server 2005, I get "Incorrect syntax near the keyword 'with'".
The reason I would like it is that most of my queries are constructed dynamically, and I would like to be able to define a CTE, save it somewhere, and then drop it in to a more complex query on demand.
If Sql Server simply does not support this usage, I will have to accept it, but I have not read anything that states that it is not allowed.
Does anyone know if it is possible to get this to work?
In SQL Server, CTE's must be at the top of the query. If you construct queries dynamically, you could store a list of CTE's in addition to the query. Before you send the query to SQL server, you can prefix the query with a list of CTE's:
; with Cte1 as (...definition 1...),
Cte2 as (...definition 2...),
Cte3 as (...definition 3...),
...
...constructed query...
This is assuming that you're constructing the SQL outside of SQL Server.
You could also consider creating views. Views can contain CTE's, and they can be used as a subquery or derived table. Views are a good choice if you generate SQL infrequently, say only during an installation or as part of a deployment.
SQL Server does not support this much-required feature. I too have been looking for help on this.
MS SQL Server does not support Temporary Views either as opposed to PostgreSQL. The above-mentioned solution is also likely to work only if all the CTE definitions could be generated before-hand and do not have conflicting names in each of the sub-queries either - the purpose being that these CTE definitions may be different for each level of a sub-query.
Sad but true !!!
Regards,
Kapil
I am a newbie to SQL server. keeping this question as reference.My doubt is
why Microsoft Sql server doesn't have something like limit in Mysql and now they are forcing to write either SP or inner query for pagination.I think creating a temporary view/table or using a inner query will be slower than a simple query.And i believe that there will be a strong reason for deprecating this. I like to know the reason.
If anyone know it please share it.
I never knew SQL Server supported something like TOP 10,20 - are you really totally sure?? Wasn't that some other system maybe??
Anyway: SQL Server 2011 (code-named "Denali") will be adding more support for this when it comes out by the end of 2011 or so.
The ORDER BY clause will get new additional keywords OFFSET and FETCH - read more about them here on MSDN.
You'll be able to write statements like:
-- Specifying variables for OFFSET and FETCH values
DECLARE #StartingRowNumber INT = 150, #FetchRows INT = 50;
SELECT
DepartmentID, Name, GroupName
FROM
HumanResources.Department
ORDER BY
DepartmentID ASC
OFFSET #StartingRowNumber ROWS
FETCH NEXT #FetchRows ROWS ONLY;
SQL Server 2005 Paging – The Holy Grail (requires free registration).
(Although it says SQL Server 2005 it is still applicable to SQL Server 2008)
I agree 100%! MySQL has the LIMIT clause that makes a very easy syntax to return a range of rows.
I don't know for sure that temporary table syntax is slower because SQL Server may be able to make some optimizations. However, a LIMIT clause would be far easier to type. And I would expect there would be more opportunities for optimization too.
I brought this once before, and the group I was talking to just didn't seem to agree.
As far as I'm concerned, there is no reason not to have a LIMIT clause (or equivalent), and I strongly suspect SQL Server eventually will!
I have a table (that relates to a number of other tables) where I would like to filter ONE of the columns (RequesterID) - that column will be a combobox where only people that are not sales people should be selectable.
Here is the "unfiltered" query, lets call it QUERY 1:
SELECT RequestsID, RequesterID, ProductsID
FROM dbo.Requests
If using a separate query, lets call it QUERY 2, to filter RequesterID (which is a People related column, connected to People.PeopleID), it would look like this:
SELECT People.PeopleID
FROM People INNER JOIN
Roles ON People.RolesID = Roles.RolesID INNER JOIN
Requests ON People.PeopleID = Requests.RequesterID
WHERE (Roles.Role <> N'SalesGuy')
ORDER BY Requests.RequestsID
Now, is there a way of "merging" the QUERY 2 into QUERY 1?
(dbo.Requests in QUERY 1 has RequesterID populated as a Foreign Key from dbo.People, so no problem there... The connections are all right, just not know how to write the SQL query!)
UPDATE
Trying to explain what I mean in a bit more... :
The result set should be a number of REQUESTS - and the number of REQUESTS should not be limited by QUERY 2. QUERY 2:s only function is to limit the selectable subset in column Requests.RequesterID - and no, it´s not that clear, but in the C# VS2008 implementation I use Requests.RequesterID to eventually populate a ComboBox with [Full name], which is another column in the People table - and in that column I don´t want SalesGuy to show up as possible to select; here I´m trying to clear it out EVEN MORE... (but with wrong syntax, of course)
SELECT RequestsID, (RequesterID WHERE RequesterID != 8), ProductsID
FROM dbo.Requests
Yes, RequesterID 8 happens to be the SalesGuy :-)
here is a very comprehensive article on how to handle this topic:
Dynamic Search Conditions in T-SQL by Erland Sommarskog
it covers all the issues and methods of trying to write queries with multiple optional search conditions. This main thing you need to be concerned with is not the duplication of code, but the use of an index. If your query fails to use an index, it will preform poorly. There are several techniques that can be used, which may or may not allow an index to be used.
here is the table of contents:
Introduction
The Case Study: Searching Orders
The Northgale Database
Dynamic SQL
Introduction
Using sp_executesql
Using the CLR
Using EXEC()
When Caching Is Not Really What You Want
Static SQL
Introduction
x = #x OR #x IS NULL
Using IF statements
Umachandar's Bag of Tricks
Using Temp Tables
x = #x AND #x IS NOT NULL
Handling Complex Conditions
Hybrid Solutions – Using both Static and Dynamic SQL
Using Views
Using Inline Table Functions
Conclusion
Feedback and Acknowledgements
Revision History
if you are on the proper version of SQL Server 2008, there is an additional technique that can be used, see: Dynamic Search Conditions in T-SQL Version for SQL 2008 (SP1 CU5 and later)
If you are on that proper release of SQL Server 2008, you can just add OPTION (RECOMPILE) to the query and the local variable's value at run time is used for the optimizations.
Consider this, OPTION (RECOMPILE) will take this code (where no index can be used with this mess of ORs):
WHERE
(#search1 IS NULL or Column1=#Search1)
AND (#search2 IS NULL or Column2=#Search2)
AND (#search3 IS NULL or Column3=#Search3)
and optimize it at run time to be (provided that only #Search2 was passed in with a value):
WHERE
Column2=#Search2
and an index can be used (if you have one defined on Column2)
How about this? Since the query already joins on the requests table you can simply add the columns to the select-list like so :
SELECT Requests.RequestsID, Requests.RequesterID, Requests.ProductsID
FROM People INNER JOIN
Roles ON People.RolesID = Roles.RolesID INNER JOIN
Requests ON People.PeopleID = Requests.RequesterID
WHERE (Roles.Role <> N'SalesGuy')
ORDER BY Requests.RequestsID
You can in fact select any column from any of the joined tables (Roles, Requests, People, etc.)
It becomes clear if you just replace People.PeopleId with * and it will show you everything retrieved from the tables.