I have the following stored procedure that is quite extensive because of the dynamic #Name parameter and the sub query.
Is there a better more efficient way to do this?
CREATE PROCEDURE [dbo].[spGetClientNameList]
#Name varchar(100)
AS
BEGIN
SET NOCOUNT ON;
SELECT
*
FROM
(
SELECT
ClientID,
FirstName + ' ' + LastName as Name
FROM
Client
) a
where a.Name like '%' + #Name + '%'
Shamelessly stealing from two recent articles by Aaron Bertrand:
Follow-up #1 on leading wildcard seeks - Aaron Bertrand
One way to get an index seek for a leading %wildcard - Aaron Bertrand
The jist is to create something that we can use that resembles a trigram (or trigraph) in PostgreSQL.
Aaron Bertrand also includes a disclaimer as follows:
"Before I start to show how my proposed solution would work, let me be absolutely clear that this solution should not be used in every single case where LIKE '%wildcard%' searches are slow. Because of the way we're going to "explode" the source data into fragments, it is likely limited in practicality to smaller strings, such as addresses or names, as opposed to larger strings, like product descriptions or session abstracts."
test setup: http://rextester.com/IIMT54026
Client table
create table dbo.Client (
ClientId int not null primary key clustered
, FirstName varchar(50) not null
, LastName varchar(50) not null
);
insert into dbo.Client (ClientId, FirstName, LastName) values
(1, 'James','')
, (2, 'Aaron','Bertrand')
go
Function used by Aaron Bertrand to explode string fragments (modified for input size):
create function dbo.CreateStringFragments(#input varchar(101))
returns table with schemabinding
as return
(
with x(x) as (
select 1 union all select x+1 from x where x < (len(#input))
)
select Fragment = substring(#input, x, len(#input)) from x
);
go
Table to store fragments for FirstName + ' ' + LastName:
create table dbo.Client_NameFragments (
ClientId int not null
, Fragment varchar(101) not null
, constraint fk_ClientsNameFragments_Client
foreign key(ClientId) references dbo.Client
on delete cascade
);
create clustered index s_cat on dbo.Client_NameFragments(Fragment, ClientId);
go
Loading the table with fragments:
insert into dbo.Client_NameFragments (ClientId, Fragment)
select c.ClientId, f.Fragment
from dbo.Client as c
cross apply dbo.CreateStringFragments(FirstName + ' ' + LastName) as f;
go
Creating trigger to maintain fragments:
create trigger dbo.Client_MaintainFragments
on dbo.Client
for insert, update as
begin
set nocount on;
delete f from dbo.Client_NameFragments as f
inner join deleted as d
on f.ClientId = d.ClientId;
insert dbo.Client_NameFragments(ClientId, Fragment)
select i.ClientId, fn.Fragment
from inserted as i
cross apply dbo.CreateStringFragments(i.FirstName + ' ' + i.LastName) as fn;
end
go
Quick trigger tests:
/* trigger tests --*/
insert into dbo.Client (ClientId, FirstName, LastName) values
(3, 'Sql', 'Zim')
update dbo.Client set LastName = 'unknown' where LastName = '';
delete dbo.Client where ClientId = 3;
--select * from dbo.Client_NameFragments order by ClientId, len(Fragment) desc
/* -- */
go
New Procedure:
create procedure [dbo].[Client_getNameList] #Name varchar(100) as
begin
set nocount on;
select
ClientId
, Name = FirstName + ' ' + LastName
from Client c
where exists (
select 1
from dbo.Client_NameFragments f
where f.ClientId = c.ClientId
and f.Fragment like #Name+'%'
)
end
go
exec [dbo].[Client_getNameList] #Name = 'On Bert'
returns:
+----------+----------------+
| ClientId | Name |
+----------+----------------+
| 2 | Aaron Bertrand |
+----------+----------------+
I guess search operation on Concatenated column wont take Indexes sometimes. I got situation like above and I replaced the Concatenated search with OR which gave me better performance most of the times.
Create Non Clustered Indexes on FirstName and LastName if not present.
Check the performance after modifying the above Procedure like below
CREATE PROCEDURE [dbo].[spGetClientNameList]
#Name varchar(100)
AS
BEGIN
SET NOCOUNT ON;
SELECT
ClientID,
FirstName + ' ' + LastName as Name
FROM
Client
WHERE FirstName LIKE '%' + #Name + '%'
OR LastName LIKE '%' + #Name + '%'
END
And do check execution plans to verify those Indexes are used or not.
The problem really comes down to having to compute the column (concat the first name and last name), that pretty much forces sql server into doing a full scan of the table to determine what is a match and what isn't. If you're not allowed to add indexes or alter the table, you'll have to change the query around (supply firstName and lastName separately). If you are, you could add a computed column and index that:
Create Table client (
ClientId INT NOT NULL PRIMARY KEY IDENTITY(1,1)
,FirstName VARCHAR(100)
,LastName VARCHAR(100)
,FullName AS FirstName + ' ' + LastName
)
Create index FullName ON Client(FullName)
This will at least speed your query up by doing index seeks instead of full table scans. Is it worth it? It's difficult to say without looking at how much data there is, etc.
where a.Name like '%' + #Name + '%'
This statement never can use index. In this situation it's beter to use Full Text Search
if you can restrict your like to
where a.Name like #Name + '%'
it will use index automaticaly. Moreover you can use REVERSE() function to index statement like :
where a.Name like '%' + #Name
Related
I have a data set of about 33 million rows and 20 columns. One of the columns is a raw data tab I'm using to extract relevant data from, inlcuding ID's and account numbers.
I extracted a column for User ID's into a temporary table to trim the User ID's of spaces. I'm now trying to add the trimmed User ID column back into the original data set using this code:
SELECT *
FROM [dbo].[DATA] AS A
INNER JOIN #TempTable AS B ON A. [RawColumn] = B. [RawColumn]
Extracting the User ID's and trimming the spaces took about a minute for each query. However, running this last query I'm at the 2 hour mark and I'm only 2% of the way through the dataset.
Is there a better way to run the query?
I'm running the query in SQL Server 2014 Management Studio
Thanks
Update:
I continued to let it run through the night. When I got back into work, only 6 million rows had been completed of the 33 million rows. I cancelled the execution and I'm trying to add a smaller primary key (The only other key I could see on the table was the [RawColumn], which was a very long string of text) using:
ALTER TABLE [dbo].[DATA]
ADD ID INT IDENTITY(1,1)
Right now I'm an hour into the execution.
Next, I'm planning to make it the primary key using
ALTER TABLE dbo.[DATA]
ADD CONSTRAINT PK_[DATA] PRIMARY KEY(ID)
I'm not familiar with using Indexes.. I've tried looking up on Stack Overflow how to create one, but from what I'm reading it sounds like it would take just as long to create an index as it would to run this query. Am I wrong about that?
For context on the RawColumn data, it looks something like this:
FirstName: John LastName: Smith UserID: JohnS Account#: 000-000-0000
Update #2:
I'm now learning that using "ALTER TABLE" is a bad idea. I should have done a little bit more research into how to add a primary key to a table.
Update #3
Here's the code I used to extract the "UserID" code out of the "RawColumn" data.
DROP #TEMPTABLE1
GO
SELECT [RAWColumn],
SUBSTRING([RAWColumn], CHARINDEX('USERID:', [RAWColumn])+LEN('USERID:'), CHARINDEX('Account#:', [RAWColumn])-Charindex('Username:', [RAWColumn]) - LEN('Account#:') - LEN('USERID:')) AS 'USERID_NEW'
INTO #TempTable1
FROM [dbo].[DATA]
Next I trimmed the data from the temporary tables
DROP #TEMPTABLE2
GO
SELECT [RawColumn],
LTRIM([USERID_NEW]) AS 'USERID_NEW'
INTO #TempTable2
FROM #TempTable1
So now I'm trying to get the data from #TEMPTABLE2 back into my original [DATA] table. Hopefully this is more clear now.
So I think your parsing code is a little bit wrong. Here's an approach that doesn't assume that the values appear in any particular order. It does assume that the header/tag name has a space after the colon character and it assumes that the value end at the subsequent space character. Here's a snippet that manipulates a single value.
declare #dat varchar(128) = 'FirstName: John LastName: Smith UserID: JohnS Account#: 000-000-0000';
declare #tag varchar(16) = 'UserID: ';
/* datalength() counts the trailing space character unlike len() */
declare #idx int = charindex(#tag, #dat) + datalength(#tag);
select substring(#dat, #idx, charindex(' ', #dat + ' ', #idx + 1) - #idx) as UserID
To use it in a single query without the temporary variable, the most straightforward approach is to just replace each instance of "#idx" with the original expression:
declare #tag varchar(16) = 'UserID: ';
select RawColumn,
substring(
RawColumn,
charindex(#tag, RawColumn) + datalength(#tag),
charindex(
' ', RawColumn + ' ',
charindex(#tag, RawColumn) + datalength(#tag) + 1
) - charindex(#tag, RawColumn) + datalength(#tag)
) as UserID
from dbo.DATA;
As an update it looks something like this:
declare #tag varchar(16) = 'UserID: ';
update dbo.DATA
set UserID =
substring(
RawColumn,
charindex(#tag, RawColumn) + datalength(#tag),
charindex(
' ', RawColumn + ' ',
charindex(#tag, RawColumn) + datalength(#tag) + 1
) - charindex(#tag, RawColumn) + datalength(#tag)
) as UserID;
You also appear to be ignoring upper/lower case in your string matches. It's not clear to me whether you need to consider that more carefully.
I have a requirement from a client to have a search-field where he wants to input any text and search for every word in that text field in multiple full-text indexed columns which contain customer information, from a customer information table.
So, for example, if he inputs FL Diana Brooks Miami 90210, he wants all of these terms (FL, Diana, Brooks, Miami, 90210) to each be searched into the State, FirstName, LastName, City and Zip columns.
Now, this seems totally a bad idea to begin with and as an alternative I suggested using multiple fields where to input this information separately. Nonetheless, the point I am at is having to make a proof of concept as to why this won't work, from a performance perspective, and that it's better to have multiple fields where you input the term you want to search for.
So, getting to my query, I'm trying to write a Full-Text query to do what the client has asked for in order to get a benchmark for performance.
What I have so far doesn't seem to work, so I guess I am asking if it's even possible to do this?
declare
#zip varchar(10) = 90210
, #lastName varchar(50) = 'Brooks'
, #firstName varchar(50) = 'Diana'
, #city varchar(50) = 'Miami'
, #state char(2) = 'FL'
, #searchTerm varchar(250) = ''
, #s varchar(1) = ' '
set #searchTerm = #state + ' ' + #firstName + ' ' + #lastName + ' ' + #city
select *
from freetexttable(contacts, (zip, lastName, FirstName, city, state), #searchTerm) ftTbl
inner join contacts c on ftTbl.[key] = c.ContactID
The query I have above seems to work, but is not restrictive enough in order to find only the single record I'm looking for and is returning a whole lot more (which I'm guessing that it's because I'm using FREETEXTTABLE).
I've also tried replacing it with CONTAINSTABLE, but I get an error saying:
Msg 7630, Level 15, State 3, Line 26
Syntax error near 'Diana' in the full-text search condition 'FL Diana Brooks Miami'.
With using regular indexes I have been able to solve this, but I'm curious if it's even possible to do the same thing with Full-Text.
Using regular indexes I have a query with a adaptable WHERE clause, like below:
WHERE C.FirstName like coalesce(#FirstName + '%' , C.FirstName)
AND C.LastName like coalesce(#LastName + '%' , C.LastName)
etc.
You can create a view WITH SCHEMABINDING with id and concatinated columns:
CREATE VIEW dbo.SearchView WITH SCHEMABINDING
AS
SELECT id,
[State]+' ',
[FirstName]+' ',
[LastName]+' ',
[City]+' ',
[Zip] as search_string
FROM YourTable
Create index
CREATE UNIQUE CLUSTERED INDEX UCI_SearchView ON dbo.SearchView (id ASC)
Then create full-text index on search_string field.
USE YourDB
GO
--Enable Full-text search on the DB
IF (SELECT DATABASEPROPERTY(DB_NAME(), N'IsFullTextEnabled')) <> 1
EXEC sp_fulltext_database N'enable'
GO
--Create a full-text catalog
IF NOT EXISTS (SELECT * FROM dbo.sysfulltextcatalogs WHERE [name] = N'CatalogName')
EXEC sp_fulltext_catalog N'CatalogName', N'create'
GO
EXEC sp_fulltext_table N'dbo.SearchView', N'create', N'CatalogName', N'IndexName'
GO
--Add a column to catalog
EXEC sp_fulltext_column N'dbo.SearchView', N'search_string', N'add', 0 /* neutral */
GO
--Activate full-text for table/view
EXEC sp_fulltext_table N'dbo.SearchView', N'activate'
GO
--Full-text index update
exec sp_fulltext_catalog 'CatalogName', 'start_full'
GO
After that you need to write some function to construct a search condition. F.e.
FL Diana Brooks Miami 90210
Became:
"FL*" AND "Diana*" AND "Brooks*" AND "Miami*" AND "90210*"
And use it in FREETEXT or CONTAINS searches:
DECLARE #search nvarchar(4000) = '"FL*" AND "Diana*" AND "Brooks*" AND "Miami*" AND "90210*"'
SELECT sv.*
FROM dbo.SearchView sv
INNER JOIN CONTAINSTABLE (dbo.SearchView, search_string, #search) as c
ON c.[KEY] = sv.id
I have a table showing locations with a BIT column for each tool in use at each location:
CREATE TABLE dbo.[ToolsSelected] (
[LocationID] NVARCHAR(40) NOT NULL,
[Tool1] INTEGER DEFAULT 0 NOT NULL,
[Tool2] INTEGER DEFAULT 0 NOT NULL,
[Tool3] INTEGER DEFAULT 0 NOT NULL,
[Tool4] INTEGER DEFAULT 0 NOT NULL,
PRIMARY KEY ([LocationID])
);
LocID Tool1 Tool2 Tool3 Tool4
----- ----- ----- ----- -----
AZ 0 1 1 0
NY 1 0 1 1
I need to convert this to a table by LocationID indicating which tools at which locations:
CREATE TABLE dbo.[ByLocation] (
[LocationID] NVARCHAR(40) NOT NULL,
[Tool] NVARCHAR(40) NOT NULL, -- Column title of ToolsSelected table
PRIMARY KEY ([LocationID], [Tool])
);
LocID Tool
----- -----
AZ Tool2
AZ Tool3
NY Tool1
NY Tool3
NY Tool4
The idea is that each location can select the tools they need, I then need to query the tools table to get details (versions, etc) for each tool selected. Each location is unique; each tool is unique. Is there a way to do this or a much better implementation?
Here is the answer to the immediate question, given only 4 tools columns:
SELECT LocID = LocationID, Tool
FROM
(
SELECT LocationID, Tool = 'Tool1' FROM dbo.ToolsSelected WHERE Tool1 = 1
UNION ALL
SELECT LocationID, Tool = 'Tool2' FROM dbo.ToolsSelected WHERE Tool2 = 1
UNION ALL
SELECT LocationID, Tool = 'Tool3' FROM dbo.ToolsSelected WHERE Tool3 = 1
UNION ALL
SELECT LocationID, Tool = 'Tool4' FROM dbo.ToolsSelected WHERE Tool4 = 1
) AS x
ORDER BY LocID, Tool;
With 40 columns, you could do the same thing, but along with the desire to generate this dynamically:
DECLARE #sql NVARCHAR(MAX);
SET #sql = N'';
SELECT #sql += '
UNION ALL
SELECT LocationID, Tool = ''' + name + '''
FROM dbo.ToolsSelected WHERE ' + name + ' = 1'
FROM sys.columns WHERE [object_id] = OBJECT_ID('dbo.ToolsSelected')
AND name LIKE 'Tool[0-9]%';
SELECT #sql = N'SELECT LocID = LocationID, Tool
FROM
(' + STUFF(#sql, 1, 17, '') + '
) AS x ORDER BY LocID, Tool;';
PRINT #sql;
-- EXEC sp_executesql #sql;
*BUT*
Storing these as separate columns is a recipe for disaster. So when you add Tool41, Tool42 etc. you have to change the schema then change all your code that passes the column names and 1/0 via parameters etc. Why not represent these as simple numbers, e.g.
CREATE TABLE dbo.LocationTools
(
LocID NVARCHAR(40),
ToolID INT
);
So in the above case you would store:
LocID Tool
----- ----
AZ 2
AZ 3
NY 1
NY 3
NY 4
Now when you pass in the checkboxes they've selected, presumably from the front end you are receiving two values, such as:
LocID: "NY"
Tools: "Tool1, Tool5, Tool26"
If that's about right, then you can populate the table when a user creates or changes their choice, first using a split function to break up the comma-separated list dictated by the checkboxes:
CREATE FUNCTION dbo.SplitTools
(
#ToolList NVARCHAR(MAX)
)
RETURNS TABLE
WITH SCHEMABINDING
AS
RETURN
(
SELECT ToolID = y.i.value('(./text())[1]', 'int')
FROM
(
SELECT x = CONVERT(XML,
'<i>' + REPLACE(REPLACE(#List, ',', '</i><i>'), 'Tool', '')
+ '</i>').query('.')
) AS a CROSS APPLY x.nodes('i') AS y(i)
);
GO
(You forgot to tell us which version of SQL Server you are using - if 2008 or above you could use a table-valued parameter as an alternative to a split function.)
Then a procedure to handle it:
CREATE PROCEDURE dbo.UpdateLocationTools
#LocID NVARCHAR(40),
#Tools NVARCHAR(MAX)
AS
BEGIN
SET NOCOUNT ON;
-- in case they had previously selected tools
-- that are no longer selected, clear first:
DELETE dbo.LocationTools WHERE LocID = #LocID;
INSERT dbo.LocationTools(LocID, ToolID)
SELECT #LocID, ToolID
FROM dbo.SplitTools(#Tools);
END
GO
Now you can add new tool #s without changing schema or code, since your list of checkboxes could also be generated from your data - assuming you have a dbo.Tools table or want to add one. This table could also be used for data integrity purposes (you could put a foreign key on dbo.LocationTools.ToolID).
And you can generate your desired query very simply:
SELECT LocID, Tool = 'Tool' + CONVERT(VARCHAR(12), ToolID)
FROM dbo.LocationTools
ORDER BY LocID, ToolID;
No redundant data, no wide tables with unmanageable columns, and a proper index can even help you search for, say, all locations using Tool3 efficiently...
I was looking at different ways of writing a stored procedure to return a "page" of data. This was for use with the ASP ObjectDataSource, but it could be considered a more general problem.
The requirement is to return a subset of the data based on the usual paging parameters; startPageIndex and maximumRows, but also a sortBy parameter to allow the data to be sorted. Also there are some parameters passed in to filter the data on various conditions.
One common way to do this seems to be something like this:
[Method 1]
;WITH stuff AS (
SELECT
CASE
WHEN #SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name)
WHEN #SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC)
WHEN #SortBy = ...
ELSE ROW_NUMBER() OVER (ORDER BY whatever)
END AS Row,
.,
.,
.,
FROM Table1
INNER JOIN Table2 ...
LEFT JOIN Table3 ...
WHERE ... (lots of things to check)
)
SELECT *
FROM stuff
WHERE (Row > #startRowIndex)
AND (Row <= #startRowIndex + #maximumRows OR #maximumRows <= 0)
ORDER BY Row
One problem with this is that it doesn't give the total count and generally we need another stored procedure for that. This second stored procedure has to replicate the parameter list and the complex WHERE clause. Not nice.
One solution is to append an extra column to the final select list, (SELECT COUNT(*) FROM stuff) AS TotalRows. This gives us the total but repeats it for every row in the result set, which is not ideal.
[Method 2]
An interesting alternative is given here (https://web.archive.org/web/20211020111700/https://www.4guysfromrolla.com/articles/032206-1.aspx) using dynamic SQL. He reckons that the performance is better because the CASE statement in the first solution drags things down. Fair enough, and this solution makes it easy to get the totalRows and slap it into an output parameter. But I hate coding dynamic SQL. All that 'bit of SQL ' + STR(#parm1) +' bit more SQL' gubbins.
[Method 3]
The only way I can find to get what I want, without repeating code which would have to be synchronized, and keeping things reasonably readable is to go back to the "old way" of using a table variable:
DECLARE #stuff TABLE (Row INT, ...)
INSERT INTO #stuff
SELECT
CASE
WHEN #SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name)
WHEN #SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC)
WHEN #SortBy = ...
ELSE ROW_NUMBER() OVER (ORDER BY whatever)
END AS Row,
.,
.,
.,
FROM Table1
INNER JOIN Table2 ...
LEFT JOIN Table3 ...
WHERE ... (lots of things to check)
SELECT *
FROM stuff
WHERE (Row > #startRowIndex)
AND (Row <= #startRowIndex + #maximumRows OR #maximumRows <= 0)
ORDER BY Row
(Or a similar method using an IDENTITY column on the table variable).
Here I can just add a SELECT COUNT on the table variable to get the totalRows and put it into an output parameter.
I did some tests and with a fairly simple version of the query (no sortBy and no filter), method 1 seems to come up on top (almost twice as quick as the other 2). Then I decided to test probably I needed the complexity and I needed the SQL to be in stored procedures. With this I get method 1 taking nearly twice as long as the other 2 methods. Which seems strange.
Is there any good reason why I shouldn't spurn CTEs and stick with method 3?
UPDATE - 15 March 2012
I tried adapting Method 1 to dump the page from the CTE into a temporary table so that I could extract the TotalRows and then select just the relevant columns for the resultset. This seemed to add significantly to the time (more than I expected). I should add that I'm running this on a laptop with SQL Server Express 2008 (all that I have available) but still the comparison should be valid.
I looked again at the dynamic SQL method. It turns out I wasn't really doing it properly (just concatenating strings together). I set it up as in the documentation for sp_executesql (with a parameter description string and parameter list) and it's much more readable. Also this method runs fastest in my environment. Why that should be still baffles me, but I guess the answer is hinted at in Hogan's comment.
I would most likely split the #SortBy argument into two, #SortColumn and #SortDirection, and use them like this:
…
ROW_NUMBER() OVER (
ORDER BY CASE #SortColumn
WHEN 'Name' THEN Name
WHEN 'OtherName' THEN OtherName
…
END *
CASE #SortDirection
WHEN 'DESC' THEN -1
ELSE 1
END
) AS Row
…
And this is how the TotalRows column could be defined (in the main select):
…
COUNT(*) OVER () AS TotalRows
…
I would definitely want to do a combination of a temp table and NTILE for this sort of approach.
The temp table will allow you to do your complicated series of conditions just once. Because you're only storing the pieces you care about, it also means that when you start doing selects against it further in the procedure, it should have a smaller overall memory usage than if you ran the condition multiple times.
I like NTILE() for this better than ROW_NUMBER() because it's doing the work you're trying to accomplish for you, rather than having additional where conditions to worry about.
The example below is one based off a similar query I'm using as part of a research query; I have an ID I can use that I know will be unique in the results. Using an ID that was an identity column would also be appropriate here, though.
--DECLARES here would be stored procedure parameters
declare #pagesize int, #sortby varchar(25), #page int = 1;
--Create temp with all relevant columns; ID here could be an identity PK to help with paging query below
create table #temp (id int not null primary key clustered, status varchar(50), lastname varchar(100), startdate datetime);
--Insert into #temp based off of your complex conditions, but with no attempt at paging
insert into #temp
(id, status, lastname, startdate)
select id, status, lastname, startdate
from Table1 ...etc.
where ...complicated conditions
SET #pagesize = 50;
SET #page = 5;--OR CAST(#startRowIndex/#pagesize as int)+1
SET #sortby = 'name';
--Only use the id and count to use NTILE
;with paging(id, pagenum, totalrows) as
(
select id,
NTILE((SELECT COUNT(*) cnt FROM #temp)/#pagesize) OVER(ORDER BY CASE WHEN #sortby = 'NAME' THEN lastname ELSE convert(varchar(10), startdate, 112) END),
cnt
FROM #temp
cross apply (SELECT COUNT(*) cnt FROM #temp) total
)
--Use the id to join back to main select
SELECT *
FROM paging
JOIN #temp ON paging.id = #temp.id
WHERE paging.pagenum = #page
--Don't need the drop in the procedure, included here for rerunnability
drop table #temp;
I generally prefer temp tables over table variables in this scenario, largely so that there are definite statistics on the result set you have. (Search for temp table vs table variable and you'll find plenty of examples as to why)
Dynamic SQL would be most useful for handling the sorting method. Using my example, you could do the main query in dynamic SQL and only pull the sort method you want to pull into the OVER().
The example above also does the total in each row of the return set, which as you mentioned was not ideal. You could, instead, have a #totalrows output variable in your procedure and pull it as well as the result set. That would save you the CROSS APPLY that I'm doing above in the paging CTE.
I would create one procedure to stage, sort, and paginate (using NTILE()) a staging table; and a second procedure to retrieve by page. This way you don't have to run the entire main query for each page.
This example queries AdventureWorks.HumanResources.Employee:
--------------------------------------------------------------------------
create procedure dbo.EmployeesByMartialStatus
#MaritalStatus nchar(1)
, #sort varchar(20)
as
-- Init staging table
if exists(
select 1 from sys.objects o
inner join sys.schemas s on s.schema_id=o.schema_id
and s.name='Staging'
and o.name='EmployeesByMartialStatus'
where type='U'
)
drop table Staging.EmployeesByMartialStatus;
-- Populate staging table with sort value
with s as (
select *
, sr=ROW_NUMBER()over(order by case #sort
when 'NationalIDNumber' then NationalIDNumber
when 'ManagerID' then ManagerID
-- plus any other sort conditions
else EmployeeID end)
from AdventureWorks.HumanResources.Employee
where MaritalStatus=#MaritalStatus
)
select *
into #temp
from s;
-- And now pages
declare #RowCount int; select #rowCount=COUNT(*) from #temp;
declare #PageCount int=ceiling(#rowCount/20); --assuming 20 lines/page
select *
, Page=NTILE(#PageCount)over(order by sr)
into Staging.EmployeesByMartialStatus
from #temp;
go
--------------------------------------------------------------------------
-- procedure to retrieve selected pages
create procedure EmployeesByMartialStatus_GetPage
#page int
as
declare #MaxPage int;
select #MaxPage=MAX(Page) from Staging.EmployeesByMartialStatus;
set #page=case when #page not between 1 and #MaxPage then 1 else #page end;
select EmployeeID,NationalIDNumber,ContactID,LoginID,ManagerID
, Title,BirthDate,MaritalStatus,Gender,HireDate,SalariedFlag,VacationHours,SickLeaveHours
, CurrentFlag,rowguid,ModifiedDate
from Staging.EmployeesByMartialStatus
where Page=#page
GO
--------------------------------------------------------------------------
-- Usage
-- Load staging
exec dbo.EmployeesByMartialStatus 'M','NationalIDNumber';
-- Get pages 1 through n
exec dbo.EmployeesByMartialStatus_GetPage 1;
exec dbo.EmployeesByMartialStatus_GetPage 2;
-- ...etc (this would actually be a foreach loop, but that detail is omitted for brevity)
GO
I use this method of using EXEC():
-- SP parameters:
-- #query: Your query as an input parameter
-- #maximumRows: As number of rows per page
-- #startPageIndex: As number of page to filter
-- #sortBy: As a field name or field names with supporting DESC keyword
DECLARE #query nvarchar(max) = 'SELECT * FROM sys.Objects',
#maximumRows int = 8,
#startPageIndex int = 3,
#sortBy as nvarchar(100) = 'name Desc'
SET #query = ';WITH CTE AS (' + #query + ')' +
'SELECT *, (dt.pagingRowNo - 1) / ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 As pagingPageNo' +
', pagingCountRow / ' + CAST(#maximumRows as nvarchar(10)) + ' As pagingCountPage ' +
', (dt.pagingRowNo - 1) % ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 As pagingRowInPage ' +
'FROM ( SELECT *, ROW_NUMBER() OVER (ORDER BY ' + #sortBy + ') As pagingRowNo, COUNT(*) OVER () AS pagingCountRow ' +
'FROM CTE) dt ' +
'WHERE (dt.pagingRowNo - 1) / ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 = ' + CAST(#startPageIndex as nvarchar(10))
EXEC(#query)
At result-set after query result columns:
Note:
I add some extra columns that you can remove them:
pagingRowNo : The row number
pagingCountRow : The total number of rows
pagingPageNo : The current page number
pagingCountPage : The total number of pages
pagingRowInPage : The row number that started with 1 in this page
I currently have the following select statement, but I wish to move to full text search on the Keywords column. How would I re-write this to use CONTAINS?
SELECT MediaID, 50 AS Weighting
FROM Media m JOIN #words w ON m.Keywords LIKE '%' + w.Word + '%'
#words is a table variable filled with words I wish to look for:
DECLARE #words TABLE(Word NVARCHAR(512) NOT NULL);
If you are not against using a temp table, and EXEC (and I realize that is a big if), you could do the following:
DECLARE #KeywordList VARCHAR(MAX), #KeywordQuery VARCHAR(MAX)
SELECT #KeywordList = STUFF ((
SELECT '"' + Keyword + '" OR '
FROM FTS_Keywords
FOR XML PATH('')
), 1, 0, '')
SELECT #KeywordList = SUBSTRING(#KeywordList, 0, LEN(#KeywordList) - 2)
SELECT #KeywordQuery = 'SELECT RecordID, Document FROM FTS_Demo_2 WHERE CONTAINS(Document, ''' + #KeywordList +''')'
--SELECT #KeywordList, #KeywordQuery
CREATE TABLE #Results (RecordID INT, Document NVARCHAR(MAX))
INSERT INTO #Results (RecordID, Document)
EXEC(#KeywordQuery)
SELECT * FROM #Results
DROP TABLE #Results
This would generate a query like:
SELECT RecordID
,Document
FROM FTS_Demo_2
WHERE CONTAINS(Document, '"red" OR "green" OR "blue"')
And results like this:
RecordID Document
1 one two blue
2 three red five
If CONTAINS allows a variable or column, you could have used something like this.
SELECT MediaID, 50 AS Weighting
FROM Media m
JOIN #words w ON CONTAINS(m.Keywords, w.word)
However, according to Books Online for SQL Server CONTAINS, it is not supported. Therefore, no there is no way to do it.
Ref: (column_name appears only in the first param to CONTAINS)
CONTAINS
( { column_name | ( column_list ) | * }
,'<contains_search_condition>'
[ , LANGUAGE language_term ]
)