I've created a stored procedure, but I can't alter it now, because the darn thing is still in use.
alter proc GetDuplicates #search1 varchar(40) = '', #search2 varchar(40) =
'blahblehblargh' as
select distinct i4202 "Man", i4203 "Mod"
into #ManModTemp
from inventory
where i4211 like ('2%')
order by i4202 desc
select distinct i4202 "Man", i4203 "Mod", count(*) "Count"
into #ManModCount
from inventory
group by i4202, i4203
select Man, Mod
into #ManModList
from #ManModTemp
where Mod in (select Mod from #ManModList group by Mod having Count(*) >
1) and (Man like '%' + IsNull(#search1,'') + '%' or Man like '%' +
IsNull(#search2, 'blahblehblargh') + '%')
select * from #ManModList
join #ManModCount
on (#ManModList.Man = #ManModCount.Man and #ManModList.Mod =
#ManModCount.Mod)
I think the mistake I made, maybe, is that I didn't wrap it in BEGIN-END statements and now it's just not ending?
Please forgive the hacky code (as my wife, the developer, calls it). I'm new at this?
Here is the error:
Could not execute statement.
Procedure in use by 'user'
SQLCODE=-215, ODBC 3 State="40001"
Line 1, column 1
Related
I am struggling with this: I have two SQL statements that have 2 different sets of keywords. These are stored in temporary tables since I cannot update, delete or insert into a table.
How do I write a third SQL statement (limited on SQL characters in each statement) that says: "If 'pingu' and 'noot' is correct then true, otherwise if 'sponge' and 'bob' are true display results" (this works)? But then how do I say: "if 'pingu' and 'sponge' is selected then true, or 'bob' and 'noot' are selected then true", but keeping the 'pingu' and 'noot' as true if selected?
Example of keyword list 1: 'Pingu' and 'Noot'
DECLARE #teststring varchar(512) = '{KEYWORD}'
SELECT TOP 1 k.type
FROM (VALUES
('pingu', '66'), ('noot', '66'))
k(word,type) WHERE #teststring LIKE '%' + k.word + '%'
GROUP BY k.type
HAVING COUNT(1) >=2
ORDER BY COUNT(1) DESC;
Example of keyword list 2: 'Sponge' and 'Bob'
DECLARE #teststring varchar(512) = '{KEYWORD}'
SELECT TOP 1 k.type
FROM (VALUES ('sponge', '66'), ('bob', '66')) k (word, type)
WHERE #teststring LIKE '%' + k.word + '%'
GROUP BY k.type
HAVING COUNT(1) >= 2
ORDER BY COUNT(1) DESC;
What about combining the two source queries with a UNION ALL?
For example (adapting your original queries):
DECLARE #teststring varchar(512) = '{KEYWORD}';
WITH Keywords AS (
SELECT *
FROM (VALUES ('pingu', '66'), ('noot', '66')) k(word, type)
UNION ALL
SELECT *
FROM (VALUES ('sponge', '66'), ('bob', '66')) k(word, type)
)
SELECT TOP 1 k.type
FROM Keywords k
WHERE #teststring LIKE '%' + k.word + '%'
GROUP BY k.type
HAVING COUNT(1) >=2
ORDER BY COUNT(1) DESC;
This returns a result row if at least two keywords, regardless of which source query the keywords come from, are found in #teststring.
Note: If your keywords lists are large, it may be worth reworking the query so that an index can be used to make processing the WHERE clause more efficient.
I have a requirement from a client to have a search-field where he wants to input any text and search for every word in that text field in multiple full-text indexed columns which contain customer information, from a customer information table.
So, for example, if he inputs FL Diana Brooks Miami 90210, he wants all of these terms (FL, Diana, Brooks, Miami, 90210) to each be searched into the State, FirstName, LastName, City and Zip columns.
Now, this seems totally a bad idea to begin with and as an alternative I suggested using multiple fields where to input this information separately. Nonetheless, the point I am at is having to make a proof of concept as to why this won't work, from a performance perspective, and that it's better to have multiple fields where you input the term you want to search for.
So, getting to my query, I'm trying to write a Full-Text query to do what the client has asked for in order to get a benchmark for performance.
What I have so far doesn't seem to work, so I guess I am asking if it's even possible to do this?
declare
#zip varchar(10) = 90210
, #lastName varchar(50) = 'Brooks'
, #firstName varchar(50) = 'Diana'
, #city varchar(50) = 'Miami'
, #state char(2) = 'FL'
, #searchTerm varchar(250) = ''
, #s varchar(1) = ' '
set #searchTerm = #state + ' ' + #firstName + ' ' + #lastName + ' ' + #city
select *
from freetexttable(contacts, (zip, lastName, FirstName, city, state), #searchTerm) ftTbl
inner join contacts c on ftTbl.[key] = c.ContactID
The query I have above seems to work, but is not restrictive enough in order to find only the single record I'm looking for and is returning a whole lot more (which I'm guessing that it's because I'm using FREETEXTTABLE).
I've also tried replacing it with CONTAINSTABLE, but I get an error saying:
Msg 7630, Level 15, State 3, Line 26
Syntax error near 'Diana' in the full-text search condition 'FL Diana Brooks Miami'.
With using regular indexes I have been able to solve this, but I'm curious if it's even possible to do the same thing with Full-Text.
Using regular indexes I have a query with a adaptable WHERE clause, like below:
WHERE C.FirstName like coalesce(#FirstName + '%' , C.FirstName)
AND C.LastName like coalesce(#LastName + '%' , C.LastName)
etc.
You can create a view WITH SCHEMABINDING with id and concatinated columns:
CREATE VIEW dbo.SearchView WITH SCHEMABINDING
AS
SELECT id,
[State]+' ',
[FirstName]+' ',
[LastName]+' ',
[City]+' ',
[Zip] as search_string
FROM YourTable
Create index
CREATE UNIQUE CLUSTERED INDEX UCI_SearchView ON dbo.SearchView (id ASC)
Then create full-text index on search_string field.
USE YourDB
GO
--Enable Full-text search on the DB
IF (SELECT DATABASEPROPERTY(DB_NAME(), N'IsFullTextEnabled')) <> 1
EXEC sp_fulltext_database N'enable'
GO
--Create a full-text catalog
IF NOT EXISTS (SELECT * FROM dbo.sysfulltextcatalogs WHERE [name] = N'CatalogName')
EXEC sp_fulltext_catalog N'CatalogName', N'create'
GO
EXEC sp_fulltext_table N'dbo.SearchView', N'create', N'CatalogName', N'IndexName'
GO
--Add a column to catalog
EXEC sp_fulltext_column N'dbo.SearchView', N'search_string', N'add', 0 /* neutral */
GO
--Activate full-text for table/view
EXEC sp_fulltext_table N'dbo.SearchView', N'activate'
GO
--Full-text index update
exec sp_fulltext_catalog 'CatalogName', 'start_full'
GO
After that you need to write some function to construct a search condition. F.e.
FL Diana Brooks Miami 90210
Became:
"FL*" AND "Diana*" AND "Brooks*" AND "Miami*" AND "90210*"
And use it in FREETEXT or CONTAINS searches:
DECLARE #search nvarchar(4000) = '"FL*" AND "Diana*" AND "Brooks*" AND "Miami*" AND "90210*"'
SELECT sv.*
FROM dbo.SearchView sv
INNER JOIN CONTAINSTABLE (dbo.SearchView, search_string, #search) as c
ON c.[KEY] = sv.id
I'm trying to clean up some stored procedures, and was curious about the following. I did a search, but couldn't find anything that really talked about performance.
Explanation
Imagine a stored procedure that has the following parameters defined:
#EntryId uniqueidentifier,
#UserId int = NULL
I have the following table:
tbl_Entry
-------------------------------------------------------------------------------------
| EntryId PK, uniqueidentifier | Name nvarchar(140) | Created datetime | UserId int |
-------------------------------------------------------------------------------------
All columns are NOT NULL.
The idea behind this stored procedure is that you can get an Entry by its uniqueidentifier PK and, optionally, you can validate that it has the given UserId assigned by passing that as the second parameter. Imagine administrators who can view all entries versus a user who can only view their own entries.
Option 1 (current)
DECLARE #sql nvarchar(3000);
SET #sql = N'
SELECT
a.EntryId,
a.Name,
a.UserId,
b.UserName
FROM
tbl_Entry a,
tbl_User b
WHERE
a.EntryId = #EntryId
AND b.UserId = a.UserId';
IF #UserId IS NOT NULL
SET #sql = #sql + N' AND a.UserId = #UserId';
EXECUTE sp_executesql #sql;
Option 2 (what I thought would be better)
SELECT
a.EntryId,
a.Name,
a.UserId,
b.UserName
FROM
tbl_Entry a,
tbl_User b
WHERE
a.EntryId = #EntryId
AND a.UserId = COALESCE(#UserId, a.UserId)
AND b.UserId = a.UserId;
I realize this case is fairly, simple, and could likely be optimized by a single IF statement that separates two queries. I wrote a simple case to try and concisely explain the issue. The actual stored procedure has 6 nullable parameters. There are others that have even more nullable parameters. Using IF blocks would be very complicated.
Question
Will SQL Server still check a.UserId = a.UserId on every row even though that condition will always be true, or will that condition be optimized out when it sees that #UserId is NULL?
If it would check a.UserId = a.UserId on every row, would it be more efficient to build a string like in option 1, or would it still be faster to do the a.UserId = a.UserId condition? Is that something that would depend on how many rows are in the tables?
Is there another option here that I should be considering? I wouldn't call myself a database expert by any means.
You will get the best performance (and the lowest query cost) if you replace the COALESCE with a compound predicate as follows:
(#UserId IS NULL OR a.UserId = #UserId)
I would also suggest when writing T-SQL that you utilize the join syntax rather than the antiquated ANSI-89 coding style. The revised query will look something like this:
SELECT a.EntryId, a.Name, a.UserId, b.UserName
FROM tblEntry a
INNER JOIN tblUser b ON a.UserId = b.UserId
WHERE a.EntryId = #EntryId
AND (#UserId IS NULL OR a.UserId = #UserId);
I am executing below query. It takes 80 seconds for just 17 records.
can any body tell me reason if knows. I have already tried with Indexes.
SELECT DISTINCT t.i_UserID,
u.vch_LoginName,
t.vch_PreviousEmailAddress AS 'vch_EmailAddress',
u.vch_DisplayName,
t.d_TransactionDate AS 'd_DateAdded',
'Old' AS 'vch_RecordStatus'
FROM tblEmailTransaction t
INNER JOIN tblUser u
ON t.i_UserID = u.i_UserID
WHERE t.vch_PreviousEmailAddress LIKE '%kala%'
Change collation for vch_PreviousEmailAddress column on Latin1_General_100_BIN2
Create covered index:
CREATE NONCLUSTERED INDEX ix
ON dbo.tblEmailTransaction (vch_PreviousEmailAddress)
INCLUDE (i_UserID, d_TransactionDate)
GO
And have fun with this query:
SELECT t.i_UserID,
u.vch_LoginName,
t.vch_PreviousEmailAddress AS vch_EmailAddress,
u.vch_DisplayName,
t.d_TransactionDate AS d_DateAdded,
'Old' AS vch_RecordStatus
FROM (
SELECT DISTINCT i_UserID,
vch_PreviousEmailAddress,
d_TransactionDate
FROM dbo.tblEmailTransaction
WHERE vch_PreviousEmailAddress LIKE '%kala%' COLLATE Latin1_General_100_BIN2
) t
JOIN dbo.tblUser u ON t.i_UserID = u.i_UserID
One other thing, which I find useful in solving problems like this:
Try running the following script. It will tell you which indexes you could ask to your SQL Server database, which would make the most (positive) improvement.
SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED
SELECT TOP 100
ROUND(s.avg_total_user_cost * s.avg_user_impact * (s.user_seeks + s.user_scans),0) AS 'Total Cost',
s.avg_user_impact,
d.statement AS 'Table name',
d.equality_columns,
d.inequality_columns,
d.included_columns,
'CREATE INDEX [IndexName] ON ' + d.statement + ' ( '
+ case when (d.equality_columns IS NULL OR d.inequality_columns IS NULL)
then ISNULL(d.equality_columns, '') + ISNULL(d.inequality_columns, '')
else ISNULL(d.equality_columns, '') + ', ' + ISNULL(d.inequality_columns, '')
end + ' ) '
+ CASE WHEN d.included_columns IS NULL THEN '' ELSE 'INCLUDE ( ' + d.included_columns + ' )' end AS 'CREATE INDEX command'
FROM sys.dm_db_missing_index_groups g,
sys.dm_db_missing_index_group_stats s,
sys.dm_db_missing_index_details d
WHERE d.database_id = DB_ID()
AND s.group_handle = g.index_group_handle
AND d.index_handle = g.index_handle
ORDER BY [Total Cost] DESC
The right-hand column displays the CREATE INDEX command which you'd need to run, to create that index.
This one of those lifesaver scripts, which I run on our in-house databases once ever so often.
But yes, in your example, this is just likely to tell you that you need an index on the vch_PreviousEmailAddress field in your tblEmailTransaction table.
The probable bottleneck are 2:
Missing Index on tblEmailTransaction.i_UserID: Check if the table has the index
Missing Index on tblUser.i_UserID: Check if the table has the index
Like Statement: Like statement is know to be not good in performance, as Devart suggested, try to specify collection in this way:
WHERE vch_PreviousEmailAddress LIKE '%kala%' COLLATE Latin1_General_100_BIN2
To have a better view on your query, You have to run this command with your query:
SET IO STATISTICS ON
It will write all the IO Access that the query does and the we can see what happen.
Just a final question ?
How many rows contains the two tables?
Ciao
I was looking at different ways of writing a stored procedure to return a "page" of data. This was for use with the ASP ObjectDataSource, but it could be considered a more general problem.
The requirement is to return a subset of the data based on the usual paging parameters; startPageIndex and maximumRows, but also a sortBy parameter to allow the data to be sorted. Also there are some parameters passed in to filter the data on various conditions.
One common way to do this seems to be something like this:
[Method 1]
;WITH stuff AS (
SELECT
CASE
WHEN #SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name)
WHEN #SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC)
WHEN #SortBy = ...
ELSE ROW_NUMBER() OVER (ORDER BY whatever)
END AS Row,
.,
.,
.,
FROM Table1
INNER JOIN Table2 ...
LEFT JOIN Table3 ...
WHERE ... (lots of things to check)
)
SELECT *
FROM stuff
WHERE (Row > #startRowIndex)
AND (Row <= #startRowIndex + #maximumRows OR #maximumRows <= 0)
ORDER BY Row
One problem with this is that it doesn't give the total count and generally we need another stored procedure for that. This second stored procedure has to replicate the parameter list and the complex WHERE clause. Not nice.
One solution is to append an extra column to the final select list, (SELECT COUNT(*) FROM stuff) AS TotalRows. This gives us the total but repeats it for every row in the result set, which is not ideal.
[Method 2]
An interesting alternative is given here (https://web.archive.org/web/20211020111700/https://www.4guysfromrolla.com/articles/032206-1.aspx) using dynamic SQL. He reckons that the performance is better because the CASE statement in the first solution drags things down. Fair enough, and this solution makes it easy to get the totalRows and slap it into an output parameter. But I hate coding dynamic SQL. All that 'bit of SQL ' + STR(#parm1) +' bit more SQL' gubbins.
[Method 3]
The only way I can find to get what I want, without repeating code which would have to be synchronized, and keeping things reasonably readable is to go back to the "old way" of using a table variable:
DECLARE #stuff TABLE (Row INT, ...)
INSERT INTO #stuff
SELECT
CASE
WHEN #SortBy = 'Name' THEN ROW_NUMBER() OVER (ORDER BY Name)
WHEN #SortBy = 'Name DESC' THEN ROW_NUMBER() OVER (ORDER BY Name DESC)
WHEN #SortBy = ...
ELSE ROW_NUMBER() OVER (ORDER BY whatever)
END AS Row,
.,
.,
.,
FROM Table1
INNER JOIN Table2 ...
LEFT JOIN Table3 ...
WHERE ... (lots of things to check)
SELECT *
FROM stuff
WHERE (Row > #startRowIndex)
AND (Row <= #startRowIndex + #maximumRows OR #maximumRows <= 0)
ORDER BY Row
(Or a similar method using an IDENTITY column on the table variable).
Here I can just add a SELECT COUNT on the table variable to get the totalRows and put it into an output parameter.
I did some tests and with a fairly simple version of the query (no sortBy and no filter), method 1 seems to come up on top (almost twice as quick as the other 2). Then I decided to test probably I needed the complexity and I needed the SQL to be in stored procedures. With this I get method 1 taking nearly twice as long as the other 2 methods. Which seems strange.
Is there any good reason why I shouldn't spurn CTEs and stick with method 3?
UPDATE - 15 March 2012
I tried adapting Method 1 to dump the page from the CTE into a temporary table so that I could extract the TotalRows and then select just the relevant columns for the resultset. This seemed to add significantly to the time (more than I expected). I should add that I'm running this on a laptop with SQL Server Express 2008 (all that I have available) but still the comparison should be valid.
I looked again at the dynamic SQL method. It turns out I wasn't really doing it properly (just concatenating strings together). I set it up as in the documentation for sp_executesql (with a parameter description string and parameter list) and it's much more readable. Also this method runs fastest in my environment. Why that should be still baffles me, but I guess the answer is hinted at in Hogan's comment.
I would most likely split the #SortBy argument into two, #SortColumn and #SortDirection, and use them like this:
…
ROW_NUMBER() OVER (
ORDER BY CASE #SortColumn
WHEN 'Name' THEN Name
WHEN 'OtherName' THEN OtherName
…
END *
CASE #SortDirection
WHEN 'DESC' THEN -1
ELSE 1
END
) AS Row
…
And this is how the TotalRows column could be defined (in the main select):
…
COUNT(*) OVER () AS TotalRows
…
I would definitely want to do a combination of a temp table and NTILE for this sort of approach.
The temp table will allow you to do your complicated series of conditions just once. Because you're only storing the pieces you care about, it also means that when you start doing selects against it further in the procedure, it should have a smaller overall memory usage than if you ran the condition multiple times.
I like NTILE() for this better than ROW_NUMBER() because it's doing the work you're trying to accomplish for you, rather than having additional where conditions to worry about.
The example below is one based off a similar query I'm using as part of a research query; I have an ID I can use that I know will be unique in the results. Using an ID that was an identity column would also be appropriate here, though.
--DECLARES here would be stored procedure parameters
declare #pagesize int, #sortby varchar(25), #page int = 1;
--Create temp with all relevant columns; ID here could be an identity PK to help with paging query below
create table #temp (id int not null primary key clustered, status varchar(50), lastname varchar(100), startdate datetime);
--Insert into #temp based off of your complex conditions, but with no attempt at paging
insert into #temp
(id, status, lastname, startdate)
select id, status, lastname, startdate
from Table1 ...etc.
where ...complicated conditions
SET #pagesize = 50;
SET #page = 5;--OR CAST(#startRowIndex/#pagesize as int)+1
SET #sortby = 'name';
--Only use the id and count to use NTILE
;with paging(id, pagenum, totalrows) as
(
select id,
NTILE((SELECT COUNT(*) cnt FROM #temp)/#pagesize) OVER(ORDER BY CASE WHEN #sortby = 'NAME' THEN lastname ELSE convert(varchar(10), startdate, 112) END),
cnt
FROM #temp
cross apply (SELECT COUNT(*) cnt FROM #temp) total
)
--Use the id to join back to main select
SELECT *
FROM paging
JOIN #temp ON paging.id = #temp.id
WHERE paging.pagenum = #page
--Don't need the drop in the procedure, included here for rerunnability
drop table #temp;
I generally prefer temp tables over table variables in this scenario, largely so that there are definite statistics on the result set you have. (Search for temp table vs table variable and you'll find plenty of examples as to why)
Dynamic SQL would be most useful for handling the sorting method. Using my example, you could do the main query in dynamic SQL and only pull the sort method you want to pull into the OVER().
The example above also does the total in each row of the return set, which as you mentioned was not ideal. You could, instead, have a #totalrows output variable in your procedure and pull it as well as the result set. That would save you the CROSS APPLY that I'm doing above in the paging CTE.
I would create one procedure to stage, sort, and paginate (using NTILE()) a staging table; and a second procedure to retrieve by page. This way you don't have to run the entire main query for each page.
This example queries AdventureWorks.HumanResources.Employee:
--------------------------------------------------------------------------
create procedure dbo.EmployeesByMartialStatus
#MaritalStatus nchar(1)
, #sort varchar(20)
as
-- Init staging table
if exists(
select 1 from sys.objects o
inner join sys.schemas s on s.schema_id=o.schema_id
and s.name='Staging'
and o.name='EmployeesByMartialStatus'
where type='U'
)
drop table Staging.EmployeesByMartialStatus;
-- Populate staging table with sort value
with s as (
select *
, sr=ROW_NUMBER()over(order by case #sort
when 'NationalIDNumber' then NationalIDNumber
when 'ManagerID' then ManagerID
-- plus any other sort conditions
else EmployeeID end)
from AdventureWorks.HumanResources.Employee
where MaritalStatus=#MaritalStatus
)
select *
into #temp
from s;
-- And now pages
declare #RowCount int; select #rowCount=COUNT(*) from #temp;
declare #PageCount int=ceiling(#rowCount/20); --assuming 20 lines/page
select *
, Page=NTILE(#PageCount)over(order by sr)
into Staging.EmployeesByMartialStatus
from #temp;
go
--------------------------------------------------------------------------
-- procedure to retrieve selected pages
create procedure EmployeesByMartialStatus_GetPage
#page int
as
declare #MaxPage int;
select #MaxPage=MAX(Page) from Staging.EmployeesByMartialStatus;
set #page=case when #page not between 1 and #MaxPage then 1 else #page end;
select EmployeeID,NationalIDNumber,ContactID,LoginID,ManagerID
, Title,BirthDate,MaritalStatus,Gender,HireDate,SalariedFlag,VacationHours,SickLeaveHours
, CurrentFlag,rowguid,ModifiedDate
from Staging.EmployeesByMartialStatus
where Page=#page
GO
--------------------------------------------------------------------------
-- Usage
-- Load staging
exec dbo.EmployeesByMartialStatus 'M','NationalIDNumber';
-- Get pages 1 through n
exec dbo.EmployeesByMartialStatus_GetPage 1;
exec dbo.EmployeesByMartialStatus_GetPage 2;
-- ...etc (this would actually be a foreach loop, but that detail is omitted for brevity)
GO
I use this method of using EXEC():
-- SP parameters:
-- #query: Your query as an input parameter
-- #maximumRows: As number of rows per page
-- #startPageIndex: As number of page to filter
-- #sortBy: As a field name or field names with supporting DESC keyword
DECLARE #query nvarchar(max) = 'SELECT * FROM sys.Objects',
#maximumRows int = 8,
#startPageIndex int = 3,
#sortBy as nvarchar(100) = 'name Desc'
SET #query = ';WITH CTE AS (' + #query + ')' +
'SELECT *, (dt.pagingRowNo - 1) / ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 As pagingPageNo' +
', pagingCountRow / ' + CAST(#maximumRows as nvarchar(10)) + ' As pagingCountPage ' +
', (dt.pagingRowNo - 1) % ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 As pagingRowInPage ' +
'FROM ( SELECT *, ROW_NUMBER() OVER (ORDER BY ' + #sortBy + ') As pagingRowNo, COUNT(*) OVER () AS pagingCountRow ' +
'FROM CTE) dt ' +
'WHERE (dt.pagingRowNo - 1) / ' + CAST(#maximumRows as nvarchar(10)) + ' + 1 = ' + CAST(#startPageIndex as nvarchar(10))
EXEC(#query)
At result-set after query result columns:
Note:
I add some extra columns that you can remove them:
pagingRowNo : The row number
pagingCountRow : The total number of rows
pagingPageNo : The current page number
pagingCountPage : The total number of pages
pagingRowInPage : The row number that started with 1 in this page