MS-SQL 2005 search: conditional where clause with freetext - sql-server

I'm writing a fairly complex stored procedure to search an image library.
I was going to use a view and write dynamic sql to query the view, but I need to use a full text index, and my view needs outer joins (MS-SQL 2005 full-text index on a view with outer joins)
So, I'm back to a stored procedure.
I need to search on (all optional):
a general search query that uses the full text index (or no search terms)
one or more categories (or none)
a single tag (or none)
Is there a way to do a conditional FREETEXT in the 'WHERE' clause? The query may be empty, in which case I want to ignore this, or just return all FTI matches.
...AND FREETEXT(dbo.MediaLibraryCultures.*, '"* "') doesn't seem to work. Not sure how a case statement would work here.
Am I better off inserting the category/tag filter results into a temp table/table variable, then joining the FTI search results? That way I can only do the join if the search term is supplied.
Thoughts?

I know it's a year later and a newer version of SQL but FYI...
I am using SQL Server 2008 and have tried to short circuit using
AND ( #searchText = '' OR freetext(Name, #searchText))
and I receive the message "Null or empty full-text predicate" when setting #searchText = ''. I guess something in 2008 has changed that keeps short circuiting from working in this case.

You could add a check for the empty search string like
where ...
AND (FREETEXT(dbo.MediaLibraryCultures.*, #FreeTextSearchFor) OR #FreeTextSearchFor = '')
(I have a feeling that freetext searches can't have null passed into them, so I'm comparing to an empty string)
If the term to search for is empty, the whole clause will evaluate to true, so no restrictions will be applied (by this clause) to the rows returned, and of course since its a constant being compared to a variable - I would think the optimizer would come into play and not perform that comparison for each row.

Hmm, I thought there was no short-circuiting in sql server?
AND (#q = '' OR FREETEXT(dbo.MediaLibraryCultures.*, #q))
seems to work just fine!
Strangely, the full text scan is still part of the execution plan.

Doesn't work on SQL Server 2014. I tried the suggested short circuit in one of my stored procedures, but it keeps evaluating the FREETEXT expression. The only solution I have found is the following:
IF ISNULL(#Text, N'') = N'' SET #Text = N'""'
SELECT ...
WHERE ...
AND (#Text = '""' OR FREETEXT([Data], #Text)

Related

Compare NULL Parameters with columns that may contain NULL values

I'm trying to work on SQL Server with some parameters that can be NULL, where NULL means "ignore this parameter".
Then I have the column where the middle name is stored and can contain nulls.
I have the following conditions that works really fast:
T.tr_ben_name = ISNULL(#BenFirstName, T.tr_send_name) AND
T.tr_ben_middle = ISNULL(#BenMiddleName, T.tr_send_middle) AND
T.tr_ben_last = ISNULL(#BenLastName, T.tr_send_last) AND
T.tr_ben_last2 = ISNULL(#BenSecondLastName, T.tr_send_last2 )
But for some reason if the middle name value and the corresponding parameter are both NULL the record will be skipped, even if I turn off ANSI NULLS.
Then I came up with this other version that works well but 4 times slower:
(T.tr_ben_name = #BenFirstName OR #BenFirstName IS NULL) AND
(T.tr_ben_middle = #BenMiddleName OR #BenMiddleName IS NULL) AND
(T.tr_ben_last = #BenLastName OR #BenLastName IS NULL) AND
(T.tr_ben_last2 = #BenSecondLastName OR #BenSecondLastName IS NULL)
Can anyone explain what is the difference between these 2 approaches?
This is a short summary of why the queries perform differently, and what you can do to help the performance. For more details, see Catch-All Queries and Revisiting Catch-all Queries by Gail Shaw. For an exhaustive analysis, see Dynamic Search Conditions in T‑SQL By Erland Sommarskog.
Basically, the query with conditional parameters tries to create a query plan that works for all possible combinations of parameters passed in. This means that it doesn't default to lookup by index seek if you pass in the seekable parameters only. Instead, it uses some query plan that is usable for any combination of parameters.
Basic fixes for this issue are
Add OPTION (RECOMPILE) on the end of your query (SQL 2008 R2 SP1, 2008 SP3, or higher only)
Use Dynamic SQL. Only add the conditions being checked with non-null parameters.
If you want the full detailed whys and wherefores, the articles above are very good.

MS SQL Excel Query wildcards

I'm trying to introduce LIKE clause with wildcards in SQL query that runs within Excel 2007, where parameters are taken from specific Excel cells:
SELECT Elen_SalesData_View.ItemCode, Elen_SalesData_View.ItemDescription,
Elen_SalesData_View.ItemValue, Elen_SalesData_View.Quantity,
Elen_SalesData_View.CustomerId, Elen_SalesData_View.CustomerName,
Elen_SalesData_View.SalesInvoiceId, Elen_SalesData_View.EffectiveDate,
Elen_SalesData_View.CountryId
FROM SM_Live.dbo.Elen_SalesData_View Elen_SalesData_View
WHERE (Elen_SalesData_View.EffectiveDate>=? And Elen_SalesData_View.EffectiveDate<=?)
AND (Elen_SalesData_View.CustomerName<>'PROMO')
AND (Elen_SalesData_View.ItemDescription LIKE '%'+?+'%')
The EffectiveDate parameters are running fine and bringing back data as expected. But since I introduced LIKE - query runs, but returns nothing.
It doesn't return any results without wildcards either (full description entered):
(Elen_SalesData_View.ItemDescription LIKE ?)
Is there a restriction to wildcards or LIKE clause? If so, is there a way around it? (I cannot use CONTAINS, as the ItemDescription field is not FULLTEXT)
Have a look at this reference which suggests that % itself is the wildcard character, although it may depend on the dialect of SQL you are using. If this is the case then your LIKE clause will simply be LIKE '%' but untested.
I've just got this to work by using the (Elen_SalesData_View.ItemDescription LIKE ?) syntax then having the cell that contains the parameter value include the wildcard characters. If you don't/can't include the wildcards then create a formula in a separate cell to wrap the value in % characters and use this cell for the parameter value.
Rhys
My query was correct. There was something wrong with the actual spreadsheet. After redoing all from scratch - it worked!
SELECT Elen_SalesData_View.ItemCode, Elen_SalesData_View.ItemDescription,
Elen_SalesData_View.ItemValue, Elen_SalesData_View.Quantity,
Elen_SalesData_View.CustomerId, Elen_SalesData_View.CustomerName,
Elen_SalesData_View.SalesInvoiceId, Elen_SalesData_View.EffectiveDate,
Elen_SalesData_View.CountryId
FROM SM_Live.dbo.Elen_SalesData_View Elen_SalesData_View
WHERE (Elen_SalesData_View.ItemDescription Like '%'+?+'%')
AND (Elen_SalesData_View.EffectiveDate>=?) AND (Elen_SalesData_View.EffectiveDate<=?)
AND (Elen_SalesData_View.CustomerName<>'PROMO')

Stored procedure for Search Products with words in any order

I am stuck in nopcommerce stored procedure for product search which is quiet big.So I cannot post code .But part of store procedure is
where
--Some conditions
AND (
#SearchKeywords = 0 or
(
patindex(#Keywords, p.name) > 0
)
)
here I have converted my keyword to keyword with delimiter e.g 'gemini oil' to '%gemini%oil%' now if p.name is 'Gemini Refined Sunflower Oil' its working correctly.
But if my keyword is '%oil%gemini%' its not working.So basically I want to return result even if words in search keyword matches with p.name ,Condition is that words in search keyword can be any order.Contains slows down stored procedure so that option cannot work.
Any help would be appreciated.
SQL Server Full Text Search should help you out. You will basically create indexes on the columns you want to search. in the where clause of your query you will use the CONTAINS operator and pass it your search input.
you can start http://msdn.microsoft.com/en-us/library/ms142571.aspx or
http://beingoyen.blogspot.in/2008/09/full-text-search-step-by-step-tutorial.html
to learn more
already answered in different post by stephen776
post here again to increase visibility
You should create a fulltext index. That's what they are made for. You can then search in your column using the contains or freetext functions.
Warning: Fulltext search is very powerfull and you should get familiar with the possibilities and caveats (stopword lists!).

T-SQL Where Clause Case Statement Optimization (optional parameters to StoredProc)

I've been battling this one for a while now. I have a stored proc that takes in 3 parameters that are used to filter. If a specific value is passed in, I want to filter on that. If -1 is passed in, give me all.
I've tried it the following two ways:
First way:
SELECT field1, field2...etc
FROM my_view
WHERE
parm1 = CASE WHEN #PARM1= -1 THEN parm1 ELSE #PARM1 END
AND parm2 = CASE WHEN #PARM2 = -1 THEN parm2 ELSE #PARM2 END
AND parm3 = CASE WHEN #PARM3 = -1 THEN parm3 ELSE #PARM3 END
Second Way:
SELECT field1, field2...etc
FROM my_view
WHERE
(#PARM1 = -1 OR parm1 = #PARM1)
AND (#PARM2 = -1 OR parm2 = #PARM2)
AND (#PARM3 = -1 OR parm3 = #PARM3)
I read somewhere that the second way will short circuit and never eval the second part if true. My DBA said it forces a table scan. I have not verified this, but it seems to run slower on some cases.
The main table that this view selects from has somewhere around 1.5 million records, and the view proceeds to join on about 15 other tables to gather a bunch of other information.
Both of these methods are slow...taking me from instant to anywhere from 2-40 seconds, which in my situation is completely unacceptable.
Is there a better way that doesn't involve breaking it down into each separate case of specific vs -1 ?
Any help is appreciated. Thanks.
I read somewhere that the second way will short circuit and never eval the second part if true. My DBA said it forces a table scan.
You read wrong; it will not short circuit. Your DBA is right; it will not play well with the query optimizer and likely force a table scan.
The first option is about as good as it gets. Your options to improve things are dynamic sql or a long stored procedure with every possible combination of filter columns so you get independent query plans. You might also try using the "WITH RECOMPILE" option, but I don't think it will help you.
if you are running SQL Server 2005 or above you can use IFs to make multiple version of the query with the proper WHERE so an index can be used. Each query plan will be placed in the query cache.
also, here is a very comprehensive article on this topic:
Dynamic Search Conditions in T-SQL by Erland Sommarskog
it covers all the issues and methods of trying to write queries with multiple optional search conditions
here is the table of contents:
Introduction
The Case Study: Searching Orders
The Northgale Database
Dynamic SQL
Introduction
Using sp_executesql
Using the CLR
Using EXEC()
When Caching Is Not Really What You Want
Static SQL
Introduction
x = #x OR #x IS NULL
Using IF statements
Umachandar's Bag of Tricks
Using Temp Tables
x = #x AND #x IS NOT NULL
Handling Complex Conditions
Hybrid Solutions – Using both Static and Dynamic SQL
Using Views
Using Inline Table Functions
Conclusion
Feedback and Acknowledgements
Revision History
If you pass in a null value when you want everything, then you can write your where clause as
Where colName = IsNull(#Paramater, ColName)
This is basically same as your first method... it will work as long as the column itself is not nullable... Null values IN the column will mess it up slightly.
The only approach to speed it up is to add an index on the column being filtered on in the Where clause. Is there one already? If not, that will result in a dramatic improvement.
No other way I can think of then doing:
WHERE
(MyCase IS NULL OR MyCase = #MyCaseParameter)
AND ....
The second one is more simpler and readable to ther developers if you ask me.
SQL 2008 and later make some improvements to optimization for things like (MyCase IS NULL OR MyCase = #MyCaseParameter) AND ....
If you can upgrade, and if you add an OPTION (RECOMPILE) to get decent perf for all possible param combinations (this is a situation where there is no single plan that is good for all possible param combinations), you may find that this performs well.
http://blogs.msdn.com/b/bartd/archive/2009/05/03/sometimes-the-simplest-solution-isn-t-the-best-solution-the-all-in-one-search-query.aspx

How do you get leading wildcard full-text searches to work in SQL Server?

Note: I am using SQL's Full-text search capabilities, CONTAINS clauses and all - the * is the wildcard in full-text, % is for LIKE clauses only.
I've read in several places now that "leading wildcard" searches (e.g. using "*overflow" to match "stackoverflow") is not supported in MS SQL. I'm considering using a CLR function to add regex matching, but I'm curious to see what other solutions people might have.
More Info: You can add the asterisk only at the end of the word or phrase. - along with my empirical experience: When matching "myvalue", "my*" works, but "(asterisk)value" returns no match, when doing a query as simple as:
SELECT * FROM TABLENAME WHERE CONTAINS(TextColumn, '"*searchterm"');
Thus, my need for a workaround. I'm only using search in my site on an actual search page - so it needs to work basically the same way that Google works (in the eyes on a Joe Sixpack-type user). Not nearly as complicated, but this sort of match really shouldn't fail.
Workaround only for leading wildcard:
store the text reversed in a different field (or in materialised view)
create a full text index on this column
find the reversed text with an *
SELECT *
FROM TABLENAME
WHERE CONTAINS(TextColumnREV, '"mrethcraes*"');
Of course there are many drawbacks, just for quick workaround...
Not to mention CONTAINSTABLE...
The problem with leading Wildcards: They cannot be indexed, hence you're doing a full table scan.
It is possible to use the wildcard "*" at the end of the word or phrase (prefix search).
For example, this query will find all "datab", "database", "databases" ...
SELECT * FROM SomeTable WHERE CONTAINS(ColumnName, '"datab*"')
But, unforutnately, it is not possible to search with leading wildcard.
For example, this query will not find "database"
SELECT * FROM SomeTable WHERE CONTAINS(ColumnName, '"*abase"')
To perhaps add clarity to this thread, from my testing on 2008 R2, Franjo is correct above. When dealing with full text searching, at least when using the CONTAINS phrase, you cannot use a leading , only a trailing functionally. * is the wildcard, not % in full text.
Some have suggested that * is ignored. That does not seem to be the case, my results seem to show that the trailing * functionality does work. I think leading * are ignored by the engine.
My added problem however is that the same query, with a trailing *, that uses full text with wildcards worked relatively fast on 2005(20 seconds), and slowed to 12 minutes after migrating the db to 2008 R2. It seems at least one other user had similar results and he started a forum post which I added to... FREETEXT works fast still, but something "seems" to have changed with the way 2008 processes trailing * in CONTAINS. They give all sorts of warnings in the Upgrade Advisor that they "improved" FULL TEXT so your code may break, but unfortunately they do not give you any specific warnings about certain deprecated code etc. ...just a disclaimer that they changed it, use at your own risk.
http://social.msdn.microsoft.com/Forums/ar-SA/sqlsearch/thread/7e45b7e4-2061-4c89-af68-febd668f346c
Maybe, this is the closest MS hit related to these issues... http://msdn.microsoft.com/en-us/library/ms143709.aspx
One thing worth keeping in mind is that leading wildcard queries come at a significant performance premium, compared to other wildcard usages.
Note: this was the answer I submitted for the original version #1 of the question before the CONTAINS keyword was introduced in revision #2. It's still factually accurate.
The wildcard character in SQL Server is the % sign and it works just fine, leading, trailing or otherwise.
That said, if you're going to be doing any kind of serious full text searching then I'd consider utilising the Full Text Index capabilities. Using % and _ wild cards will cause your database to take a serious performance hit.
Just FYI, Google does not do any substring searches or truncation, right or left. They have a wildcard character * to find unknown words in a phrase, but not a word.
Google, along with most full-text search engines, sets up an inverted index based on the alphabetical order of words, with links to their source documents. Binary search is wicked fast, even for huge indexes. But it's really really hard to do a left-truncation in this case, because it loses the advantage of the index.
As a parameter in a stored procedure you can use it as:
ALTER procedure [dbo].[uspLkp_DrugProductSelectAllByName]
(
#PROPRIETARY_NAME varchar(10)
)
as
set nocount on
declare #PROPRIETARY_NAME2 varchar(10) = '"' + #PROPRIETARY_NAME + '*"'
select ldp.*, lkp.DRUG_PKG_ID
from Lkp_DrugProduct ldp
left outer join Lkp_DrugPackage lkp on ldp.DRUG_PROD_ID = lkp.DRUG_PROD_ID
where contains(ldp.PROPRIETARY_NAME, #PROPRIETARY_NAME2)
When it comes to full-text searching, for my money nothing beats Lucene. There is a .Net port available that is compatible with indexes created with the Java version.
There's a little work involved in that you have to create/maintain the indexes, but the search speed is fantastic and you can create all sorts of interesting queries. Even indexing speed is pretty good - we just completely rebuild our indexes once a day and don't worry about updating them.
As an example, this search functionality is powered by Lucene.Net.
Perhaps the following link will provide the final answer to this use of wildcards: Performing FTS Wildcard Searches.
Note the passage that states: "However, if you specify “Chain” or “Chain”, you will not get the expected result. The asterisk will be considered as a normal punctuation mark not a wildcard character. "
If you have access to the list of words of the full text search engine, you could do a 'like' search on this list and match the database with the words found, e.g. a table 'words' with following words:
pie
applepie
spies
cherrypie
dog
cat
To match all words containing 'pie' in this database on a fts table 'full_text' with field 'text':
to-match <- SELECT word FROM words WHERE word LIKE '%pie%'
matcher = ""
a = ""
foreach(m, to-match) {
matcher += a
matcher += m
a = " OR "
}
SELECT text FROM full_text WHERE text MATCH matcher
% Matches any number of characters
_ Matches a single character
I've never used Full-Text indexing but you can accomplish rather complex and fast search queries with simply using the build in T-SQL string functions.
From SQL Server Books Online:
To write full-text queries in
Microsoft SQL Server 2005, you must
learn how to use the CONTAINS and
FREETEXT Transact-SQL predicates, and
the CONTAINSTABLE and FREETEXTTABLE
rowset-valued functions.
That means all of the queries written above with the % and _ are not valid full text queries.
Here is a sample of what a query looks like when calling the CONTAINSTABLE function.
SELECT RANK , * FROM TableName ,
CONTAINSTABLE (TableName, *, '
"*WildCard" ') searchTable WHERE
[KEY] = TableName.pk ORDER BY
searchTable.RANK DESC
In order for the CONTAINSTABLE function to know that I'm using a wildcard search, I have to wrap it in double quotes. I can use the wildcard character * at the beginning or ending. There are a lot of other things you can do when you're building the search string for the CONTAINSTABLE function. You can search for a word near another word, search for inflectional words (drive = drives, drove, driving, and driven), and search for synonym of another word (metal can have synonyms such as aluminum and steel).
I just created a table, put a full text index on the table and did a couple of test searches and didn't have a problem, so wildcard searching works as intended.
[Update]
I see that you've updated your question and know that you need to use one of the functions.
You can still search with the wildcard at the beginning, but if the word is not a full word following the wildcard, you have to add another wildcard at the end.
Example: "*ildcar" will look for a single word as long as it ends with "ildcar".
Example: "*ildcar*" will look for a single word with "ildcar" in the middle, which means it will match "wildcard". [Just noticed that Markdown removed the wildcard characters from the beginning and ending of my quoted string here.]
[Update #2]
Dave Ward - Using a wildcard with one of the functions shouldn't be a huge perf hit. If I created a search string with just "*", it will not return all rows, in my test case, it returned 0 records.

Resources