Full text search asterisk returns wrong result - sql-server

I have a Ship table with FTS index, which was created as:
CREATE FULLTEXT INDEX ON Ship
(
Name
)
KEY INDEX PK_Ship_Id
ON MyCatalog
WITH CHANGE_TRACKING AUTO, STOPLIST OFF;
And when I run query bellow:
select Name From Ship where CONTAINS(Name, N'"n*"');
I get wrong result, for instance "Vitamin D3 1000 Iu".
But I want get only rows where name field has any word starts with 'n' char.

FTS engine has strange 'feature', when you try find somethings as CONTAINS(Name, N'"n*"'), it searches all numbers because it keeps numbers as NN.
The best decision which was founded is in these two cases(CONTAINS(Name, N'"n*"'), CONTAINS(Name, N'"nn*"')) use "like" search.

Related

Fulltext Contains does not return rows

I have a row in a table that contains "DS012345" in a column called description
When I use this query:
Select * from Tablename where Contains(Description, ' "*012345*" ')
This query returns no result.
I have created the unique index, fulltext catalog, I have turned off the Stop Words using the Object Explorer. Still do not know why it does not return that row.
Any suggestion or cause for this?
Thanksl.
Why not just use LIKE instead to do a search.
Select * from Tablename where Description LIKE '%012345%'
Just does a search where 012345 appears anywhere within the description column.
Stop words is the number that it starts to seek for a word in your database..
Fulltext should be used to get the exact word, if you just want a part of the word you should use LIKE %...%.

Ignore Dash (-) from Full Text Search (FREETEXTTABLE) search column in SQL Server

I use CONTAINSTABLE for my searching algorithm. I want to search column value with ignoring dash in particular column value. for example, column contains '12345-67' then it should search with '1234567' as below query.
SELECT *
FROM table1 AS FT_Table
INNER JOIN CONTAINSTABLE(table2, columnname, '1234567') AS Key_Table ON FT_Table.ID = Key_Table.[Key]
Is there any way to ignore dash (-) while searching with string that doesn't contain a dash (-)?
I did some digging and spent a few hours time :)
Unfortunately, there is no way to perform it. Looks like Sql Server FTS populate the words by breaking words (except whitespaces) also special characters( -, {, ( etc.)
But it doesn't populate complete word and my understanding there is no way to provide some population rules for satisfy the need. (I mean, telling to population service, If the word contains "-" replace it with "".)
I provided an example for clarify the situation.
Firstly, create table, FTS catalog, Full text index and insert sample row for table.
CREATE TABLE [dbo].[SampleTextData]
(
[Id] int identity(1,1) not null,
[Text] varchar(max) not null,
CONSTRAINT [PK_SampleTextData] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
);
CREATE FULLTEXT CATALOG ftCatalog AS DEFAULT;
CREATE FULLTEXT INDEX ON SampleTextData
(Text)
KEY INDEX PK_SampleTextData
ON ft
INSERT INTO [SampleTextData] values ('samp-le text')
Then, provide sample queries;
select * from containstable(SampleTextData,Text,'samp-le') --Success
select * from containstable(SampleTextData,Text,'samp') --Success
select * from containstable(SampleTextData,Text,'le') --Success
select * from containstable(SampleTextData,Text,'sample') -- Fail
These samples are successfully except one 'Samp-le'. For investigating the situtation, execute this query;
SELECT display_term, column_id, document_count
FROM sys.dm_fts_index_keywords (DB_ID('YourDatabase'), OBJECT_ID('SampleTextData'))
Output :
le 2 1
samp 2 1
samp-le 2 1
text 2 1
END OF FILE 2 1
The query gives us word results which are populated by FTS population service. As you see, the population results contain 'le', 'samp', 'samp-le' but not 'sample'. This is the reason how sample query getting failed.

SQL Server Fulltext search not finding my rows

I have a SQL Server table and I'm trying to make sense of fulltext searching :-)
I have set up a fulltext catalog and a fulltext index on a table Entry, which contains among other columns a VARCHAR(20) column called VPN-ID.
There are about 200'000 rows in that table, and the VPN-ID column has values such as:
VPN-000-359-90
VPN-000-363-85
VPN-000-362-07
VPN-000-362-91
VPN-000-355-55
VPN-000-368-36
VPN-000-356-90
Now I'm trying to find rows in that table with a fulltext enabled search.
When I do
SELECT (list of columns)
FROM dbo.Entry
WHERE CONTAINS(*, 'VPN-000-362-07')
everything's fine and dandy and my rows are returned.
When I start searching with a wildcard like this:
SELECT (list of columns)
FROM dbo.Entry
WHERE CONTAINS(*, 'VPN-000-362-%')
I am getting results and everything seems fine.
HOWEVER: when I searching like this:
SELECT (list of columns)
FROM dbo.Entry
WHERE CONTAINS(*, 'VPN-000-36%')
suddenly I get no results back at all..... even though there are clearly rows that match that search criteria...
Any ideas why?? What other "surprises" might fulltext search have in store for me? :-)
Update: to create my fulltext catalog I used:
CREATE FULLTEXT CATALOG MyCatalog WITH ACCENT_SENSITIVITY = OFF
and to create the fulltext index on my table, I used
CREATE FULLTEXT INDEX
ON dbo.Entry(list of columns)
KEY INDEX PK_Entry
I tried to avoid any "oddball" options as much a I could.
Update #2: after a bit more investigation, it appears as if SQL Server Fulltext search somehow interprets my dashes inside the strings as separators....
While this query returns nothing:
SELECT (list of columns)
FROM dbo.Entry
WHERE CONTAINS(*, '"VPN-000-362*"')
this one does (splitting up the search term on the dashes):
SELECT (list of columns)
FROM dbo.Entry
WHERE CONTAINS(*, ' "VPN" AND "000" AND "362*"')
OK - seems a bit odd that a dash appears to result in a splitting up that somehow doesn't work.....
which Language for Word Breaker do you use? Have you tried Neutral?
EDIT:
in adition you should use WHERE CONTAINS([Column], '"text*"'). See MSDN for more information on Prefix Searches:
C. Using CONTAINS with
The following example returns all
product names with at least one word
starting with the prefix chain in the
Name column.
USE AdventureWorks2008R2;
GO
SELECT Name
FROM Production.Product
WHERE CONTAINS(Name, ' "Chain*" ');
GO
btw ... similar question here and here
Just wondering, but why don't you just do this:
SELECT (list of columns)
FROM dbo.Entry
WHERE [VPN-ID] LIKE 'VPN-000-36%'
It seems to me that fulltext search is not the right tool for the job. Just use a normal index on that column.

How to search Keywords like IN, OR, AND in a column with FullText Index in MSSQL Server 2008

I want to search keywords like IN, OR, etc from a table with Fulltext index like this:
SELECT * from Table1 where
CONTAINS(countrycode, 'IN OR DE OR GB')
But this query is returning the rows with "DE" or "GB" only, not "IN". How this can be solved?
Probably because IN and OR are keywords on SQL.
According to http://msdn.microsoft.com/en-us/library/ms187787.aspx:
B. Using CONTAINS and phrase in
The following example returns all
products that contain either the
phrase "Mountain" or "Road".
Copy USE AdventureWorks2008R2; GO
SELECT Name FROM Production.Product
WHERE CONTAINS(Name, ' "Mountain" OR
"Road" ')

MS SQL FTI - searching on "n*" returns numbers

This seems like odd behaviour from SQL's full-text-index.
FTI stores number in its index with an "NN" prefix, so "123" is saved as "NN123".
Now when a user searches for words beginning with N (i.e. contains "n*" ) they also get all numbers.
So:
select [TextField]
from [MyTable]
where contains([TextField], '"n*"')
Returns:
MyTable.TextField
--------------------------------------------------
This text contains the word navigator
This text is nice
This text only has 123, and shouldn't be returned
Is there a good way to exclude that last row? Is there a consistent workaround for this?
Those extra "" are needed to make the wildcard token work:
select [TextField] from [MyTable] where contains([TextField], 'n*')
Would search for literal n* - and there aren't any.
--return rows with the word text
select [TextField] from [MyTable] where contains([TextField], 'text')
--return rows with the word tex*
select [TextField] from [MyTable] where contains([TextField], 'tex*')
--return rows with words that begin tex...
select [TextField] from [MyTable] where contains([TextField], '"tex*"')
There are a couple of ways to handle this, though neither is really all that great.
First, add a column to your table that says that TextField is really a number. If you could do that and filter, you would have the most performant version.
If that's not an option, then you will need to add a further filter. While I haven't extensively tested it, you could add the filter AND TextField NOT LIKE 'NN%[0-9]%'
The downside is that this would filter out 'NN12NOO' but that may be an edge case not represented by your data.

Resources