I'm have some trouble with the fulltext CONTAINS operator. Here's a quick script to show what I'm doing. Note that the WAITFOR line simply gives the fulltext index a moment to finish filling up.
create table test1 ( id int constraint pk primary key, string nvarchar(100) not null );
insert into test1 values (1, 'dog')
insert into test1 values (2, 'dogbreed')
insert into test1 values (3, 'dogbreedinfo')
insert into test1 values (4, 'dogs')
insert into test1 values (5, 'breeds')
insert into test1 values (6, 'breed')
insert into test1 values (7, 'breeddogs')
go
create fulltext catalog cat1
create fulltext index on test1 (string) key index pk on cat1
waitfor delay '00:00:03'
go
select * from test1 where contains (string, '"*dog*"')
go
drop table test1
drop fulltext catalog cat1
The result set returned is:
1 dog
2 dogbreed
3 dogbreedinfo
4 dogs
Why is record #7 'breeddogs' not returned?
EDIT
Is there another way I should be searching for strings that are contained in other strings? A way that is faster than LIKE '%searchword%' ?
Just because MS Full-Text search does not support suffix search - only prefix, i.e. '* ' in front of '*dog *' is simply ignored. It is clearly stated in Books Online btw.
CONTAINS can search for:
A word or phrase.
The prefix of a word or phrase.
A word near another word.
A word inflectionally generated from another (for example, the word drive is the inflectional stem of drives, drove, driving, and driven).
A word that is a synonym of another word using a thesaurus (for example, the word metal can have synonyms such as aluminum and steel).
Where prefix term is defined like this:
< prefix term > ::= { "word *" | "phrase *" }
So, unfortunately: there's no way to issue a LIKE search in fulltext search.
Related
CREATE TABLE tsearch.pgweb(id int, body text, title text, last_mod_date date);
CREATE TABLE
omm=# INSERT INTO tsearch.pgweb VALUES(1, 'China, officially the People''s Republic of China(PRC), located in Asia, is the world''s most populous state.', 'China', '2010-1-1');
INSERT 0 1
omm=# INSERT INTO tsearch.pgweb VALUES(2, 'America is a rock band, formed in England in 1970 by multi-instrumentalists Dewey Bunnell, Dan Peek, and Gerry Beckley.', 'America', '2010-1-1');
INSERT 0 1
omm=# INSERT INTO tsearch.pgweb VALUES(3, 'England is a country that is part of the United Kingdom. It shares land borders with Scotland to the north and Wales to the west.', 'England','2010-1-1');
– To speed up text searches, GIN indexes can be created (specify english configuration to parse and normalize strings)
omm=# CREATE INDEX pgweb_idx_1 ON tsearch.pgweb USING gin(to_tsvector('english', body));
CREATE INDEX
– concatenated columns index
omm=# CREATE INDEX pgweb_idx_3 ON tsearch.pgweb USING gin(to_tsvector('english', title || ' ' || omm(# body));
CREATE INDEX
At this point, execute explain SELECT body FROM tsearch.pgweb WHERE to_tsvector(body) ## to_tsquery('america'); and find that the gin index is not used.
I would like to ask in what circumstances will such an index be used? (this type of index was not used when testing inserting 10,000 pieces of data)
I have a problem using the sql server full text with parameter
Alter Procedure[dbo].[SelectFullName]
#fullname nvarchar(45)
As
Select*from [dbo][NamePersonTB]
Where CONTAINS (fullname,'"*#fullname*"')
I want to use SAME LIKE to fullname
You are using the #Fullname as a literal string by wrapping it in single quotes. You need to pass a variable directly into the CONTAINS function if you want to use the actual value of #Fullname as your search criteria
Also note you cannot do the %SearchTerm% exactly the same way if you want to leverage your fulltext index. You can search for words that have a matching prefix, but not matching suffix/middle. For more info on wildcard(*) usage with CONTAINS, see <prefix_term> section in the MS doc
Below I've created two ways I might set up the fulltext search, a simple version and a more advanced. Not sure your business needs, but the more advanced fully leverages the fulltext index and has a "smart" ranking option that is pretty neat
Table Setup
CREATE TABLE NamePersonTB (ID INT IDENTITY(1,1) CONSTRAINT PK_NamePersonTB Primary Key,FullName NVARCHAR(100))
INSERT INTO NamePersonTB
VALUES ('John Smith')
,('Jane Smith')
,('Bill Gates')
,('Satya Nadella')
CREATE FULLTEXT CATALOG ct_test AS DEFAULT;
CREATE FULLTEXT INDEX ON NamePersonTB(FullName) KEY INDEX PK_NamePersonTB;
Fulltext Search Script
DECLARE #FullName NVARCHAR(45);
/*Sample searches*/
SET #FullName = 'John Smith' /*Notice John Smith appears first in ranked search*/
--SET #FullName = 'Smith'
--SET #FullName = 'Sm'
--SET #FullName = 'Bill'
DECLARE #SimpleContainsSearchCriteria NVARCHAR(1000)
,#RankedContainsSearchCriteria NVARCHAR(1000)
/*
Below will
1. Parses the words into rows
2. Adds wildcard to end(cannot add wildcard to prefix according to MS doc on CONTAINS)
3. Combines all words back into single row with separator to create CONTAINS search criteria
*/
SELECT #SimpleContainsSearchCriteria = STRING_AGG(CONCAT('"',A.[Value],'*"'),' AND ')
FROM STRING_SPLIT(REPLACE(#Fullname,'"',''),' ') AS A /*REPLACE() removes any double quotes as they will break your search*/
/*Same as above, but uses OR to include more results and will utilize [Rank] so better matches appear first*/
SELECT #RankedContainsSearchCriteria = STRING_AGG(CONCAT('"',A.[Value],'*"'),' OR ')
FROM STRING_SPLIT(REPLACE(#Fullname,'"',''),' ') AS A
/*Included so you can see the search critieria. Should remove in final proc*/
SELECT #Fullname AS FullNameInput
,#SimpleContainsSearchCriteria AS SimpleSearchCriteria
,#RankedContainsSearchCriteria AS RankedContainsSearchCriteria
/*Simple AND match*/
SELECT *
FROM NamePersonTB AS A
WHERE CONTAINS(FullName,#SimpleContainsSearchCriteria)
/*CONTAINSTABLE match alternative. Uses OR criteria and then ranks so best matches appear at the top*/
SELECT *
FROM CONTAINSTABLE(NamePersonTB,FullName,#RankedContainsSearchCriteria) AS A
INNER JOIN NamePersonTB AS B
ON A.[Key] = B.ID
ORDER BY A.[Rank] DESC
Sample Search Criteria
FullNameInput
SimpleSearchCriteria
RankedContainsSearchCriteria
John Smith
"John*" AND "Smith*"
"John*" OR "Smith*"
Output of Simple Search
ID
FullName
1
John Smith
Output of Ranked Search
KEY
RANK
ID
FullName
1
48
1
John Smith
2
32
2
Jane Smith
I use CONTAINSTABLE for my searching algorithm. I want to search column value with ignoring dash in particular column value. for example, column contains '12345-67' then it should search with '1234567' as below query.
SELECT *
FROM table1 AS FT_Table
INNER JOIN CONTAINSTABLE(table2, columnname, '1234567') AS Key_Table ON FT_Table.ID = Key_Table.[Key]
Is there any way to ignore dash (-) while searching with string that doesn't contain a dash (-)?
I did some digging and spent a few hours time :)
Unfortunately, there is no way to perform it. Looks like Sql Server FTS populate the words by breaking words (except whitespaces) also special characters( -, {, ( etc.)
But it doesn't populate complete word and my understanding there is no way to provide some population rules for satisfy the need. (I mean, telling to population service, If the word contains "-" replace it with "".)
I provided an example for clarify the situation.
Firstly, create table, FTS catalog, Full text index and insert sample row for table.
CREATE TABLE [dbo].[SampleTextData]
(
[Id] int identity(1,1) not null,
[Text] varchar(max) not null,
CONSTRAINT [PK_SampleTextData] PRIMARY KEY CLUSTERED
(
[Id] ASC
)
);
CREATE FULLTEXT CATALOG ftCatalog AS DEFAULT;
CREATE FULLTEXT INDEX ON SampleTextData
(Text)
KEY INDEX PK_SampleTextData
ON ft
INSERT INTO [SampleTextData] values ('samp-le text')
Then, provide sample queries;
select * from containstable(SampleTextData,Text,'samp-le') --Success
select * from containstable(SampleTextData,Text,'samp') --Success
select * from containstable(SampleTextData,Text,'le') --Success
select * from containstable(SampleTextData,Text,'sample') -- Fail
These samples are successfully except one 'Samp-le'. For investigating the situtation, execute this query;
SELECT display_term, column_id, document_count
FROM sys.dm_fts_index_keywords (DB_ID('YourDatabase'), OBJECT_ID('SampleTextData'))
Output :
le 2 1
samp 2 1
samp-le 2 1
text 2 1
END OF FILE 2 1
The query gives us word results which are populated by FTS population service. As you see, the population results contain 'le', 'samp', 'samp-le' but not 'sample'. This is the reason how sample query getting failed.
I have a full text on a Column Called SearchTerm in a table called Lease.
I tried doing something like:-
SELECT * FROM CONTAINSTABLE(Lease, SearchTerm, 'Canterbury*')
No Records are returned. I do see a record with a SearchTerm equal to - '3 Canterbury Green||Imaging Technology Group'. Yet the record doesn't show up. Can anyone tell me what is going on.
It performs a SQL Server full-text search on full-text indexed columns
containing character-based data types.
CONTAINSTABLE
CREATE TABLE Flags (Country nvarchar(30) NOT NULL, FlagColors varchar(200));
CREATE UNIQUE CLUSTERED INDEX FlagKey ON Flags(Country);
INSERT Flags VALUES ('France', 'Blue and White and Red');
INSERT Flags VALUES ('Italy', 'Green and White and Red');
INSERT Flags VALUES ('Tanzania', 'Green and Yellow and Black and Yellow and Blue');
INSERT Flags VALUES ('US', '3 Canterbury Green||Imaging Technology Group');
SELECT * FROM Flags;
GO
CREATE FULLTEXT CATALOG TestFTCat;
CREATE FULLTEXT INDEX ON Flags(FlagColors) KEY INDEX FlagKey ON TestFTCat;
GO
SELECT * FROM Flags;
SELECT * FROM CONTAINSTABLE (Flags, FlagColors, 'Canterbury')
SELECT * FROM CONTAINSTABLE (Flags, FlagColors, '"Canterbury*"')
You may have to enclose the '"Canterbury*"' in double quotes as #Richard specified.
It's not necessary to specify the * though as you see the above select gives same results.
I am faced with a database (sqlite specifically) query that I am not sure how to approach.
I am looking for all tuples who's name attribute is a substring of some provided constant.
For example it is a database containing food items. If the constant is "Maranatha Natural Almond Butter 26oz Lightly Roasted" I would like any tuple in the database that contains the words "Almond Butter", "Maranatha Natural", etc to be returned as matches.
I really am at a loss for how to approach this problem efficiently and any help would be greatly appreciated.
Use LIKE, but the other way around:
SELECT *
FROM mytable
WHERE 'Maranatha Natural Almond Butter 26oz Lightly Roasted' LIKE '%' || name || '%'
I recommend you give full-text searching a try. In your examples you wouldn't necessarily need it as LIKE might be sufficient. However, if you want to match only exact words (say you search for set you may not want setting matched) and if you want to match multiple words wherever they appear in your descriptions, FTS can be very helpful. Before using it, verify that your implementation is compiled with it:
sqlite> pragma compile_options;
CURDIR
ENABLE_FTS3 <---- this one has to appear
ENABLE_RTREE
TEMP_STORE=1
THREADSAFE=0
Say you have the FTS table FoodItemsFTS populated with the food item you mentioned earlier:
sqlite> CREATE VIRTUAL TABLE FoodItemsFTS USING fts3();
sqlite>
sqlite> INSERT INTO FoodItemsFTS (docid, content)
VALUES (1, "Maranatha Natural Almond Butter 26oz Lightly Roasted");
sqlite> INSERT INTO FoodItemsFTS (docid, content)
VALUES (2, "Maranatha Natural Almond Butter 26oz");
sqlite>
sqlite> SELECT docid FROM FoodItemsFTS WHERE FoodItemsFTS MATCH 'Almond Roasted';
1
sqlite>