I'm new to free-text search, so pardon the newbie question. Suppose I have the following full-text index:
Create FullText Index on Contacts(
FirstName,
LastName,
Organization
)
Key Index PK_Contacts_ContactID
Go
I want to do a freetext search against all three columns concatenated
FirstName + ' ' + LastName + ' ' + Organization
So that for example
Searching for jim smith returns all contacts named Jim Smith
Searching for smith ibm returns all contacts named Smith who work at IBM
This seems like it would be a fairly common scenario. I would have expected this to work:
Select c.FirstName, c.LastName, c.Organization, ft.Rank
from FreeTextTable(Contacts, *, 'smith ibm') ft
Left Join Contacts c on ft.[Key]=c.ContactID
Order by ft.Rank Desc
but this is apparently doing smith OR ibm; it returns a lot of Smiths who don't work at IBM and vice versa. Surprisingly, searching for smith AND ibm yields identical results.
This does what I want...
Select c.FirstName, c.LastName, c.Organization
from Contacts c
where Contains(*, 'smith') and Contains(*, 'ibm')
...but then I can't parameterize queries coming from the user -- I would have to break up the search string into words myself and assemble the SQL on the fly, which is ugly and unsafe.
The usual approach I take is to create a search view or calculated column (using a trigger) that puts all of those values into a single field.
The other thing I do is to use a full-text search engine- such as Lucene/Solr.
Boolean operators are only supported for CONTAINS, not FREETEXT.
Try your AND query with CONTAINSTABLE.
Related
I have an sql server database that has multiple tables that are all related to each other.
I am wondering is it possible to use FreeTextTable or ContainsTable for this?
I have only seen examples of looking at one table and search all the columns. I have not seen a case where I may have say a form. Lets call it "student application form".
On this form it may have information like
Student First Name
Student Last Name
Address
Campus they wish to study at
Tell us about yourself
Now I want to build 1 search box that will search all these tables and find this "application"
the user might type in
John Smith
Main Campus
motivated
So all tables and columns would need to be checked, but end result would be brining back the application(s) that the full text search thinks matches what was typed in.
The table structure might be like this
Application
id
firstName
lastName
campusId
AddressId
Campus
id
name
Address
id
-Name
In my real database I got like 5 or 6 tables that join with the "application" table and would all need to be accounted for.
Can this be done with full text search? Or do I have to search each table individually and somehow tie it all together again.
You can use an indexed view that concatenante all values that you want to search for...
CREATE VIEW dbo.V_FT
WITH SCHEMABINDING
AS
SELECT A.id,
CONCAT(firstName, '#', lastName, '#', C.name, '#',D.name) AS DATA
FROM dbo.Application
JOIN dbo.Campus AS C ON C.id = A.campusId
JOIN dbo.Address AS D ON C.id = A.AddressId;
GO
CREATE UNIQUE CLUSTERED INDEX X_V_APP ON dbo.V_FT (id);
GO
Then create an FT index on DATA colums of the indexed view
I have created the following view in SQL Server 2008 to create mailing lists for land owners:
SELECT
dbo.parcel.featid,
CAST(mms_db.dbo.TR_Roll_Master.FMT_ROLL_NO AS decimal(11, 3)) AS Roll,
dbo.parcel.survey, mms_db.dbo.Central_Name_Database.NAME AS Owner,
mms_db.dbo.Central_Name_Database.NAME_2 AS Owner2,
mms_db.dbo.Central_Name_Database.BOX_NUM,
mms_db.dbo.Central_Name_Database.APT_NUM,
mms_db.dbo.Central_Name_Database.FMT_STREET AS House_num,
mms_db.dbo.Central_Name_Database.CITY AS Town,
mms_db.dbo.Central_Name_Database.PROV_CD AS Prov,
mms_db.dbo.Central_Name_Database.POST_CD AS Post_code,
mms_db.dbo.TR_Roll_Number_Owners.NAME_CODE
FROM
mms_db.dbo.TR_Roll_Master
INNER JOIN
dbo.parcel ON mms_db.dbo.TR_Roll_Master.ROLL_NO = dbo.parcel.roll_no COLLATE SQL_Latin1_General_CP1_CI_AS
INNER JOIN
mms_db.dbo.TR_Roll_Number_Owners ON mms_db.dbo.TR_Roll_Master.ROLL_NO = mms_db.dbo.TR_Roll_Number_Owners.ROLL_NO
INNER JOIN
mms_db.dbo.Central_Name_Database ON mms_db.dbo.TR_Roll_Number_Owners.NAME_CODE = mms_db.dbo.Central_Name_Database.NAME_CODE
WHERE
(mms_db.dbo.TR_Roll_Master.DEL_ROLL NOT LIKE '%Y%') AND
(mms_db.dbo.TR_Roll_Master.ROLL_NO NOT LIKE 'P%') OR
(mms_db.dbo.TR_Roll_Master.DEL_ROLL IS NULL) AND (mms_db.dbo.TR_Roll_Master.ROLL_NO NOT LIKE 'P%') OR
(mms_db.dbo.TR_Roll_Master.DEL_ROLL NOT LIKE '%I%') AND
(mms_db.dbo.TR_Roll_Master.ROLL_NO NOT LIKE 'P%')
The view works fine however there are often duplicates as many people own more than one piece of land. I would like to group by Name_Code to eliminate the duplicates.
When I add:
Group by mms_db.dbo.TR_Roll_Number_Owners.NAME_CODE
to the end of the query I am returned with the following response:
SQL Execution Error.
Executed SQL statement: SELECT dbo.parcel.featid, CAST(mms_db.dbo.TR_Roll_Master.FMT_ROLL_NO AS decimal(11,3)) AS Roll,
dbo.parcel.survey,
mms_db.dbo.Central_NameDatabase.Name AS Owner,
mms_db.dbo.Central_Name_Database.NAME_2 AS Owner2,
mms_db.dbo.Central_Name_Database.B...
Error Source: .Net SQLClient Data Provider
Error Message: Column 'dbo.parcel.featid' is invalid in the select list
because it is not contained in either an aggregate function or the
GROUP BY clause.
I'm not sure what I need to change to make this work.
--Edit--
As a sample data, here is a condensed sample of what I would like to achieve
Roll Owner Box_Num Town Prov Post_code Name_Code
100 John Smith 50 Somewhere MB R3W 9T7 00478
200 John Smith 50 Somewhere MB R3W 9T7 00478
300 Peter Smith 72 Somewhere MB R3W 9T9 00592
400 John Smith 90 OtherPlace MB R2R 8V7 00682
John Smith has the name code of 00478. He owns both Roll 100 & 200, Peter Smith owns 300 and another person with the name of John Smith owns 400. Based on different Name_Code values I know that the two John Smith's are different people. I would like an output that would list John Smith with Name_Code 00478 1 time only while also listing Peter Smith and the other John Smith. Name_Code is the only value I can use for grouping as the rest could represent different people with the same name.
If you just want to eliminate duplicates, just use DISTINCT and exclude the columns representing other "people on more than one piece of land" from your query viz:
SELECT DISTINCT
NAME_CODE,
{column2},
{column3},
FROM
[MyView]
However, if you wish to perform aggregation of some sort, or show one random of the "people on more than one piece of land" then you will need the GROUP BY. All non-aggregated columns in the select need to appear in the group by:
SELECT
NAME_CODE,
... Other non aggregated fields here
COUNT(featid) AS NumFeatIds,
MIN(Owner2) AS FirstOwner,
... etc (other aggregated columns)
GROUP BY
NAME_CODE,
... All non-aggregated columns in the select.
Edit
To get the table listed in your edit, you would just need to ORDER BY Name_Code
However to get just one row of John Smith #00478, you need to compromise on the non-unique columns by either eliminating them entirely, using GROUP BY and aggregates on the rows, doing a GROUP_CONCAT type hack to e.g. comma separate them, or to pivot the duplicate row columns as extra columns on the one row.
Since you've mentioned GROUP repeatedly, it seems the aggregation route is necessary. John Smith #00478 has 2 properties, hence 2 discrete Roll values. So Roll can't appear in the aggregated result. So instead, you can return e.g. a count of the Rolls, or the MIN or MAX Roll, but not both Rows*. The other columns (Address related) are probably constant for all properties (assuming John Smith 00478 has one address), but unfortunately SqlServer will require you to include them in the GROUP.
I would suggest you try:
SELECT
COUNT(Roll) AS NumPropertiesOwned,
Owner,
Box_Num,
Town,
Prov,
Post_code,
Name_Code
FROM [MyNewView]
GROUP BY
Owner, Box_Num, Town, Prov, Post_code, Name_Code
ORDER BY Name_Code;
i.e. all the non-aggregated columns must be repeated in the GROUP BY
* unless you use the GROUP_CONCAT hack or the pivot route
its telling you what to do:
"Error Message: Column 'dbo.parcel.featid' is invalid in the select list
because it is not contained in either an aggregate function or the
GROUP BY clause."
This means you have to group the other (non-aggregated) fields too.
I've got a customer table like this:
ID*,
Title,
FirstName,
MiddleNames,
LastName,
CompanyName
All string fields are nullable.
I need to be able to offer the user a fuzzy search. So, for example, they could enter the following searches and it would bring back ranked results:
"BOB"
"BOB JONES"
"BOB JON*"
"MR JONES"
"BOB DAVE JONES"
"B D JONES"
"BOB JONES ACME CORP"
"ACME CORP"
"ACME BOB"
etc.
My problem is that there doesn't seem to be a way to do wildcard/LIKE% matches. So if there is a surname "JONESY", searching "JONES" doesn't match it.
In an ideal world, I'd like to CONCATENATE all the string columns in to a single column and then do my fuzzy search on that, because the ranking would be better.
Can anybody tell me how to either do wildcard searches OR search on CONCATENATED fields?
Thanks,
Simon.
You can define a SQL Server full-text index on multiple columns in a table.
Full-Text queries against a table like this can either specify the column for querying or query against all columns at once.
Full-Text search does not support true wildcard matching but it does support prefix matching. This means that you can search for "JONES*" and it will match "JONESON" or "JONESY".
Using FREETEXTTABLE will provide Rank for your results.
A prefix match for "JONES" would look like this:
SELECT
t.QueryContent
, ft.[Key]
, ft.[Rank]
FROM
Table t
LEFT OUTER JOIN CONTAINSTABLE ( Table , * , '"JONES*"' ) ft ON ( t.TableID = ft.[Key] )
ORDER BY
ft.Rank DESC
, t.QueryContent
i was also doing that kind of search. here is the link which helped me
http://www.codeproject.com/KB/database/SQLServer2K8FullTextSearh.aspx
I have a hard time finding a good question title - let me just show you what I have and what the desired outcome is. I hope this can be done in SQL (I have SQL Server 2008).
1) I have a table called Contacts and in that table I have fields like these:
FirstName, LastName, CompanyName
2) Some demo data:
FirstName LastName CompanyName
John Smith Smith Corp
Paul Wade
Marc Andrews Microsoft
Bill Gates Microsoft
Steve Gibbs Smith Corp
Diane Rowe ABC Inc.
3) I want to get an intersecting list of people and companies, but companies only once. This would look like this:
Name
ABC Inc.
Bill Gates
Diane Rowe
John Smith
Marc Andrews
Microsoft
Smith Corp
Steve Gibbs
Paul Wade
Can I do this with SQL? How?
You take all the person names, and then also add all the companies
SELECT CONCAT([First Name],' ',[Last Name]) AS Name FROM Contacts
UNION ALL
SELECT DISTINCT CompanyName FROM Contacts
WHERE CompanyName IS NOT NULL
The DISTINCT keyword ensures that companies are output only once, and the WHERE
clause removes rows where no company info is known.
If a person has the same name as a company, then this will output a duplicate. If you don't want that, then change UNION ALL to UNION, and any name will be output only once.
I'm not sure what you mean by "intersecting," but you can easily get the results you describe as the union of two queries against that same table.
select
t.firstname + ' ' + t.lastname
from
mytable t
union
select
t.company
from
mytable t
Edit: UNION should make each SELECT distinct by default.
Does this do what you need?
SELECT FirstName + ' ' + LastName AS Name
FROM Contacts
UNION
SELECT CompanyName
FROM Contacts
(The UNION rather than UNION ALL will ensure distinctness of both top and bottom parts. mdma's answer will work if you do need the possibility of duplicate people names. You might need to add an ORDER BY Name depending on your needs)
I'm currently working on an application where we have a SQL-Server database and I need to get a full text search working that allows us to search people's names.
Currently the user can enter a into a name field that searches 3 different varchar cols. First, Last, Middle names
So say I have 3 rows with the following info.
1 - Phillip - J - Fry
2 - Amy - NULL - Wong
3 - Leo - NULL - Wong
If the user enters a name such as 'Fry' it will return row 1. However if they enter Phillip Fry, or Fr, or Phil they get nothing.. and I don't understand why its doing this. If they search for Wong they get rows 2 and 3 if they search for Amy Wong they again get nothing.
Currently the query is using CONTAINSTABLE but I have switched that with FREETEXTTABLE, CONTAINS, and FREETEXT without any noticeable differences in the results. The table methods are be preferred because they return the same results but with ranking.
Here is the query.
....
#Name nvarchar(100),
....
--""s added to prevent crash if searching on more then one word.
DECLARE #SearchString varchar(100)
SET #SearchString = '"'+#Name+'"'
SELECT Per.Lastname, Per.Firstname, Per.MiddleName
FROM Person as Per
INNER JOIN CONTAINSTABLE(Person, (LastName, Firstname, MiddleName), #SearchString)
AS KEYTBL
ON Per.Person_ID = KEYTBL.[KEY]
WHERE KEY_TBL.RANK > 2
ORDER BY KEYTBL.RANK DESC;
....
Any Ideas...? Why this full text search is not working correctly ?
If you're just searching people's names, it might be in your best interest to not even use the full text index. Full text index makes sense when you have large text fields, but if you're mostly dealing with one word per field, I'm not sure how much extra you would get out of full text indexes. Waiting for the full text index to reindex itself before you can search for new records can be one of the many problems.
You could just make a query such as the following. Split your searchstring on spaces, and create a list of the search terms.
Select FirstName,MiddleName,LastName
From person
WHERE
Firstname like #searchterm1 + '%'
or MiddleName like #searchterm1 + '%'
or LastName like #searchterm1 + '%'
or Firstname like #searchterm2 + '%'
etc....
FreeTextTable should work.
INNER JOIN FREETEXTTABLE(Person, (LastName, Firstname, MiddleName), #SearchString)
#SearchString should contain the values like 'Phillip Fry' (one long string containing all of the lookup strings separated by spaces).
If you would like to search for Fr or Phil, you should use asterisk: Phil* and Fr*
'Phil' is looking for exactly the word 'Phil'. 'Phil*' is looking for every word which is starting with 'Phil'
Thanks for the responses guys I finally was able to get it to work. With part of both Biri, and Kibbee's answers. I needed to add * to the string and break it up on spaces in order to work. So in the end I got
....
#Name nvarchar(100),
....
--""s added to prevent crash if searching on more then one word.
DECLARE #SearchString varchar(100)
--Added this line
SET #SearchString = REPLACE(#Name, ' ', '*" OR "*')
SET #SearchString = '"*'+#SearchString+'*"'
SELECT Per.Lastname, Per.Firstname, Per.MiddleName
FROM Person as Per
INNER JOIN CONTAINSTABLE(Person, (LastName, Firstname, MiddleName), #SearchString)
AS KEYTBL
ON Per.Person_ID = KEYTBL.[KEY]
WHERE KEY_TBL.RANK > 2
ORDER BY KEYTBL.RANK DESC;
....
There are more fields being searched upon I just simplified it for the question, sorry about that, I didn't think it would effect the answer. It actually searches a column that has a csv of nicknames and a notes column as well.
Thanks for the help.
Another approach could be to abstract the searching away from the individual fields.
In other words create a view on your data which turns all the split fields like firstname lastname into concatenated fields i.e. full_name
Then search on the view. This would likely make the search query simpler.
You might want to check out Lucene.net as an alternative to Full Text.