SQL Server; index on TEXT column - sql-server

I have a database table with several columns; most of them are VARCHAR(x) type columns, and some of these columns have an index on them so that I can search quickly for data inside it.
However, one of the columns is a TEXT column, because it contains a very large amount of data (23 kb of plain ascii text etc). I want to be able to search in that column (... WHERE col1 LIKE '%search string%'... ), but currently it's taking forever to perform the query. I know that the query is slow because of this column search because when I remove that criteria from the WHERE clause the query completes (what I would consider), instantaneously.
I can't add an index on this column because that option is grayed out for that column in the index builder / wizard in SQL Server Management Studio.
What are my options here, to speed up the query search in that column?
Thanks for your time...
Update
Ok, so I looked into the full text search and did all that stuff, and now I would like to run queries. However, when using "contains", it only accepts one word; what if I need an exact phrase? ... WHERE CONTAINS (col1, 'search phrase') ... throws an error.
Sorry, I'm new to SQL Server
Update 2
sorry, just figured it out; use multiple "contains" clauses instead of one clause with multiple words. Actually, this still doesn't get what I want (the exact phrase) it only makes sure that all words in the phrase are present.

Searching TEXT fields is always pretty slow. Give Full Text Search a try and see if that works better for you.

If your queries are like LIKE '%string%' (i. e. you search for a string inside a TEXT field), then you'll need a FULLTEXT index.
If you search for a substring in the beginning of the field (LIKE 'string%') and use SQL Server 2005 or higher, then you can convert your TEXT into a VARCHAR(MAX), create a computed column and index this column.
See this article in my blog for performance details:
Indexing VARCHAR(MAX)

You should be looking at using Full Text Indexing on the column.

You can do complex boolean querying in FTS; like
contains(yourcol,'"My first sting" or "my second string" and "my third string"')
Depending on your query ContainsTable or freetexttable might give better results.
If you are connecting through .Net you might want to look at A google full text search

And since nobody has already said it (maybe because it's obvious) querying LIKE '%string%' bypasses your existing indexes - so it'll run slow.
Hence - why you need to use full text indexing. (which is what Quassnoi said).
Correction - I'm sure I learnt this, and always believed it - but after some investigating it (using wildcard at the start) seems OK? My old regex queries run better with likes!

Related

fast search-within-text in SQL Server (fulltext not good enough)

I am using fulltext indexing in SQL Server 2000 and 2012. It is working great, except that users need to be able to search for ' * term * ' and not just 'term * ' (that is, the search needs to return results that contain the search term, not just begin with the search term).
From what I read and researched, this is not possible in SQL Server.
Since this is a requirement, and using the LIKE operator instead of full text is just too slow, I am thinking about breaking up each value into words, and creating a special table that contains each word twice - once normal, once reversed - and a foreign key to the relevant item.
This is the only way I can see to accomplish decent speeds.
Has anyone done this? Does anyone know of any other solution? Is it maybe possible to control the index itself, adding the reverse values to it, without actually creating a column that contains them?

Does a Full-Text Index work well for columns with embedded code values

Using SQL Server 2012, I've got a table that currently has several hundred-thousand rows, and will grow.
In this table, I've got a nvarchar(30) field that contains Medical Record Number (MRN) values. These values can be just about any alphanumeric value, but are not words.
For Example,
DR-345687
34568523
*45612345;T
My application allows the end user to enter a value, say '456' in the search field. The application would need to return all three of the example records.
Currently, I'm using Entity Framework 5.0, and asking for a field.Contains('456') type of search.
This always takes 3-5 seconds to return since it appears to do a table search.
My question is: Would creating a Full Text Index on this column help performance? I haven't tried it yet because the only copy of the database that I have with lots of data in it is currently in QA trials.
Looking at the documentation for the Full Text Indexes it appears that it is optimized around separate words in the field value, so I am hesitant to take the performance hit to create the index without knowing how it is likely to affect my query performance.
EF won't use the T-SQL keywords needed to access the SQL Server full text index (http://msdn.microsoft.com/en-us/library/ms142571.aspx#queries) so your solution won't fly without more work.
I think you would have to create a SProc to get the data using the FTI and then have EF call this. I have a similar issue and would be interested to know your results.
Andy

SQL server search

I'm going to perform a search in my SQL server DB (ASP.NET, VS2010,C#), user types a phrase and I should search this phrase in several fields, how is it possible? do we have functions such as CONTAINS() in SQL server? can I perform my search using normal queries or I should work in my queries using C# functions?
for instance I have 3 fields in my table which can contain user search phrase, is it OK to write following sql command? (for instance user search phrase is GAME)
select * from myTable where columnA='GAME' or columnB='GAME' or columnC='GAME
I have used AND between different conditions, but can I use OR? how can I search inside my table fields? if one of my fields contains the phrase GAME, how can I find it? columnA='GAME' finds only those fields that are exactly 'GAME', is it right?
I'm a bit confused about my search approach, please help me, thanks guys
OR works fine if you want at least one of the conditions to be true.
If you want to search inside your text strings you can use LIKE
select * from myTable where columnA like '%GAME%' or columnB like '%GAME%' or columnC like '%GAME%'
Note that % is the wildcard.
If you want to find everything that begins with 'GAME' you type LIKE 'GAME%', if you allow 'GAME' to be in the middle you need % in both ends.
You can use LIKE instead of equals and then it can contain wildcard characters, so your example could be:
select * from myTable where columnA LIKE '%GAME%' or columnB LIKE '%GAME%' or columnC LIKE '%GAME%'
Further information may be found in MSDN
This is going to do some pretty heavy lifting in terms of what the database has to do though - I would suggest you consider something like full text search as I think it would more likely be suited to your scenario and provide faster results (of course, if you never have many records to search LIKE would probably do fine). Information on this is also in MSDN
Don't use LIKE, as suggested by other answers. It won't work with indexes, and therefore will be slow to return and expensive to run. Instead, you have two options:
Option 1: Full-Text Indexes
do we have functions such as CONTAINS() in SQL server?
Yes! You can use the CONTAINS() function in sql server. You just have to set up a full-text index for each of the columns you need to search on.
Option 2: Lucene.Net
Lucene.Net is a popular client-side library for searching text data that integrates closely with Sql Server. You can use it to make implementing your search a little easier.

SQL Server Full-Text Search - No hit even though the word is present

I'm having problems with Full-Text Search on SQL Server 2005. In my setup, I am indexing a single column, let's call it Content. The indexing is done in the neutral culture since the column might contain text in different languages. The fulltext index is created as follows:
CREATE FULLTEXT INDEX
ON [dbo].[Table1]([Content])
KEY INDEX [UI_Table1_Id] ON [Catalog]
WITH CHANGE_TRACKING AUTO
This table is then filled. The users can then query against the index. The queries look somewhat like this:
SELECT * FROM Table1 AS table1
INNER JOIN CONTAINSTABLE (Table1, Content, #0 , LANGUAGE 1033) AS KEY_TBL
ON table1.Id = KEY_TBL.[KEY]
WHERE table1.locale = 'en-US'
As I said, the content column contains different languages. thus the LANGUAGE (and table1.locale = 'en-US' in the CONTAINSTABLE may change, to be e.g. Danish, English or Swedish LCID's.
I'm having one problem, though. If the column is filled a text containing the word "koncepttitel" and I query for it, I get no hits if using the Swedish language (LANGUAGE 1053). I will get a hit if I use English (LANGUAGE 1033) for the same word.
Previously I got the "Informational: The full-text search condition contained noise word(s)." error message. I then cleared the Swedish stop word list. Now I get no error message, but still I can't seem to get a hit for my query.
Is there any way for me to configure SQL Server Full-Text Search to output more diagnostic information than this? Is there e.g. a way for me to see which noise word the full-text search condition contained?
The thing is, I don't care so much for my users searching for this specific word. However, I'm worried that this error may be across more relevant search terms, which I won't be able to foresee, which means that my users won't be able to find what they are looking for.
Update: I'm wondering if I may have misinterpreted the set-up for full-text search. Could this issue be due to me indexing the content in the neutral culture and querying in a specific culture? Should I always use the neutral culture when querying?
In my dealings with FTS and noise files, you need to restart the service (FTS service) then run an update population.
Depending on the size you may just want to consider dropping the index and recreating it.

Best way to literal phrase search a text column in SQL Server

I thought full-text search would let me do exact phrase searching in a more optimized way than a LIKE predicate, but I'm reading that it doesn't do that exactly.
Is "LIKE" the most efficient way to search thousands of rows of TEXT fields in a table for a literal string?
It's got to be exact matching...
LIKE(string%) will work faster if you have proper index on the column and you are looking for "string" only in the beginning of the value. You have to use LIKE(%string%) if the "string" might be in the middle of your value; table scan will be fired in this case and it's slow (slower than full-text search mostly).
You can use the CONTAINS() function of full-text search for exact match.
Apparently, CONTAINS is faster than a LIKE query...
http://www.docstoc.com/docs/2280727/Microsoft-SQL-Server-70-Full-Text-Search-What-is-full-text-search
(Profiling can be found on Page 19 of that presentation)
What version of SQL Server are you on? I would recommend replacing TEXT with VARCHAR(MAX), if you ever can (SQL Server 2005 and up).
What makes you say that full text won't work? How have you set up fulltext, and what do your fulltext queries look like?
Marc

Resources