SQL freetext() not working on some columns - sql-server

I have a table which contains 7 columns and I have created a full text index on the table. However I noticed that a search using freetext() does not return any rows on 2 of the columns
Its returns rows on other columns.
Here is my query
select * from dbo.ModelCategoryValues
where freetext(economyvalues,'24,29')
and freetext(featurevalues,'10')
and freetext(pricerangevalues,'15')
and freetext(performancevalues,'18,20')
and freetext(economyvalues,'22,24')
and freetext(usevalues,'28')
This returns expected results
However when I run the below no rows are returned
select * from dbo.ModelCategoryValues
where freetext(cartypevalues,'1')
I can see rows corresponding to the above data. I have tried everything from re-populating index to re-creating it but no success.

FREETEXT by default excludes the records having "STOPLIST" Values.
To resolve the problem, set the StopList to 'OFF' using following query :-
ALTER FULLTEXT INDEX ON DealerSearch SET STOPLIST = OFF

Related

Optimizing SQL Server Table Valued function with Full Text Search tables

I have been trying to analyse and optimize a runtime spike in a stored procedure in SQL Server 2014.
Background of the DB objects
I found that the stored procedure shows a spike in runtime when there are more number of records in a source table to process.
The stored procedure part that is lagging is a SELECT INTO temp table (say #tempTable2) statement, that joins another temp table (say #tempTable1) and a multi-statement table valued function (say testFunction)
Code format below :
SELECT t.col1, f.col2
INTO #tempTable2
FROM #tempTable1 t
OUTER APPLY testFunction(t.id) f
This function returns results very quick when I tries to test passing a single value like below:
SELECT *
FROM testFunction(100)
But when the actual stored procedure statement runs, it is kind of stuck in the function for hours (e.g: if #tempTable1 has 12K records, taking 9 hours for the above statement to complete)
This function calls different other multi-statement table valued functions from it.
Also it queries from a table (say table1) which has full-text search enabled - uses full-text search functions such as CONTAINS and CONTAINSTABLE.
The table currently has 6 million records.
The full text index on the table has change tracking set to OFF &
Full text index is configured for 3 text columns in the table.
Also, The table has a clustered index on Id (int datatype) column.
This table is truncated and loaded daily and then below statement executes
ALTER FULLTEXT INDEX ON indx_table1 START FULL POPULATION
There is no other maintenance seems to be done for the Full text index table.
Checking sys.fulltext_index_fragments returns only 1 record with below details
status
data_size
row_count
4
465517531
1408231
max full-text crawl range for DB has value 4.
--
I doubt the bad performance of the query is due to the unmaintained Full text search function - but do not have any proof for it.
If anyone has some idea about this, can you please share your thoughts.
Edit:
Inserted script for TVF(testFunction) and function called from it(fnt_testfunction3) in
https://www.codepile.net/pile/6BDJNPA0
The table with full text search used in testFunction is tbl_FullTextSearch

Postgres: Do non-selected rows affect performance?

My main question is, in a single table, do the number of records NOT included in a WHERE clause affect query performance of SELECT, INSERT, and UPDATE?
Say I have a table with 20 million rows, and this table has an indexed error string column.
Pretend 19,950,000 of those records have 0 set for this column, and 50,000 have it set to NULL.
My query does SELECT * FROM pending_emails WHERE error IS NULL.
After some logic in my app, I then need to update those same records by ID to set their error:
UPDATE "pending_emails" SET "error" = '0' WHERE "pending_emails"."id" = 46
UPDATE "pending_emails" SET "error" = '0' WHERE "pending_emails"."id" = 50
I'm trying to determine if I can leave 'completed' records in the database without affecting performance of the active records I'm working with, or if I should delete them (not preferred).
Typically no. That's the purpose of indexing. You might want to consider a filtered index for this column: https://www.postgresql.org/docs/current/static/indexes-partial.html Then your index isn't even indexing the '0' rows at all.

Get a list of columns and widths for a specific record

I want a list of properties about a given table and for a specific record of data from that table - in one result
Something like this:
Column Name , DataLength, SchemaLengthMax
...and for only one record (based on a where filter)
So what Im thinking is something like this:
- Get a list of columns from sys.columns and also the schema-based maxlength value
- populate column names into a temp table that includes (column_name, data_length, schema_size_max)
- now loop over that temp table and for each column name, fetch the data for that column based on a specific record, then update the temp table with the length of this data
- finally, select from the temp table
sound reasonable?
Yup. That way works. Not sure if it's the best, since it involves one iteration per column along with the where condition on the source table.
Consider this, instead :
Get the candidate records into a temporary table after applying the where condition. Make sure to get a primary key. If there is no primary key, get a rowid. (assuming SQL Server 2005 or above).
Create a temporary table (Say, #RecValueLens) that has three columns : Primary_key_Value, MyColumnName, MyValueLen
Loop through the list of column names (after taking only the column names into another temporary table) and build sql statement shown in Step 4.
Insert Into #RecValueLens (Primary_Key_Value, MyColumnName, MyValueLen)
Select Max(Primary_Key_Goes_Here), Max('Column_Name_Goes_Here') as ColumnName, Len(Max(Column_Name)) as ValueMyLen From Source_Table_Goes_Here
Group By Primary_Key_Goes_Here
So, if there are 10 columns, you will have 10 insert statements. You could either insert them into a temporary table and run it as a loop. If the number of columns is few, you could concatenate all statements into a single batch.
Run the SQL Statement(s) from above. So, you have Record-wise, column-wise, Value lengths. What is left is to get the column definition.
Get the column definition from sys.columns into a temporary table and join with the #RecValueLens to get the output.
Do you want me to write it for you ?

LIKE vs CONTAINS on SQL Server

Which one of the following queries is faster (LIKE vs CONTAINS)?
SELECT * FROM table WHERE Column LIKE '%test%';
or
SELECT * FROM table WHERE Contains(Column, "test");
The second (assuming you means CONTAINS, and actually put it in a valid query) should be faster, because it can use some form of index (in this case, a full text index). Of course, this form of query is only available if the column is in a full text index. If it isn't, then only the first form is available.
The first query, using LIKE, will be unable to use an index, since it starts with a wildcard, so will always require a full table scan.
The CONTAINS query should be:
SELECT * FROM table WHERE CONTAINS(Column, 'test');
Having run both queries on a SQL Server 2012 instance, I can confirm the first query was fastest in my case.
The query with the LIKE keyword showed a clustered index scan.
The CONTAINS also had a clustered index scan with additional operators for the full text match and a merge join.
I think that CONTAINS took longer and used Merge because you had a dash("-") in your query adventure-works.com.
The dash is a break word so the CONTAINS searched the full-text index for adventure and than it searched for works.com and merged the results.
Also try changing from this:
SELECT * FROM table WHERE Contains(Column, "test") > 0;
To this:
SELECT * FROM table WHERE Contains(Column, '"*test*"') > 0;
The former will find records with values like "this is a test" and "a test-case is the plan".
The latter will also find records with values like "i am testing this" and "this is the greatest".
I didn't understand actually what is going on with "Contains" keyword. I set a full text index on a column. I run some queries on the table.
Like returns 450.518 rows but contains not and like's result is correct
SELECT COL FROM TBL WHERE COL LIKE '%41%' --450.518 rows
SELECT COL FROM TBL WHERE CONTAINS(COL,N'41') ---40 rows
SELECT COL FROM TBL WHERE CONTAINS(COL,N'"*41*"') -- 220.364 rows

SQL Server Fulltext search not finding my rows

I have a SQL Server table and I'm trying to make sense of fulltext searching :-)
I have set up a fulltext catalog and a fulltext index on a table Entry, which contains among other columns a VARCHAR(20) column called VPN-ID.
There are about 200'000 rows in that table, and the VPN-ID column has values such as:
VPN-000-359-90
VPN-000-363-85
VPN-000-362-07
VPN-000-362-91
VPN-000-355-55
VPN-000-368-36
VPN-000-356-90
Now I'm trying to find rows in that table with a fulltext enabled search.
When I do
SELECT (list of columns)
FROM dbo.Entry
WHERE CONTAINS(*, 'VPN-000-362-07')
everything's fine and dandy and my rows are returned.
When I start searching with a wildcard like this:
SELECT (list of columns)
FROM dbo.Entry
WHERE CONTAINS(*, 'VPN-000-362-%')
I am getting results and everything seems fine.
HOWEVER: when I searching like this:
SELECT (list of columns)
FROM dbo.Entry
WHERE CONTAINS(*, 'VPN-000-36%')
suddenly I get no results back at all..... even though there are clearly rows that match that search criteria...
Any ideas why?? What other "surprises" might fulltext search have in store for me? :-)
Update: to create my fulltext catalog I used:
CREATE FULLTEXT CATALOG MyCatalog WITH ACCENT_SENSITIVITY = OFF
and to create the fulltext index on my table, I used
CREATE FULLTEXT INDEX
ON dbo.Entry(list of columns)
KEY INDEX PK_Entry
I tried to avoid any "oddball" options as much a I could.
Update #2: after a bit more investigation, it appears as if SQL Server Fulltext search somehow interprets my dashes inside the strings as separators....
While this query returns nothing:
SELECT (list of columns)
FROM dbo.Entry
WHERE CONTAINS(*, '"VPN-000-362*"')
this one does (splitting up the search term on the dashes):
SELECT (list of columns)
FROM dbo.Entry
WHERE CONTAINS(*, ' "VPN" AND "000" AND "362*"')
OK - seems a bit odd that a dash appears to result in a splitting up that somehow doesn't work.....
which Language for Word Breaker do you use? Have you tried Neutral?
EDIT:
in adition you should use WHERE CONTAINS([Column], '"text*"'). See MSDN for more information on Prefix Searches:
C. Using CONTAINS with
The following example returns all
product names with at least one word
starting with the prefix chain in the
Name column.
USE AdventureWorks2008R2;
GO
SELECT Name
FROM Production.Product
WHERE CONTAINS(Name, ' "Chain*" ');
GO
btw ... similar question here and here
Just wondering, but why don't you just do this:
SELECT (list of columns)
FROM dbo.Entry
WHERE [VPN-ID] LIKE 'VPN-000-36%'
It seems to me that fulltext search is not the right tool for the job. Just use a normal index on that column.

Resources