I'm working with fulltext indexing in SQL Server, but I have the following trouble:
In my indexed column, there are words and numeric codes that I need to find. For example: 069-8987.15
The users will be searching by the literal code, hybrid way or without the special characters, like 069-8987.15 or 069-898715 or 069898715.
I can do that just in the first case.
SELECT [Key], [Rank]
FROM CONTAINSTABLE(dbo.History, Report, '*069-8987.15*')
If I try to use the others, I can't return anything.
How can I fix this? What do I need to do to return the data using the three search ways?
Related
I have this query and it refuses to use an index, idk if it's because the "Expand" stage in the pipeline or what exactly, but I can't get it to use an index in this form, especially in the ORDER BY clause, it still gives me a "Sort" stage in the planner, and I'd like to avoid it.
The index is the createdAt property.
PROFILE
MATCH (u:User {user_id: '61c84762da4e457d55656efa'})-[follows:FOLLOWS]->(following:User)-[relatedTo:POSTED|SHARED]->(everything)
WHERE relatedTo.createdAt > datetime("2000-02-12T15:42:10.866+00:00")
RETURN u, relatedTo, everything
ORDER BY relatedTo.createdAt DESC
Here is a picture of the planner
The only way it does what I want it to do, is if I remove everything prior to the last relation, which obviously defies the point of that query but it was just for testing.
PROFILE
MATCH (following:User)-[relatedTo:POSTED|SHARED]->(everything)
WHERE relatedTo.createdAt > datetime("2000-02-12T15:42:10.866+00:00")
RETURN relatedTo, everything
ORDER BY relatedTo.createdAt DESC
Now it uses the index.
Any ideas how to do I get it to use an index in both, the query & the sort?
I'm not entirely clear why you want to use an index?
In your first query an index is used to find the :User node and then relationship pointers are followed to find the other nodes of interest. In Neo4j following relationship pointers is always faster than trying to use an index to find nodes (unlike a relational database). Typically, you only want to use an index to find your start nodes in a path, which is what your first query is doing.
If you really want to split the query to start the index search in a different part of the path you could split the query into multiple parts using WITH.
I have a SQL Server Full-Text Search indexing two columns in one of my tables.
I am pulling out suggested keywords from a web front-end based on user's input. Such that entering a phrase like 'ban' would yield words such as banana, banish, urban, husband, etc. The user would then click on one of these words to confirm their choice, or add further letters to narrow down their search.
I have the following number of total keywords, as shown by the following query:
SELECT COUNT(*) FROM sys.dm_fts_index_keywords ( DB_ID(), OBJECT_ID('Search'))
217,998
So, when querying the keywords I have a query like below:
SELECT TOP 10 *, display_term, document_count
FROM sys.dm_fts_index_keywords ( DB_ID(), OBJECT_ID('Search'))
WHERE column_id=5
AND keyword != 0xFF
AND display_term like '%ban%'
AND display_term NOT LIKE 'nn%'
However, this currently takes circa 30 seconds to run! Clearly this is far too slow to be of any use.
So, as a way of a work around I have created my own keywords table to store my keywords. Whenever I add content to my full-text search table, I run a query below to find out which keywords will be indexed:
SELECT display_term AS Term, COUNT(display_term) AS [Count]
FROM sys.dm_fts_parser('"There are many types of fruit, including apples, bananas and cherries." ', 1033, 0, 0)
WHERE display_term NOT LIKE 'nn%'
AND special_term NOT IN ('Noise Word', 'End of Sentence')
GROUP BY display_term
I then take these words and store them into my own keywords table, for later use by the web front end described above. This is much quicker.
However, I can't help feeling that I shouldn't need to create a workaround and that finding keywords is something that many people would need to be doing.
I have searched for other methods, tables, or other functionality contained within SQL Server, but all to no avail.
I have also looked into indexing the sys.dm_fts_index_keywords table. However, searching for the word "indexing" is problematic due to the nature of subject matter.
Does anyone have another method that is quick to execute, and hopefully also requires less programmatic intervention?
i´ve searched for a solution for a while now. Anyway I can not come up with a way that returns me the recordset I want.
I have a table full of different texts as a collection of all texts used in a HMI software.
Now when a user creates a new text I want to check if a similar text already exists in the table.
I´ve come so far to find FullTextSearch on MS SQL Server should be the best way to do this. My Problem is the following:
When I use FreeText on a new text that should be checked for similar values I get way to many results. Every record is listed that contains even only one of the relevant words in my search string.
Example:
Search text:
Deceleration Linear Motor Transfer to Top
Values that should be found:
'Deceleration linear motor transfer top'
'Deceleration linear motor handover to top'
Values that should not be found:
'Accelearion linear motor handover to top'
'linear motor handover to top'
So I want it to work just like FreeText is working (with INFLECTIONAL and THESAURUS comparison), but only records that contain all words in the search string, except those who are on the stopword list (so fill words are also ignored).
I thought about using Contains in combination with Formsof for every single word in my search string. But then it does not ignore those words on the stopword list.
I hope I was able to specify my problem properly and hope someone can help me with it.
Thanks in advance.
For anyone who might also run into this kinda problem. I solved it myself by now with the sledgehammer approach.
I just concat all words in my search expression with
(Formsof(... Thesaurus, *Word1* ) OR Formsof(... INFLECTIONAL *Word1*)) AND
(Formsof(... Thesaurus, *Word2* ) OR Formsof(... INFLECTIONAL *Word2*))
For the stopwords I skip those words manually by checking each word if it is listed before adding it to my where string.
This article helped me a lot with getting the correct language id for the selected column in the code.
Some Useful Full Text Index Stoplist Related Queries
I am using CONTAINS and FREETEXT on SQL query to search for text in big text fields.
What I noticed that the search returns result when the exact word match, but what if I want to search for similar words?
For example, when I type Carlo, it did not display anything if what I have is Carlos (with an S)
Below is a simple query similar to the one I use:
SELECT P.*
FROM MyTable AS P
WHERE(CONTAINS(P.*, 'Carlo') OR freetext(P.*, 'Carlo'))
How can I make the search bring similar words to Carlo such as Carlos, Carla, etc... without affecting the performance?
Try this
SELECT P.*
FROM MyTable AS P
WHERE CONTAINS(P.*, 'FORMSOF(INFLECTIONAL, "Carlo")')
For reference you can check documentation
I'm using SQL Server 2012 and have created full-text index for NAME column in COMPANY table. All the searches I've tested are of the following format (with variable number of words to search), matching by beginnings of words in any order:
select id, name from company where contains(name, '"ka*" AND "de*"')
The problem is that there are cases where this query doesn't return any results even though it should be perfect match. For example when company name is "ka de we oy", the example above returns a match but '"ka*" AND "de*" AND "we*"' does not and neither does searching with all the four 'words'.
There are also other cases where, strangely enough, the search does not return results even with exact words. This seems related to very short (two-letter) words. There are also some issues with searching with many (6+) words.
Is there some explicit restriction to the number of words in a single query or how short they can be? How can I fix or work around this?
Edit: it seems to be certain common English words which are entirely excluded from the index (like 'we' in the example). This is an issue since it's a requirement that a few of the common words definitely should be searchable. Is there any way to change which words are not indexed or e.g. change the 'language' of the indexing to apply different set of common words that are left out?
Apparently this is simply a case of defining correct stopwords / stoplist:
https://msdn.microsoft.com/en-us/library/ms142551.aspx
https://msdn.microsoft.com/en-us/library/cc280405.aspx
Or setting the full-text index language for the column to the actual language so that English words don't cause issues.
Edit: actually it was easiest to simply disable the stoplist for the table entirely:
ALTER FULLTEXT INDEX ON company SET STOPLIST = OFF
Hopefully this helps someone else