how to search permutated word or number in database - permutation

I have create a search form that can search the data from my database and show the result.
Now I have try to create a search form that can search the permutated words.
Example,
I have a textfield in the form for keyin the keyword to search. What to do if I want to search "tea" but the results returned from the database not only "tea" but also "tae", "eat", "ate", "eta", and "aet".
I know how to generate the list of permutated word of tea but I don't know how to search tea but the script will auto search the permutated words also.
Can I use wildcard for this?
I have think that generate the permutated words first and then search with the permutated words that list out. But it is not work.
Any idea or help is appreciated.
Thanks

Generate all permutation resutls and send list to database and query using IN clause for each list item.

Related

FullText Search with FreeText but only return records that contain all words of an expression, except those on stopword list

i´ve searched for a solution for a while now. Anyway I can not come up with a way that returns me the recordset I want.
I have a table full of different texts as a collection of all texts used in a HMI software.
Now when a user creates a new text I want to check if a similar text already exists in the table.
I´ve come so far to find FullTextSearch on MS SQL Server should be the best way to do this. My Problem is the following:
When I use FreeText on a new text that should be checked for similar values I get way to many results. Every record is listed that contains even only one of the relevant words in my search string.
Example:
Search text:
Deceleration Linear Motor Transfer to Top
Values that should be found:
'Deceleration linear motor transfer top'
'Deceleration linear motor handover to top'
Values that should not be found:
'Accelearion linear motor handover to top'
'linear motor handover to top'
So I want it to work just like FreeText is working (with INFLECTIONAL and THESAURUS comparison), but only records that contain all words in the search string, except those who are on the stopword list (so fill words are also ignored).
I thought about using Contains in combination with Formsof for every single word in my search string. But then it does not ignore those words on the stopword list.
I hope I was able to specify my problem properly and hope someone can help me with it.
Thanks in advance.
For anyone who might also run into this kinda problem. I solved it myself by now with the sledgehammer approach.
I just concat all words in my search expression with
(Formsof(... Thesaurus, *Word1* ) OR Formsof(... INFLECTIONAL *Word1*)) AND
(Formsof(... Thesaurus, *Word2* ) OR Formsof(... INFLECTIONAL *Word2*))
For the stopwords I skip those words manually by checking each word if it is listed before adding it to my where string.
This article helped me a lot with getting the correct language id for the selected column in the code.
Some Useful Full Text Index Stoplist Related Queries

Is there a way to execute a query on SOLR where I have a list of words that need to be in different fields?

everybody. I'm trying to elaborate a query that complies with the following:
Find a set of words that appear in a group of fields. For example, i want to find the documents that have the words soccer, ball and goalkeeper in one or both fields: 'sport_name' and 'descritpion'.
The problem I'm having is that I need to treat both fields as only one for getting results like:
{
"sport_name":"soccer",
"description": "...played with a ball... positions are goalkeeper"
}
I need that the words appear in any field, but all the words need to appear in the "concatenated bigger field".
Is there a way to do this during query time?
Thanks!!
You can do this by using the edismax handler (defType=edismax), setting q.op=AND (since all the terms has to be present) and using qf=sport_name description to tell Solr to search for the given terms in both fields.
You can also use qf=sport_name^2 description to say that you want to weigh hits in the sport_name field twice as much as hits in the description field. So if there was a sport named something with ball, that hit would contribute more to the score than if the same content were present in the description field.

Solr search on a field with ReverseStringFilterFactory return 0 records for reverse input

I have a requirement where user should able to get same result when searched with a String in reversed or striaght for
example: q="F44" or q="44F" should result same result.
I have created a new field "text_rev" which is assigned to below Field Type.
And I did Copy field with actual/original field "retailId"
<copyField source="retailId" dest="text_rev"/>
<fieldType name="text_rvsstr" class="solr.TextField"><analyzer><tokenizer class="solr.StandardTokenizerFactory"/><filter class="solr.ReverseStringFilterFactory"/></analyzer></fieldType>
when I search with q=text_rev:F44 i get the result but when i search with q=text_rev:44F i get 0 results.
Please advice.
Those searches are on the same field. Searching the reverse direction is only going to work on the reversed field, and searching the forward direction is only going to work on the original field.
By searching both fields for the same information, you can check both directions in one query.
q=retailId:F44 OR text_rev:F44
You need to search both fields. Also, if you actually expect to search in reverse, you need to have asymmetric index and query-type definition. Otherwise your term will get reversed both during indexing and querying and you effectively loose any reason to do so.
You can test that by using Analyse screen of the Admin UI and providing content in both boxes. It will then show how the terms get processed and matched during indexing/querying.

Multiple Full Text Search SQL Queries Merged and Scored (Ranked Search Results)

I have a bunch of articles in one table that I'd like to query for search results. Using Full Text Search I can return a list of items that have the search keywords "near" each other.
Full text search does not seem to allow thesaurus (FORMSOF) with the NEAR delimiter.
What I'd like to do, in SQL, is create a query, or a number of queries, which search the same data, in different ways, and return a score (or RANK if using Full Text Search), then I would like to merge these results so there are no duplicates, and total up the ranks/scores, so that I can ORDER BY those scores.
Add in that I would also like to search a separate link table of "tags" that the documents have been assigned, and also assign extra score for those with corresponding tags.
What is the best practice way of fulfilling these requirements?
Full-text search can do search like ('"word*" near "another*"') in CONTAINSTABLE statement. The asterisk will help to search any words started with 'word' and 'another' near each other with ranking.
On the other side you can launch FORMSOF(Thesaurus, word) AND FORMSOF(Thesaurus, another) search with CONTAINSTABLE statement.
Then MERGE the results and use ORDER BY to sort by both given RANKs.

Does google search API eliminate stop words?

Consider if your search query in google search API is "I Love you".
In this query, "I" and "you" are stop words and they occur in almost every document. The keyword(s) present in this search is "Love" which should be searched for. So, there must be a process to detect the stop words and eliminate them from the document list we feed to the API. Does google do it automatically in their search API or do we have to process the search query before firing the query? If google already uses the IDF (Inverse Document Frequency) table to eliminate (or less - prioritise) the stop words, how do they do it? If not, how can we eliminate those stop words? Does the algorithm (if any) works for other (vernacular) languages too?
Link to Google search API here
Google full text search api does not eliminate stop words.
If you perform a global search with search query "I Love you", you will only get documents which will have all the 3 words and not just stop words
The white space between words, quoted strings, numbers, and dates is
treated as an implicit AND operator.
If you want the same functionality while searching within a field here is one approach to look for:
If you enclose your query between parentheses then search will only return documents that contains all the words in the query.
For the case "I Love you", search query should be:
field_name = "(I Love You)"
or
field_name = "(I AND Love AND You)"
This way you will only get documents that contain all the words and not just stop words.
You can just search for the word "Love" in the index.
If you want to search for the word anywhere in the text, you can use wild card operator *
field_name = "Love*"

Resources