Autocomplete term suggestion as per popularity - solr

I have implemented autocomplete term suggestion in my MVC application. Let me explain you how I have done this. I have created one table in DB and table columns is like:
Id SearchTerm CatID ResultCount Clicks Latency TermSearchTime
Now, whenever user search a term we store it in this table. Next time it same word match we display term suggestion. Moreover, we display term suggestion as term popularity. Which word is more searched is displayed first in suggestion.
But now I also want to provide term suggestion for misspell term. For example Samsung is already there in my table. If someone search for samsng in that case Samsung should be there in term suggestion.
As I do not know how to spell check in SQL server, I decided to do it using Solr.
How can I do it using Solr with my default behaviour which I have done with SQL Db? Moreover, please note Search result I fetch from the Solr. I have already index all products. Do I need to index Search Term as well?
Any help is appreciation. Thanks.

check this in your solrconfig.xml file to use spellcheck handler.
<requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="df">text</str>
<!-- Solr will use suggestions from both the 'default' spellchecker
and from the 'wordbreak' spellchecker and combine them.
collations (re-written queries) can include a combination of
corrections from both spellcheckers -->
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck.dictionary">wordbreak</str>
<str name="spellcheck">on</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.alternativeTermCount">5</str>
<str name="spellcheck.maxResultsForSuggest">5</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">true</str>
<str name="spellcheck.maxCollationTries">10</str>
<str name="spellcheck.maxCollations">5</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr></requestHandler>
if not present then copy paste in your file. restart solr. try /spell?q=ipad

Related

AnalyzingInfixLookupFactory implementation in Solr Suggestor not returning suggestion results

My requirement is to provide automatic suggestions to users on asset names as per their project.
I have tried using AnalyzingInfixLookupFactory and BlendedInfixLookupFactory, as these are the only ones that support context filtering.
But no suggestion results are being returned.
Below is extract from solrconfig.xml:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">mySuggester</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">assetname_str</str>
<str name="indexPath">/home/suggest_index</str>
<str name="contextField">projectid</str>
<str name="weightField">weight</str>
<str name="suggestAnalyzerFieldType">string</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">mySuggester</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
However if I try using FuzzyLookupFactory as lookup Impl, then suggestion results are returned as expected.(but problem is Fuzzylookupfactory does not support context filtering)
url used:
http://ipaddress:port/solr/collection_name/suggest?suggest=true&suggest.build=true&suggest.dictionary=mySuggester&wt=json&suggest.q=Com&suggest.cfq=
1234
(I know this is an old issue, but in case others stumble across it with the same problem...)
I spent a couple of days dealing with the same empty results. You don't say what the type of the field is that you're using as material for suggestions. You've got suggestAnalyzerFieldType set to string.
By default, string is a fieldType with no analysis many out-of-the-box schema.xml examples. A key concept, which is only vaguely hinted at in the Solr manual's Suggester doc, is that lookupImpls like AnalyzingInfixLookupFactory and BlendedInfixLookupFactory can take a suggestAnalyzerFieldType that is not the type of the field from which you are generating suggestions, but rather need a type that contains the appropriate analyzer elements, such as solr.WhiteSpaceTokenizer needed for suggestions.
In my case, I was trying to suggest from a multivalued string field--I wanted the field to have no tokenization. But until I changed the suggestAnalyzerFieldType from string to text_ws (a fieldType whose analyzer is only sole.WhiteSpaceTokenizer, I got empty results.
For what it's worth, if you use multivalued string field for suggestions, and many documents that contain the same string values in that field, then the BlendedInfixLookupFactory seems to produce a better result with no duplicate suggestions.

How to tune apache SOLR spellcheck for desired suggestion?

Enviroment: SAP Hybris 6.7.0.0, Apache Solr 7.7.2
I am using solr to power a indie eCommerce platform. In that context we have product data in the Solr dB. For example: productName_text, BrandName_string, etc.
I've created a spellcheck component with this current configuration below:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">en</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="field">spellcheck_en</str>
<str name="distanceMeasure">internal</str>
<float name="accuracy">0.7</float>
<int name="maxEdits">2</int>
<int name="minPrefix">0</int>
<int name="maxInspections">5</int>
<int name="minQueryLength">2</int>
</lst>
</searchComponent>
And turned on spellcheck on /select request handler
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collate">true</str>
and spellcheck is configured dynamically for the a single field. Suppose:
productName_text
which consists of product names from a typical electronic gadgets or it's cases. For example:
"Apple Watch Series 2 38mm Stainless Steel Case with Midnight Blue Modern Buckle Medium"
"A.O. Smith X4 RO Water Purifier (White)"
If we misspell "wath" for "watch" we get suggestion "water". Or spelling "suop maker" for "soup maker" we get "shop maker". How to tune spellchecker according to my data? Is there any other solution to implement for misbehaving queries.
Tried playing with all the spellcheck configuration from [1]: https://cwiki.apache.org/confluence/display/SOLR/SpellCheckComponent but couldn't find any solid solution yet.
Tried implementing WordBreakSolrSpellChecker, which doesn't seem to change any outcome
Played around with "spellcheck.collate" and other attributes, but it returns suggestion which has no search result.
I've observed, spellcheck is deeply affected by multivalued fields(?)
In general, How to go about the terms which should give wrong suggestion, or suggestions that are that must not come based on user preferences? Is it possible to handle two different spellcheck components, if "DirectSolrSpellChecker" does'nt give desired suggestion , I can switch to "FilebasedSpellChecker"? Can I maintain a .txt file to track all the terms which needs tuning, or the same in SAP hybris?

Solr SuggestComponent - Building dictionaries based on certain filters?

I am currently using the solr suggest component for an autocomplete feature. Now, according to user permissions and which area of the site I am on, I want to offer the user different suggestions. Now I assumed it would easily be possible to only consider certain explicit entries for building my dictionary (i.e.: dict1 is built only from entries where type=t1 and locale=en, dict 2 where type=t1 and locale=de, dict3 where type=t2 and locale=en, etc...). But I can't figure out where I would do such a thing. The system is running solr 4.6.
Do you know of any solution or have a possible workaround?
I am not currently able to update solr on the system or change the way documents are indexed apart from the solr configuration, so unfortunately context filtering is not available to me. This would only be a last resort if nothing else works.
Since you’re using old Solr 4.6 which doesn’t have context filtering or even multiple dictionaries, you would need to specify your dictionaries aka SearchComponent for each of the entry
<searchComponent class="solr.SpellCheckComponent" name="suggest-en">
<lst name="spellchecker">
-->
<!--
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
<str name="field">name</str> <!-- the indexed field to derive suggestions from
<float name="threshold">0.005</float>
<str name="buildOnCommit">true</str>
<str name="sourceLocation">american-english</str>
--> </lst>
</searchComponent>
And then define request handlers, like this:
<requestHandler class="org.apache.solr.handler.component.SearchHandler"
name="/suggest-en">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest-en</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collate">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
The workaround here would be, that you would need to specify suggest-en, suggest-de, and all other types of the request handlers, and later point out clients depends on their profile to the correct request handler.

DSE CQL Query for Solr Suggestor

I am using DSE 5.0.1 version. Earlier we used facet query to show search suggestions. For performance reasons , looking for other alternatives to get suggestions and found solr search suggester component. But I couldn't find examples where suggester component is used from a CQL query. Its possible right?Can anyone help me on this.
Thanks in advance.
Yes, it's possible and relatively easy - you just need to understand how to map XML that you want to put into generated solrconfig.xml into JSON that is used for configuration.
For example, we want to configure suggestor to suggest on the data from field title, and use additional weights from the rating field. As per Solr documentation the XML piece should look following way:
<searchComponent class="solr.SuggestComponent" name="suggest">
<lst name="suggester">
<str name="name">titleSuggester</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="suggestAnalyzerFieldType">TextField</str>
<str name="field">title</str>
<str name="weightField">rating</str>
<str name="buildOnCommit">false</str>
<str name="exactMatchFirst">true</str>
<str name="contextField">country</str>
</lst>
</searchComponent>
<requestHandler class="solr.SearchHandler" name="/suggest">
<arr name="components">
<str>suggest</str>
</arr>
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
</lst>
</requestHandler>
In CQL, it will be converted
ALTER SEARCH INDEX CONFIG ON table ADD
searchComponent[#name='suggest',#class='solr.SuggestComponent']
WITH $$ {"suggester":[{"name":"titleSuggester"},
{"lookupImpl":"AnalyzingInfixLookupFactory"},
{"dictionaryImpl":"DocumentDictionaryFactory"},
{"suggestAnalyzerFieldType":"TextField"},
{"field":"title"}, {"weightField":"rating"},
{"buildOnCommit":"false"}, {"exactMatchFirst":"true"},
{"contextField":"country"}]} $$;
ALTER SEARCH INDEX CONFIG ON table ADD
requestHandler[#name='/suggest',#class='solr.SearchHandler']
WITH $$ {"defaults":[{"suggest":"true"},
{"suggest.count":"10"}],"components":["suggest"]} $$;
After that you need not to forget to execute:
RELOAD SEARCH INDEX ON table;
And your suggestor will work. In my example, the index for suggestor should be build explicitly because inventory doesn't change very often. This is done via HTTP call like this:
curl 'http://localhost:8983/solr/keyspace.table/suggest?suggest=true&suggest.dictionary=titleSuggester&suggest.q=Wat&suggest.cfq=US&wt=json&suggest.build=true&suggest.reload=true'
But you can control this by setting buildOnCommit to true. Or you can configure it to build suggestor index on start, etc. - see Solr's documentation.
Full example is here - this is an example of the e-commerce application.

Solr - Boosting result if query is found in a special field

I am wondering if it is possible with Solr 3.4 to boost a search result, if the query is found in a special field without using the "fieldname:query"-syntax.
Let me explain:
I have several fields in my index. One of it is named "abbreviation" and is filled with text like AVZ, JSP, DECT, ...
To be able to find results when searching purely for "AVZ" I added a
<copyField source="abbreviation" dest="text"/>
in my schema.xml. The field text is my defaultSearchField.
This is not the best solution in my opinion. So I am trying to find out, if it is possible to search for "AVZ" in all fields and if the String is found in the field abbreviation, the result entry should be boosted (increasing the score) so that it will be listed at first entry in the result list. Would be the same as using abbreviation:AVZ AVZ as query.
The other possibility I can think of is to analyze the query. And if a substring like "AVZ" is found, the query will be appended with abbreviation:AVZ. But in this case I must be able to find out, which abbreviations are indexed. Is it possible to retrieve all possible terms of a field from the Solr index using SolrJ?
Best Regards
Tobias
Without the fieldname:term syntax use can define a request handler -
<requestHandler name="search" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="defType">dismax</str>
<str name="qf">
abbreviation^2 text
</str>
<str name="q.alt">*:*</str>
<str name="rows">10</str>
<str name="fl">*,score</str>
</lst>
</requestHandler>
This uses the dismax query parser. You can use edismax as well.
This will boost the results and query would be a simple query as q=AVZ.
If only through url, you can boost match on specific field like mentioned # link
e.g.
q=abbreviation:AVZ^2 text:AVZ
This would boost the results with a match on abbreviation, which would result the documents to appear on top.
It is not possible to get all results with dismax using the *:* query.
However, for all docs just do not pass any q param. q.alt=*:* will return all the docs.
Else, update the defType to edismax.
<requestHandler name="search" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="defType">edismax</str>
<str name="qf">
abbreviation^2 text
</str>
<str name="q.alt">*:*</str>
<str name="rows">10</str>
<str name="fl">*,score</str>
</lst>
</requestHandler>
Apache Solr 6.4.2:
Boosting Exact phrase search not working:
Solrconfig.xml:
explicit
<int name="rows">10</int>
<str name="defType">edismax</str>
<str name="qf">names^50</str>
<!-- <str name="df">text</str> -->
</lst>
Solr query used to test: q=(names:alex%20pandian)&wt=json&debugQuery=on
In debug mode it shows
"parsedquery_toString":"+((names:alex ((names:pandian)^50.0))) ()"
It is boosting the terms from second word only. In this case only Pandian is boosted but Alex is not.

Resources