Solr dictionary based suggester won't suggest on whole phrase - solr

When I enter a query containing multiple word to my Suggester component I got separated results for each. The problem is well explained here: How to have Solr autocomplete on whole phrase when query contains multiple terms?
The only difference is, I have a suggester based on a dictionary file, not an index field. The solution explained in the above link and many others didn't work
Here is the configuration:
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory</str>
<str name="buildOnCommit">true</str>
<str name="suggestAnalyzerFieldType">text_suggest</str>
<str name="sourceLocation">suggestionsFull.txt</str>
</lst>
<str name="queryAnalyzerFieldType">text_suggest</str>
<!-- <queryConverter name="queryConverter" class="org.apache.solr.spelling.SuggestQueryConverter"/> -->
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collate">false</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
schema.xml
<fieldType name="text_suggest" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.TurkishLowerCaseFilterFactory"/>
<filter class="solr.TrimFilterFactory"/>
</analyzer>
</fieldType>
I also use spellcheck.q parameter instead of q
http://localhost:8983/solr/collection1/suggest?spellcheck.q=bu+bir&wt=json&indent=true
What am I doing wrong?

Finally I found the solution:
Looks like even if you build the suggestion dictionary from a file but not from an index field, you have to specify an index field in the solrconfig. Thus in the schema.xml create a dummy field from the text_suggest fieldtype which we had already created:
<field name="text_suggest" type="text_suggest" indexed="false" stored="false" />
Then in the solrconfig.xml add <str name="field">text_suggest</str> line to the searchComponent:
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.fst.AnalyzingInfixLookupFactory</str>
<str name="buildOnCommit">true</str>
<str name="suggestAnalyzerFieldType">text_suggest</str>
<str name="field">text_suggest</str>
<str name="sourceLocation">suggestionsFull.txt</str>
</lst>
</searchComponent>
Restart the solr and you're done!

Related

Sole Suggester: AnalyzingInfixLookupFactory - Store Lookup build failed

I have this configuration (with solr 5.3.1):
<searchComponent class="solr.SuggestComponent" name="suggest">
<lst name="suggester">
<str name="name">suggest</str>
<str name="storeDir">dict_suggest</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="highlight">false</str>
<str name="field">suggestion</str>
<str name="suggestAnalyzerFieldType">suggest</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="payloadField">id</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="suggest">true</str>
<str name="suggest.dictionary">suggest</str>
<str name="suggest.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
The field in schema.xml is defined as <field name="suggestion" type="suggest" indexed="true" stored="true" required="true" multiValued="true" />.
The field type definition is this:
<fieldType name="suggest" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
Each time I try to build the index, solr shows Store Lookup build failed
There's no dump or description in the logs.
Am I missing something in the config? The suggester seems to work fine, so the "in memory" index works fine.
Thanks

Returning an entire Document on Solr Suggestion

Implemented a basic Solr Suggestion.I am able to get the suggested terms.
But is there a way to return entire SOLR Document based on the suggestion?
Here is the searchComponent and requestHandler in solr_config.xml.
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookupFactory</str>
<str name="field">complete_search</str>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.collate">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
The field and fieldType defintion in schema.xml are as follows.
<field name="complete_search" type="text_auto" indexed="true" stored="true" multiValued="true"/>
<fieldType class="solr.TextField" name="text_auto">
<analyzer>
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
</fieldType>
The result I am getting is as follows:
<arr name="suggestion">
<str>global academy for learning</str>
<str>global art</str>
<str>global institute of fine arts</str>
<str>global kids</str>
<str>global music academy</str>
<str>global residential school</str>
<str>globetrippers</str>
<str>globetrotters</str>
<str>glorious kids</str>
<str>glow tennis academy</str>
</arr>
My query for is this http://localhost:8983/solr/core_name/suggest?q=glo
So is there a way to get output in the form of a SOLR Document as in
<doc>
<str name="id">35716</str>
<str name="PID">35716</str>
<str name="service_name">Cherubs Montessori</str>
<arr name="complete_search">
<str>Cherubs Montessori</str>
<str>Arts and Crafts</str>
<str>No 173, 9th Main Road, 7th Sector, HSR Layout</str>
<str>Bangalore</str>
<str>HSR Layout</str>
</arr>
<str name="permalink">http://zp.local/extracurricular-activities/cherubs-montessori-at-hsr-layout-in-bangalore</str>
<arr name="categories">
<str>Arts and Crafts</str>
</arr>
<float name="average_ratings">0.0</float>
<str name="lat_lng">12.9102859,77.6450215</str>
<str name="listing_thumbnail">/uploads/2015/09/Cherubs-Montessori-300x122.jpg</str>
<float name="maximum_age">14.0</float>
<float name="minimum_age">5.0</float>
<str name="address">No 173, 9th Main Road, 7th Sector, HSR Layout</str>
<str name="city">Bangalore</str>
<str name="locality">HSR Layout</str>
<long name="_version_">1514279153996660736</long></doc>
<doc>
It is not possible at the moment. You can send only one field in the payload attribute along with your suggestions. You can find more information here.

Solr How to sorting suggestions by sales

We are trying to use solr on our website as a search engine , but we have a problem , we can not sort the suggestions by the number of sales.
I tried the components Facet, Terms, FreeTextLookupFactory and the spellcheck component, but in none of the above components are able to get the results that I want.
The most important thing that I would understand its if we can sort the suggestion for a weight chosen by us.
The version of solr we are using is the 5.0.
schema.xml:
<field name="name_complete" type="text_shingle" indexed="true" stored="true" required="false" multiValued="false" omitTermFreqAndPositions="true"/>
<fieldType name="text_shingle" class="solr.TextField" positionIncrementGap="100">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.ShingleFilterFactory" maxShingleSize="4" outputUnigrams="true"/>
</analyzer>
</fieldType>
solrconfig.xml
FreeTextLookupFactory:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">suggest_product_free</str>
<str name="lookupImpl">FreeTextLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">name_complete</str>
<str name="indexPath">suggest_product_free</str>
<str name="weightField">n_sales</str>
<str name="buildOnCommit">true</str>
<str name="suggestFreeTextAnalyzerFieldType">text_shingle</str>
<int name="ngrams">3</int>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="wt">json</str>
<str name="indent">true</str>
<str name="suggest">true</str>
<str name="suggest.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
spellcheck:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="queryAnalyzerFieldType">text_shingle</str>
<str name="name">autocomplete</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.fst.WFSTLookupFactory</str>
<str name="field">name_complete</str>
<str name="buildOnCommit">true</str>
<float name="threshold">0.005</float>
<str name="spellcheckIndexDir">./suggester_autocomplete</str>
</lst>
</searchComponent>
<requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<!-- Solr will use suggestions from both the 'default' spellchecker
and from the 'wordbreak' spellchecker and combine them.
collations (re-written queries) can include a combination of
corrections from both spellcheckers -->
<str name="spellcheck">true</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.dictionary">autocomplete</str>
<!--str name="spellcheck.dictionary">wordbreak</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.alternativeTermCount">5</str>
<str name="spellcheck.maxResultsForSuggest">5</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">true</str>
<str name="spellcheck.maxCollationTries">10</str>
<str name="spellcheck.maxCollations">5</str-->
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
any ideas how I can make to order the suggestion for the weight?
Thanks to all and have a nice day.
example:
assume that I have these saved values on solr:
1: Reflex canon eos 7d (weight: 1)
2: Reflex canon eos 6d (weight: 2)
3: Reflex canon eos 70d (weight: 3)
and requested the string 'can'.
the result that I want:
canon eos
canon eos 70d
canon eos 6d
canon eos 7d
in practice would that it were a autocomplete that I can order for the weight.

Returning arbitrary fields using the Solr Suggester component

I'm looking to use the Solr Suggester component for serving up search auto-complete suggestions.
I have created a field in my schema:
<fieldType name="text_autocomplete" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.LowerCaseTokenizerFactory"/>
</analyzer>
</fieldType>
While my solrconfig.xml looks like:
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
<str name="field">title_autocomplete</str>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collate">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
I am getting sensible results coming back, which is great. However, I would like to return the id field for the matching document rather than the field I am attempting to match against.
The suggestion component is made to return you the most correlating words for a partial match to what you have typed in. Then if you need the documents that have a specific word from the list of suggestions, shoot another query to Solr (this time a different search handler) and get the doc ids back.

SolR : full sentence spellcheck

I'm trying to configure a spellchecker to autocomplete full sentences from my query.
I've already been able to get this results:
"american israel" :
-> "american something"
-> "israel something"
But i want :
"american israel" :
-> "american israel something"
This is my solrconfig.xml :
<searchComponent name="suggest_full" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">suggestTextFull</str>
<lst name="spellchecker">
<str name="name">suggest_full</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
<str name="field">text_suggest_full</str>
<str name="fieldType">suggestTextFull</str>
</lst>
</searchComponent>
<requestHandler name="/suggest_full" class="org.apache.solr.handler.component.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest_full</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.onlyMorePopular">true</str>
</lst>
<arr name="last-components">
<str>suggest_full</str>
</arr>
</requestHandler>
And this is my schema.xml:
<fieldType name="suggestTextFull" class="solr.TextField">
<analyzer type="index">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
</analyzer>
<analyzer type="query">
<tokenizer class="solr.KeywordTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
</analyzer>
</fieldType>
...
<field name="text_suggest_full" type="suggestTextFull" indexed="true" stored="false" multiValued="true"/>
I've read somewhere that I have to use spellcheck.q because q use the WhitespaceAnalyzer, but when I use spellcheck.q i get a java.lang.NullPointerException
Any ideas ?
If you spellcheck fields ( text_suggest_full ) contain american something and israel something so make sure, that there also exist an document/entry , with the value american israel something.
Solr will not merge american something and israel something to one term and will not apply the result to your spellchecking for american israel.
Wouldnt be there an autocomplete approach more suitable? See this article e.g.
You can use the suggester / a flexible "autocomplete" component;
you must have version 3.X of solr
SolrConfig.xml :
<searchComponent name="suggest" class="solr.SpellCheckComponent">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
<str name="field">name_autocomplete</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
Shema.xml
<field name="name_autocomplete" type="text" indexed="true" stored="true" multiValued="false" />
Add copyField
<copyField source="name" dest="name_autocomplete" />
Reload solr, reindex all and test :
http://localhost:8983/solr/suggest?q=&amerspellcheck=true&spellcheck.collate=true&spellcheck.build=true
Get something like :
<?xml version="1.0" encoding="UTF-8"?>
<response>
<lst name="spellcheck">
<lst name="suggestions">
<lst name="ameri">
<int name="numFound">2</int>
<int name="startOffset">0</int>
<int name="endOffset">2</int>
<arr name="suggestion">
<str>american morocco</str>
<str>american morocco something</str>
</arr>
</lst>
<str name="collation">american morocco something</str>
</lst>
</lst>
</response>
Hope that help
Cheers
IMHO, a problem with the spellcheck component is that each word is spell checked against the full index.
The "collation" of the spell checked words does not neccesary match an single document within the index, but might come from separate indexed documents.

Resources