solr - spellcheck causing Core Reload to hang - solr

I am having an issue with by solr settings.
After a lot of investigation today, I found that its the spellcheck component which is causing the issue of Core Reload to hang.
If its turned off, all will run well and core can easily reload. However, when the spellcheck is on, the core wont reload instead hangs forever. Then the only way to get the project back alive is to stop solr, and delete the data folder then start solr again.
Here are the solr config settings for spell check:
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<!-- Spell checking defaults -->
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck">on</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.onlyMorePopular">false</str>
<str name="spellcheck.maxResultsForSuggest">5</str>
<str name="spellcheck.alternativeTermCount">2</str>
<str name="spellcheck.extendedResults">false</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.maxCollations">3</str>
<str name="spellcheck.maxCollationTries">3</str>
<str name="spellcheck.collateExtendedResults">true</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">text_en_splitting</str>
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">location_details</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="buildOnCommit">true</str>
<float name="accuracy">0.5</float>
<float name="thresholdTokenFrequency">.01</float>
<int name="maxEdits">1</int>
<int name="minPrefix">3</int>
<int name="maxInspections">3</int>
<int name="minQueryLength">4</int>
<float name="maxQueryFrequency">0.001</float>
</lst>
</searchComponent>
.
Here is the field from schema:
<field name="location_details" type="text_en_splitting" indexed="true" stored="false" required="false" />

Basically, it is a bug in Solr. You need to just hide/comment/remove the following from your requestHandler:
<!--<str name="spellcheck.maxCollationTries">3</str> here is a bug, put this parameter in the actual query string instead -->
Furthermore, if you really need to use maxCollationTries, you can enter it as a Query parameter in your url instead.

Related

Solr Suggester taking too long to provide response

I am using Solr Suggester to provide suggestion in the search page of our application. But every suggestion request to Solr is taking too long to send response. I have tried with multiple lookup Impl such as AnalyzingLookupFactory, AnalyzingInfixLookupFactory, FuzzyLookupFactory etc.
Below is my configuration:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">mySuggester</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">spell_suggest</str>
<str name="weightField">spell_suggest</str>
<str name="suggestAnalyzerFieldType">text_general</str>
<str name="buildOnStartup">false</str>
</lst>
<lst name="suggester">
<str name="name">altSuggester</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="field">spell_suggest</str>
<str name="weightField">spell_suggest</str>
<str name="suggestAnalyzerFieldType">text_general</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<!--<str name="suggest.dictionary">mySuggester</str> -->
<str name="suggest.dictionary">altSuggester</str>
<str name="suggest">true</str>
<str name="suggest.count">6</str>
<str name="spellcheck">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
The response, with just 42000 indexed documents, is taking close to 5 to 7 seconds to provide response. This is impacting the functionality badly in the application
Following is my request: http://<myIP>:8983/solr/mycollection/suggest?df=spell_suggest&suggest=true&suggest.build=true&q=Vendor
Please suggest if I need to provide few more configurations or need to modify existing configurations to improve performance.
Thanks!
When you're issuing suggest.build each time, you're effectively asking for the suggestion index to be rebuilt from scratch each time you're querying the suggester.
It should only be rebuilt after changes if necessary (depending on which dictionaryImpl you're using).

query on suggester while build dictionary

My suggester conf:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">titleSuggester</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="field">name</str>
<str name="suggestAnalyzerFieldType">text_pt</str>
<str name="payloadField">type</str>
<str name="weightField">weightField</str>
<str name="buildOnCommit">false</str>
<str name="buildOnStartup">false</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="indexPath">/home/dev/suggestions</str>
</lst>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy" >
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">titleSuggester</str>
<str name="suggest.onlyMorePopular">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
It's work! But, i neeed build my dictionary every hour, and this build takes 2 minutes.
Every hour i run:
localhost:8983/solr/AutoComplete/suggest?suggest.q=term&suggest.build=true
During this time i need get results, but when i run a query as:
localhost:8983/solr/AutoComplete/suggest?suggest.q=term
i get this return(because build is running):
<response>
<lst name="responseHeader">
<int name="status">500</int>
<int name="QTime">5</int>
</lst>
<lst name="error">
<str name="msg">suggester was not built</str>
What can I do to get results while the build is running?
This question is quite old, but I have the same problem (my rebuild may run an hour) and I came to this solution:
Configure two components, e.g. suggest_A and suggest_B with different indexPath values.
Configure two request handlers, e.g. suggest and suggest_Rebuild.
Assign suggest_A to suggest and suggest_B to suggest_Rebuild.
Do the rebuild on the suggest_Rebuild handler. After the rebuild is finished, switch the component assignment of both components via the config API (update-requesthandler).
The drawback of this solution is that you need the double disk space.

Solr Spellcheck request returns nothing

I actually use Solr 4.8.1 and I set up spellcheck. After indexing, the request doesn't return any suggestion.
After the advice of #n0tting, I modified a little my files.
Here are steps:
1- solrconfig.xml
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">phraseText</str>
<lst name="spellchecker">
<str name="classname">solr.IndexBasedSpellChecker</str>
<str name="spellcheckIndexDir">./spellchecker</str>
<str name="name">default</str>
<str name="field">title_spellcheck</str>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
add some configurations in standard requestHandler:
<requestHandler name="standard" class="solr.StandardRequestHandler" default="true">
<!-- default values for query parameters -->
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<!-- Optional, must match spell checker's name as defined above, defaults to "default" -->
<str name="spellcheck.dictionary">default</str>
<!-- omp = Only More Popular -->
<str name="spellcheck.onlyMorePopular">false</str>
<!-- exr = Extended Results -->
<str name="spellcheck.extendedResults">false</str>
<!-- The number of suggestions to return -->
<str name="spellcheck.count">1</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
2 schema.xml
Define a field for spell check:
<field name="title_spellcheck" type="phraseText" indexed="true" stored="false" multiValued="true" />
<copyField source="title" dest="title_spellcheck"/>
3 Request:
.../select?q=recommend&defType=edismax&qf=title&spellcheck=true&spellcheck.build=true&spellcheck.q=recommend&spellcheck.collate=true
I don't get any suggestion at result, neither <lst name="spellcheck">. can anybody give me an advice? Thanks a lot.
References:
https://cwiki.apache.org/confluence/display/solr/Spell+Checking
http://solr.pl/en/2011/05/23/%E2%80%9Ccar-sale-application%E2%80%9D-%E2%80%93-spellcheckcomponent-%E2%80%93-did-you-really-mean-that-part-5/

solr suggester not working with shard for multiple core

I'm trying to use the suggest component (solr 4.6) with multiple cores. I have added a search component and a request handler in my solrconfig. That works fine for 1 core but querying my solr instance with the shards parameter does not work.
But did you mean' (spell check ) is working fine with multiple cores using shard.
Here is the configuration part of solrconfig file :
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggestDictionary</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookupFactory</str>
<str name="field">suggest</str>
<float name="threshold">0.0005</float>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler">
<lst name="defaults">
<str name="echoParams">none</str>
<str name="wt">xml</str>
<str name="indent">false</str>
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggestDictionary</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collate">false</str>
<str name="qt">/suggest</str>
<str name="shards.qt">/suggest</str>
<str name="shards">localhost:8080/cores/core1,localhost:8080/cores/core2</str>
<bool name="distrib">false</bool>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
<shardHandlerFactory class="HttpShardHandlerFactory">
<int name="socketTimeOut">1000</int>
<int name="connTimeOut">5000</int>
</shardHandlerFactory>
</requestHandler>
It works for me..
You can get the suggestions using this RestURL
http://localhost:8983/solr/demo/spell?q=howoo&wt=json&indent=true&qt=spell&shards.qt=/spell&shards=localhost:8983/solr/demo_shard2_replica1,localhost:8983/solr/demo_shard1_replica2
OR Simply use this :
http://localhost:8983/solr/demo/spell?q=hoo&wt=json&indent=true&shards.qt=/spell
shards.qt=/spell : Need to add that allows suggestion on shards
Here, you have make changes and apply for things which requires.
Collection = demo
Shards = demo_shard2_replica1, demo_shard1_replica2
Replace collection and shards names with your names of collection and shards.

Request handle solrconfig.xml Spellchecker

I am trying to set up spellchecker, according to solr documentation. But when I am testing, I don't have any suggestion. My piece of code follows:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">textSpell</str>
<lst name="spellchecker">
<str name="classname">solr.IndexBasedSpellChecker</str>
<str name="name">default</str>
<str name="field">name</str>
<str name="spellcheckIndexDir">./spellchecker</str>
</lst>
<str name="queryAnalyzerFieldType">textSpell</str>
</searchComponent>
<requestHandler name="/spellcheck" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<!-- Optional, must match spell checker's name as defined above, defaults to "default" -->
<str name="spellcheck.dictionary">default</str>
<!-- omp = Only More Popular -->
<str name="spellcheck.onlyMorePopular">false</str>
<!-- exr = Extended Results -->
<str name="spellcheck.extendedResults">false</str>
<!-- The number of suggestions to return -->
<str name="spellcheck.count">1</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
The query I send to Solr:
q=%2B%28text%3A%28gasal%29%29&suggestField=contentOriginal&ontologySeed=gasal&spellcheck.build=true&spellcheck.q=gasal&spellcheck=true&spellcheck.collate=true&hl=true&hl.snippets=5&hl.fl=text&hl.fl=text&rows=12&start=0&qt=%2Fsuggestprobabilistic
Does anybody know why?? Thanks in advance
First, don't repeat queryAnalyzerFieldType twice in the component configuration.
It is recommended not to use a /spellcheck handler but instead to bind the spellcheck component to the standard query handler (or dismax if it is what you use) like this:
<requestHandler name="standard" class="solr.SearchHandler" default="true">
<lst name="defaults">
...
</lst>
<arr name="last-components">
<str>spellcheck</str>
...
</arr>
</requestHandler>
You can then call it like this:
http://localhost:8983/solr/select?q=komputer&spellcheck=true
Also don't forget to build the spellcheck dictionary before you use it:
http://localhost:8983/solr/select/?q=*:*&spellcheck=true&spellcheck.build=true
You can force the dictionary to build at each commit by configuring it in the component:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">textSpell</str>
<lst name="spellchecker">
<str name="classname">solr.IndexBasedSpellChecker</str>
<str name="name">default</str>
<str name="field">name</str>
<str name="spellcheckIndexDir">./spellchecker1</str>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
Finally, make sure that your name field is really an indexed field of type textSpell and that it contains enough content to build a good dictionary. In my case, I have a field named spellchecker that is populated from a couple of fields of my index (using copyField instructions in the schema).

Resources