I am trying to set up spellchecker, according to solr documentation. But when I am testing, I don't have any suggestion. My piece of code follows:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">textSpell</str>
<lst name="spellchecker">
<str name="classname">solr.IndexBasedSpellChecker</str>
<str name="name">default</str>
<str name="field">name</str>
<str name="spellcheckIndexDir">./spellchecker</str>
</lst>
<str name="queryAnalyzerFieldType">textSpell</str>
</searchComponent>
<requestHandler name="/spellcheck" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<!-- Optional, must match spell checker's name as defined above, defaults to "default" -->
<str name="spellcheck.dictionary">default</str>
<!-- omp = Only More Popular -->
<str name="spellcheck.onlyMorePopular">false</str>
<!-- exr = Extended Results -->
<str name="spellcheck.extendedResults">false</str>
<!-- The number of suggestions to return -->
<str name="spellcheck.count">1</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
The query I send to Solr:
q=%2B%28text%3A%28gasal%29%29&suggestField=contentOriginal&ontologySeed=gasal&spellcheck.build=true&spellcheck.q=gasal&spellcheck=true&spellcheck.collate=true&hl=true&hl.snippets=5&hl.fl=text&hl.fl=text&rows=12&start=0&qt=%2Fsuggestprobabilistic
Does anybody know why?? Thanks in advance
First, don't repeat queryAnalyzerFieldType twice in the component configuration.
It is recommended not to use a /spellcheck handler but instead to bind the spellcheck component to the standard query handler (or dismax if it is what you use) like this:
<requestHandler name="standard" class="solr.SearchHandler" default="true">
<lst name="defaults">
...
</lst>
<arr name="last-components">
<str>spellcheck</str>
...
</arr>
</requestHandler>
You can then call it like this:
http://localhost:8983/solr/select?q=komputer&spellcheck=true
Also don't forget to build the spellcheck dictionary before you use it:
http://localhost:8983/solr/select/?q=*:*&spellcheck=true&spellcheck.build=true
You can force the dictionary to build at each commit by configuring it in the component:
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">textSpell</str>
<lst name="spellchecker">
<str name="classname">solr.IndexBasedSpellChecker</str>
<str name="name">default</str>
<str name="field">name</str>
<str name="spellcheckIndexDir">./spellchecker1</str>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
Finally, make sure that your name field is really an indexed field of type textSpell and that it contains enough content to build a good dictionary. In my case, I have a field named spellchecker that is populated from a couple of fields of my index (using copyField instructions in the schema).
Related
I am using Solr Suggester to provide suggestion in the search page of our application. But every suggestion request to Solr is taking too long to send response. I have tried with multiple lookup Impl such as AnalyzingLookupFactory, AnalyzingInfixLookupFactory, FuzzyLookupFactory etc.
Below is my configuration:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">mySuggester</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">spell_suggest</str>
<str name="weightField">spell_suggest</str>
<str name="suggestAnalyzerFieldType">text_general</str>
<str name="buildOnStartup">false</str>
</lst>
<lst name="suggester">
<str name="name">altSuggester</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="field">spell_suggest</str>
<str name="weightField">spell_suggest</str>
<str name="suggestAnalyzerFieldType">text_general</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<!--<str name="suggest.dictionary">mySuggester</str> -->
<str name="suggest.dictionary">altSuggester</str>
<str name="suggest">true</str>
<str name="suggest.count">6</str>
<str name="spellcheck">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
The response, with just 42000 indexed documents, is taking close to 5 to 7 seconds to provide response. This is impacting the functionality badly in the application
Following is my request: http://<myIP>:8983/solr/mycollection/suggest?df=spell_suggest&suggest=true&suggest.build=true&q=Vendor
Please suggest if I need to provide few more configurations or need to modify existing configurations to improve performance.
Thanks!
When you're issuing suggest.build each time, you're effectively asking for the suggestion index to be rebuilt from scratch each time you're querying the suggester.
It should only be rebuilt after changes if necessary (depending on which dictionaryImpl you're using).
I am trying to implement the auto suggest of solr this is the changes that I made in solrconfig.xml file
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collate">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookupFactory</str>
<str name="field">displayName</str> <!-- the indexed field to derive suggestions from -->
<float name="threshold">0.005</float>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
when I try to query with sample input as 'p'
http://localhost:8983/solr/food/suggest?q=p&wt=json&indent=true
it returns 5 words
"pizza", "potato", "pasta", "protein", "premium"
but in the displayName field I got words like paneer , palak etc which is not showing up why is it so?
Can you added the following to your configuration and run the below query. Don't forget to reload the solr core after putting these changes.
<str name="suggestAnalyzerFieldType">string</str>
<str name="storeDir">suggester_fuzzy_dir</str>
http://localhost:8983/solr/food/suggest?suggest=true&suggest.build=true&suggest.dictionary=suggest&wt=json&suggest.q=p&suggest.count=10
I'm trying to use the suggest component (solr 4.6) with multiple cores. I have added a search component and a request handler in my solrconfig. That works fine for 1 core but querying my solr instance with the shards parameter does not work.
But did you mean' (spell check ) is working fine with multiple cores using shard.
Here is the configuration part of solrconfig file :
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggestDictionary</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.fst.FSTLookupFactory</str>
<str name="field">suggest</str>
<float name="threshold">0.0005</float>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="org.apache.solr.handler.component.SearchHandler">
<lst name="defaults">
<str name="echoParams">none</str>
<str name="wt">xml</str>
<str name="indent">false</str>
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggestDictionary</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collate">false</str>
<str name="qt">/suggest</str>
<str name="shards.qt">/suggest</str>
<str name="shards">localhost:8080/cores/core1,localhost:8080/cores/core2</str>
<bool name="distrib">false</bool>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
<shardHandlerFactory class="HttpShardHandlerFactory">
<int name="socketTimeOut">1000</int>
<int name="connTimeOut">5000</int>
</shardHandlerFactory>
</requestHandler>
It works for me..
You can get the suggestions using this RestURL
http://localhost:8983/solr/demo/spell?q=howoo&wt=json&indent=true&qt=spell&shards.qt=/spell&shards=localhost:8983/solr/demo_shard2_replica1,localhost:8983/solr/demo_shard1_replica2
OR Simply use this :
http://localhost:8983/solr/demo/spell?q=hoo&wt=json&indent=true&shards.qt=/spell
shards.qt=/spell : Need to add that allows suggestion on shards
Here, you have make changes and apply for things which requires.
Collection = demo
Shards = demo_shard2_replica1, demo_shard1_replica2
Replace collection and shards names with your names of collection and shards.
The following query works well for me
http://...:8983/solr/vault/select?q=White&defType=edismax&qf=VersionComments+VersionName
returns all the documents where version comments includes White
I try to omit the qf containing the fields names :
In solr config I write
<requestHandler name="/select" class="solr.SearchHandler">
<!-- default values for query parameters can be specified, these
will be overridden by parameters in the request
-->
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="df">PackageName</str>
<str name="df">Tag</str>
<str name="df">VersionComments</str>
<str name="df">VersionTag</str>
<str name="df">VersionName</str>
<str name="df">SKU</str>
<str name="df">SKUDesc</str>
</lst>
I restart the solr and create a full import.
Then I try using
http://...:8983/solr/vault/select?q=White&defType=edismax
But I dont get the document any as answer.
What am I doing wrong?
df is the default field and will only take effect if the qf is not defined and its a single definition field in the configuration.
You can check the below configuration with qt=edismax parameter :-
<requestHandler name="edismax" class="solr.SearchHandler" >
<lst name="defaults">
<str name="defType">edismax</str>
<str name="echoParams">explicit</str>
<str name="df">PackageName Tag VersionComments ....</str>
</lst>
</requestHandler>
You can use qf (query field) with weight indication.
<requestHandler name="/select" class="solr.SearchHandler">
<!-- default values for query parameters can be specified, these
will be overridden by parameters in the request
-->
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<!--
[....]
-->
<str name="qf">PackageName^40.0 Tag^10.0 VersionComments^5.0 VersionTag^4.0</str>
<!--
[....]
-->
</lst>
</requestHandler>
Solr 4.8.1 We can make default as follows. by editing solrconfig.xml
<requestHandler name="/clustering" startup="lazy" enable="${solr.clustering.enabled:false}" class="solr.SearchHandler">
<lst name="defaults">
<!-- Configure the remaining request handler parameters. -->
<str name="defType">edismax</str>
<str name="qf">
text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4
</str>
<str name="q.alt">*:*</str>
<str name="rows">10</str>
<str name="fl">*,score</str>
</lst>
<arr name="last-components">
<str>clustering</str>
</arr>
</requestHandler>
I am trying to implement auto suggest from a huge set of paragraphs that are indexed. But I would want to filter out certain unwanted words appearing in auto suggest. For example words like "and", "how", "when", etc needs to be avoided. How do i go about it.
This is the configuration I have done for autosuggest in solrconfig.xml..
<requestHandler class="org.apache.solr.handler.component.SearchHandler" name="/suggest">
<lst name="defaults">
<str name="spellcheck">true</str>
<str name="spellcheck.dictionary">suggest</str>
<str name="spellcheck.onlyMorePopular">true</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.collate">true</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
<searchComponent class="solr.SpellCheckComponent" name="suggest">
<lst name="spellchecker">
<str name="name">suggest</str>
<str name="classname">org.apache.solr.spelling.suggest.Suggester</str>
<str name="lookupImpl">org.apache.solr.spelling.suggest.tst.TSTLookup</str>
<str name="field">keywords</str>
<float name="threshold">0.005</float>
<str name="buildOnCommit">true</str>
</lst>
I would recommend adding the StopFilterFactory to the backing fieldType definition for your keywords field in your schema.xml file. If you need those words ("and", "how", "when") in your keywords field for other searching requirements, I would suggest creating a custom field in your schema.xml just for the suggester and you can use the copyField directive to populate this new field.