Solr suggester in SolrCloud mode - solr

I am running the solr in CloudSolr mode with three shards. The data is already indexed into solr. Now I have configured the solr suggester in solrconfig.xml. This is the configuration from solrconfig file. I am using solr 4.10 version.
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">mysuggest</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="storeDir">suggester_fuzzy_dir</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">businessName</str>
<str name="payloadField">profileId</str>
<str name="weightField">businessName</str>
<str name="suggestAnalyzerFieldType">text_general</str>
<str name="buildOnStartup">false</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
Here is the command I am using to fetch the result:
http://shard1:8900/solr/core/suggest?suggest=true&suggest.build=true&suggest.reload&suggest.dictionary=mysuggest&wt=json&indent=true&suggest.q=sale
This is the output of the command:
{
"responseHeader":{
"status":0,
"QTime":1490},
"command":"build",
"suggest":{}
}
Nothing is coming into suggest result. I have 10K records indexed into solr.
I am seeing the following into log file:
org.apache.solr.handler.component.SuggestComponent; http://shard1:8983/solr/core/ : null
org.apache.solr.handler.component.SuggestComponent; http://shard2:8900/solr/core/ : null
org.apache.solr.handler.component.SuggestComponent; http://shard3:7574/solr/core/ : null
I am not able understand what is missing here. Thanks.

It was not working because solr was running in SolrCloud mode. There is two ways to perform suggestion in solrCloud mode:
Use the distrib=false parameter. This will fetch the data from only one shard which you are accessing in the command. You can add the following into Component definition itself.
<bool name="distrib">false</bool>
Use the shards and shards.qt parameter for searching all the shards. The shards parameter will contain comma separated list of all the shards which you want to include in the query. The shards.qt parameter will define the reat API you want to access.
shards.qt: Signals Solr that requests to shards should be sent to a request handler given by this parameter. Use shards.qt=/spell when making the request if your request handler is "/spell".
shards: shards=solr-shard1:8983/solr,solr-shard2:8983/solr Distributed Search
Please check Here for more details.

Related

Solr Error: QueryComponent.mergeIds(QueryComponent.java:895) in custom request handler

I'm using Solr 8.4.0 and i tried to make a search request handler that only return a specific sets of field in a collection, without anyone can change what fields to be displayed.
Here is how the request handler looks like
<requestHandler class="solr.SearchHandler" name="/search">
<arr name="components">
<str>query</str>
<str>facet</str>
</arr>
<lst name="defaults">
<int name="rows">10</int>
<str name="wt">json</str>
<str name="q.alt">*:*</str>
</lst>
<lst name="invariants">
<str name="facet">true</str>
<str name="facet.mincount">1</str>
<str name="fl">_uniqueid</str>
<str name="fl">document_title_t</str>
<str name="fl">document_title_string_s</str>
<str name="fl">document_shortsummary_t</str>
<str name="fl">page_url_s</str>
<str name="fl">topic_path</str>
<str name="fl">itemid_s</str>
<str name="echoParams">none</str>
<str name="omitHeader">true</str>
</lst></requestHandler>
After making the collection and trying the request handler, i received
this error
It seems this issue only happens when we are using multiple shards, changing the collection to a single shard removes the error, but we need to have multiple shards for this collection later on production. We are using 2 shards and 3 replicas
I have managed to solve this issue. By going through solr's code from repository in github, i found out that at queryComponent.java line 895 it's trying to access a certain header. After removing the omitHeader invariant the request handler seems to work perfectly

How to setup solr 8.11 replication slave?

I try to setup a solr master/slave replication. But I've some issues to understand how I setup the slave solr. In each documentation or "How to do" there are only described the different solrconfig.xml for slave but not how I should setup them.
Should I create on the slave also a core too? Because when I do it, the slave solr didn't recognized that he should be a slave. When I call /replication?command=details on slave, the output are
{
"responseHeader":{
"status":0,
"QTime":1},
"status":"OK",
"details":{
"indexSize":"69 bytes",
"indexPath":"/var/solr/data/vdiParts/data/index/",
"commits":[[
"indexVersion",0,
"generation",1,
"filelist",["segments_1"]]],
"isMaster":"true",
"isSlave":"false",
"indexVersion":0,
"generation":1,
"master":{
"replicateAfter":["commit"],
"replicationEnabled":"true"}}}
So he thinks he is a master. In slave solrconfig.xml I create the correct requestHandler
<requestHandler name="/replication" class="solr.ReplicationHandler">
<lst name="follower">
<str name="leaderUrl">http://[host]:8983/solr/[core]/replication</str>
<str name="pollInterval">00:00:20</str>
<str name="httpConnTimeout">5000</str>
<str name="httpReadTimeout">10000</str>
</lst>
</requestHandler>
Thx!
Should I create on the slave also a core too? Because when I do it, the slave solr didn't recognized that he should be a slave.
Yes, you need to create the second core and after you have done that, update its solrconfig.xml to indicate that where is the master Solr.
solrconfig.xml in the master core will have a section like this:
<requestHandler name="/replication" class="solr.ReplicationHandler">
<lst name="master">
<str name="replicateAfter">optimize</str>
<str name="backupAfter">optimize</str>
<str name="confFiles">schema.xml,stopwords.txt</str>
</lst>
</requestHandler>
whereas the solrconfig.xml in the slave will have a section that looks more or less like this:
<requestHandler name="/replication" class="solr.ReplicationHandler">
<lst name="slave">
<str name="masterUrl">http://localhost:8983/solr/bibdata/replication</str>
<str name="pollInterval">00:00:20</str>
<str name="compression">internal</str>
<str name="httpConnTimeout">5000</str>
<str name="httpReadTimeout">10000</str>
<str name="httpBasicAuthUser">username</str>
<str name="httpBasicAuthPassword">password</str>
</lst>
</requestHandler>
More details https://github.com/hectorcorrea/solr-for-newbies/blob/code4lib_2018/tutorial.md#solr-replication

DSE CQL Query for Solr Suggestor

I am using DSE 5.0.1 version. Earlier we used facet query to show search suggestions. For performance reasons , looking for other alternatives to get suggestions and found solr search suggester component. But I couldn't find examples where suggester component is used from a CQL query. Its possible right?Can anyone help me on this.
Thanks in advance.
Yes, it's possible and relatively easy - you just need to understand how to map XML that you want to put into generated solrconfig.xml into JSON that is used for configuration.
For example, we want to configure suggestor to suggest on the data from field title, and use additional weights from the rating field. As per Solr documentation the XML piece should look following way:
<searchComponent class="solr.SuggestComponent" name="suggest">
<lst name="suggester">
<str name="name">titleSuggester</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="suggestAnalyzerFieldType">TextField</str>
<str name="field">title</str>
<str name="weightField">rating</str>
<str name="buildOnCommit">false</str>
<str name="exactMatchFirst">true</str>
<str name="contextField">country</str>
</lst>
</searchComponent>
<requestHandler class="solr.SearchHandler" name="/suggest">
<arr name="components">
<str>suggest</str>
</arr>
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
</lst>
</requestHandler>
In CQL, it will be converted
ALTER SEARCH INDEX CONFIG ON table ADD
searchComponent[#name='suggest',#class='solr.SuggestComponent']
WITH $$ {"suggester":[{"name":"titleSuggester"},
{"lookupImpl":"AnalyzingInfixLookupFactory"},
{"dictionaryImpl":"DocumentDictionaryFactory"},
{"suggestAnalyzerFieldType":"TextField"},
{"field":"title"}, {"weightField":"rating"},
{"buildOnCommit":"false"}, {"exactMatchFirst":"true"},
{"contextField":"country"}]} $$;
ALTER SEARCH INDEX CONFIG ON table ADD
requestHandler[#name='/suggest',#class='solr.SearchHandler']
WITH $$ {"defaults":[{"suggest":"true"},
{"suggest.count":"10"}],"components":["suggest"]} $$;
After that you need not to forget to execute:
RELOAD SEARCH INDEX ON table;
And your suggestor will work. In my example, the index for suggestor should be build explicitly because inventory doesn't change very often. This is done via HTTP call like this:
curl 'http://localhost:8983/solr/keyspace.table/suggest?suggest=true&suggest.dictionary=titleSuggester&suggest.q=Wat&suggest.cfq=US&wt=json&suggest.build=true&suggest.reload=true'
But you can control this by setting buildOnCommit to true. Or you can configure it to build suggestor index on start, etc. - see Solr's documentation.
Full example is here - this is an example of the e-commerce application.

Plugin for Solr Suggester Component

I created a plugin for Solr (version 6.3) that adds a permission layer to filter the retrieved documents using a database query (e.g: the user with ID 2 hasn't permissions to see the document with the ID 1); As the logic that defines if an user has permissions needs fields that aren't indexed in the Solr, i need to check in the database.
To achieve that i created a Query Parser (called DocumentsByUserParser that extends the class QParserPlugin) defined in the solrconfig.xml with the following:
<queryParser name="filterDocument" class="mypackage.solr.plugin.DocumentsByUserParser" />
To call this plugin, i only have to set the fq parameter with the {!filterDocument userId='<user_id>'} along with the other query parameters. Note that the code above works well with the Search Component with an edismax type.
My question is: can i create a new similar plugin, as described above, that works with the Suggest Component? Because when I index a document, the user may have (or not) permissions to see that document, so the suggester shouldn't show suggestions that the user hasn't permissions to see.
I defined my Suggest Component along with the Request Handler with the following:
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">mySuggester</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="indexPath">suggester_infix_dir</str>
<str name="highlight">true</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">AUTO_COMPLETE_FIELD</str>
<str name="suggestAnalyzerFieldType">text_general</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>
</searchComponent>
P.S - The context filter query described in https://lucene.apache.org/solr/guide/6_6/suggester.html only works with a single indexed field, so this will not work with my use case.

How to query Solr shard

I've followed this to set up shard in Solr. As per this topic "Testing Index Sharding on Two Local Servers", I was able to query into shard and get the result (somehose:port1/solr/select?shards=somehost:port1/solr,somehost:port2/solr&indent=true&q=helloworld
).
In that page it is also mentioned that "Rather than require users to include the shards parameter explicitly, it is usually preferred to configure this parameter as a default in the RequestHandler section of solrconfig.xml."
So, I made the changes in solrconfig.xml of the solr instance which is running on port1
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="df">text</str>
</lst>
<lst name="shards.info">
<lst name="localhost:port2/solr">
<long name="numFound">1333</long>
<float name="maxScore">1.0</float>
<str name="shardAddress">http://localhost:port2/solr</str>
<long name="time">686</long>
</lst>
<lst name="localhost:port1/solr">
<long name="numFound">342</long>
<float name="maxScore">1.0</float>
<str name="shardAddress">http://localhost:port1/solr</str>
<long name="time">602</long>
</lst>
</lst>
Now, I'm trying to hit somehost:port1/solr/collection1/select?q=helloworld&wt=json&indent=true
but I'm not getting the desired responce. Please let me know what I'm missing here?
You can't just copy the content from the response into your configuration file - those two formats are completely different. The reference is to the fact that each entry in the defaults section is added to the query string (unless they're provided there already - there are also options if you want to force a certain value that can't be overridden).
<requestHandler name="/selectdistributed" class="solr.SearchHandler">
<lst name="defaults">
[...]
<str name="shards">somehost:port1/solr,somehost:port2/solr</str>
</lst>
</requestHandler>
.. should do what you want. This will add shards=somehost:port1/solr,somehost:port2/solr to the query string of all the requsts that go through that handler.

Resources