Solr Request Handlers and Query Parsers - solr

I am new to Solr. Please help me with following queries:
What is the difference between a request handler and a query parser?
I'm thinking it's that when a query is sent through URL in solr, at first the query is parsed using the Query parser. Request handler then takes the parsed query and searches and presents the response according to request handler parameters. Is this Correct?
What is the default query parser and default request handler in Solr?
Parameter deftype is used to specify parser and qt for request handlers right?
I wrote this query
select?q=features:power%20features:latency&deftype=dismax
which works, but select?q=features:power%20features:latency&qt=dismax does not.
Here is my requestHandler
<requestHandler name="dismax" class="solr.SearchHandler">
<lst name="defaults">
<str name="defType">dismax< /str>
<str name="echoParams">explicit< /str>
<float name="tie">0.01< /float>
<str name="qf">text^0.5 features^1.0 name^1.2 sku^1.5 id^10.0 manu^1.1 cat^1.4< /str>
<str name="pf">text^0.2 features^1.1 name^1.5 manu^1.4 manu_exact^1.9< /str>
<str name="bf">popularity^0.5 recip(price,1,1000,1000)^0.3< /str>
<str name="fl">id,name,price,score< /str>
<str name="mm">2<-1 5<-2 6<90%< /str>
<int name="ps">100< /int>
<str name="q.alt">*:*< /str>
<!-- example highlighter config, enable per-query with hl=true
-- >
< str name="hl.fl">text features name</str>
<!-- for this field, we want no fragmenting, just highlighting
-- >
< str name="f.name.hl.fragsize">0< /str>
<!-- instructs Solr to return the field itself if no query terms are
found
-- >
<str name="f.name.hl.alternateField">name< /str>
<str name="f.text.hl.fragmenter">regex< /str>
<!-- defined below
-->
< /lst>
</requestHandler>

Default request handler is the one with default="true" parameter in SolrConfig.xml (SearchHandler, if you haven't changed that).
Request handler handles requests, so it is a starting point for every request, which means that request handler uses/calls query parser (either the one specified by the url or default one) as its first step.
You want to get:
1. Documents with "power latency" as the phrase?
2. Or docs with both terms anywhere in a doc?
3. Or docs with either of those terms?
Try like this:
1. select?q=features:"power latency"&qt=dismax
2. select?q=features:power+features:latency&qt=dismax&mm=2
3. select?q=features:power+features:latency&qt=dismax&mm=1
More info on DisMaxQParserPlugin.

Related

Solr More Like This Says "numFound" doesn't equal number of docs in match

I have a Solr More Like This Handler, configured as follows:
Request Handler Configuration
<requestHandler name="/themlturl" class="solr.MoreLikeThisHandler">
<lst name="defaults">
<str name="wt">json</str>
<int name="rows">5</int>
<str name="mlt.fl">name, category_stack</str>
<str name="mlt.qf">name^3 category_stack^5</str>
<str name="fl">id, name</str>
<str name="mlt">true</str>
<str name="mlt.mintf">1</str>
</lst>
</requestHandler>
Simple Query
Queries that has one document match work fine
results in
Query With More Than One Document
I am trying to get documents similar to more than one document using an OR in the q field.
This results in the following response
it is clear that Solr found the three documents since the match > numFound is 3, but the returned documents in the match > docs is only one, and the results in the response are documents similar to that one document.
Does the MLT handler support multiple documents ? if not, is there a solution other than querying the handler once for each document.
What I am trying to build is a simple content-based recommendation engine which is supposed to show documents similar to the ones a user saves.

Solr Error: QueryComponent.mergeIds(QueryComponent.java:940) when using invariants in a request handler

I need a search request handler that only return a specific sets of field in a collection, but for security reasons, no one can change what fields to be displayed. (There are some indexed sensitive fields that I don't want anyone access it)
I tried to use invariants in the request handler, and define the list of field there, so I made a handler like this:
<requestHandler class="solr.SearchHandler" name="/search">
<lst name="defaults">
<str name="q">*:*</str>
</lst>
<lst name="invariants">
<str name="fl">content</str>
<str name="fl">description</str>
</lst>
</requestHandler>
However, I get this error when I call the request handler:
{
"responseHeader":{
"zkConnected":true,
"status":500,
"QTime":4},
"error":{
"trace":"java.lang.NullPointerException\r\n\tat org.apache.solr.handler.component.QueryComponent.mergeIds(QueryComponent.java:940)\r\n\tat org.apache.solr.handler.component.QueryComponent.handleRegularResponses(QueryComponent.java:585)\r\n\tat org.apache.solr.handler.component.QueryComponent.handleResponses(QueryComponent.java:564)\r\n\tat org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:423)\r\n\tat org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:177)\r\n\tat org.apache.solr.core.SolrCore.execute(SolrCore.java:2503)\r\n\tat org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)\r\n\tat org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)\r\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:382)\r\n\tat org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:326)\r\n\tat org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1751)\r\n\tat org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:582)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:143)\r\n\tat org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)\r\n\tat org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:226)\r\n\tat org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1180)\r\n\tat org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:512)\r\n\tat org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:185)\r\n\tat org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1112)\r\n\tat org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:141)\r\n\tat org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:213)\r\n\tat org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:119)\r\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\r\n\tat org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)\r\n\tat org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:134)\r\n\tat org.eclipse.jetty.server.Server.handle(Server.java:534)\r\n\tat org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:320)\r\n\tat org.eclipse.jetty.server.HttpConnection.onFillable(HttpConnection.java:251)\r\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)\r\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)\r\n\tat org.eclipse.jetty.io.ssl.SslConnection.onFillable(SslConnection.java:251)\r\n\tat org.eclipse.jetty.io.AbstractConnection$ReadCallback.succeeded(AbstractConnection.java:283)\r\n\tat org.eclipse.jetty.io.FillInterest.fillable(FillInterest.java:108)\r\n\tat org.eclipse.jetty.io.SelectChannelEndPoint$2.run(SelectChannelEndPoint.java:93)\r\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.executeProduceConsume(ExecuteProduceConsume.java:303)\r\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.produceConsume(ExecuteProduceConsume.java:148)\r\n\tat org.eclipse.jetty.util.thread.strategy.ExecuteProduceConsume.run(ExecuteProduceConsume.java:136)\r\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool.runJob(QueuedThreadPool.java:671)\r\n\tat org.eclipse.jetty.util.thread.QueuedThreadPool$2.run(QueuedThreadPool.java:589)\r\n\tat java.lang.Thread.run(Unknown Source)\r\n",
"code":500}}
Currently I do a workaround to put the fields inside the default instead of invariants. so I made the handler to be like this instead:
<requestHandler class="solr.SearchHandler" name="/search">
<lst name="defaults">
<str name="q">*:*</str>
<str name="fl">content</str>
<str name="fl">description</str>
</lst>
</requestHandler>
This workaround is working, but I don't want that because it's not secure enough. Anyone can help?
For additional information, I'm using Solr version 7.2.1, and the collection has 2 shards, 2 replicas.
When you set the fl list to invariant, that means that the internal requests between shards won't be able to override it either. The line triggering your error is:
resultIds.put(shardDoc.id.toString(), shardDoc);
I'm guessing you'd want to include id in your field list, and probably _version_ and _root_ as well. _version_ have special meaning when it comes to replication and optimistic concurrent update, while _root_ affects child documents.

Solr suggester in SolrCloud mode

I am running the solr in CloudSolr mode with three shards. The data is already indexed into solr. Now I have configured the solr suggester in solrconfig.xml. This is the configuration from solrconfig file. I am using solr 4.10 version.
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">mysuggest</str>
<str name="lookupImpl">FuzzyLookupFactory</str>
<str name="storeDir">suggester_fuzzy_dir</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">businessName</str>
<str name="payloadField">profileId</str>
<str name="weightField">businessName</str>
<str name="suggestAnalyzerFieldType">text_general</str>
<str name="buildOnStartup">false</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
Here is the command I am using to fetch the result:
http://shard1:8900/solr/core/suggest?suggest=true&suggest.build=true&suggest.reload&suggest.dictionary=mysuggest&wt=json&indent=true&suggest.q=sale
This is the output of the command:
{
"responseHeader":{
"status":0,
"QTime":1490},
"command":"build",
"suggest":{}
}
Nothing is coming into suggest result. I have 10K records indexed into solr.
I am seeing the following into log file:
org.apache.solr.handler.component.SuggestComponent; http://shard1:8983/solr/core/ : null
org.apache.solr.handler.component.SuggestComponent; http://shard2:8900/solr/core/ : null
org.apache.solr.handler.component.SuggestComponent; http://shard3:7574/solr/core/ : null
I am not able understand what is missing here. Thanks.
It was not working because solr was running in SolrCloud mode. There is two ways to perform suggestion in solrCloud mode:
Use the distrib=false parameter. This will fetch the data from only one shard which you are accessing in the command. You can add the following into Component definition itself.
<bool name="distrib">false</bool>
Use the shards and shards.qt parameter for searching all the shards. The shards parameter will contain comma separated list of all the shards which you want to include in the query. The shards.qt parameter will define the reat API you want to access.
shards.qt: Signals Solr that requests to shards should be sent to a request handler given by this parameter. Use shards.qt=/spell when making the request if your request handler is "/spell".
shards: shards=solr-shard1:8983/solr,solr-shard2:8983/solr Distributed Search
Please check Here for more details.

Solr edismax qf and pf defaults not working to boost fields

I am attempting to set up a request handler that will boost certain fields by different amounts. I have the following request handler.
<requestHandler name="/select" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="start">0</str>
<int name="rows">10</int>
<str name="defType">edismax</str>
<str name="qf">
title^50.0 searchTitle^7.0 keywords^5.0 content^1.0 text^1.0
</str>
<str name="pf">
title^50.0 searchTitle^7.0 keywords^5.0 content^1.0 text^1.0
</str>
<str name="df">text</str>
</lst>
</requestHandler>
However, the fields aren't being boosted correctly, if at all. I noticed that documents with the search term in the title field aren't appearing any higher than documents with the search term in the text field. Arbitrarily re-arranging the weights produces the same document order each time.
When I go into the solr web interface/admin UI and do a search I get the same results. However, if I explicitly check the edismax checkbox and enter the field-boost data in the qf and pf boxes I get the results and the weighting I would expect.
In fact, I also just tried changing the rows value to 5 and still received the same result. It looks like my queries aren't being handled by the /select handler, even though that is what I choose both in the solr Admin UI and when I create the HttpSolrServer object to do the queries from the server.
I am using solr v4.8.0.
Any help would be appreciated.
Check setting in solrconfig for
<requestDispatcher handleSelect="false" >
If you want to use select as a requesthandler, this needs to be
<requestDispatcher handleSelect="true" >

why i can read only 10 Documents out of 665 results into beans in solr

I have indexed my database tables into solr using DataImportHandler. Now when I query the server it shows me that the number of results found 665. But when i try to assign it to beans like List itemList = rsp.getBeans(Item.class), it is giving me only 10 results.
Can some one help me out on this.
Thanks in Advance.
When you don't define the amount of rows (documents) to fetch, Solr defaults to fetching 10 documents, as explained in the docs.
By default Solr returns only 10 Documents. If you want to fetch all documents, you will need to update solrConfig.xml file of Core (path : /solr/server/solr/core_name/conf/solrConfig.xml) :
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10000000</int> <!--you can update it to some large value that is higher than the possible number of rows that are expected.-->
</lst>
</requestHandler>
You might have to edit your solrconfig.xml.
There change the "/select" Request Handler like this.
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">1000</int> <!-- Change this as you want -->
<str name="df">text</str>
</lst>
</requestHandler>

Resources