More like Solr Query filtering - solr

I was trying to use the MLT (more like this) feature of SOLR but was stuck on how to use the filtering of related content.
For e.g My documents in solr have following different categories
sports, entertainment, funny, busiseness etc
I want related stuff (based on user query) for each category. Thus I would like to filter the MLT results of solr cased on category type.
Can I somehow filter results?
If not possible, can I somehow use solr function query to make sure related stuff are grouped by category?
Thanks.

Need to define a /mlt request handler and then use fq= for filterting the query for MLT

In the solrconfig.xml definition, you should define a request handler as follows:
<requestHandler name="/mlt" class="solr.MoreLikeThisHandler">
</requestHandler>
Then you need to make a request to this handler where you specify the "fq" parameter. This will filter your MoreLikeThis (MLT) results. See the following example where documentid is the ID of the document on which you want to find similar results and searchtext is the field that you use to compare whether two documents are similar or not. "rows" is the number of MLT results that the request should return.
'http://solrserver.local:4562/solr/mlt?rows=2&fl=id&mlt.fl=searchtext&fq=category:<category>&q=id:<documentid>',
See the Filter Query documentation:
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefq(FilterQuery)Parameter

Related

Solr dismax Query Over Multiple Fields

I am trying to do a solr dismax query over multiple fields, and am a little confused with the syntax.
My core contains a whole load of podcast episodes. The fields in the index are EPISODE_ID, EPISODE_TITLE, EPISODE_DESC, and EPISODE_KEYWORDS.
Now, when I do a query I would like to search for the query term in the EPISODE_TITLE, EPISODE_DESC, and EPISODE_KEYWORDS fields, with different boosts for the different fields.
So when I search for 'jedi', the query I've built looks like this:
http://localhost:8983/solr/episode_core/select?
&defType=dismax&q=jedi&fl=EPISODE_ID,EPISODE_TITLE,EPISODE_DESC,EPISODE_KEYWORDS
&qf=EPISODE_TITLE^3.0+EPISODE_DESC^2.0+EPISODE_KEYWORDS
However, this doesn't seem to work - it returns zero records.
When I put a default field like below, it now works, but this is kind of crap because it means I'm not getting results from searching all of the 3 fields:
http://localhost:8983/solr/episode_core/select?&df=EPISODE_DESC
&defType=dismax&q=jedi&fl=EPISODE_ID,EPISODE_TITLE,EPISODE_DESC,EPISODE_KEYWORDS
&qf=EPISODE_TITLE^3.0+EPISODE_DESC^2.0+EPISODE_KEYWORDS
Is there something I am missing here? I thought that you could search over multiple fields, and I thought that the 'qf' parameter would mean you didn't need to supply the default field parameter?
All help much appreciated...
Your idea is correct. If you've defined qf (query fields) for Dismax, there shouldn't be any need to specify a df (default field).
Can you be more specific about what isn't working?
Also, read up on Configuration Invariants in solrconfig.xml as it is possible your configuration could be sending some different parameters than you've specified in the URL.
(E.g. if you're seeing a specific error message asking you to provide a df)

Solr filter query for suggester

I'm using the Solr Suggester Component, and was wondering, if the results can be filtered by the fq parameter. I have a query like this:
http://localhost:8982/solr/core1/suggest?q=shirts&fq=category_id%3A321&wt=json&indent=true&spellcheck=true&spellcheck.build=true
Here, I try to get some suggestions for q=shirts. I want to filter this by fq=category_id:321, so that I don't get suggestions from other categories. Because category with the category_id:321 doesn't have any products related to shirts, it shouldn't return any suggestions. But it does. And when trying to search for that suggestion, it doesn't find anything, because the "original" search is filtered with the fq=... parameters.
I found something with collate here http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate. It collates my results, but also returns suggestions for shirts.
So my question is, is the Suggester (or basically the SpellCheckerComponent) aware of the fq parameter, and how can I use this parameter to filter the suggestions (or in a later stage, the spelling corrections).
EDIT
I found out, that the "normal" spellcheck component (e.g. with the class solr.IndexBasedSpellChecker for example), does indeed take the fq parameter into account. I can set
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">false</str>
and a suggestion for shitr is not returned when filtering by a certain category id, where the keyword shirt is not present.
I'm wondering, why this doesn't work with the suggest component. Any ideas?
I do not think so, you need to look at this a bit differently to understand why. As you may already know, the SpellChecker operates based on dictionary built from the field you specified in the config.
default
text
solr.DirectSolrSpellChecker
...
And the copy fields that should makeup your dictionary during indexing fill the "text" field hence the dictionary.
Example :
At this point spell checker does not know where the suggestion came from.
So with collation you can do little better, did you try &spellcheck=true&spellcheck.extendedResults=true&spellcheck.collate=true ? This will make sure suggestion has some results .
The spellcheck.extendedResults ,Provide additional information about the suggestion, such as the frequency (hits) in the index, which may help you in your logic.

Solr query syntax facet issues

I have a solr server setup and am attempting to get facets working correctly, the following query
/select?q=*:*&wt=xml&indent=true?&facet=true&facet.field=style&facet.field=variety&facet.field=packsize&fq=packsize:6&fq=CABERNET
Where there are three facet fields "style, variety and packsize". The query above returns a number of correct results however when I execute this
/select?q=*:*&wt=xml&indent=true?&facet=true&facet.field=style&facet.field=variety&facet.field=packsize&fq=packsize:6&fq=variety:CABERNET
Suddenly I receive zero results why does prefixing "variety" to the fq break for this but not for packsize?
Also when trying to add &fq=style:red or &fq=red neither of these work even though there are many results with "style = red". Any ideas??
For the filter queries if the field is not specified, the query would work on the default field.
You can check the filter query executed by adding debugQuery=on
<arr name="parsed_filter_queries">
<str>text:solr100</str>
</arr>
So check for the default field should have the CABERNET term.
Also the match would depend upon the field type, the analysis performed and where the field is indexed or not.
Only fields indexed would be able to filter the results.

Solr facet search returning different result and count

I am trying to implement Solr facet search functionality and testing my query on the server via the url. When I run this query
http://localhost:8080/solr3/core0/select?indent=on&version=2.2&q=ipad&facet.field=brand&facet=on
I get something like
...<lst name="facet_counts"><lst name="facet_queries"/><lst name="facet_fields"><lst name="brand"><int name="Apple">37</int>
But when I use apple as facet query like
http://localhost:8080/solr3/core0/select?indent=on&version=2.2&q=ipad&facet.field=brand&facet=on&fq=apple
I expect to get 37 results, but query returns <result name="response" numFound="402" start="0">
Am I missing something here?
Thanks
This is how you apply the filter: q=ipad&fq=brand:apple
Don't repeat the facet unless you want multi-select facets (and even then, it's more complex than that).

Limiting the output from MoreLikeThis in Solr

I'm trying to use MoreLikeThis to get all similar documents but not documents with a specific contenttype.
So the first query needs to find the one document that I want to get "More Like This" of - and the second query needs to limit the similar documents to not be pdf's (-contenttype:pdf)
Does anyone know if this is possible?
Thanks
When using the MoreLikeThisHandler, all the common parameters applied to the mlt results set. So you can use the fq parameter to exclude your pdf documents from the mlt results:
http://localhost:8983/solr/mlt?q=test&mlt.fl=text&fq=-contenttype:pdf
The q parameter allows to select the document to generate mlt results (actually, it's the first document matching the initial query that is used).

Resources