Solr filter query for suggester - solr

I'm using the Solr Suggester Component, and was wondering, if the results can be filtered by the fq parameter. I have a query like this:
http://localhost:8982/solr/core1/suggest?q=shirts&fq=category_id%3A321&wt=json&indent=true&spellcheck=true&spellcheck.build=true
Here, I try to get some suggestions for q=shirts. I want to filter this by fq=category_id:321, so that I don't get suggestions from other categories. Because category with the category_id:321 doesn't have any products related to shirts, it shouldn't return any suggestions. But it does. And when trying to search for that suggestion, it doesn't find anything, because the "original" search is filtered with the fq=... parameters.
I found something with collate here http://wiki.apache.org/solr/SpellCheckComponent#spellcheck.collate. It collates my results, but also returns suggestions for shirts.
So my question is, is the Suggester (or basically the SpellCheckerComponent) aware of the fq parameter, and how can I use this parameter to filter the suggestions (or in a later stage, the spelling corrections).
EDIT
I found out, that the "normal" spellcheck component (e.g. with the class solr.IndexBasedSpellChecker for example), does indeed take the fq parameter into account. I can set
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">false</str>
and a suggestion for shitr is not returned when filtering by a certain category id, where the keyword shirt is not present.
I'm wondering, why this doesn't work with the suggest component. Any ideas?

I do not think so, you need to look at this a bit differently to understand why. As you may already know, the SpellChecker operates based on dictionary built from the field you specified in the config.
default
text
solr.DirectSolrSpellChecker
...
And the copy fields that should makeup your dictionary during indexing fill the "text" field hence the dictionary.
Example :
At this point spell checker does not know where the suggestion came from.
So with collation you can do little better, did you try &spellcheck=true&spellcheck.extendedResults=true&spellcheck.collate=true ? This will make sure suggestion has some results .
The spellcheck.extendedResults ,Provide additional information about the suggestion, such as the frequency (hits) in the index, which may help you in your logic.

Related

Solr dismax Query Over Multiple Fields

I am trying to do a solr dismax query over multiple fields, and am a little confused with the syntax.
My core contains a whole load of podcast episodes. The fields in the index are EPISODE_ID, EPISODE_TITLE, EPISODE_DESC, and EPISODE_KEYWORDS.
Now, when I do a query I would like to search for the query term in the EPISODE_TITLE, EPISODE_DESC, and EPISODE_KEYWORDS fields, with different boosts for the different fields.
So when I search for 'jedi', the query I've built looks like this:
http://localhost:8983/solr/episode_core/select?
&defType=dismax&q=jedi&fl=EPISODE_ID,EPISODE_TITLE,EPISODE_DESC,EPISODE_KEYWORDS
&qf=EPISODE_TITLE^3.0+EPISODE_DESC^2.0+EPISODE_KEYWORDS
However, this doesn't seem to work - it returns zero records.
When I put a default field like below, it now works, but this is kind of crap because it means I'm not getting results from searching all of the 3 fields:
http://localhost:8983/solr/episode_core/select?&df=EPISODE_DESC
&defType=dismax&q=jedi&fl=EPISODE_ID,EPISODE_TITLE,EPISODE_DESC,EPISODE_KEYWORDS
&qf=EPISODE_TITLE^3.0+EPISODE_DESC^2.0+EPISODE_KEYWORDS
Is there something I am missing here? I thought that you could search over multiple fields, and I thought that the 'qf' parameter would mean you didn't need to supply the default field parameter?
All help much appreciated...
Your idea is correct. If you've defined qf (query fields) for Dismax, there shouldn't be any need to specify a df (default field).
Can you be more specific about what isn't working?
Also, read up on Configuration Invariants in solrconfig.xml as it is possible your configuration could be sending some different parameters than you've specified in the URL.
(E.g. if you're seeing a specific error message asking you to provide a df)

/select with 'q' parameter does not work

Whenever i query with q=: it shows all the documents but when i query with q=programmer 0 docs found.(contents is the default search field)
my schema has: id(unique),author,title,contents fields
Also query works fine for:
q=author:"Value" or q=title:"my book" etc, only for contents field no results.
Also when i query using spell checker(/spell?q=programmer) output shows spelling suggestions for this word,when 'programmer' is the right word and present in many documents.
I referred the example docs for configurations.
All of a sudden i am getting this,initially it worked fine.
I guess there some problem only in the contents field,but cannot figure it out.
Is it because indexes are not created properly for contents field?
(I am using solr 4.2 on Windows 7 with tomcat as webserver)
Please help.Thanks a lot in advance.
Are you sure you set the default search field? The reason you have this problem might be because you didn't set the <defaultSearchField> field in your schema.xml file. This is why "q=author:value" works while q=WHATEVER doesn't.
The Is used by Solr when parsing queries to
identify which field name should be searched in queries where an
explicit field name has not been used.
But also consider this:
The is used by Solr when parsing queries to
identify which field name should be searched in queries where an
explicit field name has not been used. It is preferable to not use or
rely on this setting; instead the request handler or query LocalParams
for a search should specify the default field(s) to search on. This
setting here can be omitted and it is being considered for
deprecation.
Do you have any data in your instance. try q=*:* and see what it returns. "for" is a stop word, may be it was filtered out. Look for something else as value to test.

More like Solr Query filtering

I was trying to use the MLT (more like this) feature of SOLR but was stuck on how to use the filtering of related content.
For e.g My documents in solr have following different categories
sports, entertainment, funny, busiseness etc
I want related stuff (based on user query) for each category. Thus I would like to filter the MLT results of solr cased on category type.
Can I somehow filter results?
If not possible, can I somehow use solr function query to make sure related stuff are grouped by category?
Thanks.
Need to define a /mlt request handler and then use fq= for filterting the query for MLT
In the solrconfig.xml definition, you should define a request handler as follows:
<requestHandler name="/mlt" class="solr.MoreLikeThisHandler">
</requestHandler>
Then you need to make a request to this handler where you specify the "fq" parameter. This will filter your MoreLikeThis (MLT) results. See the following example where documentid is the ID of the document on which you want to find similar results and searchtext is the field that you use to compare whether two documents are similar or not. "rows" is the number of MLT results that the request should return.
'http://solrserver.local:4562/solr/mlt?rows=2&fl=id&mlt.fl=searchtext&fq=category:<category>&q=id:<documentid>',
See the Filter Query documentation:
https://cwiki.apache.org/confluence/display/solr/Common+Query+Parameters#CommonQueryParameters-Thefq(FilterQuery)Parameter

Solr Spell Check result based filter query

I implemented Solr SpellCheck Component based on the document from http://wiki.apache.org/solr/SpellCheckComponent , it works good. But i am trying to filter the spell check result based on some other filter. Consider the following schema
product_name
product_text
product_category
product_spell -> copy string from product_name and product_text . And tokenized using white space analyzer
For the above schema, i am trying to filter the spell check result based on provided category. I tried querying like http://127.0.0.1:8080/solr/colr1/myspellcheck/?q=product_category:160%20appl&spellcheck=true&spellcheck.extendedResults=true&spellcheck.collate=true . Spellcheck results does not consider the product_category:160
Is it because the dictionary was build for all the categories? If so is it a good idea to create the dictionary for every category?
Is it not possible to have another filter condition in spellcheck component?
I am using solr 3.5
I previously understood from the SOLR-2010 issue that filtering through the fq parameter should be possible using collation, but it isn't, I think I misunderstood.
In fact, the SpellCheckComponent has most likely a separate index, except for the DirectoSolrSpellChecker implementation. It means the field you select is indexed in a different index, which contains only the information about that specific field you chose to make spelling corrections.
If you're curious, you can also have a look how that additional index looks like using luke, since it's of course a lucene index. Unfortunately filtering using other fields isn't an option there, simply because there is only one field there, the one you use to make spelling corrections.

Solr Query with LIKE Clause

I'm working with Solr and I'd like to know if it is possible to have a LIKE clause in the query. For example, I want to know all organizations with "New York" in the title. In SQL, this would be written like Name LIKE 'New York%'.
My question - how do you write a LIKE query in Solr?
I'm using the SolrNet library, if that makes a difference.
You just search for "New York", but first you need to properly configure your field's analyzer. For example you might want to start with a field type like text_general as defined in the default Solr schema. This field type will tokenize on whitespace and other common word separators, then apply a filter of stopwords, then lowercase the terms in order to make searches case-insensitive.
More information about analyzers in the Solr wiki.
If you're using solr 3.1 or newer, have a look at the Extended DisMax Query Parser, which supports wildcard queries. You can enable it using <str name="defType">edismax</str> in the request handler configuration.
Then you can use a query like title:New York* with the same behaviour as a query with like clause. The main difference between my answer and the accepted one is that you can even search for fragment of words using wildcards. For example New Yorkers would match in this case.
Unfortunately you could have problems with case-sensitive queries even if you're using a LowerCaseFilterFactory. Have a look here to know more. Most of those problems will be fixed with the solr 3.6 release since the SOLR-2438 issue has been solved.

Resources