Solr range query in text field

Solr range query in text field - solr

I have a multi valued field. The content looks this way
multi_field:"type:type1; YEAR:2008"
I want to be able to make range requests based on YEAR substring. I cannot understand if I can perform this kind of range queries. I want to have something like this. q=multi_field:"type:type1;*" AND multi_field:"*YEARS:[2005 TO 2010]*"
Is it possible? I know it looks horrible. But is there any way I can get it?

No, it is not possible (at least without a hell-a-lot of coding). The easiest way should be to fix the indexing code to split the field into two separate fields. If you need to keep the original multi_field available (e.g. it is used in processing the search results), create two new fields (e.g. multi_field_part1 and multi_field_part2), do the search over the new fields (q=multi_field_part1:type1 AND multi_field_part2:[2005 TO 2010]), but use the old one in results.

Related

Match keywords from a list and filter out the rest using regexp or other?

Ok so I'm familiar with the filter in Google Data Studio where strings can be filtered using AND or OR. The maximum however is 75 and I need 100. It's quite annoying though having to put one filter up at a time. Ideal would have been to be able to copy all keywords and paste them in.
Any solutions to this? Can this be done with for instance regexp?

(?i)^(Keyword 1)$|^(Keyword 2)$|^(Keyword 3)$|^(Keyword 4)
So that is what I came up with above and it seems to work.
During copy from Google Sheets, is there a way to format a column of keywords into the regex format above?

Display Solr results based on custom selection

I have a filed 'qualification' which is having multiple values (something like MCA, MBA, MSC, PhD, ...).
My requirement is to display results in the order MSC, MCA, PhD, MBA. So, I am using the below query to boost the field values.
&bq=(qualification: "MSC"^5 "MCA"^4 "PhD"^3 "MBA"^2)
The above query is working only when I use q=*:*
But when search with any text like q=course, I am not getting the results with specified order.
Please help what I did wrong.
Thanks & Regards
Venu

You're probably not doing anything "wrong", but when you actually search for something, the score isn't flat (i.e. it's no longer just 1) any more.
If you don't want your query to affect the score, use a filter query (fq instead). This does however not give you any actual relevance inside the results - if you still want that, you'll probably have to adjust your boosts to be far higher, so that the actual scores are only used internally within each boost level.
&bq=qualification:"MSC"^50000
&bq=qualification:"MCA"^40000
&bq=qualification:"PhD"^30000
&bq=qualification:"MBA"^20000
If you append debugQuery=true to your query string, you can see how the score is calculated for each document, and adjust your boosts accordingly.

How to save value with wildcard in Solr?

all.
I have the following trouble with Solr. I need to implement "reverse" search with wildcards. I mean I want to keep value like "auto*" and this item should be found with request like "autocar", "autoplan" or "automate". Could someone help me with this, please? Thanks.

If you want to match shorter indexed value (auto) against longer searched value (autobus), you want a custom analysis chain that includes EdgeNGramFilter on the query side only. Then, the incoming search word will get split into possible prefixes and matched against the indexed term.

Sunspot/Solr: word concatenation

I'm using Solr with the Sunspot Ruby gem. It works great, but I'm noticing that sometimes users will get poor search results because they have concatenated their search terms (e.g. 'foolproof') where the document text was 'fool proof'. Or vice-versa.
I was going to try and address this by creating a set of alternate match fields by manually concatenating the words from the source documents together. This seems kind of hackish, and implementing the other side (breaking up user concatenations into words) is not obvious.
Is there a way to do this properly in Solr/Sunspot?

Did yo have a look at SOLR spellcheck (or spell check) component?
http://wiki.apache.org/solr/SpellCheckComponent
For example, there is a WordBreakSolrSpellChecker, which may provide valid suggestions in such case.

Is it possible to have SOLR MoreLikeThis use different fields for model and matches?

Let's say I have documents with two fields, A and B.
I'd like to use SOLR's MoreLikeThis, but with a twist: I'm most interested in boosting documents whose A field is like my model document's B field. (That is, extract MLT's 'interesting terms' from the model B field, but only collect MLT results based on the A field.)
I don't see a way to use the mlt.fl fields or mlt.qf boosts to achieve this effect in a single query. (It seems mlt.fl specifies fields used for both discovery of 'interesting terms' and matching to those terms.) Am I missing some option?
Or will I have to extract the 'interesting terms' myself and swap the 'field:term' details?
(Other ideas in this same vein appreciated as well.)

Two options I see are:
Use a copyField - index your original document with a copy of field A named B, and then query using B.
Extend MoreLikeThisHandler and change the fields you query.
The first option costs a bit of programming (mostly configuration changes) and some memory consumption. The second involves more programming but no memory footprint increase. Hope one of them suits your needs.

I now think there are two ways to achieve the desired effect (without customizing the MLT source code).
First option: Do an initial MLT query with the MLT handler, adding the parameter &mlt.interestingTerms=details. This includes the list of terms that were deemed interesting, ranked with their relative boosts. The usual behavior uses those discovered terms against the same mlt.fl fields to find similar documents. For example, the response will include something like:
"interestingTerms":
["field_b:foo",5.0,"field_b:bar",2.9085307,"field_b:baz",1.67070794]
(Since the only thing about this initial query that's interesting is the interestingTerms, throwing in an fq that rules out all docs could help it skip unnecessary scoring work.)
Explicitly re-composing that interestingTerms info into a new OR query field_a:foo^5.0 field_a:bar^2.9085307 field_a:baz^1.67070794 amounts to using the B field example text to find documents that are similar in field A, and may be mimicking exactly the kind of query default MLT does on its usual model field.
Second option: Grab the model document's actual field B text, and feed it directly as a ContentStream body, to be used in lieu of a query, for specifying the model document. Then target mlt.fl at field A for the sake of collecting similar results. For example, a fragment of the parameters might be …&stream.body=foo bar baz&mlt.fl=field_a&…. Again, the net effect being that model text originally from field_b is finding documents similar only in field_a.