Solr Search for term with under certain price - solr

I am working on an e-commerce app whose search is powered by Solr. How can I accomplish something like: 'DSLR under $600' gives me products that are found with DSLR and price is under $600?

Lucene and Solr query parser support less than or equal to (<=):
?q=name:DSLR AND price:[* to 600]
Note : Assuming your field is "name"
This would give you the less than 600 dollars.
Square brackets [ & ] denote an inclusive range query that matches values including the upper and lower bound.
Curly brackets { & } denote an exclusive range query that matches values between the upper and lower bounds, but excluding the upper and lower bounds themselves.
For more details please refer Range Searches

You can use
protocol://{solr-url}/solr/{collection-name}/select?
fq=name:DSLR
&fq=price:[0 TO 600]
&q=*:*
&wt=json
In my understanding
Using fq over q is recommended here.

Related

Preserving word order in Vespa in non-English

I am creating a schema for Vespa mainly for English, but with two fields in Wylie transliteration of Tibetan, which looks like this
'jam dpal smra ba'i seng ge la bstod pa ut+pal dmar po'i do shal
Typically users want to match every token and preserve the word order, and preferably in the beginning of the field.
For example, to find the field above, user might enter "'jam dpal smra ba'i seng ge". They would not appreciate results where these tokens would appear in different order, even if that would rank high with BM25. BM25 would still be needed for fallback.
Could you give me an example of the schema field / ranking expression to rank in this order:
exact match in the beginning of field
exact match anywhere
bm25
Naturally, I'll turn off stemming. Also, apostrophes and, less importantly, plus signs should be preserved.
I have read especially the Schema Reference of Vespa docs, but I did not find a solution.
I got the best results with
field wylie type string {
indexing: index | summary
index: enable-bm25
stemming: none
}
rank-profile native_rank_and_wylie {
first-phase {
expression: nativeRank(title, body) + fieldMatch(wylie).earliness + fieldMatch(wylie).longestSequence * 0.4
}
}
Note that longestSequence is not normalized and can affect scores a lot.

Can Solr rank its highlight snippets?

Solr will give as many highlights as I specify in hl.snippets, giving a list of highlights. What I want is the set of (2 or 3) highlight snippets that best match the query. Is there in innate Solr feature that does this?
The Unified Highlighter allows you to tell it how it should score the returned highlights. It should already do some scoring by default, so the first task would be switch to the unified highlighter if you're not using that.
You can then tweak how it uses BM25 to score the returned highlights:
hl.score.k1 [Optional] [Default: 1.2]
Specifies BM25 term frequency normalization parameter 'k1'. For example, it can be set to 0 to rank passages solely based on the number of query terms that match.
hl.score.b [Optional] [Default: 0.75]
Specifies BM25 length normalization parameter 'b'. For example, it can be set to "0" to ignore the length of passages entirely when ranking.
hl.score.pivot [Optional] [Default: 87]
Specifies BM25 average passage length in characters.

Sort or filter results by function query defined in a field

I have a Solr 6.2 instance running, and I'm exploring its advantages and limitations. One limitation I've run into seems to be that you can't sort or filter the data based off of a field function query.
.../solr/collection/select?q=*:*&fl=*,total:sum(v1,v2)&fq=total:[10 TO *]
Solr responds with an error stating that the total field does not exist. Indeed, the field is not defined in my schema because it's not a stored part of the dataset - it's calculated at query time. They call it a pseudo field. I haven't been able to find an example in the documentation or a solution online. So, is there a way around this?
.../solr/collection/select?q=*:*&fl=*,total:sum(v1,v2)&fq={!frange l=10} sum(v1,v2)
I have very same problem as you.
I want to query particular division value of two fields.
I tried to used [0.3 TO *] like you.
You can also use upper bound for your range if you need.
http://archive.apache.org/dist/lucene/solr/ref-guide/apache-solr-ref-guide-4.6.pdf
"l" is for lower bound.
"u" is for upper bound.
fq={!frange l=0 u=2.2} sum(user_ranking,editor_ranking)
Maybe this works for you?
you can do this. instead of total try sum.
you can find more info here. https://wiki.apache.org/solr/FunctionQuery#What_is_a_Function.3F
an example from the sole wiki.
Example Function Queries
To give you a better understanding of how function queries can be used in Solr, suppose an index stores the dimensions in meters x,y,z of some hypothetical boxes with arbitrary names stored in field boxname. Suppose we want to search for box matching name findbox but ranked according to volumes of boxes. The query parameters would be:
q=boxname:findbox val:"product(x,y,z)"
This query will rank the results based on volumes. In order to get the computed volume, you will need to request the score, which will contain the resultant volume:
&fl=*, score
Suppose that you also have a field storing the weight of the box as weight. To sort by the density of the box and return the value of the density in score, you would submit the following query:
http://localhost:8983/solr/collection_name/select?q=boxname:findbox val:"div(weight,product(x,y,z))"&fl=boxname x y z weight score`
you can read more about it here. https://cwiki.apache.org/confluence/display/solr/Function+Queries
Try this
solr/collection/select?q=*:* _val_:"sum(v1,v2)"&fl=* score&fq={!frange l=10 }sum(v1,v2)

Solr - how to plan field boosting

I query using
qf=Name+Tag
Now I want that documents that have the phrase in tag will arrive first so I use
qf=Name+Tag^2
and they do appear first.
What should be the rule of thumb regarding the number that comes after the field?
How do I know what number to set it?
The number is pure preference based and is mainly trial and error basis.
As to how much the field weighs in comparison to the other field.
The scoring takes into account various factors, however some factors can be considered and tested
e.g. term frequency - So is a word appears twice in Name should it override a single occurrence in the tag field
Also, if you are checking for a Phrase match you should use pf if using the edismax parser.
qf will match individuals words where pf will match whole words.
For e.g. if you have fields name & tag and you search for ruby rails
qf would cause scoring name:ruby tag:ruby & name:rails tag:rails
pf would cause scoring name:"ruby rails" tag:"ruby rails"
so would be better to use qf to match the results and boost single matches but have higher pf values.

Solr Fuzzy search in multiValued field with max distance between terms

Hello stackOverflowers
I have a field in a Solr document collection with a field called
names_txt - this is a multiValue="true" field.
This field contains all the names of the associated persons to a document
I want to be able to both do a fuzzy search and at the same time limit the number of terms between the to matching terms.
The query
names_txt:("markus foss"~2)
Will return all documents where you find the terms markus and foss where theres max 2 terms between them.
But when i search in a fuzzy way AND want to also specify the max number of terms between the matches, I cant get the syntax right.
The query:
names_txt:(markus~0.7 foss~0.7)
This does work, but returns false postives, since it will return a document with "markus something" in one value, and "foss somethingElse" in another.
What I would like to write is:
(markus~0.7 foss~0.7)~2
but this syntax is illegal in solr.
Anyone out there have a solution for my problem?
Since in one single query term Solr can either process a word distance restraint or a fuzzy search restraint, we will need two terms for this:
names_txt:("markus foss"~2) AND names_txt:(markus~0.7 foss~0.7)
Note that quantifying fuzzyness by a float number is deprecated. Internally, lucene converts converts the float number to an int between 0 and 2 anyway, so we should use this integer (Damereau Levenshtein) edit distance right from the beginning in our search terms. So my final proposal states:
names_txt:("markus foss"~2) AND names_txt:(markus~1 foss~1)
(For those who are interested: The deprecated, somewhat quirky function that converts the similarity float to an edit distance int can be found at the end of this code file.)
I think you could do that using SpanQuery The issue is that the usual query parsers in Solr dont support them. Look at this article that mentions those that support spans: Surround, Xml-Query-Parser and Qsol. But check the status of each in current solr version.

Resources