Use different Solr Similarity algo for every search - solr

Is possible in Solr 1.4 to specify which similarity class to use for every search within a single index?
Let's say, I got 2 type of search (keyword and brand). For keyword search, I want to use the DefaultSimilarity class. But, for brand search, I want to use my CustomSimilarity class.
I've been modifying the schema.xml to specify a single similarity class to use. But, I came to this requirement that I have to use 2 different similarity classes.
I'll be glad to here your thoughts on this.
Thanks in advance.

AFAIK the Similarity can only be defined at the schema/index level and can't be overriden per fieldType or per query. (see this and this).
However you can customize your result ordering using other methods: boosting, function queries, a custom analyzer per field, or even sorting.
The Solr Relevancy Cookbook wiki is a good reference.

Related

Tweaking Solr scoring function

In Solr 6.0.0, I want to tweak the scoring based on which field matches (in indexing and searching xml files), do I have to override my own similarity class ?
Although you can use a custom Similarity, that's more useful for changing things like how term popularity effects score. Simply ranking different fields differently is typically what you use score boosts for.
q=foo:something^100 OR bar:something^10
I recommend reading https://wiki.apache.org/solr/SolrRelevancyFAQ and maybe take a look at the dismax query parser, which has a convenient notation for that kind of thing. See https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser (and the "qf" parameter)

word proximity not working in apache solr

I am using dismax parser to boost phrase queries like following
qf=story_title^5.0+tax_payer_name+judgement_text^1.0+story_description^1.0+tax_payer_name+nature_of_the_issues+decision_summary+additional_comments+facts_of_the_case+section_number';
pf=story_title^5.0+&pf=judgement_text+story_description^1+nature_of_the_issues+decision_summary+additional_comments+facts_of_the_case+section_number';
qs=3';
ps=3';
but whenever i search like 54F beed registration , some results come up where , there are more registration word recurring and not 54F beed registration
Somewhere i found that solr score depends on percentage of word repeating in document
how can we override this behavior to achieve desired results in solr?
Thanks in advance.
I don't think there's an omitTermFreq setting yet, even if it has been mentioned many times.
A possible solution is to create your own similarity class by subclassing DefaultSimilarity, and returning 1.0f as the tf value.
See Solr Custom Similarity for an on how to implement a custom similarity class. Recent versions of Solr (4.0+) supports a custom similarity class per field.

How to calculate score of a doc in solr?

I have a solr core named Search Stats with fields
query
search_count
number_of_clicks
added_to_cart
order_placed
I want a scoring function so that I can boost the docs according to the data.
This data is actually want to use for showing suggestions.
I have already used the sort function. suggest me some boost function other than sorting.
you can specify the same in Solr that you want to retrieve the score of every document using q=colour Plus&fl=*,score
Read more on this in below link
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_see_the_relevancy_scores_for_search_results
Lucene/Solr, provides TF-IDF (VSM based) scoring strategy.
The below link highlights more about TF and other terms
http://www.solrtutorial.com/solr-search-relevancy.html
There are three boost methods you can use: boost, bf, and bq. Nicely laid out here. The boost function can be used generally, and the bf and bq are specific to edismax and dismax. See the solr wiki here and here. There are also index-time boosts, as described in the first wiki link above.

Adding Boost to Score According to Payload of Multivalued Field at Solr

Here is my case;
I have a field at my schema named elmo_field. I want that elmo_field should have payloaded values. i.e.
dorothy|0.46 sesame|0.37 big bird|0.19 bird|0.22
When a user searches for a keyword i.e. dorothy I want to add 0.46 to usual score. If user searches for big bird, 0.19 should be added and if user searches for bird, 0.22 should be added (payloads are added - or payloads * normalize coefficient will be added).
I mean I will make a search on my index at my other fields of solr schema. And I will make another search (this one is an exact match search) at elmo_field at same time and if matches something I will increase score with payloads.
Any ideas?
I've implemented a custom similarity wrapper. For usual things I've used DefaultSimilarity. If a field is a payloaded field another similarity that is implemented by me is used. That similarity class just ignores payload value. I've also implemented a query parser that is a customized version of edismax. With that approach I could add payload value into the document score.
Have you looked at CustomScoreQuery?
There's an example with some explanation how to do this at http://dev.fernandobrito.com/2012/10/building-your-own-lucene-scorer/
You could do a boost on a query as this question suggests: How to assign a weight to a term query in Lucene/Solr
Or you could try using payloads as described here:
http://searchhub.org/2009/08/05/getting-started-with-payloads/

Developing custom facet calculations in SOLR

I'm looking into using Solr for a project where we have some specific faceting requirements. From what I've learned, Solr provides range-based facets, where Solr can provide facets of different value-ranges or date-ranges, e.i. field values are "grouped" and aggregated into different bins.
I would like to do something similar, but I want to create a custom function that maps field values to my specific facets, so that each field value is evaluated using a function to see which facet it belongs to.
myFacet = myFacetMapper(fieldValue)
Its sort of a more advanced version of range-facets, but where values are mapped using a custom function rather than just into different bins.
Does anyone know if this is possible and where to start?
I would look into using SimpleFacets to implement your logic. Then you embed it inside a SearchComponent, that you can register into your solrconfig. Look at the code of FacetComponent for an example.
Create another field with value = myFacetMapper(field) , then do normal faceting on that field.

Resources