I have few doubts regarding the usage of AnalyzingInfixLookupFactory as lookupImpl in Solr suggester component.
ref: https://cwiki.apache.org/confluence/display/solr/Suggester
1.) As AnalyzingInfixLookupFactory lookupImpl will build a lucene index for suggestions by generating n-grams for the suggestion field, will it still be a correct choice to choose this suggester implementation if the main index which is used for search is already having n-grams for the same field.
2.) Also, AnalyzingInfixLookupFactory returns duplicate results if multiple documents have the same value for suggestion field. How to handle this? Although choosing FST based suggester will prevent this, but it can't support infix search.
Related
I'm using Solr's SpellCheckComponent with IndexBasedSpellChecker. Wondering if there's a way to get an output of all the words in the dictionary.
Might help us catch some of the misspellings on our site.
yes, there is. IndexBasedSpellChecker, according to the doc: "The IndexBasedSpellChecker uses a Solr index as the basis for a parallel index used for spell checking. It requires defining a field as the basis for the index terms "
So it just uses one field you choose from the index. To enumerate all terms on a field you use the Terms component and you set terms.fl to that field. If you have lots of terms, you could play do some scrolling with terms.lower, terms.limit and terms.upper to get the info in multiple calls.
I am using dismax parser to boost phrase queries like following
qf=story_title^5.0+tax_payer_name+judgement_text^1.0+story_description^1.0+tax_payer_name+nature_of_the_issues+decision_summary+additional_comments+facts_of_the_case+section_number';
pf=story_title^5.0+&pf=judgement_text+story_description^1+nature_of_the_issues+decision_summary+additional_comments+facts_of_the_case+section_number';
qs=3';
ps=3';
but whenever i search like 54F beed registration , some results come up where , there are more registration word recurring and not 54F beed registration
Somewhere i found that solr score depends on percentage of word repeating in document
how can we override this behavior to achieve desired results in solr?
Thanks in advance.
I don't think there's an omitTermFreq setting yet, even if it has been mentioned many times.
A possible solution is to create your own similarity class by subclassing DefaultSimilarity, and returning 1.0f as the tf value.
See Solr Custom Similarity for an on how to implement a custom similarity class. Recent versions of Solr (4.0+) supports a custom similarity class per field.
I have a solr core named Search Stats with fields
query
search_count
number_of_clicks
added_to_cart
order_placed
I want a scoring function so that I can boost the docs according to the data.
This data is actually want to use for showing suggestions.
I have already used the sort function. suggest me some boost function other than sorting.
you can specify the same in Solr that you want to retrieve the score of every document using q=colour Plus&fl=*,score
Read more on this in below link
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_see_the_relevancy_scores_for_search_results
Lucene/Solr, provides TF-IDF (VSM based) scoring strategy.
The below link highlights more about TF and other terms
http://www.solrtutorial.com/solr-search-relevancy.html
There are three boost methods you can use: boost, bf, and bq. Nicely laid out here. The boost function can be used generally, and the bf and bq are specific to edismax and dismax. See the solr wiki here and here. There are also index-time boosts, as described in the first wiki link above.
I'm working with Solr 4.1, and got it working correctly. I have the terms in the document which contains "school". It gives search result correctly when searching for "school". However, I want Solr to include these results when searching for "schools".
So basically I want Solr to include singular terms in its search results when searching for plural terms.
Any idea how to do it?
You need to apply stemming to your indexed fields in order to achieve this behavior. Please see the Stemming Wiki Page for a complete explanation and an example fieldType to support stemming.
Is possible in Solr 1.4 to specify which similarity class to use for every search within a single index?
Let's say, I got 2 type of search (keyword and brand). For keyword search, I want to use the DefaultSimilarity class. But, for brand search, I want to use my CustomSimilarity class.
I've been modifying the schema.xml to specify a single similarity class to use. But, I came to this requirement that I have to use 2 different similarity classes.
I'll be glad to here your thoughts on this.
Thanks in advance.
AFAIK the Similarity can only be defined at the schema/index level and can't be overriden per fieldType or per query. (see this and this).
However you can customize your result ordering using other methods: boosting, function queries, a custom analyzer per field, or even sorting.
The Solr Relevancy Cookbook wiki is a good reference.