Solr - Boosting result if query is found in a special field - solr

I am wondering if it is possible with Solr 3.4 to boost a search result, if the query is found in a special field without using the "fieldname:query"-syntax.
Let me explain:
I have several fields in my index. One of it is named "abbreviation" and is filled with text like AVZ, JSP, DECT, ...
To be able to find results when searching purely for "AVZ" I added a
<copyField source="abbreviation" dest="text"/>
in my schema.xml. The field text is my defaultSearchField.
This is not the best solution in my opinion. So I am trying to find out, if it is possible to search for "AVZ" in all fields and if the String is found in the field abbreviation, the result entry should be boosted (increasing the score) so that it will be listed at first entry in the result list. Would be the same as using abbreviation:AVZ AVZ as query.
The other possibility I can think of is to analyze the query. And if a substring like "AVZ" is found, the query will be appended with abbreviation:AVZ. But in this case I must be able to find out, which abbreviations are indexed. Is it possible to retrieve all possible terms of a field from the Solr index using SolrJ?
Best Regards
Tobias

Without the fieldname:term syntax use can define a request handler -
<requestHandler name="search" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="defType">dismax</str>
<str name="qf">
abbreviation^2 text
</str>
<str name="q.alt">*:*</str>
<str name="rows">10</str>
<str name="fl">*,score</str>
</lst>
</requestHandler>
This uses the dismax query parser. You can use edismax as well.
This will boost the results and query would be a simple query as q=AVZ.
If only through url, you can boost match on specific field like mentioned # link
e.g.
q=abbreviation:AVZ^2 text:AVZ
This would boost the results with a match on abbreviation, which would result the documents to appear on top.
It is not possible to get all results with dismax using the *:* query.
However, for all docs just do not pass any q param. q.alt=*:* will return all the docs.
Else, update the defType to edismax.
<requestHandler name="search" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="defType">edismax</str>
<str name="qf">
abbreviation^2 text
</str>
<str name="q.alt">*:*</str>
<str name="rows">10</str>
<str name="fl">*,score</str>
</lst>
</requestHandler>

Apache Solr 6.4.2:
Boosting Exact phrase search not working:
Solrconfig.xml:
explicit
<int name="rows">10</int>
<str name="defType">edismax</str>
<str name="qf">names^50</str>
<!-- <str name="df">text</str> -->
</lst>
Solr query used to test: q=(names:alex%20pandian)&wt=json&debugQuery=on
In debug mode it shows
"parsedquery_toString":"+((names:alex ((names:pandian)^50.0))) ()"
It is boosting the terms from second word only. In this case only Pandian is boosted but Alex is not.

Related

AnalyzingInfixLookupFactory implementation in Solr Suggestor not returning suggestion results

My requirement is to provide automatic suggestions to users on asset names as per their project.
I have tried using AnalyzingInfixLookupFactory and BlendedInfixLookupFactory, as these are the only ones that support context filtering.
But no suggestion results are being returned.
Below is extract from solrconfig.xml:
<searchComponent name="suggest" class="solr.SuggestComponent">
<lst name="suggester">
<str name="name">mySuggester</str>
<str name="lookupImpl">AnalyzingInfixLookupFactory</str>
<str name="dictionaryImpl">DocumentDictionaryFactory</str>
<str name="field">assetname_str</str>
<str name="indexPath">/home/suggest_index</str>
<str name="contextField">projectid</str>
<str name="weightField">weight</str>
<str name="suggestAnalyzerFieldType">string</str>
<str name="buildOnStartup">false</str>
<str name="buildOnCommit">false</str>
</lst>
</searchComponent>
<requestHandler name="/suggest" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="suggest">true</str>
<str name="suggest.count">10</str>
<str name="suggest.dictionary">mySuggester</str>
</lst>
<arr name="components">
<str>suggest</str>
</arr>
</requestHandler>
However if I try using FuzzyLookupFactory as lookup Impl, then suggestion results are returned as expected.(but problem is Fuzzylookupfactory does not support context filtering)
url used:
http://ipaddress:port/solr/collection_name/suggest?suggest=true&suggest.build=true&suggest.dictionary=mySuggester&wt=json&suggest.q=Com&suggest.cfq=
1234
(I know this is an old issue, but in case others stumble across it with the same problem...)
I spent a couple of days dealing with the same empty results. You don't say what the type of the field is that you're using as material for suggestions. You've got suggestAnalyzerFieldType set to string.
By default, string is a fieldType with no analysis many out-of-the-box schema.xml examples. A key concept, which is only vaguely hinted at in the Solr manual's Suggester doc, is that lookupImpls like AnalyzingInfixLookupFactory and BlendedInfixLookupFactory can take a suggestAnalyzerFieldType that is not the type of the field from which you are generating suggestions, but rather need a type that contains the appropriate analyzer elements, such as solr.WhiteSpaceTokenizer needed for suggestions.
In my case, I was trying to suggest from a multivalued string field--I wanted the field to have no tokenization. But until I changed the suggestAnalyzerFieldType from string to text_ws (a fieldType whose analyzer is only sole.WhiteSpaceTokenizer, I got empty results.
For what it's worth, if you use multivalued string field for suggestions, and many documents that contain the same string values in that field, then the BlendedInfixLookupFactory seems to produce a better result with no duplicate suggestions.

Autocomplete term suggestion as per popularity

I have implemented autocomplete term suggestion in my MVC application. Let me explain you how I have done this. I have created one table in DB and table columns is like:
Id SearchTerm CatID ResultCount Clicks Latency TermSearchTime
Now, whenever user search a term we store it in this table. Next time it same word match we display term suggestion. Moreover, we display term suggestion as term popularity. Which word is more searched is displayed first in suggestion.
But now I also want to provide term suggestion for misspell term. For example Samsung is already there in my table. If someone search for samsng in that case Samsung should be there in term suggestion.
As I do not know how to spell check in SQL server, I decided to do it using Solr.
How can I do it using Solr with my default behaviour which I have done with SQL Db? Moreover, please note Search result I fetch from the Solr. I have already index all products. Do I need to index Search Term as well?
Any help is appreciation. Thanks.
check this in your solrconfig.xml file to use spellcheck handler.
<requestHandler name="/spell" class="solr.SearchHandler" startup="lazy">
<lst name="defaults">
<str name="df">text</str>
<!-- Solr will use suggestions from both the 'default' spellchecker
and from the 'wordbreak' spellchecker and combine them.
collations (re-written queries) can include a combination of
corrections from both spellcheckers -->
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck.dictionary">wordbreak</str>
<str name="spellcheck">on</str>
<str name="spellcheck.extendedResults">true</str>
<str name="spellcheck.count">10</str>
<str name="spellcheck.alternativeTermCount">5</str>
<str name="spellcheck.maxResultsForSuggest">5</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.collateExtendedResults">true</str>
<str name="spellcheck.maxCollationTries">10</str>
<str name="spellcheck.maxCollations">5</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr></requestHandler>
if not present then copy paste in your file. restart solr. try /spell?q=ipad

Solr edismax qf and pf defaults not working to boost fields

I am attempting to set up a request handler that will boost certain fields by different amounts. I have the following request handler.
<requestHandler name="/select" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="echoParams">explicit</str>
<str name="start">0</str>
<int name="rows">10</int>
<str name="defType">edismax</str>
<str name="qf">
title^50.0 searchTitle^7.0 keywords^5.0 content^1.0 text^1.0
</str>
<str name="pf">
title^50.0 searchTitle^7.0 keywords^5.0 content^1.0 text^1.0
</str>
<str name="df">text</str>
</lst>
</requestHandler>
However, the fields aren't being boosted correctly, if at all. I noticed that documents with the search term in the title field aren't appearing any higher than documents with the search term in the text field. Arbitrarily re-arranging the weights produces the same document order each time.
When I go into the solr web interface/admin UI and do a search I get the same results. However, if I explicitly check the edismax checkbox and enter the field-boost data in the qf and pf boxes I get the results and the weighting I would expect.
In fact, I also just tried changing the rows value to 5 and still received the same result. It looks like my queries aren't being handled by the /select handler, even though that is what I choose both in the solr Admin UI and when I create the HttpSolrServer object to do the queries from the server.
I am using solr v4.8.0.
Any help would be appreciated.
Check setting in solrconfig for
<requestDispatcher handleSelect="false" >
If you want to use select as a requesthandler, this needs to be
<requestDispatcher handleSelect="true" >

Boost score from schema

I have a fieldType named double_score. The values here are all precomputed and can fit in a double format. I would like to use this score to boost the associated values s.t. solr returns values by this order. Moreover, I'd like to do this from just the schema. This last clause seems to be the one that is tripping up my searching / configuring fu.
Thanks.
EDIT: (dismax)
<requestHandler name="default" class="solr.SearchHandler" default="true">
<lst name="defaults">
<str name="defType">dismax</str>
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<str name="qf">name</str>
<str name="bq">double_score</str>
<str name="debug">true</str>
<str name="q.alt">*:*</str>
</lst>
</requestHandler>
Use sort order if you would like your results to be sorted acording to your double_score field.
You can see here how to use sort after your field: http://wiki.apache.org/solr/CommonQueryParameters#sort
If you want this to be set in your schema you just have to add the sort:double_score as a default parameter for each request:
<requestHandler name="default" class="solr.StandardRequestHandler" default="true">
<lst name="defaults">
<str name="sort">double_score</str>
</lst>
</requestHandler>
"returns values by this order" if that means a simple sort, go with Dorin's answer.
But to boost results based on fields (you may take several fields into consideration) , see this: http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_make_.22superman.22_in_the_title_field_score_higher_than_in_the_subject_field

Solr Minimum Match not working?

In my solrconfig.xml I specify a mm of 100% yet, searches with multiple terms, still show results that only match some of the search terms. If I explicitly put a + in front of each term, the desired behavior is achieved, but for obvious reasons, I don't want the user to have to enter the +'s.
Also, I have tried several variations of the mm parameter, and none of them seem to achieve what I am after. Below is the entire request handler:
<requestHandler name="dismax" class="solr.SearchHandler" >
<lst name="defaults">
<str name="defType">dismax</str>
<str name="echoParams">explicit</str>
<float name="tie">0.01</float>
<str name="qf">
body^0.5 subject^3.0 from^10.0 to^7.0
</str>
<str name="mm">
100%
</str>
<int name="ps">100</int>
<str name="q.alt">*:*</str>
</lst>
</requestHandler>
What am I doing wrong?
I've answered my own question. The xml config above is fine. I was passing a boost parameter to the query that looked something like
{!boost b=<some boost>}
And that was causing the dismax handler to parse the query differently, thereby ignoring the mm.

Resources