Solr Boosting specific field values - solr

I'm trying to boost the score for documents returned from a search in solr.
The boost I want to achieve is something along the lines of:
field1:(value1)^5 OR field2:(value2)^2
If the document does have field1 matching value1, boost by 5.
If document does have field2 matching value2, boost by 2.
The documents have many fields, let's call them field1, field2... and may be missing certain fields.
The documents do not need to have field1 or field2 matching value1, value2 respectively.
I have other filter queries such as:
fq: field1:[* TO *] <- checking for presence of
fq: field3: ("something" "somethingelse")
fq: field4: 1
I am grouping my results by a certain field not being used in any of the queries.
Raw query parameters:
group=true&group.facet=true&group.field=anIndependentField
I am using the same fq's with tried different query parsers.
There are enough documents in solr with field1:value1 and/or field2:value2 as well as other values for those fields.
So far I've tried using the query parsers:
Standard Query Parser
method a) q: field1:(value1)^5 OR field2:(value2)^2 // no results
method b) q: *:* OR field1:(value1)^5 OR field2:(value2)^2 // no results
method c) q: (value1)^5 OR (value2)^2 // incorrect. looks for complete match.
method d) q: (value1)^5 (value2)^2 // incorrect. looks for complete match
EDisMax Query Parser
(defType=edismax)
q: *:*
bq: field1:(value1)^5 OR field2:(value2)^2
Problem with this one is that results are not in expected order.
A document that has field1:somethingElse and field2:somethingElse2 got a higher score than a document that has field1: somethingElse and field2:value2.
Can anyone see what I'm doing wrong or has a suggestion to improve the relevancy of my search queries?

You can use the bf parameter of eDismax queryParser in the following way:
bf=if(termfreq(field1,"value1"),5,if(termfreq(field2,"value2"),2,1))
Please find below the complete query.
https://<MY_SERVER_NAME>:9443/solr/<MY_COLLECTION>/select?q=*%3A*&wt=json&indent=true&defType=edismax&bf=if(termfreq(field1%2C%22value1%22)%2C3%2Cif(termfreq(field2%2C%22value2%22)%2C2%2C0))

Related

Why does Dismax's bq (Boost Query) parameter filter results instead of just boosting them?

I'm using Solr 8.11.2. In order to boost documents with certain field values, I'm using Dismax's bq (Boost Query) parameter.
From what I've read, this should only influence the score of the search results returned by the rest of the query. What I see happening is that it filters all search results that don't have the field I'm boosting.
I'm using the following query, which returns all documents containing both words procedure and maintenance:
q=((+procedure+maintenance))&rows=10&start=0&wt=xml&q.op=AND&fl=id,score,alias,author,hash,collection,label,url,lastModified,path,extension,objectId,objectDtType,title,DocumentPK_s,Taal_s,Site_s,SharePointId_s&hl=true&hl.qt=highlightRH&hl.fl=content,description,label&hl.snippets=5&defType=dismax&bf=recip(max(0,ms(NOW-3MONTH,creationDate)),3.16e-11,1,1)&pf=content&sort=score DESC
But as soon as I append &bq=language:english^10000, which is supposed to boost documents where the field language is set to english, all documents where the field language doesn't exist are no longer part of the results.
Am I misunderstanding how this parameter is supposed to work? Is it a side effect?

Match all documents excluding some terms using full Lucene syntax

Our service's default search web page uses the * full Lucene query to match all documents. This is before the user has provided any search terms. There is some data (test data, in our case) that we want to exclude from the search result.
Is it possible to match all documents but exclude a subset of all documents?
For example, suppose we have an "owners" field and we want to exclude documents with the "testA" and "testB" owner. The following query does not seem to work with the match all approach:
Query: search=* -owners:testA -owners:testB&queryType=full&$orderby=created desc
Error: "Failed to parse query string. See https://aka.ms/azure-search-full-query for supported syntax."
When searching for anything but *, this approach works fine. For example:
Query: search=foo -owners:testA -owners:testB&queryType=full&$orderby=created desc
Result: (many documents matched)
I have considered a $filter for this and using $filter=filterableOwners/all(p: p ne 'testa' and p ne 'testb') but this has the following drawbacks:
the index must be rebuild with a filterable field
analyzers can't be used so case-insensitivity must be implemented by lowercasing the values and filter expression
Ideally this could be done using only the search query parameter with a Lucene query text.
I found a workaround for the issue. If you have a field in your documents that always has a value, you can use a .* regex to match all values in the field and therefore match all documents.
For example, suppose the packageId field has a value for all documents.
Incorrect (as posted in the original question):
Query: search=* -owners:testA -owners:testB&queryType=full&$orderby=created desc
Correct:
Query: search=packageId:/.*/ -owners:testA -owners:testB&queryType=full&$orderby=created desc

How does Solr process the query string when using edismax qf parameter and specify field in query

All:
[UPDATE]
After reading the debug explain, it seems that the qf will expand only
the keywords without specifying field.
===================================================================
When I learn to use edismax query parser, it said the qf paramter is:
Query Fields: specifies the fields in the index on which to perform
the query. If absent, defaults to df.
And its purpose is to generate all fields' combination with the query terms.
However, if we already specify the field in query( q prameter), I wonder what happen when I specify another different fields in qf?
For example:
q=title:epic
defType=edismax
qf=content
Could anyone give some explanation how SOLR interpret this query?
Thanks
When you specify qf it means you want solr to search for whatever is in the "q" field in these "qf" fields. So, your first and third line contradict each other:
q=title:epic
defType=edismax
qf=content
If you want to search for any document where the content field contains anything matching your search terms, but these search terms as tokens in "q" separated by +OR+.
like this...
q=I+OR+like+OR+books+ORand+OR+games
defType=edismax
qf=content
When q=title:epic. It means you has settled the query field to title, so the qf parameter could not be set as "content", in this case, you have no query result for sure. You leave the qf parameter empty or set it as "title"

Boosting documents when the numeric search value lands between two fields of the document

Is there a way to boost documents that for a search query A, A lands beween field X and field Y? but just boost them , I don't want to filter the non-matches
I know that if I would like to filter I could go the route:
fieldX:[* TO A] AND fieldY:[A TO *]
but I don't want to filter, I just want to boost these documents, the documents which A is not between X and Y should still be considered just rank lower.
Its a custom function the way to go?
If you are using a version of Solr up to 4.9, you can use ReRanking by doing a query like :
q=yourRequest&rq={!rerank reRankQuery=$rqq reRankDocs=1000 reRankWeight=3}&rqq=(fieldX:[* TO A] AND fieldY:[A TO *])
This will reRank your first 1000 documents using your subquery.

Solr filter queries and boosting

Is it possible to boost fields that appear in filter queries (fq=) in Solr?
I have a faceted query that has a tagged filter query something like this:
...&q=*:*&fq={!tag:X}brand:(+"4911")+OR+body:(abc)&facet.field={!ex:X}brand&..
(I facet on brand and the facet is set to ignore the filter query tagged X, so I need to use a filter query.)
I would like to make matches on the brand field score higher than matches on body field in the filter query.
The fields brand and body are multivalued.
I've tried adding bf=/bq= arguments, and I can get brand matches to score higher if I change the filter query to be the main 'q=' query, but I don't seem to be able to influence the score of anything in the filter query. I think I maybe going about it in the wrong way..
Thanks.
Solr "fq"'s do not affect score -- see the wiki. So, you should add your queries to "q" that you actually want to boost. If need be, you can always duplicate a query restriction in both "q" and "fq", as "fq" only acts as a restriction on the results set.

Resources