Why does Dismax's bq (Boost Query) parameter filter results instead of just boosting them? - solr

I'm using Solr 8.11.2. In order to boost documents with certain field values, I'm using Dismax's bq (Boost Query) parameter.
From what I've read, this should only influence the score of the search results returned by the rest of the query. What I see happening is that it filters all search results that don't have the field I'm boosting.
I'm using the following query, which returns all documents containing both words procedure and maintenance:
q=((+procedure+maintenance))&rows=10&start=0&wt=xml&q.op=AND&fl=id,score,alias,author,hash,collection,label,url,lastModified,path,extension,objectId,objectDtType,title,DocumentPK_s,Taal_s,Site_s,SharePointId_s&hl=true&hl.qt=highlightRH&hl.fl=content,description,label&hl.snippets=5&defType=dismax&bf=recip(max(0,ms(NOW-3MONTH,creationDate)),3.16e-11,1,1)&pf=content&sort=score DESC
But as soon as I append &bq=language:english^10000, which is supposed to boost documents where the field language is set to english, all documents where the field language doesn't exist are no longer part of the results.
Am I misunderstanding how this parameter is supposed to work? Is it a side effect?

Related

How to see Solr explain for a document not returned by Solr query

I am using Solr's explain to debug my Solr query. I can see explain results for everything that Solr query returns, but not for the documents the query has not returned.
There are documents which I think should be returned by a query but are not.
I want to see how the Solr score is calculated for those documents to be able to compare with other documents.
I was able to find the answer to this question.
There's a query parameter called explainOther. You can specify a query in this parameter and on top of the explain you will get for the matched queries, now Solr will show you the explain for any record that matches this new explainOther query as well.
Here is the explanation of that parameter from Solr reference guide:
The explainOther Parameter
(From: https://lucene.apache.org/solr/guide/6_6/common-query-parameters.html#CommonQueryParameters-TheexplainOtherParameter)
The explainOther parameter specifies a Lucene query in order to identify a set of documents. If this parameter is included and is set to a non-blank value, the query will return debugging information, along with the "explain info" of each document that matches the Lucene query, relative to the main query (which is specified by the q parameter). For example:
q=supervillians&debugQuery=on&explainOther=id:juggernaut
The query above allows you to examine the scoring explain info of the top matching documents, compare it to the explain info for documents matching id:juggernaut, and determine why the rankings are not as you expect.
The default value of this parameter is blank, which causes no extra "explain info" to be returned.

Match all documents excluding some terms using full Lucene syntax

Our service's default search web page uses the * full Lucene query to match all documents. This is before the user has provided any search terms. There is some data (test data, in our case) that we want to exclude from the search result.
Is it possible to match all documents but exclude a subset of all documents?
For example, suppose we have an "owners" field and we want to exclude documents with the "testA" and "testB" owner. The following query does not seem to work with the match all approach:
Query: search=* -owners:testA -owners:testB&queryType=full&$orderby=created desc
Error: "Failed to parse query string. See https://aka.ms/azure-search-full-query for supported syntax."
When searching for anything but *, this approach works fine. For example:
Query: search=foo -owners:testA -owners:testB&queryType=full&$orderby=created desc
Result: (many documents matched)
I have considered a $filter for this and using $filter=filterableOwners/all(p: p ne 'testa' and p ne 'testb') but this has the following drawbacks:
the index must be rebuild with a filterable field
analyzers can't be used so case-insensitivity must be implemented by lowercasing the values and filter expression
Ideally this could be done using only the search query parameter with a Lucene query text.
I found a workaround for the issue. If you have a field in your documents that always has a value, you can use a .* regex to match all values in the field and therefore match all documents.
For example, suppose the packageId field has a value for all documents.
Incorrect (as posted in the original question):
Query: search=* -owners:testA -owners:testB&queryType=full&$orderby=created desc
Correct:
Query: search=packageId:/.*/ -owners:testA -owners:testB&queryType=full&$orderby=created desc

No matches when mixing keywords

I am trying to do a product search setup using Solr. It does return results for keywords that follow the same order in the product name. However, when the keywords are mixed up, no results are returned. I would like to get results with scores that closely match the given keywords in any order.
My question on scoring has the schema, data configuration and query. Any help will be greatly appreciated.
As long as you enter your query as a regular query, instead of using wildcards, any hits in a text_general field as you've defined should be returned.
You can use the mm parameter to adjust how many of the terms supplied that need to match from a query. I suggest using the edismax query parser, as that allows you do to more "natural" queries instead of having to add the fieldnames in the query itself:
defType=edismax&qf=catchall&q=nikon dslr
defType=edismax&qf=catchall&q=dslr nikon
should both give the same set of documents (but possibly different scores when using phrase boosts).

How to boost fields in solr

I already have the boost determined before hand. I have a field in the solr index called boost1 . This boost field will have a value from 1 to 10 similar to google PR rank. This is the boost that should be applied to every query ran in solr. here are the fields in my index
Id
Title
Text
Boost1
The boost field should be apply to every query. I am trying to implement functionality similar to Google PR rank. Is there a way to do this using solr?
you can add the boost during query e.g.
q={!boost b=boost1}
How_can_I_boost_the_score_of_newer_documents
However, this may need to be added explicitly by you.
If you are using dismax or edismax with the request handler, The bf (Boost Functions) parameter could be used to boost the documents.
http://wiki.apache.org/solr/DisMaxQParserPlugin#bf_.28Boost_Functions.29
bf=boost1^0.5
This can be added to defaults with the request handler definition, so that they are applied to all the search queries.
you can use function queries to vary the amount of boost FunctionQuery
I think you need to use index time document boosts. See this if you are indexing XML or this if using DataImportHandler.

Solr filter queries and boosting

Is it possible to boost fields that appear in filter queries (fq=) in Solr?
I have a faceted query that has a tagged filter query something like this:
...&q=*:*&fq={!tag:X}brand:(+"4911")+OR+body:(abc)&facet.field={!ex:X}brand&..
(I facet on brand and the facet is set to ignore the filter query tagged X, so I need to use a filter query.)
I would like to make matches on the brand field score higher than matches on body field in the filter query.
The fields brand and body are multivalued.
I've tried adding bf=/bq= arguments, and I can get brand matches to score higher if I change the filter query to be the main 'q=' query, but I don't seem to be able to influence the score of anything in the filter query. I think I maybe going about it in the wrong way..
Thanks.
Solr "fq"'s do not affect score -- see the wiki. So, you should add your queries to "q" that you actually want to boost. If need be, you can always duplicate a query restriction in both "q" and "fq", as "fq" only acts as a restriction on the results set.

Resources