DisMax query parser is not running - solr

Environment- solr-8.9.0
Movies data in a form of a .csv file has been indexed in apache solr.
Movies data
name,directed_by,genre,type,id,initial_release_date
.45,Gary Lennon,Black comedy|Thriller|Psychological thriller|Indie film|Action Film|Crime Thriller|Crime Fiction|Drama,,/en/45_2006,2006-11-30
9,Shane Acker,Computer Animation|Animation|Apocalyptic and post-apocalyptic fiction|Science Fiction|Short Film|Thriller|Fantasy,,/en/9_2005,2005-04-21
Bomb the System,Adam Bhala Lough,Crime Fiction|Indie film|Coming of age|Drama,,/en/bomb_the_system,
movie_name,Adam Bhala,Animation|Indie film|Coming of age|Drama,,/en/bomb_the_system,
The DisMax Query Parser has been run over the field 'directed_by'. 'directed_by' field is mapped as a 'text_general' field-type in managed-schema.
I have run the following query over solr
curl -G http://localhost:8983/solr/testCore1/select --data-urlencode "q=directed_by:'Adam Bhala Lough~'" --data-urlencode "defType=dismax" --data-urlencode "mm=2"
but the above query is giving 0 'numFound' in response.
I expect the following field to match:
Adam Bhala
Adam Bhala Lough
I expect the following field not to match:
Adam Gary Lennon
Although data is indexed in solr. I have cross-checked this by running the following query without disMax query parser and it is giving responses.
curl -G http://localhost:8983/solr/testCore1/select --data-urlencode "q=directed_by:'Adam Bhala Lough~'"
Why dismax query parser is not running as it should be? I understand that mm(Minimum Should Match) Parameter===> defines the minimum number of clauses that must match.
I have spent hours to solve this. However, I cannot seem to find anything that holds my hand.
Could someone help me find the missing piece?
Reference
https://solr.apache.org/guide/8_9/the-dismax-query-parser.html
How to use disMax query parser in solr

You need to use the qf (query fields) parameter to specify in which field(s) the search should be performed, along with (optionally) a boost factor used to increase or decrease that particular field’s importance in the query (defaults to 1) :
q=Adam Bhala Lough~
qf=directed_by
Or, use edismax query parser instead of dismax, so that the standard query syntax you were using (directed_by:'Adam Bhala Lough~') can work as intended.

Related

Why does Dismax's bq (Boost Query) parameter filter results instead of just boosting them?

I'm using Solr 8.11.2. In order to boost documents with certain field values, I'm using Dismax's bq (Boost Query) parameter.
From what I've read, this should only influence the score of the search results returned by the rest of the query. What I see happening is that it filters all search results that don't have the field I'm boosting.
I'm using the following query, which returns all documents containing both words procedure and maintenance:
q=((+procedure+maintenance))&rows=10&start=0&wt=xml&q.op=AND&fl=id,score,alias,author,hash,collection,label,url,lastModified,path,extension,objectId,objectDtType,title,DocumentPK_s,Taal_s,Site_s,SharePointId_s&hl=true&hl.qt=highlightRH&hl.fl=content,description,label&hl.snippets=5&defType=dismax&bf=recip(max(0,ms(NOW-3MONTH,creationDate)),3.16e-11,1,1)&pf=content&sort=score DESC
But as soon as I append &bq=language:english^10000, which is supposed to boost documents where the field language is set to english, all documents where the field language doesn't exist are no longer part of the results.
Am I misunderstanding how this parameter is supposed to work? Is it a side effect?

SOLR Solarium can we use filter-queries with dismax-queries?

i just built a search form backed by solr, we are using the solarium library to construct our requests.
we built a "huge" collection of filterqueries like that one:
$query = $client->createQuery($client::QUERY_SELECT);
$query->setStart(0)->setRows(1000);
$query->addFilterQuery($query->createFilterQuery("foo")->setQuery("bar:true"));
$query->addFilterQuery($query->createFilterQuery("fo")->setQuery("ba:false"));
....
but we realized that the search just hits all the single fields we specify in the filterqueries, but we have to actually query multiple fields. while reading the docs i realized we could have been wrong, right? the correct approach would be to use disMax queries (in combination with facets?)? im wondering, can we use DisMax in combination with filterqueries to "expand" our search to multiple fields (with boosts) ? or do we have to actually rework everything?
im kinda missing the big picture to decide what the best/working solution would be
help is much appreciated
edit:
solr:
solr-spec 7.6.0
solarium:
solarium/solarium 6.0.1 PHP Solr client
You can give a query parser when giving the fq argument:
fq={!dismax qf="firstfield secondfield^5"}this is my query
The syntax is known as Local Parameters. Since dismax (or edismax which you should normally use now) doesn't have a identifier in front of it, it is implicitly parsed as the type.
If a local parameter value appears without a name, it is given the implicit name of "type". This allows short-form representation for the type of query parser to use when parsing a query string.
You'll have to make sure that Solarium doesn't escape the value you give to setQuery, but seeing as you're already giving a field:value combination, it doesn't seem to get escaped. Double check the Solr log to see exactly what query is being sent to Solr (or ask Solarium to give you the exact query string being sent if possible).

Synonyms in Solr Query Results

Whenever I query a string in solr, i want to get the synonyms of field value if there exists any as a part of query result, is it possible to do that in solr
There is no direct way to fetch the synonyms used in the search results. You can get close by looking at how Solr parsed your query via the debugQuery=true parameter and looking at the parsedQuery value in the response but it would not be straightforward. For example, if you search for "tv" on a text field that uses synonyms you will get something like this:
$ curl "localhost:8983/solr/your-core/select?q=tv&debugQuery=true"
{
...
"parsedquery":"SynonymQuery(Synonym(_text_:television _text_:televisions _text_:tv _text_:tvs))",
...
Another approach would be to load in your application the synonyms.txt file that Solr uses and do the mapping yourself. Again, not straightforward,

Fielded searches with Solr ExtendedDisMax Query Parser

I'm having a problem using the Solr ExtendedDisMax Query Parser with query that contains fielded searches inside not-plain queries.
The case is the following.
If I send to SOLR an edismax request (defType=edismax) with parameters
qf=field1^10
q=field2:ciao
debugQuery=on (for debug purposes)
solr parses the query as I expect, in fact the debug part of the response tells me that
[parsedquery_toString] => +field2:ciao
But if I make the expression only a bit more complex, like putting the condition into brackets:
1. qf=field1^10
2. q=(field2:ciao)
I get
[parsedquery_toString] => +(((field1:field2:^2.0) (field1:ciao^2.0))~2)
where Solr seems not recognize the field syntax.
I've not found any mention to this behavior in the documentation, where instead they say that
This parser supports full Lucene QueryParser syntax including boolean operators 'AND', 'OR', 'NOT', '+' and '-', fielded search, term boosting, fuzzy...
This problem is really annoying me because I would like to do compelx boolean and fielded queries even with the edismax parser.
Do you know a way to workaround this?
EDIT: The Solr version is 3.6
If you are using Solr 3.6, there is a current issue with eDisMax and Fielded searches that was introduced with Solr 3.6. The workaround is to precede the field name with a space.
So change your query to the following:
qf=field1^10
q=( field2:ciao)
Please see eDismax: A fielded query wrapped by parens is not recognized for the more details.

Solr Index appears to be valid - but returns no results

Solr newbie here.
I have created a Solr index and write a whole bunch of docs into it. I can see
from the Solr admin page that the docs exist and the schema is fine as well.
But when I perform a search using a test keyword I do not get any results back.
On entering * : *
into the query (in Solr admin page) I get all the results.
However, when I enter any other query (e.g. a term or phrase) I get no results.
I have verified that the field being queried is Indexed and contains the values I am searching for.
So I am confused what I am doing wrong.
Probably you don't have a <defaultSearchField> correctly set up. See this question.
Another possibility: your field is of type string instead of text. String fields, in contrast to text fields, are not analyzed, but stored and indexed verbatim.
I had the same issue with a new setup of Solr 8. The accepted answer is not valid anymore, because the <defaultSearchField> configuration will be deprecated.
As I found no answer to why Solr does not return results from any fields despite being indexed, I consulted the query documentation. What I found is the DisMax query parser:
The DisMax query parser is designed to process simple phrases (without complex syntax) entered by users and to search for individual terms across several fields using different weighting (boosts) based on the significance of each field. Additional options enable users to influence the score based on rules specific to each use case (independent of user input).
In contrast, the default Lucene parser only speaks about searching one field. So I gave DisMax a try and it worked very well!
Query example:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video
You can also specify which fields to search exactly to prevent unwanted side effects. Multiple fields are separated by spaces which translate to + in URLs:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&qf=features+text
Last but not least, give the fields a weight:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&qf=features^20.0+text^0.3
If you are using pysolr like I do, you can add those parameters to your search request like this:
results = solr.search('search term', **{
'defType': 'dismax',
'qf': 'features text'
})
In my case the problem was the format of the query. It seems that my setup, by default, was looking and an exact match to the entire value of the field. So, in order to get results if I was searching for the sit I had to query *sit*, i.e. use wildcards to get the expected result.
With solr 4, I had to solve this as per Mauricio's answer by defining type="text_en" to the field.
With solr 6, use text_general.

Resources