Using Dismax as parser, why doesn't Solr always expand synonyms? - solr

Our application does some parsing of its own before the query is sent to Solr, which means that searching for the word court produces the following query:
((+court))
This works fine using the Standard Query Parser. But using the Dismax parser, synonyms no longer get expanded, yielding less search results. Using Solr Admin, I see that if I remove the parentheses the synonyms do get expanded.
Is there a reason that the parentheses cause different behaviour using Dismax? How do I get them to behave consistently?

Related

SOLR Solarium can we use filter-queries with dismax-queries?

i just built a search form backed by solr, we are using the solarium library to construct our requests.
we built a "huge" collection of filterqueries like that one:
$query = $client->createQuery($client::QUERY_SELECT);
$query->setStart(0)->setRows(1000);
$query->addFilterQuery($query->createFilterQuery("foo")->setQuery("bar:true"));
$query->addFilterQuery($query->createFilterQuery("fo")->setQuery("ba:false"));
....
but we realized that the search just hits all the single fields we specify in the filterqueries, but we have to actually query multiple fields. while reading the docs i realized we could have been wrong, right? the correct approach would be to use disMax queries (in combination with facets?)? im wondering, can we use DisMax in combination with filterqueries to "expand" our search to multiple fields (with boosts) ? or do we have to actually rework everything?
im kinda missing the big picture to decide what the best/working solution would be
help is much appreciated
edit:
solr:
solr-spec 7.6.0
solarium:
solarium/solarium 6.0.1 PHP Solr client
You can give a query parser when giving the fq argument:
fq={!dismax qf="firstfield secondfield^5"}this is my query
The syntax is known as Local Parameters. Since dismax (or edismax which you should normally use now) doesn't have a identifier in front of it, it is implicitly parsed as the type.
If a local parameter value appears without a name, it is given the implicit name of "type". This allows short-form representation for the type of query parser to use when parsing a query string.
You'll have to make sure that Solarium doesn't escape the value you give to setQuery, but seeing as you're already giving a field:value combination, it doesn't seem to get escaped. Double check the Solr log to see exactly what query is being sent to Solr (or ask Solarium to give you the exact query string being sent if possible).

Solr: dismax query parser doesn't support AND OR than why i am getting result

According to the Solr documentation, Dismax Query Parser doesn't support AND not OR in queries. However if I run one of the following queries:
http://xx.xx.Xx.xx:yyyy/solr/select?q=Pakistan%20OR%20India&wt=json&indent=true&defType=dismax
http://xx.xx.Xx.xx:yyyy/solr/select?q=Pakistan&wt=json&start=0&rows=20&indent=true&fl=content,url,title&fq=(title:[''+TO+*]+AND+url:[''+TO+*]+AND+content:[''+TO+*])&fq=group:ur_blogs&defType=dismax
I get results.
My question is: dismax doesn't support AND or OR in 'q' parameter or in the entire query?
As per description provided in link https://cwiki.apache.org/confluence/display/solr/The+DisMax+Query+Parser
"The DisMax query parser supports an extremely simplified subset of the Lucene QueryParser syntax. As in Lucene, quotes can be used to group phrases, and +/- can be used to denote mandatory and optional clauses. All other Lucene query parser special characters (except AND and OR) are escaped to simplify the user experience."

Solr search query for advanced search

I'm using solr 4.2. And I've implemented Advanced search in my application. In this search, one option is there for 'include all these words'. so whenever i put any word or any phrase without stop words, it works fine but with stop words, it returns no result.
For example:
Query: purchase tune
This query works fine but if i put query like:'i would like to purchase tune'
then it gives no results.
Why is so? And what i need to do?
Also for this kind of search I was using 'mm' parameter in solr. But now using 'q.op'.
In solr i'm adding param q.op as following
params.add("q.op","AND");
and query as
params.add("q",query);
Do you encounter the problem with the Dismax Handler and the mm parameter?
The DisMax handler has an issue with stopwords:
https://issues.apache.org/jira/browse/SOLR-3085
In this ticket you find some workarounds:
Typical solution is to not use stopwords or harmonize stopword lists
across all fields in your QF, or relax the MM to a lower percentag.
Sometimes these are not acceptable workarounds, and we should find a
better solution.
Perhaps "harmonize stopword lists" is a suitable soluation for your problem.

What is the difference between dismax and EdisMax?

I like to know what is the difference between DisMax and EDisMax..?
Is there any useful reference to know about that.? Also, I would like to know what are the queries DisMax failed to produce the result for which EDisMax is able to produce the result..?
EDisMax has some Query parameter like boost Parameter,ps Parameter,The pf2 Parameter; But apart from this query parameter, how EDisMax better than DisMax; how queries are processed between these two.What factors make EDisMax do better than DisMax..
Some queries failed to give result in DisMax but EDisMax gives result for those queries.
I googled the difference between DisMax and EDisMax. I have found, the parameters have been used in EDisMax is only the difference between DisMax and EDisMax; but I am expecting something technically to explain to others in presentation.
http://ip:8983/solr/C73/select/?defType=edismax&q=ipod OR video&fl=filename, score&hl=true&hl.fl=content contentenstem filename&hl.zetaContentField=content
for above query EDisMax produces about 238 results; but DisMax produces 0 result.
So what is the difference between handling this query by this two parser;What makes EDisMax to produce result.Thats what I like to know ....
As Dismax had a lot of limitations, EDismax query parser was added.
Check out SOLR-1553
To start with (as in Documentation) :-
The extended dismax parser was based on the original Solr dismax parser.
Supports full lucene query syntax in the absence of syntax errors
supports "and"/"or" to mean "AND"/"OR" in lucene syntax mode
When there are syntax errors, improved smart partial escaping of special characters is done to prevent them... in this mode, fielded queries, +/-, and phrase queries are still supported.
Improved proximity boosting via word bigrams... this prevents the problem of needing 100% of the words in the document to get any boost, as well as having all of the words in a single field.
advanced stopword handling... stopwords are not required in the mandatory part of the query but are still used (if indexed) in the proximity boosting part. If a query consists of all stopwords (e.g. to be or not to be) then all will be required.
Supports the "boost" parameter.. like the dismax bf param, but multiplies the function query instead of adding it in
Supports pure negative nested queries... so a query like +foo (-foo) will match all documents
However, as you would a lot of associated JIRA's to improve the query parsing capability and support for more features.
Reading through the JIRA's can be really insightful :)
In general EDisMax is an extended version of the DisMax. You can find good description and differences of both parser in the following links.
DisMax Query Parser
Extended DisMax Query Parser

Fielded searches with Solr ExtendedDisMax Query Parser

I'm having a problem using the Solr ExtendedDisMax Query Parser with query that contains fielded searches inside not-plain queries.
The case is the following.
If I send to SOLR an edismax request (defType=edismax) with parameters
qf=field1^10
q=field2:ciao
debugQuery=on (for debug purposes)
solr parses the query as I expect, in fact the debug part of the response tells me that
[parsedquery_toString] => +field2:ciao
But if I make the expression only a bit more complex, like putting the condition into brackets:
1. qf=field1^10
2. q=(field2:ciao)
I get
[parsedquery_toString] => +(((field1:field2:^2.0) (field1:ciao^2.0))~2)
where Solr seems not recognize the field syntax.
I've not found any mention to this behavior in the documentation, where instead they say that
This parser supports full Lucene QueryParser syntax including boolean operators 'AND', 'OR', 'NOT', '+' and '-', fielded search, term boosting, fuzzy...
This problem is really annoying me because I would like to do compelx boolean and fielded queries even with the edismax parser.
Do you know a way to workaround this?
EDIT: The Solr version is 3.6
If you are using Solr 3.6, there is a current issue with eDisMax and Fielded searches that was introduced with Solr 3.6. The workaround is to precede the field name with a space.
So change your query to the following:
qf=field1^10
q=( field2:ciao)
Please see eDismax: A fielded query wrapped by parens is not recognized for the more details.

Resources