SOLR Solarium can we use filter-queries with dismax-queries? - solr

i just built a search form backed by solr, we are using the solarium library to construct our requests.
we built a "huge" collection of filterqueries like that one:
$query = $client->createQuery($client::QUERY_SELECT);
$query->setStart(0)->setRows(1000);
$query->addFilterQuery($query->createFilterQuery("foo")->setQuery("bar:true"));
$query->addFilterQuery($query->createFilterQuery("fo")->setQuery("ba:false"));
....
but we realized that the search just hits all the single fields we specify in the filterqueries, but we have to actually query multiple fields. while reading the docs i realized we could have been wrong, right? the correct approach would be to use disMax queries (in combination with facets?)? im wondering, can we use DisMax in combination with filterqueries to "expand" our search to multiple fields (with boosts) ? or do we have to actually rework everything?
im kinda missing the big picture to decide what the best/working solution would be
help is much appreciated
edit:
solr:
solr-spec 7.6.0
solarium:
solarium/solarium 6.0.1 PHP Solr client

You can give a query parser when giving the fq argument:
fq={!dismax qf="firstfield secondfield^5"}this is my query
The syntax is known as Local Parameters. Since dismax (or edismax which you should normally use now) doesn't have a identifier in front of it, it is implicitly parsed as the type.
If a local parameter value appears without a name, it is given the implicit name of "type". This allows short-form representation for the type of query parser to use when parsing a query string.
You'll have to make sure that Solarium doesn't escape the value you give to setQuery, but seeing as you're already giving a field:value combination, it doesn't seem to get escaped. Double check the Solr log to see exactly what query is being sent to Solr (or ask Solarium to give you the exact query string being sent if possible).

Related

Why dismax q.alt doesn't return any result

I'm new to solr.
After following the tutorial exercise 1(https://solr.apache.org/guide/8_9/solr-tutorial.html), I'm able to do some solr query on my loacl machine.
If I want to get result without condition, I will do the query like
http://127.0.0.1:8983/solr/#/techproducts/query?q=*:*&q.op=OR
This works pretty fine.
But when I switch to "dismax" and try to have similar result, I do need to use "q.alt".
The query is like
http://127.0.0.1:8983/solr/#/techproducts/query?q.op=OR&defType=dismax&q.alt=*:*
However, this query resulted in no result, which is pretty weird.
Even thought I specified the row, it still won't work.
http://127.0.0.1:8983/solr/#/techproducts/query?q.op=OR&defType=dismax&q.alt=*:*&row=0
Does anyone face the same problem before?
These parameters are not meant to be used with the user interface URLs; they're for sending directly to Solr. The user interface is a Javascript interface that talks to the Solr API behind the scenes. You can see that your urls have a local anchor in them (#), and this is just references that the javascript based user interface uses to load the correct page.
The rows parameter is also named rows, not row - and when used with 0, no documents will be returned (in the example it's given as an example for using facets with complete counts - you have to ask for facets for that to make sense).
The actual URL to query Solr for matching documents would be:
http://127.0.0.1:8983/solr/techproducts/select?defType=edismax&q.alt=*:*
This URL is shown in the user interface over the query results when using the query page.
There is also usually no reason to use dismax and not edismax these days, as edismax does everything that the old dismax handler did and with more functionality.

Spring Data Solr: Queries with "AND", "NOT" and "OR" not escaped or handled

We are using spring-data-solr, mainly using exact match/equals filter queries.
We have found that the values NOT, OR, and AND can be supplied, which are passed directly onto solr (without any pre-processing). This causes solr to error. For example, building a Criteria object like
Criteria.where("fuelType").is("AND")
Results in the following solr query
fq=fuelType:AND
We have found that if we call Solr directly with
fq=fuelType:"AND"
This would be fine, however, I can see that quotes are only added when there is whitespace in the value.
Is there something I am missing?
I still want to use the Standard Solr query parser if possible
The pull request for this has been merged,
https://jira.spring.io/browse/DATASOLR-437
https://github.com/spring-projects/spring-data-solr/pull/74

Solr dismax Query Over Multiple Fields

I am trying to do a solr dismax query over multiple fields, and am a little confused with the syntax.
My core contains a whole load of podcast episodes. The fields in the index are EPISODE_ID, EPISODE_TITLE, EPISODE_DESC, and EPISODE_KEYWORDS.
Now, when I do a query I would like to search for the query term in the EPISODE_TITLE, EPISODE_DESC, and EPISODE_KEYWORDS fields, with different boosts for the different fields.
So when I search for 'jedi', the query I've built looks like this:
http://localhost:8983/solr/episode_core/select?
&defType=dismax&q=jedi&fl=EPISODE_ID,EPISODE_TITLE,EPISODE_DESC,EPISODE_KEYWORDS
&qf=EPISODE_TITLE^3.0+EPISODE_DESC^2.0+EPISODE_KEYWORDS
However, this doesn't seem to work - it returns zero records.
When I put a default field like below, it now works, but this is kind of crap because it means I'm not getting results from searching all of the 3 fields:
http://localhost:8983/solr/episode_core/select?&df=EPISODE_DESC
&defType=dismax&q=jedi&fl=EPISODE_ID,EPISODE_TITLE,EPISODE_DESC,EPISODE_KEYWORDS
&qf=EPISODE_TITLE^3.0+EPISODE_DESC^2.0+EPISODE_KEYWORDS
Is there something I am missing here? I thought that you could search over multiple fields, and I thought that the 'qf' parameter would mean you didn't need to supply the default field parameter?
All help much appreciated...
Your idea is correct. If you've defined qf (query fields) for Dismax, there shouldn't be any need to specify a df (default field).
Can you be more specific about what isn't working?
Also, read up on Configuration Invariants in solrconfig.xml as it is possible your configuration could be sending some different parameters than you've specified in the URL.
(E.g. if you're seeing a specific error message asking you to provide a df)

why in solr 1.4 passing a letter to a 'int' field results in exception?

I'm running lucene solr 1.4 on top of tomcat.
I have a field id defined with type int which is mapped to solr.TrieIntField.
When I do a solr query like ?q=id:a I get a NumberFormatException.
Is it possible to configure solr in such a way that it returns empty result set for above scenraio instead of throwing the exception?
Why do you have to have this as TrieIntField? Can you not use ? They are all sub classes of non tokekenized field (org.apache.solr.schema.FieldType).
Update: Based on your original question, as it is about id, i suggested to use string, as it makes no difference in that case. But if other fields use TrieIntField type the downside of using string for those fields is that your sorts and range queries may go string based and may be not desirable. In that case you need to prevent your orignal problem in client API or you need to handle them better by writing your own handler. Solr is doing correct thing by giving error as most applications would capure this error and respond to users with better user error message. If solr returns no results instead of error then it would be missleading.
Solr is written in Java so it is expected. You have to either filter out non-integer value from your client side (or API layer) or use String type as Arun suggested.

Solr Index appears to be valid - but returns no results

Solr newbie here.
I have created a Solr index and write a whole bunch of docs into it. I can see
from the Solr admin page that the docs exist and the schema is fine as well.
But when I perform a search using a test keyword I do not get any results back.
On entering * : *
into the query (in Solr admin page) I get all the results.
However, when I enter any other query (e.g. a term or phrase) I get no results.
I have verified that the field being queried is Indexed and contains the values I am searching for.
So I am confused what I am doing wrong.
Probably you don't have a <defaultSearchField> correctly set up. See this question.
Another possibility: your field is of type string instead of text. String fields, in contrast to text fields, are not analyzed, but stored and indexed verbatim.
I had the same issue with a new setup of Solr 8. The accepted answer is not valid anymore, because the <defaultSearchField> configuration will be deprecated.
As I found no answer to why Solr does not return results from any fields despite being indexed, I consulted the query documentation. What I found is the DisMax query parser:
The DisMax query parser is designed to process simple phrases (without complex syntax) entered by users and to search for individual terms across several fields using different weighting (boosts) based on the significance of each field. Additional options enable users to influence the score based on rules specific to each use case (independent of user input).
In contrast, the default Lucene parser only speaks about searching one field. So I gave DisMax a try and it worked very well!
Query example:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video
You can also specify which fields to search exactly to prevent unwanted side effects. Multiple fields are separated by spaces which translate to + in URLs:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&qf=features+text
Last but not least, give the fields a weight:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&qf=features^20.0+text^0.3
If you are using pysolr like I do, you can add those parameters to your search request like this:
results = solr.search('search term', **{
'defType': 'dismax',
'qf': 'features text'
})
In my case the problem was the format of the query. It seems that my setup, by default, was looking and an exact match to the entire value of the field. So, in order to get results if I was searching for the sit I had to query *sit*, i.e. use wildcards to get the expected result.
With solr 4, I had to solve this as per Mauricio's answer by defining type="text_en" to the field.
With solr 6, use text_general.

Resources