Lucene OR query not working - solr

I am trying to query Solr with following requirement:
_ I would like to get all documents which not have a particular field
-exclusivity:[* TO *]
I would like to get all document which have this field and got the specific value
exclusivity:(None)
so when I am trying to query Solr 4 with:
fq=(-exclusivity:[* TO *]) OR exclusivity:(None)
I have only got results if the field exists in document and the value is None but results not contain results from first query !!
I cannot understand why it is not working

To explain your results, the query (-exclusivity:[* TO *]) will always get no results, because you haven't specified any result to retrieve. By default, Lucene doesn't retrieve any results, unless you tell it to get them. exclusivity:(None) isn't a limitation placed on the full result set, it is the key used to find the documents to retrieve. This differs from a database, which by default returns all records in a table, and allows you to limit the set.
(-exclusivity:[* TO *]) only specifies what NOT to get, but doesn't tell it to GET anything at all.
Solr has logic to handle Pure negative queries (I believe, in much the same way as below, by implicitly retrieving all documents first), but from what I gather, only as the top level query, and it does not handle queries like term1 OR -term2 documented here.
I believe with solr you should be able to use the query *:* to get all docs (though that would not be available in raw lucene), so you could use the query:
(*:* -exclusivity:[* TO *]) exclusivity:(None)
which would mean, get (all docs except those with a value in exclusivity) or docs where exclusivity = "None"

I have founded answer to this problem. I have made bad assumption how "-" works in solr.I though that
-exclusivity:[* TO *]
add everything without exclusivity field to the data set but it is not the case. The '-' could only exclude things from data set. BTW femtoRgon you are right but I am using it as fq (filter query) not as a master query I have forgotten to mention that.
So the solution is like
-exclusivity:([* TO *] AND -(None))
and full query looks like
/?q=*:*&fq=-exclusivity:([* TO *] AND -(None))
so that means I will get everything does not have field exclusivity or has this field and it is populated with value None.

Related

Nested document searches in Solr complex parentFilter syntax

We are adding nested documents to our Solr index. For this purpose, we've added a solr_record_type field to each record, but there will be an interval while we are updating the index where the original documents will have null in this field. We would like to treat all of the original documents as root documents.
In our Solr index, solr_record_type equals 1 and the child types are represented by 2-4. So, in order to get backwards compatibility with what is currently returned by queries, I added this fq parameter:
-solr_record_type:[2 TO 4]
However, I am having trouble composing the parentFilter in the child transformer. For the fl field I've tried:
*,[child parentFilter="-solr_record_type:[2 TO 4]"]
This doesn't work because it then omits the _childDocuments_ section from the results for some reason. I don't know why. I need some way to specify that the parent filter is either "null or 1" or "anything but 2, 3, and 4". How can I do this?
I was unable to find a definitive reference for syntax for the parentFilter, only very simple examples.
A negative query needs to be prefixed with what it's going to remove the documents from. Think of it as the intersection between the two sets, and if you only have the set which are "these documents should be removed", you have nothing to remove them from.
The regular query parser (and the edismax handlers) append the set of all documents, *:* automagically in front of negative queries for you, so it appears to work - until you start with longer AND and OR statements involving negative queries, where you suddenly need to prefix *:* as well.
The same is the case in the parentFilter syntax - there is no inherent set of all documents automagically prefixed internally, so if you have a negative query, you'll have to add it yourself.
*,[child parentFilter="*:* -solr_record_type:[2 TO 4]"]

Solr dismax Query Over Multiple Fields

I am trying to do a solr dismax query over multiple fields, and am a little confused with the syntax.
My core contains a whole load of podcast episodes. The fields in the index are EPISODE_ID, EPISODE_TITLE, EPISODE_DESC, and EPISODE_KEYWORDS.
Now, when I do a query I would like to search for the query term in the EPISODE_TITLE, EPISODE_DESC, and EPISODE_KEYWORDS fields, with different boosts for the different fields.
So when I search for 'jedi', the query I've built looks like this:
http://localhost:8983/solr/episode_core/select?
&defType=dismax&q=jedi&fl=EPISODE_ID,EPISODE_TITLE,EPISODE_DESC,EPISODE_KEYWORDS
&qf=EPISODE_TITLE^3.0+EPISODE_DESC^2.0+EPISODE_KEYWORDS
However, this doesn't seem to work - it returns zero records.
When I put a default field like below, it now works, but this is kind of crap because it means I'm not getting results from searching all of the 3 fields:
http://localhost:8983/solr/episode_core/select?&df=EPISODE_DESC
&defType=dismax&q=jedi&fl=EPISODE_ID,EPISODE_TITLE,EPISODE_DESC,EPISODE_KEYWORDS
&qf=EPISODE_TITLE^3.0+EPISODE_DESC^2.0+EPISODE_KEYWORDS
Is there something I am missing here? I thought that you could search over multiple fields, and I thought that the 'qf' parameter would mean you didn't need to supply the default field parameter?
All help much appreciated...
Your idea is correct. If you've defined qf (query fields) for Dismax, there shouldn't be any need to specify a df (default field).
Can you be more specific about what isn't working?
Also, read up on Configuration Invariants in solrconfig.xml as it is possible your configuration could be sending some different parameters than you've specified in the URL.
(E.g. if you're seeing a specific error message asking you to provide a df)

Solr: Strange behavior with quoting and escaping

I am using filter queries with Solr 4.10.0 / Lucene 4.10.0 and have the strange situation that while
fq=areas:Finanz- & Rechnungswesen and
fq=areas:"Finanz- & Rechnungswesen"
yield the same set of documents,
fq=areas:E-Commerce & Neue Medien and
fq=areas:"E-Commerce & Neue Medien"
don't – in the latter case, the set of results is empty.
I executed the queries in the Solr admin UI and checked in the Solr log that the filters correctly translate to the query params
fq=areas:Finanz-+%26+Rechnungswesen
fq=areas:"Finanz-+%26+Rechnungswesen"
fq=areas:E-Commerce+%26+Neue+Medien
fq=areas:"E-Commerce+%26+Neue+Medien"
respectively. Only in the last case, the result set is empty. Can anyone explain why this is the case? Unfortunately, Spring Data Solr quotes multi-word filters, so it gives a wrong result in that case.
Without seeing the data within your index it's hard to diagnose exactly why you have different numbers of results, however your queries may not be behaving how you expect due to the field syntax.
the filter query areas:Finanz- & Rechnungswesen will be parsed as:
areas:Finanz- {default_field}:rechnungswesen where {default_field} is whatever has been configured as your default field when one has not been supplied.
In order to debug these queries more easily, have a look at the results with debugQuery=true, this can also be done in the
Solr Admin UI's query interface.
To make sure that all terms are limited to your areas field, use parentheses, e.g:
areas:(Finanz- & Rechnungswesen)
For more details, have a look at the Solr query parser syntax: https://wiki.apache.org/solr/SolrQuerySyntax#Default_QParserPlugin:_LuceneQParserPlugin

Solr Boolean query on multi-valued field in filter query

We have a multi-valued indexed field named tags. We want to find all documents that meet one of the following conditions via a filter query:
if tag flagged is present, then tag safe should also be present.
tag flagged is not present.
I tried fq=(tags:(flagged AND safe) OR -tags:flagged) but it is not returning the desired results. Instead it is returning documents taggedsafe and not tag flagged i.e. the result is same as this query: fq=(tags:safe AND -tags:flagged). How do I fix my query?
Also both fq=(tags:safe AND -tags:flagged) and fq=(tags:safe OR -tags:flagged) are returning the same results. Why is this?
Solr version: 3.6.2
The following works correctly.
From Erik Hatcher (solr-user mailing group):
Inner purely negative clauses aren't allowed by Lucene. (Solr supports top-level negative clauses, though, so q=NOT foo works as expected.)
To get a nested negative clause to work, try this:
q=tags:(flagged AND safe) OR (*:* AND NOT tags:flagged)

Solr Index appears to be valid - but returns no results

Solr newbie here.
I have created a Solr index and write a whole bunch of docs into it. I can see
from the Solr admin page that the docs exist and the schema is fine as well.
But when I perform a search using a test keyword I do not get any results back.
On entering * : *
into the query (in Solr admin page) I get all the results.
However, when I enter any other query (e.g. a term or phrase) I get no results.
I have verified that the field being queried is Indexed and contains the values I am searching for.
So I am confused what I am doing wrong.
Probably you don't have a <defaultSearchField> correctly set up. See this question.
Another possibility: your field is of type string instead of text. String fields, in contrast to text fields, are not analyzed, but stored and indexed verbatim.
I had the same issue with a new setup of Solr 8. The accepted answer is not valid anymore, because the <defaultSearchField> configuration will be deprecated.
As I found no answer to why Solr does not return results from any fields despite being indexed, I consulted the query documentation. What I found is the DisMax query parser:
The DisMax query parser is designed to process simple phrases (without complex syntax) entered by users and to search for individual terms across several fields using different weighting (boosts) based on the significance of each field. Additional options enable users to influence the score based on rules specific to each use case (independent of user input).
In contrast, the default Lucene parser only speaks about searching one field. So I gave DisMax a try and it worked very well!
Query example:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video
You can also specify which fields to search exactly to prevent unwanted side effects. Multiple fields are separated by spaces which translate to + in URLs:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&qf=features+text
Last but not least, give the fields a weight:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&qf=features^20.0+text^0.3
If you are using pysolr like I do, you can add those parameters to your search request like this:
results = solr.search('search term', **{
'defType': 'dismax',
'qf': 'features text'
})
In my case the problem was the format of the query. It seems that my setup, by default, was looking and an exact match to the entire value of the field. So, in order to get results if I was searching for the sit I had to query *sit*, i.e. use wildcards to get the expected result.
With solr 4, I had to solve this as per Mauricio's answer by defining type="text_en" to the field.
With solr 6, use text_general.

Resources