Solr questions regarding handler resolution and escaping

Solr questions regarding handler resolution and escaping - solr

I have a couple of questions regarding Solr usage:
Certain requests can be sent to different paths (handlers?). For example, the MoreLikeThis component can being sent to either /select or /mlt.
I have found these two links in the Solr wiki:
http://localhost:8983/solr/mlt?q=id:UTF8TEST&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&mlt.match.include=false
http://localhost:8983/solr/select?q=apache&mlt=true&mlt.fl=manu,cat&mlt.mindf=1&mlt.mintf=1&fl=id,score
What is the reasoning behind this setup? If I decide to send my MoreLikeThis requests to /mlt does this mean I can not utilize any /select specific calls - if there is even such a thing - such as facets - ? If not, can a /select path can be configured to handle all requests from Spellcheck to Clustering?
How do you escape double character special strings (&&, ||) in Lucene?
http://lucene.apache.org/java/2_9_1/queryparsersyntax.html#Escaping+Special+Characters
Do I escape the first character only (\&&) or do I escape both? And when do I need to escape them? A couple of tests that I performed on the example server provided in the Solr package were inconclusive:
http://localhost:8983/solr/select/?q=manu:%22apple%20%26%26%22%20AND%20manu:%22computer%22
Still returns results,

1) The rationale behind MoreLikeThisHandler is explained in the Solr wiki:
When you specifically want information
about similar documents, you can use
the MoreLikeThisHandler.
If you want to filter the similar
results given by MoreLikeThis you have
to use the MoreLikeThisHandler. It
will consider the similar document
result set as the main one so will
apply the specified filters (fq) on
it. If you use the
MoreLikeThisComponent and apply query
filters it will be applyed to the
result set returned by the main query
(QueryComponent) and not to the one
returned by the MoreLikeThisComponent.
2) You need to escape every single character.

Related

Exact phrase search in solr (no substring)

I am using plain solr queries via http from rsolr ruby gem. So its practically the same as doing a query using solr admin interface. I need to search for exact phrases like "SFO 10+" in solr, but it will be treated as a substring search. And the "+" will have some special meaning even when using \\ as noted in the doc. It must not match substring, and the plus should be treated literally.
q: col0_t:"SFO 10+" returns 1 correct hit - 1 wrong
q: col0_t:"SFO 10\\+" same result
q: col0_t:"SFO 10" same result
The unexpected hit is:
"col0_t":"SFO 10-15"
Its a third party database, so I rather not modify anything in the database, unless its strictly necessary. It could damage the original system. We have made our own module on top of this system. I hope to do this directly, or I must clean the data afterwards in the app.

SOLR Solarium can we use filter-queries with dismax-queries?

i just built a search form backed by solr, we are using the solarium library to construct our requests.
we built a "huge" collection of filterqueries like that one:
$query = $client->createQuery($client::QUERY_SELECT);
$query->setStart(0)->setRows(1000);
$query->addFilterQuery($query->createFilterQuery("foo")->setQuery("bar:true"));
$query->addFilterQuery($query->createFilterQuery("fo")->setQuery("ba:false"));
....
but we realized that the search just hits all the single fields we specify in the filterqueries, but we have to actually query multiple fields. while reading the docs i realized we could have been wrong, right? the correct approach would be to use disMax queries (in combination with facets?)? im wondering, can we use DisMax in combination with filterqueries to "expand" our search to multiple fields (with boosts) ? or do we have to actually rework everything?
im kinda missing the big picture to decide what the best/working solution would be
help is much appreciated
edit:
solr:
solr-spec 7.6.0
solarium:
solarium/solarium 6.0.1 PHP Solr client

You can give a query parser when giving the fq argument:
fq={!dismax qf="firstfield secondfield^5"}this is my query
The syntax is known as Local Parameters. Since dismax (or edismax which you should normally use now) doesn't have a identifier in front of it, it is implicitly parsed as the type.
If a local parameter value appears without a name, it is given the implicit name of "type". This allows short-form representation for the type of query parser to use when parsing a query string.
You'll have to make sure that Solarium doesn't escape the value you give to setQuery, but seeing as you're already giving a field:value combination, it doesn't seem to get escaped. Double check the Solr log to see exactly what query is being sent to Solr (or ask Solarium to give you the exact query string being sent if possible).

Spring Data Solr: Queries with "AND", "NOT" and "OR" not escaped or handled

We are using spring-data-solr, mainly using exact match/equals filter queries.
We have found that the values NOT, OR, and AND can be supplied, which are passed directly onto solr (without any pre-processing). This causes solr to error. For example, building a Criteria object like
Criteria.where("fuelType").is("AND")
Results in the following solr query
fq=fuelType:AND
We have found that if we call Solr directly with
fq=fuelType:"AND"
This would be fine, however, I can see that quotes are only added when there is whitespace in the value.
Is there something I am missing?
I still want to use the Standard Solr query parser if possible

The pull request for this has been merged,
https://jira.spring.io/browse/DATASOLR-437
https://github.com/spring-projects/spring-data-solr/pull/74

Solrnet q parameter issue with compound query

We are using Solrnet to issue a compound query to Solr based on a set of options that the user can select for e.g. phrase, exact phrase. exclusion, proximity, etc. We are creating the individual queries based on the options selected using the SolrQueryByField API and combining the combination using a SolrMultipleCriteriaQuery with the AND operator. But when we submit the query to Solr, the q parameter that gets submitted is having the + sign added accross all the terms:
q=(ContentSearch:(roman)+AND+ContentSearch:("test+case")+AND+-ContentSearch:(wine)+AND+(ContentSearch:(A)+OR+ContentSearch:(B))+AND+ContentSearch:("catacombs+wine"~5)+AND+ContentSearch:([10+TO+20]))}
The +AND+ or "test+case" or +AND+- or 10+TO+20 is messing up the query parser. Has anybody encountered this before? Is it something to do with the url encoding when solrnet is sending the request to solr?

If you are using SolrNet 0.4.0 you can set an optional parameter on the SolrQueryByField Quoted=false this will stop the default behavior of QueryByField to escape special characters.

Solr Index appears to be valid - but returns no results

Solr newbie here.
I have created a Solr index and write a whole bunch of docs into it. I can see
from the Solr admin page that the docs exist and the schema is fine as well.
But when I perform a search using a test keyword I do not get any results back.
On entering * : *
into the query (in Solr admin page) I get all the results.
However, when I enter any other query (e.g. a term or phrase) I get no results.
I have verified that the field being queried is Indexed and contains the values I am searching for.
So I am confused what I am doing wrong.

Probably you don't have a <defaultSearchField> correctly set up. See this question.
Another possibility: your field is of type string instead of text. String fields, in contrast to text fields, are not analyzed, but stored and indexed verbatim.

I had the same issue with a new setup of Solr 8. The accepted answer is not valid anymore, because the <defaultSearchField> configuration will be deprecated.
As I found no answer to why Solr does not return results from any fields despite being indexed, I consulted the query documentation. What I found is the DisMax query parser:
The DisMax query parser is designed to process simple phrases (without complex syntax) entered by users and to search for individual terms across several fields using different weighting (boosts) based on the significance of each field. Additional options enable users to influence the score based on rules specific to each use case (independent of user input).
In contrast, the default Lucene parser only speaks about searching one field. So I gave DisMax a try and it worked very well!
Query example:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video
You can also specify which fields to search exactly to prevent unwanted side effects. Multiple fields are separated by spaces which translate to + in URLs:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&qf=features+text
Last but not least, give the fields a weight:
http://localhost:8983/solr/techproducts/select?defType=dismax&q=video&qf=features^20.0+text^0.3
If you are using pysolr like I do, you can add those parameters to your search request like this:
results = solr.search('search term', **{
'defType': 'dismax',
'qf': 'features text'
})

In my case the problem was the format of the query. It seems that my setup, by default, was looking and an exact match to the entire value of the field. So, in order to get results if I was searching for the sit I had to query *sit*, i.e. use wildcards to get the expected result.

With solr 4, I had to solve this as per Mauricio's answer by defining type="text_en" to the field.

With solr 6, use text_general.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Solr questions regarding handler resolution and escaping - solr

Related

Exact phrase search in solr (no substring)

SOLR Solarium can we use filter-queries with dismax-queries?

Spring Data Solr: Queries with "AND", "NOT" and "OR" not escaped or handled

Solrnet q parameter issue with compound query

Solr Index appears to be valid - but returns no results

Categories

Resources