Inconsistent search results using fuzzy search across multiple collections

Inconsistent search results using fuzzy search across multiple collections - solr

I'm building a search query using the edismax parser and specifying the query fields. Sometimes I need to search across multiple collections and sometimes I am just searching against a single collection. In either case, I am generating a single query by specifying the collections parameter in addition to my query fields.
This means that my qf parameter may list fields that do not exist in one or more collections. Normally this isn't a problem and I get back the results I expect (provided I am using the edismax parser). However, I have noticed that if I perform a fuzzy search this way, I am getting inconsistent results.
For example:
http://localhost:8983/solr/activity/select?q=jva040~2&defType=edismax&qf=Code
gives me results with Codes like
RVA010, JAA048, RVA041
but if I issue a query with a non-existent field in the activity collection like
http://localhost:8983/solr/activity/select?q=jva040~2&defType=edismax&qf=Code+Poop
I get results with Codes like
53721ILTHRS-CHFSPMT-2, 53721ILTHRS-CHFSCOS-2, 53721ILTHRS-CHFSNEO--11/2/15
Is this a bug within Solr or am I constructing this query wrong? I am using Solr version 5.2.1

Related

Apache Solr use entire string for search within collection

I have managed to create a dataset using Apache Solr. I have also managed to make queries, such as in this example:
content:(test1 OR test2) OR title: test2
I would now like to search the dataset using an entire string, in similar fashion to searching on google. Is the correct way to approach this to keep using or tags on the title and content for each word within the query, or is there a better way to achieve this ? (I am not looking for exact matches, just the most relevant ones)

You can use dismax or edismax for your approach and can pass the phrases if you have with the boosting.
The DisMax query parser is designed to process simple phrases (without
complex syntax) entered by users and to search for individual terms
across several fields using different weighting (boosts) based on the
significance of each field. Additional options enable users to
influence the score based on rules specific to each use case
(independent of user input).
The detailed parameters are found on the solr page at Solr Dismax

Is it possible to exclude specific values from being included in Solr facets?

I'm using Solr facets to get the most common values for specific fields. It has occurred to me that (for business logic purposes) it would be preferable to exclude certain values. I cannot seem to find a way to do this, however.
I'm not looking to exclude the filter query, as seems to be commonly discussed.
If I'm getting the top 3 facets for a field, and seeing that "ValueA", "ValueB", and "ValueC", I'd like to say, essentially, "Get facets that aren't ValueB". So my facet instead returns data for "ValueA", "ValueC", and "ValueD".

Use the facet.excludeTerms parameter. According to the source the format seems to be "term1,term2" to exclude those two terms.
The feature was introduced with Solr 6.5.
If you need the same feature before Solr 6.5 - if you need to supply the term to exclude separately for each query, you're going to have to do it in your controller / Solr interfacing code. If you want to do it for a single or multiple terms across the whole index for all queries, add a separate field and filter out those terms while indexing.

Solr doesnot accepts unparsed query

I have added some documents in my solr index using requestHandler and now I am trying to query them from the web UI, I am getting the correct result when my query parameter is in the fomat
[id]:[search-item]
but i want to search it without parsing in this format, so for example i have to search for cat, i just type "cat" and it gives me the result, and not "animal:cat",
I am new to solr so I am not very sure, where am I going wrong

Use the DisMax query parsers/handlers
Extract from DisMax documentation
The DisMax query parser is designed to process simple phrases (without
complex syntax) entered by users and to search for individual terms
across several fields using different weighting (boosts) based on the
significance of each field. Additional options enable users to
influence the score based on rules specific to each use case
(independent of user input).
In general, the DisMax query parser's interface is more like that of
Google than the interface of the 'standard' Solr request handler. This
similarity makes DisMax the appropriate query parser for many consumer
applications. It accepts a simple syntax, and it rarely produces error
messages.
Also see DisMax and full documentation of the DisMax query parser here

SOLR: ordering results alphabetically by field

SOLR results are normally ordered by "best match" of your search criteria. Is it possible to order the results alphabetically by a given SOLR field?
I realize that this is not a typical use case, but here's my motivation. We have quite a lot of code written around SOLR that performs queries based on user searches against the various fields of our data. Most of the time, we want a relevancy ordering (i.e. best matches first).
But one anomalous use-case requires that we return data ordered alphabetically by field. I could perform this query using our SQL database (avoiding SOLR altogether), but I'd have to replicate an awful lot of code that's tailored around consuming SOLR results (facets in particular). I'm hoping to use the same code path, if it's possible to get such an ordering from SOLR.

Yes, you just have to set the sort parameter to field-name

Solr - retrieving facet counts for unfiltered version of query

I'm using Solr for searching, and recently started using faceting to allow users to narrow their search. However, once the user filters by one of the facets, the other filter options are no longer returned in the facet results. This is expected, but not what I'd like.
Is there some way to return the facet fields and counts for the unfiltered query, without doing an extra search? For instance, if the user filters by category (by selecting a specific category), I'd like them to still be able to pick one of the other categories without having to explicitly remove the filter first. (That is, all of the categories—and their counts—should still be returned by Solr, so that I can include them on the page along with the filtered query set.)
I suspect this may not be possible. If it isn't I can just do an extra query per search, which would leave out the filter (and return 0 rows), as described in a previous StackOverflow question. But I thought I'd ask: does anyone know a way to do this without multiple queries?

This is called multi-select faceting and it is possible using specific LocalParams to exclude filters when faceting. See "Tagging and excluding Filters" for details.

This is a SO answer also explaining this but with an example provided:
SolrNet : Keep Facet count when filtering query,
and here is a fresh SOLR documentation URL, since URLs from both this and linked SO answers are outdated now:
https://solr.apache.org/guide/8_11/faceting.html#tagging-and-excluding-filters

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Inconsistent search results using fuzzy search across multiple collections - solr

Related

Apache Solr use entire string for search within collection

Is it possible to exclude specific values from being included in Solr facets?

Solr doesnot accepts unparsed query

SOLR: ordering results alphabetically by field

Solr - retrieving facet counts for unfiltered version of query

Categories

Resources