Limiting / Filtering multivalue fields in Solr - solr

Is there a way to limit, or filter, the returned text of a multivalued field in Solr? Given the following document structure in Solr:
...
<doc>
<str name="title">example</str>
<arr name="foo">
<str>bar1</str>
<str>bar2</str>
<str>bar3</str>
<str>bar4</str>
<str>bar5</str>
<str>bar6</str>
</arr>
</doc>
...
I'd like to limit the response to only show 1 of the "foo" values based on a Filter Query request. So for example, the query:
select/?q=example&fq=foo:bar2`
I would want a response of:
...
<doc>
<str name="title">example</str>
<arr name="foo">
<str>bar2</str>
</arr>
</doc>
...

Nope. There is not way to filter the Multivalued Values returned with the response.
You can easily do it at client side though.
If you can use Facet to get the list, you can use facet.prefix to limit the values for the field foo returned as facet.

Did you try using dynamic fields if you know the sample space of values for 'foo'? For example:
and then filter on bar_x:true. You would end up using a large number of dynamic fields.

filter query should work, try below code
&fq=+foo:"bar2"

Related

Solr - returning counts for facets specified in query

I would like to create a query that will return facet counts for some field values even if that value count is 0. But on the other hand I don't want counts for the values that were not in the original query.
For example if I use in my query:
<arr name="fq">
<str>Field:(4 OR 5)</str>
</arr>
I and there exists no document with the value 5 in this field, I would like to get back:
<lst name="facet_fields">
<lst name="VersionStatus">
<int name="4">2</int>
<int name="5">0</int>
</lst>
</lst>
But there shouldn't be counts for other values (ex. 1, 2, 3, ...), because they weren't spcified in the query.
Is that even possible? I was trying to achieve that with missing=true parameter, but that didn't work.
instead of faceting on the field with 'facet.field' you can use N facet.query params, matching the terms in your fq, so, in your example:
&facet.query=Field:4&facet.query=Field:5

How to search over documents with 2 or more entries in multivalued field in Solr?

I have a schema that allows a multivalued field, how do I construct a search that only returns documents that have 2 or more entries in that field? for example in this subset of data:
<doc>
<str name="id">A</str>
<arr name="multivaluedField">
<str>One</str>
<str>Two</str>
</arr></doc>
<doc>
<str name="id">B</str>
<arr name="multivaluedField">
<str>One</str>
</arr></doc>
<doc>
<str name="id">C</str>
<arr name="multivaluedField">
<str>Three</str>
<str>Four</str>
</arr></doc>
The search would return documents A and C only since they have 2 entries in MultivaluedField even if they are different entries.
The easiest (and most effective) way would be to index a integer value that contains the count of values together with the existing values, so you have a multiValued_count field. This field can be indexed and you can do both efficient range queries and exact value lookups.
You can do this in your indexing code directly or in an updateprocessor if needed.

Understading Solr nested queries

I'm trying to understand solr nested queries but I'm having a problem undestading the syntax.
I have the following two indexed documents (among others):
<doc>
<str name="city">Guarulhos</str>
<str name="name">Fulano Silva</str>
</doc>
<doc>
<str name="city">Fortaleza</str>
<str name="name">Fulano Cardoso Silva</str>
</doc>
If I query for q="Fulano Silva"~2&defType=edismax&qf=name&fl=score I have:
<doc>
<float name="score">28.038431</float>
<str name="city">Guarulhos</str>
<str name="name">Fulano Silva</str>
</doc>
<doc>
<float name="score">19.826164</float>
<str name="city">Fortaleza</str>
<str name="name">Fulano Cardoso Silva</str>
</doc>
So I thought that if I queried for:
q="Fulano Silva"~2 AND __query__="{!edismax qf=city}fortaleza" &defType=edismax&qf=name&fl=score
I'd give a bit more score for the second document, but actually I get an empty result set with numFound=0.
What am I doing wrong here?
Need to remove the "=" and replace it with ":" to use the nested query syntax:
q="Fulano Silva"~2 AND _query_:"{!edismax qf=city}fortaleza" &defType=edismax&qf=name&fl=score
*Use _query_: instead of _query_=
Hope this works...
EDIT: When you say q=, are you specifying the query in a URL, or is the text after the q= being put in an application or the Solr dashboard? If we're talking about a URL, you may need to use percent-encoding to get it to work. I mentioned that below, but since I haven't heard from you, I thought I'd reiterate.
Why don't you do q=name:"Fulano Silva" AND city:"fortaleza"?
Another possibility: q=_query_:"{!edismax qf='name'}Fulano Silva" AND city:"fortaleza"
If you're set on a nested query, select?defType=edismax&q="Fulano Silva" AND _query_:"{!edismax qf='city' v='fortaleza'}" should work, but the results and the way it matches will depend on what analyzers you are using to query and index name and city. Also, if these queries are in your query string, make sure you are
encoding them properly.
In order to help you any more, I need to know what you're trying to accomplish with your query. Then perhaps we can be sure you have the right indexing set up, that edismax is the right query handler, etc.
On top of the previous comments, the asker has mispelled _query_ as __query__ (note the double underscore in the second, mispelled, version); Solr expects _query_ to be spelled with only one underscore (_) before and one after the word query, not two.

Replacing SOLR output field value

I have below mentioned SOLR query which works fine.
query:"COMPLEX CONDITION 1" OR query:"COMPLEX CONDITION 2"
I get 4 documents in result - 2 from condition1 and 2 from condition2. I need to know documents belong to which condition.
I cannot figure out from the result as the conditions are too complex.
What i want to do is change the value of the "status" field in the output.
Lets say, status=Active for condition1 and status=Expired for condition2.
The current value of status is not accurate as the status is decided based on the conditions i use.
Is there a way to overwrite the output value of any field(s) in SOLR?
have you tried using highlighting to determine which documents matched which condition? If you turn on highlighting (&hl=on&hl.fl=<fields_you're_trying_to_match>), then Solr will return a structure at the end of the results structure (whether you're returning results in JSON or XML) called "highlighting." This structure in turn will contain structures named according to the unique key of your index (if there is one) with elements that match.
<lst name="highlighting">
<lst name="1">
<arr name="title">
<str>Bob <em>Jones</em></str>
</arr>
<arr name="category">
<str><em>Jones</em> Family</str>
</arr>
<arr name="description">
<str>This is a book about Bob <em>Jones</em>, the patriarch of the <em>Jones</em> Family.</str>
</arr>
<lst>
<lst>
More here:
How to return column that matched the query in Solr..?
Now I apologize that this doesn't answer the latter part of your question, but gives you some help for the first part.

Filter doc if a specified multivalued filed contains only one value

We encounter a query case that to filter doc if a specified multivalued filed contains only one value.
For instance:
We have an index of suit, including clothes ,trousers or other things. If there is only one product within a suit due to out of stock, we can't show the suit to user, because it's not 'suit'.
Here is our data:
<doc>
<int name="suitId">001</int>
<arr name="productName">
<str>T-shirt</str>
<str>jeans</str>
</arr>
</doc>
<doc>
<int name="suitId">002</int>
<arr name="productName">
<str>T-shirt</str>
</arr>
</doc>
We wanna except the suit of suitId=002.
It would be better to have a separate field maintaining the count of the products for a suit and use it to filter the suits.
I don't think you can use the range queries for the text multivalued fields.
you can probably use productName:[* TO *] to select suit having atleast one product, but not the count.

Resources