Solr facet Filtering - solr

I have these fields in Solr
"IsFair": "True",
"IsHeight" : "True",
"IsFat" : "false",
"IsManly" : "False"
But while filtering data I want them to display as Fair, Height, Fat, Manly in a single field at the front end.
Something like FilterName "Appearance Type" and in that, I should have "Fair", "Height", "Fat", "Manly" as filters. Someone suggested me to use Hybrid Filter but I didn't understand how to achieve this.

I think the best way of implementation will be to create a multivalued field appearance_type of type String and generate facet on the field appearance_type, later while applying the filter you can use the same field.
So your example document will have,
{
"id":"doc1",
"appearance_type":["fair","height"]
}

Related

How can I filter results by custom field in Solr query?

I need some custom field filter for my Solr data like
{
"id":"1",
"name":"Test title",
"language":"en"
},
{
"id":"2",
"name":"Test title",
"language":"fr"
"parent": "1"
}
I need to get just first item by query
/select?q=name:test
So I need to filter results by parent field in such way, that one of the items will be present in the result.
Thanks for any ideas.
When I need to do querys in Solr I used the SearchQuery() and inside them I set filterQueries. There was possible set filters for my search.
final String FIELD_NAME = "name_text_mv"; // name of my field in Solr
SearchQuery searchQuery = init(facetSearchConfig); // init configs
searchQuery.setFreeTextQueryBuilder(text); // set the text of my search
setFiltersFreeTextSearch(searchQuery.getFilterQueries(), text, FIELD_NAME);
The function to make the magic (add in my search my filters):
private void setFiltersFreeTextSearch(List<QueryField> filters, String text, String... fields) {
text = StringUtils.stripAccents(text).toLowerCase();
String textCapitalized = capitalizeEachWolrd(text.toLowerCase());
for (String field : fields) {
QueryField queryField = new QueryField(field, SearchQuery.Operator.OR, SearchQuery.QueryOperator.CONTAINS,
text, text.toUpperCase(), textCapitalized);
filters.add(queryField);
}
}
How you can see, in this QueryField you can add the 'wheres' of you search in Solr. I was using CONTAINS and that is my 'LIKE' and 'OR' for find any item.
So basicly you can use QueryField() to add filters for you specifically field.
Well, this was my solution for my case, anyway, this is just an idea. :)
(For the projet is used Java)

Solr faceting based on function result

I'm trying to perform faceting based on a dynamic value. Basically I want identical behavior to the def function, but that doesn't seem to be available with faceting.
Consider these two "products":
{
"id":"product1",
"defaultPrice":19.99,
"overridePrice":14.99
},
{
"id":"product2",
"defaultPrice":49.99
}
I want to add that overridePrice is just an example. The actual field is a dynamic value that will depend on what context a search is performed in, and there may be many overridden prices, so I can't just derive price at index time.
For the response, I'm doing something like this for fl:
fl=price:def(overridePrice, defaultPrice) and using the same def function to perform sorting on price. This works fine.
So now I want to apply the same logic to facets. I've tried using json.facet, which seemed like it would work:
json.facet={
price: "def(overridePrice, defaultPrice)"
}
I've tried other variations as well, such as using field:def(overridePrice, defaultPrice) as well as field:price, but def doesn't seem to be an available function for faceting, and the price derived field is not available when faceting.
So the question: How can I perform faceting based on a default field like I'm doing for fl and sorting? Will this require a custom aggregation function, or is there a clever way I can do this without a custom function? It would be much more preferable to be able to do this with built-in Solr functionality.
I was able to do a hacky solution based on a tip in another question.
We can use two facets with a query to filter documents depending on if a field exists.
Example:
{
price_override: {
type: query,
q: "overridePrice:[* TO *]",
facet: {
price_override:{
type:terms,
field: overridePrice
}
}
},
price_standard: {
type: query,
q: "-overridePrice:[* TO *] AND defaultPrice:[* TO *]",
facet: {
price_standard: {
type: terms,
field: defaultPrice
}
}
}
}
Explanation:
price_override: {
type: query,
q: "overridePrice:[* TO *]"
This range query only selects documents that have an overridePrice field.
price_standard: {
type: query,
q: "-overridePrice:[* TO *] AND defaultPrice:[* TO *]"
-overridePrice:[* TO *] omits documents with the overridePrice field, and selects documents with a defaultPrice field.
And the facet response:
"facets":{
"count":2,
"price_override":{
"count":1,
"price_override":{
"buckets":[{
"val":14.99,
"count":1}]}},
"price_standard":{
"count":1,
"price_standard":{
"buckets":[{
"val":49.99,
"count":1}]}}}
This does require manually grouping price_override and price_standard into a single facet group, but the results are as expected. This could also pretty easily be tweaked into a ranged query, which is my use case.

Index a dictionary property in azure search

I have a DTO with a property of type Dictionary<string, string>. It's not annotated. When I upload my DTO and call indexClient.Documents.Index(batch), I get this error back from the service:
The request is invalid. Details: parameters : A node of type 'StartObject' was read from the JSON reader when trying to read the contents of the property 'Data'; however, a 'StartArray' node was expected.
The only way I've found to avoid it is by setting it to null. This is how I created my index:
var fields = FieldBuilder.BuildForType<DTO>();
client.Indexes.Create(new Index
{
Name = indexName,
Fields = fields
});
How can I index my dictionary?
Azure Cognitive Search doesn't support fields that behave like loosely-typed property bags like dictionaries. All fields in the index must have a well-defined EDM type.
If you don't know the set of possible fields at design-time, you have a couple options, but they come with big caveats:
In your application code, add new fields to the index definition as you discover them while indexing documents. Updating the index will add latency to your overall write path, so depending on how frequently new fields are added, this may or may not be practical.
Model your "dynamic" fields as a set of name/value collection fields, one for each desired data type. For example, if a new string field "color" is discovered with value "blue", the document you upload might look like this:
{
"id": "123",
"someOtherField": 3.5,
"dynamicStringFields": [
{
"name": "color",
"value": "blue"
}
]
}
Approach #1 risks bumping into the limit on the maximum number of fields per index.
Approach #2 risks bumping into the limit on the maximum number of elements across all complex collections per document. It also complicates the query model, especially for cases where you might want correlated semantics in queries.

Solr Conditional Highlighting: How to highlight with conditions?

In a Solr Implementation, I am trying to do some conditional highlight depending on others fields than the one we search on.
I want to get the matching result a field "content" highlighted only if it is indicated in Solr that this field can be exposed for this element.
Given a Solr base populated with :
[{ firstname:"Roman",
content: "A quick response is the best",
access:"" },
{ "firstname":"Roman",
"content": "Responsive is important",
"access":"contentAuthorized" }
]
I would like to get both document in my answer, and the highlight on the "content" field only for the one with the data "access":"contentAuthorized", so I am executing the query:
q:(firstname:r* OR (+tags:contentAuthorized AND +content:r*))
The expected answer would be:
...
{
{
"firstname":"Roman"
},
{
"firstname":"Roman"
}
},
highlighting":{
"0f278cb5-7150-42f9-8dca-81bfa68a9c6e":{
"firstname":["<em>Roman</em>"],
"105c6464-0350-4873-9936-b46c39c88647":{
"firstname":["<em>Roman</em>"],
"content":["<em>Responsive</em> is important],
}
}
But I actually get:
...
{
{
"firstname":"Roman"
},
{
"firstname":"Roman"
}
},
highlighting":{
"0f278cb5-7150-42f9-8dca-81bfa68a9c6e":{
"firstname":["<em>Roman</em>"],
"content":["A quick <em>response</em> is the best"],
"105c6464-0350-4873-9936-b46c39c88647":{
"firstname":["<em>Roman</em>"],
"content":["<em>Responsive</em> is important],
}
}
So, I get the "content" on the highlight of the second element while (+tags:contentAuthorized AND +content:r*) is false.
Does anyone have an idea of how I could do conditional highlighting with Solr so ?
Thank you for reading this and for taking your time to think about it :D
If you want highlighting to be applied on certain fields only, then you need to set the query parameter hl.fl to those fields. In your case hl.fl=content. You should then set hl.requireFieldMatch=true.
Refer to Solr Highlighting documentation:
By default, false, all query terms will be highlighted for each field to be highlighted (hl.fl) no matter what fields the parsed query refer to. If set to true, only query terms aligning with the field being highlighted will in turn be highlighted.
For further info on how to use the query parameters: https://solr.apache.org/guide/8_6/highlighting.html

Query specificly indexed value in multivalued field

I have a multivalued field which is filled by an array of strings. Now I want to find all documents that have i. e. foo as the i. e. second (!) string in this field. Is this possible?
If it is not, what would be your recommendation to achieve this?
For Solr, you can use UpdateRequestProcessor to copy and modify the field to add position prefix. So, you'll end up with 2_91 or similar. You can use StatelessScriptURP for that.
Alternatively, you could send this information as multiple fields and have dynamic field definition to map them.
Basically, for both Solr and ES, underlying Lucene stores multivalued strings as just one long string with large token offset between last token of first value and first token of second value. So, absolute positions require some sort of hack. Runtime hacks (e.g. ElasticSearch example in the other answer) are expensive during query. Content modifying hacks (e.g. URP in this example) are expensive with additional disk space or with more complex schema.
In elasticsearch, you can achieve this using Script Filter, Here is a sample,
consider a mapping for phone_no as,
{
"index": {
"mappings": {
"type": {
"properties": {
"phone_no": {
"type": "string"
}
}
}
}
}
}
put a document (first),
POST index/type
{
"phone_no" :["91","92210"]
}
and second one too,
POST index/type
{
"phone_no" :["92210","91"]
}
so, if you want to find the second value equals 91, then here is a query,
POST index/type/_search
{
"filter" :{
"script": {
"script": "_source.phone_no[1].equals(val)",
"params": {
"val" :"91"
}
}
}
}
where , val can be user defined,
Here in the above script, no case is handled (like, if it have size >1 , which may return execption sometime, you can modify script for your need ). Thanks,
Hope this might helps!!

Resources