Hybris: Combine different solr facet under one - solr

I have applied solr facet on properties of products.
Eg: The product can be either Medicine(0/1) or Drug(0/1) or Poison(0/1).
0 means NO, 1 means YES.
These are different features of a product hence appear as different facets. It is possible to display them under one facet instead eg: "Type", under which these three solr facet "Medicine", "Drug", "Poison" should display like:
Type
-----
Medicine (50)
Drug (100)
Poison (75)

Not sure about Hybris, but you should be able to do so with facet queries. You would have one facet query per each of your three conditions. In the UI, you can organize the counts anyway you want.
However, I am not sure why you can't just have a category field that contains a multi-valued field that contains Medicine and/or Drug and/or Poison value. Then faceting on that field would give you the breakdowns. If your values do not come in that way, you can probably manipulate them either with copyField or with a custom Update Request Processor chain to merge into one field.

This is super easy. Just make an IndexedProperty "Type" and a new custom ValueProvider for it. Then extract these values based on the boolean flags - just hard code if necessary. No need for anything more complex.

I tried the solutions posted here but they were not fitting my requirement. I did changes through facet navigation tag files to bring all classification attribute facets (Medicine, Drug, Poison) under a single facet (Type).

Related

Query Field Boosting issue in solr when boosted term is present in more than 1 field

I am having 4 fields in solr.
For ex. Field1, Field2, Field3 and Field4.
My boost sequence is like field1^10, field2^8, field3^7 field4^6.
Now if I search for a keyword marketing lets say q=(Field1:("marketing")^10 OR Field2:("marketing")^8 OR Field3:("marketing")^7 OR Field4:("marketing")^6).
Requirement:
Now according to requirement, marketing present in field1 should appear first and so on which is working fine.
Problem:
But there is one record where marketing is appearing in Field3 and Field4 and it is appearing 2nd in result while record containing marketing in Field2 is appearing 3rd in result which is probably because of scoring mechanism.
Solution I need:
I want to show records in the order of boost applied in that field no matter if it is found in multiple field i.e. the record having marketing in field2 should always appear 2nd in result.
Response given by #MatsLindh in the comments is the proper solution :
You can however try to increase your boosts to have a much larger
difference between the different levels - field1^100000, field2^10000, field3^1000 field4^100 - that way, given the same
content, two later fields will not add up to a larger boost than the
ones before it.
Note: Be aware that the scores will be affected by more than just the boost (such as the number of occurences, etc.).
I can think on two way to solve this:
Use qf query parameter - if you pass the fields and boots in the qf parameter instead of q so your query looks something like that: q=marketing&qf="field1^10 field2^8, field3^7 field4^6" then the parsed query will be something like: max(field1:marketing^10,field2:marketing^8,field3:marketing^7 OR , field4:marketing^6) so it doesn't matter in how many fields they appear it'll only take the max.
Change the boosts so each boost value is higher than the sum of the boost before him. e.g: field4^1, field3^2, field2^4, field1^8, that way no combination of fields can affect the ordering.

Select multiple values of same facet using IBM WCS v7 and Apache Solr

We use IBM WCS v7, with embeded Apache Solr. Solr is used as a search engine for our e-commerce based application.
As per a recent requirement, we want to use multi select facet functionality, where the user can check multiple facet attributes, and the corresponding values will be OR'ed to the search result.
Ex- I wish to check Color:RED, Color:BLUE and Color:BLACK in my default Search Results, so that each attribute value will be OR'ed in the resulting search results display.
We use the out-of-the-box SearchDisplayCmd, for our Search functionality, where the field "metaData=" takes care of history of the facets applied, and "facet=" takes care of applying a facet field. For the query param "metaData", it encodes the multiple facets into base64 encoding. It uses a special de-limiter to AND the different facet fields,and restrict the search results.
brand:"POLO" color:"RED" shape:"Oval"
I want to know, if there exists any such de-limiter or any alternatives by using which, I can perform an OR operation, on different values of the same facet attribute, and use "metaData" parameter to maintain a history of the applied facets.
Any help on the same front is highly appreciated. Any other approaches, on applying multiple values of the same facet attribute are also welcome.
Great Thanks in advance.
Regards,
Jitendriya Dash
I recently worked on this: Select multiple values of same facet
I was able to get it also.
Try to find where it hits the tag. The expression builder I used comes OOB. getCatalogNavigationView. Make sure you use the appropriate searchProfile.
Pass the facet param in this way.
<c:forEach var="facetSelect" value="paramValues.facet">
<wcf:param name="facet" value="facetSelect>
</c:forEach
But by this method you will not be able to select values from any other attributes. If someone knows how to select values from the same facet or different facet, pls share.
Update SELECTION column of FACET table to 1 to mark the facetable attribute as multi selectable.
In WCS7+, for enabling multi select facet functionality go to FACET table and set 'SELECTION' column value to 1 instead of 0.
If an attribute is to be made multi select facet, you can make the changes from CMC. Go to the attribute dictionary select the attribute and in facetable properties, check 'Allow multiple facet value'.

Difference between Solr Facet Fields and Filter Queries

I am using SolrMeter to test Apache Solr search engine. The difference between Facet fields and Filter queries is not clear to me. SolrMeter tutorial lists this as an exapmle of Facet fields :
content
category
fileExtension
and this as an example of Filter queries :
category:animal
category:vegetable
categoty:vegetable price:[0 TO 10]
categoty:vegetable price:[10 TO *]
I am having a hard time wrapping my head around it. Could somebody explain by example? Can I use SolrMeter without specifying either facets or filters?
Facet fields are used to get statistics about the returned documents - specifically, for each value of that field, how many returned documents have that value for that field. So for example, if you have 10 products matching a query for "soft rug" if you facet on "origin," you might get 6 documents for "Oklahoma" and 4 for "Texas." The facet field query will give you the numbers 6 and 4.
Filter queries on the other hand are used to filter the returned results by adding another constraint. The thing to remember is that the query when used in filtering results doesn't affect the scoring or relevancy of the documents. So for example, you might search your index for a product, but you only want to return results constrained by a geographic area or something.
A facet is an field (type) of the document, so category is the field. As Ansari said, facets are used to get statistics and provide grouping capabilities. You could apply grouping on the category field to show everything vegetable as one group.
Edit: The parts about searching inside of a specific field are wrong. It will not search inside of the field only. It should be 'adding a constraint to the search' instead.
Performing a filter query of category:vegetable will search for vegetable in the category field and no other fields of the document. It is used to search just specific fields rather than every field. Sometimes you know that the term you want only is in one field so you can search just that one field.

what is the advantages of mutivalued option in solr

What is the advantages of mutivalued field option in solr.
I have a field with comma separated keywords.
I can do 2 things
make a non-multivalued text field
make a multivalued text field which contains each keyword
I can still query in both the cases. So whats the advantages of multivalued over non-multivalued?
advantages of multivalued: you don't need to change the document design. If en document containes multiple values in one filed, so solr/lucen can handle this field.
Also an advantage: multiple values could describe an document more exact (thing about tags of an blog post, or so)
advantages of non-multivalued: you can use specific features, which required an single term (word) in one filed, like spell checking. It's also a benefit for clustering (carrot) or grouping, which works mostly better on non-multivalued fields
Querying by the multivalue field will receive what you want.
Example: doc1 has a keyword 'abc', and doc2 has a keyword 'abcd'. If query by keyword 'abc' only doc1 should be matched.
So in non-multivalue approach both documents will matched, case you'll use like syntax.
multivalue fields can be very handy, let say you have many fields and you wish to search for several fields but not in all of them. you can create multivalue field that include all the fields that you wont to search for them on this field and search in it.
for example, let say you have fields that may have value of string or value of number. and than you wish to search on all string values that were found in the document. so you can create multivalue field for all string values and search in it.

Is it possible to have SOLR MoreLikeThis use different fields for model and matches?

Let's say I have documents with two fields, A and B.
I'd like to use SOLR's MoreLikeThis, but with a twist: I'm most interested in boosting documents whose A field is like my model document's B field. (That is, extract MLT's 'interesting terms' from the model B field, but only collect MLT results based on the A field.)
I don't see a way to use the mlt.fl fields or mlt.qf boosts to achieve this effect in a single query. (It seems mlt.fl specifies fields used for both discovery of 'interesting terms' and matching to those terms.) Am I missing some option?
Or will I have to extract the 'interesting terms' myself and swap the 'field:term' details?
(Other ideas in this same vein appreciated as well.)
Two options I see are:
Use a copyField - index your original document with a copy of field A named B, and then query using B.
Extend MoreLikeThisHandler and change the fields you query.
The first option costs a bit of programming (mostly configuration changes) and some memory consumption. The second involves more programming but no memory footprint increase. Hope one of them suits your needs.
I now think there are two ways to achieve the desired effect (without customizing the MLT source code).
First option: Do an initial MLT query with the MLT handler, adding the parameter &mlt.interestingTerms=details. This includes the list of terms that were deemed interesting, ranked with their relative boosts. The usual behavior uses those discovered terms against the same mlt.fl fields to find similar documents. For example, the response will include something like:
"interestingTerms":
["field_b:foo",5.0,"field_b:bar",2.9085307,"field_b:baz",1.67070794]
(Since the only thing about this initial query that's interesting is the interestingTerms, throwing in an fq that rules out all docs could help it skip unnecessary scoring work.)
Explicitly re-composing that interestingTerms info into a new OR query field_a:foo^5.0 field_a:bar^2.9085307 field_a:baz^1.67070794 amounts to using the B field example text to find documents that are similar in field A, and may be mimicking exactly the kind of query default MLT does on its usual model field.
Second option: Grab the model document's actual field B text, and feed it directly as a ContentStream body, to be used in lieu of a query, for specifying the model document. Then target mlt.fl at field A for the sake of collecting similar results. For example, a fragment of the parameters might be …&stream.body=foo bar baz&mlt.fl=field_a&…. Again, the net effect being that model text originally from field_b is finding documents similar only in field_a.

Resources