Hybris custom facet sort provider not working - solr

I made an implementation where I created a custom Facet Value Sort Provider and a custom Facet Top Values provider.
I assigned them to one of my Solr Indexed Properties. I also change the Facet Sort type to Custom
It worked just fine on my local enviroment and in one of our test enviroments as well. But on our QA enviroment only the top values provider is working. The regular Facet Solr Provider applied is based on the facet result count.
I just notice after this implementation that doesn't matter the Facet sort that I select there, it insists to apply the sort by count.
Do you guys have any idea how to make my custom sort work there? Is there maybe a solr xml that I must change?

After selecting "custom" for SolrIndexedPropertyFacetSort, and setting the field customFacetSortProvider to your custom bean, you need to make sure your bean implements FacetSortProvider and override the comparator method:
#Override
public Comparator<FacetValue> getComparatorForTypeAndProperty(IndexedType arg0, IndexedProperty arg1)
{
// XXX Auto-generated method stub
return null;
}

It worked after changing the legacyMode to true in SolrSearchConfig.
It was the only different between the enviroments

Related

Add user-specified internal version to Solr core?

I have a script that loads information about medications, like you would find in RxNorm, into a Solr core. There's a relatively constant schema for all of the documents. See below.
I would also like to add a document to the core with two properties:
the date on which the core was populated
the version of the software that did the population
Are there established ways to do that? I'm using R's solrium package.
Could this be considered a bad idea? Is there some way to lock the core so changes can't be made after the version document is added? I do have a customized schema.xml, but otherwise this is a pretty vanilla Solr setup.
Schema illustration
select?q=medlabel%3Aacetaminophen
gets
"responseHeader":{
"status":0,
"QTime":0,
"params":{
"q":"medlabel:acetaminophen"}},
"response":{"numFound":4269,"start":0,"docs":[
{
"id":"http://purl.bioontology.org/ontology/RXNORM/161",
"medlabel":["acetaminophen"],
"tokens":["acetaminophen"],
"definedin":["http://purl.bioontology.org/ontology/RXNORM/"],
"employment":["IN"],
"_version_":1674388636888465414},
{
"id":"http://purl.obolibrary.org/obo/CHEBI_46195",
"medlabel":["acetaminophen"],
"tokens":["4-acetamidophenol",
"acetaminophen",
"apap",
"panadol",
"paracetamol",
"tylenol"],
"definedin":["http://purl.obolibrary.org/obo/chebi.owl"],
"employment":["active_ingredient"],
"_version_":1674388639675580445},
{
"id":"http://purl.bioontology.org/ontology/RXNORM/1006970",
"medlabel":["acetaminophen / dimenhydrinate"],
"tokens":["/",
"acetaminophen",
"dimenhydrinate"],
"definedin":["http://purl.bioontology.org/ontology/RXNORM/"],
"employment":["MIN"],
"_version_":1674388635062894610}
etc.
You can set a collection in read only mode after indexing your content into it using MODIFYCOLLECTION. That will effectively give you a read-only collection which does not allow any updates.
My recommendation for your other case would be to have that field present on each document instead of as a separate document (which sure, that'd work as well). But if your number of documents is very large, add a separate document with the metadata you need.
However, you can also use MODIFYCOLLECTION for this to attach properties to the collection itself:
The attributes that can be modified are:
other custom properties that use a property. prefix
So you can add property.client_version and property.populated_datetime properties to the collection itself, which would then be replicated properly across your cluster if needed. The collection also have a last index update time available, but this might be node specific (since the commits can happen in different timeframes on each node). It won't let you attach the client version anyhow.

Spring Data MongoDB default type for inheritance

My model consisted of the following example:
class Aggregate {
private SomeClassWithFields property;
}
Now I decided to introduce inheritance to SomeClassWithFields. This results in:
class Aggregate {
private AbstractBaseClass property;
}
The collection already contains a lot of documents. These documents do not contain a _class property inside the DB since they were stored before the inheritance was present.
Is there a way to tell Spring Data MongoDB to use SomeClassWithFields as the default implementation of AbstractBaseClass if no _class property is present?
The other solution would be to add the _class to all the existing documents with a script but this would take some time since we have a lot of documents.
I solved it by using an AbstractMongoEventListener
The AbstractMongoEventListener has an onAfterLoad method which I used to set the default _class value if none was present :) This method is called before any mapping from the DBObject to my domain model by spring so it works then.
Do note that I also needed to let spring data mongodb know the mappingBasePackage in order for it to be able to read an Aggregate before writing one. This can be done implementing the getMappingBasePackage method of the PreconfiguredAbstractMongoConfiguration class.

How to do per field facet min count in solr query using java?

In solr document it is written:
The facet.mincount parameter specifies the minimum counts required for a facet field to be included in the response. If a field's counts are below the minimum, the field's facet is not returned.
The default value is 0.
This parameter can be specified on a per-field basis with the syntax of f.fieldname.facet.mincount.
How to do this in java? I don't see a query.setMinCount for a field only for the overall query.
http://docs.spring.io/spring-data/solr/docs/current/api/org/springframework/data/solr/core/query/FacetOptions.html
In java I'm used to use SolrJ directly, without Spring or other frameworks.
I suggest SolrQuery, which is the class used commonly to prepare query parameters.
As you have seen there are many many parameters when you prepare a Solr query, but, even with SolrJ, not all the parameters have an equivalent method.
Given that, there is no method for f.fieldname.facet.mincount parameter.
But SolrQuery has a method add inherited by its parent ModifiableSolrParams. You can use this method to figure out every case not handled by standard SolrQuery interface.
For example this query add a :
SolrQuery q = new SolrQuery()
q.setQuery("*:*")
q.setFacet(true);
q.addFacetField("country");
q.add("f.country.facet.mincount", "1");
Consider that, if you want, you can use even only the add method:
SolrQuery q = new SolrQuery()
q.add("q", "*:*")
q.add("facet", "true");
q.add("facet.field", "country");
q.add("f.country.facet.mincount", "1");
On the other hand, if you want try to use Spring, well, looking at FacetOptions class I see there is a nested static class FacetOptions.FacetParameter that have a costructor FacetOptions.FacetParameter(String parameter, Object value), which it seems to accept every kind of parameter/value. In some way FacetParameter resembles the behaviour we have just seen with SolrJ add method.

How to specify Solr Cloud collection dynamically in Spring Data Solr 2.0.1?

We are trying to implement a two-dimensional solr cloud cluster where the first dimension is a collection and the second is a shard. Collection should be determined in runtime based on a document properties.
I can see that this functionality is supported by solrj- CloudSolrClient has appropriate methods which accept collection name like add(String collection, SolrInputDocument doc), so I registered #Bean CloudSolrClient("zookeeper.host"). But apparently it isn't enough because methods in SolrTemplate, which is used by Spring Data Solr, doesn't accept a collection name.
Since SolrTemplate uses SolrClient under the hood I tried to workaround this problem extending SolrTemplate and overriding saveBean and saveBeans methods delegating to CloudSolrClient#add(String collection, SolrInputDocument doc) and CloudSolrClient#add(String collection, Collection<SolrInputDocument> docs). It worked fine until I was need to do the same for queries. SolrTemplate#executeSolrQuery is package-private and final, so I can't override it. And here I stuck!
To summarise my question: is there a way to specify a collection name in spring data solr in runtime?
I would greatly appreciate any help!
Regards,
Eugeny
My problem was a bit different, but I had also a problem with collection name in queries and in my case adding #SolrDocument(solrCoreName = "core_to_which_model_class_belong") to model class solved the problem.

Sitecore 7.2 and SOLR: exclude clones from web index

I'm trying to exclude all clones from Sitecore's web index. I've created a custom crawler inheriting from Sitecore.ContentSearch.SitecoreItemCrawler overriding the IsExcludedFromIndex method with the following code:
protected override bool IsExcludedFromIndex(SitecoreIndexableItem indexable, bool checkLocation)
{
if (indexable.Item["Hide from Search"] == "1")
return true;
if (indexable.Item.IsClone)
return true;
return base.IsExcludedFromIndex(indexable, checkLocation);
}
My "Hide from Search" field works: any items with that field set are not included in the web index. However, the indexable.Item.IsClone is never true, and all "clones" remain in the web index.
When I run the master index against this crawler, the IsClone is true for each clone and they are not included in the index. I suspect it works for master and not for the web index because clones are expanded on publishing targets (as noted by John West).
Apologies if this question is considered a duplicate of Globally exclude cloned items from index? - the solution there did not work for me, and I'm using SOLR (vs Lucene) and on a newer version of Sitecore, so I believe this may be a separate issue.
So, how can I exclude all clones from a SOLR index of a Sitecore 7.2 web (publish target) database?
As you wrote in your question, IsClone property is not relevant for items which are published, cause Sitecore clears the value of __Source field.
That's why there is no out of the box method to determine whether the item from the web database was a clone or not.
What you can use is the solution proposed by John West in his blog post Identify Cloned Items Sitecore ASPNET CMS Publishing Target Databases. In nutshell, you need to add your processor to the publishing pipeline, and save the value of the __Source field in another custom field or at least store boolean value in your custom Is Cloned field.
Then you can use your approach, just instead of checking IsClone you need to check whether new custom field is not empty.

Resources