what is EditorialMarkerFactory.? - solr

What is EditorialMarkerFactory in solr.?
How it used in DocumentTransformer...?
For what scenarios we have to use EditorialMarkerFactory in our application... ?
.

I tried using the EditorialMarkerFactory with solr 4.0 final and it wouldn't work. The error was class not found.
I changed the class to ElevatedMarkerFactory in the solrconfig.xml file and it started working. Solr needs to update the wiki and the default solrconfig.xml file

If you check the Wiki you can read why and how it is used. It says
EditorialMarkerFactory is used to mark items that have been editorially boosted by the QueryElevationComponent so that an application has the option of treating them specially

Related

Sitecore RichText field not indexing correctly in Solr in Prodcution only

I have a field whose type is set to Rich text in Sitecore.
On lower environments, the values get indexed correctly and HTML text is stored in Solr correctly.
On Production, for all items the HTML gets stripped off completely.
UPDATE: One difference is that in lower environments, we have Solr on Prem and on Production, it is Solr cloud
UPDATE: I have checked the CM and CD servers and all have the field reader for the Body Copy field
UPDATE: This is now happening for all items. Earlier, possibly the other items weren't updated and published and so they showed the HTML correctly?
What could the issue be?
It is only happening on Production
I have validated the config is as expected. The field is Body Copy
<fieldReaders type="Sitecore.ContentSearch.FieldReaders.FieldReaderMap, Sitecore.ContentSearch">
<param desc="id">defaultFieldReaderMap</param>
<mapFieldByTypeName hint="raw:AddFieldReaderByFieldTypeName">
<fieldReader fieldTypeName="html|rich text" fieldReaderType="Sitecore.ContentSearch.FieldReaders.RichTextFieldReader, Sitecore.ContentSearch" />
</mapFieldByTypeName>
<mapFieldByFieldName hint="raw:AddFieldReaderByFieldName">
<fieldReader fieldName="Body Copy" fieldReaderType="Sitecore.ContentSearch.FieldReaders.DefaultFieldReader, Sitecore.ContentSearch" />
</mapFieldByFieldName>
</fieldReaders>
It is now happening for all the content
I have resolved the HTML errors in the fields that reported HTML errors but that didn't fix it either.
If its happening uniformly to all fields, then my guess is that it has got to be your SOLR configuration.
Please review the managed-schema.xml document for the relevant SOLR Core on both your test and your production system.
You can do this via the file system (on prem):
i.e. C:\Solr\server\solr\web_index\conf\managed-schema.xml
Or via the SOLR dashboard (cloud and prem):
i.e. https://solr-domain:8983/solr/#/web_index/files?file=managed-schema
I suspect you will find a difference relating to filters.
Please see HTMLStripCharFilterFactory on this link for an example of a filter which would cause the issues you are describing:
https://solr.apache.org/guide/8_1/charfilterfactories.html#solr-htmlstripcharfilterfactory
Let me know if that helps at all.
Regards
Dean

can solr find all of the terms of a field of a document?

solr uses inverted index to find the document from the indexed "terms".
but what I wonder is that -
is there any approach to know all of the terms which refer to a specific documents?
thanks
You can use this Luke tool which allows you to explore Lucene index.
It depends of your Solr version but you can also use Luke request handler with an HTTP request. Here you have the documentation about this handler:
To use the LukeRequestHandler, make sure it is defined in your
solrconfig.xml:
<requestHandler name="/admin/luke" class="org.apache.solr.handler.admin.LukeRequestHandler"/>
Assuming you have this handler mapped to "/admin/luke" in
solrconfig.xml and are running the example on port localhost:8983,
visit:
/admin/luke
/admin/luke?fl=cat
/admin/luke?fl=id&numTerms=50
/admin/luke?id=SOLR1000
/admin/luke?docId=2
Forward index is what you were asking. here in general. here in solr

Solr more like this don't return score while specify `mlt.count`

I'm using solr's more like this to analyze the most similar documents. But while I specify mlt.count argument and if it is not 15, the score don't show. The more like this arguments is mlt=true&mlt.fl=text&mlt.count=12, while text is the filed that has term vector. And the fl argument is *,score. I queried this url:
http://localhost:8983/solr/collection1/select?q=id%3A1967956383&wt=json&indent=true&mlt=true&mlt.fl=text&mlt.count=12.
When I specify mlt.count=15, the score shows up. And after that, I query mlt.count=12 again, it shows up, too.
My solr version is 4.0.
Does anybody have any idea? Thanks!
This has been documented as a SOLR bug SOLR-5042, and a patch posted against solr version 4.3. I've relocated that patch back to 4.2.1 and seen that it fixes this behavior there.
If you query the /mlt handler directly, instead of using the mlt component under the /select handler, you can work around this issue, as the handler accepts its count as rows=12 instead of mlt.count=12.

dynamicField in django-haystack SOLR config

How to configure search_indexes.py to index dynamicFields in django-haystack. I'm using SOLR as the search engine for haystack.
If you are using Haystack's build_solr_schema management command to create your schema.xml, notice that it automatically includes various dynamicFields for popular field types. For example, check out the schema template for Haystack v2.1. (This looks like it's been there since Haystack v1.)
This allows you to create dynamically-named fields in your search index's prepare method. For example, if you were indexing notes that could have an id string for your ever-changing group of partners, you could do this:
def prepare(self, obj):
self.prepared_data = super(NoteIndex, self).prepare(obj)
for (partner_name, partner_id) in get_partners():
self.prepared_data['%s_s' % partner_name] = partner_id
return self.prepared_data
The key thing here is that the field name ends with "_s", which according to the schema is a dynamic name for string types.
Unfortunately, these dynamic partner fields are not explicitly defined at the top of your SearchIndex class. You may want to mention this in a comment.
As far as I can see in the source code of django-haystack 1.2.* you can't do this. You can write own schema instead of generating it using management commands and use it.
As #nofinator say, you can do this in .prepare method of SearchIndex class by concatinating field name with prefix from SOLR Schema.xml.
By default Haystack(current ver. is 2.1.1) ships with some default DynamicField like a *_s. But if you want, you can make your own DynamicField.
In my project ill make attr_* field and its work fine.
All you need to do, is add this field, with same syntax in Schema.xml
You can do in manualy or overriding standart Haystack build_solr_schema management command.(Btw, its uses standart django render template fnc. so its pretty easy.

Index-time boosting using DIH with JdbcDataSource

Is it possible to add boosts to docs and fields in Solr 1.4 DIH when using a JdbcDataSource? The documentation seem to suggest it's possible but I can't find any examples.
There are a few examples of how to add the boost="2.0" attribute to your docs/fields in XML imports, but how do you do the same with the JdbcDataSource?
The closest I could get to an answer was http://www.nabble.com/data-import-handler---going-deeper...-td20731715.html
Add a special value $fieldBoost. to the row map
Has this been implemented yet?
$fieldBoost is not implemented, but $docBoost is.
Source code.
Special commands docs.
This is not an answer, If you want to change the score of the field or the document you have added
http://wiki.apache.org/solr/SolrRelevancyFAQ#How_can_I_boost_the_score_of_newer_documents
just go through above link

Resources