I need to count how many fields has a multivalue field to sort results out.
solrconfig.xml
<updateRequestProcessorChain name="add-numbers-count">
<processor class="solr.CloneFieldUpdateProcessorFactory">
<str name="source">cat_ids</str>
<str name="dest">cat_ids_count</str>
</processor>
<processor class="solr.CountFieldValuesUpdateProcessorFactory">
<str name="fieldName">cat_ids_count</str>
</processor>
<processor class="solr.DefaultValueUpdateProcessorFactory">
<str name="fieldName">cat_ids_count</str>
<int name="value">0</int>
</processor>
<!-- <processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" /> -->
</updateRequestProcessorChain>
<initParams path="/update/**,/query,/select,/spell">
<lst name="defaults">
<str name="df">_text_</str>
<str name="update.chain">add-numbers-count</str>
</lst>
</initParams>
manageschema.xml
<field name="cat_ids" type="plongs"/>
<field name="cat_ids_count" type="pint"/>
Note that RunUpdateProcessorFactory and LogUpdateProcessorFactory are commented out.
If I use them update fails with a non sense error:
ERROR: [doc=44996] Error adding field 'data_readout'='2025-06-01' msg=Invalid Date String:'2025-06-01'int(44996)
Solr is not creating this field cat_ids_count I guess because there is no RunUpdateProcessorFactory.
Do I have to delete and recreate collection? Or is there any error I can't see?
Related
I am using SOLR Cloud 6.0 with below setup,
3 physical VM, 5 collections with 6 shrad each with replication factor 2
I am running zookeeper on each VM
I am receiving below exception very frequently in solr logs,
o.a.s.h.RequestHandlerBase org.apache.solr.common.SolrException: RunUpdateProcessor has received an AddUpdateCommand containing a document that appears to still contain Atomic document update operations, most likely because DistributedUpdateProcessorFactory was explicitly disabled from this updateRequestProcessorChain
at org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:63)
at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
at org.apache.solr.update.processor.AddSchemaFieldsUpdateProcessorFactory$AddSchemaFieldsUpdateProcessor.processAdd(AddSchemaFieldsUpdateProcessorFactory.java:335)
at org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:48)
Can you please help me with the root cause ? Below is the snapshot of solrconfig,
<updateRequestProcessorChain name="add-unknown-fields-to-the-schema">
<!-- UUIDUpdateProcessorFactory will generate an id if none is present in the incoming document -->
<processor class="solr.UUIDUpdateProcessorFactory" />
<processor class="solr.LogUpdateProcessorFactory"/>
<processor class="solr.DistributedUpdateProcessorFactory"/>
<processor class="solr.RemoveBlankFieldUpdateProcessorFactory"/>
<processor class="solr.FieldNameMutatingUpdateProcessorFactory">
<str name="pattern">[^\w-\.]</str>
<str name="replacement">_</str>
</processor>
<processor class="solr.ParseBooleanFieldUpdateProcessorFactory"/>
<processor class="solr.ParseLongFieldUpdateProcessorFactory"/>
<processor class="solr.ParseDoubleFieldUpdateProcessorFactory"/>
<processor class="solr.ParseDateFieldUpdateProcessorFactory">
<arr name="format">
<str>yyyy-MM-dd'T'HH:mm:ss.SSSZ</str>
<str>yyyy-MM-dd'T'HH:mm:ss,SSSZ</str>
<str>yyyy-MM-dd'T'HH:mm:ss.SSS</str>
<str>yyyy-MM-dd'T'HH:mm:ss,SSS</str>
<str>yyyy-MM-dd'T'HH:mm:ssZ</str>
<str>yyyy-MM-dd'T'HH:mm:ss</str>
<str>yyyy-MM-dd'T'HH:mmZ</str>
<str>yyyy-MM-dd'T'HH:mm</str>
<str>yyyy-MM-dd HH:mm:ss.SSSZ</str>
<str>yyyy-MM-dd HH:mm:ss,SSSZ</str>
<str>yyyy-MM-dd HH:mm:ss.SSS</str>
<str>yyyy-MM-dd HH:mm:ss,SSS</str>
<str>yyyy-MM-dd HH:mm:ssZ</str>
<str>yyyy-MM-dd HH:mm:ss</str>
<str>yyyy-MM-dd HH:mmZ</str>
<str>yyyy-MM-dd HH:mm</str>
<str>yyyy-MM-dd</str>
</arr>
</processor>
<processor class="solr.AddSchemaFieldsUpdateProcessorFactory">
<str name="defaultFieldType">strings</str>
<lst name="typeMapping">
<str name="valueClass">java.lang.Boolean</str>
<str name="fieldType">booleans</str>
</lst>
<lst name="typeMapping">
<str name="valueClass">java.util.Date</str>
<str name="fieldType">tdates</str>
</lst>
<lst name="typeMapping">
<str name="valueClass">java.lang.Long</str>
<str name="valueClass">java.lang.Integer</str>
<str name="fieldType">tlongs</str>
</lst>
<lst name="typeMapping">
<str name="valueClass">java.lang.Number</str>
<str name="fieldType">tdoubles</str>
</lst>
</processor>
<processor class="solr.RunUpdateProcessorFactory"/>
</updateRequestProcessorChain>
Regards,
Pratik Thaker
I Want to count multi-valued field in SOLR.
I have two multi-valued fields store_id and filter_id
and i want to count these field value like
store_id = {0,3,7}
count_store_id = 3
filter_id = {12,13,20,22,59,61,62,145}
count_filter_id = 8
and is that possible when store_id is update then count_store_id also update in solr by default
## Ashraful Islam - As you told me i'll change it but there is nothing going happen here i attach image find it.
Yes as suggested by Alexandre Rafalovitch, by using defining custom UpdaterequestProcessor you can get the count value of multivalued field.
add below lines in your solrconfig.xml
<updateRequestProcessorChain name="multivaluecountnum" default="true">
<processor class="solr.CloneFieldUpdateProcessorFactory">
<str name="source">store_id</str>
<str name="dest">store_id_count</str>
</processor>
<processor class="solr.CloneFieldUpdateProcessorFactory">
<str name="source">filter_id</str>
<str name="dest">filter_id_count</str>
</processor>
<processor class="solr.CountFieldValuesUpdateProcessorFactory">
<str name="fieldName">store_id_count</str>
</processor>
<processor class="solr.CountFieldValuesUpdateProcessorFactory">
<str name="fieldName">filter_id_count</str>
</processor>
<processor class="solr.DefaultValueUpdateProcessorFactory">
<str name="fieldName">store_id_count</str>
<int name="value">0</int>
</processor>
<processor class="solr.DefaultValueUpdateProcessorFactory">
<str name="fieldName">filter_id_count</str>
<int name="value">0</int>
</processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
Do not forget to add RunUpdateProcessorFactory at the end of any chains you define in solrconfig.xml
Add store_id_count and filter_id_count fields in schema file
<field name="store_id_count" type="int" stored="true"/>
<field name="filter_id_count" type="int" stored="true"/>
Reindex docs and query, you will see two new fields store_id_count and filter_id_count in result.
Hope this Helps,
Vinod.
You can do this with a custom UpdateRequestProcessor chain that uses CountFieldValuesUpdateProcessorFactory.
I actually use Solr 4.8.1 and I set up spellcheck. After indexing, the request doesn't return any suggestion.
After the advice of #n0tting, I modified a little my files.
Here are steps:
1- solrconfig.xml
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">phraseText</str>
<lst name="spellchecker">
<str name="classname">solr.IndexBasedSpellChecker</str>
<str name="spellcheckIndexDir">./spellchecker</str>
<str name="name">default</str>
<str name="field">title_spellcheck</str>
<str name="buildOnCommit">true</str>
</lst>
</searchComponent>
add some configurations in standard requestHandler:
<requestHandler name="standard" class="solr.StandardRequestHandler" default="true">
<!-- default values for query parameters -->
<lst name="defaults">
<str name="echoParams">explicit</str>
<int name="rows">10</int>
<!-- Optional, must match spell checker's name as defined above, defaults to "default" -->
<str name="spellcheck.dictionary">default</str>
<!-- omp = Only More Popular -->
<str name="spellcheck.onlyMorePopular">false</str>
<!-- exr = Extended Results -->
<str name="spellcheck.extendedResults">false</str>
<!-- The number of suggestions to return -->
<str name="spellcheck.count">1</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
2 schema.xml
Define a field for spell check:
<field name="title_spellcheck" type="phraseText" indexed="true" stored="false" multiValued="true" />
<copyField source="title" dest="title_spellcheck"/>
3 Request:
.../select?q=recommend&defType=edismax&qf=title&spellcheck=true&spellcheck.build=true&spellcheck.q=recommend&spellcheck.collate=true
I don't get any suggestion at result, neither <lst name="spellcheck">. can anybody give me an advice? Thanks a lot.
References:
https://cwiki.apache.org/confluence/display/solr/Spell+Checking
http://solr.pl/en/2011/05/23/%E2%80%9Ccar-sale-application%E2%80%9D-%E2%80%93-spellcheckcomponent-%E2%80%93-did-you-really-mean-that-part-5/
I have written a custom UpdateRequestProcessorFactory to parse my data before getting indexed. But the data is not getting committed. So when i restart the server all the data is gone. I have used the correct config also.
<updateRequestProcessorChain name="mytestupdatehandler" default="true">
<processor class="com.solr.handler.interceptor"></processor>
<processor class="solr.LogUpdateProcessorFactory" />
<processor class="solr.RunUpdateProcessorFactory" />
</updateRequestProcessorChain>
<requestHandler name="/MypdateHandler" class="solr.UpdateRequestHandler" >
<lst name="defaults">
<str name="update.chain">mytestupdatehandler</str>
</lst>
</requestHandler>
<requestHandler name="/update" class="solr.UpdateRequestHandler">
<!-- See below for information on defining
updateRequestProcessorChains that can be used by name
on each Update Request
-->
<lst name="defaults">
<str name="maxThreads">50</str>
<str name="handlerType">asyncXML</str>
<str name="sharedError">false</str>
</lst>
</requestHandler>
Also the default update handler also uses my update.chain . how to prevent that
You have default="true", so that makes your chain used for all update handlers. Remove that.
You seem to be missing class name in your handler definition, unless it is interceptor in the com.solr.handler package: <processor class="com.solr.handler.interceptor.CLASSNAME?"></processor>
Are you getting any errors in the console log if you start Solr from the command line? That might give you a hint.
I am having an issue with by solr settings.
After a lot of investigation today, I found that its the spellcheck component which is causing the issue of Core Reload to hang.
If its turned off, all will run well and core can easily reload. However, when the spellcheck is on, the core wont reload instead hangs forever. Then the only way to get the project back alive is to stop solr, and delete the data folder then start solr again.
Here are the solr config settings for spell check:
<requestHandler name="/select" class="solr.SearchHandler">
<lst name="defaults">
<!-- Spell checking defaults -->
<str name="spellcheck.dictionary">default</str>
<str name="spellcheck">on</str>
<str name="spellcheck.count">5</str>
<str name="spellcheck.onlyMorePopular">false</str>
<str name="spellcheck.maxResultsForSuggest">5</str>
<str name="spellcheck.alternativeTermCount">2</str>
<str name="spellcheck.extendedResults">false</str>
<str name="spellcheck.collate">true</str>
<str name="spellcheck.maxCollations">3</str>
<str name="spellcheck.maxCollationTries">3</str>
<str name="spellcheck.collateExtendedResults">true</str>
</lst>
<arr name="last-components">
<str>spellcheck</str>
</arr>
</requestHandler>
<searchComponent name="spellcheck" class="solr.SpellCheckComponent">
<str name="queryAnalyzerFieldType">text_en_splitting</str>
<lst name="spellchecker">
<str name="name">default</str>
<str name="field">location_details</str>
<str name="classname">solr.DirectSolrSpellChecker</str>
<str name="buildOnCommit">true</str>
<float name="accuracy">0.5</float>
<float name="thresholdTokenFrequency">.01</float>
<int name="maxEdits">1</int>
<int name="minPrefix">3</int>
<int name="maxInspections">3</int>
<int name="minQueryLength">4</int>
<float name="maxQueryFrequency">0.001</float>
</lst>
</searchComponent>
.
Here is the field from schema:
<field name="location_details" type="text_en_splitting" indexed="true" stored="false" required="false" />
Basically, it is a bug in Solr. You need to just hide/comment/remove the following from your requestHandler:
<!--<str name="spellcheck.maxCollationTries">3</str> here is a bug, put this parameter in the actual query string instead -->
Furthermore, if you really need to use maxCollationTries, you can enter it as a Query parameter in your url instead.