Update (not replace) Solr data with solrj library [duplicate] - solr

This question already has an answer here:
Update specific field in Solr
(1 answer)
Closed 6 years ago.
Suppose, I have a Solr index with current structure:
<field name="id" type="string" indexed="true" stored="true" required="true"/>
<field name="field_1" type="string" indexed="true" stored="true"/>
<field name="field_2" type="string" indexed="true" stored="true"/>
which already has some data. I want to replace data in field "field_1" but data in field "field_2" has to be stay untouched.
For a while I have been using curl whith json file for this task. The example of json file is
[
"{"id":1,"field_1":{"set":"some value"}}"
]
Data in this file replace value only in field "field_1".
Now I have to the same with solrj library.
There are some code snippets in order explain my attempts.
SolrInputDocument doc = new SolrInputDocument();
doc.addField("field_1", "some value");
documents.add(doc);
server = new ConcurrentUpdateSolrClient(solrServerUrl, solrQueueSize, solrThreadCount);
UpdateResponse resp = server.add(documents, solrCommitTimeOut);
When I run this code value of the "field_1" became "some value", but the value of "field_2" became is null.
How can avoid replacing value in field "field_2"?

Because you are doing a full update, what you are doing is overwriting the entire previous document with a new one, which does not have field2.
You need to do a partial update as explained here (scroll down to SOLRJ comment):
https://cwiki.apache.org/confluence/display/solr/Updating+Parts+of+Documents
SolrJ code for Atomic Update
String solrBaseurl = "http://hostname:port/solr";
String collection = "mydocs";
SolrClient client = new HttpSolrClient(solrBaseurl);
SolrInputDocument doc = new SolrInputDocument();
doc.addField("id", "test");
Map<String, String> cmd1 = new HashMap<>();
Map<String, String> cmd2 = new HashMap<>();
cmd1.put("set", "newvalue");
cmd2.put("add", "additionalvalue");
doc.addField("field1", cmd1);
doc.addField("field2", cmd2);
client.add(collection, doc);

Related

Sitecore 9 Indexing : Solr Pattern Tokenizer not Working

I'm new with this combination sitecore and solr stuff.. I've a little issue with the pattern tokenizer which is not working.. I'm following this documentation
Solr :
https://lucene.apache.org/solr/guide/6_6/tokenizers.html#Tokenizers-RegularExpressionPatternTokenizer)
Sitecore 9 Solr :
https://doc.sitecore.net/sitecore_experience_platform/setting_up_and_maintaining/search_and_indexing/using_solr_field_name_resolution
When I do the indexing, my field value is : a,b,c and I expected on solr it will be ["a","b","c"] but it contains ["a,b,c"]
This is my Sitecore Config
<fieldMap>
<typeMatches hint="raw:AddTypeMatch">
<typeMatch type="System.Collections.Generic.List`1[System.String]" typeName="commaDelimitedCollection" fieldNameFormat="{0}_cd"
multiValued="true" settingType="Sitecore.ContentSearch.SolrProvider.SolrSearchFieldConfiguration, Sitecore.ContentSearch.SolrProvider"/>
</typeMatches>
<fieldNames hint="raw:AddFieldByFieldName">
<field fieldName="Keywords" returnType="commaDelimitedCollection"/>
</fieldNames>
</fieldMap>
This is my Solr Schema
<fieldType name="commaDelimited" class="solr.TextField" multiValued="true">
<analyzer>
<tokenizer class="solr.PatternTokenizerFactory" pattern="\s*,\s*"/>
</analyzer>
</fieldType>
<dynamicField name="*_cd" type="commaDelimited" multiValued="true" indexed="true" stored="true"/>
Any idea what's wrong with my configuration above?
Thanks
Not sure if I get the full picture here. Maybe your approach is perfectly valid, but I don't think I've seen that one before. Instead of defining a new type, you could reuse the *_sm (multiValued string) and perform the splitting of the string at index time on the Sitecore side. Usually you don't need more field types than the ones provided by sitecore and it's typically easier to maintain all the code in your VS solution instead of depending on additional Solr config. (In Sitecore 9 you can deploy your Solr managed schema from the control panel though.)
A simple computed field field can look like this:
<fields hint="raw:AddComputedIndexField">
<field fieldName="keywords" returnType="stringCollection">
Your.Name.Space.YourComputedFieldClass, YourAssembly
</field>
</fields>
And a class implementation could look something like this:
public class YourComputedFieldClass : IComputedIndexField
{
public object ComputeFieldValue(IIndexable indexable)
{
var item = indexable as SitecoreIndexableItem;
var fieldValue = item?.Item?["Keywords"]
if (string.IsNullOrWhitespace(fieldValue)) {
return null;
}
return fieldValue.Split(',');
}
public string FieldName { get; set; }
public string ReturnType { get; set; }
}

Geoloc is incorrect after update parts document in Solr

I want to update a particular field of a document in Solr. But after updating , the field *_coordinate was converted from tdouble to array. How can I fix it ? I use Apache Solr version 6.2.1.
This is my dynamicField in the schema file:
<!-- Type used to index the lat and lon components for the "location" FieldType -->
<dynamicField name="*_coordinate" type="tdouble" indexed="true" stored="true" useDocValuesAsStored="false" />
This is the code that I used for update a field:
String solrID = (String) currentDoc.getFieldValue("id");
SolrInputDocument solrDocToIndex = new SolrInputDocument();
solrDocToIndex.addField("id", solrID);
Map<String, String> partialUpdate = new HashMap<>();
partialUpdate.put("add", "Solr Demo");
solrDocToIndex.addField("tags", partialUpdate);
I have a field geoloc_0_coordinate. Before do update it has value 12.123456. But after running the code update, it changed to [12.123456,12.123456]

Sitecore _path field returns NULL in Solr index

I am using Solr index for Sitecore.
However, the search result always gives back null for _path field.
It was working on Lucene. Does Solr needs special treatment?
Below is the glass mapper property:
[IndexField("_path"), TypeConverter(typeof(IndexFieldEnumerableConverter))]
[SitecoreIgnore]
public virtual System.Collections.Generic.IEnumerable<ID> EntityPath { get; set; }
And the SOLR schema has entry below:
<field name="_path" type="string" indexed="true" stored="false" multiValued="true" />
Change your "store" setting to true:
<field name="_path" type="string" indexed="true" stored="true" multiValued="true" />
The stored attribute will make sure that your original value is kept in the index for retrieval. Otherwise you can search in the field, but not fetch it.

solrj atomic update - how to set field to null?

I'm using atomic update with Solrj. It works perfectly, but I don’t know how to delete a field within an existing document.
In the Solr tutorial (http://wiki.apache.org/solr/UpdateXmlMessages) they explain how to do it with the xml:
<add>
<doc>
<field name="employeeId">05991</field>
<field name="skills" update="set" null="true" />
</doc>
</add>
Does anyone knows how to do it from SolrJ?
Thanks!
Ok. So apparently this is the way to do so -
SolrInputDocument inputDoc = new SolrInputDocument();
Map<String, String> partialUpdateNull = new HashMap<String, String>();
partialUpdateNull.put("set", null);
inputDoc.setField("FIELD_YOU_WANT_TO_DELETE", partialUpdateNull);
Thanks anyways!

Updatable Fields In Solr

I am using Solr for searching my corpus of web page data. My solr-indexer will create several fields and corresponding values. However some of these fields I want to update more often, like for example the number of clicks on that page. These fields need not be indexable and I don't need to perform a search on these field values. However I do want to fetch them and update them often.
I am a newbie in solr so a more descriptive answer with perhaps some running example/code would help me better.
If you are on Solr 4+, yes you can push a Partial Update to Solr index.
For partial update, all fields in your schema.xml need to be stored.
This is how your fields section should look like:
<fields>
<field name="id" type="string" indexed="true" stored="true" required="true" />
<field name="title" type="text_general" indexed="true" stored="true"/>
<field name="description" type="text_general" indexed="true" stored="true" />
<field name="body" type="text_general" indexed="true" stored="true"/>
<field name="clicks" type="integer" indexed="true" stored="true" />
</fields>
Now when you send a partial update to one of the fields, eg: in your case the "clicks"; in the background Solr will go and fetch values for all other fields for that document, such as title, description, body, delete old document and will push new updated document to Solr index.
localhost:8080/solr/update?commit=true' -H 'Content-type:application/json' -d '[{"id":"1","clicks":{"set":100}}]
Here is a good documentation on partial updates: http://solr.pl/en/2012/07/09/solr-4-0-partial-documents-update/
Sample SOLR- partial update code:
Prerequisites: The fields need to be stored.
You need to configure update log path under direct update handler
<updateHandler class="solr.DirectUpdateHandler2">
<!-- Enables a transaction log, used for real-time get, durability, and
and solr cloud replica recovery. The log can grow as big as
uncommitted changes to the index, so use of a hard autoCommit
is recommended (see below).
"dir" - the target directory for transaction logs, defaults to the
solr data directory. -->
<updateLog>
<str name="dir">${solr.ulog.dir:}</str>
</updateLog>
</updateHandler>
Code:
import java.io.IOException;
import java.util.HashMap;
import java.util.Map;
import org.apache.solr.client.solrj.SolrServer;
import org.apache.solr.client.solrj.SolrServerException;
import org.apache.solr.client.solrj.impl.ConcurrentUpdateSolrServer;
import org.apache.solr.client.solrj.impl.HttpSolrServer;
import org.apache.solr.common.SolrInputDocument;
public class PartialUpdate {
public static void main(String args[]) throws SolrServerException,
IOException {
SolrServer server = new HttpSolrServer("http://localhost:8080/solr");
SolrInputDocument doc = new SolrInputDocument();
Map<String, String> partialUpdate = new HashMap<String, String>();
// set - to set a field.
// add - to add to a multi-valued field.
// inc - to increment a field.
partialUpdate.put("set", "peter"); // value that need to be set
doc.addField("id", "122344545"); // unique id
doc.addField("fname", partialUpdate); // value of field fname corresponding to id 122344545 will be set to 'peter'
server.add(doc);
}
}

Resources