Copy Solr Field Values via Script

Copy Solr Field Values via Script - solr

I would like to copy the data from one field to another field for all documents in Solr.
A title field that is already populated needs to be copied into another field I just created. I'd like to do them all at once if possible via Putty or the Solr Admin console.
Thank you for any help.

If you have pre-ingested data then the only option is to re-ingest the data after adding the second field. You can set only the new field in the docs instead of inserting all the fields using Solr atomic updates. https://solr.apache.org/guide/8_6/updating-parts-of-documents.html#atomic-updates
solr.add({'id':1, 'newField': {'set': 'sample value'}})
For future insertions, if you want the second field to be auto filled, you can use Solr copy field with the source set to the first field. https://solr.apache.org/guide/8_6/copying-fields.html

Related

How to update Field Name in Solr Collection?

I have a solr indexed data as below.
My requirement is to update the field name MATERIAL_DOCUMENT_YEAR which is actually a date to MATERIAL_DOCUMENT_DATE.
The data is in Millions, which will take more time to re-index.
Is there any way from Solr UI to update the field name, without re-indexing the whole data?
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":39,
"params":{
"q":"SOLR_DATA",
"_":"1607925693065"}},
"response":{"numFound":129500000,"start":0,"maxScore":5.632038,"docs":[
{
"PLANT":["HYD"],
"STOCK_TYPE":[""],
"Table_Name":["TBL_MATERIAL_DOC_DISPLAY"],
"MATERIAL_DOCUMENT_YEAR":["20140312"],
"MATERIAL_DESCRIPTION":["T-SHIRT-XXL"],
"MATERIAL_DOCUMENT_NUMBER":["12345678"],
"MOVEMENT_TYPE":["123"],
"COST_CENTER":[""],

There is no way to rename the field without re-indexing and modifying the the schema.xml
May be you can add another field with correct name.
once it is added for all the fields then you can remove the earlier incorrect field.
Second option would be create another collection with correct field names.
Once all the data is up to date in new collection then you can create an alias to it with earlier collection name.
Once done with all the above you can then remove the older index...

Solr full text search for dynamically added data?

I'm trying index the data without defining schema.xml, is the any way to apply full text search without adding schema.xml or updating the manged shema?

The default operation mode of Solr is to use the Schemaless mode. In this mode Solr will guess what the field type is based on what pattern the data matches the first time a field is included. If it is numeric the first time, Solr will guess that it's going to be a numeric field every time.
If the field contains text it'll be indexed as a text field with processing applied as defined in the default schema.
As long as you're using the default configuration you can submit documents with just the field name and the associated text, then search against the field name as necessary.practice

Solr fields mapping?

I am indexing documents into solr from a source. At source, for each document, i have some associated properties which i am indexing & fetching into solr.
What i am doing is i am mapping some fields from source properties with solr schema fields. But i could see couple of extra fields in solr logs which i am not mapping. While querying in solr admin UI, i could see only mapped fields.
E.g. In below logs, i am using only content_name & content content_modifier but i could see Template fields also.
INFO - 2014-09-18 12:07:47.185; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp=/solr path=/update/extract params={literal.content_name=1_.000&literal.content_modifier=System&literal.Template={8ad4d8f0-93a7-4941-9657-cf3706f00409} {add=[1_.000 (1479581071766978560)]} 0 0
So whats happening here? Will solr index only mapped fields and skip rest of unmapped ones? Or will solr index all fields including mapped & non-mapped but on admin UI , it will show only mapped fields?
Please suggest.

Your question is defined by what your solrconfig and schema say because you can configure it any way you want. Here is how it works for the example schema for Solr 4.10:
1) In solrconfig.xml, the handler use "uprefix" parameter to map all fields NOT in schema to a dynamic field ignored_*
2) In schema.xml, that dynamic field has type ignored
3) Type ignored (in the same file) is defined as stored=false and indexed=false. Which means do not complain if you get one of fields with matching pattern, but do nothing with, literally ignore.
So, if you don't like that, you can modify any part of that pipeline. The easiest test would be to change the dynamic field to use type string and reindex. Then, you should see the rest of the fields.

Adding and Updating Solr and lucene field

I am new to solr. can someone address below questions.
1. Currently I have an index with 1.5 mill records. I am having a need to update value of a field to a new value. How do I do it. Will it be a re-indexing? Sample code will be helpful.
I have another need where I want to add a index field but don't want to reindex the entire content. I have document ids with me. For this requirement I can use lucene if that helps.

Currently I have an index with 1.5 mill records. I am having a need to update value of a field to a new value. How do I do it. Will it be a re-indexing? Sample code will be helpful.
Well, the good news is that the latest versions of Solr (starting with 4.3 or 4.4, I think) allows you to do what they call Atomic Updates. See here:
http://wiki.apache.org/solr/Atomic_Updates
From the coding point of view, it as if you were only updating the desired field. Using the Java SolrJ API it's something like this:
Let's say you have a document with a multi value field called "stuffedAnimals". The field already contains "teddy bear" and "stuffed turtle" as values. You want to update it and add a new value like "pink fluffy flamingo". What you can do is:
SolrInputDocument updateDocument = new SolrInputDocument();
//here you must add the id field with the desired value, corresponding to the doc you want to update:
updateDocument.addField("id", 2312312);
//tell it to add the new value to the existing ones, rather then replace them with it:
updateDocument.addField("stuffedAnimals", new HashMap(){{put("add","pink fluffy flamingo");}});
Problem with this is performance: what actually happens when you do this is that the document is removed and re-added entirely (not just the field). This is something you need to take into consideration if you plan on doing a lot of such operations.
I have another need where I want to add a index field but don't want to reindex the entire content. I have document ids with me. For this requirement I can use lucene if that helps.
Well, as I was saying above: when you update a field, the document is actually re-written entirely, so that means it's re-indexed with the new field as well. If you're using Solr 4.4 or earlier you need to declare the new fields in the schema.xml file. If you're using Solr 4.5 or newer you don't need to worry about the schema.xml any more.
Finally, as a remark for both questions: if you want to update a Solr document, make sure all its fields are marked as "stored" (stored=true in schema.xml). Since a partial update on a field translates into Solr removing and re-adding the document (with the update applied), if certain fields are not stored, Solr won't know what value to put in them after the update.

Take a look at atomic update feature added in 4.0.
It allows You to change value of particular field without reindexing whole document.
Remember that all fields in your schema have to be stored(without copyFields). If You need further assistance please write more detailed description.

Add new field to SOLR with default value. Populate existing documents

I have added a new field to a SOLR 3.6.1 schema.xml with a default value. Is it possible to populate / index existing documents in the SOLR repository with this default value without having to re-load all the data? I have been looking at re-indexing and re-optimizing but haven't been able to get this to work?

Any changes in schema.xml related to addition or change in fields would need re-indexing of the data.
So you have to reload your data.
If you know the document, you can do a Partial update of all those document with just that field.

Check Solr: Add new fields with Default Value for Existing Documents
If we only need search and display the new fields, we can do the following steps.
add the new field definition in schema.xml:
We need update search query: when search default value for this newFiled, also search null value:
-(-newFiled:defaultValue AND newFiled:[* TO *])
Use DocTransformer to add default value when there is no value in that field for old data.
Some functions may not work such as sort, stats.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight