indexing with solr - solr

I'm using solrj to index data, I have created some new field in schema.xml, when I try to index data by java I have to use all new fields, if I don't use one of this new field I have an exception org.apache.solr.common.solrexception bad request
Can I index Data with fields that I choose?

Because all of your defined fields in schema.xml has set the attribute "required" as "true".
Set the attribute to false and the field is not longer necessary for each document.

Related

Copy Solr Field Values via Script

I would like to copy the data from one field to another field for all documents in Solr.
A title field that is already populated needs to be copied into another field I just created. I'd like to do them all at once if possible via Putty or the Solr Admin console.
Thank you for any help.
If you have pre-ingested data then the only option is to re-ingest the data after adding the second field. You can set only the new field in the docs instead of inserting all the fields using Solr atomic updates. https://solr.apache.org/guide/8_6/updating-parts-of-documents.html#atomic-updates
solr.add({'id':1, 'newField': {'set': 'sample value'}})
For future insertions, if you want the second field to be auto filled, you can use Solr copy field with the source set to the first field. https://solr.apache.org/guide/8_6/copying-fields.html

How to update Field Name in Solr Collection?

I have a solr indexed data as below.
My requirement is to update the field name MATERIAL_DOCUMENT_YEAR which is actually a date to MATERIAL_DOCUMENT_DATE.
The data is in Millions, which will take more time to re-index.
Is there any way from Solr UI to update the field name, without re-indexing the whole data?
{
"responseHeader":{
"zkConnected":true,
"status":0,
"QTime":39,
"params":{
"q":"SOLR_DATA",
"_":"1607925693065"}},
"response":{"numFound":129500000,"start":0,"maxScore":5.632038,"docs":[
{
"PLANT":["HYD"],
"STOCK_TYPE":[""],
"Table_Name":["TBL_MATERIAL_DOC_DISPLAY"],
"MATERIAL_DOCUMENT_YEAR":["20140312"],
"MATERIAL_DESCRIPTION":["T-SHIRT-XXL"],
"MATERIAL_DOCUMENT_NUMBER":["12345678"],
"MOVEMENT_TYPE":["123"],
"COST_CENTER":[""],
There is no way to rename the field without re-indexing and modifying the the schema.xml
May be you can add another field with correct name.
once it is added for all the fields then you can remove the earlier incorrect field.
Second option would be create another collection with correct field names.
Once all the data is up to date in new collection then you can create an alias to it with earlier collection name.
Once done with all the above you can then remove the older index...

Solr fields mapping?

I am indexing documents into solr from a source. At source, for each document, i have some associated properties which i am indexing & fetching into solr.
What i am doing is i am mapping some fields from source properties with solr schema fields. But i could see couple of extra fields in solr logs which i am not mapping. While querying in solr admin UI, i could see only mapped fields.
E.g. In below logs, i am using only content_name & content content_modifier but i could see Template fields also.
INFO - 2014-09-18 12:07:47.185; org.apache.solr.update.processor.LogUpdateProcessor; [collection1] webapp=/solr path=/update/extract params={literal.content_name=1_.000&literal.content_modifier=System&literal.Template={8ad4d8f0-93a7-4941-9657-cf3706f00409} {add=[1_.000 (1479581071766978560)]} 0 0
So whats happening here? Will solr index only mapped fields and skip rest of unmapped ones? Or will solr index all fields including mapped & non-mapped but on admin UI , it will show only mapped fields?
Please suggest.
Your question is defined by what your solrconfig and schema say because you can configure it any way you want. Here is how it works for the example schema for Solr 4.10:
1) In solrconfig.xml, the handler use "uprefix" parameter to map all fields NOT in schema to a dynamic field ignored_*
2) In schema.xml, that dynamic field has type ignored
3) Type ignored (in the same file) is defined as stored=false and indexed=false. Which means do not complain if you get one of fields with matching pattern, but do nothing with, literally ignore.
So, if you don't like that, you can modify any part of that pipeline. The easiest test would be to change the dynamic field to use type string and reindex. Then, you should see the rest of the fields.

Using default value for Solr field boosting when field does not exist

My existing Solr 4.x instance has about 650k documents indexed. I just added a new field to the schema that will hold a number of votes given to the document that will be used in boosting the score. Until the first user up votes (or down votes) a given document, said document will not have that field defined. You can see this when viewing the document using the Solr Admin tool.
The field was defined with a default value but I think this only applies to new documents (or maybe reindexed documents) that do not have said field specified.
When I try to test out different boosting functions, I get the following exception back
"error": {
"msg": "can not use FieldCache on a field which is neither indexed nor has doc values: votes",
"code": 400
}
Is it possible to specify a default value to be used for boosting when the field does not (yet) exist in the document? My logic would be
field exists -- use field value
field does not exist -- use default value
This seems to related to your earlier question. Perhaps you can try the FuntionQuery as well
q={!boost b=map(field,0,0,0,default_value) } your_query
This will boost based on the field value, and use default_value if the field value is null.
Reference here

Add new field to SOLR with default value. Populate existing documents

I have added a new field to a SOLR 3.6.1 schema.xml with a default value. Is it possible to populate / index existing documents in the SOLR repository with this default value without having to re-load all the data? I have been looking at re-indexing and re-optimizing but haven't been able to get this to work?
Any changes in schema.xml related to addition or change in fields would need re-indexing of the data.
So you have to reload your data.
If you know the document, you can do a Partial update of all those document with just that field.
Check Solr: Add new fields with Default Value for Existing Documents
If we only need search and display the new fields, we can do the following steps.
add the new field definition in schema.xml:
We need update search query: when search default value for this newFiled, also search null value:
-(-newFiled:defaultValue AND newFiled:[* TO *])
Use DocTransformer to add default value when there is no value in that field for old data.
Some functions may not work such as sort, stats.

Resources