Update a new field to existing document - solr

is there possibility to update a new field to an existing document?
For example:
There is an document with several fields, e.g.
ID=99999
Field1:text
Field2:text
This document is already in the index, now I want to insert a new field to this document WITHOUT the old data:
ID=99999
Field3:text
For now, the old document will be deleted and a new document with the ID will be created. So if I now search for the ID 99999 the result will be:
ID=99999
Field3:text
I read this at the Solr Wiki
How can I update a specific field of an existing document?
I want update a specific field in a document, is that possible? I only need to index one field for >a specific document. Do I have to index all the document for this?
No, just the one document. Let's say you have a CMS and you edit one document. You will need to re-index this document only by using the the add solr statement for the whole document (not one field only).
In Lucene to update a document the operation is really a delete followed by an add. You will need >to add the complete document as there is no such "update only a field" semantics in Lucene.
So is there any solution for this? Will this function be implemented in a further version (I currently use 3.6.0). As a workaround, I thought about writing a script or an application, which will collect the existing fields, add the new field and update the whole document. But I think this will suffer performance. Do you have any other ideas?
Best regards

I have 2 answers for you (both more or less bad):
To update filed with in document in Solr you have to reindex whole document (to update Field3 within document ID:99999 you have to reindex that document with values for all fields)
In Solr 4 they implemented feature like that, but they have a condition: all fields have to be stored, not just indexed. What is happening that is they are using stored values and reindexing document in the background. If you are interested, there is nice article about it: http://solr.pl/en/2012/07/09/solr-4-0-partial-documents-update/ This solution have obvious flaw and that is size of index when you are storing all fields.
I hope that this will help you with your problem. If you have some more questions, please ask

It is possible to do this in Solr 4. E.g. Consider the following document
{
"id": "book123",
"name" : "Solr Rocks"
}
In order to add an author field to the document the field value would be a json object with "set" attribute and the field value
$ curl http://localhost:8983/solr/update -H 'Content-type:application/json' -d '
[
{"id" : "book123",
"author" : {"set":"The Community"}
}
]'
Your new document
$ curl http://localhost:8983/solr/get?id=book123
will be
{
"doc" : {
"id" : "book123",
"name" : "Solr Rocks"
"author": "The Community"
}
}
Set will add or replace the author field. Along with set you also have the option to increment(inc) and adding(add)

From Solr 4 onwards you can update a field in solr ....no need to reindex the entire indexes .... various modifiers are supported like ....
set – set or replace a particular value, or remove the value if null is specified as the new value
add – adds an additional value to a list
remove – removes a value (or a list of values) from a list
removeregex – removes from a list that match the given Java regular expression
inc – increments a numeric value by a specific amount (use a negative value to decrement)
example :
document
{
"id": "1",
"name" : "Solr"
"views" : "2"
}
now update with
$ curl http://localhost:8983/solr/demo/update -d '
[
{"id" : "1",
"author" : {"set":"Neal Stephenson"},
"views" : {"inc":3},
}
]'
will result into
{
"id": "1",
"name" : "Solr"
"views" : "5"
"author" : "Neal Stephenson"
}

Related

Do you know how to set about solr not present _"str"?

I used Apache Solr to indexing database, the problem is created field named "*_str". someone told update.autoCreateFields:true change false. But it still works!. please help me and my memory :(
In java, for example, I used SolrInputDocument.addField("A", valueOfA), SolrInputDocument.addField("B", valueOfB).
Then, Solr present
"A" : "vauleofA"
"B" : "vauleofB"
"A_str" : "vauleofA"
"B_str" : "vauleofB"
In a standard Solr 7 installation when Solr automatically adds a field (e.g. when update.autoCreateFields is set to true) you will get these _str fields also added by default. For example if you add the following document to Solr:
[
{ "id": "test01", "somefield": "hello world" }
]
You will see two fields in your schema somefield and somefield_str. I believe the configuration for the additional _str field is defined in the solrconfig.xml file under (look for AddSchemaFieldsUpdateProcessorFactory) -- but I am not sure about this.
If you set autoCreateFields to false after you have imported the document that created those fields, those fields will remain in your schema (and on the documents that already have them.) You will need to recreate your schema in order to get rid of them.

Store abbreviation using Solr in-built feature

I want to make abbreviation of words using Solr. I am using Solr 7.1. In the schema, I have one field named "author", which is of data type string. Now I want to make another copy field from it which will store abbreviation of the string which is trying to store in "author" field. As example, "William Shakespeare" is going to store in "author" field, during addition "W. Shakespeare" will add in the copy field. I am very new in Solr and unable to configure it to fulfill the purpose. Please help.

How to filter an array in Azure Search

I have following Data in my Index,
{
"name" : "The 100",
"lists" : [
"2c8540ee-85df-4f1a-b35f-00124e1d3c4a;Bellamy",
"2c8540ee-85df-4f1a-b35f-00155c40f11c;Pike",
"2c8540ee-85df-4f1a-b35f-00155c02e581;Clark"
]
}
I have to get all the documents where the lists has Pike in it.
Though a full search query works with Any I could't get the contains work.
$filter=lists/any(t: t eq '2c8540ee-85df-4f1a-b35f-00155c40f11c;Pike')
However i am not sure how to search only with Pike.
$filter=lists/any(t: t eq 'Pike')
I guess the eq looks for a full text search, is there any way with the given data structure I should make this query work.
Currently the field lists has no searchable property only the filterable property.
The eq operator looks for exact, case-sensitive matches. That's why it doesn't match 'Pike'. You need to structure your index such that terms like 'Pike' can be easily found. You can accomplish this in one of two ways:
Separate the GUIDs from the names when you index documents. So instead of indexing "2c8540ee-85df-4f1a-b35f-00155c40f11c;Pike" as a single string, you could index them as separate strings in the same array, or perhaps in two different collection fields (one for GUIDs and one for names) if you need to correlate them by position.
If the field is searchable, you can use the new search.ismatch function in your filter. Assuming the field is using the standard analyzer, full-text search will word-break on the semicolons, so you should be able to search just for "Pike" and get a match. The syntax would look like this: $filter=search.ismatch('Pike', 'lists') (If looking for "Pike" is all your filter does, you can just use the search and searchFields parameters to the Search API instead of $filter.) If the "lists" field is not already searchable, you will need to either add a new field and re-index the "lists" values, or re-create your index from scratch with the new field definition.
Update
There is a new approach to solve this type of problem that's available in API versions 2019-05-06 and above. You can now use complex types to represent structured data, including in collections. For the original example, you could structure the data like this:
{
"name" : "The 100",
"lists" : [
{ "id": "2c8540ee-85df-4f1a-b35f-00124e1d3c4a", "name": "Bellamy" },
{ "id": "2c8540ee-85df-4f1a-b35f-00155c40f11c", "name": "Pike" },
{ "id": "2c8540ee-85df-4f1a-b35f-00155c02e581", "name": "Clark" }
]
}
And then directly query for the name sub-field like this:
$filter=lists/any(l: l/name eq 'Pike')
The documentation for complex types is here.

Cloudant search documents that appear after certain id

There is a cloudant database that stores some documents.
There is also mobile app that takes those documents by using search indexes.
Question is:
Is it possible to make query "get me all indexes that appear after this one"?
For example:
I start app, and get from database documents with id 'aaa','aab' and 'aac'.
I want to store last id - 'aac' - in memory of my app.
Then, when I start the app, I want to get from database documents that appeared after 'aac'.
I think the main problem will be, that _ids are assigned as random strings, but I want to be sure.
when searching the index, try including the selector field in JSON object of the request body:
{
"selector": {
"_id": {
"$gt": "the_previous_id"
}
},
"sort": [
{
"_id": "asc"
}
]
}
in addition, from https://docs.cloudant.com/document.html:
"The _id field is either created by you, or generated automatically as a UUID by Cloudant."
therefore, it is possible to provide your own _ids when creating a document if the Cloudant generated _ids are not working for you.
condition operators:
https://docs.cloudant.com/cloudant_query.html#condition-operators

SOLR Tika: add text of file to existing record (ExtractingRequestHandler)

I am indexing posts in SOLR with "name", "title", and "description" fields. I'd like to later be able to add a file (like a Word doc or a PDF) using Tika / the ExtractingRequestHandler.
I know I can add documents like so: (or through other interfaces)
curl
'http://localhost:8983/solr/update/extract?literal.id=post1&commit=true'
-F "myfile=#tutorial.html"
But this replaces the correct post (post1 above) -- is there a parameter I can pass to have it only add to the record?
In Solr (ver < 4.0) you can't modify fields in a document. You can only delete or add/replace whole documents. Therefore, when "appending" a file to the Solr document you have to rebuild your document from its current values (using literal), i.e. query for the document and then:
http://localhost:8983/solr/update/extract?literal.id=post1&literal.name=myName&literal.title=myTitle&literal.description=myDescription&commit=true

Resources