Elasticsearch must some values and must not all others - arrays

so I have an index with the following mapping:
{
"tests": {
"mappings": {
"test": {
"properties": {
"arr": {
"properties": {
"name": {
"type": "long"
}
}
}
}
}
}
}
I have 3 documents with the following values for the field "arr":
arr: [{name: 1}, {name: 2}, {name: 3}]
arr: [{name: 1}, {name: 2}, {name: 4}]
arr: [{name: 1}, {name: 2}]
I would like to search in a way that I could find the documents in which all name values are in an array of name. However, the document doesn't need to have all the values of the array, and if it has one value that is not in the array, then the document is not good.
For example:
If my array of names is [1,2], then I only want document 3, but I have all of them
If my array of names is [1,2,3], then I want document 1 and 3, but I only have document 1 with must and all of them with should
If my array of names is [1,2,4], then I want document 2 and 3, but I only have document 2 with must and all of them with should
If my array of names is [1], then I want no document, but I have all of them
Here, the arrays are small, in my project the arrays in the documents are much more bigger and the array of comparison as well.
So, to be specific, I need to search in a way that:
All the names in the document are in the array of names
The document does not need to have ALL the values in the array of name
Thank you in advance

Using Scripting (No Mapping Change)
No mapping changes are required to what you already have.
I've come up with the below script. I believe the core logic is in the script which I think is self-explanatory.
POST test_index_arr/_search
{
"query":{
"bool": {
"filter": {
"script": {
"script": {
"source" : """
List myList = doc['arr.name'];
int tempVal = myList.size();
for(int i=0; i<params.ids.size(); i++){
long temp = params.ids.get(i);
if(myList.contains(temp))
{
tempVal = tempVal - 1;
}
if(tempVal==0){
break;
}
}
if(tempVal==0)
return true;
else
return false;
""",
"lang" : "painless",
"params": {
"ids": [1,2,4]
}
}
}
}
}
}
}
So what the scripting does is, it checks if a document has all its numbers in its arr present in the input, if so it would return the document.
In the above query, the part "ids": [1,2,4] acts as input. You need to add/remove/update values here this depending on your requirement.
Hope it helps!

Related

Which is the efficient way of updating an element in MongoDB?

I have a collection like below
{
"doc_id": "1234",
"items": [
{
"item_no": 1,
"item": "car",
},
{
"item_no": 2,
"item": "bus",
},
{
"item_no": 3,
"item": "truck",
}
]
},
I need to update an element inside items list based on a search criteria. My search criteria is, if "item_no" is 3, "item" should be updated to "aeroplane".
I have written the following two approaches in Python to solve this.
Approach 1:
cursor = list(collection.find({"doc_id": 1234}))
for doc in cursor:
if "items" in doc:
temp = deepcopy(doc["items"])
for element in doc["items"]:
if ("item_no" and "item") in element:
if element["item_no"] == 3:
temp[temp.index(element)]["item"] = "aeroplane"
collection.update_one({"doc_id": 1234},
{"$set": {"items": temp}})
Approach 2:
cursor = list(collection.find({"doc_id": 1234}))
for doc in cursor:
if "items" in doc:
collection.find_one_and_update({"doc_id": 1234}, {'$set': {'items.$[elem]': {"item_no": 3, "item": "aeroplane"}}}, array_filters=[{'elem.item_no': {"$eq": 3}}])
Among the above two approaches, which one is better in terms of time complexity?
Use only a query and avoid loops:
db.collection.update({
"doc_id": "1234",
"items.item_no": 3
},
{
"$set": {
"items.$.item": "aeroplane"
}
})
Example here
Note how using "items.item_no": 3 into the find stage you can use $ into update stage to refer the object into the array.
So, doing
{
"doc_id": "1234",
"items.item_no": 3
}
When you use $ you are telling mongo: "Ok, do your action in the object where the condition is match" (i.e., the object in the collection with doc_id: "1234" and an array with items.item_no: 3)
Also if you want to update more than one document you can use multi:true like this example.
Edit: It seems you are using pymongo so you can use multi=True (insted of multi: true) or a cleaner way, using update_many.
collection.update_many( /* your query here */ )

Reference value from positional element in array in update

Suppose I have a document that looks like this:
{
"id": 1,
"entries": [
{
"id": 100,
"urls": {
"a": "url-a",
"b": "url-b",
"c": "url-c"
},
"revisions": []
}
]
}
I am trying to add a new object to the revisions array that contains its own urls field. Two of the fields should be copied from the entry's urls, while the last one will be new. The result should look like this:
{
"id": 1,
"entries": [
{
"id": 100,
"urls": {
"a": "url-a",
"b": "url-b",
"c": "url-c"
},
"revisions": [
{
"id": 1000,
"urls": {
"a": "url-a", <-- copied
"b": "url-b", <-- copied
"c": "some-new-url" <-- new
}
}
]
}
]
}
I am on MongoDB 4.2+, so I know I can use $property on the update query to reference values. However, this does not seem to be working as I expect:
collection.updateOne(
{
id: 1,
"enntries.id": 100
},
{
$push: {
"entries.$.revisions": {
id: 1000,
urls: {
"a": "$entries.$.urls.a",
"b": "$entries.$.urls.b",
"c": "some-new-url"
}
}
}
}
);
The element gets added to the array, but all I see for the url values is the literal $entries.$.urls.a. value I suspect the issue is with combining the reference with selecting a specific positional array element. I have also tried using $($entries.$.urls.a), with the same result.
How can I make this work?
Starting from MongoDB version >= 4.2 you can use aggregation pipeline in updates which means your update part of query will be wrapped in [] where you can take advantage of executing aggregation in query & also use existing field values in updates.
Issue :
Since you've not wrapped update part in [] to say it's an aggregation pipeline, .updateOne() is considering "$entries.$.urls.a" as a string. I believe you'll not be able to use $ positional operator in updates which use aggregation pipeline.
Try below query which uses aggregation pipeline :
collection.updateOne(
{
id: 1,
"entries.id": 100 /** "entries.id" is optional but much needed to avoid execution of below aggregation for doc where `id :1` but no `"entries.id": 100` */,
}
[
{
$set: {
entries: {
$map: { // aggregation operator `map` iterate over array & creates new array with values.
input: "$entries",
in: {
$cond: [
{ $eq: ["$$this.id", 100] }, // `$$this` is current object in array iteration, if condition is true do below functionality for that object else return same object as is to array being created.
{
$mergeObjects: [
"$$this",
{
revisions: { $concatArrays: [ "$$this.revisions", [{ id: 1000, urls: { a: "$$this.urls.a", b: "$$this.urls.b", c: "some-new-url" } } ]] }
}
]
},
"$$this" // Returning same object as condition is not met.
]
}
}
}
}
}
]
);
$mergeObjects will replace existing revisions field in $$this (current) object with value of { $concatArrays: [ "$$this.revisions", { id: 1000, urls: { a: "$$this.urls.a", b: "$$this.urls.b", c: "some-new-url" } } ] }.
From the above field name revisions and as it being an array I've assumed there will multiple objects in that field & So we're using $concatArrays operator to push new objects into revisions array of particular entires object.
In any case, if your revisions array field does only contain one object make it as an object instead of array Or you can keep it as an array & use below query - We've removed $concatArrays cause we don't need to merge new object to existing revisions array as we'll only have one object every-time.
collection.update(
{
id: 1,
"entries.id": 100
}
[
{
$set: {
entries: {
$map: {
input: "$entries",
in: {
$cond: [
{ $eq: ["$$this.id", 100] },
{
$mergeObjects: [
"$$this",
{
revisions: [ { id: 1000, urls: { a: "$$this.urls.a", b: "$$this.urls.b", c: "some-new-url" } } ]
}
]
},
"$$this"
]
}
}
}
}
}
]
);
Test : Test your aggregation pipeline here : mongoplayground
Ref : .updateOne()
Note : If in any case .updateOne() throws in an error due to in-compatible client or shell, try this query with .update(). This execution of aggregation pipeline in updates helps to save multiple DB calls & can be much useful on arrays with less no.of elements.

Move an element from one array to another within same document MongoDB

I have data that looks like this:
{
"_id": ObjectId("4d525ab2924f0000000022ad"),
"array": [
{ id: 1, other: 23 },
{ id: 2, other: 21 },
{ id: 0, other: 235 },
{ id: 3, other: 765 }
],
"zeroes": []
}
I'm would like to to $pull an element from one array and $push it to a second array within the same document to result in something that looks like this:
{
"_id": ObjectId("id"),
"array": [
{ id: 1, other: 23 },
{ id: 2, other: 21 },
{ id: 3, other: 765 }
],
"zeroes": [
{ id: 0, other: 235 }
]
}
I realize that I can do this by doing a find and then an update, i.e.
db.foo.findOne({"_id": param._id})
.then((doc)=>{
db.foo.update(
{
"_id": param._id
},
{
"$pull": {"array": {id: 0}},
"$push": {"zeroes": {doc.array[2]} }
}
)
})
I was wondering if there's an atomic function that I can do this with.
Something like,
db.foo.update({"_id": param._id}, {"$move": [{"array": {id: 0}}, {"zeroes": 1}]}
Found this post that generously provided the data I used, but the question remains unsolved after 4 years. Has a solution to this been crafted in the past 4 years?
Move elements from $pull to another array
There is no $move in MongoDB. That being said, the easiest solution is a 2 phase approach:
Query the document
Craft the update with a $pull and $push/$addToSet
The important part here, to make sure everything is idempotent, is to include the original array document in the query for the update.
Given a document of the following form:
{
_id: "foo",
arrayField: [
{
a: 1,
b: 1
},
{
a: 2,
b: 1
}
]
}
Lets say you want to move { a: 1, b: 1 } to a different field, maybe called someOtherArrayField, you would want to do something like.
var doc = db.col.findOne({_id: "foo"});
var arrayDocToMove = doc.arrayField[0];
db.col.update({_id: "foo", arrayField: { $elemMatch: arrayDocToMove} }, { $pull: { arrayField: arrayDocToMove }, $addToSet: { someOtherArrayField: arrayDocToMove } })
The reason we use the $elemMatch is to be sure the field we are about to remove from the array hasn't changed since we first queried the document. When coupled with a $pull it also isn't strictly necessary, but I am typically overly cautious in these situations. If there is no parallelism in your application, and you only have one application instance, it isn't strictly necessary.
Now when we check the resulting document, we get:
db.col.findOne()
{
"_id" : "foo",
"arrayField" : [
{
"a" : 2,
"b" : 1
}
],
"someOtherArrayField" : [
{
"a" : 1,
"b" : 1
}
]
}

JSON schema for an array

I've a JSON Schema and a sample input. I need to write a generic schema which can handle the array regardless the length of the array. Currently, I need to write schema for each of the index in the array.
JSON Schema
{
"title":"Example",
"$schema":"http://json-schema.org/draft-04/schema#",
"type":"array",
"items":[
{
"oneOf":[
{
"multipleOf": 3
}
]
},
{
"oneOf":[
{
"multipleOf": 3
},
{
"multipleOf": 5
}
]
}
]
}
Sample Input
[
3,
5
]
I need a schema which can validate [1,3,5,6,3,5,4,......] (regardless the length)
If you put a schema directly in items, instead of using an array, then it applies to all array items:
{
"type": "array",
"items": {
"oneOf": [
{"multipleOf": 3},
{"multipleOf": 5}
]
}
}
If you want to describe an initial set of items with specific schemas, and all the following ones with a generic one, then use an array with items, and a schema in additionalItems:
{
"type": "array",
"items": [
{"multipleOf": 3},
...
],
"additionalItems": {
"oneOf": [
{"multipleOf": 3},
{"multipleOf": 5}
]
}
}

How to find number of distinct values from a collection

Suppose I have a collection like:
{
"id": 1,
"name": "jonas",
},
{
"id": 2,
"name": "jonas",
},
{
"id":3,
"name": "smirk",
}
How do I get :
Number of distinct names, like in this case, 2
The distinct names, in this case, jonas and smirk ?
With some Backbone and Underscore magic, combining collection.pluck and _.uniq:
pluck collection.pluck(attribute)
Pluck an attribute from each model in the collection. Equivalent to calling map, and returning a single attribute from the iterator.
uniq _.uniq(array, [isSorted], [iterator])
Produces a duplicate-free version of the array, using === to test object equality.
[...]
var c = new Backbone.Collection([
{id: 1, name: "jonas"},
{id: 2, name: "jonas"},
{id: 3, name: "smirk"}
]);
var names = _.uniq(c.pluck('name'));
console.log(names.length);
console.log(names);
And a demo http://jsfiddle.net/nikoshr/PSFXg/

Resources