MongoDB Find Exact Array Match but order doesn't matter - arrays

I am querying for finding exact array match and retrieved it successfully but when I try to find out the exact array with values in different order then it get fails.
Example
db.coll.insert({"user":"harsh","hobbies":["1","2","3"]})
db.coll.insert({"user":"kaushik","hobbies":["1","2"]})
db.coll.find({"hobbies":["1","2"]})
2nd Document Retrieved Successfully
db.coll.find({"hobbies":["2","1"]})
Showing Nothing
Please help

The currently accepted answer does NOT ensure an exact match on your array, just that the size is identical and that the array shares at least one item with the query array.
For example, the query
db.coll.find({ "hobbies": { "$size" : 2, "$in": [ "2", "1", "5", "hamburger" ] } });
would still return the user kaushik in that case.
What you need to do for an exact match is to combine $size with $all, like so:
db.coll.find({ "hobbies": { "$size" : 2, "$all": [ "2", "1" ] } });
But be aware that this can be a very expensive operation, depending on your amount and structure of data.
Since MongoDB keeps the order of inserted arrays stable, you might fare better with ensuring arrays to be in a sorted order when inserting to the DB, so that you may rely on a static order when querying.

To match the array field exactly Mongo provides $eq operator which can be operated over an array also like a value.
db.collection.find({ "hobbies": {$eq: [ "singing", "Music" ] }});
Also $eq checks the order in which you specify the elements.
If you use below query:
db.coll.find({ "hobbies": { "$size" : 2, "$all": [ "2", "1" ] } });
Then the exact match will not be returned. Suppose you query:
db.coll.find({ "hobbies": { "$size" : 2, "$all": [ "2", "2" ] } });
This query will return all documents having an element 2 and has size 2 (e.g. it will also return the document having hobies :[2,1]).

Mongodb filter by exactly array elements without regard to order or specified order.
Source: https://savecode.net/code/javascript/mongodb+filter+by+exactly+array+elements+without+regard+to+order+or+specified+order
// Insert data
db.inventory.insertMany([
{ item: "journal", qty: 25, tags: ["blank", "red"], dim_cm: [ 14, 21 ] },
{ item: "notebook", qty: 50, tags: ["red", "blank"], dim_cm: [ 14, 21 ] },
{ item: "paper", qty: 100, tags: ["red", "blank", "plain"], dim_cm: [ 14, 21 ] },
{ item: "planner", qty: 75, tags: ["blank", "red"], dim_cm: [ 22.85, 30 ] },
{ item: "postcard", qty: 45, tags: ["blue"], dim_cm: [ 10, 15.25 ] }
]);
// Query 1: filter by exactly array elements without regard to order
db.inventory.find({ "tags": { "$size" : 2, "$all": [ "red", "blank" ] } });
// result:
[
{
_id: ObjectId("6179333c97a0f2eeb98a6e02"),
item: 'journal',
qty: 25,
tags: [ 'blank', 'red' ],
dim_cm: [ 14, 21 ]
},
{
_id: ObjectId("6179333c97a0f2eeb98a6e03"),
item: 'notebook',
qty: 50,
tags: [ 'red', 'blank' ],
dim_cm: [ 14, 21 ]
},
{
_id: ObjectId("6179333c97a0f2eeb98a6e05"),
item: 'planner',
qty: 75,
tags: [ 'blank', 'red' ],
dim_cm: [ 22.85, 30 ]
}
]
// Query 2: filter by exactly array elements in the specified order
db.inventory.find( { tags: ["blank", "red"] } )
// result:
[
{
_id: ObjectId("6179333c97a0f2eeb98a6e02"),
item: 'journal',
qty: 25,
tags: [ 'blank', 'red' ],
dim_cm: [ 14, 21 ]
},
{
_id: ObjectId("6179333c97a0f2eeb98a6e05"),
item: 'planner',
qty: 75,
tags: [ 'blank', 'red' ],
dim_cm: [ 22.85, 30 ]
}
]
// Query 3: filter by an array that contains both the elements without regard to order or other elements in the array
db.inventory.find( { tags: { $all: ["red", "blank"] } } )
// result:
[
{
_id: ObjectId("6179333c97a0f2eeb98a6e02"),
item: 'journal',
qty: 25,
tags: [ 'blank', 'red' ],
dim_cm: [ 14, 21 ]
},
{
_id: ObjectId("6179333c97a0f2eeb98a6e03"),
item: 'notebook',
qty: 50,
tags: [ 'red', 'blank' ],
dim_cm: [ 14, 21 ]
},
{
_id: ObjectId("6179333c97a0f2eeb98a6e05"),
item: 'planner',
qty: 75,
tags: [ 'blank', 'red' ],
dim_cm: [ 22.85, 30 ]
}
]

This query will find exact array with any order.
let query = {$or: [
{hobbies:{$eq:["1","2"]}},
{hobbies:{$eq:["2","1"]}}
]};
db.coll.find(query)

with $all we can achieve this.
Query : {cast:{$all:["James J. Corbett","George Bickel"]}}
Output : cast : ["George Bickel","Emma Carus","George M. Cohan","James J. Corbett"]

Using aggregate this is how I got mine proficient and faster:
db.collection.aggregate([
{$unwind: "$array"},
{
$match: {
"array.field" : "value"
}
},
You can then unwind it again for making it flat array and then do grouping on it.

This question is rather old, but I was pinged because another answer shows that the accepted answer isn't sufficient for arrays containing duplicate values, so let's fix that.
Since we have a fundamental underlying limitation with what queries are capable of doing, we need to avoid these hacky, error-prone array intersections. The best way to check if two arrays contain an identical set of values without performing an explicit count of each value is to sort both of the arrays we want to compare and then compare the sorted versions of those arrays. Since MongoDB does not support an array sort to the best of my knowledge, we will need to rely on aggregation to emulate the behavior we want:
// Note: make sure the target_hobbies array is sorted!
var target_hobbies = [1, 2];
db.coll.aggregate([
{ // Limits the initial pipeline size to only possible candidates.
$match: {
hobbies: {
$size: target_hobbies.length,
$all: target_hobbies
}
}
},
{ // Split the hobbies array into individual array elements.
$unwind: "$hobbies"
},
{ // Sort the elements into ascending order (do 'hobbies: -1' for descending).
$sort: {
_id: 1,
hobbies: 1
}
},
{ // Insert all of the elements back into their respective arrays.
$group: {
_id: "$_id",
__MY_ROOT: { $first: "$$ROOT" }, // Aids in preserving the other fields.
hobbies: {
$push: "$hobbies"
}
}
},
{ // Replaces the root document in the pipeline with the original stored in __MY_ROOT, with the sorted hobbies array applied on top of it.
// Not strictly necessary, but helpful to have available if desired and much easier than a bunch of 'fieldName: {$first: "$fieldName"}' entries in our $group operation.
$replaceRoot: {
newRoot: {
$mergeObjects: [
"$__MY_ROOT",
{
hobbies: "$hobbies"
}
]
}
}
}
{ // Now that the pipeline contains documents with hobbies arrays in ascending sort order, we can simply perform an exact match using the sorted target_hobbies.
$match: {
hobbies: target_hobbies
}
}
]);
I cannot speak for the performance of this query, and it may very well cause the pipeline to become too large if there are too many initial candidate documents. If you're working with large data sets, then once again, do as the currently accepted answer states and insert array elements in sorted order. By doing so you can perform static array matches, which will be far more efficient since they can be properly indexed and will not be limited by the pipeline size limitation of the aggregation framework. But for a stopgap, this should ensure a greater level of accuracy.

Related

How to use $getfield to get a field from ROOT Document with condition in Aggregation Mongodb

I'm starting to learn Aggregate in MongoDB. I have a simple Doc as below, which has 2 fields, name and examScores, examScores is an array contains multiplier documents:
{ _id: ObjectId("633199db009be219a43ae426"),
name: 'Max',
examScores:
[ { difficulty: 4, score: 57.9 },
{ difficulty: 6, score: 62.1 },
{ difficulty: 3, score: 88.5 } ] }
{ _id: ObjectId("633199db009be219a43ae427"),
name: 'Manu',
examScores:
[ { difficulty: 7, score: 52.1 },
{ difficulty: 2, score: 74.3 },
{ difficulty: 5, score: 53.1 } ] }
Now I query the maximum score of each person using $unwind and $group/$max as below:
db.test.aggregate([
{$unwind: "$examScores"},
{$group: {_id: {name: "$name"}, maxScore: {$max: "$examScores.score"}}}
])
{ _id: { name: 'Max' }, maxScore: 88.5 }
{ _id: { name: 'Manu' }, maxScore: 74.3 }
But I want the result also contains the examScores.difficulty field corresponding to name and examScores.score, like below:
{ _id: { name: 'Max' }, difficulty: 3, maxScore: 88.5 }
{ _id: { name: 'Manu' }, difficulty: 2, maxScore: 74.3 }
I know that I can use $sort + $group and $first to achieve this goal. But I want to use $getField or any other methods to get data from ROOT Doc.
My idea is use $project and $getField to get the difficulty field from ROOT doc (or $unwind version of ROOT doc) with the condition like ROOT.name = Aggregate.name and Root.examScores.score = Aggregate.maxScore.
It will look something like this:
{$project:
{name: 1,
maxScore: 1,
difficulty:
{$getField: {
field: "$examScores.difficulty"
input: "$$ROOT.$unwind() with condition/filter"}
}
}
}
I wonder if this is possible in MongoDB?
Solution 1
$unwind
$group - Group by name. You need $push to add the $$ROOT document into data array.
$project - Set the difficulty field by getting the value of examScores.difficulty from the first item of the filtered data array by matching the examScores.score with maxScore.
db.collection.aggregate([
{
$unwind: "$examScores"
},
{
$group: {
_id: {
name: "$name"
},
maxScore: {
$max: "$examScores.score"
},
data: {
$push: "$$ROOT"
}
}
},
{
$project: {
_id: 0,
name: "$_id.name",
maxScore: 1,
difficulty: {
$getField: {
field: "difficulty",
input: {
$getField: {
field: "examScores",
input: {
$first: {
$filter: {
input: "$data",
cond: {
$eq: [
"$$this.examScores.score",
"$maxScore"
]
}
}
}
}
}
}
}
}
}
}
])
Demo Solution 1 # Mongo Playground
Solution 2: $rank
$unwind
$rank - Ranking by partition name and sort examScores.score descending.
$match - Filter the document with { rank: 1 }.
$unset - Remove rank field.
db.collection.aggregate([
{
$unwind: "$examScores"
},
{
$setWindowFields: {
partitionBy: "$name",
sortBy: {
"examScores.score": -1
},
output: {
rank: {
$rank: {}
}
}
}
},
{
$match: {
rank: 1
}
},
{
$unset: "rank"
}
])
Demo Solution 2 # Mongo Playground
Opinion: I would say this approach:
$sort by examScores.score descending
$group by name, take the first document
would be much easier.
There's no need to $unwind and then rebuild the documents again via $group to achieve your desired results. I'd recommend avoiding that altogether.
Instead, consider processing the arrays inline using array expression operators. Depending on the version and exact results you are looking for, here are two starting points that may be worth considering. In particular the $maxN operator and the $sortArray operator may be of interest for this particular question.
You can get a sense for what these two operators do by running an $addFields aggregation to see their output, playground here.
With those as a starting point, it's really up to you to make the pipeline output the desired result. Here is one such example that matches the output you described in the question pretty well (playground):
db.collection.aggregate([
{
"$addFields": {
"relevantEntry": {
$first: {
$sortArray: {
input: "$examScores",
sortBy: {
"score": -1
}
}
}
}
},
},
{
"$project": {
_id: 0,
name: 1,
difficulty: "$relevantEntry.difficulty",
maxScore: "$relevantEntry.score"
}
}
])
Which yields:
[
{
"difficulty": 3,
"maxScore": 88.5,
"name": "Max"
},
{
"difficulty": 2,
"maxScore": 74.3,
"name": "Manu"
}
]
Also worth noting that this particular approach doesn't do anything special if there are duplicates. You could look into using $filter if something more was needed in that regard.

Mongo DB find value in array of multiple nested arrays

I need to check if an ObjectId exists in a non nested array and in multiple nested arrays, I've managed to get very close using the aggregation framework, but got stuck in the very last step.
My documents have this structure:
{
"_id" : ObjectId("605ce5f063b1c2eb384c2b7f"),
"name" : "Test",
"attrs" : [
ObjectId("6058e94c3994d04d28639616"),
ObjectId("6058e94c3994d04d28639627"),
ObjectId("6058e94c3994d04d28639622"),
ObjectId("6058e94c3994d04d2863962e")
],
"variations" : [
{
"varName" : "Var1",
"attrs" : [
ObjectId("6058e94c3994d04d28639616"),
ObjectId("6058e94c3994d04d28639627"),
ObjectId("6058e94c3994d04d28639622"),
ObjectId("60591791d4d41d0a6817d23f")
],
},
{
"varName" : "Var2",
"attrs" : [
ObjectId("60591791d4d41d0a6817d22a"),
ObjectId("60591791d4d41d0a6817d255"),
ObjectId("6058e94c3994d04d28639622"),
ObjectId("60591791d4d41d0a6817d23f")
],
},
],
"storeId" : "9acdq9zgke49pw85"
}
Let´s say I need to check if this if this _id exists "6058e94c3994d04d28639616" in all arrays named attrs.
My aggregation query goes like this:
db.product.aggregate([
{
$match: {
storeId,
},
},
{
$project: {
_id: 0,
attrs: 1,
'variations.attrs': 1,
},
},
{
$project: {
attrs: 1,
vars: '$variations.attrs',
},
},
{
$unwind: '$vars',
},
{
$project: {
attr: {
$concatArrays: ['$vars', '$attrs'],
},
},
},
]);
which results in this:
[
{
attr: [
6058e94c3994d04d28639616,
6058e94c3994d04d28639627,
6058e94c3994d04d28639622,
6058e94c3994d04d2863962e,
6058e94c3994d04d28639616,
6058e94c3994d04d28639627,
6058e94c3994d04d28639622,
60591791d4d41d0a6817d23f,
60591791d4d41d0a6817d22a,
60591791d4d41d0a6817d255,
6058e94c3994d04d28639622,
60591791d4d41d0a6817d23f
]
},
{
attr: [
60591791d4d41d0a6817d22a,
60591791d4d41d0a6817d255,
6058e94c3994d04d28639622,
60591791d4d41d0a6817d23f,
6058e94c3994d04d28639624,
6058e94c3994d04d28639627,
6058e94c3994d04d28639628,
6058e94c3994d04d2863963e
]
}
]
Assuming I have two products in my DB, I get this result. Each element in the outermost array is a different product.
The last bit, which is checking for this key "6058e94c3994d04d28639616", I could not find a way to do it with $group, since I dont have keys to group on.
Or with $match, adding this to the end of the aggregation:
{
$match: {
attr: "6058e94c3994d04d28639616",
},
},
But that results in an empty array. I know that $match does not query arrays like this, but could not find a way to do it with $in as well.
Is this too complicated of a Schema? I cannot have the original data embedded, since it is mutable and I would not be happy to change all products if something changed.
Will this be very expensive if I had like 10000 products?
Thanks in advance
You are trying to compare string 6058e94c3994d04d28639616 with ObjectId. Convert the string to ObjectId using $toObjectId operator when perform $match operation like this:
{
$match: {
$expr: {
$in: [{ $toObjectId: "6058e94c3994d04d28639616" }, "$attr"]
}
}
}

$lookup Array of Objects in parent Array and append the results to each item of said Array

Considering the following document "Backpack", each slots is a piece of said backpack, and each slot has a contents describing various items and a count of them.
{
_id: "backpack",
slots: [
{
slot: "left-pocket",
contents: [
{
item: "pen",
count: 3
},
{
item: "pencil",
count: 2
},
]
},
{
slot: "right-pocket",
contents: [
{
item: "bottle",
count: 1
},
{
item: "eraser",
count: 1
},
]
}
]
}
The item field is the _id of an item of another collection, e.g.:
{
_id: "pen",
color: "red"
(...)
},
Same for pen, pencil, bottle, eraser, etc.
I want to make a $lookup so I can fill in the item's data, but I'm not finding a way of having the lookup's as be the same place as the item. That is:
db.collection.aggregate({
{
$lookup: {
from: 'items',
localField: 'slots.contents.item',
foreignField: '_id',
as: 'convertedItems', // <=== ISSUE
},
},
})
Problem is that as being named convertedItems means the document gets an array of items in the root of the document called 'convertedItems', like this:
{
_id: "backpack",
slots: [ (...) ],
convertedItems: [ (...) ]
}
How can I tell $lookup to actually use the localField as the place to append the data?
That is, make document become:
{
_id: "backpack",
slots: [
{
slot: "left-pocket",
contents: [
{
item: "pen", // <== NOTE
count: 3, // <== NOTE
_id: "pen",
color: "red"
(...)
},
{
item: "pencil", // <== NOTE
count: 2, // <== NOTE
_id: "pencil",
color: "blue"
(...)
},
]
},
(...)
Note: At this point, if have entire data of item, doesn't matter if item property is kept, but count must remain.
I can't manually do $addFields with arrayElemAt because the number of items in slots is not fixed.
Extra Info: I'm using MongoAtlas Free so assume MongoDB 4.2+ (no need to unwind arrays for $lookup).
PS: I thought now of just leaving as root item (e.g. "convertedItems") and on the code that receives the API, when looping through the items, I do Array.find on the "convertedItems" per the the _id using the item. I'll keep the question as I'm curious on how to do on MongoDB side
When you use $lookup, there is a single query in the related collection for each document in the source pipeline, not a query per value in the source document.
If you want each item looked up separately, you'll need to unwind the arrays so each document in the pipeline contains a single item, do the lookup, and then group to rebuild the arrays.
db.collection.aggregate([
{$unwind: "$slots"},
{$unwind: "$slots.contents"},
{$lookup: {
from: "items",
localField: "slots.contents.item",
foreignField: "_id",
as: "convertedItems"
}},
{$group: {
_id: "$slots.slot",
root: {$first: "$$ROOT"},
items: {
$push: {
$mergeObjects: [
"$slots.contents",
{$arrayElemAt: ["$convertedItems", 0]}
]
}},
}},
{$addFields: {"root.slots.contents": "$items"}},
{$replaceRoot: {newRoot: "$root"}},
{$group: {
_id: "$_id",
root: {$first: "$$ROOT"},
slots: {$push: "$slots"}
}},
{$addFields: {"root.slots": "$slots"}},
{$replaceRoot: {newRoot: "$root"}},
{$project: { convertedItems: 0}}
])
Playground
unwind makes your collection explode, Also you can't specify in place of
'as', So you need to add additional stages like addFields, filters to
get required o/p
As I've commented, your requirement has a bit to do in order to match main doc's elements with $lookup result, maybe this can be easily done by code, but if it has to be done by query, using this query you'll be working on same no.of docs as what you've in collection quiet opposite to unwind as it would explode you docs when having nested arrays like what you've now, As in general this is a bit complex try to use $match as first stage to filter docs if needed for better performance. Additionally you can use $explain to get to know about your query performance.
Query :
db.Backpack.aggregate([
/** lookup on items collection & get matched docs to items array */
{
$lookup: {
from: "items",
localField: "slots.contents.item",
foreignField: "_id",
as: "items"
}
},
/** Iterate on slots & contents & internally filter on items array to get matched doc for a content object &
* merge the objects back to respective objects to form the same structure */
{
$project: {
slots: {
$map: {
input: "$slots",
in: {
$mergeObjects: [
"$$this",
{
contents: {
$map: {
input: "$$this.contents",
as: "c",
in: {
$mergeObjects: [
"$$c",
{
$let: {
vars: {
matchedItem: {
$arrayElemAt: [
{
$filter: {
input: "$items",
as: "i",
cond: {
$eq: [
"$$c.item",
"$$i._id"
]
}
}
},
0
]
}
},
in: {
color: "$$matchedItem.color"
}
}
}
]
}
}
}
}
]
}
}
}
}
}
])
Test : MongoDB-Playground

Move an element from one array to another within same document MongoDB

I have data that looks like this:
{
"_id": ObjectId("4d525ab2924f0000000022ad"),
"array": [
{ id: 1, other: 23 },
{ id: 2, other: 21 },
{ id: 0, other: 235 },
{ id: 3, other: 765 }
],
"zeroes": []
}
I'm would like to to $pull an element from one array and $push it to a second array within the same document to result in something that looks like this:
{
"_id": ObjectId("id"),
"array": [
{ id: 1, other: 23 },
{ id: 2, other: 21 },
{ id: 3, other: 765 }
],
"zeroes": [
{ id: 0, other: 235 }
]
}
I realize that I can do this by doing a find and then an update, i.e.
db.foo.findOne({"_id": param._id})
.then((doc)=>{
db.foo.update(
{
"_id": param._id
},
{
"$pull": {"array": {id: 0}},
"$push": {"zeroes": {doc.array[2]} }
}
)
})
I was wondering if there's an atomic function that I can do this with.
Something like,
db.foo.update({"_id": param._id}, {"$move": [{"array": {id: 0}}, {"zeroes": 1}]}
Found this post that generously provided the data I used, but the question remains unsolved after 4 years. Has a solution to this been crafted in the past 4 years?
Move elements from $pull to another array
There is no $move in MongoDB. That being said, the easiest solution is a 2 phase approach:
Query the document
Craft the update with a $pull and $push/$addToSet
The important part here, to make sure everything is idempotent, is to include the original array document in the query for the update.
Given a document of the following form:
{
_id: "foo",
arrayField: [
{
a: 1,
b: 1
},
{
a: 2,
b: 1
}
]
}
Lets say you want to move { a: 1, b: 1 } to a different field, maybe called someOtherArrayField, you would want to do something like.
var doc = db.col.findOne({_id: "foo"});
var arrayDocToMove = doc.arrayField[0];
db.col.update({_id: "foo", arrayField: { $elemMatch: arrayDocToMove} }, { $pull: { arrayField: arrayDocToMove }, $addToSet: { someOtherArrayField: arrayDocToMove } })
The reason we use the $elemMatch is to be sure the field we are about to remove from the array hasn't changed since we first queried the document. When coupled with a $pull it also isn't strictly necessary, but I am typically overly cautious in these situations. If there is no parallelism in your application, and you only have one application instance, it isn't strictly necessary.
Now when we check the resulting document, we get:
db.col.findOne()
{
"_id" : "foo",
"arrayField" : [
{
"a" : 2,
"b" : 1
}
],
"someOtherArrayField" : [
{
"a" : 1,
"b" : 1
}
]
}

MongoDB - Select multiple sub-dicuments from array using $elemMatch

I have a collection like the following:-
{
_id: 5,
"org_name": "abc",
"items": [
{
"item_id": "10",
"data": [
// Values goes here
]
},
{
"item_id": "11",
"data": [
// Values goes here
]
}
]
},
// Another sub document
{
_id: 6,
"org_name": "sony",
"items": [
{
"item_id": "10",
"data": [
// Values goes here
]
},
{
"item_id": "11",
"data": [
// Values goes here
]
}
]
}
Each sub document corresponds to individual organizations and each organization has an array of items in them.
What I need is to get the select individual elements from the items array, by providing item_id.
I already tried this:-
db.organizations.find({"_id": 5}, {items: {$elemMatch: {"item_id": {$in: ["10", "11"]}}}})
But it is returning either the item list with *item_id* "10" OR the item list with *item_id* "11".
What I need is is the get values for both item_id 10 and 11 for the organization "abc". Please help.
update2:
db.organizations.aggregate([
// you can remove this to return all your data
{$match:{_id:5}},
// unwind array of items
{$unwind:"$items"},
// filter out all items not in 10, 11
{$match:{"items.item_id":{"$in":["10", "11"]}}},
// aggregate again into array
{$group:{_id:"$_id", "items":{$push:"$items"}}}
])
update:
db.organizations.find({
"_id": 5,
items: {$elemMatch: {"item_id": {$in: ["10", "11"]}}}
})
old Looks like you need aggregation framework, particularly $unwind operator:
db.organizations.aggregate([
{$match:{_id:5}}, // you can remove this to return all your data
{$unwind:"$items"}
])

Resources