I'm very new to MongoDB and I need help figuring out how to perform aggregation on a key in MongoDB and use that result to return matches.
For example, if I have a collection called Fruits with the following documents:
{
"id": 1,
"name": "apple",
"type": [
"Granny smith",
"Fuji"
]
}, {
"id": 2,
"name": "grape",
"type": [
"green",
"black"
]
}, {
"id": 3,
"name": "orange",
"type": [
"navel"
]
}
How do I write a query that will return the names of the fruits with 2 types, ie apple and grape?
Thanks!
Demo - https://mongoplayground.net/p/ke3VJIErhvb
use $size to get records with 2 number of type
https://docs.mongodb.com/manual/reference/method/db.collection.find/#mongodb-method-db.collection.find
The $size operator matches any array with the number of elements specified by the argument. For example:
db.collection.find({
type: { "$size": 2 } // match document with type having size 2
},
{ name: 1 } // projection to get name and _id only
)
To get the length of the array you should use $size operator in $project pipeline stage
So the pipeline $project stage should look like this
{
"$project": {
"name": "$name",
type: {
"$size": "$type"
}
}
}
Here is an working example of the same ⇒ https://mongoplayground.net/p/BmS9BGhqsFg
Let's say I have three document structured like so :
{
"_id": 1,
"conditions": [
["Apple", "Orange"],
["Lemon"],
["Strawberry"]
]
},
{
"_id": 2,
"conditions": [
["Apple"],
["Strawberry"]
]
},
{
"_id": 3,
"conditions": [
["Apple", "Lime"]
]
}
And I have an array, I'll call it ARC for this example :
ARC = [
"Apple",
"Lime",
"Banana",
"Avocado",
"Cherry"
]
I would like to return all document in which all conditions subarray values can't be found in the ARC array.
For example, with the data above, the first document should be returned because :
The Apple AND Orange combination is not in the ARC array
Lemon is not in the ARC array
Strawberry is not in the ARC array
The second document shouldn't be returned because :
Apple is in the ARC array
And the third document shouldn't be returned because :
The Apple AND Lime combination is in the ARC array
I've tried
db.example.find({"conditions": {$not: {$elemMatch: {$all: [ARC]}}}})
But it seems way too simple.. So, as expected, it doesn't work.
I know mongoDB is pretty powerful with all the aggregation and stuff but I'm a bit lost.
Do you know if it's possible with a query alone and if so, what should I look for ?
The query below should solve your problem.
var ARC = [
"Apple",
"Lime",
"Banana",
"Avocado",
"Cherry"
];
db.test.find(
{ $expr: {
$eq: [
{ $filter: { input: "$conditions", as: "c", cond: { $setIsSubset: [ "$$c", ARC] } } },
[ ]
]
}
}
)
It's made up of lots of parts so I'll try to break it down a bit, The first part is $expr within a find (or can be used within a $match in an aggregation) this allows us aggregation expressions within the query. So this allows us to use a $filter.
The $filter expression allows us to filter down the arrays in the condition field to check if any are a subset of the array ARC passed in.
We can actually take that filter an execute it on its own using an aggregation query:
db.test.aggregate([
{ $project: {
"example" : { $filter: { input: "$conditions", as: "c", cond: { $setIsSubset: [ "$$c", ARC] } } }
} }])
{ "_id" : 1, "example" : [ ] }
{ "_id" : 2, "example" : [ [ "Apple" ] ] }
{ "_id" : 3, "example" : [ [ "Apple", "Lime" ] ] }
The last part of the query is the $eq which is taking the value that is created with the filter and then matching it against an empty array [ ].
This is an aggregation approach. You should use $setIsSubset.
Below should be helpful:
db.collection.aggregate([
{
$match: {
$expr: {
$eq: [
true,
{
$allElementsTrue: {
$map: {
input: "$conditions",
as: "c",
in: {
$not: {
$setIsSubset: [
"$$c",
[
"Apple",
"Lime",
"Banana",
"Avocado",
"Cherry"
]
]
}
}
}
}
}
]
}
}
}
])
MongoPlayGroundLink
I'm using Nodejs with Mongoose package.
Given I've something like this:-
let people = [
{
"_id": 1,
"name": "Person 1",
"pets": [
{
"_id": 1,
"name": "Tom",
"category": "cat"
},
{
"_id": 2,
"name": "Jerry",
"category": "mouse"
}
]
}
]
I want to get only the data of Jerry in pets array using it's _id (result shown below)
{
"_id": 2,
"name": "Jerry",
"category": "mouse"
}
Can I get it without needing to specify the _id of person 1 when using $elemMatch? Right now I code like this:-
const pet = People.find(
{ "_id": "1"}, // specifying 'person 1 _id' first
{ pets: { $elemMatch: { _id: 2 } } } // using 'elemMatch' to get 'pet' with '_id' of '2'
)
And it gave me what I want like I've shown you above. But is there any other way I can do this without needing to specify the _id of it's parent first (in this case, the _id of the people array)
Assuming nested array's _id's are unique you can filter by nested array elements directly:
const pet = People.find(
{ "pets._id": 2 },
{ pets: { $elemMatch: { _id: 2 } } }
)
here is my data structure
I want to get a result where I can get the document where there is a row in moderators but not in members
{
"_id" : "10",
"members" : [
"10",
"20",
"30"
],
"moderators" : [
"50",
"60",
"70"
]
}
You can use $setDifference to perform the relative complement to get rows in the moderator array which are not in members array followed by $match to get all the entries where there foundInModerator is populated.
db.collection.aggregate(
[
{ $project: { members: 1, moderators: 1, foundInModerator: { $setDifference: [ "$moderators", "$members" ] }, _id: 0 } },
{ $match:{foundInModerator:{$ne:[] } } }
]
)
For returning the result where the value is in "one array" but "not in the other" simply use the $ne operation:
db.collection.find({ "moderators": "50", "members": { "$ne": "50" } })
So the match condition only returns positive where "50" is present in the "moderators" array but not in the "members" array.
I am querying for finding exact array match and retrieved it successfully but when I try to find out the exact array with values in different order then it get fails.
Example
db.coll.insert({"user":"harsh","hobbies":["1","2","3"]})
db.coll.insert({"user":"kaushik","hobbies":["1","2"]})
db.coll.find({"hobbies":["1","2"]})
2nd Document Retrieved Successfully
db.coll.find({"hobbies":["2","1"]})
Showing Nothing
Please help
The currently accepted answer does NOT ensure an exact match on your array, just that the size is identical and that the array shares at least one item with the query array.
For example, the query
db.coll.find({ "hobbies": { "$size" : 2, "$in": [ "2", "1", "5", "hamburger" ] } });
would still return the user kaushik in that case.
What you need to do for an exact match is to combine $size with $all, like so:
db.coll.find({ "hobbies": { "$size" : 2, "$all": [ "2", "1" ] } });
But be aware that this can be a very expensive operation, depending on your amount and structure of data.
Since MongoDB keeps the order of inserted arrays stable, you might fare better with ensuring arrays to be in a sorted order when inserting to the DB, so that you may rely on a static order when querying.
To match the array field exactly Mongo provides $eq operator which can be operated over an array also like a value.
db.collection.find({ "hobbies": {$eq: [ "singing", "Music" ] }});
Also $eq checks the order in which you specify the elements.
If you use below query:
db.coll.find({ "hobbies": { "$size" : 2, "$all": [ "2", "1" ] } });
Then the exact match will not be returned. Suppose you query:
db.coll.find({ "hobbies": { "$size" : 2, "$all": [ "2", "2" ] } });
This query will return all documents having an element 2 and has size 2 (e.g. it will also return the document having hobies :[2,1]).
Mongodb filter by exactly array elements without regard to order or specified order.
Source: https://savecode.net/code/javascript/mongodb+filter+by+exactly+array+elements+without+regard+to+order+or+specified+order
// Insert data
db.inventory.insertMany([
{ item: "journal", qty: 25, tags: ["blank", "red"], dim_cm: [ 14, 21 ] },
{ item: "notebook", qty: 50, tags: ["red", "blank"], dim_cm: [ 14, 21 ] },
{ item: "paper", qty: 100, tags: ["red", "blank", "plain"], dim_cm: [ 14, 21 ] },
{ item: "planner", qty: 75, tags: ["blank", "red"], dim_cm: [ 22.85, 30 ] },
{ item: "postcard", qty: 45, tags: ["blue"], dim_cm: [ 10, 15.25 ] }
]);
// Query 1: filter by exactly array elements without regard to order
db.inventory.find({ "tags": { "$size" : 2, "$all": [ "red", "blank" ] } });
// result:
[
{
_id: ObjectId("6179333c97a0f2eeb98a6e02"),
item: 'journal',
qty: 25,
tags: [ 'blank', 'red' ],
dim_cm: [ 14, 21 ]
},
{
_id: ObjectId("6179333c97a0f2eeb98a6e03"),
item: 'notebook',
qty: 50,
tags: [ 'red', 'blank' ],
dim_cm: [ 14, 21 ]
},
{
_id: ObjectId("6179333c97a0f2eeb98a6e05"),
item: 'planner',
qty: 75,
tags: [ 'blank', 'red' ],
dim_cm: [ 22.85, 30 ]
}
]
// Query 2: filter by exactly array elements in the specified order
db.inventory.find( { tags: ["blank", "red"] } )
// result:
[
{
_id: ObjectId("6179333c97a0f2eeb98a6e02"),
item: 'journal',
qty: 25,
tags: [ 'blank', 'red' ],
dim_cm: [ 14, 21 ]
},
{
_id: ObjectId("6179333c97a0f2eeb98a6e05"),
item: 'planner',
qty: 75,
tags: [ 'blank', 'red' ],
dim_cm: [ 22.85, 30 ]
}
]
// Query 3: filter by an array that contains both the elements without regard to order or other elements in the array
db.inventory.find( { tags: { $all: ["red", "blank"] } } )
// result:
[
{
_id: ObjectId("6179333c97a0f2eeb98a6e02"),
item: 'journal',
qty: 25,
tags: [ 'blank', 'red' ],
dim_cm: [ 14, 21 ]
},
{
_id: ObjectId("6179333c97a0f2eeb98a6e03"),
item: 'notebook',
qty: 50,
tags: [ 'red', 'blank' ],
dim_cm: [ 14, 21 ]
},
{
_id: ObjectId("6179333c97a0f2eeb98a6e05"),
item: 'planner',
qty: 75,
tags: [ 'blank', 'red' ],
dim_cm: [ 22.85, 30 ]
}
]
This query will find exact array with any order.
let query = {$or: [
{hobbies:{$eq:["1","2"]}},
{hobbies:{$eq:["2","1"]}}
]};
db.coll.find(query)
with $all we can achieve this.
Query : {cast:{$all:["James J. Corbett","George Bickel"]}}
Output : cast : ["George Bickel","Emma Carus","George M. Cohan","James J. Corbett"]
Using aggregate this is how I got mine proficient and faster:
db.collection.aggregate([
{$unwind: "$array"},
{
$match: {
"array.field" : "value"
}
},
You can then unwind it again for making it flat array and then do grouping on it.
This question is rather old, but I was pinged because another answer shows that the accepted answer isn't sufficient for arrays containing duplicate values, so let's fix that.
Since we have a fundamental underlying limitation with what queries are capable of doing, we need to avoid these hacky, error-prone array intersections. The best way to check if two arrays contain an identical set of values without performing an explicit count of each value is to sort both of the arrays we want to compare and then compare the sorted versions of those arrays. Since MongoDB does not support an array sort to the best of my knowledge, we will need to rely on aggregation to emulate the behavior we want:
// Note: make sure the target_hobbies array is sorted!
var target_hobbies = [1, 2];
db.coll.aggregate([
{ // Limits the initial pipeline size to only possible candidates.
$match: {
hobbies: {
$size: target_hobbies.length,
$all: target_hobbies
}
}
},
{ // Split the hobbies array into individual array elements.
$unwind: "$hobbies"
},
{ // Sort the elements into ascending order (do 'hobbies: -1' for descending).
$sort: {
_id: 1,
hobbies: 1
}
},
{ // Insert all of the elements back into their respective arrays.
$group: {
_id: "$_id",
__MY_ROOT: { $first: "$$ROOT" }, // Aids in preserving the other fields.
hobbies: {
$push: "$hobbies"
}
}
},
{ // Replaces the root document in the pipeline with the original stored in __MY_ROOT, with the sorted hobbies array applied on top of it.
// Not strictly necessary, but helpful to have available if desired and much easier than a bunch of 'fieldName: {$first: "$fieldName"}' entries in our $group operation.
$replaceRoot: {
newRoot: {
$mergeObjects: [
"$__MY_ROOT",
{
hobbies: "$hobbies"
}
]
}
}
}
{ // Now that the pipeline contains documents with hobbies arrays in ascending sort order, we can simply perform an exact match using the sorted target_hobbies.
$match: {
hobbies: target_hobbies
}
}
]);
I cannot speak for the performance of this query, and it may very well cause the pipeline to become too large if there are too many initial candidate documents. If you're working with large data sets, then once again, do as the currently accepted answer states and insert array elements in sorted order. By doing so you can perform static array matches, which will be far more efficient since they can be properly indexed and will not be limited by the pipeline size limitation of the aggregation framework. But for a stopgap, this should ensure a greater level of accuracy.