Combine Mongo Documents after multiple lookups in single aggregation - arrays

I'm stuck trying to combine my document results. Here is my query and data
{"_id":"5c21ab13d03013b384f0de26",
"roles":["5c21ab31d497a61195ce224c","5c21ab4ad497a6f348ce224d","5c21ab5cd497a644b6ce224e"],
"agency":"5b4ab7afd6ca361cb38d6a60","agents":["5b4ab5e897b24f1c4c8e3de3"]}
Here is the query
return db.collection('projects').aggregate([
{
$match: {
agents: ObjectId(agent)
}
},
{
$unwind: "$agents"
},
{
$lookup: {
from: "agents",
localField: "agents",
foreignField: "_id",
as: "agents"
}
},
{
$unwind: {
path: "$roles",
preserveNullAndEmptyArrays: true
}
},
{
$lookup: {
from: "roles",
localField: "roles",
foreignField: "_id",
as: "roles"
}
},
{
$lookup: {
from: "agencies",
localField: "agency",
foreignField: "_id",
as: "agency"
}
}
])
As you can see, an entry in the project collection has two arrays that are unwound before a lookup on each entry is performed and then a final lookup is performed on the "agency" field.
However when I get the results from this query I am getting a document count equal to the number of roles. For example the project I am aggregating has 3 roles and 1 agent. So I am getting back an array of 3 objects, one for each role rather than a single document with the roles array containing all three roles. There is also a chance the agents array can have more than one value.
So lost...

You don't have to run $unwind before $lookup. The localField section states that:
If your localField is an array, you may want to add an $unwind stage to your pipeline. Otherwise, the equality condition between the localField and foreignField is foreignField: { $in: [ localField.elem1, localField.elem2, ... ] }
So basically if you don't run $unwind for instance on roles then instead of document per role you will get an array of roles as ObjectIds replaced by an array of objects from that second collection.
So you can try following aggregation:
db.collection('projects').aggregate([
{
$match: {
agents: ObjectId(agent)
}
},
{
$lookup: {
from: "agents",
localField: "agents",
foreignField: "_id",
as: "agents"
}
},
{
$lookup: {
from: "roles",
localField: "roles",
foreignField: "_id",
as: "roles"
}
},
{
$lookup: {
from: "agencies",
localField: "agency",
foreignField: "_id",
as: "agency"
}
}
])

Related

Mongodb: How to filter out array objects from $lookup result by matching lists of ObjectIds

I'm building a view that compiles data from a few different collections so that I don't have to make multiple queries during an API call. I'd like to do some filtering on top of the basic "localField" "foreignField" filtering but can't quite seem to get the right result no matter what I do. The problem seems to be that I need to check which objects of the array contain an array that has an ObjectId included in another array.
I'm using aggregate function while trying to tweak my results. It looks like this:
db.users.aggregate([{ $lookup: { from: "organizations", localField: "context", foreignField: "_id", as: "context" } }, { $lookup: { from: "roles", localField: "roles", foreignField: "_id", as: "roles_with_details" } }, { $lookup: { from: "service_modules", localField: "context.enabled_service_modules", foreignField: "_id", as: "service_modules" } }, { $project: { "context.enabled_service_modules": 0, "password": 0 } }])
This returns data in the following format:
[
{
_id: ObjectId("63907d27ba21557a3455a24b"),
first_name: 'Tep',
last_name: 'Tes',
display_name: 'Tep Tes',
created: ISODate("2022-12-07T11:46:47.230Z"),
last_seen: 1670618355349,
username: 'example#example.com',
email: 'example#example.com',
enable_local_login: 'true',
connected_logins: [],
roles: [
ObjectId("63907c7fba21557a3455a247"),
ObjectId("63907c7fba21557a3455a248")
],
favourite_sm: [ ObjectId("6390832cba21557a3455a250") ],
context: [
{
_id: ObjectId("639074e7ba21557a3455a23f"),
organization_name: 'vip',
display_name: 'Vip',
enabled_login_methods: [ 'ldap' ],
login_method_configurations: [],
created: ISODate("2022-12-07T11:11:35.568Z")
}
],
roles_with_details: [
{
_id: ObjectId("63907c7fba21557a3455a247"),
role: 'jt',
description: 'Has access to membership data'
},
{
_id: ObjectId("63907c7fba21557a3455a248"),
role: 'at',
description: 'Has access to tenant information'
}
],
service_modules: [
{
_id: ObjectId("6390832cba21557a3455a250"),
service_name: 'Jt',
permissions: [
ObjectId("63907c7fba21557a3455a245"),
ObjectId("63907c7fba21557a3455a246"),
ObjectId("63907c7fba21557a3455a247")
],
route: 'jt'
},
{
_id: ObjectId("6390832cba21557a3455a24c"),
service_name: 'Mc',
permissions: [
ObjectId("63907c7fba21557a3455a245"),
ObjectId("63907c7fba21557a3455a246")
],
route: 'mc'
},
{
_id: ObjectId("6390832cba21557a3455a24e"),
service_name: 'Ms',
permissions: [
ObjectId("63907c7fba21557a3455a245"),
ObjectId("63907c7fba21557a3455a246")
],
route: 'ms'
},
{
_id: ObjectId("6390832cba21557a3455a251"),
service_name: 'At',
permissions: [
ObjectId("63907c7fba21557a3455a245"),
ObjectId("63907c7fba21557a3455a246"),
ObjectId("63907c7fba21557a3455a248")
],
route: 'at'
}
]
}
]
I would like to filter this result so that "service_modules" array would only include objects which contain at least one ObjectId in their "service_modules.permissions" array, that is also found in the "roles" array. This would mean that the "service_modules" should look like this:
service_modules: [
{
_id: ObjectId("6390832cba21557a3455a250"),
service_name: 'Jt',
permissions: [
ObjectId("63907c7fba21557a3455a245"),
ObjectId("63907c7fba21557a3455a246"),
ObjectId("63907c7fba21557a3455a247")
],
route: 'jt'
},
{
_id: ObjectId("6390832cba21557a3455a251"),
service_name: 'At',
permissions: [
ObjectId("63907c7fba21557a3455a245"),
ObjectId("63907c7fba21557a3455a246"),
ObjectId("63907c7fba21557a3455a248")
],
route: 'at'
}
]
I have tried modifying the $lookup, which joins the service_modules collection, to following trying to use pipeline to match the arrays:
{ $lookup: { from: "service_modules", localField: "context.enabled_service_modules", foreignField: "_id","let":{"sm":"$service_modules","roles":"$roles"},"pipeline":[{$match:{$expr:{$in:["$$sm.permissions","$$roles"]}}}], as: "service_modules" } }
But it gives a following error:
MongoServerError: PlanExecutor error during aggregation :: caused by :: $in requires an array as a second argument, found: missing
I also tried using unwind and group, but trying to unwind with:
{$unwind:"$service_modules.permissions"}
results in empty set, so I wasn't able to proceed to the grouping stage.
How should I filter the $lookup in order to get the result I'd want?
One way you could do it is by using a "$match" in a "pipeline" of the service_modules "$lookup".
db.users.aggregate([
{
$lookup: {
from: "organizations",
localField: "context",
foreignField: "_id",
as: "context"
}
},
{
$lookup: {
from: "roles",
localField: "roles",
foreignField: "_id",
as: "roles_with_details"
}
},
{
$lookup: {
from: "service_modules",
localField: "context.enabled_service_modules",
foreignField: "_id",
as: "service_modules",
"let": {
"roles": "$roles"
},
"pipeline": [
{
"$match": {
"$expr": {
"$gt": [
{"$size": {"$setIntersection": ["$$roles", "$permissions"]}},
0
]
}
}
}
]
}
},
{
$project: {
"context.enabled_service_modules": 0,
"password": 0
}
}
])
Try it on mongoplayground.net.

MongoDB $match in a aggregate lookup not working as expected

When i run this query:
db.friendRequests.aggregate([
$lookup: {
from: "users",
localField: "author",
foreignField: "_id",
pipeline: [
{
$match: {
$expr: {
friend_id: new mongoose.Types.ObjectId(userid),
},
},
},
],
as: "userdata",
}
])
It returns every entry in the collection, but theres a pipeline in it. Then why is it not working?
Can you help me? Thanks!
Playground:
https://mongoplayground.net/p/Eh2j8lU4IQl
The friend_id field is present in the friendRequests collection (source for the aggregation) not the users collection which is the target for the $lookup. Therefore that predicate should come in a $match stage that precedes the $lookup:
db.friendRequests.aggregate([
{
$match: {
"friend_id": ObjectId("636a88de3e45346191cf4257")
}
},
{
$lookup: {
from: "users",
localField: "author",
foreignField: "_id",
as: "userdata"
}
}
])
See how it works in this playground example. Note that I changed inventory to users assuming that was just a typo in the collection name in the provided playground link.
Original answer
This syntax is incorrect:
$match: {
$expr: {
friend_id: new mongoose.Types.ObjectId(userid),
},
}
You should change it to either
$match: {
friend_id: new mongoose.Types.ObjectId(userid),
}
Or
$match: {
$expr: {
$eq: [
"$friend_id", new mongoose.Types.ObjectId(userid)
]
},
}
For mongodb version under 5.0 (Thanks for the remark #user20042973):
$lookup with localField and foreignField will ignore a pipeline. Remove them and add a let key in order to enable the pipeline.

How to calculate the size of an aggregation when limit of 16 mb is crossed

I want to calculate the size of a few interrelated documents in MongoDB for a particular application User.
I am performing aggregation and in the end doing projection on the bsonSize, but when the document size exceeds the limit of 16 MB, this approach is not working.
I think there must be some better way to solve this problem, I request to the experienced developer who is viewing this question to share a better approach.
This is what my aggregation array looks like,
[
{
$match: {
user: userId
}
}, {
$lookup: {
from: 'anyvalidcollection1',
localField: 'validlocalfield1',
foreignField: 'validforeignfield1',
as: 'alias1'
}
}, $lookup: {
from: 'anyvalidcollection2',
localField: 'validlocalfield2',
foreignField: 'validforeignfield2',
as: 'alias2'
},
$lookup: {
from: 'anyvalidcollection3',
localField: 'validlocalfield3',
foreignField: 'validforeignfield3',
as: 'alias3'
},
$lookup: {
from: 'anyvalidcollection4',
localField: 'validlocalfield4',
foreignField: 'validforeignfield4',
as: 'alias4'
},
$lookup: {
from: 'anyvalidcollection5',
localField: 'validlocalfield5',
foreignField: 'validforeignfield5',
as: 'alias5'
}
}, {
$project: {
size: {
$bsonSize: '$$ROOT'
},
fileSize: '$file_data.size'
}
}, {
$unwind: {
path: '$fileSize',
includeArrayIndex: '0',
preserveNullAndEmptyArrays: false
}
}, {
$group: {
_id: 'sum',
totalSize: {
$sum: '$size'
},
totalFileSize: {
$sum: '$fileSize'
}
}
}
]
If document in the pipeline is > 16MB $bsonSize will complain of document size, even if returned documents are < 16MB.
But a simple way to solve this is to do many bson sizes, for example if you have many fields with lots of data, field1,field2,field3
You can do something like the bellow
aggregate(
[{"$set":
{"size":
{"$add":
[{"$bsonSize": {"field1": "$field1"}},
{"$bsonSize": {"field2": "$field2"}},
{"$bsonSize": {"field3": "$field3"}}]}}}])
You also have 5 lookups that looks alot, maybe you can reduce them, if you change your schema.

MongoDB aggregation: Counting results of the lookup without joining

I'm working with this query:
customers.aggregate: [
{$lookup: {
from: "users",
localField: "_id",
foreignField: "customerId",
as: "users"
}},
{$lookup: {
from: "templates",
let: {localField: "$_id"},
pipeline: [{
$match: { $and: [{
$expr: { $eq: ["$customerId", "$$localField"]}},
{module: false}]
}}],
as: "templates"
}},
{$lookup: {
from: "publications",
localField: "_id",
foreignField: "customerId",
as: "publications"
}},
{$lookup: {
from: "documents",
let: {localField: "$_id"},
pipeline: [{
$match: { $and: [{
$expr: { $eq: ["$customerId", "$$localField"]}},
{createdAt: {$gte: {$date: "<someDate>"}}}]
}}],
as: "recentDocuments"
}}
]
In the last lookup stage I'm filtering documents with the customerId field according to the _id field and newer than <someDate> and then joining those documents to respective "customer" object.
And after this step or if possible even in this same step I would also like to add a new field to each resulting "customer" document with the counted number of all the documents (not only those that pass the time filter) from the "documents" collection with the customerId field value corresponding to the customer document's _id. And I also wish not to join those documents to the customer object as I only need a total number of documents with respective customerId. I can only use extended JSON v1 strict mode syntax.
The result would look like:
customers: [
0: {
users: [...],
templates: [...],
publications: [...],
recentDocuments: [...],
totalDocuments: <theCountedNumber>
},
1: {...},
2: {...},
...
]
Use $set and $size
db.customers.aggregate([
{
$lookup: {
from: "documents",
let: { localField: "$_id" },
pipeline: [
{
$match: {
$and: [
{ $expr: { $eq: [ "$customerId", "$$localField" ] } }
]
}
}
],
as: "recentDocuments"
}
},
{
$set: {
totalDocuments: { $size: "$recentDocuments" }
}
}
])
mongoplayground
So on Thursday I've found a proper syntax to solve my problem. It goes like:
db.customers.aggregate([
{
$lookup: {
from: "users",
localField: "_id",
foreignField: "customerId",
as: "users"
}},
{$lookup: {
from: "templates",
let: {localField: "$_id"},
pipeline: [{
$match: { $and: [{
$expr: { $eq: ["$customerId", "$$localField"]}},
{module: false}]
}}],
as: "templates"
}},
{$lookup: {
from: "publications",
localField: "_id",
foreignField: "customerId",
as: "publications"
}},
{$lookup: {
from: "documents",
let: {localField: "$_id"},
pipeline: [{
$match: { $and: [{
$expr: { $eq: ["$customerId", "$$localField"]}},
{createdAt: {$gte: {$date: "<someDate>"}}}]
}}],
as: "recentDocuments"
},
{$lookup: {
from: "documents",
let: {localField: "$_id"},
pipeline: [{
$match: {$and: [{
$expr: {$eq: ["$customerId", "$$localField"]}},
{ $count: "count" }],
as: "documentsNumber"
}}
])
This command would, in the last stage of the aggregate pipeline, go over the documents collection again, but this time would return all the documents instead of filtering by the time period, and then would swap the resulting object for every "customer" object with the array with one item being the number of all the documents. The array could be later "unwinded" with the $unwind action, but it proved to decrease the performance drastically, thus - omitted. I really hope this will help someone to solve a similar problem.

mongo lookup where array fields don't match

I have two collections: Users and Roles.
Both use ObjectIds as their index.
Users has a field called Roles which holds an array of Role ObjectIds.
Users:
{
"_id" : ObjectId("590253b50985e614aaa90098"),
"roles" : [
ObjectId("57d624612808daf641fafae3"),
ObjectId("5a2da7e37f1c84d172161273"),
ObjectId("5a2ede157f1c84d172161d33"),
ObjectId("5a2ede927f1c84d172161d34")
}
Roles:
{
"_id" : ObjectId("57c6371cf541a6c9457f1319"),
"name" : "Admin"
}
I'm trying to identify the Role objectIds in the roles array of the User collection that DO NOT have a reference in the Roles Collection.
Any ideas? I've tried Aggregates, lookups, foreach, nin.. and have not found the right combination. Clearly I'm new to mongo :p
Thanks in advance for any help!
You must use $unwind, $lookup, $match and $group:
db.Users.aggregate([
{
$unwind: "$roles"
},
{
$lookup:
{
from: "Roles",
localField: "roles",
foreignField: "_id",
as: "Role_info"
}
},
{
$match : {
"Role_info.0": {$exists:false}
}
},
{
$group: {
_id: "$roles"
}
}]);

Resources