I would like to extract from the collection the IDs of documents that have duplicate IDs of "drives" objects that are nested in the array that is in "streetModel".
This is my typical document :
{
"_id": {
"$oid": "61375bec4fa522001b608568"
},
"name": "Streetz",
"statusDetail": {},
"streetModel": {
"_id": "3.7389-51.0566",
"name": "Kosheen - Darude - Swedish - Trynidad - Maui",
"countryCode": "DEN",
"drives": [{
"_id": -903500698,
"direction": "WEST"
}, {
"_id": 1915399546,
"direction": "EAST"
}, {
"_id": 1294835467,
"direction": "NORTH"
}, {
"_id": 1248969937,
"direction": "EAST"
}, {
"_id": 1248969937,
"direction": "EAST"
}, {
"_id": 1492411786,
"direction": "SOUTH"
}]
},
"createdAt": {
"$date": "2021-09-07T12:32:44.238Z"
}
}
In this particular document with the ID 61375bec4fa522001b608568, in "streetModel", in "drives" array I have got duplicated drives objects with id 1248969937.
I would like to create a query to the database that will return the ID of all documents with such a problem (duplicate "drives").
Right now I have got this:
db.streets.aggregate([
{
$unwind: "$streetModel"
},
{
$unwind: "$drives"
},
{
$group: {
_id: {
id: "$_id"
},
sum: {
$sum: 1
},
}
},
{
$match: {
sum: {
$gt: 1
}
}
},
{
$project: {
_id: "$_id._id",
duplicates: {
drives: "$_id"
}
}
}
])
but that's not it.
I try in many ways to rewrite this query, but unfortunately it doesn't work.
Query
unwind
group by document id + driverid
keep only those that had more than one time same driveid
replace-root is to make the document better looking, you could $project also instead
if you need any more stage i think you can add it, for examplpe to get the documents that have this problem project only the docid's
Test code here
db.collection.aggregate([
{
"$unwind": {
"path": "$streetModel.drives"
}
},
{
"$group": {
"_id": {
"docid": "$_id",
"driveid": "$streetModel.drives._id"
},
"duplicates": {
"$push": "$streetModel.drives.direction"
}
}
},
{
"$match": {
"$expr": {
"$gt": [
{
"$size": "$duplicates"
},
1
]
}
}
},
{
"$replaceRoot": {
"newRoot": {
"$mergeObjects": [
"$_id",
"$$ROOT"
]
}
}
},
{
"$project": {
"_id": 0
}
}
])
Related
Need help with mongo db query
Mondo db query - search for parents with state good and children with state bad or missing. output should be an array of all the children with state bad or missing from parents with good state
Below is the JSON list
[
{
"name": "parent-a",
"status": {
"state": "good"
},
"children": [
"child-1",
"child-2"
]
},
{
"name": "child-1",
"state": "good",
"parent": "parent-a"
},
{
"name": "child-2",
"state": {},
"parent": "parent-a"
},
{
"name": "parent-b",
"status": {
"state": "good"
},
"children": [
"child-3",
"child-4"
]
},
{
"name": "child-3",
"state": "good",
"parent": "parent-b"
},
{
"name": "child-4",
"state": "bad",
"parent": "parent-b"
},
{
"name": "parent-c",
"status": {
"state": "bad"
},
"children": [
"child-5",
"child-6"
]
},
{
"name": "child-5",
"state": "good",
"parent": "parent-c"
},
{
"name": "child-6",
"state": "bad",
"parent": "parent-c"
}
]
Expected output
"children": [
{
"name": "child-2",
"state": {}
},
{
"name": "child-4",
"state": "bad"
}
]
Any inputs would be appreciated. Thanks in advance :)
One option is to use $lookup* for this:
db.collection.aggregate([
{$match: {state: {$in: ["bad", {}]}}},
{$lookup: {
from: "collection",
localField: "parent",
foreignField: "name",
pipeline: [
{$match: {"status.state": "good"}}
],
as: "hasGoodParent"
}},
{$match: {"hasGoodParent.0": {$exists: true}}},
{$project: {name: 1, state: 1, _id: 0}}
])
See how it works on the playground example
*If your mongoDB version is lower than 5.0 you need to change the syntax a bit. Drop the localField and foreignField of the $lookup and replace with let and equality match on the pipeline
Here is an approach doing this all without a "$lookup" stage as performance usually suffers when involved. Basically we match all relevant children and parents and we group by the child id. if it has a parent (which means the parent has a "good" state, and a "child" which means the child has a "bad/{}" state then it's matched).
You should make sure you have the appropriate indexes to support the initial query.
Additionally I would personally recommend adding a boolean field on each document to mark wether it's a parent or a child. right now we have to use the field structure based on your input to mark this type but I would consider this a bad practice.
Another thing we did not discuss which doesn't seem possible from the current structure is recursion, can a child have children of it's own? Just some things to consider
db.collection.aggregate([
{
$match: {
$or: [
{
$and: [
{
"status.state": "good"
},
{
parent: {
$exists: false
}
},
{
"children.0": {
$exists: true
}
}
]
},
{
$and: [
{
"state": {
$in: [
"bad",
null,
{}
]
}
},
{
parent: {
$exists: true
}
}
]
}
]
}
},
{
$unwind: {
path: "$children",
preserveNullAndEmptyArrays: true
}
},
{
$addFields: {
isParent: {
$cond: [
{
$eq: [
null,
{
$ifNull: [
"$parent",
null
]
}
]
},
1,
0
]
}
}
},
{
$group: {
_id: {
$cond: [
"$isParent",
"$children",
"$name"
]
},
hasParnet: {
$sum: "$isParent"
},
hasChild: {
$sum: {
$subtract: [
1,
"$isParent"
]
}
},
state: {
"$mergeObjects": {
$cond: [
"$isParent",
{},
{
state: "$state"
}
]
}
}
}
},
{
$match: {
hasChild: {
$gt: 0
},
hasParnet: {
$gt: 0
}
}
},
{
$group: {
_id: null,
children: {
$push: {
name: "$_id",
state: "$state.state"
}
}
}
}
])
Mongo Playground
Current MongoDB query, takes upto 5 mins to search through 2 documents, when each document has 10,000 contacts, Please suggest ways to improve this significantly.
I am trying to search for a phone number in hundreds of documents.
Each document belongs to a user and each user has a contacts array (as you can see in the below code) with 10,000 objects and each object can have 2 to 3 phone numbers. (See below document structure).
If a phone number is found in multiple documents, I need the MongoDB query to return an array with userNumber’s found in those documents.
Below is the structure of the document I have in MongoDB collection. For simplicity, I showed only one object in contacts array, infact there are thousands of objects
{
"_id": { "$oid": "61d1f04266289f003452d705" },
"userID": { "$oid": "61d1efea2c0fab00340f47c8" },
"contacts": [
{
"emailAddresses": [
{ "id": "6884", "label": "email1", "email": "addedemail#gmail.com" }
],
"phoneNumbers": [
{
"label": "other",
"id": "4594",
"number": "+918984292930"
},
{
"label": "other",
"id": "4595",
"number": "+911234567890"
}
],
"_id": { "$oid": "61d1f04266289f003452d744" },
"ContactName": "Sample User 1 Name Changed",
"ContactNumber": "+918984292930",
"recordID": "833"
}
],
"userNumber": "+911234567890",
"__v": 7
}
Current MongoDB Query:
await ContactModel.aggregate([
{
$match: {
userNumber: userNumber,
},
},
{
$unwind: "$contacts",
},
{
$lookup: {
from: "phonenumbers",
let: {
contactNumberVar: "$contacts.ContactNumber",
},
pipeline: [
{ $unwind: "$contacts" },
{
$project: {
userNumber: 1,
"contacts.ContactNumber": 1,
},
},
{
$match: {
$and: [
{ $expr: { $eq: ["$$contactNumberVar", "$userNumber"] } },
{
$expr: {
$eq: [contactNumber, "$contacts.ContactNumber"],
},
},
],
},
},
],
as: "mutualContacts",
},
},
{
$project: {
userID: 1,
"mutualContacts.userNumber": 1,
},
},
{
$group: {
_id: "$userID",
mutualContacts: {
$push: {
$cond: [
{ $gt: [{ $size: "$mutualContacts" }, 0] },
{ $arrayElemAt: ["$mutualContacts.userNumber", 0] },
"$$REMOVE",
],
},
},
},
},
]).exec()
First of all ensure you have indexes that support the query on both collections.
{userNumber:1}
Should be a good candidate, but please test other options.
Next - query optimisation. In the lookup pipeline:
pipeline: [
{ $unwind: "$contacts" },
{
$project: {
userNumber: 1,
"contacts.ContactNumber": 1,
},
},
{
$match: {
$and: [
{ $expr: { $eq: ["$$contactNumberVar", "$userNumber"] } },
{
$expr: {
$eq: [contactNumber, "$contacts.ContactNumber"],
},
},
],
},
},
],
You unwind whole phonenumbers collection.
Match it first and unwind/project only matching documents instead:
pipeline: [
{
$match: {
$and: [
{ $expr: { $eq: ["$$contactNumberVar", "$userNumber"] } },
{
$expr: {
$eq: [contactNumber, "$contacts.ContactNumber"],
},
},
],
},
},
{ $unwind: "$contacts" },
{
$project: {
userNumber: 1,
"contacts.ContactNumber": 1,
},
},
{
$match: {
$and: [
{ $expr: { $eq: ["$$contactNumberVar", "$userNumber"] } },
{
$expr: {
$eq: [contactNumber, "$contacts.ContactNumber"],
},
},
],
},
},
],
I have a document like this:
I need to return the documents and filter the nested array (lessons) where any item of subLessons into input array
[{
"_id": {
"$oid": "6081fedbee5d133dbffb42eb"
},
"name": "my quiz",
"lessons": [
{
"_id": "460c42e1-b0b7-437e-ab63-c59cce8ced0d",
"name": "section",
"subLesson": [
{
"$oid": "6081fed9ee5d133dbffb3cba"
},
{
"$oid": "6081fed9ee5d133dbffb3cc0"
}
]
},
{
"_id": "f7b5c95f-1a68-42ca-880c-22ef3831ff03",
"name": "ffff",
"subLesson": [
{
"$oid": "6081fed9ee5d133dbffb3cbb"
}
]
}
]
}
}]
I wrote the following query but it does not work. I do not know how to use $elemMatch in $filter
db.collection.aggregate([
{
"$project": {
_id: 1,
lessons: {
$filter: {
"input": "$lessons",
"as": "lesson",
"cond": {
"$$lesson.subLesson": {
"$elemMatch": {
"$in": [
ObjectId("6081fed9ee5d133dbffb3cba")
]
}
}
}
}
}
}
}
])
I am trying to find the record such that the result looks like the following.
[{
"_id": {
"$oid": "6081fedbee5d133dbffb42eb"
},
"lessons": [
{
"_id": "460c42e1-b0b7-437e-ab63-c59cce8ced0d",
"name": "zzzz",
"subLesson": [
{
"$oid": "6081fed9ee5d133dbffb3cba"
},
{
"$oid": "6081fed9ee5d133dbffb3cc0"
}
]
}
]
},
}]
Can anyone please help out to understand how can I make this work
thanks
You can use $in directly
db.collection.aggregate([
{
$project: {
lessons: {
$filter: {
input: "$lessons",
cond: {
$in: [ ObjectId("6081fed9ee5d133dbffb3cba"), "$$this.subLesson" ]
}
}
}
}
}
])
Working Mongo playground
Update 1
db.collection.aggregate([
{ "$unwind": "$lessons" },
{
"$match": {
"lessons.subLesson": {
$in: [ ObjectId("6081fed9ee5d133dbffb3cba"), ObjectId("6081fed9ee5d133dbffb3cbb") ]
}
}
},
{
$group: {
_id: "$_id",
name: { $first: "$name" },
lessons: { $push: "$lessons" }
}
}
])
Mongo Playground
I have an object like this:
{
"_id": {
"$oid": "5f0047f02fd3fc048aab9ee9"
},
"array": [
{
"_id": {
"$oid": "5f00dcc23e12b8721e4f3672"
},
"name": "NAME",
"sub_array": [
{
"sub_array2": [
{
"$oid": "5f00e367f7b8747beddc6d31"
},
{
"$oid": "5f00f26c1facd18c5158d1d3"
}
],
"_id": {
"$oid": "5f00de99a8802e767885e72b"
},
"week_day": 1
},
{
"sub_array2": [
{
"$oid": "5f00e367f7b8747beddc6d31"
}
],
"_id": {
"$oid": "5f00f2501facd18c5158d1d2"
},
"week_day": 3
}
]
},
{
"_id": {
"$oid": "5f00f2401facd18c5158d1d1"
},
"name": "NAME1",
"sub_array": []
}
]
}
I want to replace sub_array ids with objects from another collection but that results converting array and sub_array to objects and losing all of the data like week_day.
Lookup:
'$lookup': {
'from': 'sati',
'localField': 'array.sub_array.sub_array2',
'foreignField': '_id',
'as': 'array.sub_array.sub_array2'
}
Result:
{
"_id": {
"$oid": "5f0047f02fd3fc048aab9ee9"
},
"array": {
"sub_array": {
"sub_array2": [
{
"_id": {
"$oid": "5f00e367f7b8747beddc6d31"
},
"endTime": "2020-07-03T12:06:50+0000",
"startTime": "2020-07-03T12:05:50+0000",
"data1": {
"$oid": "5f005e63ab1cbf2374d5163f"
}
},
{
"_id": {
"$oid": "5f00e367f7b8747beddc6d31"
},
"endTime": "2020-07-03T12:06:50+0000",
"startTime": "2020-07-03T12:05:50+0000",
"data1": {
"$oid": "5f005e63ab1cbf2374d5163f"
}
},
{
"_id": {
"$oid": "5f00e367f7b8747beddc6d31"
},
"endTime": "2020-07-03T12:06:50+0000",
"startTime": "2020-07-03T12:05:50+0000",
"data1": {
"$oid": "5f005e63ab1cbf2374d5163f"
}
}
]
}
}
}
Is there a way to "replace" the individual ids without converting entire arrays to objects and removing other fields. I know mongoose can do that but I'm not permitted to use it. None of the other questions helped (example).
It will override entire object key:value with $lookup result. Instead, store the lookup result in the sati variable and add an extra stage like shown below.
$map allows use iterate over an array and transform each item.
db.collection.aggregate([
{
"$lookup": {
"from": "sati",
"localField": "array.sub_array.sub_array2",
"foreignField": "_id",
"as": "sati"
}
},
{
$project: {
array: {
$map: {
input: "$array",
as: "array",
in: {
_id: "$$array._id",
name: "$$array.name",
sub_array: {
$map: {
input: "$$array.sub_array",
as: "sub_array",
in: {
_id: "$$sub_array._id",
week_day: "$$sub_array.week_day",
sub_array2: {
$filter: {
input: "$sati",
as: "sati_item",
cond: {
$in: [
"$$sati_item._id",
"$$sub_array.sub_array2"
]
}
}
}
}
}
}
}
}
}
}
}
])
MongoPlayground | Altenative with $mergeObjects
I have two collections in the following format -
collection 1
{
"_id": "col1id1",
"name": "col1doc1",
"properties": [ "<_id1>", "<_id2>", "<_id3>"]
}
collection 2
{
"_id": "<_id1>",
"name": "doc1",
"boolean_field": false
}
{
"_id": "<_id2>",
"name": "doc2",
"boolean_field": true
}
{
"_id": "<_id3>",
"name": "doc3",
"boolean_field" : false
}
the desired output is -
{
"_id": "col1id1",
"name": "col1doc1",
"property_names": ["doc1", "doc3"]
}
The field proerties of document in collection1 has three IDs of documents in collection2 but the output after join operation should contain only those which have the boolean_field value as false. How can I perform this filter with join operation in MongoDB?
$lookup can be used along with $unwind to achieve this.
db.col1.aggregate([
{
"$unwind": "$properties"
},
{
"$lookup": {
from: "col2",
localField: "properties",
"foreignField": "_id",
"as": "property_names"
}
},
{
"$match": {
"property_names": {
"$elemMatch": {
"bool_field": false
}
}
}
},
{
"$unwind": "$property_names"
},
{
"$group": {
"_id": "$_id",
"properties": {
"$push": "$properties"
},
"property_names": {
"$push": "$property_names"
}
}
},
{
"$project": {
"_id": 1,
"name": 1,
"property_names": {
"name": 1
}
}
}
]);