I would like to know how to query in mongo db based on two other schema collection.
I have 3 schema's in mongo db:
1. Site : {_id, name}
2. Components : { _id, siteId, details }
3. Maintenance : { _id, siteId }
I want to query and get all the components with site information and at same time ensuring that they are not in maintenance.
I am able to fetch components with site information with following query:
componentCollection
.aggregate([
{
$lookup: {
from: 'sites',
localField: 'siteId',
foreignField: '_id',
as: 'sites',
},
},
])
How to update this query so that I can ensure selected components site's are not in maintenance collection?
Update with sample data, expected and current output:
`Site`
---------------------------
_id | name
---------------------------
1 | site1
---------------------------
2 | site2
---------------------------
`Components`
---------------------------
_id | siteId | details
---------------------------
3 | 1 | help & support
---------------------------
4 | 2 | footer links
---------------------------
`Maintenance`
---------------------------
_id | siteId
---------------------------
5 | 1
---------------------------
With the above sample query I am getting the following result:
[
{
_id: 3,
siteId: 1,
details: 'help & support',
sites: [
{
_id: 1,
name: 'site1'
}
]
},
{
_id: 4,
siteId: 2,
details: 'footer links',
sites: [
{
_id: 2,
name: 'site2'
}
]
}
]
But I want only below, as site1 is in maintenance mode
[
{
_id: 4,
siteId: 2,
details: 'footer links',
sites: [
{
_id: 2,
name: 'site2'
}
]
}
]
This might help you.
Join the Components Collection and Site Collection
Nested join the Site and Maintenance. Because if Maintenance has the Site, we can easily eliminate the object.
Filter out the object in the Site array if joinMaintenance is an empty array using $filter.
So if joinMaintenance doesn't have any objects, it will exist in Sites
Here is the code
[
{
"$lookup": {
"from": "Site",
"let": {
"sId": "$siteId"
},
"pipeline": [
{
$match: {
$expr: {
$eq: [
"$_id",
"$$sId"
]
}
}
},
{
$lookup: {
from: "Maintenance",
let: {
"smId": "$_id"
},
pipeline: [
{
"$match": {
$expr: {
$eq: [
"$siteId",
"$$smId"
]
}
}
}
],
as: "joinMaintenance"
}
}
],
as: "sites"
}
},
{
$project: {
details: 1,
siteId: 1,
sites: {
$filter: {
input: "$sites",
cond: {
$eq: [
"$$this.joinMaintenance",
[]
]
}
}
}
}
},
{
$match: {
$expr: {
$ne: [
"$sites",
[]
]
}
}
},
{
$project: {
"sites.joinMaintenance": 0
}
}
]
Working Mongo playground
Related
I want to track changes on MongoDB Documents. The big Challenge is that MongoDB has nested Documents.
Example
[
{
"_id": "60f7a86c0e979362a25245eb",
"email": "walltownsend#delphide.com",
"friends": [
{
"name": "Hancock Nelson"
},
{
"name": "Owen Dotson"
},
{
"name": "Cathy Jarvis"
}
]
}
]
after the update/change
[
{
"_id": "60f7a86c0e979362a25245eb",
"email": "walltownsend#delphide.com",
"friends": [
{
"name": "Daphne Kline" //<------
},
{
"name": "Owen Dotson"
},
{
"name": "Cathy Jarvis"
}
]
}
]
This is a very basic example of a highly expandable real world use chase.
On a SQL Based Database, I would suggest some sort of this solution.
The SQL way
users
_id
email
60f7a8b28db7c78b57bbc217
cathyjarvis#delphide.com
friends
_id
user_id
name
0
60f7a8b28db7c78b57bbc217
Hancock Nelson
1
60f7a8b28db7c78b57bbc217
Suarez Burt
2
60f7a8b28db7c78b57bbc217
Mejia Elliott
after the update/change
users
_id
email
60f7a8b28db7c78b57bbc217
cathyjarvis#delphide.com
friends
_id
user_id
name
0
60f7a8b28db7c78b57bbc217
Daphne Kline
1
60f7a8b28db7c78b57bbc217
Suarez Burt
2
60f7a8b28db7c78b57bbc217
Mejia Elliott
history
_id
friends_id
field
preUpdate
postUpdate
0
0
name
Hancock Nelson
Daphne Kline
If there is an update and the change has to be tracked before the next update, this would work for NoSQL as well. If there is a second Update, we have a second line in the SQL database and it't very clear. On NoSQL, you can make a list/array of the full document and compare changes during the indexes, but there is very much redundant information which hasn't changed.
Have a look at Set Expression Operators
$setDifference
$setEquals
$setIntersection
Be ware, these operators perform set operation on arrays, treating arrays as sets. If an array contains duplicate entries, they ignore the duplicate entries. They ignore the order of the elements.
In your example the update would result in
removed: [ {name: "Hancock Nelson" } ],
added: [ {name: "Daphne Kline" } ]
If the number of elements is always the same before and after the update, then you could use this one:
db.collection.insertOne({
friends: [
{ "name": "Hancock Nelson" },
{ "name": "Owen Dotson" },
{ "name": "Cathy Jarvis" }
],
updated_friends: [
{ "name": "Daphne Kline" },
{ "name": "Owen Dotson" },
{ "name": "Cathy Jarvis" }
]
})
db.collection.aggregate([
{
$set: {
difference: {
$map: {
input: { $range: [0, { $size: "$friends" }] },
as: "i",
in: {
$cond: {
if: {
$eq: [
{ $arrayElemAt: ["$friends", "$$i"] },
{ $arrayElemAt: ["$updated_friends", "$$i"] }
]
},
then: null,
else: {
old: { $arrayElemAt: ["$friends", "$$i"] },
new: { $arrayElemAt: ["$updated_friends", "$$i"] }
}
}
}
}
}
}
},
{
$set: {
difference: {
$filter: {
input: "$difference",
cond: { $ne: ["$$this", null] }
}
}
}
}
])
I have a USER table with documents:
{
_id: 1,
name: 'funny-guy43',
image: '../../../img1.jpg',
friends: [2, 3]
},
{
_id: 2,
name: 'SurfinGirl3',
image: '../../../img2.jpg',
friends: []
},
{
_id: 3,
name: 'FooBarMan',
image: '../../../img3.jpg',
friends: [2]
}
friends is an array of USER _ids. (1) I want to get user by _id, (2) look at his friends and (3) query the USER table with the friend ids to return all friends.
for example, find user 1, query the table based on his friends 2 and 3, and return 2 and 3.
Can I do that in one transaction? Or do I query the table to get user array of friends, then query the table again with array of friends ids.
I'm using .Net Core if that matters.
I am very open to alternative approaches as well.
It is, in fact, possible to do this in one transaction. Or, to be more exact, in one aggregation.
I would first split the users into 2 different subsets, one called searched_user and the other other_users, where searched_user will have only the user we are searching for and other_users will have everyone else. We can do that using $facet. Here is the idea:
{
"$facet": {
"searched_user": [
{
$match: {
_id: 1
}
}
],
"other_users": [
{
$match: {
_id: {
$ne: 1
}
}
}
]
}
}
Once they are separated like this, we can search the other_users subset using the friend ids from the searched_user. So here is the full aggregation:
db.collection.aggregate([
{
"$facet": {
"searched_user": [
{
$match: {
_id: 1
}
}
],
"other_users": [
{
$match: {
_id: {
$ne: 1
}
}
}
]
}
},
{
"$unwind": "$searched_user"
},
{
$project: {
user_friends: {
$filter: {
input: "$other_users",
as: "other_users",
cond: {
$in: [
"$$other_users._id",
"$searched_user.friends"
]
}
}
}
}
}
])
Here we are looking for user 1 and the result will be user 1's friends.
[
{
"user_friends": [
{
"_id": 2,
"friends": [],
"image": "../../../img2.jpg",
"name": "SurfinGirl3"
},
{
"_id": 3,
"friends": [
2
],
"image": "../../../img3.jpg",
"name": "FooBarMan"
}
]
}
]
Playground: https://mongoplayground.net/p/-8pNnQXg8r6
You can achieve this by using lookup in aggregation, Tried it with MongoDB version v4.2.11.
db.users.aggregate([
{
'$match': {
'_id': 1,
}
},
{
'$lookup': {
'from' : 'users',
'let' : {
'friendIds': '$friends',
},
'pipeline': [
{
'$match':{
'$expr': {'$in': [ '$_id', '$$friendIds']}
}
}
],
'as': 'friendsArr'
}
}
])
Result:
[
{
"_id" : 1,
"name" : "funny-guy43",
"image" : "../../../img1.jpg",
"friends" : [
2,
3
],
"friendsArr" : [
{
"_id" : 2,
"name" : "SurfinGirl3",
"image" : "../../../img2.jpg",
"friends" : [ ]
},
{
"_id" : 3,
"name" : "FooBarMan",
"image" : "../../../img3.jpg",
"friends" : [
2
]
}
]
}
]
This question already has answers here:
Find duplicate records in MongoDB
(10 answers)
Closed 2 years ago.
I have a DB with news articles, and I am trying to do a little DB cleaning. I want to find all duplicate documents, and the best way i think to accomplish this by using the url field. My documents are structured as follows:
{
_id:
author:
title:
description:
url:
urlToImage:
publishedAt:
content:
summarization:
source_id:
}
Any help is greatly appreciated
Assuming a collection documents with name (using name instead of url) field consisting duplicate values. I have two aggregations which return some output which can be used to do further processing. I hope you will find this useful.
{ _id: 1, name: "jack" },
{ _id: 2, name: "john" },
{ _id: 3, name: "jim" },
{ _id: 4, name: "john" }
{ _id: 5, name: "john" },
{ _id: 6, name: "jim" }
Note that "john" has 3 occurrances and "jim" has 2.
(1) This aggregation returns the names which have duplicates (more than one occurance):
db.collection.aggregate( [
{
$group: {
_id: "$name",
count: { $sum: 1 }
}
},
{
$group: {
_id: "duplicate_names",
names: { $push: { $cond: [ { $gt: [ "$count", 1 ] }, "$_id", "$DUMMY" ] } }
}
}
] )
The output:
{ "_id" : "duplicate_names", "names" : [ "john", "jim" ] }
(2) The following aggregation just returns the _id field values for the duplicate documents. For example, the name "jim" has _idvalues 3 and 6. The output has only the id's for the duplicate documents, i.e., 6.
db.colection.aggregate( [
{
$group: {
_id: "$name",
count: { $sum: 1 },
ids: { $push: "$_id" }
}
},
{
$group: {
_id: "duplicate_ids",
ids: { $push: { $slice: [ "$ids", 1, 9999 ] } }
}
},
{
$project: {
ids: {
$reduce: {
input: "$ids",
initialValue: [ ],
in: { $concatArrays: [ "$$this", "$$value" ] }
}
}
}
}
] )
The output:
{ "_id" : duplicate_ids", "ids" : [ 6, 4, 5 ] }
I have the next collection for exaple:
// vehicles collection
[
{
"_id": 321,
manufactor: SOME-OBJECT-ID
},
{
"_id": 123,
manufactor: ANOTHER-OBJECT-ID
},
]
And I have a collection named tables:
// tables collection
[
{
"_id": SOME-OBJECT-ID,
title: "Skoda"
},
{
"_id": ANOTHER-OBJECT-ID,
title: "Mercedes"
},
]
As you can see, the vehicles collection's documents are pulling data from the
tables's collection ducments - the first document in the vehicles collection has a manufactor
id which is getting pulled from the tables collection and named Skoda.
That is great.
When I am querying the DB using aggregate I can able to easily pull the remote data from the remote collections
respectively - without any problem.
I can also easily make rules and limitations like $project, $sort, $skip, $limit and others.
But I want to display to the user only those vehicles that are manufcatord by Mercedes.
Since Mercedes is not mentioned in the vehicles collection, but only its ID, the $text $search would not
return with the right results.
This is the aggregate pipeline that I provide:
[
{
$match: {
$text: {
$search: "Mercedes"
}
}
},
{
$lookup: {
from: "tables",
let: {
manufactor: "$manufactor"
},
pipeline: [
{
$match: {
$expr: {
$eq: [
"$_id", "$$manufactor"
]
}
}
},
{
$project: {
title: 1
}
}
],
as: "manufactor"
},
},
{
$unwind: "$manufactor"
},
{
$lookup: {
from: "tables",
let: {
model: "$model"
},
pipeline: [
{
$match: {
$expr: {
$eq: [
"$_id", "$$model"
]
}
}
},
{
$project: {
title: 1
}
}
],
as: "model"
},
},
{
$unwind: "$model"
},
{
$lookup: {
from: "users",
let: {
joined_by: "$_joined_by"
},
pipeline: [
{
$match: {
$expr: {
$eq: [
"$_id", "$$joined_by"
]
}
}
},
{
$project: {
personal_info: 1
}
}
],
as: "joined_by"
},
},
{
$unwind: "$joined_by"
}
]
As you can see I am using the $text and $search $match at the first stage in the pipleline - otherwise
MongoDB will throw an error.
But this $text $search object searhed only in the origin collection - the vehicles collection.
Is there a way to tell MongoDB to search in the remote collection with the $text and $search method
and then put in the aggregate only results that are matching both?
UPDATE
When I am doing this instead:
{
$lookup: {
from: "tables",
pipeline: [
{
$match: {
$text: {
$search: "Mercedes"
}
}
},
{
$project: {
title: 1
}
}
],
as: "manufactor"
},
},
This is what I receive:
MongoError: pipeline requires text score metadata, but there is no text score available
if you are using one of the affected versions in this thread, you need to update your mongodb server.
As you can see the issue was fixed in version 4.1.8
In MongoDB, I am trying to write a query where I have two input array Bills, Names where the first one contains billids and the second one contains names of the person. Also in reality Bills at index i and Names at index i is the actual document which we want to search in MongoDB.
I want to write a query such that Bills[i] = db_billid && Names[i] = db_name which means I want to return the result where at a particular index both billid and name matches.
I thought of using $in but the thing is I can apply $in in Bills but I don't know at which index that billid is found.
{ $and: [{ billid: { $in: Bills } }, {name: Names[**index at which this bill is found]}] }
Can anyone please help me how can I solve this ??
MongoDB Schema
var transactionsschema = new Schema({
transactionid: {type: String},
billid: {type: String},
name: {type: String}
});
Sample documents in MongoDB
{ _id: XXXXXXXXXXX, transactionid: 1, billid : bnb1234, name: "sudhanshu"}, { _id: XXXXXXXXXXX, transactionid: 2, billid : bnb1235, name: "michael"}, { _id: XXXXXXXXXXX, transactionid: 3, billid : bnb1236, name: "Morgot"}
Sample arrays
Bills = ["bill1", "bill2", "bill3"], Names = ["name1", "name2", "name"]
Edit - If $in can work in array of objects then I can have array of object with keys as billid and name
var arr = [{ billid: "bill1", "name": "name1"}, {billid: "bill2", "name": "name2"}, {billid: "bill3", "name": "name3"}]
But the thing is then how can put below query
{ $and: [{ billid: { $in: arr.Bills } }, {name: arr.Names}] }
Knowing that bills are unique but names may have duplicates you can use $indexOfArray to get index of matching bill and then use that index to compare names array at evaluated index (using $arrayElemAt to retrieve value). Also you have to check if value returned by $indexOfArray is not equal to -1 (no match)
var bills = ["bnb1234", "bnb1235"];
var names = ["sudhanshu", "michael"];
db.col.aggregate([
{
$match: {
$expr: {
$and: [
{ $ne: [{ $indexOfArray: [ bills, "$billid" ] }, -1] },
{ $eq: ["$name", { $arrayElemAt: [ names, { $indexOfArray: [ bills, "$billid" ] } ] }] },
]
}
}
}
])
alternatively $let can be used to avoid duplication:
db.col.aggregate([
{
$match: {
$expr: {
$let: {
vars: { index: { $indexOfArray: [ bills, "$billid" ] } },
in: { $and: [ { $ne: [ "$$index", -1 ] }, { $eq: [ "$name", { $arrayElemAt: [ names, "$$index" ] }] }]}
}
}
}
}
])