How can I omit elements from arrays inside nested documents with mongo? - arrays

I have the following structure for a collection in MongoDB
{
'_id': 45
'tags': [ 'tag 1', 'tag 3' ]
'active': true
'fields': [
{ 'name': 'common field 1', 'type': 'text', 'value': 'some text', ... },
{ 'name': 'common field 2', ... },
{ 'name': 'multivalued field 1',
'type': 'multifield',
'valueCount': 5,
'value': [
{ 'name': 'subfield1', ..., 'value': [1, 2, 3, 4, 5]},
{ 'name': 'subfield2', ..., 'value': ["one", "two", "three", "four", "five"]},
{ 'name': 'subfield3', ..., 'value': ["here", "there", "", "", ""]}
], ... }
]
}
and I am trying to implement projection in my API: for example, if the user requests
api/collection/?fields=id,fields{common field 2, multifield{subfield1}}
The result should be
{
'_id': 45
'fields': [
{ 'name': 'common field 1', 'type': 'text', 'value': 'some text', ... },
{ 'name': 'multivalued field 1',
'type': 'multifield',
'valueCount': 5,
'value': [
{ 'name': 'subfield1', ..., 'value': [1, 2, 3, 4, 5]},
], ... }
]
}
Since the 'fields' names are not actual keys, I cannot use mongo projection, say
db.collection.find({},{_id: 1, tags: 1, fields.'common field 1': 1})
So I must instead search within the array for the fields whose "name" property matches my projection parameter. I achieved that for the first level array with aggregation and $redact, as suggested in this answer https://stackoverflow.com/a/24032549/5418731
db.points.aggregate([
{ $match: {}},
{
$project: {_id :1, fields: 1}
},
{ $redact : {
$cond: {
if: { $or : [{ $not : "$name" }, { $eq: ["$name", "common field 1"] }]},
then: "$$DESCEND",
else: "$$PRUNE"
}
}}])
However, I cannot use $redact to select subfields from the inner arrays in multivalued fields. The $or parameters would have to be something like
[{ $not : "$name" }, { $eq: ["$name", "common field 1"] }, { $eq: ["$name", "subfield1"] }]
which means a first level field with the same name of the subfield specified would also pass.
After upgrading MongoDB to 3.2, I attempted the $filter solution in this answer https://stackoverflow.com/a/12241930/5418731, which also works fine for the first level array:
db.points.aggregate([
{$project: {
fields: {$filter: {
input: '$fields',
as: 'field',
cond: {$eq: ['$$field.name', 'multivalued field 1']}
}}
}}
])
but I couldn't find a way to use it "nested" and filter the second level array items. Adding {$eq: ['$$field.value.name', 'subfield1']} doesn't work.
Last, I tried the $map solution presented here https://stackoverflow.com/a/24156418/5418731:
db.points.aggregate([
{ "$project": {
"_id": 1,
"fields": {
"$map": {
"input": "$fields",
"as": "f",
"in": {
"$ifNull": [
{
"name": "$multivalued field 1",
"type": "$multifield", //attempt to restrict search to fields with arrays as values
"value": {
"$map": {
"input": "$$f.value",
"as": "v",
"in": {
"$ifNull": [
{ "name": "$subfield1"},
false
]
}
}
}
},
false
]
}
}
}
}}
])
But this one won't work because the "value" property of each "fields" item is not necessarily an array, and when it isn't the whole query fails.
I'm about to give up and mask the results in JS. Is there a good solution for that with Mongo?

Is there a typo in your sample request. As you're querying for documents with
fields where name is "common field 2" but you're expecting "common field 1" in your response.
Also instead of making this so complex, you can simply use the aggregation pipeline and proceed in the following manner:
First $unwind on the fields array.
Then $match fields where fields.name = "common field 1" and type = "multifield".
Then $unwind on the value array.
Finally $match the fields where fields.value.name = "subfield1"
Something like this :
db.points.aggregate([
{ $unwind: "$fields" },
{ $match: { "fields.name": "common field 1", "fields.type": "multifield" } },
{ $unwind: "$fields.value" },
{ $match: { "fields.value.name": "subfield1" } }
]);

Related

How to use $getfield to get a field from ROOT Document with condition in Aggregation Mongodb

I'm starting to learn Aggregate in MongoDB. I have a simple Doc as below, which has 2 fields, name and examScores, examScores is an array contains multiplier documents:
{ _id: ObjectId("633199db009be219a43ae426"),
name: 'Max',
examScores:
[ { difficulty: 4, score: 57.9 },
{ difficulty: 6, score: 62.1 },
{ difficulty: 3, score: 88.5 } ] }
{ _id: ObjectId("633199db009be219a43ae427"),
name: 'Manu',
examScores:
[ { difficulty: 7, score: 52.1 },
{ difficulty: 2, score: 74.3 },
{ difficulty: 5, score: 53.1 } ] }
Now I query the maximum score of each person using $unwind and $group/$max as below:
db.test.aggregate([
{$unwind: "$examScores"},
{$group: {_id: {name: "$name"}, maxScore: {$max: "$examScores.score"}}}
])
{ _id: { name: 'Max' }, maxScore: 88.5 }
{ _id: { name: 'Manu' }, maxScore: 74.3 }
But I want the result also contains the examScores.difficulty field corresponding to name and examScores.score, like below:
{ _id: { name: 'Max' }, difficulty: 3, maxScore: 88.5 }
{ _id: { name: 'Manu' }, difficulty: 2, maxScore: 74.3 }
I know that I can use $sort + $group and $first to achieve this goal. But I want to use $getField or any other methods to get data from ROOT Doc.
My idea is use $project and $getField to get the difficulty field from ROOT doc (or $unwind version of ROOT doc) with the condition like ROOT.name = Aggregate.name and Root.examScores.score = Aggregate.maxScore.
It will look something like this:
{$project:
{name: 1,
maxScore: 1,
difficulty:
{$getField: {
field: "$examScores.difficulty"
input: "$$ROOT.$unwind() with condition/filter"}
}
}
}
I wonder if this is possible in MongoDB?
Solution 1
$unwind
$group - Group by name. You need $push to add the $$ROOT document into data array.
$project - Set the difficulty field by getting the value of examScores.difficulty from the first item of the filtered data array by matching the examScores.score with maxScore.
db.collection.aggregate([
{
$unwind: "$examScores"
},
{
$group: {
_id: {
name: "$name"
},
maxScore: {
$max: "$examScores.score"
},
data: {
$push: "$$ROOT"
}
}
},
{
$project: {
_id: 0,
name: "$_id.name",
maxScore: 1,
difficulty: {
$getField: {
field: "difficulty",
input: {
$getField: {
field: "examScores",
input: {
$first: {
$filter: {
input: "$data",
cond: {
$eq: [
"$$this.examScores.score",
"$maxScore"
]
}
}
}
}
}
}
}
}
}
}
])
Demo Solution 1 # Mongo Playground
Solution 2: $rank
$unwind
$rank - Ranking by partition name and sort examScores.score descending.
$match - Filter the document with { rank: 1 }.
$unset - Remove rank field.
db.collection.aggregate([
{
$unwind: "$examScores"
},
{
$setWindowFields: {
partitionBy: "$name",
sortBy: {
"examScores.score": -1
},
output: {
rank: {
$rank: {}
}
}
}
},
{
$match: {
rank: 1
}
},
{
$unset: "rank"
}
])
Demo Solution 2 # Mongo Playground
Opinion: I would say this approach:
$sort by examScores.score descending
$group by name, take the first document
would be much easier.
There's no need to $unwind and then rebuild the documents again via $group to achieve your desired results. I'd recommend avoiding that altogether.
Instead, consider processing the arrays inline using array expression operators. Depending on the version and exact results you are looking for, here are two starting points that may be worth considering. In particular the $maxN operator and the $sortArray operator may be of interest for this particular question.
You can get a sense for what these two operators do by running an $addFields aggregation to see their output, playground here.
With those as a starting point, it's really up to you to make the pipeline output the desired result. Here is one such example that matches the output you described in the question pretty well (playground):
db.collection.aggregate([
{
"$addFields": {
"relevantEntry": {
$first: {
$sortArray: {
input: "$examScores",
sortBy: {
"score": -1
}
}
}
}
},
},
{
"$project": {
_id: 0,
name: 1,
difficulty: "$relevantEntry.difficulty",
maxScore: "$relevantEntry.score"
}
}
])
Which yields:
[
{
"difficulty": 3,
"maxScore": 88.5,
"name": "Max"
},
{
"difficulty": 2,
"maxScore": 74.3,
"name": "Manu"
}
]
Also worth noting that this particular approach doesn't do anything special if there are duplicates. You could look into using $filter if something more was needed in that regard.

How to project only matched array item in mongodb aggregation?

I have a collection called shows, with documents as:
{
"url": "http://www.tvmaze.com/shows/167/24",
"name": "24",
"genres": [
"Drama",
"Action"
],
"runtime": 60
},
{
"url": "http://www.tvmaze.com/shows/4/arrow",
"name": "Arrow",
"genres": [
"Drama",
"Action",
"Science-Fiction"
],
"runtime": 60
}
I wanted to search shows with genre 'Action' and project the result array as
{
"url": "http://www.tvmaze.com/shows/167/24",
"name": "24",
"genres": [
"Action" // I want only the matched item in
//my result array
],
"runtime": 60
} , //same for the second doc as well
If I use
db.shows.find({genres:'Action'}, {'genres.$': 1});
It works but the same does not work in aggregate method with $project
Shows.aggregate([
{
$match: { 'genres': 'Action'}
},
{
$project: {
_id: 0,
url: 1,
name: 1,
runtime: 1,
'genres.$': 1
}
}
]);
this is the error I get on this aggregate query
Invalid $project :: caused by :: FieldPath field names may not start with '$'."
db.collection.aggregate([
{
$match: {
"genres": {
$regex: "/^action/",
$options: "im"
}
}
},
{
$project: {
_id: 0,
url: 1,
name: 1,
runtime: 1,
genres: {
$filter: {
input: "$genres",
as: "genre",
cond: {
$regexMatch: {
input: "$$genre",
regex: "/^action/",
options: "im"
}
}
}
}
}
}
])
Here is how I solved it, Thanks #turivishal for the help

Searching for another way of displaying object of object in Mongodb

I have data that looks like:
[
{
'_id': ObjectId('589ba2fb2742a35b47dad21c'),
'name': 'Iphone7',
'price': 14500,
'category': 'Phone',
'vendor': 'Apple',
'stock': [
10,
40,
],
'quantity': 10,
},
{
'_id': ObjectId('589ba2fb2742a35b47dad21d'),
'name': 'Samaung TV',
'price': 6500,
'category': 'TV',
'vendor': {
'name': 'Samaung',
'phone': '01061202200',
},
'stock': [
5,
70,
80,
34,
],
'quantity': 5,
},
];
I could get second "vendor" which contain phone as like:
db.products.find({"vendor.phone": {"$exists": true}}).pretty()
I'm searching for any other way to do get only the vendor that contains "phone" value. I'm new to mongo. thanks in advance.
I would argue $exists is the best. However if for some case you insist not to do so you could use $type to only find documents where vendor.phone is of a certain type.
Under the assumption that all phone numbers are type string you could use this query:
db.collection.find({
"vendor.phone": {
$type: 2
}
})
If vendor.phone can be multiple types you'll have to use an $or query to cover all those types like so: (in this example types 1 and 2 represent number and string)
db.collection.find({
$or: [
{
"vendor.phone": {
$type: 1
}
},
{
"vendor.phone": {
$type: 2
}
}
]
})
Mongo Playground

Using $size and $addToSet to compare multiple arrays from the same cluster

Here is an example of the kind of documents I'm querying:
INPUT:
}
"_id": ObjectId("2786872873872"),
"data_shop" : {
"records_data" : [
{
"artist_name" : [
{
"val" : "BEYONCE",
},
],
"album_name" : [
{
"val" : "COUNTDOWN",
}
],
"qty" : [
0,
1,
2,
3
]
},
{
"artist_name" : [
{
"val" : "MUSE",
},
],
"album_name" : [
{
"val" : "THE RESISTANCE",
}
],
"qty" : [
0,
1,
2,
3,
3
]
}
],
},
}
}
"_id": ObjectId("2786872855555"),
"data_shop" : {
"records_data" : [
{
"artist_name" : [
{
"val" : "MAC MILLER",
},
],
"album_name" : [
{
"val" : "SWIMMING",
}
],
"qty" : [
0,
1,
2,
3,
]
},
{
"artist_name" : [
{
"val" : "DAFT PUNK",
},
],
"album_name" : [
{
"val" : "RANDOM ACCESS MEMORIES",
}
],
"qty" : [
0,
1,
2,
3,
4,
]
}
],
},
}
What I've done so far:
I'm trying to use both $size and $addtoSet in order to return the ObjectIds that have repeated numbers in the qty field. As you can see, only the first ObjectId has a repeated number (3) in the qty field.
This is what I've done so far:
db.mycollection.aggregate(
[
{$match: {"data_shop.records_data.qty.1": {$lte: 1}}},
{
$project: {Album_Cluster:"$data_shop.records_data"}
},
{
$unwind: "$Album_Cluster"
},
{
$project: {qty: "$Album_Cluster.qty"},
},
{
$project: {qty_size: {$size: "$qty"}, qty:1}
},
{ $match: {"qty_size.1": {$exists: false}, qty_size: {$gt: 1} }},
{$group:
{_id: "$_id",
totalSize: {$push: "$qty_size"},
realSize: {$addToSet: "$qty"},
}
},
],
{allowDiskUse: true}
)
And this is the result of the query above, in order to check the functionality of the query:
{"_id":ObjectId("2786872873872"), "totalSize": [4, 5], "realSize":[[0, 1, 2, 3]]}
{"_id":ObjectId("2786872855555"), "totalSize": [4, 5], "realSize":[[0, 1, 2, 3], [0, 1, 2, 3, 4]]}
I'm a little bit stuck at this part since I want to compare the total size of each array versus the real size of the array (by real size I mean non-repeating numbers)
OUTPUT
This is how the output of the query should look like:
{"_id":ObjectId("2786872873872"), "isRepeating": true}
{"_id":ObjectId("2786872855555"), "isRepeating": false}
EDIT:
I've improved my query in order to get this output schema:
db.mycollection.aggregate(
[
{$match: {"data_shop.records_data.qty.1": {$lte: 1}}},
{
$project: {Album_Cluster:"$data_shop.records_data"}
},
{
$unwind: "$Album_Cluster"
},
{
$project: {qty: "$Album_Cluster.qty"},
},
{
$project: {qty_size: {$size: "$qty"}, qty:1}
},
{$group:
{
_id: "$_id",
totalSize: {$addToSet: "$qty_size"},
realSize: {$addToSet: "$qty"},
}
},
{$unwind: "$realSize"},
{
$project:
{
totalSize:1,
real_count: {$size: "$realSize"}
}
},
{$unwind: "$totalSize"},
{
$group: {
_id: "$_id",
total_size: {$addToSet: "$totalSize"},
real_size: {$addToSet: "$real_count"}
}
},
],
{allowDiskUse: true}
)
And now I'm getting this as my output:
{"_id":ObjectId("2786872873872"), "total_size": [4, 5], "real_size":[4]}
{"_id":ObjectId("2786872855555"), "total_size": [4, 5], "real_size":[5, 4]}
Now my question is, does $in allow me to validate that [4, 5] is valid in [5, 4] so my output will be isRepeating = false?
Altought you can achieve this with a stack og unwind / group stages, it can be very expensive in resources consumption.
Unfortunaltely, the $addToSet oerator in available only in $group stage.
But... There's a trick, with the $setUnion operator.
$setUnion performs set operation on arrays, treating arrays as sets. If an array contains duplicate entries, $setUnion ignores the duplicate entries.
Knowing this, performing a $setUnion on your qty array, without any other array, will just... remove the duplicates.
Here's a implementation of this approach, using only 2 project stages
db.collection.aggregate([
{
$project: {
"data_shop.records_data": {
$map: {
input: "$data_shop.records_data",
as: "data",
in: {
qty_size: {
$size: "$$data.qty"
},
qty_size_unique: {
$size: {
$setUnion: [
"$$data.qty"
]
}
}
}
}
}
}
},
{
$project: {
isRepeating: {
$cond: {
if: {
$eq: [
"$data_shop.records_data.qty_size",
"$data_shop.records_data.qty_size_unique"
]
},
then: false,
else: true
}
}
}
}
])
It will return the expected output :
[
{
"_id": 1,
"isRepeating": true
},
{
"_id": 2,
"isRepeating": false
}
]

mongodb - using join on a local variable

I'm using node.js and mongodb, I have an array of objects which holds the names of an id. Let's say below is my array
let names = [
{ value: 1, text: 'One' },
{ value: 2, text: 'Two' },
{ value: 3, text: 'Three' },
{ value: 4, text: 'Gour' }
]
And this is my query result of a collection using $group which gives me the distinct values.
[
{ _id: { code: '1', number: 5 } },
{ _id: { code: '2', number: 5 } },
{ _id: { code: '3', number: 2 } },
{ _id: { code: '4', number: 22 } },
]
$lookup let's us to join the data from a different collection, but in my case I have an array which holds the text value for each of the codes which I got from the query.
Is there a way we can map the text from the array to the results from mongodb?
Any help will be much appreciated.
EDIT
MongoDB query which I was trying
db.collection.aggregate([
{
$match: {
_Id: id
}
},
{
$lookup: {
localField: "code",
from: names,
foreignField: "value",
as: "renderedNames"
}
},
{
"$group" : {
"_id": {
code: "$code",
number: "$number"
}
}
}
]);
Local variable lives in nodejs app, and mongodb knows nothing about it.
It looks like it belongs to representation layer, where you want to show codes as meaningful names. The mapping should be done there. I believe find is the most suitable here:
names.find(name => name.code === doc._id.code).text
If the names are not truly variable but quite constant, you can move it to own collection, e.g. codeNames:
db.codeNames.insert([
{ _id: "1", text: 'One' },
{ _id: "2", text: 'Two' },
{ _id: "3", text: 'Three' },
{ _id: "4", text: 'Gour' }
]);
and use $lookup as following:
db.collection.aggregate([
{
$match: {
_Id: id
}
},
{
"$group" : {
"_id": {
code: "$code",
number: "$number"
}
}
},
{
$lookup: {
localField: "_id.code",
from: "codeNames",
foreignField: "_id",
as: "renderedNames"
}
}
]);
If none of the above suit your usecase, you can pass the names to the database in each request to map names db-side, but you must be really really sure you cannot use 2 previous options:
db.collection.aggregate([
{
$match: {
_Id: id
}
},
{
"$group" : {
"_id": {
code: "$code",
number: "$number"
}
}
},
{
$project: {
renderedNames: { $filter: {
input: [
{ value: "1", text: 'One' },
{ value: "2", text: 'Two' },
{ value: "3", text: 'Three' },
{ value: "4", text: 'Gour' }
],
as: "name",
cond: { $eq: [ "$$name.value", "$_id.code" ] }
}
}
}
},
]);
As a side note, I find $match: {_Id: id} quite confusing, especially followed by $group. If _Id is _id, it is unique. You can have no more than 1 document after this stage, so there is not too much to group really.

Resources