Parsing a difficult MongoDB field (with multi-level array) - arrays

Hello Experts,
I am trying to parse a MongoDB collection row, and after using $unwind,
one of the remaining fields looks like that:
[
{
"account_id": "1234",
"cities": {
"cityname1": {
"param1": 1,
"param2": 2
}
}
},
{
"account_id": "2345",
"cities": {
"cityname2": {
"param1": 3,
"param2": 3
}
}
},
{
"account_id": "3456",
"cities": {
"cityname3": {
"param1": 8,
"param2": 6
}
}
}
]
Now, I would like to continue parsing this field, so I can extract the fieldname/value for account_id, for param1 and for param2, hoping then to sum up the param1 and param2 values.
However, when I try to use a second $unwind, I receive those fields with "null" value.
How should I parse this field correctly?

$set change cities into array by using $objectToArray
$unwind unwind cities array
$group group by account_id and cityname then sum up param1 and param2
(you can only group by account_id or cityname, just remove one of them)
aggregate
db.collection.aggregate([
{
"$set": {
cities: {
"$objectToArray": "$cities"
}
}
},
{
"$unwind": "$cities"
},
{
"$group": {
"_id": {
account_id: "$account_id",
"cityname": "$cities.k"
},
"sumOfParam1": {
"$sum": "$cities.v.param1"
},
"sumOfParam2": {
"$sum": "$cities.v.param2"
}
}
}
])
mongoplayground

Related

MongoDB Aggregation: How to return only the values that don't exist in all documents

Lets say I have an array ['123', '456', '789']
I want to Aggregate and look through every document with the field books and only return the values that are NOT in any documents. For example if '123' is in a document, and '456' is, but '789' is not, it would return an array with ['789'] as it's not included in any books fields in any document.
.aggregate( [
{
$match: {
books: {
$in: ['123', '456', '789']
}
}
},
I don't want the documents returned, but just the actual values that are not in any documents.
Here's one way to scan the entire collection to look for missing book values.
db.collection.aggregate([
{ // "explode" books array to docs with individual book values
"$unwind": "$books"
},
{ // scan entire collection creating set of book values
"$group": {
"_id": null,
"allBooksSet": {
"$addToSet": "$books" // <-- generate set of book values
}
}
},
{
"$project": {
"_id": 0, // don't need this anymore
"missing": { // use $setDifference to find missing values
"$setDifference": [
[ "123", "456", "789" ], // <-- your values go here
"$allBooksSet" // <-- the entire collection's set of book values
]
}
}
}
])
Example output:
[
{
"missing": [ "789" ]
}
]
Try it on mongoplayground.net.
Based on #rickhg12hs's answer, there is another variation replacing $unwind with $reduce, which considered less costly. Two out of Three steps are the same:
db.collection.aggregate([
{
$group: {
_id: null,
allBooks: {$push: "$books"}
}
},
{
$project: {
_id: 0,
allBooksSet: {
$reduce: {
input: "$allBooks",
initialValue: [],
in: {$setUnion: ["$$value", "$$this"]}
}
}
}
},
{
$project: {
missing: {
$setDifference: [["123","456", "789"], "$allBooksSet"]
}
}
}
])
Try it on mongoplayground.net.

MongoDB query to return docs based on an array size after filtering array of JSON objects?

I have MongoDB documents structured in this way:
[
{
"id": "car_1",
"arrayProperty": [
{
"model": "sedan",
"turbo": "nil"
},
{
"model": "sedan",
"turbo": "60cc"
}
]
},
{
"id": "car_2",
"arrayProperty": [
{
"model": "coupe",
"turbo": "50cc"
},
{
"model": "coupe",
"turbo": "60cc"
}
]
}
]
I want to be able to make a find query that translates into basic English as "Ignoring all models that have 'nil' value for 'turbo', return all documents with arrayProperty of length X." That is to say, the "arrayProperty" of car 1 would be interpreted as having a size of 1, while the array of car 2 would have a size of 2. The goal is to be able to make a query for all cars with arrayProperty size of 2 and only see car 2 returned in the results.
Without ignoring the nil values, the query is very simple as:
{ arrayProperty: { $size: 2} }
And this would return both cars 1 and 2. Moreover, if our array was just a simple array such as:
[1,2,3,'nil]
Then our query is simply:
{
arrayProperty: {
$size: X,
$ne: "nil"
}
}
However, when we introduce an array of JSON objects, things get tricky. I have tried numerous things to no avail including:
"arrayProperty": {
$size: 2,
$ne: {"turbo": "nil"}
}
"arrayProperty": {
$size: 2,
$ne: ["arrayProperty.turbo": "nil"]
}
Even without the $size operator in there, I can't seem to filter by the nil value. Does anyone know how I would properly do this in those last two queries?
use $and in $match
db.collection.aggregate([
{
$match: {
"$and": [
{
arrayProperty: {
$size: 2
}
},
{
"arrayProperty.turbo": {
$ne: "nil"
}
}
]
}
}
])
mongoplayground
use $set first
db.collection.aggregate([
{
"$set": {
"arrayProperty": {
"$filter": {
"input": "$arrayProperty",
"as": "a",
"cond": {
$ne: [
"$$a.turbo",
"nil"
]
}
}
}
}
},
{
$match: {
arrayProperty: {
$size: 1
}
}
}
])
mongoplayground
set a new field of size
db.collection.aggregate([
{
"$set": {
"size": {
$size: {
"$filter": {
"input": "$arrayProperty",
"as": "a",
"cond": {
$ne: [
"$$a.turbo",
"nil"
]
}
}
}
}
}
},
{
$match: {
size: 1
}
}
])
mongoplayground

Sum Quantity in Mongo Subdocument Based on Filter

I have a "shipment" document in MongoDB that has the following basic structure:
shipment {
"id": "asdfasdfasdf",
"shipDate": "2021-04-02",
"packages": [
{
"id": "adfasdfasdfasdf",
"contents": [
{
"product": {
"id": "asdfasdfasdfasd"
},
"quantity": 10
}
]
}
]
}
Please note that "product" is stored as a DBRef.
I want to find the total quantity of a specific product (based on the product ID) that has been shipped since a given date. I believe this is the appropriate logic that should be followed:
Match shipments with "shipDate" greater than the given date.
Find entries where "contents" contains a product with an "id" matching the given product ID
Sum the "quantity" value for each matching entry
Return the sum
So far, this is what I've come up with for the Mongo query so far:
db.shipment.aggregate([
{$match: {"shipDate": {$gt: ISODate("2019-01-01")}}},
{$unwind: "$packages"},
{$unwind: "$packages.contents"},
{$unwind: "$packages.contents.product"},
{
$project: {
matchedProduct: {
$filter: {
input: "$packages.contents.products",
as: "products",
cond: {
"$eq": ["$products.id", ObjectId("5fb55eae3fb1bf783a4fa97f")]
}
}
}
}
}
])
The query works, but appears to just return all entries that meet the $match criteria with a "products" value of null.
I'm pretty new with Mongo queries, so it may be a simple solution. However, I've been unable to figure out just how to return the $sum of the "contents" quantity fields for a matching product ID.
Any help would be much appreciated, thank you.
Query Which Solved The Problem
db.shipment.aggregate([
{
$match: {
"shipDate": {$gte: ISODate("2019-01-01")},
"packages.contents.product.$id": ObjectId("5fb55eae3fb1bf783a4fa98e")
}
},
{ $unwind: "$packages" },
{ $unwind: "$packages.contents" },
{ $unwind: "$packages.contents.product" },
{
$match: {
"packages.contents.product.$id": ObjectId("5fb55eae3fb1bf783a4fa98e")
}
},
{
$group: {
"_id": null,
"total": {
"$sum": "$packages.contents.quantity"
}
}
}
])
Demo - https://mongoplayground.net/p/c3Ia9L47cJS
Use { $match: {"packages.contents.product.id": 1 } }, to filter records by product id.
After that group them back and find the total { $group: {"_id": null,"total": { "$sum": "$packages.contents.quantity" } } }
db.collection.aggregate([
{ $match: {"shipDate": "2021-04-02","packages.contents.product.id": 1 } },
{ $unwind: "$packages" },
{ $unwind: "$packages.contents" },
{ $match: { "packages.contents.product.id": 1 } },
{ $group: { "_id": null,"total": { "$sum": "$packages.contents.quantity" } } }
])
Adding extra check at top { $match: {"shipDate": "2021-04-02","packages.contents.product.id": 1 } } for product id will filter only documents with produce id we need so query will be faster.
Option-2
Demo - https://mongoplayground.net/p/eo521luylsG
db.collection.aggregate([
{ $match: { "shipDate": "2021-04-02", "packages.contents.product.id": 1 }},
{ $unwind: "$packages" },
{ $project: { contents: { $filter: { input: "$packages.contents", as: "contents", cond: {"$eq": [ "$$contents.product.id", 1] }}}}},
{ $unwind: "$contents" },
{ $group: { "_id": null, "total": { "$sum": "$contents.quantity" }}}
])

Alias for each "_id" in array

How can i set alias for each "_id" in array of students.
I want to set an alias for _id in this json output.
{
"items": [
{
"id": "5d7aa7c1cba3435ebcb069c6",
"start": "2019-01-01T10:00:00.000Z",
"end": "2019-01-01T10:00:00.000Z",
"description": "test",
"students": [
{
"_id": "5d7aa779cba3435ebcb069c5", // <- alias _id to id
"name": "Jon",
"surname": "Snow"
}
]
}
]
}
How can i do that with Aggregation operations?
Use the native $map operator in your aggregate pipeline to transform the array. You would need to nest two $map operations; one
for the items array and an inner $map for the students array:
const studentsMap = {
'$map': {
'input': '$$this.students',
'as': 'student',
'in': {
'id': '$$student._id',
'name': '$$student.name',
'surname': '$$student.surname'
}
}
}
db.collection.aggregate([
{ '$addFields': {
'items': {
'$map': {
'input': '$items',
'in': {
'id': '$$this.id',
'start': '$$this.start',
'end': '$$this.end',
'description': '$$this.description',
'students': studentsMap
}
}
}
} }
])
Use map to transform.
items = items.map((item)=>{
item.students = item.students.map((student)=>{
student.id = student._id;
delete student._id;
return student;
})
return item;
})
A better approach would be to avoid hard coding of the fields in the $map stage as the maintenance of such queries becomes difficult when document size grows. The following query can get us the expected output and it's independent of the fields present in the document. As it would only focus on replacing the _id present in items.students.
db.collection.aggregate([
{
$addFields:{
"items":{
$map:{
"input":"$items",
"as":"item",
"in":{
$mergeObjects:[
"$$item",
{
"students":{
$map:{
"input":"$$item.students",
"as":"student",
"in":{
$mergeObjects:[
"$$student",
{
"id":"$$student._id"
}
]
}
}
}
}
]
}
}
}
}
},
{
$project:{
"items.students._id":0
}
}
]).pretty()

Query only for numbers in nested array

I am trying to get an average number of an key in a nested array inside a document, but not sure how to accomplish this.
Here is how my document looks like:
{
"_id": {
"$oid": "XXXXXXXXXXXXXXXXX"
},
"data": {
"type": "PlayerRoundData",
"playerId": "XXXXXXXXXXXXX",
"groupId": "XXXXXXXXXXXXXX",
"holeScores": [
{
"type": "RoundHoleData",
"points": 2
},
{
"type": "RoundHoleData",
"points": 13
},
{
"type": "RoundHoleData",
"points": 3
},
{
"type": "RoundHoleData",
"points": 1
},
{
"type": "RoundHoleData",
"points": 21
}
]
}
}
Now, the tricky part of this is that I only want the average of points for holeScores[0] of all documents with this playerid and this groupid.
Actually, the best solution would be collecting all documents with playerid and groupid and create a new array with the average of holeScores[0], holeScores[1], holeScores[2]... But if I only can get one array key at the time, that would be OK to :-)
Here is what I am thinking but not quit sure how to put it together:
var allScores = dbCollection('scores').aggregate(
{$match: {"data.groupId": groupId, "playerId": playerId}},
{$group: {
_id: playerId,
rounds: { $sum: 1 }
result: { $sum: "$data.scoreTotals.points" }
}}
);
Really hoping for help with this issue and thanks in advance :-)
You can use $unwind with includeArrayIndex to get index and then use $group to group by that index
dbCollection('scores').aggregate(
{
$match: { "data.playerId": "XXXXXXXXXXXXX", "data.groupId": "XXXXXXXXXXXXXX" }
},
{
$unwind: {
path: "$data.holeScores",
includeArrayIndex: "index"
}
},
{
$group: {
_id: "$index",
playerId: { $first: "data.playerId" },
avg: { $avg: "$data.holeScores.points" }
}
}
)
You can try below aggregation
db.collection.aggregate(
{ "$match": { "data.groupId": groupId, "data.playerId": playerId }},
{ "$group": {
"_id": null,
"result": {
"$sum": {
"$arrayElemAt": [
"$data.holeScores.points",
0
]
}
}
}}
)

Resources