mongoDB sum of array lengths and missing fields [duplicate] - arrays

This question already has answers here:
MongoDB - The argument to $size must be an Array, but was of type: EOO / missing
(3 answers)
Closed 3 years ago.
I have a simple document that stores arrays of objects for a user, and I am looking to get a sum of these arrays and a total of all; the challenge is that some documents are missing fields that other documents have, and this causes my aggregate query to fail.
Document looks like this.
{
name: '',
type: '',
cars: [],
boats: [],
planes: []
}
Some people do not have boats or planes...and those documents might look like
{
name: '',
type: '',
cars: []
}
So when I run my aggregate
[
{
'$match': {
'type': 'big_spender'
}
}, {
'$project': {
'name': '$name',
'cars': {
'$size': '$cars'
},
'boats': {
'$size': '$boats'
},
'planes': {
'$size': '$planes'
}
}
}, {
'$addFields': {
'total_vehicles': {
'$add': [
'$cars', '$boats', '$planes'
]
}
}
}
]
I get the error: "The argument to $size must be an array, but was of type: missing"
I am pretty sure I can use $exists to avoid this problem, and return a 0, but I have no idea what that syntax might look like.
I need to return a 0 for arrays that don't exist so when i add them for totals, I get a correct total and no errors.
Any help appreciated.

Use $ifNull aggregation operator. It will replace the field with blank array if it does not exists.
[
{ '$match': { 'type': 'big_spender' }},
{ '$project': {
'name': '$name',
'cars': { '$size': '$cars' },
'boats': { '$size': { '$ifNull': ['$boats', []] }},
'planes': { '$size': { '$ifNull': ['$planes', []] }}
}},
{ '$addFields': {
'total_vehicles': {
'$add': [
'$cars', '$boats', '$planes'
]
}
}}
]

Related

Mongodb: check that all the fields of the elements of an array of objects respect a condition

I have a database of a the employees of a company that looks like this:
{
_id: 7698,
name: 'Blake',
job: 'manager',
manager: 7839,
hired: ISODate("1981-05-01T00:00:00.000Z"),
salary: 2850,
department: {name: 'Sales', location: 'Chicago'},
missions: [
{company: 'Mac Donald', location: 'Chicago'},
{company: 'IBM', location: 'Chicago'}
]
}
I have an exercise in which I need to write the MongoDb command that returns all them employees who did all their missions in Chicago. I struggle with the all because I cannot find a way to check that all the locations of the missions array are equal to 'Chicago'.
I was thinking about doing it in two time: first find the total number of missions the employee has and then compare it to the number of mission he has in Chicago (that how I would do in SQL I guess). But I cannot found the number of mission the employee did in Chicago. Here is what I tried:
db.employees.aggregate([
{
$match: { "missions": { $exists: true } }
},
{
$project: {
name: 1,
nbMissionsChicago: {
$sum: {
$cond: [
{
$eq: [{
$getField: {
field: { $literal: "$location" },
input: "$missions"
}
}, "Chicago"]
}, 1, 0
]
}
}
}
}
])
Here is the result :
{ _id: 7698, name: 'Blake', nbMissionsChicago: 0 }
{ _id: 7782, name: 'Clark', nbMissionsChicago: 0 }
{ _id: 8000, name: 'Smith', nbMissionsChicago: 0 }
{ _id: 7902, name: 'Ford', nbMissionsChicago: 0 }
{ _id: 7499, name: 'Allen', nbMissionsChicago: 0 }
{ _id: 7654, name: 'Martin', nbMissionsChicago: 0 }
{ _id: 7900, name: 'James', nbMissionsChicago: 0 }
{ _id: 7369, name: 'Smith', nbMissionsChicago: 0 }
First of all, is there a better method to check that all the locations of the missions array respect the condition? And why does this commands returns only 0 ?
Thanks!
If all you need is the agents who had all their missions in "Chicago" then you don't need an aggregation pipeline for it, specifically the approach of filtering the array as part of the aggregation can't utilize an index and will make performance even worse.
A simple query should suffice here:
db.collection.find({
$and: [
{
"missions": {
$exists: true
}
},
{
"missions.location": {
$not: {
$gt: "Chicago"
}
}
},
{
"missions.location": {
$not: {
$lt: "Chicago"
}
}
}
]
})
Mongo Playground
This way we can build an index on the missions field and utilize it properly, any documents with a different value other then "Chigaco" will not match as they will fail the $gt or $lt comparion.
Note that an empty array also matches the condition, you can change the generic "missions" exists condition key into "missions.0": {$exists: true}, this will also require at least one mission.
You are unable to get the correct result as it is not the correct way to iterate the element in an array field.
Instead, you need to work with $size operator to get the size of an array and the $filter operator to filter the document.
Updated: You can directly compare the filtered array with the original array.
db.employees.aggregate([
{
$match: {
"missions": {
$exists: true
}
}
},
{
$project: {
name: 1,
nbMissionsChicago: {
$eq: [
{
$filter: {
input: "$missions",
cond: {
$eq: [
"$$this.location",
"Chicago"
]
}
}
},
"$missions"
]
}
}
}
])
Demo # Mongo Playground

Mongo updateMany statement with an inner array of objects to manipulate

I'm struggling to write a Mongo UpdateMany statement that can reference and update an object within an array.
Here I create 3 documents. Each document has an array called innerArray always containing a single object, with a single date field.
use test;
db.innerArrayExample.insertOne({ _id: 1, "innerArray": [ { "originalDateTime" : ISODate("2022-01-01T01:01:01Z") } ]});
db.innerArrayExample.insertOne({ _id: 2, "innerArray": [ { "originalDateTime" : ISODate("2022-01-02T01:01:01Z") } ]});
db.innerArrayExample.insertOne({ _id: 3, "innerArray": [ { "originalDateTime" : ISODate("2022-01-03T01:01:01Z") } ]});
I want to add a new date field, based on the original date field, to end up with this:
{ _id: 1, "innerArray": [ { "originalDateTime" : ISODate("2022-01-01T01:01:01Z"), "copiedDateTime" : ISODate("2022-01-01T12:01:01Z") } ]}
{ _id: 2, "innerArray": [ { "originalDateTime" : ISODate("2022-01-02T01:01:01Z"), "copiedDateTime" : ISODate("2022-01-02T12:01:01Z") } ]}
{ _id: 3, "innerArray": [ { "originalDateTime" : ISODate("2022-01-03T01:01:01Z"), "copiedDateTime" : ISODate("2022-01-03T12:01:01Z") } ]}
In pseudo code I am saying take the originalDateTime, run it through a function and add a related copiedDateTime value.
For my specific use-case, the function I want to run strips the timezone from originalDateTime, then overwrites it with a new one, equivalent to the Java ZonedDateTime function withZoneSameLocal. Aka 9pm UTC becomes 9pm Brussels (therefore effectively 7pm UTC). The technical justification and methodology were answered in another Stack Overflow question here.
The part of the query I'm struggling with, is the part that updates/selects data from an element inside an array. In my simplistic example, for example I have crafted this query, but unfortunately it doesn't work:
This function puts copiedDateTime in the correct place... but doesn't evaluate the commands to manipulate the date:
db.innerArrayExample.updateMany({ "innerArray.0.originalDateTime" : { $exists : true }}, { $set: { "innerArray.0.copiedDateTime" : { $dateFromString: { dateString: { $dateToString: { "date" : "$innerArray.0.originalDateTime", format: "%Y-%m-%dT%H:%M:%S.%L" }}, format: "%Y-%m-%dT%H:%M:%S.%L", timezone: "Europe/Paris" }}});
// output
{
_id: 1,
innerArray: [
{
originalDateTime: ISODate("2022-01-01T01:01:01.000Z"),
copiedDateTime: {
'$dateFromString': {
dateString: { '$dateToString': [Object] },
format: '%Y-%m-%dT%H:%M:%S.%L',
timezone: 'Europe/Paris'
}
}
}
]
}
This simplified query, also has the same issue:
b.innerArrayExample.updateMany({ "innerArray.0.originalDateTime" : { $exists : true }}, { $set: { "innerArray.0.copiedDateTime" : "$innerArray.0.originalDateTime" }});
//output
{
_id: 1,
innerArray: [
{
originalDateTime: ISODate("2022-01-01T01:01:01.000Z"),
copiedDateTime: '$innerArray.0.originalDateTime'
}
]
}
As you can see this issue looks to be separate from the other stack overflow question. Instead of being able changing timezones, it's about getting things inside arrays to update.
I plan to take this query, create 70,000 variations of it with different location/timezone combinations and run it against a database with millions of records, so I would prefer something that uses updateMany instead of using Javascript to iterate over each row in the database... unless that's the only viable solution.
I have tried putting $set in square brackets. This changes the way it interprets everything, evaluating the right side, but causing other problems:
test> db.innerArrayExample.updateMany({ "_id" : 1 }, [{ $set: { "innerArray.0.copiedDateTime" : "$innerArray.0.originalDateTime" }}]);
//output
{
_id: 1,
innerArray: [
{
'0': { copiedDateTime: [] },
originalDateTime: ISODate("2022-01-01T01:01:01.000Z")
}
]
}
Above it seems to interpret .0. as a literal rather than an array element. (For my needs I know the array only has 1 item at all times). I'm at a loss finding an example that meets my needs.
I have also tried experimenting with the arrayFilters, documented on my mongo updateMany documentation but I cannot fathom how it works with objects:
test> db.innerArrayExample.updateMany(
... { },
... { $set: { "innerArray.$[element].copiedDateTime" : "$innerArray.$[element].originalDateTime" } },
... { arrayFilters: [ { "originalDateTime": { $exists: true } } ] }
... );
MongoServerError: No array filter found for identifier 'element' in path 'innerArray.$[element].copiedDateTime'
test> db.innerArrayExample.updateMany(
... { },
... { $set: { "innerArray.$[0].copiedDateTime" : "$innerArray.$[element].originalDateTime" } },
... { arrayFilters: [ { "0.originalDateTime": { $exists: true } } ] }
... );
MongoServerError: Error parsing array filter :: caused by :: The top-level field name must be an alphanumeric string beginning with a lowercase letter, found '0'
If someone can help me understand the subtleties of the Mongo syntax and help me back on to the right path I'd be very grateful.
You want to be using pipelined updates, the issue you're having with the syntax you're using is that it does not allow the usage of aggregation operators and document field values.
Here is a quick example on how to do it:
db.collection.updateMany({},
[
{
"$set": {
"innerArray": {
$map: {
input: "$innerArray",
in: {
$mergeObjects: [
"$$this",
{
copiedDateTime: "$$this.originalDateTime"
}
]
}
}
}
}
}
])
Mongo Playground

Looping through array to count in mongodb/mongoose

I have a user schema that contains a value called amputationInfo:
amputationInfo: [
{
type: String,
},
],
Here is an example of what that might look like in the database:
amputationInfo: [
"Double Symes/Boyd",
"Single Above-Elbow"
]
I have a review Schema that allows a user to leave a review, it contains a reference to the user who left it:
user: {
type: mongoose.Schema.ObjectId,
ref: 'User',
require: [true, 'Each review must have an associated user!'],
},
When a user leaves a review, I want to create an aggregate function that looks up the user on the review, finds their amputationInfo, loops through the array and adds up the total amount of users that contain "Double Symes/Boyd", "Single Above-Elbow"
So if we have 3 users and their amputationInfo is as follows:
amputationInfo: [
"Double Symes/Boyd",
"Single Above-Elbow"
]
amputationInfo: [
"Single Above-Elbow"
]
amputationInfo: []
The return from the aggregate function will count each term and add one to the corresponding value and look something like this:
[
{
doubleSymesBoyd: 1,
singleAboveElbow: 2
}
]
Here is what I have tried, but I just don't know enough about mongoDB to solve the issue:
[
{
'$match': {
'prosthetistID': new ObjectId('6126ca6148f34c00189f86f5')
}
}, {
'$lookup': {
'from': 'users',
'localField': 'user',
'foreignField': '_id',
'as': 'userInfo'
}
}, {
'$unwind': {
'path': '$userInfo'
}
}
]
After the $unwind, the resulting object has a userInfo key, that contains an amputationInfo array nested:
You can have following stages
$unwind to deconstruct the array
first $group to get the sum of each category
second $group to push into one document and make it as key value pair
$arrayToObject to get the desired output
$replaceRoot to make the data output into root
Here is the code
db.collection.aggregate([
{ "$unwind": "$userInfo.amputationInfo" },
{
"$group": {
"_id": "$userInfo.amputationInfo",
"count": { "$sum": 1 }
}
},
{
$group: {
_id: null,
data: { $push: {
k: "$_id",
v: "$count"
}
}
}
},
{ $project: { data: { "$arrayToObject": "$data" } } },
{ "$replaceRoot": { "newRoot": "$data" } }
])
Working Mongo playground

How to fix MongoDB array concatination error?

I have a collection in mongodb with a few million documents. there is an attribute(categories) that is an array that contains all the categories that a document belongs to. I am using following query to convert the array into a comma separated string to add it to SQL server through a spoon transformation.
for example
the document has ["a","b","c",...] and i need a,b,c,.... so i can pit it in a column
categories: {
$cond: [
{ $eq: [{ $type: "$categories" }, "array"] },
{
$trim: {
input: {
$reduce: {
input: "$categories",
initialValue: "",
in: { $concat: ["$$value", ",", "$$this"] }
}
}
}
},
"$categories"
]
}
when i run the query i get the following error and i cannot figure out what the problem is.
com.mongodb.MongoQueryException: Query failed with error code 16702 and error message '$concat only supports strings, not array' on server
a few documents had this attribute as string and not array so i added a type check. but still the issue is there. any help on how to narrow down the issue will be very appreciated.
A few other attributes were the same in the same collection and this query is working fine for the rest of them.
I don't see any problem in your aggregation. It shouldn't give this error. Can you try to update your mongodb version?
However, your aggregation is not working properly reduce wasn't working . I converted it to this:
db.collection.aggregate([
{
"$project": {
categories: {
$cond: [
{
$eq: [{ $type: "$categories" }, "array"]
},
{
'$reduce': {
'input': '$categories',
'initialValue': '',
'in': {
'$concat': [
'$$value',
{ '$cond': [{ '$eq': ['$$value', ''] }, '', ', '] },
'$$this'
]
}
}
},
"$categories"
]
}
}
}
])
Edit:
So, if you have nested arrays in the categories field. We can flat our arrays with unwind stage. So if you can add these 3 stages above the $project stage. Our aggregation will work.
{
"$unwind": "$categories"
},
{
"$unwind": "$categories"
},
{
"$group": {
_id: null,
categories: {
$push: "$categories"
}
}
},
Playground

Count an array inside an array and then sum them MongoDB NodeJS

I have an object that has an array of page objects and each page object has an array of questions.
Ex object:
{
Id: 1,
UserId: 14,
Deleted: false,
Collaborators: [],
Title: "Awesome",
Pages: [{
Id: 1,
Title: 'Jank',
Questions: [
{ Id: 1, Content: 'Ask me about it' },
{ Id: 2, Content: 'Ask me about it again' }
]
}, {
Id: 2,
Title: 'Janker',
Questions: [
{ Id: 1, Content: 'Tell me about it' },
{ Id: 2, Content: 'Tell me about it again' }
]
}]
}
What I am trying to do is to get a count of all the questions for the entire bas object. I am not sure how to do that. I have tried to use aggregate and $sum the total questions and then do another function to $sum those all together to get a total for the entire object. Unfortunately my $sum is not working like I thought it would.
Ex code (nodejs):
var getQuestionCount = function(id) {
var cursor = mongo.collection('surveys').aggregate([{
$match: {
$or: [{
"UserId": id
}, {
"Collaborators": {
$in: [id]
}
}]
}
}, {
$match: {
"Deleted": false
}
}, {
$unwind: "$Pages"
},
{ $group: { _id: null, number: { $sum: "$Pages.Questions" } } }
], function(err, result) {
//This log just gives me [object Object], [object Object]
console.log('q count ' + result);
});
}
Any idea how to do this? My end result from the example object above would ideally return 4 as the question count for the whole object.
I'd try following shell query.
db.collection.aggregate([
// filter out unwanted documents.
{$match:{Id: 1}},
// Unwind Pages collection to access Questions array
{$unwind:"$Pages"},
// Count items in Questions array
{$project:{count: {$size:"$Pages.Questions"}}},
// Finally sum items previously counted.
{$group:{_id:"$_id", total: {$sum: "$count"}}}
])
Based on your sample document, it should return correct count of Questions.
{
"_id" : ObjectId("57723bb8c10c41c41ff4897c"),
"total" : NumberInt(4)
}

Resources