Let's say I have a Customer document like the following
db.collection.insertOne( {
"customerName": "John Doe",
"orders": [
{
"type": "regular",
"items": [
{
"name": "itemA",
"price": 11.1
},
{
"name": "itemB",
"price": 22.2
}
]
},
{
"type": "express",
"items": [
{
"name": "itemC",
"price": 33.3
},
{
"name": "itemD",
"price": 44.4
}
]
}
]
})
How can I calculate the total price of all orders (111 in this example)?
You can $unwind twice (because nested array) and group using $sum like this:
db.collection.aggregate([
{
"$unwind": "$orders"
},
{
"$unwind": "$orders.items"
},
{
"$group": {
"_id": "$customerName",
"total": {
"$sum": "$orders.items.price"
}
}
}
])
Example here
Related
I need to run a query that joins documents from two collections, I wrote an aggregation query but it takes too much time when running in the production database with many documents. Is there any way to write this query in a more efficient way?
Query in Mongo playground: https://mongoplayground.net/p/dLb3hsJHNYt
There are two collections users and activities. I need to run a query to get some users (from users collection), and also their last activity (from activities collection).
Database:
db={
"users": [
{
"_id": 1,
"email": "user1#gmail.com",
"username": "user1",
"country": "BR",
"creation_date": 1646873628
},
{
"_id": 2,
"email": "user2#gmail.com",
"username": "user2",
"country": "US",
"creation_date": 1646006402
}
],
"activities": [
{
"_id": 1,
"email": "user1#gmail.com",
"activity": "like",
"timestamp": 1647564787
},
{
"_id": 2,
"email": "user1#gmail.com",
"activity": "comment",
"timestamp": 1647564834
},
{
"_id": 3,
"email": "user2#gmail.com",
"activity": "like",
"timestamp": 1647564831
}
]
}
Inefficient Query:
db.users.aggregate([
{
// Get users using some filters
"$match": {
"$expr": {
"$and": [
{ "$not": { "$in": [ "$country", [ "AR", "CA" ] ] } },
{ "$gte": [ "$creation_date", 1646006400 ] },
{ "$lte": [ "$creation_date", 1648684800 ] }
]
}
}
},
{
// Get the last activity within the time range
"$lookup": {
"from": "activities",
"as": "last_activity",
"let": { "cur_email": "$email" },
"pipeline": [
{
"$match": {
"$expr": {
"$and": [
{ "$eq": [ "$email", "$$cur_email" ] },
{ "$gte": [ "$timestamp", 1647564787 ] },
{ "$lte": [ "$timestamp", 1647564834 ] }
]
}
}
},
{ "$sort": { "timestamp": -1 } },
{ "$limit": 1 }
]
}
},
{
// Remove users with no activity
"$match": {
"$expr": {
"$gt": [ { "$size": "$last_activity" }, 0 ] }
}
}
])
Result:
[
{
"_id": 1,
"country": "BR",
"creation_date": 1.646873628e+09,
"email": "user1#gmail.com",
"last_activity": [
{
"_id": 2,
"activity": "comment",
"email": "user1#gmail.com",
"timestamp": 1.647564788e+09
}
],
"username": "user1"
},
{
"_id": 2,
"country": "US",
"creation_date": 1.646006402e+09,
"email": "user2#gmail.com",
"last_activity": [
{
"_id": 3,
"activity": "like",
"email": "user2#gmail.com",
"timestamp": 1.647564831e+09
}
],
"username": "user2"
}
]
I'm more familiar with relational databases, so I'm struggling a little to run this query efficiently.
Thanks!
I have a restaurant collection with its documents formed like this one:
{
"address": {
"building": "1007",
"coord": [
-73.856077,
40.848447
],
"street": "Moris Park Ave",
"zipcode": "10462"
},
"borough": "Bronx",
"cuisine": "Bakery",
"grades": [
{
"date": {
"$date": 1393804800000
},
"grade": "A",
"score": "81"
},
{
"date": {
"$date": 1378857600000
},
"grade": "A",
"score": "6"
},
{
"date": {
"$date": 1358985600000
},
"grade": "A",
"score": "99"
},
{
"date": {
"$date": 11322006400000
},
"grade": "B",
"score": "14"
},
{
"date": {
"$date": 1288715200000
},
"grade": "B",
"score": "14"
}
],
"name": "Morris Park Bake Shop"
}
My homework asked me to find any restaurants having score from 80 to 100 and I do this
db.restaurants.find({ $expr: {$and: [{$gt: [ { $toInt: "$grades.score" }, 80 ]}, {$lt: [ { $toInt: "$grades.score" }, 100 ]}] } }).pretty()
And received "Executor error during find command :: caused by :: Unsupported conversion from array to int in $convert with no onError value".
I try
db.restaurants.find({ $expr: {$and: [{$gt: [ { $toInt: "$grades.$score" }, 80 ]}, {$lt: [ { $toInt: "$grades.$score" }, 100 ]}] } }).pretty()
And this returned:"FieldPath field names may not start with '$'. Consider using $getField or $setField."
Then i try
db.restaurants.find({$and:[{'grades.score': {$gt: 80}}, {'grade.score':{$lt:100}}]}).collation({locale:'en_US' ,numericOrdering: true})
And that returned nothing. It has to return at least the document i mentioned above, right?.
Perhaps this homework is about learning proper field value types or collation. With collation, numericOrdering can be used as commented by #prasad_. If score is truly numeric, for several reasons it's best to store it as numeric.
Unfortunately there doesn't seem to be a way at this time to specify a collation with mongoplayground.net. Without using a collation, there are many ways to achieve your desired output. Here's one way.
db.collection.aggregate([
{
// make grades.score numeric
"$set": {
"grades": {
"$map": {
"input": "$grades",
"as": "grade",
"in": {
"$mergeObjects": [
"$$grade",
{ "score": { "$toDecimal": "$$grade.score" } }
]
}
}
}
}
},
{
"$match": {
"grades.score": {
"$gt": 80,
"$lt": 100
}
}
},
{
"$project": {
"_id": 0,
"name": 1
}
}
])
Try it on mongoplayground.net.
How to calculate the sum of confident_score for every individual vendor?
Data stored in the DB:
[
{
"_id": "61cab38891152daf9387c0c7",
"name": "dummy",
"company_email": "abc#mailinator.com",
"brief_msg": "Cillum sed est prae",
"similar_case_ids": [],
"answer_id": [
"61cab38891152daf9387c0c9"
],
"pros_cons": [
{
"vendor_name": "xyzlab",
"score": [
{
"question_id": "61c5b47198b2c5bbf9f6471c",
"title": "Vendor F",
"confident_score": 80,
"text": "text1",
"_id": "61cac505caeeeb3cec78bf0f"
},
{
"question_id": "61c5b47198b2c5bbf9f6471c",
"title": "Vendor FFF",
"confident_score": 40,
"text": "text1",
"_id": "61cac505caeeeb3cec78bf10"
}
]
},
{
"vendor_name": "abclab",
"score": [
{
"question_id": "61c5b47198b2c5bbf9f6471c",
"title": "Vendor B",
"confident_score": 50,
"text": "text1",
"_id": "61cac505caeeeb3cec78bf16"
},
{
"question_id": "61c5b47198b2c5bbf9f6471c",
"title": "Vendor BB",
"confident_score": 60,
"text": "text1",
"_id": "61cac505caeeeb3cec78bf17"
}
]
}
]
the query for getting the matching id and grouping objects according to the vendor_name:
aggregate([
{
$match: { _id: id }
},
{
$unwind: {
path: '$pros_cons'
}
},
{
$group: {
_id: '$pros_cons'
}
},
])
};
After query I'm getting this:
[
{
"_id": {
"vendor_name": "abclab",
"score": [
{
"question_id": "61c5b47198b2c5bbf9f6471c",
"title": "Vendor B",
"confident_score": 50,
"text": "text1",
"_id": "61cac505caeeeb3cec78bf16"
},
{
"question_id": "61c5b47198b2c5bbf9f6471c",
"title": "Vendor BB",
"confident_score": 60,
"text": "text1",
"_id": "61cac505caeeeb3cec78bf17"
}
],
}
},
{
"_id": {
"vendor_name": "xyzlab",
"score": [
{
"question_id": "61c5b47198b2c5bbf9f6471c",
"title": "Vendor F",
"confident_score": 80,
"text": "text1",
"_id": "61cac505caeeeb3cec78bf0f"
},
{
"question_id": "61c5b47198b2c5bbf9f6471c",
"title": "Vendor FFF",
"confident_score": 40,
"text": "text1",
"_id": "61cac505caeeeb3cec78bf10"
}
],
}
}
]
Need to calculate sum for (vendor_name:abclab)TOTAL=110 and for (vendor_name:xyzlab)TOTAL=120 INDIVIDUALLY
required output:
[
{
"vendor_name": "abclab",
"totalScore": 110,
"count" : 2
},
{
"vendor_name": "xyzlab",
"totalScore": 120,
"count" : 2
}
]
$match - Filter documents by id.
$unwind - Deconstruct pros_cons array to multiple documents.
$project - Decorate output documents. With $reduce, to create totalScore field by summing confident_score from each element in pros_cons.score array.
db.collection.aggregate([
{
$match: {
_id: "61cab38891152daf9387c0c7"
}
},
{
$unwind: {
path: "$pros_cons"
}
},
{
$project: {
_id: 0,
vendor_name: "$pros_cons.vendor_name",
totalScore: {
$reduce: {
input: "$pros_cons.score",
initialValue: 0,
in: {
$sum: [
"$$value",
"$$this.confident_score"
]
}
}
}
}
}
])
Sample Demo on Mongo Playground
let data = [
{
"_id": '101',
"name": 'category_1',
"subcategory": [
{
"_id": '201',
"name": 'subCategory_1'
},
{
"_id": '202',
"name": 'subCategory_2',
"subsubcategory": [
{
"_id": '301',
"name": 'subsubcategory_1'
},
{
"_id": '302',
"name": 'subsubcategory_2'
}
]
}
]
}
]
How can I change subsubcategory name ?
I need to aggregate an array as follows
Two document examples:
{
"_index": "log",
"_type": "travels",
"_id": "tnQsGy4lS0K6uT3Hwzzo-g",
"_score": 1,
"_source": {
"state": "saopaulo",
"date": "2014-10-30T17",
"traveler": "patrick",
"registry": "123123",
"cities": {
"saopaulo": 1,
"riodejaneiro": 2,
"total": 2
},
"reasons": [
"Entrega de encomenda"
],
"from": [
"CompraRapida"
]
}
},
{
"_index": "log",
"_type": "travels",
"_id": "tnQsGy4lS0K6uT3Hwzzo-g",
"_score": 1,
"_source": {
"state": "saopaulo",
"date": "2014-10-31T17",
"traveler": "patrick",
"registry": "123123",
"cities": {
"saopaulo": 1,
"curitiba": 1,
"total": 2
},
"reasons": [
"Entrega de encomenda"
],
"from": [
"CompraRapida"
]
}
},
I want to aggregate the cities array, to find out all the cities the traveler has gone to. I want something like this:
{
"traveler":{
"name":"patrick"
},
"cities":{
"saopaulo":2,
"riodejaneiro":2,
"curitiba":1,
"total":3
}
}
Where the total is the length of the cities array minus 1. I tried the terms aggregation and the sum, but couldn't output the desired output.
Changes in the document structure can be made, so if anything like that would help me, I'd be pleased to know.
in the document posted above "cities" is not a json array , it is a json object.
If changing the document structure is a possibility I would change cities in the document to be an array of object
example document:
cities : [
{
"name" :"saopaulo"
"visit_count" :"2",
},
{
"name" :"riodejaneiro"
"visit_count" :"1",
}
]
You would then need to set cities to be of type nested in the index mapping
"mappings": {
"<type_name>": {
"properties": {
"cities": {
"type": "nested",
"properties": {
"city": {
"type": "string"
},
"count": {
"type": "integer"
},
"value": {
"type": "long"
}
}
},
"date": {
"type": "date",
"format": "dateOptionalTime"
},
"registry": {
"type": "string"
},
"state": {
"type": "string"
},
"traveler": {
"type": "string"
}
}
}
}
After which you could use nested aggregation to get the city count per user.
The query would look something on these lines :
{
"query": {
"match": {
"traveler": "patrick"
}
},
"aggregations": {
"city_travelled": {
"nested": {
"path": "cities"
},
"aggs": {
"citycount": {
"cardinality": {
"field": "cities.city"
}
}
}
}
}
}