Difference between two consecutive mongodb documents fields - database

I have a time-series collection in mongodb. Which looks like this:
{ _id: 1, time: 2021-01-03T06:26:20.000+00:00 }
{ _id: 2, time: 2021-01-03T06:26:21.000+00:00 }
{ _id: 3, time: 2021-01-03T06:26:22.000+00:00 }
I want to accumulate all document based on time field and all documents are sorted based on time field. And output should be look like (t3-t2) + (t2-t1). So for this output will be 2 seconds.
For postgresql we can use window function or joins to calculated. How to calculate this in mongodb?

You an use this one:
db.collection.aggregate([
{ $match: { _id: { $ne: 3 } } },
{ $group: { _id: null, max_time: { $max: "$time" }, min_time: { $min: "$time" } } },
{ $set: { difference: { $divide: [{ $subtract: ["$max_time", "$min_time"] }, 1000] } } }
])

Related

how to get the total documents count and specific type of document count?

I'm trying to find total occurrence of specific type of category and sum of all categories selected. right now, I can get is total count of selected categories, but I also want total occurrence of specific categories and group documents every hour basis.
P.S I'm new to mongo dB
[
{
$match: {
categories: {
$in: categories
}
}
},
{
$addFields: {
uniqueHour: {
$dateToString: {
format: "%H",
date: "$publishedAt"
}
}
}
},
{
$addFields: {
categories: categories
}
},
{
$group: {
_id: {
hour: "$uniqueHour"
},
count: {
$sum: 1
}
}
},
{
$sort: {
_id: -1
}
}
]

How to use $getfield to get a field from ROOT Document with condition in Aggregation Mongodb

I'm starting to learn Aggregate in MongoDB. I have a simple Doc as below, which has 2 fields, name and examScores, examScores is an array contains multiplier documents:
{ _id: ObjectId("633199db009be219a43ae426"),
name: 'Max',
examScores:
[ { difficulty: 4, score: 57.9 },
{ difficulty: 6, score: 62.1 },
{ difficulty: 3, score: 88.5 } ] }
{ _id: ObjectId("633199db009be219a43ae427"),
name: 'Manu',
examScores:
[ { difficulty: 7, score: 52.1 },
{ difficulty: 2, score: 74.3 },
{ difficulty: 5, score: 53.1 } ] }
Now I query the maximum score of each person using $unwind and $group/$max as below:
db.test.aggregate([
{$unwind: "$examScores"},
{$group: {_id: {name: "$name"}, maxScore: {$max: "$examScores.score"}}}
])
{ _id: { name: 'Max' }, maxScore: 88.5 }
{ _id: { name: 'Manu' }, maxScore: 74.3 }
But I want the result also contains the examScores.difficulty field corresponding to name and examScores.score, like below:
{ _id: { name: 'Max' }, difficulty: 3, maxScore: 88.5 }
{ _id: { name: 'Manu' }, difficulty: 2, maxScore: 74.3 }
I know that I can use $sort + $group and $first to achieve this goal. But I want to use $getField or any other methods to get data from ROOT Doc.
My idea is use $project and $getField to get the difficulty field from ROOT doc (or $unwind version of ROOT doc) with the condition like ROOT.name = Aggregate.name and Root.examScores.score = Aggregate.maxScore.
It will look something like this:
{$project:
{name: 1,
maxScore: 1,
difficulty:
{$getField: {
field: "$examScores.difficulty"
input: "$$ROOT.$unwind() with condition/filter"}
}
}
}
I wonder if this is possible in MongoDB?
Solution 1
$unwind
$group - Group by name. You need $push to add the $$ROOT document into data array.
$project - Set the difficulty field by getting the value of examScores.difficulty from the first item of the filtered data array by matching the examScores.score with maxScore.
db.collection.aggregate([
{
$unwind: "$examScores"
},
{
$group: {
_id: {
name: "$name"
},
maxScore: {
$max: "$examScores.score"
},
data: {
$push: "$$ROOT"
}
}
},
{
$project: {
_id: 0,
name: "$_id.name",
maxScore: 1,
difficulty: {
$getField: {
field: "difficulty",
input: {
$getField: {
field: "examScores",
input: {
$first: {
$filter: {
input: "$data",
cond: {
$eq: [
"$$this.examScores.score",
"$maxScore"
]
}
}
}
}
}
}
}
}
}
}
])
Demo Solution 1 # Mongo Playground
Solution 2: $rank
$unwind
$rank - Ranking by partition name and sort examScores.score descending.
$match - Filter the document with { rank: 1 }.
$unset - Remove rank field.
db.collection.aggregate([
{
$unwind: "$examScores"
},
{
$setWindowFields: {
partitionBy: "$name",
sortBy: {
"examScores.score": -1
},
output: {
rank: {
$rank: {}
}
}
}
},
{
$match: {
rank: 1
}
},
{
$unset: "rank"
}
])
Demo Solution 2 # Mongo Playground
Opinion: I would say this approach:
$sort by examScores.score descending
$group by name, take the first document
would be much easier.
There's no need to $unwind and then rebuild the documents again via $group to achieve your desired results. I'd recommend avoiding that altogether.
Instead, consider processing the arrays inline using array expression operators. Depending on the version and exact results you are looking for, here are two starting points that may be worth considering. In particular the $maxN operator and the $sortArray operator may be of interest for this particular question.
You can get a sense for what these two operators do by running an $addFields aggregation to see their output, playground here.
With those as a starting point, it's really up to you to make the pipeline output the desired result. Here is one such example that matches the output you described in the question pretty well (playground):
db.collection.aggregate([
{
"$addFields": {
"relevantEntry": {
$first: {
$sortArray: {
input: "$examScores",
sortBy: {
"score": -1
}
}
}
}
},
},
{
"$project": {
_id: 0,
name: 1,
difficulty: "$relevantEntry.difficulty",
maxScore: "$relevantEntry.score"
}
}
])
Which yields:
[
{
"difficulty": 3,
"maxScore": 88.5,
"name": "Max"
},
{
"difficulty": 2,
"maxScore": 74.3,
"name": "Manu"
}
]
Also worth noting that this particular approach doesn't do anything special if there are duplicates. You could look into using $filter if something more was needed in that regard.

A pipeline stage specification object must contain exactly one field

db.P2447653_reviews_c.aggregate([{
$group: {_id: {"reviewerID" : "reviewerID", count: {$sum: 1 }}},
$match:{"reviewTime":{$gt:1}},
$project : { "reviewerID":1, "reviewerName":1, "reviewTime":1}}
])
I don't understand the problem, I'm very new to MongoDB
Error: MongoServerError: A pipeline stage specification object must contain exactly one field.
I have no idea what else to try. I'm completely stuck.
Doing some formatting, your query is this:
db.P2447653_reviews_c.aggregate([
{
$group: { _id: { "reviewerID": "reviewerID", count: { $sum: 1 } } },
$match: { "reviewTime": { $gt: 1 } },
$project: { "reviewerID": 1, "reviewerName": 1, "reviewTime": 1 }
}
])
You missed some brackets, must be this:
db.P2447653_reviews_c.aggregate([
{
$group: {
_id: { "reviewerID": "$reviewerID" },
count: { $sum: 1 }
}
},
{ $match: { "reviewTime": { $gt: 1 } } },
{ $project: { "reviewerID": 1, "reviewerName": 1, "reviewTime": 1 } }
])

Aggregate group by array and divide quantity to array length

Now I want to aggregate schema to group by users in array and divide items field to array length to create average..
This is simple json data ->
[{"users": ["5ea40086fc4b145b489da93d","5e8cb9a4462e45178c4d3405"],"isBuilt": true, "_id": "5eadd43b30f97f342cf663fc", "items": 3, ...},
{"users": ["5e8cb9a4462e45178c4d3405"], "isBuilt": true, "_id": "5ead419081eec52258b67f70", "items": 5, ...}]
And after aggregating with ->
Building.aggregate([
{
$match: {
updatedAt: {
$gte: startDate,
$lte: endDate
},
isBuilt: true
}
},
{
$unwind: "$users"
},
{
$group: {
_id: "$users",
items: {
$sum: '$items'
}
}
},
{
$project: {
user: '$_id',
items: 1,
_id: 0
}
}
])
I got this json ->
[{"items": 3, "user": "5ea40086fc4b145b489da93d"}, {"items": 8, "user": "5e8cb9a4462e45178c4d3405"}]
As you see here I got sum of items. In initial data Users "5ea40086fc4b145b489da93d" and "5e8cb9a4462e45178c4d3405" have 3 items, and user "5e8cb9a4462e45178c4d3405" has 5 items. And after aggregating they count by sum of items, that user "5e8cb9a4462e45178c4d3405" -> 8 items, and user "5ea40086fc4b145b489da93d" -> 3 items... Now I want make average items to users, like if length of array users is 2 or more it will divide items and give sum.. and final json will look like ->
[{"items": 1.5, "user": "5ea40086fc4b145b489da93d"}, {"items": 6.5, "user": "5e8cb9a4462e45178c4d3405"}]
PS if result of item is not integer, result should be rounded to ten
I've solved my problem with aggregation ->
Building.aggregate([
{
$match: {
updatedAt: {
$gte: startDate,
$lte: endDate
},
isBuilt: true
}
},
{
$addFields: {
itemsAvg: {
$divide: ["$items", {$size: "$users"}]
}
}
},
{
$addFields: {
roundedItemsAvg: {
$round: ["$itemsAvg", 1]
}
}
},
{
$unwind: "$users"
},
{
$group: {
_id: "$users",
items: {
$sum: '$roundedItemsAvg'
}
}
},
{
$project: {
user: '$_id',
items: 1,
_id: 0
}
}
])

Find an average day count in the array in mongo

Suppose we have an array in the aggregation pipeline:
{
dates: [
"2019-01-29",
"2019-01-29",
"2019-01-29",
"2019-01-29",
"2019-02-06",
"2019-02-06",
"2019-02-06",
"2019-02-08",
"2019-06-04",
"2019-06-25",
"2019-07-26",
"2019-08-15",
"2019-08-15",
]
}
How to find an average count of the days in such an array?
The next stage of the pipeline is supposed to look like this:
dates : {
"2019-01-29": 4,
"2019-02-06": 3,
"2019-02-08": 1,
"2019-06-04": 1,
"2019-06-25": 1,
"2019-07-26": 1,
"2019-08-15": 2
}
But the final result is supposed to look like this:
avg_day_count: 1.85714285714
I.e. the average count of the days.
The sum of all days divided by the count of unique days.
You can achieve this without any $group logic with a single $project;
db.collection.aggregate([
{
"$project": {
"result": {
$divide: [
{ $size: "$dates" },
{ $size: { $setUnion: [ "$dates" ] } }
]
}
}
}
])
will give out;
[
{
"_id": ...,
"result": 1.8571428571428572
}
]
check the code interactively on MongoPlayground
You need to run $group twice using $avg in the second one:
db.collection.aggregate([
{
$uwnind: "$dates"
},
{
$group: {
_id: "$dates",
count: { $sum: 1 }
}
},
{
$group: {
_id: null,
avg_day_count: { $avg: "$count" }
}
}
])
Mongo Playground

Resources