translate mongodb aggregation query to spring-data-mongodb - spring-data-mongodb

is it possible to translate this mongo shell aggregation query to spring-data?
db.getCollection("X").aggregate([
{
$group: {
_id: {
year: {
$year: "$happenedAt"
},
month: {
$month: "$happenedAt"
},
day: {
$dayOfMonth: "$happenedAt"
}
},
count: {
$sum: 1
}
}
}
])
Thanks

Criteria c=new Criteria(Criteria.where("year").is(y).and("month").is(m).and("day").is(d));
Aggregration a=newAggregration(match(c),group("year","month","day").count().as("total"));
That is it.

Related

How to use $getfield to get a field from ROOT Document with condition in Aggregation Mongodb

I'm starting to learn Aggregate in MongoDB. I have a simple Doc as below, which has 2 fields, name and examScores, examScores is an array contains multiplier documents:
{ _id: ObjectId("633199db009be219a43ae426"),
name: 'Max',
examScores:
[ { difficulty: 4, score: 57.9 },
{ difficulty: 6, score: 62.1 },
{ difficulty: 3, score: 88.5 } ] }
{ _id: ObjectId("633199db009be219a43ae427"),
name: 'Manu',
examScores:
[ { difficulty: 7, score: 52.1 },
{ difficulty: 2, score: 74.3 },
{ difficulty: 5, score: 53.1 } ] }
Now I query the maximum score of each person using $unwind and $group/$max as below:
db.test.aggregate([
{$unwind: "$examScores"},
{$group: {_id: {name: "$name"}, maxScore: {$max: "$examScores.score"}}}
])
{ _id: { name: 'Max' }, maxScore: 88.5 }
{ _id: { name: 'Manu' }, maxScore: 74.3 }
But I want the result also contains the examScores.difficulty field corresponding to name and examScores.score, like below:
{ _id: { name: 'Max' }, difficulty: 3, maxScore: 88.5 }
{ _id: { name: 'Manu' }, difficulty: 2, maxScore: 74.3 }
I know that I can use $sort + $group and $first to achieve this goal. But I want to use $getField or any other methods to get data from ROOT Doc.
My idea is use $project and $getField to get the difficulty field from ROOT doc (or $unwind version of ROOT doc) with the condition like ROOT.name = Aggregate.name and Root.examScores.score = Aggregate.maxScore.
It will look something like this:
{$project:
{name: 1,
maxScore: 1,
difficulty:
{$getField: {
field: "$examScores.difficulty"
input: "$$ROOT.$unwind() with condition/filter"}
}
}
}
I wonder if this is possible in MongoDB?
Solution 1
$unwind
$group - Group by name. You need $push to add the $$ROOT document into data array.
$project - Set the difficulty field by getting the value of examScores.difficulty from the first item of the filtered data array by matching the examScores.score with maxScore.
db.collection.aggregate([
{
$unwind: "$examScores"
},
{
$group: {
_id: {
name: "$name"
},
maxScore: {
$max: "$examScores.score"
},
data: {
$push: "$$ROOT"
}
}
},
{
$project: {
_id: 0,
name: "$_id.name",
maxScore: 1,
difficulty: {
$getField: {
field: "difficulty",
input: {
$getField: {
field: "examScores",
input: {
$first: {
$filter: {
input: "$data",
cond: {
$eq: [
"$$this.examScores.score",
"$maxScore"
]
}
}
}
}
}
}
}
}
}
}
])
Demo Solution 1 # Mongo Playground
Solution 2: $rank
$unwind
$rank - Ranking by partition name and sort examScores.score descending.
$match - Filter the document with { rank: 1 }.
$unset - Remove rank field.
db.collection.aggregate([
{
$unwind: "$examScores"
},
{
$setWindowFields: {
partitionBy: "$name",
sortBy: {
"examScores.score": -1
},
output: {
rank: {
$rank: {}
}
}
}
},
{
$match: {
rank: 1
}
},
{
$unset: "rank"
}
])
Demo Solution 2 # Mongo Playground
Opinion: I would say this approach:
$sort by examScores.score descending
$group by name, take the first document
would be much easier.
There's no need to $unwind and then rebuild the documents again via $group to achieve your desired results. I'd recommend avoiding that altogether.
Instead, consider processing the arrays inline using array expression operators. Depending on the version and exact results you are looking for, here are two starting points that may be worth considering. In particular the $maxN operator and the $sortArray operator may be of interest for this particular question.
You can get a sense for what these two operators do by running an $addFields aggregation to see their output, playground here.
With those as a starting point, it's really up to you to make the pipeline output the desired result. Here is one such example that matches the output you described in the question pretty well (playground):
db.collection.aggregate([
{
"$addFields": {
"relevantEntry": {
$first: {
$sortArray: {
input: "$examScores",
sortBy: {
"score": -1
}
}
}
}
},
},
{
"$project": {
_id: 0,
name: 1,
difficulty: "$relevantEntry.difficulty",
maxScore: "$relevantEntry.score"
}
}
])
Which yields:
[
{
"difficulty": 3,
"maxScore": 88.5,
"name": "Max"
},
{
"difficulty": 2,
"maxScore": 74.3,
"name": "Manu"
}
]
Also worth noting that this particular approach doesn't do anything special if there are duplicates. You could look into using $filter if something more was needed in that regard.

How to query a date that have 20 or more documents in mongoDB in the same day and hour

I have a mongoDB database structure like this
{
_id: ObjectId,
name: string,
scheduledDate: ISOString
}
I want to return all scheduledDates that repeat the same scheduledDate day 2 times or more across all the database
Example:
{
_id: ObjectId,
name: 'example1',
scheduledDate: "2022-04-15T05:44:00.000Z"
},
{
_id: ObjectId,
name: 'example1',
scheduledDate: "2022-04-15T07:44:00.000Z"
},
{
_id: ObjectId,
name: 'example1',
scheduledDate: "2022-04-18T02:44:00.000Z"
},
{
_id: ObjectId,
name: 'example1',
scheduledDate: "2022-04-18T02:20:00.000Z"
},
{
_id: ObjectId,
name: 'example1',
scheduledDate: "2022-04-18T02:44:00.000Z"
},
{
_id: ObjectId,
name: 'example1',
scheduledDate: "2022-04-10T05:44:00.000Z"
}
In this example 2022-04-15 repeats 2 times and 2022-04-18 repeat 3 times, so both match the criteria (2 times or more) so I want to return both date day
Is this possible?
Like this:
{
scheduledDate:"2022-04-15T00:00:00.000Z"
},
{
scheduledDate:"2022-04-18T00:00:00.000Z"
}
And one more question, is possible to do the same with hours? A list of specific hours of scheduledDate that repeat across all database X times
Use $group with $date and $dateTrunc
db.collection.aggregate([
{
$group: {
_id: {
$dateTrunc: {
date: { $toDate: "$scheduledDate" },
unit: "day"
}
},
count: { $sum: 1 }
}
},
{
$match: {
count: { $gt: 1 }
}
},
{
$project: {
_id: 0,
scheduledDate: "$_id"
}
}
])
mongoplayground

mongodb group each fields along with total count

I have a collection with documents in cosmosDB .How can I group each field distinct values using mongoDB?
Here is my sample data:
{
"_id" : ObjectId("61ba65af74cf385ee93ad2c8"),
"Car_brand":"A",
"Plate_number":"5",
"Model_year":"2015",
"Company":"Tesla Motors"
},
{
"_id" : ObjectId("61ba65af74cf385ee93ad2c9"),
"Car_brand":"B",
"Plate_number":"2",
"Model_year":"2021",
"Company":"Tesla Motors",
},
{
"_id" : ObjectId("61ba65af74cf385ee93ad2ca"),
"Car_brand":"B",
"Plate_number":"2",
"Model_year":"2011",
"Company":"Lamborghini",
}
expected:
{
"Car_brand":["A","B"]
"Plate_number":["5","2"]
"Model_year":["2015","2021","2011"]
"Company":["Lamborghini","Tesla Motors"]
}
Option1: Here is how to do in mongoDB , I guess it is similar in cosmosDB:
db.collection.aggregate([
{
$group: {
_id: null,
Car_brand: {
$push: "$Car_brand"
},
Plate_number: {
$push: "$Plate_number"
},
Model_year: {
$push: "$Model_year"
},
Company: {
$push: "$Company"
}
}
}
])
playground
Option2: Later I have identified you need the distinct values , here is an example:
db.collection.aggregate([
{
$group: {
_id: null,
Car_brand: {
$addToSet: "$Car_brand"
},
Plate_number: {
$addToSet: "$Plate_number"
},
Model_year: {
$addToSet: "$Model_year"
},
Company: {
$addToSet: "$Company"
}
}
}
])
playground2

Difference between two consecutive mongodb documents fields

I have a time-series collection in mongodb. Which looks like this:
{ _id: 1, time: 2021-01-03T06:26:20.000+00:00 }
{ _id: 2, time: 2021-01-03T06:26:21.000+00:00 }
{ _id: 3, time: 2021-01-03T06:26:22.000+00:00 }
I want to accumulate all document based on time field and all documents are sorted based on time field. And output should be look like (t3-t2) + (t2-t1). So for this output will be 2 seconds.
For postgresql we can use window function or joins to calculated. How to calculate this in mongodb?
You an use this one:
db.collection.aggregate([
{ $match: { _id: { $ne: 3 } } },
{ $group: { _id: null, max_time: { $max: "$time" }, min_time: { $min: "$time" } } },
{ $set: { difference: { $divide: [{ $subtract: ["$max_time", "$min_time"] }, 1000] } } }
])

How to delete documents returned by an aggregation query in mongodb

I am attempting to delete all the documents returned by an aggregation in Mongodb.
The query I have is as follows:
db.getCollection("Collection")
.aggregate([
{
$match: { status: { $in: ["inserted", "done", "duplicated", "error"] } }
},
{
$project: {
yearMonthDay: { $dateToString: { format: "%Y-%m-%d", date: "$date" } }
}
},
{ $match: { yearMonthDay: { $eq: "2019-08-06" } } }
])
.forEach(function(doc) {
db.getCollection("Collection").remove({});
});
I tried this query but it removes all the data in the database, any suggestions please?
Since the remove doesn't have a query condition its going to match with all the documents and delete irrespective of the aggregation result.
Solution (match the ids of the current cursor doc):
db.getCollection("Collection")
.aggregate([
{
$match: { status: { $in: ["inserted", "done", "duplicated", "error"] } }
},
{
$project: {
yearMonthDay: { $dateToString: { format: "%Y-%m-%d", date: "$date" } }
}
},
{ $match: { yearMonthDay: { $eq: "2019-08-06" } } }
])
.forEach(function(doc) {
db.getCollection("Collection").remove({ "_id": doc._id });
});
Another better solution would be to have single round trip to db while deletion is get a list of ids from the aggregation cursor() via cursor.map()
var idsList = db
.getCollection("Collection")
.aggregate([
{
$match: { status: { $in: ["inserted", "done", "duplicated", "error"] } }
},
{
$project: {
yearMonthDay: { $dateToString: { format: "%Y-%m-%d", date: "$date" } }
}
},
{ $match: { yearMonthDay: { $eq: "2019-08-06" } } }
])
.map(function(d) {
return d._id;
});
//now delete those documents via $in operator
db.getCollection("Collection").remove({ _id: { $in: idsList } });
As per your query its not required to filter by aggregation and remove by another methods, you can apply this query filters in remove() method's filters,
the $expr Allows the use of aggregation expressions within the query language,
db.getCollection("Collection").remove({
$and: [
{ status: { $in: ["inserted", "done", "duplicated", "error"] } },
{
$expr: {
$eq: [
{ $dateToString: { format: "%Y-%m-%d", date: "$date" } },
"2019-08-06"
]
}
}
]
});
This can also support in deleteOne and deleteMany methods.

Resources