MongoDB using skip and distinct in a query based on values inside an array - database

So I have document that is structure like this
_id: ObjectId('62bbe17d8fececa06b91873d')
clubName: 'test'
'624f4b56ab4f5170570cdba3' //IDS of staff members
A single staff can be assigned to multiple clubs so what I'm trying to achieve is to get all staff that has been assigned to at least one club and display them on a table on the front end, I followed this solution since distinct and skip can't be used on a single query but it just returned this:
{ _id: [ '624f5054ab4f5170570cdd16', '624f5054ab4f5170570cdd16' ] } //staff from club 1,
{ _id: [ '624f5054ab4f5170570cdd16', '624f9194ab4f5170570cded1' ] } //staff from club 2,
{ _id: [ '624f4b56ab4f5170570cdba3' ]} //staff from club 3
my desired outcome would be like this:
[ _id : ['624f5054ab4f5170570cdd16', '624f9194ab4f5170570cded1', '624f4b56ab4f5170570cdba3'] ]
here's my query:
const query = this.clubModel.aggregate(
[{ $group: { _id: '$staff' } }, { $skip: 0}, { $limit: 10}],
(err, results) => {
the values returned are not distinct at all, is there an operation that can evaluate the value inside an array and make them distinct?
Here's my new query after adding the 'createdAt' field in my document structure:
const query = this.clubModel.aggregate([
{ $sort: { createdAt: -1 } },
$unwind: '$drivers',
$project: {
isActive: true,
$group: {
_id: 'null',
ids: {
$addToSet: '$drivers',
$project: {
_id: 0,
$skip: skip,
$limit: limit,

Does this works for you, first UNWIND the staff array, and then group on "_id" as null and add staff values using $addToSet:
"$unwind": "$staff"
"$group": {
"_id": "null",
"ids": {
"$addToSet": "$staff"
"$project": {
"_id": 0,
$skip: 0
$limit: 10
Here's the working link.


In MongoDB, why would this aggregate() query and this find() query return different results?

In MongoDB, why would this aggregate() query and this find() query return different results?
They both return a different set of 20 results, with a few results in common.
$geoNear: {
near: {
type: "Point",
coordinates: [
distanceField: "distance",
minDistance: 0,
maxDistance: 100000,
spherical: true,
$match: {
date: {
$gt: 1529825292207,
$lt: 1659425292207,
$project: {
id: 1,
distance: 1,
_id: 0
$sort: {date: -1
$limit: 20
"loc": {
"$nearSphere": {
"$geometry": {
"type": "Point",
"coordinates": [
"$minDistance": 0,
"$maxDistance": 100000
"date": {
$gt: 1529825292207,
$lt: 1659425292207,
id: 1,
_id: 0,
distance: 1
).sort({date: -1
In the aggregation pipeline, you are sorting after projecting out the find you are trying to sort, so at the point the $sort stage processes, none of the documents have a date field, so that stage doesn't change the order of the documents in the stream.
If you reorder the pipeline to put the $sort stage ahead of the $project stage, it should return the same result as the find.

Mongo aggregation framework match a given _id

My model :
const scheduleTaskSchema = new Schema({
activity: { type: Object, required: true },
date: { type: Date, required: true },
crew: Object,
vehicle: Object,
pickups: Array,
details: String,
const ScheduleTaskModel = mongoose.model("schedule_task", scheduleTaskSchema),
and this aggregation pipeline :
let aggregation = [
$sort: {
"pickups.0.time": 1,
$group: {
_id: "$date",
tasks: { $push: "$$ROOT" },
{ $sort: { _id: -1 } },
if (hasDateQuery) {
$match: {
date: { $gte: new Date(start_date), $lte: new Date(end_date) },
} else {
aggregation.push({ $limit: 2 });
const scheduledTasksGroups = await ScheduleTaskModel.aggregate(aggregation);
the crew object can have arbitrary number of keys with this structure :
crew : {
drivers: [
_id: "656b1e9cf5b894a4f2v643bc",
name: "john"
_id: "567b1e9cf5b954a4f2c643bhh",
name: "bill"
officers: [
_id: "655b1e9cf5b6632a4f2c643jk",
name: "mark"
_id: "876b1e9af5b664a4f2c234bb",
name: "jane"
//...any number of keys that contain an array of objects that all have an _id
I'm looking for a way to return all documents (before sorting/grouping) that contain a given _id anywhere within the crew object without knowing which key to search,it can be many different keys that all contain an array of objects that all have an _id
Any ideas ?
You can use $objectToArray for this:
{$addFields: {crewFilter: {$objectToArray: "$crew"}}},
{$set: {
crewFilter: {$size: {
$reduce: {
input: "$crewFilter",
initialValue: [],
in: {$concatArrays: [
{$filter: {
input: "$$this.v",
as: "member",
cond: {$eq: ["$$member._id", _id]}
{$match: {crewFilter: {$gt: 0}}}
See how it works on the playground example

How to get specific fields on document MangoDB&Mongoose and aggregate some of the fields?

My data looks like this:
"name":"Meat Samosa",
"name":"Pilau Rice",
"name":"Pilau Rice",
I am trying to find a query that will get me all the products (no duplicates) and their quantities added up. Notice that the products id are different even when they are the same(same name) Ideally my response would look like this
name: "Meat Samosa",
price: 3.95,
quantity: 1
name: "Pilau Rice",
price: 2.95,
quantity: 2
$project to show required fields
$unwind deconstruct the products array
$group by name and get the first price and count the quantity sum
$project to show required fields
$project: {
_id: 0,
products: 1
{ $unwind: "$products" },
$group: {
_id: "$",
price: { $first: "$products.price" },
quantity: { $sum: "$products.quantity" }
$project: {
_id: 0,
name: "$_id",
price: 1,
quantity: 1

Aggregate group by array and divide quantity to array length

Now I want to aggregate schema to group by users in array and divide items field to array length to create average..
This is simple json data ->
[{"users": ["5ea40086fc4b145b489da93d","5e8cb9a4462e45178c4d3405"],"isBuilt": true, "_id": "5eadd43b30f97f342cf663fc", "items": 3, ...},
{"users": ["5e8cb9a4462e45178c4d3405"], "isBuilt": true, "_id": "5ead419081eec52258b67f70", "items": 5, ...}]
And after aggregating with ->
$match: {
updatedAt: {
$gte: startDate,
$lte: endDate
isBuilt: true
$unwind: "$users"
$group: {
_id: "$users",
items: {
$sum: '$items'
$project: {
user: '$_id',
items: 1,
_id: 0
I got this json ->
[{"items": 3, "user": "5ea40086fc4b145b489da93d"}, {"items": 8, "user": "5e8cb9a4462e45178c4d3405"}]
As you see here I got sum of items. In initial data Users "5ea40086fc4b145b489da93d" and "5e8cb9a4462e45178c4d3405" have 3 items, and user "5e8cb9a4462e45178c4d3405" has 5 items. And after aggregating they count by sum of items, that user "5e8cb9a4462e45178c4d3405" -> 8 items, and user "5ea40086fc4b145b489da93d" -> 3 items... Now I want make average items to users, like if length of array users is 2 or more it will divide items and give sum.. and final json will look like ->
[{"items": 1.5, "user": "5ea40086fc4b145b489da93d"}, {"items": 6.5, "user": "5e8cb9a4462e45178c4d3405"}]
PS if result of item is not integer, result should be rounded to ten
I've solved my problem with aggregation ->
$match: {
updatedAt: {
$gte: startDate,
$lte: endDate
isBuilt: true
$addFields: {
itemsAvg: {
$divide: ["$items", {$size: "$users"}]
$addFields: {
roundedItemsAvg: {
$round: ["$itemsAvg", 1]
$unwind: "$users"
$group: {
_id: "$users",
items: {
$sum: '$roundedItemsAvg'
$project: {
user: '$_id',
items: 1,
_id: 0

Lookup VS Lookup with pipeline MongoDB (Performace & How it internally works)

I'm making a blog and have an query about which would give me better performace, simple lookup or lookup with pipeline because sometime simple lookup gave me fast result and sometime pipleline lookup. So, I am bit confused now which one to use or where to use. Suppose I have 2 collection, user and comment collection.
// Users Collection
userName: "Web Alchemist"
// Comments Collection
isActive: "YES", // YES or NO
comment: "xyz"
Now I want to Lookup from users collection to comments, which one would be better for this. I made two query which giving me same result.
$match: { _id: ObjectId("5d68c019c7d56410cc33b01a") }
$lookup: {
from: "comments",
as: "comments",
localField: "_id",
foreignField: "userId"
$unwind: "$comments"
$match: {
"comments.isActive": "YES"
{ $limit: 5},
_id: 1, userName: 1, comments: { _id: "$comments._id", comment: "$comments.comment"}
$group: {
_id: "$_id",
userName: { '$first': '$userName' },
comments: { $addToSet: "comments"}
$match: { _id: ObjectId("5d68c019c7d56410cc33b01a") }
$lookup: {
from: "comments",
as: "comments",
let: { userId: "$_id" },
pipeline: [
$match: {
$expr: {
$and: [
{ $eq: ['$userId', '$$userId'] },
{ $eq: ['$isActive', 'YES'] }
{ limit: 5 },
$project: { _id: 1, comment: 1 }
