I have this requirement, where i have a collection as below:
{
"_id" : 1,
"name" : "sam",
"Array" : [
{ "K" : "A", "V" : 8 },
{ "K" : "B", "V" : 5 },
{ "K" : "C", "V" : 13 }
]
},
{
"_id" : 2,
"name" : "tom",
"Array" : [
{ "K" : "D", "V" : 12 },
{ "K" : "E", "V" : 14 },
{ "K" : "F", "V" : 2 }
]
},
{
"_id" : 3,
"name" : "jim",
"Array" : [
{ "K" : "G", "V" : 9 },
{ "K" : "H", "V" : 4 },
{ "K" : "I", "V" : 2 }
]
}
I would like to run a query that returns the sub-document of each _id with the highest "V", so in that case I would get:
{ "_id" : 1, "name" : "sam", "Array" : [ { "K" : "C", "V" : 13 } ] }
{ "_id" : 2, "name" : "tom", "Array" : [ { "K" : "E", "V" : 14 } ] }
{ "_id" : 3, "name" : "jim", "Array" : [ { "K" : "G", "V" : 9 } ] }
You use can select only the sub-documents where the V field's value is equal to the maximum value in the array using $filter and the $max operator.
The $addFields pipeline stage is used here to specify all other fields in the document.
db.collection.aggregate([
{
"$addFields":{
"Array":{
"$filter":{
"input":"$Array",
"cond":{
"$eq":[
"$$this.V",
{
"$max":"$Array.V"
}
]
}
}
}
}
}
])
Related
I have a collection, consist of name and data.
data is an array with 2 elements, each element is the object with code and qty.
{
"_id" : ObjectId("605c666a15d2612ed0afedd2"),
"name" : "Anna",
"data" : [
{
"code" : "a",
"qty" : 3
},
{
"code" : "b",
"qty" : 4
}
]
},
{
"_id" : ObjectId("605c666a15d2612ed0afedd3"),
"name" : "James",
"data" : [
{
"code" : "c",
"qty" : 5
},
{
"code" : "d",
"qty" : 6
}
]
}
I want to update the code of the first element to name of its document. The result I want is
{
"_id" : ObjectId("605c666a15d2612ed0afedd2"),
"name" : "Anna",
"data" : [
{
"code" : "Anna",
"qty" : 3
},
{
"code" : "b",
"qty" : 4
}
]
},
{
"_id" : ObjectId("605c666a15d2612ed0afedd3"),
"name" : "James",
"data" : [
{
"code" : "James",
"qty" : 5
},
{
"code" : "d",
"qty" : 6
}
]
}
I just google to find how to:
update array at a specific index (https://stackoverflow.com/a/34177929/11738185)
db.Collection.updateMany(
{ },
{
$set:{
'data.0.code': '$name'
}
}
)
But the code of the first element in data array is a string '$name', not a value (Anna, James)
{
"_id" : ObjectId("605c666a15d2612ed0afedd2"),
"name" : "Anna",
"data" : [
{
"code" : "$name",
"qty" : 3
},
{
"code" : "b",
"qty" : 4
}
]
},
{
"_id" : ObjectId("605c666a15d2612ed0afedd3"),
"name" : "James",
"data" : [
{
"code" : "$name",
"qty" : 5
},
{
"code" : "d",
"qty" : 6
}
]
}
update a field by the value of another field. It takes me to use pipeline updating (https://stackoverflow.com/a/37280419/11738185): the second param of updateMany is array (pipeline)
db.Collection.updateMany(
{ },
[{
$set:{
'data.0.code': '$name'
}
}]
)
and It adds field 0 to each element in data array
{
"_id" : ObjectId("605c666a15d2612ed0afedd2"),
"name" : "Anna",
"data" : [
{
"0" : {
"code" : "Anna"
},
"code" : "a",
"qty" : 3
},
{
"0" : {
"code" : "Anna"
},
"code" : "b",
"qty" : 4
}
]
},
{
"_id" : ObjectId("605c666a15d2612ed0afedd3"),
"name" : "James",
"data" : [
{
"0" : {
"code" : "James"
},
"code" : "c",
"qty" : 5
},
{
"0" : {
"code" : "James"
},
"code" : "d",
"qty" : 6
}
]
}
I can't find the solution for this case. Could anyone to help me? How can I update array at fixed index by other field. Thanks for reading!
1. update array at a specific index
You can't use internal fields as value of another fields, it will work only when you have external value to update like { $set: { "data.0.code": "Anna" } }.
2. update a field by the value of another field
Update with Aggregation pipeline can't allow to access data.0.code syntax.
You can try using $reduce in update with aggregation pipeline,
$reduce to iterate loop of data array, set empty array in initialValue of reduce, Check condition if initialValue array size is zero then replace code with name and merge with current object using $mergeObjects, else return current object,
$concatArrays to concat current object with initialValue array
db.collection.update({},
[{
$set: {
data: {
$reduce: {
input: "$data",
initialValue: [],
in: {
$concatArrays: [
"$$value",
[
{
$cond: [
{ $eq: [{ $size: "$$value" }, 0] },
{ $mergeObjects: ["$$this", { code: "$name" }] },
"$$this"
]
}
]
]
}
}
}
}
}],
{ multi: true }
)
Playground
I think easier would be another way.
Just save the model before and use it for updating after
var annaModel = nameModel.findOne({_id: "605c666a15d2612ed0afedd2" })
nameModel.findOneAndUpdate({_id: "605c666a15d2612ed0afedd2"},{$set:{'data.0.code': annaModel.name}})
I have a MeterReadings collection that looks like the following.
{
"_id" : ObjectId("5fc768b33561870a262813c6"),
"installedAppId" : "A",
"readings" : [
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984662,
"u" : "W",
"v" : 100
}
]
}
{
"_id" : ObjectId("5fc768c73561870a262813c7"),
"installedAppId" : "B",
"readings" : [
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984662,
"u" : "W",
"v" : 200
}
]
}
{
"_id" : ObjectId("5fc768e43561870a262813c8"),
"installedAppId" : "A",
"readings" : [
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984672,
"u" : "W",
"v" : 300
}
]
}
My desired output is to group by installedAppId and then have each readings in one entry attached to it from all the matching installedAppId's. Shown below is what I am aiming for.
{
"_id" : ObjectId("5fc768b33561870a262813c6"),
"installedAppId" : "A",
"readings" : [
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984662,
"u" : "W",
"v" : 100
},{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984672,
"u" : "W",
"v" : 300
}
]
}
{
"_id" : ObjectId("5fc768c73561870a262813c7"),
"installedAppId" : "B",
"readings" : [
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984662,
"u" : "W",
"v" : 200
}
]
}
Grouping by installedAppId does return two groups using the data above.
> db.MeterReadings.aggregate([ {$group: {_id: {installedAppId: "$installedAppId"}}} ])
{ "_id" : { "installedAppId" : "B" } }
{ "_id" : { "installedAppId" : "A" } }
As each reading is different though adding readings as another entry in $group is the same as just querying the entire database.
> db.MeterReadings.aggregate([ {$group: {_id: {installedAppId: "$installedAppId", readings: "$readings"}}} ]).pretty()
{
"_id" : {
"installedAppId" : "A",
"readings" : [
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984672,
"u" : "W",
"v" : 10.2
}
]
}
}
{
"_id" : {
"installedAppId" : "B",
"readings" : [
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984662,
"u" : "W",
"v" : 10.2
}
]
}
}
{
"_id" : {
"installedAppId" : "A",
"readings" : [
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984662,
"u" : "W",
"v" : 10.2
}
]
}
}
Any help is welcome!
Either $push or $addToSet seems to do the trick, adds arrays however for each. Would be nicer if could just push to one array.
addToSet
> db.MeterReadings.aggregate([{$group : {_id : "$installedAppId", readings: {$addToSet : "$readings"}}}]).pretty()
{
"_id" : "B",
"readings" : [
[
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984672,
"u" : "W",
"v" : 200
}
]
]
}
{
"_id" : "A",
"readings" : [
[
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984672,
"u" : "W",
"v" : 300
}
],
[
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984672,
"u" : "W",
"v" : 100
}
]
]
}
push
> db.MeterReadings.aggregate([{$group : {_id : "$installedAppId", readings: {$push: "$readings"}}}]).pretty()
{
"_id" : "B",
"readings" : [
[
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984672,
"u" : "W",
"v" : 200
}
]
]
}
{
"_id" : "A",
"readings" : [
[
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984672,
"u" : "W",
"v" : 100
}
],
[
{
"n" : "daf43d66-6c3b-4553-80af-6a0b1cf97418:power",
"t" : 1606902984672,
"u" : "W",
"v" : 300
}
]
]
}
I am trying to remove duplicates from MongoDB but all solutions find fail.
My JSON structure:
{
"_id" : ObjectId("5d94ad15667591cf569e6aa4"),
"a" : "aaa",
"b" : "bbb",
"c" : "ccc",
"d" : "ddd",
"key" : "057cea2fc37aabd4a59462d3fd28c93b"
}
Key value is md5(a+b+c+d).
I already have a database with over 1 billion records and I want to remove all the duplicates according to key and after use unique index so if the key is already in data base the record wont insert again.
I already tried
db.data.ensureIndex( { key:1 }, { unique:true, dropDups:true } )
But for what I understand dropDups were removed in MongoDB > 3.0.
I tried also several of java script codes like:
var duplicates = [];
db.data.aggregate([
{ $match: {
key: { "$ne": '' } // discard selection criteria
}},
{ $group: {
_id: { key: "$key"}, // can be grouped on multiple properties
dups: { "$addToSet": "$_id" },
count: { "$sum": 1 }
}},
{ $match: {
count: { "$gt": 1 } // Duplicates considered as count greater than one
}}
],
{allowDiskUse: true} // For faster processing if set is larger
).forEach(function(doc) {
doc.dups.shift(); // First element skipped for deleting
doc.dups.forEach( function(dupId){
duplicates.push(dupId); // Getting all duplicate ids
}
)
})
and it fails with:
QUERY [Js] uncaught exception: Error: command failed: {
“ok“: 0,
“errmsg“ : “assertion src/mongo/db/pipeline/value.cpp:1365“.
“code“ : 8,
“codeName" : “UnknownError“
} : aggregate failed
I haven't change MongoDB settings, working with the default settings.
This is my input collection dups, with some duplicate data (k with values 11 and 22):
{ "_id" : 1, "k" : 11 }
{ "_id" : 2, "k" : 22 }
{ "_id" : 3, "k" : 11 }
{ "_id" : 4, "k" : 44 }
{ "_id" : 5, "k" : 55 }
{ "_id" : 6, "k" : 66 }
{ "_id" : 7, "k" : 22 }
{ "_id" : 8, "k" : 88 }
{ "_id" : 9, "k" : 11 }
The query removes the duplicates:
db.dups.aggregate([
{ $group: {
_id: "$k",
dups: { "$addToSet": "$_id" },
count: { "$sum": 1 }
}},
{ $project: { k: "$_id", _id: { $arrayElemAt: [ "$dups", 0 ] } } }
] )
=>
{ "k" : 88, "_id" : 8 }
{ "k" : 22, "_id" : 7 }
{ "k" : 44, "_id" : 4 }
{ "k" : 55, "_id" : 5 }
{ "k" : 66, "_id" : 6 }
{ "k" : 11, "_id" : 9 }
As you see the following duplicate data is removed:
{ "_id" : 1, "k" : 11 }
{ "_id" : 2, "k" : 22 }
{ "_id" : 3, "k" : 11 }
Get the results in an array:
var arr = db.dups.aggregate([ ...] ).toArray()
The arr has the array of the documents:
[
{
"k" : 88,
"_id" : 8
},
{
"k" : 22,
"_id" : 7
},
{
"k" : 44,
"_id" : 4
},
{
"k" : 55,
"_id" : 5
},
{
"k" : 66,
"_id" : 6
},
{
"k" : 11,
"_id" : 9
}
]
{
"_id" : 123,
"a" : [
{
"b" : 1,
"bb" : 2
},
{
"c" : 2,
"cc" : 3
}
],
"ab" : [
{
"d" : 4,
"dd" : 5
},
{
"e" : 5,
"ee" : 6
}
]
}
Need to remove mongo specific nested document in array for each document
Output should be like: based on inputs _id:123,ab.d=4
{
"_id" : 123,
"a" : [
{
"b" : 1,
"bb" : 2
},
{
"c" : 2,
"cc" : 3
}
],
"ab" : [
{
"e" : 5,
"ee" : 6
}
]
}
Your are looking for an update with $pull operator (https://docs.mongodb.com/manual/reference/operator/update/pull/)
In your case:
db.mycollection.update({"_id":123}, {$pull: {"ab":{"d":4}}})
I was looking into the different ways of querying on array of embedded documents in aggregation pipeline MongoDB. Looks like MongoDB has less support for this.
Let's say we have following documents in test collection:
/* 1 */
{
"_id" : ObjectId("59df2c39fbd406137d4290b3"),
"a" : 1.0,
"arr" : [
{
"key": 1,
"sn" : "a",
"org": "A"
}
]
}
/* 2 */
{
"_id" : ObjectId("59df2c47fbd406137d4290b4"),
"a" : 2.0,
"arr" : [
{
"sn" : "b",
"key": 2,
"org": "B"
}
]
}
/* 3 */
{
"_id" : ObjectId("59df2c50fbd406137d4290b5"),
"a" : 3.0,
"arr" : [
{
"key": 3,
"sn" : "c",
"org": "C"
}
]
}
/* 4 */
{
"_id" : ObjectId("59df2c85fbd406137d4290b6"),
"a" : 1.0,
"arr" : [
{
"key": 1,
"sn" : "a",
"org": " A"
}
]
}
/* 5 */
{
"_id" : ObjectId("59df2c9bfbd406137d4290b7"),
"a" : 3.0,
"arr" : [
{
"sn" : "b",
"key": 2,
}
]
}
/* 6 */
{
"_id" : ObjectId("59df2e41fbd406137d4290b8"),
"a" : 4.0,
"arr" : [
{
"sn" : "b",
"key" : 2
}
]
}
/* 7 */
{
"_id" : ObjectId("59df2e5ffbd406137d4290b9"),
"a" : 5.0,
"arr" : [
{
"key" : 2,
"sn" : "b"
},
{
"sn" : "a",
"key" : 1
}
]
}
And I wanted to categorize the above documents based on "arr.sn" field value using below query:
db.test.aggregate([{"$addFields": {"Category" : { $switch: {
branches : [
{ case : { $eq : [ "$arr.nm", "a" ] }, then : "Category 1"}
],
default : "No Category"
}}}}])
but $eq operator is not giving correct result, if I use the same $eq in find method, it works:
db.test.find({"arr.sn" : "a"})
I am looking at the way to do it with only single field, here in case "arr.sn" field. Is there any way to project the field from embedded documents from the array?
Any help would be appreciated.
$eq(aggregation) compares both value and type different from query eq opeator which can compare values for any type.
You need $in(aggregation) to verify value in a array.
Something like
[
{
"$addFields": {
"Category": {
"$switch": {
"branches": [
{
"case": {
"$in": [
"a",
"$arr.sn"
]
},
"then": "Category 1"
}
],
"default": "No Category"
}
}
}
}
]