I'm trying to remove all duplicates in a collection with ensureIndex and dropDups, but this method doesn't seem to work with arrays.
For example, if I have a collection that looks like this:
{ "_id" : ObjectId("54d8f889e3fdfe0cd8b769ed"), "field1" : "a", "field2" : [ "a", "b" ] }
{ "_id" : ObjectId("54d8f89be3fdfe0cd8b769ee"), "field1" : "a", "field2" : [ "a", "b" ] }
{ "_id" : ObjectId("54d8f8a3e3fdfe0cd8b769ef"), "field1" : "a", "field2" : [ "a", "c" ] }
{ "_id" : ObjectId("54d8f8abe3fdfe0cd8b769f0"), "field1" : "a", "field2" : [ "b", "a" ] }
{ "_id" : ObjectId("54d8f8c5e3fdfe0cd8b769f1"), "field1" : "b", "field2" : [ "a", "b" ] }
and use ensureIndex like this:
> db.test.ensureIndex({field1: 1, field2: 1}, {unique: true, dropDups: true})
the result would be:
> db.test.find()
{ "_id" : ObjectId("54d8f89be3fdfe0cd8b769ee"), "field1" : "a", "field2" : [ "a", "b" ] }
{ "_id" : ObjectId("54d8f8c5e3fdfe0cd8b769f1"), "field1" : "b", "field2" : [ "a", "b" ] }
Is there a way to do this so that only exact Duplicates (in my example collection only the first or second entry) get deleted?
As I know this feature doesn't work in arrays. Any particular reason why you can't just use $addToSet when you insert the data?
Check this question, maybe help you MongoDB: Unique index on array element's property
Related
I have a collection, consist of name and data.
data is an array with 2 elements, each element is the object with code and qty.
{
"_id" : ObjectId("605c666a15d2612ed0afedd2"),
"name" : "Anna",
"data" : [
{
"code" : "a",
"qty" : 3
},
{
"code" : "b",
"qty" : 4
}
]
},
{
"_id" : ObjectId("605c666a15d2612ed0afedd3"),
"name" : "James",
"data" : [
{
"code" : "c",
"qty" : 5
},
{
"code" : "d",
"qty" : 6
}
]
}
I want to update the code of the first element to name of its document. The result I want is
{
"_id" : ObjectId("605c666a15d2612ed0afedd2"),
"name" : "Anna",
"data" : [
{
"code" : "Anna",
"qty" : 3
},
{
"code" : "b",
"qty" : 4
}
]
},
{
"_id" : ObjectId("605c666a15d2612ed0afedd3"),
"name" : "James",
"data" : [
{
"code" : "James",
"qty" : 5
},
{
"code" : "d",
"qty" : 6
}
]
}
I just google to find how to:
update array at a specific index (https://stackoverflow.com/a/34177929/11738185)
db.Collection.updateMany(
{ },
{
$set:{
'data.0.code': '$name'
}
}
)
But the code of the first element in data array is a string '$name', not a value (Anna, James)
{
"_id" : ObjectId("605c666a15d2612ed0afedd2"),
"name" : "Anna",
"data" : [
{
"code" : "$name",
"qty" : 3
},
{
"code" : "b",
"qty" : 4
}
]
},
{
"_id" : ObjectId("605c666a15d2612ed0afedd3"),
"name" : "James",
"data" : [
{
"code" : "$name",
"qty" : 5
},
{
"code" : "d",
"qty" : 6
}
]
}
update a field by the value of another field. It takes me to use pipeline updating (https://stackoverflow.com/a/37280419/11738185): the second param of updateMany is array (pipeline)
db.Collection.updateMany(
{ },
[{
$set:{
'data.0.code': '$name'
}
}]
)
and It adds field 0 to each element in data array
{
"_id" : ObjectId("605c666a15d2612ed0afedd2"),
"name" : "Anna",
"data" : [
{
"0" : {
"code" : "Anna"
},
"code" : "a",
"qty" : 3
},
{
"0" : {
"code" : "Anna"
},
"code" : "b",
"qty" : 4
}
]
},
{
"_id" : ObjectId("605c666a15d2612ed0afedd3"),
"name" : "James",
"data" : [
{
"0" : {
"code" : "James"
},
"code" : "c",
"qty" : 5
},
{
"0" : {
"code" : "James"
},
"code" : "d",
"qty" : 6
}
]
}
I can't find the solution for this case. Could anyone to help me? How can I update array at fixed index by other field. Thanks for reading!
1. update array at a specific index
You can't use internal fields as value of another fields, it will work only when you have external value to update like { $set: { "data.0.code": "Anna" } }.
2. update a field by the value of another field
Update with Aggregation pipeline can't allow to access data.0.code syntax.
You can try using $reduce in update with aggregation pipeline,
$reduce to iterate loop of data array, set empty array in initialValue of reduce, Check condition if initialValue array size is zero then replace code with name and merge with current object using $mergeObjects, else return current object,
$concatArrays to concat current object with initialValue array
db.collection.update({},
[{
$set: {
data: {
$reduce: {
input: "$data",
initialValue: [],
in: {
$concatArrays: [
"$$value",
[
{
$cond: [
{ $eq: [{ $size: "$$value" }, 0] },
{ $mergeObjects: ["$$this", { code: "$name" }] },
"$$this"
]
}
]
]
}
}
}
}
}],
{ multi: true }
)
Playground
I think easier would be another way.
Just save the model before and use it for updating after
var annaModel = nameModel.findOne({_id: "605c666a15d2612ed0afedd2" })
nameModel.findOneAndUpdate({_id: "605c666a15d2612ed0afedd2"},{$set:{'data.0.code': annaModel.name}})
I have nested array database records styled as such:
{
"_id" : "A",
"foo" : [
{
"_id" : "a",
"date" : ISODate("2017-07-13T23:27:13.522Z")
},
{
"_id" : "b",
"date" : ISODate("2017-08-04T22:36:36.381Z")
},
{
"_id" : "c",
"date" : ISODate("2017-08-23T23:59:40.202Z")
}
]
},
{
"_id" : "B",
"foo" : [
{
"_id" : "d",
"date" : ISODate("2017-07-17T23:27:13.522Z")
},
{
"_id" : "e",
"date" : ISODate("2017-01-06T22:36:36.381Z")
},
{
"_id" : "f",
"date" : ISODate("2017-09-14T23:59:40.202Z")
}
]
},
{
"_id" : "C",
"foo" : [
{
"_id" : "g",
"date" : ISODate("2017-11-17T23:27:13.522Z")
},
{
"_id" : "h",
"date" : ISODate("2017-06-06T22:36:36.381Z")
},
{
"_id" : "i",
"date" : ISODate("2017-10-14T23:59:40.202Z")
}
]
}
When I run the query:
db.bar.find(
{
$and: [
{"foo.date": {$lte: new Date(2017,8,1)}},
{"foo.date": {$gte: new Date(2017,7,1)}}
]
},
{
"_id":1
}
)
I'm returned
{
_id: "A"
},
{
_id: "B"
},
{
_id: "C"
}
Logically I'm asking for only the records where at least one date is between Aug-1 and Sept-1 (Record A), but am getting all records.
I'm thinking it might be referencing different dates on the subdocuments i.e. where foo.1.date > Aug-1 and foo.0.date < Sept-1.
Has anyone else had issue and found a resolution to this?
Your filters are evaluated separately against each subdocument in your array and that's why you're getting all results. For instance for C
element with _id g is gte 1st of August
element with _id h is lte 1st of September
You should use $elemMatch to find date in specified range
db.bar.find(
{ "foo":
{
"$elemMatch":
{ "date":
{
"$gte": new Date(2017,7,1),
"$lte": new Date(2017,8,1)
}
}
}
})
Only A will be returned for this query.
The way you are doing doensn't work as you want for array.
You have to unpack the array and after you compare the values.
db.bar.aggregate(
// Unpack the assignments array
{ $unwind : "$foo" },
// Find the assignments ending after given date
{ $match : {
"foo.date": { $gte: new Date(2017,7,1),$lt: new Date(2017,8,1) }
}}
)
This should also work fine
db.bar.find({"foo.date": { $gte: new Date(2017,7,1),$lt: new Date(2017,8,1) }})
i have collection called 'test' in that there is a document like:
{
"_id" : 1
"letters" : [
[ "A", "B" ],
[ "C", "D" ],
[ "A", "E", "B", "F" ]
]
}
if i updated the document by using $addToSet like this:
db.getCollection('test').update({"_id" : 1}, {$addToSet:{"letters": ["A", "B"] }})
it will not inserted another value. still the document look like
{
"_id" : 1
"letters" : [
[ "A", "B" ],
[ "C", "D" ],
[ "A", "E", "B", "F" ]
]
}
if im updating like this:
db.getCollection('test').update({"_id" : 1}, {$addToSet:{"letters": ["B", "A"] }})
Now it will update the document like:
{
"_id" : 1
"letters" : [
[ "A", "B" ],
[ "C", "D" ],
[ "A", "E", "B", "F" ],
[ "B", "A" ]
]
}
my requirment is if im give like this also (["B", "A"]), it will not update that document. Because the same letters are already present in the array.
could anyone can please give the solution.
#Shubham has the right answer. You should always sort your letters before saving into the document. So your original document should have been (I changed the third array):
{
"_id" : 1,
"letters" : [
[ "A", "B" ],
[ "C", "D" ],
[ "A", "B", "C", "F" ]
]
}
Then in your application do the sort. I'm including a Mongo Shell example here.
var input = ["B", "A"];
input.sort();
db.getCollection('test').update({"_id" : 1}, {$addToSet:{"letters": input}});
Try this answer , it works.
Use $push to insert any item in the array in your case.
db.getCollection('stack').update(
{ _id: 1 },
{ $push: { "letters": ["B", "A"] } }
)
For reference about $push you can view this link -
https://docs.mongodb.com/manual/reference/operator/update/push/
I have this requirement, where i have a collection as below:
{
"_id" : 1,
"name" : "sam",
"Array" : [
{ "K" : "A", "V" : 8 },
{ "K" : "B", "V" : 5 },
{ "K" : "C", "V" : 13 }
]
},
{
"_id" : 2,
"name" : "tom",
"Array" : [
{ "K" : "D", "V" : 12 },
{ "K" : "E", "V" : 14 },
{ "K" : "F", "V" : 2 }
]
},
{
"_id" : 3,
"name" : "jim",
"Array" : [
{ "K" : "G", "V" : 9 },
{ "K" : "H", "V" : 4 },
{ "K" : "I", "V" : 2 }
]
}
I would like to run a query that returns the sub-document of each _id with the highest "V", so in that case I would get:
{ "_id" : 1, "name" : "sam", "Array" : [ { "K" : "C", "V" : 13 } ] }
{ "_id" : 2, "name" : "tom", "Array" : [ { "K" : "E", "V" : 14 } ] }
{ "_id" : 3, "name" : "jim", "Array" : [ { "K" : "G", "V" : 9 } ] }
You use can select only the sub-documents where the V field's value is equal to the maximum value in the array using $filter and the $max operator.
The $addFields pipeline stage is used here to specify all other fields in the document.
db.collection.aggregate([
{
"$addFields":{
"Array":{
"$filter":{
"input":"$Array",
"cond":{
"$eq":[
"$$this.V",
{
"$max":"$Array.V"
}
]
}
}
}
}
}
])
I was looking into the different ways of querying on array of embedded documents in aggregation pipeline MongoDB. Looks like MongoDB has less support for this.
Let's say we have following documents in test collection:
/* 1 */
{
"_id" : ObjectId("59df2c39fbd406137d4290b3"),
"a" : 1.0,
"arr" : [
{
"key": 1,
"sn" : "a",
"org": "A"
}
]
}
/* 2 */
{
"_id" : ObjectId("59df2c47fbd406137d4290b4"),
"a" : 2.0,
"arr" : [
{
"sn" : "b",
"key": 2,
"org": "B"
}
]
}
/* 3 */
{
"_id" : ObjectId("59df2c50fbd406137d4290b5"),
"a" : 3.0,
"arr" : [
{
"key": 3,
"sn" : "c",
"org": "C"
}
]
}
/* 4 */
{
"_id" : ObjectId("59df2c85fbd406137d4290b6"),
"a" : 1.0,
"arr" : [
{
"key": 1,
"sn" : "a",
"org": " A"
}
]
}
/* 5 */
{
"_id" : ObjectId("59df2c9bfbd406137d4290b7"),
"a" : 3.0,
"arr" : [
{
"sn" : "b",
"key": 2,
}
]
}
/* 6 */
{
"_id" : ObjectId("59df2e41fbd406137d4290b8"),
"a" : 4.0,
"arr" : [
{
"sn" : "b",
"key" : 2
}
]
}
/* 7 */
{
"_id" : ObjectId("59df2e5ffbd406137d4290b9"),
"a" : 5.0,
"arr" : [
{
"key" : 2,
"sn" : "b"
},
{
"sn" : "a",
"key" : 1
}
]
}
And I wanted to categorize the above documents based on "arr.sn" field value using below query:
db.test.aggregate([{"$addFields": {"Category" : { $switch: {
branches : [
{ case : { $eq : [ "$arr.nm", "a" ] }, then : "Category 1"}
],
default : "No Category"
}}}}])
but $eq operator is not giving correct result, if I use the same $eq in find method, it works:
db.test.find({"arr.sn" : "a"})
I am looking at the way to do it with only single field, here in case "arr.sn" field. Is there any way to project the field from embedded documents from the array?
Any help would be appreciated.
$eq(aggregation) compares both value and type different from query eq opeator which can compare values for any type.
You need $in(aggregation) to verify value in a array.
Something like
[
{
"$addFields": {
"Category": {
"$switch": {
"branches": [
{
"case": {
"$in": [
"a",
"$arr.sn"
]
},
"then": "Category 1"
}
],
"default": "No Category"
}
}
}
}
]