Mongodb: upsert multiple documents at the same time (with key like $in) - arrays

I stumbled upon a funny behavior in MongoDB:
When I run:
db.getCollection("words").update({ word: { $in: ["nico11"] } }, { $inc: { nbHits: 1 } }, { multi: 1, upsert: 1 })
it will create "nico11" if it doesn't exist, and increase nbHits by 1 (as expected).
However, when I run:
db.getCollection("words").update({ word: { $in: ["nico10", "nico11", "nico12"] } }, { $inc: { nbHits: 1 } }, { multi: 1, upsert: 1 })
it will correctly update the keys that are already in the DB, but not insert the missing ones.
Is that the expected behavior, and is there any way I can provide an array to mongoDB, for it to update the existing elements, and create the ones that need to be created?

That is expected behaviour according to the documentation:
The update creates a base document from the equality clauses in the
parameter, and then applies the update expressions from the
parameter. Comparison operations from the will not be
included in the new document.
And, no, there is no way to achieve what you are attempting to do here using a simple upsert. The reason for that is probably that the expected outcome would be impossible to define. In your specific case it might be possible to argue along the lines of: "oh well, it is kind of obvious what we should be doing here". But imagine a more complex query like this:
db.getCollection("words").update({
a: { $in: ["b", "c" ] },
x: { $in: [ "y", "z" ]}
},
{ $inc: { nbHits: 1 } },
{ multi: 1, upsert: 1 })
What should MongoDB do in this case?
There is, however, the concept of bulk write operations in MongoDB where you would need to define three separate updateOne operations and package them up in a single request to the server.

Related

MongoDB update array of documents and replace by an array of replacement documents

Is it possible to bulk update (upsert) an array of documents with MongoDB by an array of replacement fields (documents)?
Basically to get rid of the for loop in this pseudo code example:
for user in users {
db.users.replaceOne(
{ "name" : user.name },
user,
{ "upsert": true }
}
The updateMany documentation only documents the following case where all documents are being updated in the same fashion:
db.collection.updateMany(
<query>,
{ $set: { status: "D" }, $inc: { quantity: 2 } },
...
)
I am trying to update (upsert) an array of documents where each document has it's own set of replacement fields:
updateOptions := options.UpdateOptions{}
updateOptions.SetUpsert(true)
updateOptions.SetBypassDocumentValidation(false)
_, error := collection.Col.UpdateMany(ctx, bson.M{"name": bson.M{"$in": names}}, bson.M{"$set": users}, &updateOptions)
Where users is an array of documents:
[
{ "name": "A", ...further fields},
{ "name": "B", ...further fields},
...
]
Apparently, $set cannot be used for this case since I receive the following error: Error while bulk writing *v1.UserCollection (FailedToParse) Modifiers operate on fields but we found type array instead.
Any help is highly appreciated!
You may use Collection.BulkWrite().
Since you want to update each document differently, you have to prepare a different mongo.WriteModel for each document update.
You may use mongo.ReplaceOneModel for individual document replaces. You may construct them like this:
wm := make([]mongo.WriteModel, len(users)
for i, user := range users {
wm[i] = mongo.NewReplaceOneModel().
SetUpsert(true).
SetFilter(bson.M{"name": user.name}).
SetReplacement(user)
}
And you may execute all the replaces with one call like this:
res, err := coll.BulkWrite(ctx, wm)
Yes, here we too have a loop, but that is to prepare the write tasks we want to carry out. All of them is sent to the database with a single call, and the database is "free" to carry them out in parallel if possible. This is likely to be significantly faster that calling Collection.ReplaceOne() for each document individually.

MongoDB filter a sub-array of Objects

The document structure has a round collection, which has an array of holes Objects embedded within it, with each hole played/scored entered.
The structure looks like this (there are more fields, but this summarises):
{
"_id": {
"$oid": "60701a691c071256e4f0d0d6"
},
"schema": {
"$numberDecimal": "1.0"
},
"playerName": "T Woods",
"comp": {
"id": {
"$oid": "607019361c071256e4f0d0d5"
},
"name": "US Open",
"tees": "Pro Tees",
"roundNo": {
"$numberInt": "1"
},
"scoringMethod": "Stableford"
},
"holes": [
{
"holeNo": {
"$numberInt": "1"
},
"holePar": {
"$numberInt": "4"
},
"holeSI": {
"$numberInt": "3"
},
"holeGross": {
"$numberInt": "4"
},
"holeStrokes": {
"$numberInt": "1"
},
"holeNett": {
"$numberInt": "3"
},
"holeGrossPoints": {
"$numberInt": "2"
},
"holeNettPoints": {
"$numberInt": "3"
}
}
]
}
In the Atlas web UI, it shows as (note there are 9 holes in this particular round of golf - limited to 3 for brevity):
I would like to find the players who have a holeGross of 2, or less, somewhere in their round of golf (i.e. a birdie on par 3 or better).
Being new to MongoDB, and NoSQL constructs, I am stuck with this. Reading around the aggregation pipeline framework, I have tried to break down the stages I will need as:
Filter by the comp.id and comp.roundNo
Filter this result with any hole within the holes array of Objects
Maybe I have approached this wrong, and should filter or structure this pipeline differently?
So far, using the Atlas web UI, I can apply these filters individually as:
{
"comp.id": ObjectId("607019361c071256e4f0d0d5"),
"comp.roundNo": 2
}
And:
{ "holes.0.holeGross": 2 }
But I have 2 problems:
The second filter query, I have hard-coded the array index to get this value. I would need to search across all the sub-elements of every document that matches this comp.id && comp.roundNo
How do I combine these? I presuming this is where the aggregation comes in, as well as enumerating across the whole array (as above).
I note in particular it is the extra ".0." part of the second query that I am not seeing from various other online postings trying to do the same thing. Is my data structure incorrect? Do I need the [0]...[17] Objects for an 18-hole round of golf?
I would like to find the players who have a holeGross of 2, or less, somewhere in their round of golf
if that is the goal, a simple $lte search inside the holes array like the following would do:
db.collection.find({ "holes.holeGross": { $lte: 2 } })
you simply have to not specify an array index such as 0 in the property path in order to search each element of the array.
https://mongoplayground.net/p/KhZLnj9mJe5

fetch the array element from nested mongo object [duplicate]

Is it possible to wildcard the key in a query? For instance, given the following record, I'd like to do a .find({'a.*': 4})
This was discussed here https://jira.mongodb.org/browse/SERVER-267 but it looks like it's not been resolved.
{
'a': {
'b': [1, 2],
'c': [3, 4]
}
}
As asked, this is not possible. The server issue you linked to is still under "issues we're not sure of".
MongoDB has some intelligence surrounding the use of arrays, and I think that's part of the complexity surrounding such a feature.
Take the following query db.foo.find({ 'a.b' : 4 } ). This query will match the following documents.
{ a: { b: 4 } }
{ a: [ { b: 4 } ] }
So what does "wildcard" do here? db.foo.find( { a.* : 4 } ) Does it match the first document? What about the second?
Moreover, what does this mean semantically? As you've described, the query is effectively "find documents where any field in that document has a value of 4". That's a little unusual.
Is there a specific semantic that you're trying to achieve? Maybe a change in the document structure will get you the query you want.
I've came across this question because I faced the same issue. The accepted answer provider here does explains why this is not supported but not really solves the issue itself.
I've ended up with a solution that makes the wildcard usage purposed here redundant and share here just in case someone will find this post some day
Why I wanted to use wildcards in my MongoDB queries?
In my case, I needed this "feature" in order to be able to find a match inside a dictionary (just as the question's code demonstrates).
What's the alternatives?
Use a reversed map (very similar to how DNS works) and simply use it. So, in our case we can use something similar to this:
{
"a": {
"map": {
"b": [1, 2, 3],
"c": [3, 4]
},
"reverse-map": {
"1": [ "b" ],
"2": [ "b" ],
"3": [ "b", "c" ],
"4": [ "c" ]
}
}
}
I know, it takes more memory and insert / update operations should validate this set is always symmetric and yet - it solves the problem. Now, instead of making an imaginary query like
db.foo.find( { a.map.* : 4 } )
I can make an actual query
db.foo.find( { a.reverse-map.4 : {$exists: true} } )
Which will return all items that have a specific value (in our example 4)
I know - this approach takes more memory and you need to manage indexes properly if you want to gain good performance (read the docs) and still - it's good for my use-case. Hope this helps someone else someday as well
Starting from MongoDB v3.4+, you can use $objectToArray to convert a into an array of k-v tuples for querying.
db.collection.aggregate([
{
"$addFields": {
"a": {
"$objectToArray": "$a"
}
}
},
{
$match: {
"a.v": 4
}
},
{
"$addFields": {
// cosmetics to revert back to original structure
"a": {
"$arrayToObject": "$a"
}
}
}
])
Here is the Mongo playground for your reference.

Using $rename in MongoDB for an item inside an array of objects

Consider the following MongoDB collection of a few thousand Objects:
{
_id: ObjectId("xxx")
FM_ID: "123"
Meter_Readings: Array
0: Object
Date: 2011-10-07
Begin_Read: true
Reading: 652
1: Object
Date: 2018-10-01
Begin_Reading: true
Reading: 851
}
The wrong key was entered for 2018 into the array and needs to be renamed to "Begin_Read". I have a list using another aggregate of all the objects that have the incorrect key. The objects within the array don't have an _id value, so are hard to select. I was thinking I could iterate through the collection and find the array index of the errored Readings and using the _id of the object to perform the $rename on the key.
I am trying to get the index of the array, but cannot seem to select it correctly. The following aggregate is what I have:
[
{
'$match': {
'_id': ObjectId('xxx')
}
}, {
'$project': {
'index': {
'$indexOfArray': [
'$Meter_Readings', {
'$eq': [
'$Meter_Readings.Begin_Reading', True
]
}
]
}
}
}
]
Its result is always -1 which I think means my expression must be wrong as the expected result would be 1.
I'm using Python for this script (can use javascript as well), if there is a better way to do this (maybe a filter?), I'm open to alternatives, just what I've come up with.
I fixed this myself. I was close with the aggregate but needed to look at a different field for some reason that one did not work:
{
'$project': {
'index': {
'$indexOfArray': [
'$Meter_Readings.Water_Year', 2018
]
}
}
}
What I did learn was the to find an object within an array you can just reference it in the array identifier in the $indexOfArray method. I hope that might help someone else.

Remove all items from an array NOT in array in MongoDB

I have two collections in Mongo. For simplification I´m providing a minified example
Template Collection:
{
templateId:0,
values:[
{data:"a"},
{data:"b"},
{data:"c"}
{data:"e"}
]
}
Data Collection:
{
dataId:0,
templateId:0,
values:[
{data:"a",
value: 10},
{data:"b",
value: 120},
{data:"c",
value: 3220},
{data:"d",
value: 0}
]
}
I want to make a sync from Template Collection -> Data Collection, between the template 0 and all documents using that template. In the case of the example, that would mean:
Copy {data:"e"} into the arrays of all documents with the templateId: 0
Remove {data:"d"} from the arrays of all documents with the templateId: 0
BUT do not touch the rest of the items. I can´t simply replace the array, because those values have to be kept
I´ve found a solution for 1.
db.getCollection('data').update({templateId:0},
{$addToSet: {values: {
$each:[
{data:"a"},
{data:"b"},
{data:"c"}
{data:"e"}
]
}}}, {
multi: true}
)
And a partial solution for 2.
I got it. First tried with $pullAll, but the normal $pull seems to work together with the $nin operator
db.getCollection('data').update({templateId:"QNXC4bPAF9J6r9FQu"},
{$pull:{values: { $nin:[
{data:"a"},
{data:"b"},
{data:"c"}
{data:"e"}]
}}}, {
multi: true}
)
This will remove {data:"d"} from all document arrays, but it seems to overwrite the complete array, and this is not what I want, as those value entries need to be persisted
But how can I perform a query like Remove everything from an array EXCEPT/NOT IN [a,b,c,d,...] ?

Resources