MongoDb: What's the best approach to modify a model field from an array of strings to an array of ids that would refer to another model? - database

I have a Profile model, that contains this field:
interests: {
type: [String],
},
My app has been running for a while. So this means for several documents, this field has already been filled with an array of strings.
In order to achieve certain goals, I need to create a model Interest with a field name and then refer to it in the Profile like this:
interests: [{
type: Schema.Types.ObjectId,
ref: "interests",
}],
The field name should contain the already existing string interests in Profile.interests.
This is the approach that I think I will follow:
Create Interest model.
Fill name field with the existing Profile.interests strings.
a. When doing this replace Profile.interests with the _ids of the newly created Interest documents.
b. Make sure Interest.name is unique.
c. Remove spaces.
Wherever interests in the app are used in the backend, use populate to fill them.
This doesn't feel like a safe operation. So I would like to hear your thoughts on it. Is there a better approach? Should I avoid doing this?
Thank you.

Step 1:
Create a Model for interests,
specify your desired fields fir interests schema and set properties for particular fields
specify collection name in options as per your requirement
create a model and specify your desired name in model
const InterestsSchema = new mongoose.Schema(
{ name: { type: String } },
{ collection: "interests" }
);
const Interests = mongoose.model("Interests", InterestsSchema);
Instead of removing interests field add new field interest (you can choose desired field), for safe side whenever you feel the current update working properly you can remove it, Update profile schema,
update interest field as per your requirement, now newly added field is interest
interests: {
type: [String]
},
interest: [{
type: Schema.Types.ObjectId,
ref: "interests"
}],
Step 2:
Wherever interests in the app are used in the backend, use interest and populate to fill them.
Step 3: (just execute the query)
Make a collection for interests and store all unique interests string from profile collection, so write a aggregation query to select unique string and store in interests collection, you can execute this query in mongo shell or any editor that you are using after specifying your original profile collection name,
$project to show interests field only because we are going to deconstruct it
$unwind to deconstruct interests array
$group by interests and select unique field, and trim white space from interests string
$project to show name field and if you want to then add your desired fields
$out will create a new collection interests and write all interests with newly generated _id field
db.getCollection('profile').aggregate([
{ $project: { interests: 1 } },
{ $unwind: "$interests" },
{ $group: { _id: { $trim: { input: "$interests" } } } },
{ $project: { _id: 0, name: "$_id" } },
{ $out: "interests" }
])
Playground
You have example input:
[
{
"_id": 1,
"interests": ["sports","sing","read"]
},
{
"_id": 2,
"interests": ["tracking","travel"]
}
]
After executing above query the output/result in interests / new collection would be something like:
[
{
"_id": ObjectId("5a934e000102030405000000"),
"name": "travel"
},
{
"_id": ObjectId("5a934e000102030405000001"),
"name": "sports"
},
{
"_id": ObjectId("5a934e000102030405000002"),
"name": "read"
},
{
"_id": ObjectId("5a934e000102030405000003"),
"name": "tracking"
},
{
"_id": ObjectId("5a934e000102030405000004"),
"name": "sing"
}
]
Step 4: (just execute the query)
Add new field interest with reference _ids from interests collection in profile collection, there are sequence to execute queries,
find profile query and project only required fields _id and interests when interest (new field) field is not exists and iterate loop using forEach
trim interests string iterating loop through map
find the interests reference _id by its name field from created interests collection
update query for add interest field that have _ids in profile collection
db.getCollection('profile').find(
{ interest: { $exists: false } },
{ _id: 1, interests: 1 }).forEach(function(profileDoc) {
// TRIM INTEREST STRING
var interests = profileDoc.interests.map(function(i){
return i.trim();
});
// FIND INTERESTS IDs
var interest = [];
db.getCollection('interests').find(
{ name: { $in: interests } },
{ _id: 1 }).forEach(function(interestDoc){
interest.push(interestDoc._id);
});
// UPDATE IDS IN PROFILE DOC
db.getCollection('profile').updateOne(
{ _id: profileDoc._id },
{ $set: { interest: interest } }
);
});
You have example input:
[
{
"_id": 1,
"interests": ["sports","sing","read"]
},
{
"_id": 2,
"interests": ["tracking","travel"]
}
]
After executing above query the result in your profile collection would be something like:
[
{
"_id": 1,
"interests": ["sports","sing","read"],
"interest": [
ObjectId("5a934e000102030405000001"),
ObjectId("5a934e000102030405000002"),
ObjectId("5a934e000102030405000004")
]
},
{
"_id": 2,
"interests": ["tracking","travel"],
"interest": [
ObjectId("5a934e000102030405000000"),
ObjectId("5a934e000102030405000003")
]
}
]
Step 5:
Now you have completed all the steps and you have newly added interest field
and also old field interests field is still in safe mode, just make sure everything is working properly you can delete old interests field,
remove old field interests field from all profiles
db.getCollection('profile').updateMany(
{ interests: { $exists: true } },
{ $unset: { "interests": 1 } }
);
Playground
Warning:
Test this steps in your local/development server before executing in production server.
Take backup of your database collections before executing queries.
Field and schema names are predicted you can replace with your original name.

Related

How to check if there is a key in collection that has more than one value?

My collection looks something like this:
{
{
_id: 'some value',
'product/productId': 'some value',
'product/title': 'some value',
'product/price': 'unknown'
},
{
_id: 'some value',
'product/productId': 'some value',
'product/title': 'some value',
'product/price': '12.57'
}
}
My goal is to find if there are any products that have more than one price. Values of the key "product/price" can be "unknown" or numerical (e.g. "12.75"). Is there a way to write an aggregation pipeline for that or do I need to use a map-reduce algorithm? I tried both options but didn't find the solution.
If I've understood correctly you can try this aggregation pipeline:
First of all, the _id field is (or should be) unique, so I think you mean another field like id.
So the trick is to group by that id and get all prices into an array. Then filter using $match to get only documents where the total of prices is greater than 1.
db.collection.aggregate([
{
"$group": {
"_id": "$id",
"price": {
"$push": "$product/price"
}
}
},
{
"$match": {
"$expr": {
"$gt": [ { "$size": "$price" }, 1 ]
}
}
}
])
Example here
As added into comments for Joe if you want consider identical values as the same you have to use $addToSet
Example here

Mongo - add field if object in array of sub docs has value

Details
I develop survey application with express and struggle with some getting of data.
The case:
you can get all surveys by "GET /surveys". And every survey doc has to contains hasVoted:mongoose.Bool and optionsVote:mongoose.Map if the user has voted for the survey. (SurveySchema is bellow)
you can vote for survey by "POST /surveys/vote"
you can see the results of any survey only if you vote for it
new Schema({
question: {
type: mongoose.Schema.Types.String,
required: true,
},
options: {
type: [{
type: mongoose.Schema.Types.String,
required: true,
}]
},
optionsVote: {
type: mongoose.Schema.Types.Map,
of: mongoose.Schema.Types.Number,
},
votesCount: {
type: mongoose.Schema.Types.Number,
},
votes: {
type: [{
user: {
type: mongoose.Schema.Types.ObjectId,
ref: 'User',
},
option: mongoose.Schema.Types.Number,
}]
},
})
Target:
So the target of the question is how to add fields hasVoted and optionsVote if there is "Vote" sub document in votes array where user===req.user.id ?
I believe you got the idea so if you have an idea how to change the schema to achieve the desired result I'm open!
Example:
Data:
[{
id:"surveyId1
question:"Question",
options:["op1","op2"],
votes:[{user:"userId1", option:0}]
votesCount:1,
optionsVote:{"0":1,"1":0}
},{
id:"surveyId2
question:"Question",
options:["op1","op2"],
votes:[{user:"userId2", option:0}]
votesCount:1,
optionsVote:{"0":1,"1":0}
}]
Route handler:
Where req.user.id='userId1' and then make the desired query.
The result
[{ // Voted for this survey
id:"surveyId1
question:"Question",
options:["op1","op2"],
votes:[{user:"userId1", option:0}]
votesCount:1,
optionsVote:{"0":1,"1":0},
hasVoted:true,
},{ // No voted for this survey
id:"surveyId2
question:"Question",
options:["op1","op2"],
votesCount:1,
}]
In MongoDB, you can search for sub document as follows
//Mongodb query to search for survey filled by a user
db.survey.find({ 'votes.user': myUserId })
So with this when you can get results only where user has voted, do you really need hasVoted field?
To have optionsVote field, first I would prefer schema of optionsVote as {option: "a", count:1}. You can choose any of the following approach.
A. manage to update optionsVote field at the time of update by incrementing the count of the voted option when you POST /survey/vote.
B. Another approach would be to calculate the optionsVote based on votes entries at the time of GET /survey. You can do this via aggregate
//Mongodb query to get optionsVote:{option: "a", count:1} from votes: { user:"x", option:"a"}
db.survey.aggregate([
{ $unwind: "$votes" },
{ $group: {
"_id": { "id": "_id", "option": "$votes.option" },
optionCount: { $sum: 1 }
}
},
{
$group: { "_id": "$_id.id" },
optionsVote: { $push : { option: "$_id.option", count: "$optionCount" } },
votes: { $push : '$votes'}
}
])
//WARNING: I haven't tested this query, this is just to show the approach -> group based on votes.option and count all votes for that option for each document and then create optionsVote field by pushing all option with their count using $push into the field `optionsVote`
I recommend approach A because I assume POST operations would be quite less than GET operations. Also it's easier to implement. Having said that, keeping query in B handy will help you with sanity check.

mongo update : upsert an element in array

I would like to upsert an element in an array, based on doc _id and element _id. Currently it works only if the element is allready in the array (update works, insert not).
So, these collection:
[{
"_id": "5a65fcf363e2a32531ed9f9b",
"ressources": [
{
"_id": "5a65fd0363e2a32531ed9f9c"
}
]
}]
Receiving this request:
query = { _id: '5a65fcf363e2a32531ed9f9b', 'ressources._id': '5a65fd0363e2a32531ed9f9c' };
update = { '$set': { 'ressources.$': { '_id': '5a65fd0363e2a32531ed9f9c', qt: '153', unit: 'kg' } } };
options = {upsert:true};
collection.update(query,update,options);
Will give this ok result:
[{
"_id": "5a65fcf363e2a32531ed9f9b",
"ressources": [
{
"_id": "5a65fd0363e2a32531ed9f9c",
"qt": 153,
"unit": "kg"
}
]
}]
How to make the same request work with these initial collections:
[{
"_id": "5a65fcf363e2a32531ed9f9b"
}]
OR
[{
"_id": "5a65fcf363e2a32531ed9f9b",
"ressources": []
}]
How to make the upsert work?
Does upsert works with entire document only?
Currently, I face this error:
The positional operator did not find the match needed from the query.
Thanks
I also tried to figure out how to do it. I found only one way:
fetch model by id
update array manually (via javascript)
save the model
Sad to know that in 2018 you still have to do the stuff like it.
UPDATE:
This will update particular element in viewData array
db.collection.update({
"_id": args._id,
"viewData._id": widgetId
},
{
$set: {
"viewData.$.widgetData": widgetDoc.widgetData
}
})
$push command will add new items

Update array of subdocuments in MongoDB

I have a collection of students that have a name and an array of email addresses. A student document looks something like this:
{
"_id": {"$oid": "56d06bb6d9f75035956fa7ba"},
"name": "John Doe",
"emails": [
{
"label": "private",
"value": "private#johndoe.com"
},
{
"label": "work",
"value": "work#johndoe.com"
}
]
}
The label in the email subdocument is set to be unique per document, so there can't be two entries with the same label.
My problems is, that when updating a student document, I want to achieve the following:
adding an email with a new label should simply add a new subdocument with the given label and value to the array
if adding an email with a label that already exists, the value of the existing should be set to the data of the update
For example when updating with the following data:
{
"_id": {"$oid": "56d06bb6d9f75035956fa7ba"},
"emails": [
{
"label": "private",
"value": "me#johndoe.com"
},
{
"label": "school",
"value": "school#johndoe.com"
}
]
}
I would like the result of the emails array to be:
"emails": [
{
"label": "private",
"value": "me#johndoe.com"
},
{
"label": "work",
"value": "work#johndoe.com"
},
{
"label": "school",
"value": "school#johndoe.com"
}
]
How can I achieve this in MongoDB (optionally using mongoose)? Is this at all possible or do I have to check the array myself in the application code?
You could try this update but only efficient for small datasets:
mongo shell:
var data = {
"_id": ObjectId("56d06bb6d9f75035956fa7ba"),
"emails": [
{
"label": "private",
"value": "me#johndoe.com"
},
{
"label": "school",
"value": "school#johndoe.com"
}
]
};
data.emails.forEach(function(email) {
var emails = db.students.findOne({_id: data._id}).emails,
query = { "_id": data._id },
update = {};
emails.forEach(function(e) {
if (e.label === email.label) {
query["emails.label"] = email.label;
update["$set"] = { "emails.$.value": email.value };
} else {
update["$addToSet"] = { "emails": email };
}
db.students.update(query, update)
});
});
Suggestion: refactor your data to use the "label" as an actual field name.
There is one straightforward way in which MongoDB can guarantee unique values for a given email label - by making the label a single separate field in itself, in an email sub-document. Your data needs to exist in this structure:
{
"_id": ObjectId("56d06bb6d9f75035956fa7ba"),
"name": "John Doe",
"emails": {
"private": "private#johndoe.com",
"work" : "work#johndoe.com"
}
}
Now, when you want to update a student's emails you can do an update like this:
db.students.update(
{"_id": ObjectId("56d06bb6d9f75035956fa7ba")},
{$set: {
"emails.private" : "me#johndoe.com",
"emails.school" : "school#johndoe.com"
}}
);
And that will change the data to this:
{
"_id": ObjectId("56d06bb6d9f75035956fa7ba"),
"name": "John Doe",
"emails": {
"private": "me#johndoe.com",
"work" : "work#johndoe.com",
"school" : "school#johndoe.com"
}
}
Admittedly there is a disadvantage to this approach: you will need to change the structure of the input data, from the emails being in an array of sub-documents to the emails being a single sub-document of single fields. But the advantage is that your data requirements are automatically met by the way that JSON objects work.
After investigating the different options posted, I decided to go with my own approach of doing the update manually in the code using lodash's unionBy() function. Using express and mongoose's findById() that basically looks like this:
Student.findById(req.params.id, function(err, student) {
if(req.body.name) student.name = req.body.name;
if(req.body.emails && req.body.emails.length > 0) {
student.emails = _.unionBy(req.body.emails, student.emails, 'label');
}
student.save(function(err, result) {
if(err) return next(err);
res.status(200).json(result);
});
});
This way I get the full flexibility of partial updates for all fields. Of course you could also use findByIdAndUpdate() or other options.
Alternate approach:
However the way of changing the schema like Vince Bowdren suggested, making label a single separate field in a email subdocument, is also a viable option. In the end it just depends on your personal preferences and if you need strict validation on your data or not.
If you are using mongoose like I do, you would have to define a separate schema like so:
var EmailSchema = new mongoose.Schema({
work: { type: String, validate: validateEmail },
private: { type: String, validate: validateEmail }
}, {
strict: false,
_id: false
});
In the schema you can define properties for the labels you already want to support and add validation. By setting the strict: false option, you would allow the user to also post emails with custom labels. Note however, that these would not be validated. You would have to apply the validation manually in your application similar to the way I did it in my approach above for the merging.

Filtering an embedded array in MongoDB

I have a Mongodb document that contains an an array that is deeply imbedded inside the document. In one of my action, I would like to return the entire document but filter out the elements of that array that don't match that criteria.
Here is some simplified data:
{
id: 123 ,
vehicles : [
{name: 'Mercedes', listed: true},
{name: 'Nissan', listed: false},
...
]
}
So, in this example I want the entire document but I want the vehicles array to only have objects that have the listed property set to true.
Solutions
Ideally, I'm looking for a solution using mongo's queries (e.g. `$unwind, $elemMatch, etc...) but I'm also using mongoose so solution that uses Mongoose is OK.
You could use aggregation framework like this:
db.test312.aggregate(
{$unwind:"$vehicles"},
{$match:{"vehicles.name":"Nissan"}},
{$group:{_id:"$_id",vehicles:{$push:"$vehicles"}}}
)
You can use $addToSet on the group after unwinding and matching by listed equals true.
Sample shell query:
db.collection.aggregate([
{
$unwind: "$vehicles"
},
{
$match: {
"vehicles.listed": {
$eq: true
}
}
},
{
$group: {
_id: "$id",
vehicles: {
"$addToSet": {
name: "$vehicles.name",
listed: "$vehicles.listed"
}
}
}
},
{
$project: {
_id: 0,
id: "$_id",
vehicles: 1
}
}
]).pretty();

Resources