MongoDB Aggregate Array with Two Fields - arrays

I have vehicles collection with the following schema, all the articles are just general products (no child products included):
{
"_id" : ObjectId("554995ac3d77c8320f2f1d2e"),
"model" : "ILX",
"year" : 2015,
"make" : "Acura",
"motor" : {
"cylinder" : 4,
"liters" : "1.5"
},
"products" : [
ObjectId("554f92433d77c803836fefe3"),
...
]
}
And I have products collection, some of them are general products related with warehouse sku's and some products are "son" products that fit in multiples general products, these son products are also related with warehouse sku's:
general products
{
"_id" : ObjectId("554b9f223d77c810e8915539"),
"brand" : "Airtex",
"product" : "E7113M",
"type" : "Fuel Pump",
"warehouse_sku" : [
"1_5551536f3d77c870fc388a04",
"2_55515e163d77c870fc38b00a"
]
}
child product
{
"_id" : ObjectId("55524d0c3d77c8ba9cb2d9fd"),
"brand" : "Performance",
"product" : "P41K",
"type" : "Repuesto Bomba Gasolina",
"general_products" : [
ObjectId("554b9f223d77c810e8915539"),
ObjectId("554b9f123d77c810e891552f")
],
"warehouse_sku" : [
"1_555411043d77c8066b3b6720",
"2_555411073d77c8066b3b6728"
]
}
My question is to obtain a list of general products (_id and general_products inside child products) for warehouse_sku that follow the pattern : 1_
I have created an aggregate query with the following structure:
list_products = db.getCollection('products').aggregate([
... {$match: {warehouse_sku: /^1\_/}},
... {$group: { "_id": "$_id" } }
... ])
And that query give me successfully a result :
{ "_id" : ObjectId("55524d0c3d77c8ba9cb2d9fd") }
{ "_id" : ObjectId("554b9f223d77c810e8915539") }
but I need to obtain a list of general products so I can use $in in the vehicles collection.
list_products = [ ObjectId("55524d0c3d77c8ba9cb2d9fd"), ObjectId("554b9f223d77c810e8915539")]
example: db.vehicles.find({products:{$in: list_products}})
This last query I could not achieve it.

Use the aggregation cursor's map() method to return an array of ObjectIds as follows:
var pipeline = [
{$match: {warehouse_sku: /^1\_/}},
{$group: { "_id": "$_id" } }
],
list_products = db.getCollection('products')
.aggregate(pipeline)
.map(function(doc){ return doc._id });
The find() cursor's map() would work here as well:
var query = {'warehouse_sku': /^1\_/},
list_products = db.getCollection('products')
.find(query)
.map(function(doc){ return doc._id });
UPDATE
In pymongo, you could use a lambda function with the map function. Because map expects a function to be passed in, it also happens to be one of the places where lambda routinely appears:
import re
regx = re.compile("^1\_", re.IGNORECASE)
products_cursor = db.products.find({"warehouse_sku": regx})
list_products = list(map((lambda doc: doc["_id"]), products_cursor))

Related

Mongodb query array

I need to get all documents that match an array of objects or an object with many fields.
Example 1 (array of objects)
If the document match the country_code than he must have one of postal_codes too
var locations = [
{
country_code : 'IT',
postal_codes : [21052, 21053, 21054, 21055]
},
{
country_code : 'GER',
postal_codes : [41052, 41053, 41054, 41055]
}
]
Example 2 (object with fields)
If the document match the key than it must have one of the values of that key
var location = {
'IT' : [21052, 21053, 21054, 21055],
'GER' : [41052, 41053, 41054, 41055]
}
I like the first type of document to match(array of objects) but how can i use to get all documents that match?
The documents to find have this structure:
{
"_id" : ObjectId("587f6f57ed6b9df409db7370"),
"description" : "Test description",
"address" : {
"postal_code" : "21052",
"country_code" : "IT"
}
}
You can use $in to find such collections.
db.collection_name.find(
{ address.postal_code: { $in: [your values] } },
)
Check this link for querying child objects.
Check this link for mongoDB $in
One way is to use the $or operator. This will help you limit the combinations of country_code and postal_code.
Your query should look something like this.
db.locations.find({
$or: [{
"country_code": "IT",
"postal_code": {
$in: [21052, 21053, 21054, 21055]
}
}, {
"country_code": "GER",
"postal_code": {
$in: [41052, 41053, 41054, 41055]
}
}]
})

MongoDB: Check for missing documents using a model tree structures with an array of ancestors

I'm using a model tree structures with an array of ancestors and I need to check if any document is missing.
{
"_id" : "GbxvxMdQ9rv8p6b8M",
"type" : "article",
"ancestors" : [ ]
}
{
"_id" : "mtmTBW8nA4YoCevf4",
"parent" : "GbxvxMdQ9rv8p6b8M",
"ancestors" : [
"GbxvxMdQ9rv8p6b8M"
]
}
{
"_id" : "J5Dg4fB5Kmdbi8mwj",
"parent" : "mtmTBW8nA4YoCevf4",
"ancestors" : [
"GbxvxMdQ9rv8p6b8M",
"mtmTBW8nA4YoCevf4"
]
}
{
"_id" : "tYmH8fQeTLpe4wxi7",
"refType" : "reference",
"parent" : "J5Dg4fB5Kmdbi8mwj",
"ancestors" : [
"GbxvxMdQ9rv8p6b8M",
"mtmTBW8nA4YoCevf4",
"J5Dg4fB5Kmdbi8mwj"
]
}
My attempt would be to check each ancestors id if it is existing. If this fails, this document is missing and the data structure is corrupted.
let ancestors;
Collection.find().forEach(r => {
if (r.ancestors) {
r.ancestors.forEach(a => {
if (!Collection.findOne(a))
missing.push(r._id);
});
}
});
But doing it like this will need MANY db calls. Is it possible to optimize this?
Maybe I could get an array with all unique ancestor ids first and check if these documents are existing within one db call??
First take out all distinct ancesstors from your collections.
var allAncesstorIds = db.<collectionName>.distinct("ancestors");
Then check if any of the ancesstor IDs are not in the collection.
var cursor = db.<collectionName>.find({_id : {$nin : allAncesstorIds}}, {_id : 1})
Iterate the cursor and insert all missing docs in a collection.
cursor.forEach(function (missingDocId) {
db.missing.insert(missingDocId);
});

mongo add to nested array if entry does not contain two fields that match

I have a mongo document that contains an array called history:
{
"_id" : ObjectId("575fe85bfe98c1fba0a6e535"),
"email" : "email#address",
"__v" : 0,
"history" : [
{
"name" : "Test123",
"organisation" : "Rat",
"field" : 4,
"another": 3
}
]
}
I want to add fields to each history object or update fields IF the name AND organisation match, however if they don't, I want to add a new object to the array with the queried name and organisation and add/update the other fields to the object when necessary.
So:
This query, finds one that matches:
db.users.find({
email:"email#address",
$and: [
{ "history.name": "Test123", "history.organisation": "Rat"}
]
})
However, I'm struggling to get the update/upsert to work IF that combination of history.name and history.organisation dont exist in the array.
What I think I need to do is a :
"If this history name does not equal 'Test123' AND the history organisation does not equal 'Rat' then add an object to the array with those fields and any other field provided in the update query."
I tried this:
db.users.update({
email:"email#address",
$and: [
{ "history.name": "Test123", "history.organisation": "Rat"}
]
}, {
history: { name: "Test123"},
history: { organisation: "Rat"}
}, {upsert:true})
But that gave me E11000 duplicate key error index: db.users.$email_1 dup key: { : null }
Any help greatly appreciated.
Thanks community!
Not possible with a single atomic update I'm afraid, you would have to do a couple of update operations that satisfy both conditions.
Break down the update logic into two distinct update operations, the first one would require using the positional $ operator to identify the element in the history array you want and the $set to update the existing fields. This operation follows the logic update fields IF the name AND organisation match
Now, you'd want to use the findAndModify() method for this operation since it can return the updated document. By default, the returned document does not include the modifications made on the update.
So, armed with this arsenal, you can then probe your second logic in the next operation i.e. update IF that combination of "history.name" and "history.organisation" don't exist in the array. With this second
update operation, you'd need to then use the $push operator to add the elements.
The following example demonstrates the above concept. It initially assumes you have the query part and the document to be updated as separate objects.
Take for instance when we have documents that match the existing history array, it will just do a single update operation, but if the documents do not match, then the findAndModify() method will return null, use this logic in your second update operation to push the document to the array:
var doc = {
"name": "Test123",
"organisation": "Rat"
}, // document to update. Note: the doc here matches the existing array
query = { "email": "email#address" }; // query document
query["history.name"] = doc.name; // create the update query
query["history.organisation"] = doc.organisation;
var update = db.users.findAndModify({
"query": query,
"update": {
"$set": {
"history.$.name": doc.name,
"history.$.organisation": doc.organisation
}
}
}); // return the document modified, if there's no matched document update = null
if (!update) {
db.users.update(
{ "email": query.email },
{ "$push": { "history": doc } }
);
}
After this operation for documents that match, querying the collection will yield the same
db.users.find({ "email": "email#address" });
Output:
{
"_id" : ObjectId("575fe85bfe98c1fba0a6e535"),
"email" : "email#address",
"__v" : 0,
"history" : [
{
"name" : "Test123",
"organisation" : "Rat",
"field" : 4,
"another" : 3
}
]
}
Now consider documents that won't match:
var doc = {
"name": "foo",
"organisation": "bar"
}, // document to update. Note: the doc here does not matches the current array
query = { "email": "email#address" }; // query document
query["history.name"] = doc.name; // create the update query
query["history.organisation"] = doc.organisation;
var update = db.users.findAndModify({
"query": query,
"update": {
"$set": {
"history.$.name": doc.name,
"history.$.organisation": doc.organisation
}
}
}); // return the document modified, if there's no matched document update = null
if (!update) {
db.users.update(
{ "email": query.email },
{ "$push": { "history": doc } }
);
}
Querying this collection for this document
db.users.find({ "email": "email#address" });
would yield
Output:
{
"_id" : ObjectId("575fe85bfe98c1fba0a6e535"),
"email" : "email#address",
"__v" : 0,
"history" : [
{
"name" : "Test123",
"organisation" : "Rat",
"field" : 4,
"another" : 3
},
{
"name" : "foo",
"organisation" : "bar"
}
]
}

How to query a single embedded document in an array in MongoDB?

I am trying to query a single embedded document in an array in MongoDB. I don't know what I am doing wrong. Programmatically, I will query this document and insert new embedded documents into the currently empty trips arrays.
{
"_id" : ObjectId("564b3300953d9d51429163c3"),
"agency_key" : "DDOT",
"routes" : [
{
"route_id" : "6165",
"route_type" : "3",
"trips" : [ ]
},
{
"route_id" : "6170",
"route_type" : "3",
"trips" : [ ]
},
...
]
}
Following queries -I run in mongo shell- return empty:
db.tm_routes.find( { routes : {$elemMatch: { route_id:6165 } } } ).pretty();
db.tm_routes.find( { routes : {$elemMatch: { route_id:6165,route_type:3 } } } ).pretty();
db.tm_routes.find({'routes.route_id':6165}).pretty()
also db.tm_routes.find({'routes.route_id':6165}).count() is 0.
The following query returns every document in the array
db.tm_routes.find({'routes.route_id':'6165'}).pretty();
{
"_id" : ObjectId("564b3300953d9d51429163c3"),
"agency_key" : "DDOT",
"routes" : [
{
"route_id" : "6165",
"route_type" : "3",
"trips" : [ ]
},
{
"route_id" : "6170",
"route_type" : "3",
"trips" : [ ]
},
...
]}
but db.tm_routes.find({'routes.route_id':'6165'}).count() returns 1.
And finally, here is how I inserted data in the first place -in Node.JS-:
async.waterfall([
...
//RETRIEVE ALL ROUTEIDS FOR EVERY AGENCY
function(agencyKeys, callback) {
var routeIds = [];
var routesArr = [];
var routes = db.collection('routes');
//CALL GETROUTES FUNCTION FOR EVERY AGENCY
async.map(agencyKeys, getRoutes, function(err, results){
if (err) throw err;
else {
callback(null, results);
}
});
//GET ROUTE IDS
function getRoutes(agencyKey, callback){
var cursor = routes.find({agency_key:agencyKey});
cursor.toArray(function(err, docs){
if(err) throw err;
for(i in docs){
routeIds.push(docs[i].route_id);
var routeObj = {
route_id:docs[i].route_id,
route_type:docs[i].route_type,
trips:[]
};
routesArr.push(routeObj);
/* I TRIED 3 DIFFERENT WAYS TO PUSH DATA
//1->
collection.update({agency_key:agencyKey}, {$push:{"routes":{
'route_id':docs[i].route_id,
'route_type':docs[i].route_type,
'trips':[]
}}});
//2->
collection.update({agency_key:agencyKey}, {$push:{"routes":routeObj}});
*/
}
// 3->
collection.update({agency_key:agencyKey}, {$push:{routes:{$each:routesArr}}});
callback(null, routeIds);
});
};
},
...
var collection = newCollection(db, 'tm_routes',[]);
function newCollection(db, name, options){
var collection = db.collection(name);
if (collection){
collection.drop();
}
db.createCollection(name, options);
return db.collection(name);
}
Note: I am not using Mongoose and don't want to use if possible.
Melis,
I see what you are asking for, and what you need is help understanding how things are stored in mongodb. Things to understand:
A document is the basic unit of data for MongoDB and can be roughly compared to a row in a relational database.
A collection can be thought of as a table with a dynamic schema
So documents are stored in collections.Every document has a special _id, that is unique within a collection. What you showed us above in the following format is One document.
{
"_id" : ObjectId("564b3300953d9d51429163c3"),
"agency_key" : "DDOT",
"routes" : [
{
"route_id" : "6165",
"route_type" : "3",
"trips" : [ ]
},
{
"route_id" : "6170",
"route_type" : "3",
"trips" : [ ]
},
...
]}
If you run a query in your tm_routes collection. The find() will return each document in the collection that matches that query. Therefore when you run the query db.tm_routes.find({'routes.route_id':'6165'}).pretty(); it is returning the entire document that matches the query. Therefore this statement is wrong:
The following query returns every document in the array
If you need to find a specific route in that document, and only return that route, depending on your use, because its an array, you may have to use the $-Positional Operator or the aggregation framework.
For Node and Mongodb users using Mongoose, this is one of the ways to write the query to the above problem:
db.tm_routes.updateOne(
{
routes: {
$elemMatch: {
route_id: 6165 (or if its in a route path then **6165** could be replaced by **req.params.routeid**
}
}
},
{
$push: {
"routes.$.trips":{
//the content you want to push into the trips array goes here
}
}
}
)

How to retrieve a specific field from a subdocument array with mongoose

I'm trying to get a specific field from a subdocument array
I'm not gonna include any of the fields in the parent doc
Here is the sample document
{
"_id" : ObjectId("5409dd36b71997726532012d"),
"hierarchies" : [
{
"rank" : 1,
"_id" : ObjectId("5409df85b719977265320137"),
"name" : "CTO",
"userId" : [
ObjectId("53a47a639c52c9d83a2d71db")
]
}
]
}
I would like to return the rank of the hierarchy if the a userId is in the userId array
here's what I have so far in my query
collectionName.find({{hierarchies:
{$elemMatch : {userId: ObjectId("53a47a639c52c9d83a2d71db")}}}
, "hierarchies.$.rank", function(err,data){}
so far it returns the entire object in the hierarchies array I want, but I would like to limit it to just the rank property of the object.
The projection available to .find() queries generally in MongoDB does not do this sort of projection for internal elements of an array. All you can generally do is return the "matched" element of the array entirely.
For what you want, you use the aggregation framework instead, which gives you more control over matching and projection:
Model.aggregate([
{ "$match": {
"hierarchies.userId": ObjectId("53a47a639c52c9d83a2d71db")
}},
{ "$unwind": "$hierarchies" },
{ "$match": {
"hierarchies.userId": ObjectId("53a47a639c52c9d83a2d71db")
}},
{ "$project": {
"rank": "$hierarchies.rank"
}}
],function(err,result) {
})
That basically matches the documents, filters the array content of the document to just the match and then projects only the required field.

Resources