How to count occurence of each value in array? - arrays

I have a database of ISSUES in MongoDB, some of the issues have comments, which is an array; each comments has a writer. How can I count the number of comments each writer has written?
I've tried
db.test.issues.group(
{
key = "comments.username":true;
initial: {sum:0},
reduce: function(doc, prev) {prev.sum +=1},
}
);
but no luck :(
A Sample:
{
"_id" : ObjectId("50f48c179b04562c3ce2ce73"),
"project" : "Ruby Driver",
"key" : "RUBY-505",
"title" : "GETMORE is sent to wrong server if an intervening query unpins the connection",
"description" : "I've opened a pull request with a failing test case demonstrating the bug here: https://github.com/mongodb/mongo-ruby-driver/pull/134\nExcerpting that commit message, the issue is: If we do a secondary read that is large enough to require sending a GETMORE, and then do another query before the GETMORE, the secondary connection gets unpinned, and the GETMORE gets sent to the wrong server, resulting in CURSOR_NOT_FOUND, even though the cursor still exis ts on the server that was initially queried.",
"status" : "Open",
"components" : [
"Replica Set"
],
"affected_versions" : [
"1.7.0"
],
"type" : "Bug",
"reporter" : "Nelson Elhage",
"priority" : "major",
"assignee" : "Tyler Brock",
"resolution" : "Unresolved",
"reported_on" : ISODate("2012-11-17T20:30:00Z"),
"votes" : 3,
"comments" : [
{
"username" : "Nelson Elhage",
"date" : ISODate("2012-11-17T20:30:00Z"),
"body" : "Thinking some more"
},
{
"username" : "Brandon Black",
"date" : ISODate("2012-11-18T20:30:00Z"),
"body" : "Adding some findings of mine to this ticket."
},
{
"username" : "Nelson Elhage",
"date" : ISODate("2012-11-18T20:30:00Z"),
"body" : "I think I tracked down the 1.9 dependency."
},
{
"username" : "Nelson Elhage",
"date" : ISODate("2012-11-18T20:30:00Z"),
"body" : "Forgot to include a link"
}
]
}

You forgot the curly braces on the key value and you need to terminate that line with a , instead of a ;.
db.issues.group({
key: {"comments.username":true},
initial: {sum:0},
reduce: function(doc, prev) {prev.sum +=1},
});
UPDATE
After realizing comments is an array...you'd need to use aggregate for that so that you can 'unwind' comments and then group on it:
db.issues.aggregate(
{$unwind: '$comments'},
{$group: {_id: '$comments.username', sum: {$sum: 1}}}
);
For the sample doc in the question, this outputs:
{
"result": [
{
"_id": "Brandon Black",
"sum": 1
},
{
"_id": "Nelson Elhage",
"sum": 3
}
],
"ok": 1
}

Just a snide answer here to compliment #JohnnyHKs answer: it sounds like your new to MongoDB and as such possibly working on a new version of MongoDB if that is the case (if not I would upgrade) either way the old group count is kinda bad. It won't, for one, work with sharding.
Instead in MongoDB 2.2 you can just do:
db.col.aggregate({$group: {_id: "$comments.username", count: {$sum: 1}}})
Or something similar. You can read more about it here: http://docs.mongodb.org/manual/applications/aggregation/

Related

Find only the documents which have two embedded documents

I'm using Mongodb to analysee a Nobel prizes dataset which documents look like these:
> db.laureate.find().pretty().limit(1)
{
"_id" : ObjectId("604bc8c847d640142f02b3b1"),
"id" : "1",
"firstname" : "Wilhelm Conrad",
"surname" : "Röntgen",
"born" : "1845-03-27",
"died" : "1923-02-10",
"bornCountry" : "Prussia (now Germany)",
"bornCountryCode" : "DE",
"bornCity" : "Lennep (now Remscheid)",
"diedCountry" : "Germany",
"diedCountryCode" : "DE",
"diedCity" : "Munich",
"gender" : "male",
"prizes" : [
{
"year" : "1901",
"category" : "physics",
"share" : "1",
"motivation" : "\"in recognition of the extraordinary services he has rendered by the discovery of the remarkable rays subsequently named after him\"",
"affiliations" : [
{
"name" : "Munich University",
"city" : "Munich",
"country" : "Germany"
}
]
}
]
}
As you see the column "prizes" has embedded documents and the query I am trying to do is finding only those laureates who won two prizes (which I already know to be Marie Curie and Linus Pauling) can you help me with that?
Thanks in advance!
The $size operator should work fine for this. You could read about it if you want in this link: https://docs.mongodb.com/manual/reference/operator/query/size/
Your new query:
db.laureate.find({prizes: {$size: 2}}).pretty().limit(1)

mongodb - extract a particular value in an embedded array within an array

I'm fairly new to mongoDB, but I've managed to archive a load of documents into a new collection called documents_archived in the following format using an aggregation pipeleine:
{
"_id" : ObjectId("5a0046ef2039404645a42f52"),
"archive" : [
{
"_id" : ObjectId("54e60f49e4b097cc823afe8c"),
"_class" : "xxxxxxxxxxxxx",
"fields" : [
{
"key" : "Agreement Number",
"value" : "1002465507"
}
{
"key" : "Name",
"value" : "xxxxxxxx"
}
{
"key" : "Reg No",
"value" : "xxxxxxx"
}
{
"key" : "Surname",
"value" : "xxxxxxxx"
}
{
"key" : "Workflow Id",
"value" : "xxxxxxxx"
}
],
"fileName" : "Audit_C1002465507.txt",
"type" : "Workflow Audit",
"fileSize" : NumberLong(404),
"document" : BinData(0, "xxxxx"),
"creationDate" : ISODate("2009-09-25T00:00:00.000+0000"),
"lastModificationDate" : ISODate("2015-02-19T16:28:57.016+0000"),
"expiryDate" : ISODate("2018-09-25T00:00:00.000+0000")
}
]
}
Now, I'm trying to extract just the Agreement Number's value. However, I have tried many things that my limited knowledge, searching and documentation will allow. Wondered if the mongoDB experts out there can help. Thank you.
Here's a solution that uses the agg framework. I am assuming that each doc can have more than one entry in the archive field but only one Agreement Number in the fields array because your design appears to be key/value. If multiple Agreement Numbers show up in the fields array we'll have to add an additional $unwind but for the moment, this should work:
db.foo.aggregate([
{$unwind: "$archive"}
,{$project: {x: {$filter: {
input: "$archive.fields",
as: "z",
cond: {$eq: [ "$$z.key", "Agreement Number" ]}
}}
}}
,{$project: {_id:false, val: {$arrayElemAt: ["$x.value",0]} }}
]);
{ "val" : "1002465507" }
You can use following in mongo shell to extract only values:
db.documents_archived.find().forEach(function(doc) {
doc.archive[0].fields.forEach(function(field) {
if (field.key == "Agreement Number") {
print(field.value)
}
})
})

How can I remove duplicates from Firebase database when nodes share a common child?

"-Kj9Penv_LMRUIPSet0b" : {
"categories" : [ "food", "fashion"],
"contact" : "profile/contact/eieiiieie888x7ww28288_x22",
"location" : "New York, United States",
"name" : "Billybob Smith",
"social" : {
"twitter" : {
"followers" : "1,002",
"nickname" : "#billybob"
}
},
"state" : "0"
},
"eieiiieie888x7ww28288_x22" : {
"categories" : [ "food", "fashion" ],
"contact" : "profile/contact/eieiiieie888x7ww28288_x22",
"location" : "New York, United States",
"name" : "Billybob Smith.",
"social" : {
"twitter" : {
"followers" : "1,002",
"nickname" : "#billybob"
}
},
"socialID" : "twitter_id|558969977",
"state" : "0",
"uniqueID" : "eieiiieie888x7ww2828"
},
This is one .JSON example of a duplicate in my database. I have a lot of duplicates in my database. The only common piece of data I capture which uniquely identifies each user is their contact link. What is my best course of action to seek and remove duplicates from my database? I'm totally stuck. The second entry example is the more accurate and complete entry. Ideally, I could remove the first one and leave the second one behind.
Could really use some help here! Thank you so much!

Create a document with an array from filtered elements in an existing document array

I've asked this question before but not in the clearest way since I had no responses :( so I thought I would try again.
I have a document as shown below, I want to create a new document which only picks the names in the array where language = "English".
{
"_id" : ObjectId("564d35d5150699558156942b"),
"objectCategory" : "Food",
"objectType" : "Fruit",
"objectName" : [
{
"language" : "English",
"name" : "Apple"
},
{
"language" : "French",
"name" : "Pomme"
},
{
"language" : "English",
"name" : "Strawberry"
},
{
"language" : "French",
"name" : "Fraise"
}
]
}
I want the $out document to look like this below. I know I can filter a document by content but, I want to filter within a single document not across a collection. For getting the right document in the first place, I would have a query to $find objectCategory = "Food" and objectType = "Fruit"
{
"_id" : ObjectId("564d35d5150699558156942b"),
"objectCategory" : "Food",
"objectType" : "Fruit",
"objectName" : [
"name" : "Apple",
"name" : "Strawberry"
]
}
Thanks, Matt
wow, ah, I really thought I found it with:
db.serviceCatalogue.find({objectName: {"$elemMatch": {language: "English"}}}, {"objectName.name": 1})
;thanks to: Retrieve only the queried element in an object array in MongoDB collection
However, it did nothing, I must have dreamt it worked. How do you just get the array positions where the value of a field called language = 'English'?
this is only an example of what I want to do, it seems like this is just painful, especially with no-one answering other than me :)

Mongoose Query: Find an element inside an array

Mongoose/Mongo noob here:
My Data
Here is my simplified data, each user has his own document
{ "__v" : 1,
"_id" : ObjectId( "53440e94c02b3cae81eb0065" ),
"email" : "test#test.com",
"firstName" : "testFirstName",
"inventories" : [
{ "_id" : "active",
"tags" : [
"inventory",
"active",
"vehicles" ],
"title" : "activeInventory",
"vehicles" : [
{ "_id" : ObjectId( "53440e94c02b3cae81eb0069" ),
"tags" : [
"vehicle" ],
"details" : [
{ "_id" : ObjectId( "53440e94c02b3cae81eb0066" ),
"year" : 2007,
"transmission" : "Manual",
"price" : 1000,
"model" : "Firecar",
"mileageReading" : 50000,
"make" : "Bentley",
"interiorColor" : "blue",
"history" : "CarProof",
"exteriorColor" : "blue",
"driveTrain" : "SWD",
"description" : "test vehicle",
"cylinders" : 4,
"mileageType" : "kms" } ] } ] },
{ "title" : "soldInventory",
"_id" : "sold",
"vehicles" : [],
"tags" : [
"inventory",
"sold",
"vehicles" ] },
{ "title" : "deletedInventory",
"_id" : "deleted",
"vehicles" : [],
"tags" : [
"inventory",
"sold",
"vehicles" ] } ] }
As you can see, each user has an inventories property that is an array that contains 3 inventories (activeInventory, soldInventory and deletedInventory)
My Query
Given an user's email a a vehicle ID, i would like my query to go through find the user's activeInventory and return just the vehicle that matches the ID. Here is what I have so far:
user = api.mongodb.userModel;
ObjectId = require('mongoose').Types.ObjectId;
return user
.findOne({email : params.username})
.select('inventories')
.find({'title': 'activeInventory'})
//also tried
//.where('title')
//.equals('activeInventory')
.exec(function(err, result){
console.log(err);
console.log(result);
});
With this, result comes out as an empty array. I've also tried .find('inventories.title': 'activeInventory') which strangely returns the entire inventories array. If possible, I'd like to keep the chaining query format as I find it much more readable.
My Ideal Query
return user
.findOne({email : params.username})
.select('inventories')
.where('title')
.equals('activeInventory')
.select('vehicles')
.id(vehicleID)
.exec(cb)
Obviously it does not work but it can give you an idea what I'm trying to do.
Using the $ positional operator, you can get the results. However, if you have multiple elements in the vehicles array all of them will be returned in the result, as you can only use one positional operator in the projection and you are working with 2 arrays (one inside another).
I would suggest you take a look at the aggregation framework, as you'll get a lot more flexibility. Here's an example query for your question that runs in the shell. I'm not familiar with mongoose, but I guess this will still help you and you'd be able to translate it:
db.collection.aggregate([
// Get only the documents where "email" equals "test#test.com" -- REPLACE with params.username
{"$match" : {email : "test#test.com"}},
// Unwind the "inventories" array
{"$unwind" : "$inventories"},
// Get only elements where "inventories.title" equals "activeInventory"
{"$match" : {"inventories.title":"activeInventory"}},
// Unwind the "vehicles" array
{"$unwind" : "$inventories.vehicles"},
// Filter by vehicle ID -- REPLACE with vehicleID
{"$match" : {"inventories.vehicles._id":ObjectId("53440e94c02b3cae81eb0069")}},
// Tidy up the output
{"$project" : {_id:0, vehicle:"$inventories.vehicles"}}
])
This is the output you'll get:
{
"result" : [
{
"vehicle" : {
"_id" : ObjectId("53440e94c02b3cae81eb0069"),
"tags" : [
"vehicle"
],
"details" : [
{
"_id" : ObjectId("53440e94c02b3cae81eb0066"),
"year" : 2007,
"transmission" : "Manual",
"price" : 1000,
"model" : "Firecar",
"mileageReading" : 50000,
"make" : "Bentley",
"interiorColor" : "blue",
"history" : "CarProof",
"exteriorColor" : "blue",
"driveTrain" : "SWD",
"description" : "test vehicle",
"cylinders" : 4,
"mileageType" : "kms"
}
]
}
}
],
"ok" : 1
}
getting the chaining query format ... i dont know how to parse it but, what you are searching for is projection, you should take a look to http://docs.mongodb.org/manual/reference/operator/projection/
it would probably look like this :
user.findOne({email: params.username}, {'inventories.title': {$elemMatch: "activeInventory", 'invertories.vehicle.id': $elemMatch: params.vehicleId}, function(err, result) {
console.log(err);
console.log(result);
})

Resources