For example, I have a collection "test" with an index on array field "numbers", I have two documents there:
db.test.createIndex({"numbers": 1})
db.test.insert({"title": "A", "numbers": [1,4,9]})
db.test.insert({"title": "B", "numbers": [2,3,7]})
1) How can I get all results sorted by "numbers" (using index), so for each value from an array I get a full document? Like this:
{"_id": "...", "title": "A", "numbers": [1,4,9]}
{"_id": "...", "title": "B", "numbers": [2,3,7]}
{"_id": "...", "title": "B", "numbers": [2,3,7]}
{"_id": "...", "title": "A", "numbers": [1,4,9]}
{"_id": "...", "title": "B", "numbers": [2,3,7]}
{"_id": "...", "title": "A", "numbers": [1,4,9]}
2) How can I get such results (sorry for no explanation, but I think it's clear what I'm trying to achieve here):
{"_id": "...", "title": "A", "numbers": 1}
{"_id": "...", "title": "B", "numbers": 2}
{"_id": "...", "title": "B", "numbers": 3}
{"_id": "...", "title": "A", "numbers": 4}
{"_id": "...", "title": "B", "numbers": 7}
{"_id": "...", "title": "A", "numbers": 9}
3) How can I get similar results, but ordering by the second element in each array?:
{"_id": "...", "title": "B", "numbers": 3}
{"_id": "...", "title": "A", "numbers": 4}
Also I care about the performance, so it'd be great if you explain which technique is faster / slower (if there is more than one way to do it, of course). Thanks.
UPD: Let me clarify. We have an index on "numbers" array. So I want to iterate this index from min to max values and get a document which the current value belongs. So some document will be presented in results N times, where N = number of elements in its array ("numbers" in this case).
Simply use the index in the sort by "dot notation":
db.collection.find().sort({ "numbers.0": 1 })
Which is the fastest way if you now the position of which you want, so just use the "index" ( starting at 0 of course ). So the same applies to any indexed position of the array.
If you want the "smallest" value in an array to sort by, then that takes more work, using .aggregate() to work that out:
db.collection.aggregate([
{ "$unwind": "$numbers" },
{ "$group": {
"_id": "$_id",
"numbers": { "$push": "$numbers" },
"min": { "$min": "$numbers" }
}},
{ "$sort": { "min": 1 } }
])
And naturally that is going to take more time in execution due to the work done than the earlier form. It of course requires the $unwind in order to de-normalize the array elements to individual documents, and the the $group with specifically $min to find the smallest value. Then of course there is the basic $sort you need.
For the full thing then you can basically do this:
db.test.aggregate([
{ "$project": {
"title": 1,
"numbers": 1,
"copy": "$numbers"
}},
{ "$unwind": "$copy" },
{ "$group": {
"_id": {
"_id": "$_id",
"number": "$copy"
},
"numbers": { "$first": "$numbers" }
}},
{ "$sort": { "_id.number": 1 } }
])
Which produces:
{
"_id" : {
"_id" : ObjectId("560545d64d64216d6de78edb"),
"number" : 1
},
"numbers" : [ 1, 4, 9 ]
}
{
"_id" : {
"_id" : ObjectId("560545d74d64216d6de78edc"),
"number" : 2
},
"numbers" : [ 2, 3, 7 ]
}
{
"_id" : {
"_id" : ObjectId("560545d74d64216d6de78edc"),
"number" : 3
},
"numbers" : [ 2, 3, 7 ]
}
{
"_id" : {
"_id" : ObjectId("560545d64d64216d6de78edb"),
"number" : 4
},
"numbers" : [ 1, 4, 9 ]
}
{
"_id" : {
"_id" : ObjectId("560545d74d64216d6de78edc"),
"number" : 7
},
"numbers" : [ 2, 3, 7 ]
}
{
"_id" : {
"_id" : ObjectId("560545d64d64216d6de78edb"),
"number" : 9
},
"numbers" : [ 1, 4, 9 ]
}
Related
I have 3 documents:
{
"id": 1,
"user": "Brian1",
"configs": [
"a",
"b",
"c",
"d"
]
}
----
{
"id": 2,
"user": "max_en",
"configs": [
"a",
"h",
"i",
"j"
]
}
----
----
{
"id": 3,
"user": "userX",
"configs": [
"t",
"u",
"s",
"b"
]
}
I want to merge all the "configs" arrays into one array without dublicates,like this:
{
"configs": [
"a",
"b",
"c",
"d",
"h",
"i",
"j",
"t",
"u",
"s",
]
}
I've tried the following:
Aggregation.group("").addToSet("configs").as("configs") and { _id: "", 'configs': { $addToSet: '$configs' } }
The first one gives an error because I've left the fieldname empty (I don't know what to put there).
The second one returns a merged array but with duplicates.
When you want to group all the documents, you need to add {_id: null}
It means group all documents.
Probably you need this
db.collection.aggregate([
{
"$unwind": "$configs"
},
{
$group: {
_id: null,
configs: {
"$addToSet": "$configs"
}
}
}
])
But be cautious when you need to use on larger collection without a match.
Let's say I have three documents in a collection, like so:
[
{"_id": "101", parts: ["a", "b"]},
{"_id": "102", parts: ["a", "c"]},
{"_id": "103", parts: ["a", "z"]},
]
what is the query I have to write so that if I input ["a","b","c"]
(i.e. all items in parts field value in each doc should be present in ["a","b","c"]) will output:
[
{"_id": "101", parts: ["a", "b"]},
{"_id": "102", parts: ["a", "c"]}
]
is this even possible? any idea?
Below solution may not be the best but it works. The idea is finding all documents that has no items in parts outside the input array. It can be done with combination of $not, $elemMatch and $nin:
db.collection.find({
parts: {
$not: {
"$elemMatch": {
$nin: ["a", "b", "c"]
}
}
}
})
Mongo Playground
Thanks to #prasad_. I have tried to come up with a solution which is similar to what I wanted. I have used $setDifference here.
db.collection.aggregate([
{
$project: {
diff: {
$setDifference: [
"$parts",
[
"a",
"b",
"c"
]
]
},
document: "$$ROOT"
}
},
{
$match: {
"diff": {
$eq: []
}
}
},
{
$project: {
"diff": 0
}
},
])
output:
[
{
"_id": "101",
"document": {
"_id": "101",
"parts": [
"a",
"b"
]
}
},
{
"_id": "102",
"document": {
"_id": "102",
"parts": [
"a",
"c"
]
}
}
]
Mongo Playground
If I have the following array in a MongoDb doc:
"example": {
[
"number": 5,
"someValue": "V"
],
[
"number": 7,
"someValue": "H"
]
}
How would i add the array below to the top of the one above:
[
"number": 3,
"someValue": "S"
]
So that the original array becomes:
"example": {
[
"number": 3,
"someValue": "S"
],
[
"number": 5,
"someValue": "V"
],
[
"number": 7,
"someValue": "H"
]
}
you can achieve this with the options of the $push operator like this :
db.collection.update({},
{
$push:{
"arr":{
$each:[
{
"number": 3,
"someValue": "S"
}
],
$position: 0
}
}
})
the $position specify where the element will be inserted.
The "users" collection has documents with an array field.
Example documents:
{
"_id" :1001,
"properties" : ["A", "B", "C", "D", "E", "F", "G", "H", "I"]
}
{
"_id" : 1002,
"properties" : ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]
}
How can I build a query to get the documents which follow the next condition?
Get only the documents that have the properties:
[ "3" AND ("A" OR "1") AND ("B" OR "2") ]
or in other way:
"3" AND "A" AND "B"
OR
"3" AND "A" AND "2"
OR
"3" AND "1" AND "B"
OR
"3" AND "1" AND "2"
In the previous example, the query has to result only the document:
{
"_id" : 1002,
"properties" : ["1", "2", "3", "4", "5", "6", "7", "8", "9", "10"]
}
The collection has 4 million documents. The document array "properties" field has average length of 15 elements. The query I am looking for must have a good performance in this quite a big collection.
Stephan's answer is ok. Other ways to achieve the result using $in and $all operators:
db.users.find(
{
$and:[
{"properties":"3"},
{"properties" : {$in: ["A", "1"]}},
{"properties" : {$in: ["B", "2"]}}
]
}
);
(translation of your first description of the subset)
And
db.users.find(
{
$or: [
{"properties" : {$all: ["3", "A", "B"]}},
{"properties" : {$all: ["3", "A", "2"]}},
{"properties" : {$all: ["3", "1", "B"]}},
{"properties" : {$all: ["3", "1", "2"]}}
]
}
);
(translation of your second description of the subset)
I'm afraid I can't tell which one will ensure the best performance. I hope that you have and index on properties.
You may try the queries on a smaller collection with explain to see the execution plan
try this:
db.users.find(
{
$or: [
{$and: [{ "properties": "3" }, { "properties": "A" }, { "properties": "B" }]},
{$and: [{ "properties": "3" }, { "properties": "A" }, { "properties": "2" }]},
{$and: [{ "properties": "3" }, { "properties": "1" }, { "properties": "B" }]},
{$and: [{ "properties": "3" }, { "properties": "1" }, { "properties": "2" }]}
]
}
);
or
db.users.find(
{
$and: [
{"properties": "3" },
{$or: [ { "properties": "A" }, { "properties": "1" } ]},
{$or: [ { "properties": "B" }, { "properties": "2" } ]}
]
}
);
Is it possible to get every element which is saved in a nested array with a find(). I need to get a list of all elements which are saved in the cat-field of the documents.
{
"_id" : "1",
"title" : "title 1",
"cat" : [
{
"element" : "element 1"
},
{
"element" : "element 2"
}
]
},
{
"_id" : "2",
"title" : "title 2",
"cat" : [
{
"element" : "element 3"
},
{
"element" : "element 4"
}
]
}
Result of this example should be - as I also need the id of the document:
1, element 1
1, element 2
2, element 3
2, element 4
You can also try out following query with distinct :
db.collection.distinct("cat.element")
EDIT:
Then you can try out $map as marked in duplicate question like:
You can simply use it like :
db.collection.aggregate({
"$project": {
"cat": {
"$map": {
"input": "$cat",
"as": "el",
"in": "$$el.element"
}
},
"title": 1
}
})
OR
db.collection.aggregate({
"$project": {
"cat": {
"$map": {
"input": "$cat",
"as": "el",
"in": "$$el.element"
}
},
"title": 1
}
}, {
$group: {
_id: "$_id",
"title": {
$first: "$title"
},
"cat": {
"$first": "$cat"
}
}
})