I have a collection in mongodb with a few million documents. there is an attribute(categories) that is an array that contains all the categories that a document belongs to. I am using following query to convert the array into a comma separated string to add it to SQL server through a spoon transformation.
for example
the document has ["a","b","c",...] and i need a,b,c,.... so i can pit it in a column
categories: {
$cond: [
{ $eq: [{ $type: "$categories" }, "array"] },
{
$trim: {
input: {
$reduce: {
input: "$categories",
initialValue: "",
in: { $concat: ["$$value", ",", "$$this"] }
}
}
}
},
"$categories"
]
}
when i run the query i get the following error and i cannot figure out what the problem is.
com.mongodb.MongoQueryException: Query failed with error code 16702 and error message '$concat only supports strings, not array' on server
a few documents had this attribute as string and not array so i added a type check. but still the issue is there. any help on how to narrow down the issue will be very appreciated.
A few other attributes were the same in the same collection and this query is working fine for the rest of them.
I don't see any problem in your aggregation. It shouldn't give this error. Can you try to update your mongodb version?
However, your aggregation is not working properly reduce wasn't working . I converted it to this:
db.collection.aggregate([
{
"$project": {
categories: {
$cond: [
{
$eq: [{ $type: "$categories" }, "array"]
},
{
'$reduce': {
'input': '$categories',
'initialValue': '',
'in': {
'$concat': [
'$$value',
{ '$cond': [{ '$eq': ['$$value', ''] }, '', ', '] },
'$$this'
]
}
}
},
"$categories"
]
}
}
}
])
Edit:
So, if you have nested arrays in the categories field. We can flat our arrays with unwind stage. So if you can add these 3 stages above the $project stage. Our aggregation will work.
{
"$unwind": "$categories"
},
{
"$unwind": "$categories"
},
{
"$group": {
_id: null,
categories: {
$push: "$categories"
}
}
},
Playground
Related
Array field in collection:
"fruits": [ "fruits": [ "fruits": [
{"fruit1": "banana"}, {"fruit2": "apple"}, {"fruit3": "pear"},
{"fruit2": "apple"}, {"fruit4": "orange"}, {"fruit2": "apple"},
{"fruit3": "pear"}, {"fruit1": "banana"}, {"fruit4": "orange"},
{"fruit4": "orange"} {"fruit3": "pear"} {"fruit1": "banana"}
]
I need to find those documents in collections, where "banana" signed before "apple". Does mongodb allows to compare elements in array just like :
if (fruits.indexOf('banana') < fruits.indexOf('apple')) return true;
Or maybe there is any other method to get result i need?
MongoDB's array query operations do not support any positional search as you want.
You can, however, write a $where query to do what you want:
db.yourCollection.find({
$where: function() {
return (this.fruits.indexOf('banana') < this.fruits.indexOf('apple'))
}
})
Be advised though, you won't be able to use indexes here and the performance will be a problem.
Another approach you can take is to rethink the database design, if you can specify what it is you're trying to build, someone can give you specific advise.
One more approach: pre-calculate the boolean value before persisting to DB as a field and query on true / false.
Consider refactoring your schema if possible. The dynamic field names(i.e. fruit1, fruit2...) make it unnecessarily complicated to construct a query. Also, if you require frequent queries by array index, you should probably store your array entries in individual documents with some sort keys to facilitate sorting with index.
Nevertheless, it is achievable through $unwind and $group the documents again. With includeArrayIndex clause, you can get the index inside array.
db.collection.aggregate([
{
"$unwind": {
path: "$fruits",
includeArrayIndex: "idx"
}
},
{
"$addFields": {
fruits: {
"$objectToArray": "$fruits"
}
}
},
{
"$addFields": {
"bananaIdx": {
"$cond": {
"if": {
$eq: [
"banana",
{
$first: "$fruits.v"
}
]
},
"then": "$idx",
"else": "$$REMOVE"
}
},
"appleIdx": {
"$cond": {
"if": {
$eq: [
"apple",
{
$first: "$fruits.v"
}
]
},
"then": "$idx",
"else": "$$REMOVE"
}
}
}
},
{
$group: {
_id: "$_id",
fruits: {
$push: {
"$arrayToObject": "$fruits"
}
},
bananaIdx: {
$max: "$bananaIdx"
},
appleIdx: {
$max: "$appleIdx"
}
}
},
{
$match: {
$expr: {
$lt: [
"$bananaIdx",
"$appleIdx"
]
}
}
},
{
$unset: [
"bananaIdx",
"appleIdx"
]
}
])
Mongo Playground
Lets say I have an array ['123', '456', '789']
I want to Aggregate and look through every document with the field books and only return the values that are NOT in any documents. For example if '123' is in a document, and '456' is, but '789' is not, it would return an array with ['789'] as it's not included in any books fields in any document.
.aggregate( [
{
$match: {
books: {
$in: ['123', '456', '789']
}
}
},
I don't want the documents returned, but just the actual values that are not in any documents.
Here's one way to scan the entire collection to look for missing book values.
db.collection.aggregate([
{ // "explode" books array to docs with individual book values
"$unwind": "$books"
},
{ // scan entire collection creating set of book values
"$group": {
"_id": null,
"allBooksSet": {
"$addToSet": "$books" // <-- generate set of book values
}
}
},
{
"$project": {
"_id": 0, // don't need this anymore
"missing": { // use $setDifference to find missing values
"$setDifference": [
[ "123", "456", "789" ], // <-- your values go here
"$allBooksSet" // <-- the entire collection's set of book values
]
}
}
}
])
Example output:
[
{
"missing": [ "789" ]
}
]
Try it on mongoplayground.net.
Based on #rickhg12hs's answer, there is another variation replacing $unwind with $reduce, which considered less costly. Two out of Three steps are the same:
db.collection.aggregate([
{
$group: {
_id: null,
allBooks: {$push: "$books"}
}
},
{
$project: {
_id: 0,
allBooksSet: {
$reduce: {
input: "$allBooks",
initialValue: [],
in: {$setUnion: ["$$value", "$$this"]}
}
}
}
},
{
$project: {
missing: {
$setDifference: [["123","456", "789"], "$allBooksSet"]
}
}
}
])
Try it on mongoplayground.net.
I need to check if an ObjectId exists in a non nested array and in multiple nested arrays, I've managed to get very close using the aggregation framework, but got stuck in the very last step.
My documents have this structure:
{
"_id" : ObjectId("605ce5f063b1c2eb384c2b7f"),
"name" : "Test",
"attrs" : [
ObjectId("6058e94c3994d04d28639616"),
ObjectId("6058e94c3994d04d28639627"),
ObjectId("6058e94c3994d04d28639622"),
ObjectId("6058e94c3994d04d2863962e")
],
"variations" : [
{
"varName" : "Var1",
"attrs" : [
ObjectId("6058e94c3994d04d28639616"),
ObjectId("6058e94c3994d04d28639627"),
ObjectId("6058e94c3994d04d28639622"),
ObjectId("60591791d4d41d0a6817d23f")
],
},
{
"varName" : "Var2",
"attrs" : [
ObjectId("60591791d4d41d0a6817d22a"),
ObjectId("60591791d4d41d0a6817d255"),
ObjectId("6058e94c3994d04d28639622"),
ObjectId("60591791d4d41d0a6817d23f")
],
},
],
"storeId" : "9acdq9zgke49pw85"
}
Let´s say I need to check if this if this _id exists "6058e94c3994d04d28639616" in all arrays named attrs.
My aggregation query goes like this:
db.product.aggregate([
{
$match: {
storeId,
},
},
{
$project: {
_id: 0,
attrs: 1,
'variations.attrs': 1,
},
},
{
$project: {
attrs: 1,
vars: '$variations.attrs',
},
},
{
$unwind: '$vars',
},
{
$project: {
attr: {
$concatArrays: ['$vars', '$attrs'],
},
},
},
]);
which results in this:
[
{
attr: [
6058e94c3994d04d28639616,
6058e94c3994d04d28639627,
6058e94c3994d04d28639622,
6058e94c3994d04d2863962e,
6058e94c3994d04d28639616,
6058e94c3994d04d28639627,
6058e94c3994d04d28639622,
60591791d4d41d0a6817d23f,
60591791d4d41d0a6817d22a,
60591791d4d41d0a6817d255,
6058e94c3994d04d28639622,
60591791d4d41d0a6817d23f
]
},
{
attr: [
60591791d4d41d0a6817d22a,
60591791d4d41d0a6817d255,
6058e94c3994d04d28639622,
60591791d4d41d0a6817d23f,
6058e94c3994d04d28639624,
6058e94c3994d04d28639627,
6058e94c3994d04d28639628,
6058e94c3994d04d2863963e
]
}
]
Assuming I have two products in my DB, I get this result. Each element in the outermost array is a different product.
The last bit, which is checking for this key "6058e94c3994d04d28639616", I could not find a way to do it with $group, since I dont have keys to group on.
Or with $match, adding this to the end of the aggregation:
{
$match: {
attr: "6058e94c3994d04d28639616",
},
},
But that results in an empty array. I know that $match does not query arrays like this, but could not find a way to do it with $in as well.
Is this too complicated of a Schema? I cannot have the original data embedded, since it is mutable and I would not be happy to change all products if something changed.
Will this be very expensive if I had like 10000 products?
Thanks in advance
You are trying to compare string 6058e94c3994d04d28639616 with ObjectId. Convert the string to ObjectId using $toObjectId operator when perform $match operation like this:
{
$match: {
$expr: {
$in: [{ $toObjectId: "6058e94c3994d04d28639616" }, "$attr"]
}
}
}
I'm currently facing quite a strange issue, i'm trying to pull from my database some data, based on a $text search and taking into account whatever permissions my user has: my data look like the following:
{
"_id" : ObjectId("5fd0e0c3233c72895e6655c9"),
"Entity" :
{
"Groups" : null,
"Name" : "Terasse"
}
}
I'm doing an aggregation query to both input the search my user queries and it's permissions values, fully formatted, the final query look like this:
db.collection.aggregate([
{
$match: {
$text: {
$search: "Terasse",
$caseSensitive: false,
$diacriticSensitive: false
}
}
},
{
$match: {
$or: [
{
"Entity.Groups": {
"$exists": false
}
},
{
"Entity.Groups": {
"$eq": null
}
},
{
"Entity.Groups": {
"$eq": []
}
},
{
$expr: {
$anyElementTrue: {
$map: {
input: "$Entity.Groups",
as: "group",
in: {
$anyElementTrue: {
$map: {
input: [
"/"
],
as: "userGroup",
in: {
$eq: [
0,
{
$indexOfBytes: [
"$$group",
"$$userGroup"
]
}
]
}
}
}
}
}
}
}
}
]
}
}
])
For a quick explanation, it first does the $text match to find the "Terasse" word in my database,
then run a second match stage to verify that my user can access this data.
My second match stage has an $or, which will first check if the data is correctly formatted before doing a special check to see if my user can access this data.
As you can see, this $or statement is checking that the Groups field of my data is: non-existing, null, or empty.
In this latter case, I would like to return this data no matter what authorization my user have and thus, not executing the very last $expr part at all
This aggregation will work perfectly fine if my Data has "Groups": [ "/" ] for example, but will fail with this error otherwise:
uncaught exception: Error: command failed: {
"ok" : 0,
"errmsg" : "$anyElementTrue's argument must be an array, but is null",
"code" : 17041,
"codeName" : "Location17041"
} : aggregate failed :
From my understanding, this error will happen IF the query will execute till the $expr part AND my Groups field is equal to non existing OR null OR empty, while it should be impossible because the $or statement should return the data as soon as it detects one of the mentionned case.
Finally, the most troubling part is that this second match stage will work perfectly with no errors at all if the first stage IS NOT a $match stage with a $text search
I am completely clueless now, is there an mongo expert that could give me a hand understanding what's happening ?
Thank you.
EDIT : as requested in comments:
this document will not work with the mentioned query
{
"_id": {
"$oid": "5fd0e0c3233c72895e6655c9"
},
"Entity": {
"Groups": null,
"Name": "Terasse"
}
}
this document will work with the mentioned query
{
"_id": {
"$oid": "5fd0e0c3233c72895e6655c9"
},
"Entity": {
"Groups": [ "/" ],
"Name": "Terasse"
}
}
also note that you cannot use mongoplayground to test this, as it requires to create a $text index before-hand (afaik, there is no way to do this in mongoplayground)
EDIT 2:
I am starting to believe that the mongo query system is quite broken when including $text stage, i've reworked the query like this to make sure that it was not due to the $or somewhat not working, and yet, it is still having the same error:
db.collection.aggregate([
{
$match: {
$text: {
$search: "Terasse",
$caseSensitive: false,
$diacriticSensitive: false
}
}
},
{
$match: {
$or: [
{
"Entity.Groups": {
"$exists": false
}
},
{
"Entity.Groups": {
"$eq": null
}
},
{
"Entity.Groups": {
"$eq": []
}
},
{
$and: [
{
"Entity.Groups": {
"$type": "array"
}
},
{
$expr: {
$anyElementTrue: {
$map: {
input: "$Entity.Groups",
as: "group",
in: {
$anyElementTrue: {
$map: {
input: [
"/test"
],
as: "userGroup",
in: {
$eq: [
0,
{
$indexOfBytes: [
"$$group",
"$$userGroup"
]
}
]
}
}
}
}
}
}
}
}
]
}
]
}
}
])
As you can see in this new query, i'm adding an $and check TO MAKE SURE THAT "Entity.Groups" is indeed an array before moving to the $anyElementTrue section and yet, the same error applies.
FINAL EDIT
Thanks to Ray's answer: I've changed my query to the following:
db.collection.aggregate([
{
$match: {
$text: {
$search: "Terasse",
$caseSensitive: false,
$diacriticSensitive: false
}
}
},
{
$addFields: {
"groupsMissing": {
$eq: [
[],
{
$ifNull: [
"$Entity.Groups",
[]
]
}
]
}
}
},
{
$match: {
$or: [
{
"groupsMissing": true
},
{
$expr: {
$anyElementTrue: {
$map: {
input: "$Entity.Groups",
as: "group",
in: {
$anyElementTrue: {
$map: {
input: [
"/test"
],
as: "userGroup",
in: {
$eq: [
0,
{
$indexOfBytes: [
"$$group",
"$$userGroup"
]
}
]
}
}
}
}
}
}
}
}
]
}
}
])
I've used another stage with the $addField as Ray mentionned but remove some of the previous/obsolete stuff, it is now working smoothly, will report if any side-effects re-occurs.
As a final note, i'm still unsure why the previous query that I've did didn't work, and why that solution does, but it seems like adding another stage to the query doing the sanitize checks and then having the second stage only checking the sanitized bool IS WORKING !
Probably it is related to the way mongo is executing the query.
I believe different stages HAS to be ran in a sequential way by mongo which is what I initially expected (though $and should also do that, by the documentation)
Having everything on a single stage is probably making mongo run the query quite differently than written in an effort to optimize it ?
That's all I can guess.
You may want to use $addFields to project some helper fields to make your life easier.
Here is the code I try to modify your version as least as possible.
db.collection.aggregate([
{
$match: {
$text: {
$search: "Terasse",
$caseSensitive: false,
$diacriticSensitive: false
}
}
},
{
$addFields: {
// flag to indicate Entity.Groups is null/missing/empty array
"groupsMissing": {
$eq: [
[],
{
$ifNull: [
"$Entity.Groups",
[]
]
}
]
},
// make Entity.Groups an empty array to avoid $anyElementTrue error
"Entity.Groups": {
$ifNull: [
"$Entity.Groups",
[]
]
}
}
},
{
$match: {
$or: [
// part of the code can be shorten
{
"groupsMissing": true
},
// the code should be the same as your version for the rest
{
$and: [
{
"Entity.Groups": {
"$type": "array"
}
},
{
$expr: {
$anyElementTrue: {
$map: {
input: "$Entity.Groups",
as: "group",
in: {
$anyElementTrue: {
$map: {
input: [
"/test"
],
as: "userGroup",
in: {
$eq: [
0,
{
$indexOfBytes: [
"$$group",
"$$userGroup"
]
}
]
}
}
}
}
}
}
}
}
]
}
]
}
}
])
First, MongoDB provides arrays for storing lists of things. Splitting strings on separators in queries is 1) less performant and 2) more difficult than it needs to be.
With that said, I do not see anything in https://docs.mongodb.com/manual/reference/operator/query/or/ saying the clauses will be evaluated in the order given. Therefore,
while it should be impossible because the $or statement should return the data as soon as it detects one of the mentionned case.
... appears to be an incorrect assumption as to how MongoDB works.
Note that https://docs.mongodb.com/manual/reference/operator/query/and/ does reference short-circuit evaluation.
I am new to MongoDB and trying to execute a query. I have a company collection and company IDs array. I would like to get the results where attributes.0.ccode exist and attributes.0.ccode is not empty and will be checked within the ids provided in an array( cdata)
var query = Company.find({ _id: { $in: cdata } },{ "attributes.0.ccode": { $exists: true }, $and: [ { "attributes.0.ccode": { $ne: "" } } ] }).select({"attributes": 1}).sort({});
The error I am getting is
"$err": "Can't canonicalize query: BadValue Unsupported projection option: attributes.0.ccode: { $exists: true }",
"code": 17287
I think it's a bracketing issue but can't figure it out where.
Any help is highly appreciated.
In your code { _id: { $in: cdata } } is interpreted as query, and everything else, starting from ,{ "attributes.0.ccode": { $e.. as a Projection (which field to display). Try to refactor your code so _id: {$in ...} and the rest of the query belong to the same higher - level object. Something like this:
var query = Company.find({
_id: {
$in: cdata
},
"attributes.0.ccode": {
$exists: true
},
$and: [
{
"attributes.0.ccode": {
$ne: ""
}
}
]
}).select({"attributes": 1}).sort({});