MongoDB Aggregation strange behaviour with $or and $text - database

I'm currently facing quite a strange issue, i'm trying to pull from my database some data, based on a $text search and taking into account whatever permissions my user has: my data look like the following:
{
"_id" : ObjectId("5fd0e0c3233c72895e6655c9"),
"Entity" :
{
"Groups" : null,
"Name" : "Terasse"
}
}
I'm doing an aggregation query to both input the search my user queries and it's permissions values, fully formatted, the final query look like this:
db.collection.aggregate([
{
$match: {
$text: {
$search: "Terasse",
$caseSensitive: false,
$diacriticSensitive: false
}
}
},
{
$match: {
$or: [
{
"Entity.Groups": {
"$exists": false
}
},
{
"Entity.Groups": {
"$eq": null
}
},
{
"Entity.Groups": {
"$eq": []
}
},
{
$expr: {
$anyElementTrue: {
$map: {
input: "$Entity.Groups",
as: "group",
in: {
$anyElementTrue: {
$map: {
input: [
"/"
],
as: "userGroup",
in: {
$eq: [
0,
{
$indexOfBytes: [
"$$group",
"$$userGroup"
]
}
]
}
}
}
}
}
}
}
}
]
}
}
])
For a quick explanation, it first does the $text match to find the "Terasse" word in my database,
then run a second match stage to verify that my user can access this data.
My second match stage has an $or, which will first check if the data is correctly formatted before doing a special check to see if my user can access this data.
As you can see, this $or statement is checking that the Groups field of my data is: non-existing, null, or empty.
In this latter case, I would like to return this data no matter what authorization my user have and thus, not executing the very last $expr part at all
This aggregation will work perfectly fine if my Data has "Groups": [ "/" ] for example, but will fail with this error otherwise:
uncaught exception: Error: command failed: {
"ok" : 0,
"errmsg" : "$anyElementTrue's argument must be an array, but is null",
"code" : 17041,
"codeName" : "Location17041"
} : aggregate failed :
From my understanding, this error will happen IF the query will execute till the $expr part AND my Groups field is equal to non existing OR null OR empty, while it should be impossible because the $or statement should return the data as soon as it detects one of the mentionned case.
Finally, the most troubling part is that this second match stage will work perfectly with no errors at all if the first stage IS NOT a $match stage with a $text search
I am completely clueless now, is there an mongo expert that could give me a hand understanding what's happening ?
Thank you.
EDIT : as requested in comments:
this document will not work with the mentioned query
{
"_id": {
"$oid": "5fd0e0c3233c72895e6655c9"
},
"Entity": {
"Groups": null,
"Name": "Terasse"
}
}
this document will work with the mentioned query
{
"_id": {
"$oid": "5fd0e0c3233c72895e6655c9"
},
"Entity": {
"Groups": [ "/" ],
"Name": "Terasse"
}
}
also note that you cannot use mongoplayground to test this, as it requires to create a $text index before-hand (afaik, there is no way to do this in mongoplayground)
EDIT 2:
I am starting to believe that the mongo query system is quite broken when including $text stage, i've reworked the query like this to make sure that it was not due to the $or somewhat not working, and yet, it is still having the same error:
db.collection.aggregate([
{
$match: {
$text: {
$search: "Terasse",
$caseSensitive: false,
$diacriticSensitive: false
}
}
},
{
$match: {
$or: [
{
"Entity.Groups": {
"$exists": false
}
},
{
"Entity.Groups": {
"$eq": null
}
},
{
"Entity.Groups": {
"$eq": []
}
},
{
$and: [
{
"Entity.Groups": {
"$type": "array"
}
},
{
$expr: {
$anyElementTrue: {
$map: {
input: "$Entity.Groups",
as: "group",
in: {
$anyElementTrue: {
$map: {
input: [
"/test"
],
as: "userGroup",
in: {
$eq: [
0,
{
$indexOfBytes: [
"$$group",
"$$userGroup"
]
}
]
}
}
}
}
}
}
}
}
]
}
]
}
}
])
As you can see in this new query, i'm adding an $and check TO MAKE SURE THAT "Entity.Groups" is indeed an array before moving to the $anyElementTrue section and yet, the same error applies.
FINAL EDIT
Thanks to Ray's answer: I've changed my query to the following:
db.collection.aggregate([
{
$match: {
$text: {
$search: "Terasse",
$caseSensitive: false,
$diacriticSensitive: false
}
}
},
{
$addFields: {
"groupsMissing": {
$eq: [
[],
{
$ifNull: [
"$Entity.Groups",
[]
]
}
]
}
}
},
{
$match: {
$or: [
{
"groupsMissing": true
},
{
$expr: {
$anyElementTrue: {
$map: {
input: "$Entity.Groups",
as: "group",
in: {
$anyElementTrue: {
$map: {
input: [
"/test"
],
as: "userGroup",
in: {
$eq: [
0,
{
$indexOfBytes: [
"$$group",
"$$userGroup"
]
}
]
}
}
}
}
}
}
}
}
]
}
}
])
I've used another stage with the $addField as Ray mentionned but remove some of the previous/obsolete stuff, it is now working smoothly, will report if any side-effects re-occurs.
As a final note, i'm still unsure why the previous query that I've did didn't work, and why that solution does, but it seems like adding another stage to the query doing the sanitize checks and then having the second stage only checking the sanitized bool IS WORKING !
Probably it is related to the way mongo is executing the query.
I believe different stages HAS to be ran in a sequential way by mongo which is what I initially expected (though $and should also do that, by the documentation)
Having everything on a single stage is probably making mongo run the query quite differently than written in an effort to optimize it ?
That's all I can guess.

You may want to use $addFields to project some helper fields to make your life easier.
Here is the code I try to modify your version as least as possible.
db.collection.aggregate([
{
$match: {
$text: {
$search: "Terasse",
$caseSensitive: false,
$diacriticSensitive: false
}
}
},
{
$addFields: {
// flag to indicate Entity.Groups is null/missing/empty array
"groupsMissing": {
$eq: [
[],
{
$ifNull: [
"$Entity.Groups",
[]
]
}
]
},
// make Entity.Groups an empty array to avoid $anyElementTrue error
"Entity.Groups": {
$ifNull: [
"$Entity.Groups",
[]
]
}
}
},
{
$match: {
$or: [
// part of the code can be shorten
{
"groupsMissing": true
},
// the code should be the same as your version for the rest
{
$and: [
{
"Entity.Groups": {
"$type": "array"
}
},
{
$expr: {
$anyElementTrue: {
$map: {
input: "$Entity.Groups",
as: "group",
in: {
$anyElementTrue: {
$map: {
input: [
"/test"
],
as: "userGroup",
in: {
$eq: [
0,
{
$indexOfBytes: [
"$$group",
"$$userGroup"
]
}
]
}
}
}
}
}
}
}
}
]
}
]
}
}
])

First, MongoDB provides arrays for storing lists of things. Splitting strings on separators in queries is 1) less performant and 2) more difficult than it needs to be.
With that said, I do not see anything in https://docs.mongodb.com/manual/reference/operator/query/or/ saying the clauses will be evaluated in the order given. Therefore,
while it should be impossible because the $or statement should return the data as soon as it detects one of the mentionned case.
... appears to be an incorrect assumption as to how MongoDB works.
Note that https://docs.mongodb.com/manual/reference/operator/query/and/ does reference short-circuit evaluation.

Related

Finding documents in mongodb collection by order of elements index of array field

Array field in collection:
"fruits": [ "fruits": [ "fruits": [
{"fruit1": "banana"}, {"fruit2": "apple"}, {"fruit3": "pear"},
{"fruit2": "apple"}, {"fruit4": "orange"}, {"fruit2": "apple"},
{"fruit3": "pear"}, {"fruit1": "banana"}, {"fruit4": "orange"},
{"fruit4": "orange"} {"fruit3": "pear"} {"fruit1": "banana"}
]
I need to find those documents in collections, where "banana" signed before "apple". Does mongodb allows to compare elements in array just like :
if (fruits.indexOf('banana') < fruits.indexOf('apple')) return true;
Or maybe there is any other method to get result i need?
MongoDB's array query operations do not support any positional search as you want.
You can, however, write a $where query to do what you want:
db.yourCollection.find({
$where: function() {
return (this.fruits.indexOf('banana') < this.fruits.indexOf('apple'))
}
})
Be advised though, you won't be able to use indexes here and the performance will be a problem.
Another approach you can take is to rethink the database design, if you can specify what it is you're trying to build, someone can give you specific advise.
One more approach: pre-calculate the boolean value before persisting to DB as a field and query on true / false.
Consider refactoring your schema if possible. The dynamic field names(i.e. fruit1, fruit2...) make it unnecessarily complicated to construct a query. Also, if you require frequent queries by array index, you should probably store your array entries in individual documents with some sort keys to facilitate sorting with index.
Nevertheless, it is achievable through $unwind and $group the documents again. With includeArrayIndex clause, you can get the index inside array.
db.collection.aggregate([
{
"$unwind": {
path: "$fruits",
includeArrayIndex: "idx"
}
},
{
"$addFields": {
fruits: {
"$objectToArray": "$fruits"
}
}
},
{
"$addFields": {
"bananaIdx": {
"$cond": {
"if": {
$eq: [
"banana",
{
$first: "$fruits.v"
}
]
},
"then": "$idx",
"else": "$$REMOVE"
}
},
"appleIdx": {
"$cond": {
"if": {
$eq: [
"apple",
{
$first: "$fruits.v"
}
]
},
"then": "$idx",
"else": "$$REMOVE"
}
}
}
},
{
$group: {
_id: "$_id",
fruits: {
$push: {
"$arrayToObject": "$fruits"
}
},
bananaIdx: {
$max: "$bananaIdx"
},
appleIdx: {
$max: "$appleIdx"
}
}
},
{
$match: {
$expr: {
$lt: [
"$bananaIdx",
"$appleIdx"
]
}
}
},
{
$unset: [
"bananaIdx",
"appleIdx"
]
}
])
Mongo Playground

MongoDB Remove Empty Object from Array after aggregate function

The output of the db.name.aggregate() function gives output:
[{}, {"abc": "zyx"}, {}, "opk": "tyr"]
Actual output desired :
[{"abc": "zyx"}, "opk": "tyr"]
Firstly, your output is not a valid array. It should be like this:
[{}, {"abc": "zyx"}, {}, {"opk": "tyr"}]
Now, to obtain your desired output, you can add the following $match stage, to
your pipeline:
db.collection.aggregate([
{
"$match": {
$expr: {
"$gt": [
{
"$size": {
"$objectToArray": "$$ROOT"
}
},
0
]
}
}
}
])
Here, we are converting the document to an array using $objectToArray, and then we check whether the size of that array is greater than 0. Only those documents are kept in the output.
Playground link.
What if you data looks like this.
[
{
"arr": [
{},
{
"abc": "zyx"
},
{},
{
"opk": "tyr"
}
]
}
]
The aggregation be like this to remove empty objects
db.collection.aggregate([
{
"$unwind": {
"path": "$arr",
}
},
{
"$match": {
arr: {
"$ne": {}
}
}
},
{
"$group": {
_id: "$_id",
arr: {
$push: "$arr"
}
}
}
])
Outputs
[
{
"_id": ObjectId("5a934e000102030405000000"),
"arr": [
{
"abc": "zyx"
},
{
"opk": "tyr"
}
]
}
]
Demo#mongoplayground
https://mongoplayground.net/p/BjGxzlrlj6s

How to fix MongoDB array concatination error?

I have a collection in mongodb with a few million documents. there is an attribute(categories) that is an array that contains all the categories that a document belongs to. I am using following query to convert the array into a comma separated string to add it to SQL server through a spoon transformation.
for example
the document has ["a","b","c",...] and i need a,b,c,.... so i can pit it in a column
categories: {
$cond: [
{ $eq: [{ $type: "$categories" }, "array"] },
{
$trim: {
input: {
$reduce: {
input: "$categories",
initialValue: "",
in: { $concat: ["$$value", ",", "$$this"] }
}
}
}
},
"$categories"
]
}
when i run the query i get the following error and i cannot figure out what the problem is.
com.mongodb.MongoQueryException: Query failed with error code 16702 and error message '$concat only supports strings, not array' on server
a few documents had this attribute as string and not array so i added a type check. but still the issue is there. any help on how to narrow down the issue will be very appreciated.
A few other attributes were the same in the same collection and this query is working fine for the rest of them.
I don't see any problem in your aggregation. It shouldn't give this error. Can you try to update your mongodb version?
However, your aggregation is not working properly reduce wasn't working . I converted it to this:
db.collection.aggregate([
{
"$project": {
categories: {
$cond: [
{
$eq: [{ $type: "$categories" }, "array"]
},
{
'$reduce': {
'input': '$categories',
'initialValue': '',
'in': {
'$concat': [
'$$value',
{ '$cond': [{ '$eq': ['$$value', ''] }, '', ', '] },
'$$this'
]
}
}
},
"$categories"
]
}
}
}
])
Edit:
So, if you have nested arrays in the categories field. We can flat our arrays with unwind stage. So if you can add these 3 stages above the $project stage. Our aggregation will work.
{
"$unwind": "$categories"
},
{
"$unwind": "$categories"
},
{
"$group": {
_id: null,
categories: {
$push: "$categories"
}
}
},
Playground

How to perform Greater than Operator in MongoDb Compass Querying inside inner object collection

Here is my Json in Mongo DB Compass. I am just querying greater than rating products from each collection.
Note: if I am doing with pageCount it is working fine because that is not inside a collection.
{PageCount:{gte:2}} -- works.
Problem with inner arrays collection of collection if anyone matches it displays all.
When we are doing the below query if anyone of the index have greater than 99 it shows all the values.
{"ProductField.ProductDetailFields.ProductDetailInfo.ProductScore.Rating": {$exists:true, $ne: null , $gte: 99}}
----- if I perform above query, I am getting this output.
How to iterate like foreach kind of things and check the condition in MongoDB querying
{
"_id":{
"$oid":"5fc73a7b3fb52d00166554b9"
},
"ProductField":{
"PageCount":2,
"ProductDetailFields":[
{
"PageNumber":1,
"ProductDetailInfo":[
{
"RowIndex":0,
"ProductScore":{
"Name":"Samsung",
"Rating":99
},
},
{
"RowIndex":1,
"ProductScore":{
"Name":"Nokia",
"Rating":96
},
},
{
"RowIndex":2,
"ProductScore":{
"Name":"Apple",
"Rating":80
},
}
]
}
]
}
},
{
"_id":{
"$oid":"5fc73a7b3fb52d0016655450"
},
"ProductField":{
"PageCount":2,
"ProductDetailFields":[
{
"PageNumber":1,
"ProductDetailInfo":[
{
"RowIndex":0,
"ProductScore":{
"Name":"Sony",
"Rating":93
}
},
{
"RowIndex":1,
"ProductScore":{
"Name":"OnePlus",
"Rating":93
}
},
{
"RowIndex":2,
"ProductScore":{
"Name":"BlackBerry",
"Rating":20
}
}
]
}
]
}
}
#Misky How to run this query execute:
While run this query in Mongo Shell - no sql client throws below error. we are using 3.4.9 https://www.nosqlclient.com/demo/
Is this somewhat close to your idea
db.collection.aggregate({
$addFields: {
"ProductField.ProductDetailFields": {
$map: {
"input": "$ProductField.ProductDetailFields",
as: "pdf",
in: {
$filter: {
input: {
$map: {
"input": "$$pdf.ProductDetailInfo",
as: "e",
in: {
$cond: [
{
$gte: [
"$$e.ProductScore.Rating",
99
]
},
{
$mergeObjects: [
"$$e",
{
PageNumber: "$$pdf.PageNumber"
}
]
},
null
]
}
}
},
as: "i",
cond: {
$ne: [
"$$i",
null
]
}
}
}
}
}
}
},
{
$addFields: {
"ProductField.ProductDetailFields": {
"$arrayElemAt": [
"$ProductField.ProductDetailFields",
0
]
}
}
})
LIVE VERSION

$in requires an array as a second argument, found: missing

can anybody please tell me what am i doing wrong?
db document structure:
{
"_id" : "module_settings",
"moduleChildren" : [
{
"_id" : "module_settings_general",
"name" : "General",
},
{
"_id" : "module_settings_users",
"name" : "Users",
},
{
"_id" : "module_settings_emails",
"name" : "Emails",
}
],
“permissions” : [
"module_settings_general",
"module_settings_emails"
]
}
pipeline stage:
{ $project: {
filteredChildren: {
$filter: {
input: "$moduleChildren",
as: "moduleChild",
cond: { $in : ["$$moduleChild._id", "$permissions"] }
}
},
}}
I need to filter "moduleChildren" array to show only modules which ids are in "permissions" array. Ive tried "$$ROOT.permissions" and "$$CURRENT.permissions" but none of them is working. I always get an error that $in is missing array as argument. It works when i hardcode the array like this: cond: { $in : ["$$moduleChild._id", [“module_settings_general", "module_settings_emails”]] } so it seems the problem is in passing of the array.
Thanks for any advices!
First option --> Use aggregation
Because your some of the documents in your collection may or may not contain permissions field or is type not equal to array that's why you are getting this error.
You can find the $type of the field and if it is not an array or not exists in your document than you can add it as an array with $addFields and $cond aggregation
db.collection.aggregate([
{ "$addFields": {
"permissions": {
"$cond": {
"if": {
"$ne": [ { "$type": "$permissions" }, "array" ]
},
"then": [],
"else": "$permissions"
}
}
}},
{ "$project": {
"filteredChildren": {
"$filter": {
"input": "$moduleChildren",
"as": "moduleChild",
"cond": {
"$in": [ "$$moduleChild._id", "$permissions" ]
}
}
}
}}
])
Second option -->
Go to your mongo shell or robomongo on any GUI you are using and run
this command
db.collection.update(
{ "permissions": { "$ne": { "$type": "array" } } },
{ "$set": { "permissions": [] } },
{ "multi": true }
)

Resources