Mongo DB Query to check if document array field element present in more than one document - arrays

I have been searching through the MongoDB query syntax with various combinations of terms to see if I can find the right syntax for the type of query I want to create.
We have a collection containing documents with an array field. This array field contains ids of items associated with the document.
I want to be able to check if an item has been associated more than once. If it has then more than one document will have the id element present in its array field.
I don't know in advance the id(s) to check for as I don't know which items are associated more than once. I am trying to detect this. It would be comparatively straightforward to query for all documents with a specific value in their array field.
What I need is some query that can return all the documents where one of the elements of its array field is also present in the array field of a different document.
I don't know how to do this. In SQL it might have been possible with subqueries. In Mongo Query Language I don't know how to do this or even if it can be done.

You can use $lookup to self join the rows and output the document when there is a match and $project with exclusion to drop the joined field in 3.6 mongo version.
$push with [] array non equality match to output document where there is matching document.
db.col.aggregate([
{"$unwind":"$array"},
{"$lookup":{
"from":col,
"localField":"array",
"foreignField":"array",
"as":"jarray"
}},
{"$group":{
"_id":"$_id",
"fieldOne":{"$first":"$fieldOne"},
... other fields
"jarray":{"$push":"$jarray"}
}},
{"$match":{"jarray":{"$ne":[]}}},
{"$project":{"jarray":0}}
])

Related

What is better way to query mongodb array field?

I am new to mongodb. So I don't know what is the better way to query array field. I have a schema where in our collection of courses we have a program field. In start it was Reference to program collection and contain program id. But Now we have change the schema from Reference to array of Reference, so we can allow course to be a part of multiple program. In our codebase all the query are written like this:
course.find({program});
Do I have to change this query to cater this schema change like this:
course.find({program: {$in: program}});
I have tested in mongodb compass and this query
course.find({program});
Work on array field.
So let me know what could be consequence if i don't use $in operator while search the array field.

MongoDB Aggregate List of Object IDs

I'm trying to create an interface which gives our team the ability to build relatively simple queries to segment customers and aggregate data about those customers. In addition to this, I want to build a "List" feature" that allows users to upload a CSV that can be separately generated for more complex one-off queries. In this scenario, I'm trying to figure out the best way to query the database.
My main question is how the $in operator works. The example below has an aggregate which tries to check if a primary key (object ID) is in an array of object IDs.
How well does the $in operator perform? I'm mostly wondering how this query will run – does it loop over the array and look for documents that match each value in the array for N lookups, or will it loop over all of the documents in the database and for each one, loop over the array and check if the field matches?
db.getCollection('customers').aggregate([
{
$match: {
'_id': { $in: ['ObjectId("idstring1")','ObjectId("idstring2")'...'ObjectId("idstring5000")']}
}
}
])
If that's not how it works, what's the most efficient way of aggregating a set of documents given a bunch of object IDs? Should I just do the N lookups manually and pipe the documents into the aggregation?
My main question is how the $in operator works. The example below has
an aggregate which tries to check if a primary key (object ID) is in
an array of object IDs.
Consider the code:
var OBJ_ARR = [ ObjectId("5df9b50e7b7941c4273a5369"), ObjectId("5df9b50f7b7941c4273a5525"), ObjectId("5df9b50e7b7941c4273a515f"), ObjectId("5df9b50e7b7941c4273a51ba") ]
db.test.aggregate( [
{
$match: {
_id: { $in: OBJ_ARR }
}
}
])
The query tries to match each of the array elements with the documents in the collection. Since, there are four elements in the OBJ_ARR there might be four documents returned or lesser depending upon the matches.
If you have N _id's in the lookup array, the operation will try to find match for all elements in the input array. The more number of values you have in the array, the more time it takes; the number of ObjectIds matter in query performance. In case you have a single element in the array, it is considered as one equal match.
From the documentation - the $in works like an $or operator with equality checks.
... what's the most efficient way of aggregating a set of documents
given a bunch of object IDs?
The _id field of the collection has a unique index, by default. The query, you are trying will use this index to match the documents. Running an explain() on a query (with a small set of test data) confirms that there is an index scan (IXSCAN) on the match operation with $in used with the aggregation query. That is a better performing query (as it is) becuse of the index usage. But, the aggregation query's later/following stages, size of the data set, the input array size and other factors will influence the overall performance and efficiency.
Also, see:
Pipeline Operators and Indexes and Aggregation Pipeline Optimization.

Skip and Limit on nested array element

I want to apply skip and limit for paging in nested array of a document how can I perform this [Efficient Way]
My Document recored like
{
"_id":"",
"name":"",
"ObjectArray":[{
"url":"",
"value":""
}]
}
I want to retrieve multiple document and every document contain 'n' number of record.
I am using $in in find query to retrieve multiple record on basis of _id but how can i get certain number of element of ObjectArray in every document?
You can try like this -
db.collection.find({}, {ObjectArray:{$slice:[0, 3]}})
This will provide you records from 0..3
$slice:[SKIP_VALUE, LIMIT_VALUE]}
For your example:-
db.collection.find({"_id":""}, {ObjectArray:{$slice:[0, 3]}})
Here is the reference for MongoDB Slice feature.
http://docs.mongodb.org/manual/reference/operator/projection/slice/

ArangoDB Query Attribute Embedded in List

I have a document embedded in a list, and I need to query for documents matching the value of a series of Strings.
My document:
As you can see, I have a regular document. Inside that document is "categories" which is of an unknown length per document. Inside categories I have "confidence" and "title". I need to query to find documents which have a titles matching a list of title I have in an ArrayList. The query I thought would work is:
FOR document IN documents FILTER document.categories.title IN #categories RETURN article
#categories is an ArrayList with a list of titles. If any of the titles in the ArrayList are in the document, I would like it to be returned.
This query seems to be returning null. I don't think it is getting down to the level of comparing the ArrayList to the "title" field in my document. I know I can access the "categories" list using [#] but I don't know how to search for the "title"s in "categories".
The query
FOR document IN documents
FILTER document.categories.title IN #categories
RETURN article
would work if document.categories is a scalar, but it will not work if document.categories is an array. The reason is that the array on the left-hand side of the IN operator will not be auto-expanded.
To find the documents the query could be rewritten as follows:
FOR document IN documents
FILTER document.categories[*].title IN #categories
RETURN document
or
FOR document IN documents
LET titles = (
FOR category IN document.categories
RETURN category.title
)
FILTER titles IN #categories
RETURN document
Apart from that, article will be unknown unless there is a collection named article. It should probably read document or document.article.

MongoDB - get all documents with a property that is saved in a list in a document of another collection

I have documents in a collection that have an array of properties (1-400 different numeric values).
Now i want to get all documents of another collection that have one of these properties (these documents only have one property).
How can i do that, preferably in one call?
As MongoDB is no relational DBMS this isn't possible to achieve with only one call.
What you need to do is to first retrieve your document your want to use for your search. Upon you retrieved it, you're using that array stored in the document todo a $in query on the field for the other collection. So for the mongo shell this could be something like this:
var ar = db.coll1.findOne().numArray
db.coll2.find({b: { $in : ar }})

Resources