Is there a built-in function to get all unique values in an array field, across all records? - arrays

My schema looks like this:
var ArticleSchema = new Schema({
...
category: [{
type: String,
default: ['general']
}],
...
});
I want to parse through all records and find all unique values for this field across all records. This will be sent to the front-end via being called by service for look-ahead search on tagging articles.
We can iterate through every single record and run go through each array value and do a check, but this would be O(n2).
Is there an existing function or another way that has better performance?

You can use the distinct function to get the unique values across all category array fields of all documents:
Article.distinct('category', function(err, categories) {
// categories is an array of the unique category values
});
Put an index on category for best performance.

Related

PocketBase filter data by with multiple relation set

i have a collection where the field "users" contains the id's of two users.
how can i search for this dataset where this both user id's are.
i tried
users = ["28gjcow5t1mkn7q", "frvl86sutarujot"]
still don't work
This is a relation, so there must be a collection that allows you multiple, non unique results, so this table you are looking at to, is the the dataset, you can query the whole dataset on script with
// you can also fetch all records at once via getFullList
const records = await pb.collection('tlds').getFullList(200 /* batch size */, {
sort: '-created',
});
I sugest you to look into: js-sdk/pocketbase on github.

Query using ObjectId in MongoDB

I have a notes collection as:
{
note: {
type: String,
},
createdBy: {
type: String,
required: true
},
}
where "createdBy" contains _id of a user from users collection.
First Question: Should I define it as String or ObjectId?
Second Question:
While querying the data as db.users.find({ createdBy: ObjectId(userid) },'notes'). Is it a O(1) operation?
Or, do I have to create an index for that to be 0(1)?
If your user collection is using ObjectId then you would better also use ObjectId in notes collection since you may want to $lookup them.
Only _id field would create index in the begging of collection. You need to create index for createdBy since you want O(1) operation.

Implementing sort by date added in mongo

I have a customer model containing an array of favorites that reference product id's in another collection:
{
name: string,
favorites: [ObjectId("123"), ObjectId("456"), ObjectId("789")]
}
Now I'm looking to let a customer sort their favorites list by the date they added the favorite...what would be the best way to store favorites with this new requirement? does Mongo allow built-in timestamps at the array element level?

Querying mongoDB document based on Array element

This is one user's notes. I want to query and get only the notes of this use with "activeFlag:1". My query object code is
findAccountObj =
{ _id: objectID(req.body.accountId),
ownerId: req.body.userId,
bookId: req.body.bookId,
"notes.activeFlag": 1 };
But this query returns all the notes, including the ones with "activeFlag:0".
How do I fix this?
If you are on v2.2, use elementmatch operator. v3.2 and above allow aggregation and filter to return a subset of a document.
here is an example Retrieve only the queried element in an object array in MongoDB collection

RethinkDB - Find documents with missing field

I'm trying to write the most optimal query to find all of the documents that do not have a specific field. Is there any better way to do this than the examples I have listed below?
// Get the ids of all documents missing "location"
r.db("mydb").table("mytable").filter({location: null},{default: true}).pluck("id")
// Get a count of all documents missing "location"
r.db("mydb").table("mytable").filter({location: null},{default: true}).count()
Right now, these queries take about 300-400ms on a table with ~40k documents, which seems rather slow. Furthermore, in this specific case, the "location" attribute contains latitude/longitude and has a geospatial index.
Is there any way to accomplish this? Thanks!
A naive suggestion
You could use the hasFields method along with the not method on to filter out unwanted documents:
r.db("mydb").table("mytable")
.filter(function (row) {
return row.hasFields({ location: true }).not()
})
This might or might not be faster, but it's worth trying.
Using a secondary index
Ideally, you'd want a way to make location a secondary index and then use getAll or between since queries using indexes are always faster. A way you could work around that is making all rows in your table have a value false value for their location, if they don't have a location. Then, you would create a secondary index for location. Finally, you can then query the table using getAll as much as you want!
Adding a location property to all fields without a location
For that, you'd need to first insert location: false into all rows without a location. You could do this as follows:
r.db("mydb").table("mytable")
.filter(function (row) {
return row.hasFields({ location: true }).not()
})
.update({
location: false
})
After this, you would need to find a way to insert location: false every time you add a document without a location.
Create secondary index for the table
Now that all documents have a location field, we can create a secondary index for location.
r.db("mydb").table("mytable")
.indexCreate('location')
Keep in mind that you only have to add the { location: false } and create the index only once.
Use getAll
Now we can just use getAll to query documents using the location index.
r.db("mydb").table("mytable")
.getAll(false, { index: 'location' })
This will probably be faster than the query above.
Using a secondary index (function)
You can also create a secondary index as a function. Basically, you create a function and then query the results of that function using getAll. This is probably easier and more straight-forward than what I proposed before.
Create the index
Here it is:
r.db("mydb").table("mytable")
.indexCreate('has_location',
function(x) { return x.hasFields('location');
})
Use getAll.
Here it is:
r.db("mydb").table("mytable")
.getAll(false, { index: 'has_location' })

Resources