I have a table called devicesegments, each row of which contains a large array called devices. Owing to the size of the device array, I've been asked not to include it in my query for a page that lists all devicesegments, but only include their count. Is this possible?
What I was doing before :
A simple db.devicesegments.find()
What I'm doing now :
db.devicesegments.find({}, { devices : 0 })
What I want to achieve :
db.devicesegments.find({}, { devices : 0, devices.length : 1 })
Something like a COUNT(devices) AS device_count!
Ashkay, there's no way to do this with Mongo currently. As #rompetroll says, your application should keep a "count" field on each document, and carefully $inc it whenever you change the number of entries in the array. Then when you query for the document, exclude the array like:
db.collection.find({}, {devices:0})
If you're willing to run MongoDB 2.1, which is a development release, the aggregation framework provides a means to calculate the array size within a query:
http://www.mongodb.org/display/DOCS/Aggregation+Framework
Since there is no way to currently do this, without including a new device_count in my table, the temporary fix that I applied was to fetch all the data from the database, along with the devices array, and for each row, add a field for devices.length and then remove the devices array before sending the data across.
Related
I am trying pagination feature in my project, I am using react with firebase.
My problem statement: if I have around 50 records in my database, what is the query if we want to select records 3 - 8?
If you want to use specific fields in front-end side then use slice and i think you can return your all list from backend and just slice from ui side.
let arr = [
{id:1,name:'One'},
{id:2,name:'Two'},
{id:3,name:'Three'},
{id:4,name:'Four'},
{id:5,name:'Five'},
{id:6,name:'Six'},
{id:7,name:'Seven'},
{id:8,name:'Eight'},
{id:9,name:'Nine'},
{id:10,name:'Ten'}
]
let selected = arr.slice(2,8);
console.log(selected)
Firebase Realtime Database doesn't have numeric index pagination. The API allows you to start at the beginning and page through the query results using startAt(), limitToFirst(), and other related methods, as described in the documentation.
If you absolutely must use indexes, you will have to read the entire set of results, populate an array, and pick out the desired results from that array using the indexes you want on the page.
I have a set of test results in my mongodb database. Each document in the database contains version information, test data, date, test run information etc...
The version is broken up in the document and stored as individual values. For example: { VER_MAJOR : "0", VER_MINOR : "2", VER_REVISION : "3", VER_PATCH : "20}
My application wants the ability to specify a specific version and grab the document as well as the previous N documents based on the version.
For example:
If version = 0.2.3.20 and n = 5 then the result would return documents with version 0.2.3.20, 0.2.3.19, 0.2.3.18, 0.2.3.17, 0.2.3.16, 0.2.3.15
The solutions that come to my mind is:
Create a new database that contains documents with version information and is sorted. Which can be used to obtain the previous N version's which can be used to obtain the corresponding N documents in the test results database.
Perform the sorting in the test results database itself like in number 1. Though if the test results database is large, this will take a very long time. Also consider inserting in order every time.
Creating another database like in option 1 doesn't seem like the right way. But sorting the test results database seems like there will be lots of overhead, am I mistaken that I should be worried about option 2 producing lots of overhead? I have the impression I'd have to query the entire database then sort it on application side. Querying the entire database seems like overkill...
db.collection_name.find().sort([Paramaters for sorting])
You are quite correct that querying and sorting the entire data set would be very excessive. I probably went overboard on this, but I tried to break everything down in detail below.
Terminology
First thing first, a couple terminology nitpicks. I think you're using the term Database when you mean to use the word Collection. Differentiating between these two concepts will help with navigating documentation and allow for a better understanding of MongoDB.
Collections and Sorting
Second, it is important to understand that documents in a Collection have no inherent ordering. The order in which documents are returned to your app is only applied when retrieving documents from the Collection, such as when specifying .sort() on a query. This means we won't need to copy all of the documents to some other collection; we just need to query the data so that only the desired data is returned in the order we want.
Query
Now to the fun part. The query will look like the following:
db.test_results.find({
"VER_MAJOR" : "0",
"VER_MINOR" : "2",
"VER_REVISION" : "3",
"VER_PATCH" : { "$lte" : 20 }
}).sort({
"VER_PATCH" : -1
}).limit(N)
Our query has a direct match on the three leading version fields to limit results to only those values, i.e. the specific version "0.2.3". A range $lte filter is applied on VER_PATCH since we will want more than a single patch revision.
We then sort results by VER_PATCH to return results descending by the patch version. Finally, the limit operator is used to restrict the number of documents being returned.
Index
We're not done yet! Remember how you said that querying the entire collection and sorting it on the app side felt like overkill? Well, the database would doing exactly that if an index did not exist for this query.
You should follow the equality-sort-match rule when determining the order of fields in an index. In this case, this would give us the index:
{ "VER_MAJOR" : 1, "VER_MINOR" : 1, "VER_REVISION" : 1, "VER_PATCH" : 1 }
Creating this index will allow the query to complete by scanning only the results it would return, while avoiding an in-memory sort. More information can be found here.
I'm using postgresql database which allows having an array datatype, in addition django provides PostgreSQL specific model fields for that.
My question is how can I filter objects based on the last element of the array?
class Example(models.Model):
tags = ArrayField(models.CharField(...))
example = Example.objects.create(tags=['tag1', 'tag2', 'tag3']
example_tag3 = Example.objects.filter(tags__2='tag3')
I want to filter but don't know what is the size of the tags. Is there any dynamic filtering something like:
example_tag3 = Example.objects.filter(tags__last='tag3')
I don't think there is a way to do that without "killing the performance" other than using raw SQL (see this). But you should avoid doing things like this, from the doc:
Tip: Arrays are not sets; searching for specific array elements can be
a sign of database misdesign. Consider using a separate table with a
row for each item that would be an array element. This will be easier
to search, and is likely to scale better for a large number of
elements.
Adding to the above answer and comment, if changing the table structure isn't an option, you may filter your query based on the first element in an array by using field__0:
example_tag3 = Example.objects.filter(tags__0='tag1')
However, I don't see a way to access the last element directly in the documentation.
The following document records a conversation between Milhouse and Bart. I would like to insert a new message with the right num (the next in the example would be 3) in a unique operation. Is that possible ?
{ user_a:"Bart",
user_b:"Milhouse",
conversation:{
last_msg:2,
messages:[
{ from:"Bart",
msg:"Hello"
num:1
},
{ from:"Milhouse",
msg:"Wanna go out ?"
num:2
}
]
}
}
In MongoDB, arrays keep their order, so by adding a num attribute, you're only creating more data for something that you could accomplish without the additional field. Just use the position in the array to accomplish the same thing. Grabbing the X message in an array will provide faster searches than searching for { num: X }.
To keep the order, I don't think there's an easy way to add the num category besides does a find() on conversation.last_msg before you insert the new subdocument and increment last_msg.
Depending on what you need to keep the ordering for, you might consider including a time stamp in your subdocument, which is commonly kept in conversation records anyway and may provide other useful information.
Also, I haven't used it, but there's a Mongoose plugin that may or may not be able to do what you want: https://npmjs.org/package/mongoose-auto-increment
You can't create an auto increment field but you can use functions to generate and administrate sequence :
http://docs.mongodb.org/manual/tutorial/create-an-auto-incrementing-field/
I would recommend using a timestamp rather than a numerical value. By using a timestamp, you can keep the ordering of the subdocument and make other use of it.
I have a mongo collection with documents that have a schema structured like the following:
{ _id : bla,
fname : foo,
lname : bar,
subdocs [ { subdocname : doc1
field1 : one
field2 : two
potentially_huge_array : [...]
}, ...
]
}
I'm using the ruby mongo driver that currently does not support elemMatch. I use an aggregation when extracting from subdocs via a project, unwind and match pipeline.
What I would now like to do is to page results from the potentially_huge_array array contained in the subdocument. I have not been able to figure out how to grab just a subset of the array without dragging the entire subdoc, huge array and all, out of the db into my app.
Is there some way to do this?
Would a different schema be a better way to handle this?
Depending on how huge is huge, you definitely don't want it embedded into another document.
The main reason is that unless you always want the array returned with the document, you probably don't want to store it as part of the document. How you can store it in another collection would depend on exactly how you want to access it.
Reviewing the types of queries you most often perform on your data will usually suggest the best schema - one that will allow you to be efficient about number of queries, the amount of data returned and ease of indexing the data.
If you field really huge and changes often, just placed it in separate collection.