Create mongodb index based on gt and lt

Create mongodb index based on gt and lt - database

I have Collection has many documents which called "products",
I want to improve performance by creating an index for it.
The problem is IDK how the index works, So IDK index will helpful or not.
My most frequently used query is about fields "storeId" and "salesDates"
storeId is just string so I think it good to create an index,
But the tricky one is salesDates, salesDates is Object has two fields from and to like this
product {
...someFields,
storeId: string,
salesDate {
from: Date time Number
to: Date time Number
}
}
My query is based on $gt $lt for example
product.find({
storeId: "blah",
salesDate.from : {$gt: 1423151234, $lt: 15123123123 }
})
OR
product.find({
storeId: "blah",
salesDate.from: {$gt: 1423151234},
salesDate.to: {$lt: 15123123123 }
})
What is the proper index for this case?

For your specific use case, I would recommend you to create an index only on
the from key and use $ge and $le in your find query.
The reason is that the lesser the number of keys you are indexing (in cases where the multiple key queries can be avoided), the better it is.
make sure that you follow the below order for both the indexing and find operation.
Index Command and Order:
db,product.createIndex({
"storeId": 1,
"salesDate.from": -1, // Change `-1` to `1` if you want to ISODate key to be indexed in Ascending order
})
Find Command:
db,product.find({
"storeId": "blah",
"salesDate.from": {$gt: 1423151234, $lt: 15123123123 },
})

Related

Sort solr from least to highest based on facet count

This my facet query
http://localhost:8983/solr/test_words/select?q=*:*&facet=true&facet.field=keyword
On searching I get results ordered based on facet count from highest to lowest
Example : {"and" :10, "to": 9, "also" : 8}
But instead I want the results ordered based on facet count from lowest to highest
Example : {"tamil" :1, "english" :2, "french":3}
I also tried
http://localhost:8983/solr/test_words/select?q=*:*&facet=true&facet.field=keyword&facet.sort=count
Which is not giving expected results. Pls help me on this!

The "old" facet interface doesn't support sorting by asc as far I know - it's always sorted from most common term to the least common one.
The JSON facet API does however support asc and desc for sorting:
sort - Specifies how to sort the buckets produced.
count specifies document count, index sorts by the index (natural) order of the bucket value. One can also sort by any facet function / statistic that occurs in the bucket. The default is count desc. This parameter may also be specified in JSON like sort:{count:desc}. The sort order may either be asc or desc.
"facet": {
keywords: {
"type": "terms",
"field": "keyword",
"sort": "count asc",
"limit": 5
}
}

Querying composite primary key in mongodb

I have a mongodb collection in which document is stored in the below format
{
"_id": {
"host_ip": "192.41.15.161",
"date": "2020-02-02T08:18:09.207Z"
},
"path": "/apache_pb.gif",
"request": "GET /apache_pb.gif HTTP/1.0",
"status": 200
}
where "host_ip" and "date" should be composite primary key i.e. (unique together) and there exists an _id index, which I think is created based on these two fields
So, how can I query based on host_ip and date together so that the "_id" index can be utilized?
Tried using
db.collection.find({_id: {host_ip: "192.41.15.161", date: {$gte: ISODate('2020-02-02T08:00:00:00.000Z')}}}), but it does not work, it does not even return the record which should match. Is it not the correct way to query?
Query like
db.collection.find({"_id.host_ip": "192.41.15.161", "_id.date": {$gte: ISODate('2020-02-02T08:00:00:00.000Z')}}), worked but it does not use the index created on "_id"

When querying for an _id composite primary key, mongo seems to only look for exact matches (essentially treating _id like a string), so when you query for _id: {a: x, $gte: y} it looks for "{a: x, $gte: y}" and doesn't allow querying by any portion of the object.
> db.test.insert({_id: {a: 1, b: 2}});
WriteResult({ "nInserted" : 1 })
> db.test.find({_id: {a: 1}}); // no results
> db.test.find({_id: {a: 1, b: {$eq: 2}}}); // no results
As far as solving your problem goes, I'd probably set up a new unique compound index on the fields you care about & ignore the _id field entirely.
MongoDB and composite primary keys has a few more details.

MongoDB OR with Regex not using compound index

Really at wits end here; I'm using the following query to search a collection with about 300K documents
query = { $or: [
{description: { $regex: ".*app.*"}},
{username: { $regex: ".*app.*"}},
]};
and simply putting that in a .find() function. It is tremendously slow. Like every single query takes at least 20 seconds.
I have tried individual indices on both username and description, and now have a compound index on {description: 1, username: 1}, but it does not seem to make a difference at all. If I check the MongoDB live metrics, it does not use the index at all.
Any pointers would be greatly appreciated.

Regex using partial string matching never use an index, because, as the name implies, with a partial string match it has no idea where to start looking for the match, and has to go over all strings.
As a solution, you can hook your database up to something like Lucene, which specializes in such queries.

Grouped records with aggregate fields

I'm running an instance of Solr 6.2. One of the use cases I'm exploring is to return records grouped by a field, including summed columns (facets) and sorted by those columns. I realize Solr is not meant to be utilized as a relational database, but is this possible?
Using the JSON API, I send the following data payload to the query endpoint of my Solr instance:
{
query: "*:*",
filter: ["status:1", "date:[2016-10-11T00:00:00Z-7DAYS/DAY TO 2016-10-11T00:00:00Z]"],
limit: 10,
params: {
group: true,
group.field: name,
group.facet: true
},
facet: {
funcs: {
type: terms,
field: name,
sort: { sum_v1: desc },
limit: 10,
facet: {
sum_v1: "sum(v1)",
sum_v2: "sum(v2)",
sum_v3: "sum(v3)"
}
}
}
This returns 10 records at a time in both the groups key and facets key of the response JSON. However, the sorted facet buckets do not match up with the grouped records. How can I get the facet counts with the relevant groups?
The only workaround I can come up with is to do a query for the grouped records first, then do another query using the id's from that query to get the facet counts. However, the downside is that I'd lose the ability to sort or filter by any of the facet counts.

Cloiudant using $nin There is no index available for this selector

I created a JSON index in cloudant on _id like so:
{
"index": {
"fields": [ "_id"]
},
"ddoc": "mydesigndoc",
"type": "json",
"name": "myindex"
}
First off, unless I specified the index name, somehow cloudant could not differentiate between the index I created and the default text based index for _id (if that is truly the case, then this is a bug I believe)
I ran the following query against the _find endpoint of my db:
{
"selector": {
"_id": {
"$nin":["v1","v2"]
}
},
"fields":["_id", "field1", "field2"],
"use_index": "mydesigndoc/myindex"
}
The result was this error:
{"error":"no_usable_index","reason":"There is no index available for this selector."}
if I change "$nin":["v1","v2"] to "$eq":"v1" then it works fine, but that is not the query I am after.
So in order to get what I want, I had to this to my selector "_id": {"$gt":null}, which now looks like:
{
"selector": {
"_id": {
"$nin":["v1","v2"],
"$gt":null
}
},
"fields":["_id", "field1", "field2"],
"use_index": "mydesigndoc/myindex"
}
Why is this behavior? This seems to be only happening if I use the _id field in the selector.
What are the ramifications of adding "_id": {"$gt":null} to my selector? Is this going to scan the entire table rather than use the index?
I would appreciate any help, thank you

Cloudant Query can use Cloudant's pre-existing primary index for selection and range querying without you having to create your own index in the _id field.
Unfortunately, the index doesn't really help when using the $nin operator - Cloudant would have to scan the entire database to check for documents which are not in your list - the index doesn't really get it any further forward.
By changing the operator to $eq you are playing to the strengths of the index which can be used to locate the record you need quickly and efficiently.
In short, the query you are attempting is inefficient. If your query was more complex e.g. the equivalent of WHERE colour='red' AND _id NOT IN ['a','b'] then a Cloudant index on colour could be used to reduce the data set to a reasonable level before doing the $nin operation on the remaining data.