Sorry, maybe the question is stupid, but I can't figure it out. Are all queries in MongoDB ad-hoc? Or ad-hoc queries can be executed in special cases?
It is my understanding that, at the most basic of levels, an ad-hoc query allows for the developer to provide variables into the query. Meaning, the full query is only known at the time of execution.
Meaning, not all queries are ad-hoc, but MongoDB does support ad-hoc queries.
An example of an ad-hoc query in Mongo would be something like:
// this example uses node.js
const results = await db.collection.find({ name: req.query.name });
In the above example, req.query.name is only known at the time of execution, thus making our query an ad-hoc query.
Please let me know if you have any questions.
Related
I need to find a document in mongodb using it's ID . This operation should be so fast. So we need to get the exact document which has the given ID . Is there any way to do that . I am a beginner here . So I would be much thankful to you if you could give some in-depth answer
Okay so your are really a beginner.
First thing you should know from now that getting any kind of record
from a database is done by a querying the database and its called as
Search.
It simply means that when you want any data from your database, database engine has to search it in database using the query you provided.
So I think this sufficient to get you know that whenever you ask(using database query) database to give some records from it, it will perform a search based on conditions you provided, then it doesn't matter
You provide condition with a single unique key or multiple complex combinations of columns or joins of multiple tables
Your database contain no records or billions of records.
it has to search in database.
So above explanation holds true for I guess every database as far as I know.
Now coming to MongoDB
So referring to above explanation, MongoDB engine query database to get result.
Now main question is - How to get result Fast!!!
And I think that's what your main concern should be.
So query speed (Search Speed) is mainly depend on 2 things:
Query.
Number of records in your database.
1. Query
Here factors affecting are :
a. Nature of parameters used in a query - (Indexed or UnIndexed)
If you use Indexed parameters in your query it always going to be a faster search operation for database.
For example the _id field is default indexed by mongodb. So if you search document in collection using _id filed alone is going to be always a faster search.
b. Combination of parameters with operators
This refers to number of parameters used for query (more the parameter, more slower) and kind of query operators you used in query (simple query operator give result faster as compare to aggregation query operators with pipelines)
c. Read Preferences
Read preference describes how MongoDB will route read operations to the members of a replica set. It actually describe your preference of confidence in data that you are getting.
Above are two main parameters, but there are many things such as :
Schema of your collection,
Your understanding of the schema (specifically data-types of documents)
Understanding of query operator you used. For example - when to use $or, $and operators and when to use $in and $nin operators.
2. Number of records in your database.
Now this happens when you have enormous data in database, of course with single database server it will be slower i.e. more records more slower.
In such cases Sharding (Clustering) your data on multiple database server will give you faster search performance.
MongoDB has Mongos component which will route our query to perfect database server in cluster. In order to perform such routing it uses config servers which stores the meta-data about our collections using Indexes and Shard-Key.
Hence in sharded environment choosing proper shard-key plays important role in faster query response.
I hope this will give you a descent idea of how actually a search is affected by various parameters.
I will improve this answer in future time.
Its pretty starlight forward, you can try for the following:
var id = "89e6dd2eb4494ed008d595bd";
Model.findById(id, function (err, user) { ... } );
with mongoose:
router.get("/:id", (req, res) => {
if (!mongoose.Types.ObjectId.isValid(req.params.id)) { // checking if the id is valid
return res.send("Please provide valid id");
}
var id = mongoose.Types.ObjectId(req.params.id);
Item.findById({ _id: id })
.then(item=> {
res.json(item);
})
.catch(err => res.status(404).json({ success: false }));
});
I have been browsing over the internet for quite few hours now and didn't came to a satisfactory answer for why one is better over another. If this is situation dependent than what are the situations to use one over the other.It would be great if you could provide me a solution on this with example if there can be one. I understand that since the aggregation operators came later so probably they are the better one, but i have still seen people using the find()+sort() method.
You shouldn't think of this as an issue of "which method is better?", but "what kind of query do I need to perform?"
The MongoDB aggregation pipeline exists to handle a different set of problems than a simple .find() query. Specifically, aggregation is meant to allow processing of data on the database end in order to reduce the workload on the application server. For example, you can use aggregation to generate a numerical analysis on all of the documents in a collection.
If all you want to do is retrieve some documents in sorted order, use find() and sort(). If you want to perform a lot of processing on the data before retrieving the results, then use aggregation with a $sort stage.
I have been using some filters in multi-faceting queries in Solr. Right now the filters are using only value but now I have to expand it to multi-values and I think I have to use OR for that. I haven't done any performance checking but I am wondering if there is a way to stop my filter queries from being stored to FilterCache? I don't want to cache results from filter queries with more than two values. Ideally I guess I have to rely on caching algorithm doing a good job but I am just wondering.
Taken from here.
To tell Solr not to cache a filter, we use the same powerful local params DSL that adds metadata to query parameters and is used to specify different types of query syntaxes and query parsers. For a normal query that does not have any localParam metadata, simply prepend a local param of cache=false.
Cassandra doesn't have some CQL like like clause.... in MySQL to search a more specific data in database.
I have looked through some data and came up some ideas
1.Using Hadoop
2.Using MySQL server to be my anther database server
But is there any ways I can improve my Cassandra DB performance easier?
Improving your Cassandra DB performance can be done in many ways, but I feel like you need to query the data efficiently which has nothing to do with performance tweaks on the db itself.
As you know, Cassandra is a nosql database, which means when dealing with it, you are sacrificing flexibility of queries for fast read/writes and scalability and fault tolerance. That means querying the data is slightly harder. There are many patterns which can help you query the data:
Know what you are needing in advance. As querying with CQL is slightly less flexible than what you could find in a RDBMS engine, you can take advantage of the fast read-writes and save the data you want to query in the proper format by duplicating it. Too complex?
Imagine you have a user entity that looks like that:
{
"pk" : "someTimeUUID",
"name": "someName",
"address": "address",
"birthDate": "someBirthDate"
}
If you persist the user like that, you will get a sorted list of users in the order they joined your db (you persisted them). Let's assume you want to get the same list of users, but only of those who are named "John". It is possible to do that with CQL but slightly inefficient. What you could do here to amend this problem is to de-normalize your data by duplicating it in order to fit the query you are going to execute over it. You can read more about this here:
http://arin.me/blog/wtf-is-a-supercolumn-cassandra-data-model
However, this approach seems ok for simple queries, but for complex queries it is somewhat hard to achieve and also, if you are unsure what you are going to query in advance, there is no way you store the data in the proper manner beforehand.
Hadoop comes to the rescue. As you know, you can use hadoop's map reduce to solve tasks involving a large amount of data, and Cassandra data, by my experience, can become very very large. With hadoop, to solve the above example, you would iterate over the data as it is, in each map method to find if the user is named John, if so, write to context.
Here is how the pseudocode would look:
map<data> {
if ("John".equals(data.getColumn("name")){
context.write(data);
}
}
At the end of the map method, you would end up with a list of all users who are named John. Youl could put a time range (range slice) on the data you feed to hadoop which will give you
all the users who joined your database over a certain period and are named John. As you see, here you are left with a lot more flexibility and you can do virtually anything. If the data you got was small enough, you could put it in some RDBMS as summary data or cache it somewhere so further queries for the same data can easily retrieve it. You can read more about hadoop in here:
http://hadoop.apache.org/
It seems there is no way to construct query with OR condition. Has anyone hit this issue or know when this will be done or any workaround.
What I want to achive something like this with OR:
query = datastore.query(kind='Article',
filters=[('url', '=', 'url1'),
('url', '=', 'url2')]
)
But this filter works as AND not OR.
OR is not a supported query construct in Google Cloud Datastore.
The current way to achieve this is to construct multiple queries client-side and combine the result sets.
For reference, you should read through the Datastore Queries documentation:
The Datastore currently only supports combining filters with the AND operator. However it's relatively straightforward to create your own OR query by issuing multiple queries and combining the results:
Python runtime supports "IN" query filter.
Note, however, that this is just a convenience: under the hood, "IN" query is translated into a series of independent queries each looking for one value on the list.