Consider a query, which I know will return no more then one result. Is there any performance penalty if instead of this:
r.table('users').filter({facebookUserId:facebookUserId}).
run(connection, function(err, cursor) {
if (err) throw err;
cursor.toArray(function(err, result) {
if (err) throw err;
//return the value
});
});
I use this:
const res = await r.table('users')
.filter({facebookUserId:facebookUserId})
.coerceTo("array")
.run(connection);
I am specifically referring to the coerceTo() command vs the cursor.
I would do the following:
create an index on facebookUserId so that I can use r.table('users').getAll(myInput, {index: 'facebookUserId'}), avoiding parsing all the documents (waaaaay faster)
not call sequence.coerceTo() because it's one more step in a probably distributed, scalable DB
use cursor.next() as it directly provides the next element as soon as it's ready, instead of cursor.toArray() which, in this case, will do the same plus checking the stream has ended (so it must be slower)
call cursor.close() as soon as I enter cursor.next() to directly free resources
adapt my code so that I don't even need to access an array's first element, since cursor.next() didn't uselessly encapsulate my document in an array
In the end:
r.table('users').getAll(facebookUserId, {index: 'facebookUserId'})
.run(connection, function(err, cursor) {
if (err) throw err;
cursor.next(function(err, item) {
cursor.close()
if (err) throw err;
//do something with item instead of result[0];
});
});
One more thing: cursor.next() behaviour can be done using cursor.each() with return false. However, should the methods vary in performance (I dont know but a documentation note tends to prefer each), we still need to wait the return of cursor.each() before closing the cursor, which mustn't be the best performance profile, all things considered.
Hope this helps!
Related
I have a simple route where I am using toArray() to return data to an extenral application, and it works, but I just want to return json objects and not return them as an array.
Here is my code, is there a way to return the same data, but just not as an array?
app.get("/test/", (request, response) => {
collection.find( {"DataSet":"somevalue"}).limit(3).toArray((error, result) => {
if(error) {
return response.status(500).send(error);
}
response.send(result);
});
});
MongoDB typically handles query results via a "cursor" object by default. The toArray() method simply takes that cursor and retrieves all of the results in an array.
It looks like you're using the Node.js driver, right? In that case, you can simply iterate over the cursor instead via each() or grab objects individually via nextObject().
Both of those links direct you to different sections of the same documentation page for the cursor object. More information on operations that can be performed on cursors can be found there as well. I highly recommend digging into that documentation a bit more.
if you simply want to return the query result (mongoDB?) it should be as follows
app.get("/test/", (request, response) => {
response.send(collection.find( {"DataSet":"somevalue"} ).limit(3))
});
https://docs.mongodb.com/stitch/mongodb/actions/collection.find/
I had a look at the bluebird promise FAQ, in which it mentions that .then(success, fail) is an antipattern. I don't quite understand its explanation as for the try and catch.
What's wrong with the following?
some_promise_call()
.then(function(res) { logger.log(res) }, function(err) { logger.log(err) })
It seems that the example is suggesting the following to be the correct way.
some_promise_call()
.then(function(res) { logger.log(res) })
.catch(function(err) { logger.log(err) })
What's the difference?
What's the difference?
The .then() call will return a promise that will be rejected in case the callback throws an error. This means, when your success logger fails, the error would be passed to the following .catch() callback, but not to the fail callback that goes alongside success.
Here's a control flow diagram:
To express it in synchronous code:
// some_promise_call().then(logger.log, logger.log)
then: {
try {
var results = some_call();
} catch(e) {
logger.log(e);
break then;
} // else
logger.log(results);
}
The second log (which is like the first argument to .then()) will only be executed in the case that no exception happened. The labelled block and the break statement feel a bit odd, this is actually what python has try-except-else for (recommended reading!).
// some_promise_call().then(logger.log).catch(logger.log)
try {
var results = some_call();
logger.log(results);
} catch(e) {
logger.log(e);
}
The catch logger will also handle exceptions from the success logger call.
So much for the difference.
I don't quite understand its explanation as for the try and catch
The argument is that usually, you want to catch errors in every step of the processing and that you shouldn't use it in chains. The expectation is that you only have one final handler which handles all errors - while, when you use the "antipattern", errors in some of the then-callbacks are not handled.
However, this pattern is actually very useful: When you want to handle errors that happened in exactly this step, and you want to do something entirely different when no error happened - i.e. when the error is unrecoverable. Be aware that this is branching your control flow. Of course, this is sometimes desired.
What's wrong with the following?
some_promise_call()
.then(function(res) { logger.log(res) }, function(err) { logger.log(err) })
That you had to repeat your callback. You rather want
some_promise_call()
.catch(function(e) {
return e; // it's OK, we'll just log it
})
.done(function(res) {
logger.log(res);
});
You also might consider using .finally() for this.
The two aren't quite identical. The difference is that the first example won't catch an exception that's thrown in your success handler. So if your method should only ever return resolved promises, as is often the case, you need a trailing catch handler (or yet another then with an empty success parameter). Sure, it may be that your then handler doesn't do anything that might potentially fail, in which case using one 2-parameter then could be fine.
But I believe the point of the text you linked to is that then is mostly useful versus callbacks in its ability to chain a bunch of asynchronous steps, and when you actually do this, the 2-parameter form of then subtly doesn't behave quite as expected, for the above reason. It's particularly counterintuitive when used mid-chain.
As someone who's done a lot of complex async stuff and bumped into corners like this more than I care to admit, I really recommend avoiding this anti-pattern and going with the separate handler approach.
By looking at advantages and disadvantages of both we can make a calculated guess as to which is appropriate for the situation.
These are the two main approaches to implementing promises. Both have it's pluses and minus
Catch Approach
some_promise_call()
.then(function(res) { logger.log(res) })
.catch(function(err) { logger.log(err) })
Advantages
All errors are handled by one catch block.
Even catches any exception in the then block.
Chaining of multiple success callbacks
Disadvantages
In case of chaining it becomes difficult to show different error messages.
Success/Error Approach
some_promise_call()
.then(function success(res) { logger.log(res) },
function error(err) { logger.log(err) })
Advantages
You get fine grained error control.
You can have common error handling function for various categories of errors like db error, 500 error etc.
Disavantages
You will still need another catch if you wish to handler errors thrown by the success callback
Simple explain:
In ES2018
When the catch method is called with argument onRejected, the
following steps are taken:
Let promise be the this value.
Return ? Invoke(promise, "then", « undefined, onRejected »).
that means:
promise.then(f1).catch(f2)
equals
promise.then(f1).then(undefiend, f2)
Using .then().catch() lets you enable Promise Chaining which is required to fulfil a workflow. You may need to read some information from database then you want to pass it to an async API then you want to manipulate the response. You may want to push the response back into the database. Handling all these workflows with your concept is doable but very hard to manage. The better solution will be then().then().then().then().catch() which receives all errors in just once catch and lets you keep the maintainability of the code.
Using then() and catch() helps chain success and failure handler on the promise.catch() works on promise returned by then(). It handles,
If promise was rejected. See #3 in the picture
If error occurred in success handler of then(), between line numbers 4 to 7 below. See #2.a in the picture
(Failure callback on then() does not handle this.)
If error occurred in failure handler of then(), line number 8 below. See #3.b in the picture.
1. let promiseRef: Promise = this. aTimetakingTask (false);
2. promiseRef
3. .then(
4. (result) => {
5. /* successfully, resolved promise.
6. Work on data here */
7. },
8. (error) => console.log(error)
9. )
10. .catch( (e) => {
11. /* successfully, resolved promise.
12. Work on data here */
13. });
Note: Many times, failure handler might not be defined if catch() is
written already.
EDIT: reject() result in invoking catch() only if the error
handler in then() is not defined. Notice #3 in the picture to
the catch(). It is invoked when handler in line# 8 and 9 are not
defined.
It makes sense because promise returned by then() does not have an error if a callback is taking care of it.
Instead of words, good example. Following code (if first promise resolved):
Promise.resolve()
.then
(
() => { throw new Error('Error occurs'); },
err => console.log('This error is caught:', err)
);
is identical to:
Promise.resolve()
.catch
(
err => console.log('This error is caught:', err)
)
.then
(
() => { throw new Error('Error occurs'); }
)
But with rejected first promise, this is not identical:
Promise.reject()
.then
(
() => { throw new Error('Error occurs'); },
err => console.log('This error is caught:', err)
);
Promise.reject()
.catch
(
err => console.log('This error is caught:', err)
)
.then
(
() => { throw new Error('Error occurs'); }
)
I have been looking around for suitable ways to 'clean up' created records after tests are run using Protractor.
As an example, I have a test suite that currently runs tests on create and update screens, but there is currently no delete feature, however there is a delete end point I can hit against the backend API.
So the approach I have taken is to record the id of the created record so that in an afterAll I can then issue a request to perform a delete operation on the record.
For example:
beforeAll(function() {
loginView.login();
page.customerNav.click();
page.customerAddBtn.click();
page.createCustomer();
});
afterAll(function() {
helper.buildRequestOptions('DELETE', 'customers/'+createdCustomerId).then(function(options){
request(options, function(err, response){
if(response.statusCode === 200) {
console.log('Successfully deleted customer ID: '+ createdCustomerId);
loginView.logout();
} else {
console.log('A problem occurred when attempting to delete customer ID: '+ createdCustomerId);
console.log('status code - ' + response.statusCode);
console.log(err);
}
});
});
});
//it statements below...
Whilst this works, I am unsure whether this is a good or bad approach, and if the latter, what are the alternatives.
I'm doing this in order to prevent a whole load of dummy test records being added over time. I know you could just clear down the database between test runs, e.g. through a script or similar on say a CI server, but it's not something I\we have looked into further. Plus this approach seems on the face of it simpler, but again I am unsure about the practicalities of such an approach directly inside the test spec files.
Can anyone out there provide further comments\solutions?
Thanks
Well, for what it's worth I basically use that exact same approach. We have an endpoint that can reset data for a specific user based on ID, and I hit that in a beforeAll() block as well to reset the data to an expected state before every run (I could have done it afterAll as well, but sometimes people mess with the test accounts so I do beforeAll). So I simply grab the users ID and send the http request.
I can't really speak to the practicality of it, as it was simply a task that I accomplished and it worked perfectly for me so I saw no need for an alternative. Just wanted to let you know you are not alone in that approach :)
I'm curious if other people have alternative solutions.
The more robust solution is to mock your server with $httpBackend so you don't have to do actual calls to your API.
You can then configure server responses from your e2e test specs.
here's a fake server example :
angular.module('myModule')
.config(function($provide,$logProvider) {
$logProvider.debugEnabled(true);
})
.run(function($httpBackend,$log) {
var request = new RegExp('\/api\/route\\?some_query_param=([^&]*)');
$httpBackend.whenGET(request).respond(function(method, url, data) {
$log.debug(url);
// see http://stackoverflow.com/questions/24542465/angularjs-how-uri-components-are-encoded/29728653#29728653
function decode_param(param) {
return decodeURIComponent(param.
replace('#', '%40').
replace(':', '%3A').
replace('$', '%24').
replace(',', '%2C').
replace(';', '%3B').
replace('+', '%20'));
}
var params = url.match(request);
var some_query_param = decodeURIComponent(params[1]);
return [200,
{
someResponse...
}, {}];
});
});
Then load this script in your test environnement and your done.
I have a nodejs application where i connect to my couchdb using nano with the following script:
const { connectionString } = require('../config');
const nano = require('nano')(connectionString);
// creates database or fails silent if exists
nano.db.create('foo');
module.exports = {
foo: nano.db.use('foo')
}
This script is running on every server start, so it tries to create the database 'foo' every time the server (re)starts and just fails silently if the database already exists.
I like this idea a lot because this way I'm actually maintaining the database at the application level and don't have to create databases manually when I decide to add a new database.
Taking this approach one step further I also tried to maintain my design docs from application level.
...
nano.db.create('foo');
const foo = nano.db.use('foo');
const design = {
_id: "_design/foo",
views: {
by_name: {
map: function(doc) {
emit(doc.name, null);
}
}
}
}
foo.insert(design, (err) => {
if(err)
console.log('design insert failed');
})
module.exports = {
foo
}
Obviously this will only insert the design doc if it doesn't exist. But what if I updated my design doc and want to update it?
I tried:
foo.get("_design/foo", (err, doc) => {
if(err)
return foo.insert(design);
design._rev = doc._rev
foo.insert(design);
})
The problem now is that the design document is updated every time the server restarts (e.g it gets a new _rev on every restart).
Now... my question(s) :)
1: Is this a bad approach for bootstrapping my CouchDB with databases and designs? Should I consider some migration steps as part of my deployment process?
2: Is it a problem that my design doc gets many _revs, basically for every deployment and server restart? Even if the document itself hasn't changed? And if so, is there a way to only update the document if it changed? (I thought of manually setting the _rev to some value in my application but very unsure that would be a good idea).
Your approach seems quite reasonable. If the checks happen only at restarts, this won't even be a performance issue.
Too many _revs can become a problem. The history of _revs is kept as _revs_info and stored with the document itself (see the CouchDB docs for details). Depending on your setup, it might be a bad decision to create unnecessary revisions.
We had a similar challenge with some server-side scripts that required certain views. Our solution was to calculate a hash over the old and new design document and compare them. You can use any hashing function for this job, such as sha1 or md5.
Just remember to remove the _rev from the old document before hashing it, or otherwise you will get different hash values every time.
I tried the md5 comparison like #Bernhard Gschwantner suggested. But I ran into some difficulties because im my case I'd like to write the map/reduce functions in the design documents in pure javascript in my code.
const design = {
_id: "_design/foo",
views: {
by_name: {
map: function(doc) {
emit(doc.name, null);
}
}
}
}
while getting the design doc from CouchDb returns the map/reduce functions converted as strings:
...
"by_name": {
"map": "function (doc) {\n emit(doc.name, null);\n }"
},
...
Obviously md5 comparing does not really work here.
I ended up with the very simple solution by just putting a version number on the design doc:
const design = {
_id: "_design/foo",
version: 1,
views: {
by_name: {
map: function(doc) {
emit(doc.name, null);
}
}
}
}
When I update the design doc, I simply increment the version number and compare it with the version number in database:
const fooDesign = {...}
foo.get('_design/foo', (err, design) => {
if(err)
return foo.insert(fooDesign);
console.log('comparing foo design version', design.version, fooDesign.version);
if(design.version !== fooDisign.version) {
fooDesign._rev = design._rev;
foo.insert(fooDesign, (err) => {
if(err)
return console.log('error updating foo design', err);
console.log('foo design updated to version', fooDesign.version)
});
}
});
Revisiting your question again: In a recent project I used the great couchdb-push module by Johannes Schmidt. You get conditional updates for free, alongside with many other benefits inherited from its dependency couchdb-compile.
That library turned out to be a hidden gem for me. HIGHLY recommended!
I am trying to do a simple query against a Cassandra cluster using the Node.js Cassandra driver. I am following the examples, but it seems like my callback functions are getting called even if there are no results in the returned set.
var q = function (param) {
var query = 'SELECT * FROM my_table WHERE my_col=?';
var params = [param];
console.log("Cassandra query is being called with parameters " + params);
client.execute(query, params, function (err, result) {
console.log("Cassandra callback is being called. Row count is " + result.rows.length);
if (err != null) {
console.log("Got error");
}
if (result.rows.length <= 0) {
console.log("Sending empty response.");
}
else {
console.log("Sending non-empty response.");
}
});
};
q('value');
q('value');
Outputs
Cassandra query is being called with parameters value
Cassandra query is being called with parameters value
Cassandra callback is being called. Row count is 0
Sending empty response.
Cassandra callback is being called. Row count is 1
Sending non-empty response.
This is happening fairly consistently, but sometimes both calls will come up empty, and sometime both calls will return values.
I guess I'm doing something wrong with the async calls here, but I'm not really sure what it is.
I think this was due to a bad node in my cluster. One of the nodes was missing a bunch of data, and so one third of the time I would get back no results.
When you open the connection do you send down a consistency command?
I had this issue as well, when I open a connection call execute and send use <keyspace>; followed by consistency local_quorum;. This tells Cassandra to look a little harder for the answer and agree with other nodes that you are getting the latest version.
Note: I keep my connections around for a while, so this doesn't add any meaningful overhead (for me).
Note2: The above doesn't seem to work from my python app anymore (using Cassandra 3.5). However, using SimpleStatement and setting the consistency before calling execute does: https://datastax.github.io/python-driver/getting_started.html#setting-a-consistency-level
I hope this helps.