I wanted to ask this questions since our application is really close to production stage and we do not want to mess any data.
I'm using Redis as the data layer between Node and DB, and I am doing operations that involves multiple data insertions to different tables. An example;
redisClientForStudent.select(1);
redisClientForTeacher.select(2)
redisClientForStudent.get('1', function(err, studentResult) {
if(err) {
//ERROR
}
else if(studentResult != null) {
redisClientForTeacher.set('2', '{name:Jane}', function(err) {
//SOME LOGIC
});
}
else {
//DATA NOT FOUND
}
});
I wanted to point out that the code above is not the actual code. The reason why I am going to use multiple clients is because when I tried to the insert the multiple datas to multiple tables it gets mixed up. I am going to try this method, however my concerns are do I have to close each redis client in case of an error? Another question is, if using multiple clients does not prevent the data getting mixed, what additional precautions could be applied?
Related
I am in between building a web application for showing vehicles position in a map view, AngularJS as front end and Node.js as the server for real-time updating of the map. The events are coming from a middleware application to Node server, the server then has to apply some logic there to broadcast the data to the needed connected clients.
Coming to the question- the issue what I am currently facing is that, there comes around 20K vehicle data at a single time to the Node server, the Node server then has to decide to which connected clients should the data updated.This is achieving by looping each data against each connected client's map bounds.If the incoming data bounds are within the connected client's bounds that particular data will be emitted to that client. So this entire process will take more time if there have 1K clients and 20K data.
Are there any ways to reduce this server overload by using any node techniques?
What I have tried: I read through node clusters, but I think it deals with distributing connections across multiple workers. Is this a way for resolving my issue?
The sample code snippet is as follows:
Node server side logic
users // user array eg: array(userSocketId1,userSocketId2);
bounds //each user's bounds array eg:array({userSocketId1:boundsValue},{userSocketId2:boundsValue2});
app.post('/addObject', (req, res) => {
for (var k = 0; k < Object.keys(req.body).length; k++) {
var point=[{'lat':req.body[k].lat,'lng':req.body[k].lng,'message':req.body[k].message,'id':req.body[k].id}];
for (var i = 0; i < users.length; i++) {
var userBounds = bounds[users[i]];
if(typeof userBounds!=='undefined'){
var inbounds = inBounds(point[0],userBounds); // check whether current vehicle's bounds within user's bounds
var user = users[i];
if(inbounds){
io.to(user).emit('updateMap', point); // send that vehicle data to one client say user
}
}
}
}
res.send('Event received in Node Server');
});
Client-side logic for plotting vehicle info to map
socket.on('updateMap', function(msg){
L.marker([msg.lat, msg.lng]).addTo(map);
});
First thing you can try is make the code asynchrnous, like using Promise.
Without any library, this should work better:
app.post('/addObject', (req, res) => {
Promise.all(Object.keys(req.body).map((k) => {
let point=[
{
'lat':req.body[k].lat,
'lng':req.body[k].lng,
'message':req.body[k].message,
'id':req.body[k].id
}
];
Promise.all(users.map((user) => {
let userBounds = bounds[user];
if(typeof userBounds!=='undefined'){
let inbounds = inBounds(point[0], userBounds); // check whether current vehicle's bounds within user's bounds
if(inbounds){
io.to(user).emit('updateMap', point); // send that vehicle data to one client say user
}
}
}));
})).then(() => {
res.send('Event received in Node Server');
}).catch((error) => {
res.send(error);
});
});
Other advantages include not having to deal with indexes, easier to deal with errors.
It may not be enought, but you will not block each time you receive a request, and that is already a huge improvement.
For existing architecture you need to do following things -
Use Cluster.
Implement your logic with Promise.
Or you need to update your architecture, you would need to store user position with socket id and user id. and you need to get all the socket id those falling in your criteria.
Here your performance player is Mongo
how?
If you are using multiple objects like
Customer=>{...} l times
Clients=>{...} m times
data=>{client_id:'something',...} n times
then you need looping each data and checks in it. which is equal to l * m * n
here is the trick that save alot of things
Customer:{
_id:Object('someid')
,client:{
_id:Object('someid'),
data:{
...
}
}
}
It will decrease the looping factor and gives result in l * (no of clients inside)* (no of data inside).
In Microsoft examples I saw two ways to check if DocumentDb object like Database, DocumentCollection, Document etc. exists :
First is by creating a query:
Database db = client.CreateDatabaseQuery().Where(x => x.Id == DatabaseId).AsEnumerable().FirstOrDefault();
if (db == null)
{
await client.CreateDatabaseAsync(new Database { Id = DatabaseId });
}
The second one is by using "try catch" block:
try
{
await this.client.ReadDatabaseAsync(UriFactory.CreateDatabaseUri(databaseName));
}
catch (DocumentClientException de)
{
if (de.StatusCode == HttpStatusCode.NotFound)
{
await this.client.CreateDatabaseAsync(new Database { Id = databaseName });
}
else
{
throw;
}
}
What is the correct way to do this procedure in terms of performance?
You should use the new CreateDatabaseIfNotExistsAsync in the DocumentDB SDK instead of both these approaches, if that's what you're trying to do.
In terms of server resources (request units), a ReadDocumentAsync is slightly more lightweight than CreateDatabaseQuery, so you should use that when possible.
I've just seen the try/catch example in one of the Microsoft provided sample project and it got me baffled, as it is plain wrong: you don't use try/catch for control flow.
Never.
This is just bad code. The new SDK provides CreateDatabaseIfNotExistsAsync which I can only hope doesn't just hide this shit. In older lib just use the query approach, unless you want to get shouted at by whoever is going to review the code.
I have been looking around for suitable ways to 'clean up' created records after tests are run using Protractor.
As an example, I have a test suite that currently runs tests on create and update screens, but there is currently no delete feature, however there is a delete end point I can hit against the backend API.
So the approach I have taken is to record the id of the created record so that in an afterAll I can then issue a request to perform a delete operation on the record.
For example:
beforeAll(function() {
loginView.login();
page.customerNav.click();
page.customerAddBtn.click();
page.createCustomer();
});
afterAll(function() {
helper.buildRequestOptions('DELETE', 'customers/'+createdCustomerId).then(function(options){
request(options, function(err, response){
if(response.statusCode === 200) {
console.log('Successfully deleted customer ID: '+ createdCustomerId);
loginView.logout();
} else {
console.log('A problem occurred when attempting to delete customer ID: '+ createdCustomerId);
console.log('status code - ' + response.statusCode);
console.log(err);
}
});
});
});
//it statements below...
Whilst this works, I am unsure whether this is a good or bad approach, and if the latter, what are the alternatives.
I'm doing this in order to prevent a whole load of dummy test records being added over time. I know you could just clear down the database between test runs, e.g. through a script or similar on say a CI server, but it's not something I\we have looked into further. Plus this approach seems on the face of it simpler, but again I am unsure about the practicalities of such an approach directly inside the test spec files.
Can anyone out there provide further comments\solutions?
Thanks
Well, for what it's worth I basically use that exact same approach. We have an endpoint that can reset data for a specific user based on ID, and I hit that in a beforeAll() block as well to reset the data to an expected state before every run (I could have done it afterAll as well, but sometimes people mess with the test accounts so I do beforeAll). So I simply grab the users ID and send the http request.
I can't really speak to the practicality of it, as it was simply a task that I accomplished and it worked perfectly for me so I saw no need for an alternative. Just wanted to let you know you are not alone in that approach :)
I'm curious if other people have alternative solutions.
The more robust solution is to mock your server with $httpBackend so you don't have to do actual calls to your API.
You can then configure server responses from your e2e test specs.
here's a fake server example :
angular.module('myModule')
.config(function($provide,$logProvider) {
$logProvider.debugEnabled(true);
})
.run(function($httpBackend,$log) {
var request = new RegExp('\/api\/route\\?some_query_param=([^&]*)');
$httpBackend.whenGET(request).respond(function(method, url, data) {
$log.debug(url);
// see http://stackoverflow.com/questions/24542465/angularjs-how-uri-components-are-encoded/29728653#29728653
function decode_param(param) {
return decodeURIComponent(param.
replace('#', '%40').
replace(':', '%3A').
replace('$', '%24').
replace(',', '%2C').
replace(';', '%3B').
replace('+', '%20'));
}
var params = url.match(request);
var some_query_param = decodeURIComponent(params[1]);
return [200,
{
someResponse...
}, {}];
});
});
Then load this script in your test environnement and your done.
I have a nodejs application where i connect to my couchdb using nano with the following script:
const { connectionString } = require('../config');
const nano = require('nano')(connectionString);
// creates database or fails silent if exists
nano.db.create('foo');
module.exports = {
foo: nano.db.use('foo')
}
This script is running on every server start, so it tries to create the database 'foo' every time the server (re)starts and just fails silently if the database already exists.
I like this idea a lot because this way I'm actually maintaining the database at the application level and don't have to create databases manually when I decide to add a new database.
Taking this approach one step further I also tried to maintain my design docs from application level.
...
nano.db.create('foo');
const foo = nano.db.use('foo');
const design = {
_id: "_design/foo",
views: {
by_name: {
map: function(doc) {
emit(doc.name, null);
}
}
}
}
foo.insert(design, (err) => {
if(err)
console.log('design insert failed');
})
module.exports = {
foo
}
Obviously this will only insert the design doc if it doesn't exist. But what if I updated my design doc and want to update it?
I tried:
foo.get("_design/foo", (err, doc) => {
if(err)
return foo.insert(design);
design._rev = doc._rev
foo.insert(design);
})
The problem now is that the design document is updated every time the server restarts (e.g it gets a new _rev on every restart).
Now... my question(s) :)
1: Is this a bad approach for bootstrapping my CouchDB with databases and designs? Should I consider some migration steps as part of my deployment process?
2: Is it a problem that my design doc gets many _revs, basically for every deployment and server restart? Even if the document itself hasn't changed? And if so, is there a way to only update the document if it changed? (I thought of manually setting the _rev to some value in my application but very unsure that would be a good idea).
Your approach seems quite reasonable. If the checks happen only at restarts, this won't even be a performance issue.
Too many _revs can become a problem. The history of _revs is kept as _revs_info and stored with the document itself (see the CouchDB docs for details). Depending on your setup, it might be a bad decision to create unnecessary revisions.
We had a similar challenge with some server-side scripts that required certain views. Our solution was to calculate a hash over the old and new design document and compare them. You can use any hashing function for this job, such as sha1 or md5.
Just remember to remove the _rev from the old document before hashing it, or otherwise you will get different hash values every time.
I tried the md5 comparison like #Bernhard Gschwantner suggested. But I ran into some difficulties because im my case I'd like to write the map/reduce functions in the design documents in pure javascript in my code.
const design = {
_id: "_design/foo",
views: {
by_name: {
map: function(doc) {
emit(doc.name, null);
}
}
}
}
while getting the design doc from CouchDb returns the map/reduce functions converted as strings:
...
"by_name": {
"map": "function (doc) {\n emit(doc.name, null);\n }"
},
...
Obviously md5 comparing does not really work here.
I ended up with the very simple solution by just putting a version number on the design doc:
const design = {
_id: "_design/foo",
version: 1,
views: {
by_name: {
map: function(doc) {
emit(doc.name, null);
}
}
}
}
When I update the design doc, I simply increment the version number and compare it with the version number in database:
const fooDesign = {...}
foo.get('_design/foo', (err, design) => {
if(err)
return foo.insert(fooDesign);
console.log('comparing foo design version', design.version, fooDesign.version);
if(design.version !== fooDisign.version) {
fooDesign._rev = design._rev;
foo.insert(fooDesign, (err) => {
if(err)
return console.log('error updating foo design', err);
console.log('foo design updated to version', fooDesign.version)
});
}
});
Revisiting your question again: In a recent project I used the great couchdb-push module by Johannes Schmidt. You get conditional updates for free, alongside with many other benefits inherited from its dependency couchdb-compile.
That library turned out to be a hidden gem for me. HIGHLY recommended!
I am trying to solve a problem that has been blocking me for a month
I am bulding the backend of an application using Node.js & Redis and due to our structure we have to transfer data from one redis table to another (What I mean by table is the one's that we use "select" i.e. "select 2")
We receive a lot of request and push a lot of response in a sec, and no matter how much I tried I could not stop data getting mixed. Assume we have a "teacherID" that has to be stored inside Redis table #2. And a "studentID" that has to be stored in Redis table #4. How matter what I tried (I've checked my code multiple times) I could not stop teacherID getting into studentID. The last trick I've tried was actually placing callback at each select.;
redisClient.select(4, function(err) {
if(err)
console.log("You could not select the table. Function will be aborted");
else {
// Proceed with the logic
}
});
What could be the reason that I cannot simply stop this mess ? One detail that drivers me crazy is that it works really well on local and also online however whenever multiple request reach to server it gets mixed. Any suggestions to prevent this error? (Even though I cannot share the code to NDA I can make sure that logic has been coded correctly)
I'm not sure about your statement about having to "transfer data from one redis table to another". Reading through your example it seems like you could simply have two redis clients that write to different databases (what you called "tables").
It would look similar to this:
var redis = require("redis");
var client1 = redis.createClient(6379, '127.0.0.1');
var client2 = redis.createClient(6379, '127.0.0.1');
client1.select(2, function(err){
console.log('client 1 is using database 2');
});
client2.select(4, function(err){
console.log('client 2 is using database 4');
});
Then, wherever your read/write logic is, you just use the appropriate client:
client1.set("someKey", "teacherID", function(err){
// ...
});
client2.set("someKey", "studentID", function(err){
// ...
});
You can obviously encapsulate the above into functions with callbacks, nest the operations, use some async library, etc. to make it do whatever you need it to do. If you need to transfer values from database 2 to database 4 you could do a simple client1.get() and client2.set().