How transparent should losing access to a Postgres-XL datanode be? - database

I have set-up a testing Postgres-XL cluster with the following architecture:
gtm - vm00
coord1+datanode1 - vm01
coord2+datanode2 - vm02
I created a new database, which contains a table that is distributed by replication. This means that I should have the exact copy of that table in each and every single datanode.
Doing operations on the table works great, I can see the changes replicated when connecting to all coordinator nodes.
However, when I simulate one of the datanodes going down, while I can still read the data in the table just fine, I cannot add or modify anything, and I receive the following error:
ERROR: Failed to get pooled connections
I am considering deploying Postgres-XL as a highly available database backend for a fair number of applications, and I cannot control how those applications interact with the database (it might be big a problem if those applications couldn't write to the database while one datanode is down).
To my understanding, Postgres-XL should achieve high availability for replicated tables in a very transparent way and should be able to support losing one or more datanodes (as long as at least one is still available - again, this is just for replicated tables), but this does not seem the case.
Is this the intended behaviour? What can be done in order to be able to withstand having one or more datanodes down?

So as it turns out not transparent at all. To my jaw dropping surprise at it turns out Postgres-XL has no build in high availably support or recovery. Meaning if you lose one node the database fails. And if you are using the round robbin or hash DISTRIBUTED BY options if you lose a disk in a node you have lost the entire database. I could not believe it, but that is the case.
They do have a "stand by" server option which is just a mirrored node for each node you have, but even this requires manually setting it to recover and doubles the number of nodes you need. For data protection you will have to use the REPLICATION DISTRIBUTED BY option which is MUCH slower and again has no fail over support so you will have to manually restart it and reconfigure it not to use the failing node.
https://sourceforge.net/p/postgres-xl/mailman/message/32776225/
https://sourceforge.net/p/postgres-xl/mailman/message/35456205/

Related

Best practice or design to scale out/horizontal scale database for microservices

The main benefit of Microservices are one Service “Type” can be scale out by using multiple container instances and load-balancing to improve through put.
But one things is, multiple instances (ie. containers) of a "Service Type" are sharing the same database instance; and this could leave to performance bottle neck when multiple instance write/read on that database instance.
Traditionally, we would scale up on the processing power of that database instance to meet high demand.
The main questions for me is, what is the current best practice/design/solution to scale out/ horizontal scale so we can have multiple instance of that database and having performance improvement?
In particular, what I want to archive are:
One instance is down, a nother instance can handle the load -> High
Availability
Can load balance read, or maybe even write to multiple database
intance
Maintain the persistent and consistency of data incase I want to
create more database-instance
Within my knowledge,
One of the solution is Microsoft SQL Server provide High availability for SQL Server containers with can do most of the requirements above (https://learn.microsoft.com/en-us/sql/linux/sql-server-linux-container-ha-overview?view=sql-server-2017). But I'm wonder is there a better solution to avoid technology lock-down?
Another solution which I'm thinking of is: Replicate to multiple instance by using CDC Stream Data from a master database instance to multiple replications. This allow replication read.
But I'm still not convince because to quarrant the consistency, every services instance should write to master-database-instance, this could also, leave to bottle neck on master database instance.
There are 3 possible architectures for database at a broad level:
Single leader (e.g. RDBMS)
Multi leader (e.g. RDBMS in multiple DC)
Leader less (e.g. Riak, Cassandra)
As you go from top to bottom in the above list, horizontal scalability potential increases, but consistancy becomes weaker.
Scalability potential increases because more nodes can accept writes as you go down the list. Consistancy becomes weaker as writes take time to propagate or replicate to all nodes responsible for the data. Conflicts arise when same record is written in two different nodes at almost same time and so at the time of replication the system does not know which one is correct.
There are various conflict resolution strategies. Different database use different strategies. You need to study these strategies to understand which one suits your usecase and based on that you pick your DB.
There is always a trade off when making choices . database has its limitations and despite scaling database we can avoid performace hit by using simple best practices. you can't leave it to database to handle high request rate and mind it scaling database is expensive option and you will hit database limits eventually if not taken right so plan the whole system than just database.
coming to your point you can have one master and slave for read and write separately is very common approach but you have to rely on eventual consistency and sql always on is something you can have a look. You can cache the most frequently data. If you have very high request rate you may need to consider queues where you put the request and dequeue later to avoid database performance hit.

Improving the application performance by load balancing SQL queries on Always on availability group nodes?

We have an intranet browser-based application written in ASP.NET with MS SQL Server as the database backend. One of our clients has always availability groups setup with two nodes. Our application requests are routed (via an availability group listener) to the primary R/W node and our client uses the R/O node for their custom reporting (crystal reports).
As the number of users is growing, we’re getting into performance problems - mostly CPU related.
We would like the customer to add more CPU’s, while they want us to start routing read-only queries to the R/O node.
We are really hesitant because these would be application changes and really non-trivial ones:
We understand that reporting is an ideal case to be sent to the R/O node (reduces load, blocking, …). Is it a recommended practice to load balance by sending the read-only queries to the read-only node(s)?
It seems to me we would need to be very careful in terms of what we can afford. It takes some time before the R/O node is synchronized, so we would need to always understand that we could be reading old data. For example, the user clicks the “Save” button and after the record is saved, we re-read the list of records to be displayed. I assume we would have to go to the R/W node to guarantee that the new records will be there. Is that correct?
If we send queries to the R/O node, don’t we degrade the robustness on the system? If one node crashes, the other node needs to be able to sustain the load on itself. Are there recommended scenarios when it makes sense to send requests to the R/O node and when it does not?
It is preferable to send queries which related to reporting (Only) to secondary nodes so that CPU intensive reports does not degrade the performance of your online database. Your transactions does not get affected from non-transactional usage.
However this does not mean that you need to make all R/O queries on secondary node. Lets say that you have a transactional operation which first needs select operation with row lock, you shouldn't be doing read operation from passive node and DML operation on active node.
We can say that all operational queries can be queried from Active node whereas Passive node(s) are more appropriate to be used for just long running reports.
For your second question If the second node is configured as Async, then yes there might be some delay and also on the case of log shipping failure It is possible to see old data.
For the third question, it really depends on current/future/peak-hours system load. It is hard to tell this or that. It also depends on the budget, IF you can afford it you can have 1 more node. It all depends. But keep in mind that RDBMS systems are not very feasible for horizontal scaling.

Solr master-master replication alternatives?

Currently we have 2 servers with a load-balancer before them. We want to be able to turn 1 machine off and later on, without the user noticing it.
Our application also uses solr and now i wanted to install & configure solr on both servers and the question is how do i configure a master-master replication?
After my initial research i found out that it's not possible :(
But what are my options here? I want both indices to stay in sync and when a document is commited on one server it should also go to the other.
Thanks for your help!
Not certain of your specific use case (why turn 1 server on and off?), there is no specific "master-master" replication. Solr does however support distributed indexing and querying via SolrCloud. From the documentation for SolrCloud:
Replication ensures redundancy for your data, and enables you to send
an update request to any node in the shard. If that node is a
replica, it will forward the request to the leader, which then
forwards it to all existing replicas, using versioning to make sure
every replica has the most up-to-date version. This architecture
enables you to be certain that your data can be recovered in the event
of a disaster, even if you are using Near Real Time searching.
It's a bit complex so I'd suggest you spend some time going thru the documentation as it's not quite as simple as setting up a couple of masters and load balancing between them. It is a big step up from the previous master/slave replication that Solr used, so even if it's not a perfect fit it will be a lot closer to what you need.
https://cwiki.apache.org/confluence/display/solr/SolrCloud
https://cwiki.apache.org/confluence/display/solr/Getting+Started+with+SolrCloud
You can just create a simple master - slave replication as described here:
https://cwiki.apache.org/confluence/display/solr/Index+Replication
But be sure you send your inserts, deletes, updates directly to the master, but selects can go through the load balancer.
The other alternative is to create a third server as a master, and 2 slaves, and the lode balancer can be in front of the two slaves.

couchdb replication on a lot of servers

I am currently looking at CouchDB and I understand that I have to specify all the replications by hand. If I want to use it on 100 nodes how would I do the replication?
Doing 99 "replicate to" and 99 "replicate from" on each node
It feels like it would be overkill since a node replication includes all the other nodes replications to it
Doing 1 replicate to the next one to form a circle (like A -> B -> C -> A)
Would work until one crash, then all wait until it comes back
The latency would be big for replicating from the first to the last
Isn't there a way to say: "here are 3 IPs on the full network. Connect to them and share with everyone as you see fit like an independent P2P" ?
Thanks for your insight
BigCouch won't provide the cross data-center stuff out of the box. Cloudant DBaaS (based on BigCouch) does have this setup already across several data-centers.
BigCouch is a sharded "Dynamo-style" fork of Apache CouchDB--it is to be merged into the "mainline" Apache CouchDB in the future, fwiw. The shards live across nodes (servers) in the same data-center. "Classic" CouchDB-style Replication is used (afaik) to keep the BigCouches in the various data-centers insync.
CouchDB-style replication (n-master) is change-based, so replication only includes the latest changes.
You would need to setup to/from pairs of replication for each node/database combination. However, if all of your servers are intended to be identical, replication won't actually happen that often--it will only happen if needed.
If A gets a change, replication ships it to B and C (etc). However, if B--having just got that change--replicates it to C before A gets the chance too--due to network latency, etc--when A does finally try, it will realize the data is already there, and not bother sending the change again.
If this is a standard part of your setup (i.e., every time you make a db you want it replicated everywhere else), then I'd highly recommend automating the setup.
Also, checkout the _replicator database. It's much easier to manage what's going on:
https://gist.github.com/fdmanana/832610
Hope something in there is useful. :)

simple Solr deployment with two servers for redundancy

I'm deploying the Apache Solr web app in two redundant Tomcat 6 servers,
to provide redundancy and improved availability. At this point, scalability is not a issue.
I have a load balancer that can dynamically route traffic to one server or the other or both.
I know that Solr supports master/slave configuration, but that requires manual recovery if the slave receives updates during the master outage (which it will in my use case).
I'm considering a simpler approach using the ability to reload a core:
- only one of the two servers is receiving traffic at any time (the "active" instance), but both are running,
- both instances share the same index data and
- before re-routing traffic due to an outage, the now active instance is told to reload the index core(s)
Limited testing of failovers with both index reads and writes has been successful. What implications/issues am I missing?
Your thoughts and opinions welcomed.
The simple approach to redundancy your considering seems reasonable but you will not be able to use it for disaster recovery unless you can share the data/index to/from a different physical location using your NAS/SAN.
Here are some suggestions:-
Make backups for disaster recovery and test those backups work as an index could conceivably have been corrupted as there are no checksums happening internally in SOLR/Lucene. An index could get wiped or some records could get deleted and merged away without you knowing it and backups can be useful for recovering those records/docs at a later time if you need to perform an investigation.
Before you re-route traffic to the second instance I would run some queries to load caches and also to test and confirm the current index works before it goes online.
Isolate the updates to one location and process and thread to ensure transactional integrity in the event of a cutover as it could be difficult to manage consistency as SOLR does not use a vector clock to synchronize updates like some databases. I personally would keep a copy of all updates in order separately from SOLR in some other store just in case a small time window needs to be repeated.
In general, my experience with SOLR has been excellent as long as you are not using cutting edge features and plugins. I have one instance that currently has 40 million docs and an uptime of well over a year with no issues. That doesn't mean you wont have issues but gives you an idea of how stable it could be.
I hardly know anything about Solr, so I don't know the answers to some of the questions that need to be considered with this sort of setup, but I can provide some things for consideration. You will have to consider what sorts of failures you want to protect against and why and make your decision based on that. There is, after all, no perfect system.
Both instances are using the same files. If the files become corrupt or unavailable for some reason (hardware fault, software bug), the second instance is going to fail the same as the first.
On a similar note, are the files stored and accessed in such a way that they are always valid when the inactive instance reads them? Will the inactive instance try to read the files when the active instance is writing them? What would happen if it does? If the active instance is interrupted while writing the index files (power failure, network outage, disk full), what will happen when the inactive instance tries to load them? The same questions apply in reverse if the 'inactive' instance is going to be writing to the files (which isn't particularly unlikely if it wasn't designed with this use in mind; it might for example update some sort of idle statistic).
Also, reloading the indices sounds like it could be a rather time-consuming operation, and service will not be available while it is happening.
If the active instance needs to complete an orderly shutdown before the inactive instance loads the indices (perhaps due to file validity problems mentioned above), this could also be time-consuming and cause unavailability. If the active instance can't complete an orderly shutdown, you're gonna have a bad time.

Resources