Synchronising different of a database - django - database

I have made a django web app (postgresql backend) for internal use for one of my clients in New Zealand.
They have told me that they would also like it to be used by one of their branches in Malaysia (it will need to be connected to the same database). The problem is that apparently in Malaysia the internet is really unpredictable and there is a lot of downtime.
So here is the question, what would be the best way for keeping the Malaysian branch running when their internet is down and having their version of the database synchronised with the main database back here in NZ?

What you are trying to do is to synchronize your data and schema across multiple (2 in your case) postgresql databases.
There are a variety of solutions to do that depending on exactly what you want to achieve. This is a good place to start - http://www.postgresql.org/docs/devel/static/high-availability.html
and the summary of the different solutions and each solution's pros and cons are listed here -
http://www.postgresql.org/docs/devel/static/different-replication-solutions.html#HIGH-AVAILABILITY-MATRIX

Related

keeping databases in sync (after write/update) across regions/zones

I have to write a webservice in php to serve at three different zones/(cities or countries). Each zone will have its own machine to run this web service instance behind every webservice is a database which is exact clone/copy in each region, web service serves the clients with data from db. Main reason for multiples instances of web service is to distribute client load.
The clients can make read and write calls via web service APIs.
Write calls will modify the database for that instance but this change has to be applied as soon as possible to all databases in other zones also as all the databases in each zone are clones and exact copies, so changes in one db must be synced in all the databases in other zones.
I presume the write calls must go to some kind of master server which coordinates among all the web services etc. But I am sure this pattern is quite common and some solution is already out there.
Please advise if there is any database or application level technique which would keep the databases in sync when there are write calls so that modification or addition is reflected in all instances of db ? I can choose the database of my choice but primary choice would be mysql server or postgres, but can change to other database which can solve this issue.
You're right, this pattern is quite common and there is a name for it - Synchronous Master-Master replication. Most modern RDBMS support it:
PosgreSQL supports it thru pg_cluster https://wiki.postgresql.org/wiki/PgCluster
MySQL https://www.howtoforge.com/mysql_master_master_replication
But before implementing it straight away I'd recommend reading more about different types of replication, their pros and cons:
https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling
https://dev.mysql.com/doc/refman/8.0/en/replication.html
Synchronous Master-Master replication will be quite slow, especially in a multi-zone scenario, so you might consider other techniques:
Asynchronous replication
Sharding/Partitioning
A mix of sharding and replication
There is a very good book on different distributed techniques(including sharding and replication) - "Designing Data Intensive Applications" by Martin Kleppmann.
Replication techniques are definitely worth looking at, but there can be a certain amount of technical overhead and cost to replication. I work for a company called Redactics (https://www.redactics.com), and we came up with a simpler solution that is sort of a near realtime replication based on delta updates using a pure SQL approach.
There are certainly pros and cons to both approaches, I'm not trying to push Redactics hard if this is not the most appropriate solution for your needs, but Redactics simply tracks the most recent primary keys and uses modification timestamps to find new and changed records, and then copies them over. You can run the sync pretty often without a lot of load since it is just a delta update. Obviously any workflow can break, but repairing broken replication can be tricky, so we like this approach and running these sync workflows within your own infrastructure.

Where should i access my Database

I'm curious how you would handle following Database access.
Let's suggest you have a Computer which Hosts your database as part of his server work and multiple client PC's which has some client-side-software on it that need to get information from this database
AFAIK there are 2 way's to do this
each client-side-software connects directly to the Database
each client-side-software connects to a server-side-software which connects to the Database as some sort of data access layer.
so what i like to know is:
What are the pro and contra's of each solution?
And are other solutions out there which maybe "better" to do this work
I would DEFINITELY go with suggestion number 2. No client application should talk to a datastore without a broker ie:
ClientApp -> WebApi -> DatabaseBroker.class -> MySQL
This is the sound way to do it as you separate concerns and define an organized throughput to the datastore.
Some benefits are:
decouple the client from the database
you can centralize all upgrades, additions and operability in one location (DatabaseBroker.class) for all clients
it's very scaleable
its safe in regards to business logic
Think of it like this with this laymans example:
Marines are not allowed to bring their own weapons to battle (client apps talking directly to DB). instead they checkout the weapon from the armory (API). The armory has control over all weapons, repairs and upgrades (data from database) and determines who gets what.
What you have described sounds like two different kind of multitier architectures.
The first point matches with a two-tier and the second one could be a three-tier.
AFAIK there are 2 way's to do this
You can divide your application in several physical tiers, therefore, you will find more cases suitable to this architecture (n-tier) than the described above.
What are the pro and contra's of each solution?
Usually the motivation for splitting your application in tiers is to achieve some kind of non-functional requirements (maintainability, availability, security, etc.), the problem is that when you add extra tiers you also add complexity,e.g.: your app components need to communicate with each other and this is more difficult when they are distributed among several machines.
And are other solutions out there which maybe "better" to do this work.
I'm not sure what you mean with "work" here, but notice that you don't need to add extra tiers to access a database. If you have a desktop application installed in a few machines a classical client/server (two-tier) model should be enough. However, a web-based application needs an extra tier for interacting with the browser. In this case the database access is not the motivation for adding this extra tier.

Multiple Raven Databases with different Replication strategies

Raven DB creating multiple Databases to support different replication strategies.
Recently I was tasked with creating an additional raven database to store information pertaining to users. So the solution I working on would have some information in one Raven database and user information in another Raven Database. The reason for the request is so we could support different replications strategies for the two databases. Given my understanding raven only supports a single replication strategy per RavenDB.
First I would like to know if anyone has created an application with two raven databases?
Second I would like to know what problems you might have encountered, and a general sense of what issues I can plan for or mitigate early on?
Thank you ahead of time,
Having multiple Raven databases is possible, but only advisable in certain situations.
If each database could potentially be on a different server (as one would assume since you're talking about replicating differently) then each must have its own DocumentStore, which is fairly expensive to set up, but this only should happen once at application startup anyway, and you're talking about 2, not 50.
As Matt mentions in the comments, if you have two databases on the same server, then you can use the same DocumentStore and specify the database name when you open the session.
Each database should be for logically very different things. You won't easily be able to commingle data between the two databases. If a document in one database contained a reference to a document id in the other database, you wouldn't be able to use the Include features to get both of those documents in one round trip - there would essentially be a wall between the databases. Indexes could not span between the databases, for example.
Accessing both databases would require spinning up an IDocumentSession for each, both of which would need to be managed separately. If you're managing your document sessions at an infrastructure level (i.e. one session per HTTP request) then having two complicates things quite a bit.
However, if you have a segmented type of application, this can work quite well. For example, if you had a Users database that provided single sign-on across multiple websites (or areas of a website) then this could be a good fit. On most pages the user info would be essentially read-only (like to display the black bar at the top of Stack Overflow), except for the user management pages.
This could also be common if you're going for an Udi Dahan style SOA application structure where you define service boundaries and each service has its own independent database.

Application database/instance decomposintion

I'm designing a service that will serve some business entites. Logically it will be divided into two parts:
Frontend - bells and whistels like Wiki, Pricing, Landing Page, maybe account information (billing, account status, and so on).
Service itself, where business entity's empoyers will do theirs work.
It is play 2.x framework, planning to host in heroku.
It is not clear for now how to decompose intstances and DB stuff.
Should I decompose DB for clients: business entity - one database? Or should I store all data in one database, but add for all tables id of business entity that ownes some row? What issues (performance, administrative, scaling) may come up with this decision?
If I will choose to divide databases, how can I do this? For that I need to launch app instance with DB for client that instance belongs to. Thus we have non-uniform instances that can be obstacle for scaling. And as I know, heroku doesn't support non-uniform (web)instances.
Please help, i'm totally stucked here.
Expected stack:
Scala
Play 2.0
Anorm
JDBC
PostgresSQL
Heroku
All (except Scala, and may be Play 2.0) of this are interchangeable.
This is a pretty classic problem. You have many clients and you wonder if you should create separate databases for each client - or if they should share a database.
I would recommend starting with one shared database and then use that until you out grow it. Think of some of the disadvantages to having each client with their own database instance:
Like you mention the schema management can be tough. You'd need to write tools to maintain all databases across all servers.
If you tell clients you have structured your system this way, some of them might push you to fork the database. In other words they might argue, "I have my own database! I want a new table just for me."
It's a bit harder to run queries across servers/databases. If you wanted to count how many items all clients have, you'd have to think about that a bit.
But if you want to start by sharding based on client (http://en.wikipedia.org/wiki/Shard_(database_architecture)), you might consider:
As mentioned previously, you'll need some tools/scripts to launch a new database instance for a client. Often those tools will need to "seed" the database with configuration information - like populating a states table for addresses.
You'll want to have a tool to monitor/maintain the databases. Start one, stop another, see if one has high CPU usage etc.
You'll need some kind of system to aggregate statistics across all clients.
You'll need a tool to roll out schema changes and a plan on how you can gracefully upgrade the database while their web application is running.
Overall I would advise to start small and simple and only start worrying about scale when you get there.

sync sqlite on ipad with remote sql server

I am new to ipad development. I have to develop an app for a client whose employees use ipads.I am to develop this app that would take the data that they have and store it to the main sql server on their server. On researching i came across that people do that once they have their data on ipad and later sync it with their server. I have used sqlite for android before. But that was like a school project. CRUD operations basically. So since i have little knowledge of sqlite i want to pursue this app in this way. My question is can i write an app that will sync temporary sqlite data with server once they sync ? I have more questions..
Thanks.
It is certainly possible to synchronize data between multiple databases.
Generally speaking, you have to record all changes made since the last synchronization (usually done with serial numbers or timestamps), and apply those changes to the other database.
If the same data has been modified by multiple users, you have to resolve this conflict somehow.
If multiple users can add data, you have to prevent duplicates of primary keys.
See these Wikipedia articles for explanations of some related concepts:
Data synchronization
Replication
Change data capture
this Guy may solve the problem, but it only supports Xamarin(iOS or Android).
http://forums.xamarin.com/discussion/5719/sync-sqlite-with-sql-server-merge-replication

Resources