Data Synchronisation in Blue Green Deployment - Cloud

Data Synchronisation in Blue Green Deployment - Cloud - database

In a blue green deployment environment, I'm trying to use two separate database for each environment.
In this case, How does realtime data synchronisation could be done between the two databases in two different deployments(Consider that the database is manage by any of the cloud service providers)?
Or else Should I need to use shared database for both the blue and green deployment environment?
Thanks in advance!

Related

Mobile database with client-side synchronisation of local databases required

I am building a mobile app with the following business requirements:
Db to be stored locally on the device for use when disconnected from
the cloud.
A NoSQL type store is required to provide for future changes without requiring complex db rebuild and data migration.
Utilises a SQL query language for simple programming.
Run on all target platforms - Windows, Android, iOS
No central database server - data to be synchronised by matching two local copies of the db file.
I have examined a lot of dbs for mobile and none provide all these features except Couchbase Lite 2.1 Enterprise Edition. The downside of that is that the EE license might be price prohibitive in my use case.
[EDIT: yes the EE license is USD$35K for <= 1000 devices to that option is out for me sadly.]
Are there any other such products out there that someone could point me to?

The client-side synchronization of local databases done by Couchbase Lite is a way to replicate data from one mobile device to another. Though is a limited feature because it works on P2P. Take as an example BitTorrent, the fastest and most effective P2P protocol. It still has flaws, risk of data corruption and partial data loss. A P2P synchronization would only be safe when running between two distinct applications on the same mobile device.
In case both databases are in the same mobile device and managed by the same application, it would be much simpler. You could do the synchronization yourself by reading data from one and saving in the other, and dealing with conflicts if needed.
I'm curious, why is it a requirement not to have a central database server? You can fine tune what data is shared and between which users is it shared. Here is how it works:
On server-side user registry, each user is assigned a list of channel names. At the same time, each JSON document added or updated is also linked to a list of channel names. For every pair of user x document with at least one channel name in common, the server allows push/pull replications to occur.
Good luck !

Kubernetes and Cloud Databases

Could someone explain the benefits/issues with hosting a database in Kubernetes via a persistent volume claim combined with a storage volume over using an actual cloud database resource?

It's essentially a trade-off: convenience vs control. Take a concrete example: let's say you pay Amazon money to use Athena, which is really just a nicely packaged version of Facebook Presto which AWS kindly operates for you in exchange for $$$. You could run Presto on EKS yourself, but why would you.
Now, let's say you want to or need to use Apache Drill or Apache Impala. Amazon doesn't offer it. Nor does any of the other big public cloud providers at time of writing, as far as I know.
Another thought: what if you want to migrate off of AWS? Your data has gravity as well.

Could someone explain the benefits/issues with hosting a database in Kubernetes ... over using an actual cloud database resource?
As previous excellent answer noted:
It's essentially a trade-off: convenience vs control
In addition to previous example (Athena), take a look at RDS as well and see what you would need to handle yourself (why would you, as said already):
Automatic backups
Multizone deployments
Snapshots
Engine upgrades
Read replicas
and other bells and whistles that come with managed service opposed to self-hosted/managed one.
But there is more to it than just convenience/control that this post I trying to shed light onto:
Kubernetes is adding another layer of abstraction there (pods, services...), and depending on way of handling storage (persistent volumes) you can have two additional considerations:
Access speed (depending on your use case this can be negligent or show stopper).
Storage that you have at hand might not be optimized for relational database type of I/O (or restrict you to schedule pods efficiently). The very same reasons you are not advised to run db on NFS for example.
There are several recent conference talks on kubernetes pointing out that database is big no-no for kubernetes (although this is highly opinionated, we do run average load mysql and postgresql databases in k8s), and large load/fast I/O is somewhat challenge to get right on k8s as opposed to somebody already fine tuned everything for you in managed cloud solution.
In conclusion:
It is all about convenience, controls and capabilities.

Is it possible to run Postgres in Google App Engine Flexible?

Is it possible to run postgres (essentially, a non-HTTP service) in a custom Google App Engine Flexible container? Or will I be forced to use Google's Cloud SQL solution?

TL;DR: You could do that, but don’t. It’s better to externalize the persistent data storage.
Yes, it is possible to run a PostgreSQL database as a microservice (named simply a 'service' in Google Cloud Platform) in a custom Google App Engine Flexible container. However, that raises another important question, namely why would you like to run an SQL database inside a container. This is a risky solution, unless you are perfectly sure about what you are doing and how to manage that.
Typical container orchestration is based on stateless services which means that they are not intended to store persistent data. This kind of containers do have some form of storage sometimes, like NoSQL databases for cache or user session information. This data is not persistent, it can be lost during restarts or destruction of instances in an agile containerized application environment. PostgreSQL databases are rather used as stateful services and do not suit the aforementioned model. Putting such database into a container, one can run into problems like data corruption or direct concurrency when accessing some shared data directory. Also, in Google App Engine Flexible it’s not possible to add a shared persistent disk, the volumes are attached to instances and destroyed together with them. Much safer solution is keeping the SQL database in an external, durable storage, as Cloud SQL that you have mentioned. There are numerous blog posts and articles that elaborate this issue with the stateless/stateful services, like this one.
It should be mentioned that if you are to use the container in a local environment or for test/development (and you are not looking for a durable state of the database), putting a PostgreSQL inside a container should be perfectly ok. Also, if you design a special way of splitting your data across instances this could work fine, as the guys did with their MySQL servers in this article. So once again, the idea of putting a PostgreSQL database in a container should be carefully thought-out, especially that there are so many options of a safe externalization of such a service.
And just as a side note, you are not forced to use Cloud SQL. The database can be hosted on Compute Engine, another cloud provider, on premises, or can be managed by a third-party vendor. In case of hosting it in Compute Engine the application is able to communicate with the database inside the same project using the internal IP of the Compute Engine instance. Using Cloud Launcher you can quickly deploy PostgreSQL and other popular databases to Compute Engine. Check these Google docs for more information about using third-party databases.

Synchronizing multiple servers and local machines with the same data / source code

Background:
Building web app with a team of 2 developers. RESTful backend via Flask. Using Linux, Apache, Redis, and Postgres.
3 servers:
1 for production
1 for development
1 for UAT
4 databases:
1 for PROD/UAT server
1 for DEV server
1 for developer A on his local machine
1 for developer B on his local machine
2 local machines / developers
In addition to the 4 databases, the developers each have one additional database that serves testing. This testing database needs to be the exact same at all times between the two developers.
Developer B has his own fork of the data, sending pull requests to the master repo, which is worked on by developer A.
Problem:
We have no real protocols to easily transfer data between each of the databases. For example, the developers test databases are often different, which causes chaos. Moving data from DEV to UAT/PROD is done manually.
Developers work in different environments and on different forks. We use pull requests in github to transfer code to Developer A's main repo.
Question
What do you recommend as a solution to our database woes? Is there a better way to share data? Is there a better way for developer A and developer B to share their environment and source code?

I have been working on this problem for an environment of similar size this year. The technology is different, but in short:
2-3 developers
Each developer has a complete environment
Integration, UAT and production server pairs (all VMs).
Linux, Mongodb, Django and Angular
Code
After a few false starts we have settled on the feature branch approach and a good working proposition. See http://nvie.com/posts/a-successful-git-branching-model/. At the moment we have master for production and a development branch. Only one person works on a 'feature' (or we pair). Features should be short-lived. We might flip between multiple features when necessary. We can merge from one feature to another when we need to. It all comes out quite cleanly when we merge back into development. The overhead is minimal and we rarely step on each other's toes. When we do the diff makes it clear what we need to fix manually. A good GIT UI tool helps.
Database
You could use a shared database for development. We don't. Every developer has an independent environment. We may need collections, but usually from UAT or prod rather than each other. I have created a comprehensive admin page. On any environment we can export a collection set. There are mechanisms to send this export where needed. This is used to take prod data and place it in UAT for problem replication. We can then drop back to Integration and dev for repair. Similarly devs can share data using the same application admin page.
Release
Going up the chain from dev -> integration -> uat -> prod is also handled by the application admin page. Any system can export, but only up to the next stage. It is not automatically imported. The import is not automagic. The admin on the target environment is told that a release is available. They can import it from the admin page. We do the same for code and database collections.
It is useful to have it integrated rather than knowing which script to run. It could be a separate application if that worked batter in your environment.

Moving application to the clould - how to design bi directional sync between clients - table design

I currently have a WPF consumer application and my users want to view and update information on mobile devices and tablets. I am planning to support ipad, iphone and windows8 metro.
I want to build a new application with cloud syncing abilities. I am planning to use the Azure platform to store a database and host services.
Given that a user may have multiple devices which may be connected or disconntected and they perform edits, deletes etc, what changes should I make to my tables to handle bi-directional syncing of data?
If the user has 2 devices, deletes a record on device and then on a second device edits the record instead and then syncs both devices to the cloud, is the record deleted or updated? How do you keep track of these changes? Would adding the columns: created and last updated be sufficient to track these changes?
What is the best approach in syncing data in hybrid applications?

have a look at the Sync Framework Toolkit
this is a toolkit built on top of Sync Framework with OData and if am not mistaken it has samples for iPhone and HTML5.

You can take a look at the open source project, OpenMobster's Sync service. You can do the following sync operations
two-way
one-way client
one-way device
bootup
Besides that, all modifications are automatically tracked and synced with the Cloud. You can have your app offline when network connection is down. It will track any changes and automatically in the background synchronize it with the cloud when the connection returns.
It also supports sync across multiple devices like iCloud does.
As per your question, is it deleted or is it updated, in my engine it would be deleted with the reasoning being someone wants the record deleted regardless of its state. I understand the other argument holds true as well. But with sync engines and conflict resolution you have to go with some behavior and stay consistent
Currently only native development is supported on Android and iOS. However, the next release which is 2.2-M8 will support end-to-end integration with PhoneGap on Android and 2.2-M9 will add iOS.
Here is a link to the open source project: http://openmobster.googlecode.com
Here is a tutorial to understand some of its workings: http://code.google.com/p/openmobster/wiki/AndroidSyncApp