We have a multi tenant system with 1000 databases. We want to migration all the database instantly when deploying new release as part of deployment process.
Migration solution provided by framework like Laravel or other provides solution to loop through database to apply migration but this solution is slow keeping in mind the no of databases it needs to run on.
Is there any solution where this can be done instantly?
One solution can be to divide the migration script in chucks and run them in parallel but was thinking if there is any better solution
Related
I have a .NET core app which I'am running on AWS Elastic Container Services (ECS).
- The app runs on two different instances.
- Database is SQL server
The app runs the database migrations on startup, which has worked really well. But then i had to migrate a lot of data which meant that the migration took longer time. This resulted in duplicates of the data being moved.
This happens because both apps first checks the database if the migration has been executed, both finds out that it hasn't, then both starts running the migration which takes time. After it is done it adds the migration to the database.
How do people solve this?
Possible solutions I and others have thought of
Start with only one instance of the app, then scale up.
this would work, but then I will have to manually scale down and up for each time there is a migration. (It is possible to do it automatically, but it would take time)
Wrap long running migrations in transactions and at the start set the migration as done in the database. Check if it is in the database before commiting the change. If the transaction fails, remove the migration from the database.
Lock the database? EF Core lock the database during migration . Seems weird.
Make the migration a part of the deployment process. This seems to be best practice, but it would mean that the Build server would need to know the Database secrets. I'am not to afraid to give it, but it would mean i would have to maintain a duplicate set.
What does people out there do? Am I missing some obvious solution?
Thanks
We also used to have our applications perform the migration, but even Microsoft recommends avoiding this in a multi-instance environment:
We recommend production apps should not call Database.Migrate at application startup. Migrate shouldn't be called from an app in server farm. For example, if the app has been cloud deployed with scale-out (multiple instances of the app are running).
Database migration should be done as part of deployment, and in a controlled way.
Like everything there are different ways to go about solving the problem. Our team is small and so we generate migration scripts through the EF CLI tooling and then run them manually as part of a deployment/maintenance routine. This could of course be automated if your process warrants it.
When I debug against a (central) database, Flyway can update the database schema. This happens when my local application runs on a development branch which is further ahead than the deployed application.
Running the local application will then invoke migration scripts on the central database. In the worst case, this might update a production database of course.
Another scenario is one where 2 developers work against 1 development database with test-data. Both developers are working on different features and both are modifying the schema. When one developer updates the db, the other is (probably) confronted with a checksum issue, and otherwise surprised by the changes made.
I'm thinking about a solution to such problems.
Of course, there are some solutions outside of Flyway, like only connecting to throw-away databases, or preventing access to important db's.
I'm interested in which options Flyway offers.
There are some options, which seems to be useful in your cases.
For the first one, you can set the target property with the same version your DB has, to prevent Flyway to update it.
For the second case with 2 developers working simultaneously with the same DB instance, you can try to turn off the validateOnMigrate property to avoid validation failures or the ignoreMissingMigrations to ignore the migrations some of the developers doesn't have yet.
Here you can find all available via console properties for migration task. You didn't specify, haw exactly you run the Flyway, if it's done vie Spring Boot, then not all this properties are available, just some of them - you can find it here under the Flyway section.
But for most cases, I think, the best solution is to simply turn Flyway migrations off during the development and debugging if it's possible and use for the delivery of ready features.
There is a testing server that uses the testing database. We test the website on the testing server. If it is okay, we update the website and the database schema from the testing server to the production server. But this method is very painful and risky.
First, we have to redirect the users to a maintenance page, so the website is paused for a while.
Second, if something goes wrong when updating, we have to back to old website, because we can't put the website in a maintenance mode for a long time.
So I'm seeking a solid solution to update an IIS website and an Sql Server Database without data loss and using maintenance mode. Is there any way to do this? How the big websites do this without data loss and pausing.
We've thought using a release candidate website. We've planned to use this RC website for temporary. First, we update the RC site, then swap the bindings between RC and production website. But this time the database is problem. Because we can change the database schema, and the old one can't work with new database. So, if we use a temp site with temp db, there will be data loss. Also, the updated website won't work with old database if the temp site uses old production database. So I need a solid and practical solution for this problem.
This is orders of magnitude more complicated than you imagine. This specifically is not about HA nor about contiguous integration. Neither of those will provide what you need, they're only pieces of the much more complex puzzle.
There simply isn't possible to write code changes in a manner that is transparent/oblivious to schema changes as they occur. At best you can write the code in a manner that supports the schema at v. N and at v. N+1, which in itself is a big challenge. But is impossible to write the code in a manner that supports the schema as it transitions from v. N to v. N+1. The schema change induced by a deployment has to be atomic for the code operating on the schema. Since the schema change itself cannot be atomic, it follow that the upgrade has two possible avenues:
take the code offline during the schema change. This is what you're doing now and is the safest approach. Of course, it implies service availability down time and runs the risks you already experienced (rollback of failed upgrade, lengthy upgrade, etc). A variant of this approach is to redirect the service to a read-only copy of the data and offer a degraded service experience(no changes are possible during the downtime) which may or may not be acceptable, depending on the business specifics.
standby upgrade. This implies that you take a snapshot of the service data (various HA solutions may provide a standby snapshot out-of-the-box, eg. log shipping). Upgrade the snapshot, then apply all the transactions that occurred on the real service data to the upgraded snapshot. This is always tricky, because it requires a technology to detect, capture and apply the changes (eg. change tracking, replication, custom solution etc) and requires to transform each change to the new, upgraded, schema. Once the upgraded schema is up to date with changes from the main service, the service can be redirected to the upgraded schema. This redirection is also much more complex than it sounds. For one choosing the moment when to cut-off the old schema and stop accepting new changes, while making sure all changes were applied to the new upgraded schema DB is a challenge in itself. Another challenge is to resolve the conflict of the code understanding pre-upgrade and post-upgrade schema versions. Developing code that handles both is, as I said, problematic and error prone, so one solution is to, again, take the service offline for a short period and replace the code. Another solution is to have a standby service, running code that handles the post-upgrade DB schema and is connected to the post-upgrade DB, and redirect the live requests to your standby, upgraded, service.
And I did not even touch the thorny subject of service interaction, when a particular service of a much larger deployed solution has to be upgraded. This is when service API protocol back compatibility plays the major role to allow the post-upgrade service to play along with its peer services.
Ultimately there just isn't any silver bullet. I've witnessed single machine large DB deployments that took weeks to roll out version N+1, with transcriptional replication contiguously feeding the post-upgrade DB schema with changes from the pre-upgrade DB. And I witnessed deployments of thousands of machines deploying version N+1 in stages, as a complicated dance of enabling code and data changes over the course of several days to reach the full functionality of the post-upgrade. This problem is just plain hard.
This is what Azure is fantastic for. The Azure Cloud platform allows for staging servers and production servers. You can set it up so once you commit your changes to Git or TFS it is pushed to either a staging or production server automatically. You can also set up to manually push changes. Most of the ORM libraries like Entity Framework have migration support.
There is a lot more information out there on this topic like:
Azure seamless upgrade when database schema changes
Staging or Production Instance?
Youre describing a high availability solution (HA). HA solutions are expensive and can quickly be overkill. you would need redundancy for your app and db servers, which means setting up a db cluster. All this increases the amount of time you would spend deploying changes, but the tradeoff is that your app is always available.
The main thing with deployment is having a repeatable process. So my best recommendation is to script or automate as much as possible.
What are relational database (and schema) migration patterns on production in continuous delivery?
In many traditional developments the DBA arranges a big migration script out of the many smaller scripts created in the current release cycle. But in CD the developer may want to push the change now to production, not wait to compile them with other scripts.
I know on rails-migration but to me it looks more reasonable to use raw sql scripts.
I've also seen tools like flyway to manage migrations but I have not read of many people using them in production. This is why I wonder what are the common practices here.
Flyway works great for continuous delivery/deployment. Many clients use it across all environments, including production.
The single most important thing for cascading DB migrations across environments is to have a 3 step process:
Step 1
Old application code works together with old DB.
Step 2
New application code get deployed, and migrates DB on startup. This migration must be backwards-compatible so that old application code still works with the new DB. This is essential because:
you can then do rolling upgrades, upgrading one node at a time until all nodes have the new application code
rollback immediately to the old application code if the new one is broken
This step may involve compatibility views and triggers to do the job.
Step 3
After the changes have been proven to work, the next version of the application code gets
deployed together with the necessary DB migrations to discard any remaining outdated (from step 1) and compatibility (from step 2) structures.
Implement changes to your database as single (raw) sql files, then use sqlpatch to build a migration script.
I usually have a single git repository for the database alone and a cd environment attached to it. I usually have a production and a development database that are automatically migrated when I push to corresponding branches.
This setup makes is very easy to setup another database for a feature branch and to experiment with it. Sqlpatch takes care of all the dependencies in the separate sql files so that I can easily merge a feature branch in another branch.
Project code and project database usually grows together. Often some unit tests needs test databases and test database entries too.
What is the perfect way to sync database/contents and release/version management?
And does it support branching and revertation of code hand-in-hand with database data and db structure?
I currently commit my changes with comment SQL code for my svn trigger, but whats about the revertion of my code-changes?
Are there any perfect solutions?
The best way of dealing with this that I know is "Continuous Database Integration" - Google will lead you to tools and articles; this is a good start.