Flyway integration in a production database - database

We have a database in production that already has a good number of rows in the "user" table. Consider the following statement from the flyway website:
If you have an existing database that has not been filled by Flyway
this is the way to go:
Create an initial migration script that will recreate your current
state and give it a low version number.
Use flyway:init to create the metadata table and set this script as the current version.
I'd like to use flyway to manage my schema and various constants in the database, but I don't want V1__Base_version.sql to contain the account information for our current production users, especially considering it's stored in SCM. If I understand these instructions correctly though, I would need the ability to "recreate [my] current state" with V1__Base_version.sql.
So would creating an initial migration with just the schema and the constants work okay? Or do the databases on our workstations need to match those in production 100%?

You are correct. The init command is there to mark the production database with a version.
The initial migration you create (with the structure of your PROD db) is for the other environments. It will never run on PROD as its version will be below the init version. It will however align all environments so that subsequent migrations can be applied equally across all of them.

Related

Given I have to write the migration scripts myself, what value does Flyway provide?

In my situation I use a tool that generates SQL statements to contain all database init/create statements. How does Flyway provide value beyond what my tool provides? Why should I care to write hand-coded migration scripts to use Flyway?
The question above mixes two things that should be separate: the concept of database creation mixed with the concept of migration.
database creation
Given a complete database and an empty database, you can use many tools to generate the scripts needed to recreate the complete database where nothing exists. In Flyway terms, you just creating a baseline. This isn't the concept of migration at all. Of course, given a V2.0 database, you could see any V1.0 database, blow it away, and install the V2.0 database, but now you've lost your data.
migration
Given a complete database V2.0 and a V1.0 older database, and you want to make the V1.0 database be "upgraded" to the V2.0. In the database world, this is called a migration because the existing 1.0 data needs to be re-arranged in a way that it works on V2.0. Now you need a script that not only creates/alters tables, you need a script that does some ETL (extract data, transform the data to be able to load into the new table structures, alter the old database to the new table structures, then load the data into the database). This may or may not be trivial, depending. You build the script to do it, Flyway will manage executing that script.
Flyway
Flyway enables the following:
Migration scripts become part of the software asset. They are versioned so that baseline/migration scripts can be maintained in source control in a way that migration becomes a repeatable feature as opposed to "one off" scripting work.
Flyway maintains a meta table in each database it works with so it knows what scripts have been applied
Flyway can apply migration in a completely automated way that removes manual execution errors
Flyway enables the creation of migration scripts as part of development (like Test Driven Development makes unit test creation an integral part of development) so that all your database development is captured in the form of migration scripts (rather than building migration scripts as needed as part of "one off" migrations.
It's common when using Flyway to update any previous version of your application in seconds via a single command. It becomes so easy that the stress of migration from an old DB to a new version goes away and now, evolution of the DB becomes easy and usual.
To use Flyway well, it requires changing your workflow: every time develop a change in your developer DB, put the change into a migration script so you can execute those changes against all the older DB versions that exist in the world. And those scripts are checked into your application's source code making migration a first class citizen of your software asset just like any other functionality.
It depends very much on your use case,
If you plan to write a simple application with an database structure that will remain static over the lifetime of the application it will add very little value.
If the project is expected to have a dynamic design over its lifetime with changes taking place on the schema Flyway provides a formal structure in which the changes maybe expressed and viewed. This formal structure can also be very helpful if you end up with a larger team working on the project as Flyway can then become part of the framework to handle things like multi-schema CI work.
One key thing is that you do not have to start with Flyway, you can added it at a later point, normally with limited retooling as the schema at that point in time will just become your baseline to which all future changes can be added.

Continuous Deployment in Cloud

I am assinged for the task of Continuous deployment from development server to production server.
In my development server all the database objects will be created under the 'DBO' Schema. But in Production server based on every Tenants company list differenet SCHEMAS will be there.
for E.g in my development server if a tablename is created like
dbo.ABC
dbo.XYZ
And while i creating a tenant(Omkar---db) (Sarkur,Mathur--- schemas), the database objects will be like
Sarkur.ABC, sarkur.XYZ
Mathur.ABC, Mathur.XYZ
Now, i have to compare these two databases to check whether any changes in structure of the database objects, addition / deletion of database objects. If so that changes has tobe synchronized in the production database.
If anyone know that how to compare these two different schemas object, pls let me know..
1 option that I know is looking suitable
Flyway :
It is Easy to setup, simple to master. Flyway let's you regain control of your database migrations with pleasure and plain sql.
Solves only one problem and solves it well. Flyway migrates your database, so you don't have to worry about it anymore.
Made for continuous delivery. Let Flyway migrate your database on application startup. Releases have never been this easy.
Big Plus It's Open Source framework!
http://flywaydb.org/

Syncing magento database froms development to production

I use git for version control. I have a development, staging and production environment. When I finish in development I push to staging for review by the client. When approved, I push changes from staging to production. That works fine as long as there is no database changes. What happens if I install modules via Magento connect on local development and it makes database modifications.
How would I push those changes up to the production server since the production server is always changing?
Edit:
I wrote two shell scripts. One that pulls the production database down to my development server, replaces base url with develpment url and updates my development db accordingly. It also leaves the production sql dump behind to be added to my git repo. I'm not really sure if it's beneficial to keep the raw dumps in source control but I'm going to try it out. The second scripts moves the development database up to staging and essentially performs the same operations as the first.
Now when it comes time to move to production I pull the updated production repo into the production server and allow magento to do it's thing. I also started using SQLYog recently and it has a database comparison wizard which will give me the differences in my development and production databases and allow me to merge the changes in selectively. It always creates a migration script that I added to source control as well. If anything goes wrong I can run the comparison to see if anything was missed.
Does this sounds like a decent workflow to you guys?
This is a common situation for developers. It's much easier to modify code and schema and be assured that all is well when there is a small codebase which is thoroughly understood and doesn't have too much flexibility for UI. Of course, this is not the case with Magento, which can be quite difficult to work into automated testing and continuous integration schemes. That said, there are some knowable, testable behaviors on which you can rely.
An Overview
When dealing with local development which is merged to production, one must be assured that the schema and data changes relevant to new or changed functionality are also applied when the filesystem is updated. This is actually how Magento itself works. Module configuration files can supply a version number and can configure setup resources. This information is used to enter into a schema & data modification workflow which results in version information being added to the database. It is the consistency between file-designated version number and database-registered version number that one can / the system can infer that the database is in the appropriate state given the files present.
This means that when the new/updated module files are merged to production and the necessary conditions are met (e.g. the config cache is invalid, etc.), the database upgrade should take place. Your (proper) concern is that this process might break based on remote server-level differences, remote data differences, etc. Without a tightly-regulated integration testing process, there is some overhead.
Plan of Attack: Pick the Right Strategy
The essential activity in this area is assessing the areas of module's impact on the database. This should be straightforward with any module which is worthy of being installed; check for any of the following:
A system.xml file
Existence of install/upgrade scripts in sql or data folders
Existence of custom setup resource class (configured under global/resources xpath)
Appropriate configuration XML (version number in module config node & a setup resource under global/resources xpath)
For 1, simply review the structure and know that its effects on the database will be limited to the core_config_data table, and generally only once an admin has saved values via the GUI (noting that 1. below applies as well).
For 2 & 3, review the scripts which are set to be run. These can be divided into three general areas:
1. Configuration settings - look for setConfigData() and deleteConfigData() calls
2. Table additions and edits (new tables, adding columns, etc.)
3. EAV-related changes and additions; look for EAV setup resources
4. Non-EAV data changes: installation of new data or modification of existing data
It's a matter of feel & intuition, but gauging the level of impact on the db will allow you to determine if you should clone production data down to local dev and test the setup workflow locally, verifying it works okay, then pushing to production and re-checking (backing up always!). If the changes are wide-ranging, it may be best to take the site offline so that you can ensure that you won't lose order or customer data if you need to revert after a botched upgrade.
You generally don't ever want to push data contained in a db from dev > prod. Your schema defs should be contained in Magento sql install scripts. If you do have actual new data you want to push up to prod, you'll have to do so on a case-by-case basis. You will most likely pull down from prod > dev to test out data and configuration before running the actual case on prod.
Case - 1:
If your production server has the same data (DB) which you have in the local, then just copy the database and files to the production server and do the the following:
1) Delete the content of the folder /var
2) Change the values of the file /app/etc/local.xml
There you can find your connection string data (database user, host and name).
3) Once you got your database uploaded, you need to make some changes.
Run this query:
SELECT * FROM core_config_data WHERE path = 'web/unsecure/base_url' OR path = 'web/secure/base_url';
you will get 2 rows. update these rows by Run this query
UPDATE core_config_data SET value = 'YOUR_NEW_LIVE_URL' WHERE path LIKE 'web/%/base_url';
That’s all.
Case - 2:
If you don't want to change the DB data's in production, then you need to install the modules via megento connect directly to the production server. And you can update the files which you have changed in Local.

Flyway/Liquibase for Database Structure and DBUnit for Database Inserts?

I have the following scenario for my application:
1 Production Server
1 Test Server
n Development Computers
For database migration we use Hibernate Schema Update for the Schema and DBUnit for filling in alle the production data (on all servers/computers). When the schema update is done I generate a new DTD File for the new schema, so I can do a fresh import of the DBUnit XML. The application updates the database at startup with the XML file (only on development and test servers/computers!)
Of course this approach is not optimal and fragile. So I looked at Liquibase and Flyway. Both seem to be great tools, but what I do not get is: How do I migrate the data? In my case, I dump the data of the production system once a week and add it to the applications source control as a DBUnit XML file, so all developers have "fresh" data and the test server has current production data, too.
The problem I see with Liquibase and Flyway is, that there is no solution how to do automated diffs from the database data and generate the migration changes automatically.
So my idea is the following with the following steps:
Set Hibernate to validate instead of update.
When a STRUCTURAL database change is needed, I add it to the migration script for the major version
No database inserts are in the migration script.
Generate a new DTD for DBunit based on the new database structure
Generate the DBUnit XML from the production database.
Another idea would be to utilize flyways JavaMigration and provide an initial Database Dump based on DBUnit. All other changes for database data will be handled in migration scripts. But still there is the problem: How to make diffs from the current migration script state and the production database state?
It would be awesome if anyone could provide me hints how to handle my scenario :)
If your goal is to use dumps of the PROD database in DEV and TEST environments, I would:
Configure the DB migration tool to run on application startup (both Flyway and Liquibase support this through their respective APIs)
Package all the DB structure migrations together with the app
Dump both data and structure from PROD
This way, when the PROD database is restored to DEV or TEST, the old metadata table of the migration tool is restored as well.
When the app starts, the migration tool will discover that the db structure is outdated and upgrade it to the newest version. Done.
No need to use DBUnit for this.
The short answer is that all your changes would be done through Liquibase or Flyway.
We use Flyway, with the same prod/test/development setup.
We make all db changes (structure or metadata) using Flyway migration scripts, stored in source control. Each time we do a new deployment to an environment, we first run the migration scripts there (using either the command line tool or the maven plugin). The code first goes to development environment, gets integration tested there and keeps going to test and production.
The main thing to watch out for is that Flyway requires a linear versioning to the files, so if two developers check in migrations at the same time, one of them will have to rename theirs.

Verify database changes (version-control)

I have read lots of posts about the importance of database version control. However, I could not find a simple solution how to check if database is in state that it should be.
For example, I have a databases with a table called "Version" (version number is being stored there). But database can be accessed and edited by developers without changing version number. If for example developer updates stored procedure and does not update Version database state is not in sync with version value.
How to track those changes? I do not need to track what is changed but only need to check if database tables, views, procedures, etc. are in sync with database version that is saved in Version table.
Why I need this? When doing deployment I need to check that database is "correct". Also, not all tables or other database objects should be tracked. Is it possible to check without using triggers? Is it possible to be done without 3rd party tools? Do databases have checksums?
Lets say that we use SQL Server 2005.
Edited:
I think I should provide a bit more information about our current environment - we have a "baseline" with all scripts needed to create base version (includes data objects and "metadata" for our app). However, there are many installations of this "base" version with some additional database objects (additional tables, views, procedures, etc.). When we make some change in "base" version we also have to update some installations (not all) - at that time we have to check that "base" is in correct state.
Thanks
You seem to be breaking the first and second rule of "Three rules for database work". Using one database per developer and a single authoritative source for your schema would already help a lot. Then, I'm not sure that you have a Baseline for your database and, even more important, that you are using change scripts. Finally, you might find some other answers in Views, Stored Procedures and the Like and in Branching and Merging.
Actually, all these links are mentioned in this great article from Jeff Atwood: Get Your Database Under Version Control. A must read IMHO.
We use DBGhost to version control the database. The scripts to create the current database are stored in TFS (along with the source code) and then DBGhost is used to generate a delta script to upgrade an environment to the current version. DBGhost can also create delta scripts for any static/reference/code data.
It requires a mind shift from the traditional method but is a fantastic solution which I cannot recommend enough. Whilst it is a 3rd party product it fits seamlessly into our automated build and deployment process.
I'm using a simple VBScript file based on this codeproject article to generate drop/create scripts for all database objects. I then put these scripts under version control.
So to check whether a database is up-to-date or has changes which were not yet put into version control, I do this:
get the latest version of the drop/create scripts from version control (subversion in our case)
execute the SqlExtract script for the database to be checked, overwriting the scripts from version control
now I can check with my subversion client (TortoiseSVN) which files don't match with the version under version control
now either update the database or put the modified scripts under version control
You have to restrict access to all databases and only give developers access to a local database (where they develop) and to the dev server where they can do integration. The best thing would be for them to only have access to their dev area locally and perform integration tasks with an automated build. You can use tools like redgates sql compare to do diffs on databases. I suggest that you keep all of your changes under source control (.sql files) so that you will have a running history of who did what when and so that you can revert db changes when needed.
I also like to be able to have the devs run a local build script to re initiate their local dev box. This way they can always roll back. More importantly they can create integration tests that tests the plumbing of their app (repository and data access) and logic stashed away in a stored procedure in an automated way. Initialization is ran (resetting db), integration tests are ran (creating fluff in the db), reinitialization to put db back to clean state, etc.
If you are an SVN/nant style user (or similar) with a single branch concept in your repository then you can read my articles on this topic over at DotNetSlackers: http://dotnetslackers.com/articles/aspnet/Building-a-StackOverflow-inspired-Knowledge-Exchange-Build-automation-with-NAnt.aspx and http://dotnetslackers.com/articles/aspnet/Building-a-StackOverflow-inspired-Knowledge-Exchange-Continuous-integration-with-CruiseControl-NET.aspx.
If you are a perforce multi branch sort of build master then you will have to wait till I write something about that sort of automation and configuration management.
UPDATE
#Sazug: "Yep, we use some sort of multi branch builds when we use base script + additional scripts :) Any basic tips for that sort of automation without full article?" There are most commonly two forms of databases:
you control the db in a new non-production type environment (active dev only)
a production environment where you have live data accumulating as you develop
The first set up is much easier and can be fully automated from dev to prod and to include rolling back prod if need be. For this you simply need a scripts folder where every modification to your database can be maintained in a .sql file. I don't suggest that you keep a tablename.sql file and then version it like you would a .cs file where updates to that sql artifact is actually modified in the same file over time. Given that sql objects are so heavily dependent on each other. When you build up your database from scratch your scripts may encounter a breaking change. For this reason I suggest that you keep a separate and new file for each modification with a sequence number at the front of the file name. For example something like 000024-ModifiedAccountsTable.sql. Then you can use a custom task or something out of NAntContrib or an direct execution of one of the many ??SQL.exe command line tools to run all of your scripts against an empty database from 000001-fileName.sql through to the last file in the updateScripts folder. All of these scripts are then checked in to your version control. And since you always start from a clean db you can always roll back if someones new sql breaks the build.
In the second environment automation is not always the best route given that you might impact production. If you are actively developing against/for a production environment then you really need a multi-branch/environment so that you can test your automation way before you actually push against a prod environment. You can use the same concepts as stated above. However, you can't really start from scratch on a prod db and rolling back is more difficult. For this reason I suggest using RedGate SQL Compare of similar in your build process. The .sql scripts are checked in for updating purposes but you need to automate a diff between your staging db and prod db prior to running the updates. You can then attempt to sync changes and roll back prod if problems occur. Also, some form of a back up should be taken prior to an automated push of sql changes. Be careful when doing anything without a watchful human eye in production! If you do true continuous integration in all of your dev/qual/staging/performance environments and then have a few manual steps when pushing to production...that really isn't that bad!
First point: it's hard to keep things in order without "regulations".
Or for your example - developers changing anything without a notice will bring you to serious problems.
Anyhow - you say "without using triggers".
Any specific reason for this?
If not - check out DDL Triggers. Such triggers are the easiest way to check if something happened.
And you can even log WHAT was going on.
Hopefully someone has a better solution than this, but I do this using a couple methods:
Have a "trunk" database, which is the current development version. All work is done here as it is being prepared to be included in a release.
Every time a release is done:
The last release's "clean" database is copied to the new one, eg, "DB_1.0.4_clean"
SQL-Compare is used to copy the changes from trunk to the 1.0.4_clean - this also allows checking exactly what gets included.
SQL Compare is used again to find the differences between the previous and new releases (changes from DB_1.0.4_clean to DB_1.0.3_clean), which creates a change script "1.0.3 to 1.0.4.sql".
We are still building the tool to automate this part, but the goal is that there is a table to track every version the database has been at, and if the change script was applied. The upgrade tool looks for the latest entry, then applies each upgrade script one-by-one and finally the DB is at the latest version.
I don't have this problem, but it would be trivial to protect the _clean databases from modification by other team members. Additionally, because I use SQL Compare after the fact to generate the change scripts, there is no need for developers to keep track of them as they go.
We actually did this for a while, and it was a HUGE pain. It was easy to forget, and at the same time, there were changes being done that didn't necessarily make it - so the full upgrade script created using the individually-created change scripts would sometimes add a field, then remove it, all in one release. This can obviously be pretty painful if there are index changes, etc.
The nice thing about SQL compare is the script it generates is in a transaction -and it if fails, it rolls the whole thing back. So if the production DB has been modified in some way, the upgrade will fail, and then the deployment team can actually use SQL Compare on the production DB against the _clean db, and manually fix the changes. We've only had to do this once or twice (damn customers).
The .SQL change scripts (generated by SQL Compare) get stored in our version control system (subversion).
If you have Visual Studio (specifically the Database edition), there is a Database Project that you can create and point it to a SQL Server database. The project will load the schema and basically offer you a lot of other features. It behaves just like a code project. It also offers you the advantage to script the entire table and contents so you can keep it under Subversion.
When you build the project, it validates that the database has integrity. It's quite smart.
On one of our projects we had stored database version inside database.
Each change to database structure was scripted into separate sql file which incremented database version besides all other changes. This was done by developer who changed db structure.
Deployment script checked against current db version and latest changes script and applied these sql scripts if necessary.
Firstly, your production database should either not be accessible to developers, or the developers (and everyone else) should be under strict instructions that no changes of any kind are made to production systems outside of a change-control system.
Change-control is vital in any system that you expect to work (Where there is >1 engineer involved in the entire system).
Each developer should have their own test system; if they want to make changes to that, they can, but system tesing should be done on a more controlled, system test system which has the same changes applied as production - if you don't do this, you can't rely on releases working because they're being tested in an incompatible environment.
When a change is made, the appropriate scripts should be created and tested to ensure that they apply cleanly on top of the current version, and that the rollback works*
*you are writing rollback scripts, right?
I agree with other posts that developers should not have permissions to change the production database. Either the developers should be sharing a common development database (and risk treading on each others' toes) or they should have their own individual databases. In the former case you can use a tool like SQL Compare to deploy to production. In the latter case, you need to periodically sync up the developer databases during the development lifecycle before promoting to production.
Here at Red Gate we are shortly going to release a new tool, SQL Source Control, designed to make this process a lot easier. We will integrate into SSMS and enable the adding and retrieving objects to and from source control at the click of a button. If you're interested in finding out more or signing up to our Early Access Program, please visit this page:
http://www.red-gate.com/Products/SQL_Source_Control/index.htm
I have to agree with the rest of the post. Database access restrictions would solve the issue on production. Then using a versioning tool like DBGhost or DVC would help you and the rest of the team to maintain the database versioning

Resources