How do I test database migrations?

How do I test database migrations? - sql-server

I'm using Migrator.NET to write database migrations for the application. Marc-André Cournoyer wrote:
Like any code in your application you
must test your migrations. Ups and downs code. Do it part of your
continuous build process and test it
on as many different databases and
environment as you can.
How do I do that? Say I have the Up() method which creates a table and the Down() method which drops the same table and I'm using SQL Server. How would a test look like? Should I be running SQL query against the system tables, like select * from sys.columns, to check if the table was created and that it has the proper structure? What if we're using NHibernate?
EDIT
I mean migrations in the Rails ActiveRecord Migrations sense (creating, modifying and tearing down databases in small steps based on C# code).
EDIT 2
And here's where I read about that we should test migrations. The blog post is actually linked from Migrator's wiki.

Do you test your DAL - some sort of integration test?
You need more than a migration script, you also need a baseline script. When you want to test a database upgrade, you should run all the scripts from the baseline on a testing/staging server to create the newest version of the database. Then test your DAL against the up-to-date test database. If all the DAL tests succeed then your migration should have been successful (otherwise your DAL tests are not complete enough).
It's an expensive test to run, but it's pretty much rock solid. I'll personally admit to doing a lot of this manually at the moment; we have an in-house migration tool that will apply all scripts (including the baseline), so the test database setup and DAL tests are separate steps. It works though. If you want to make sure that a table was created, there's no better method than to actually try to insert data into it!
You can try to verify the results by looking at system catalogs and INFORMATION_SCHEMA views and so on, but ultimately the only way to be sure it's actually working is to try to use the new objects. Just because the objects are there doesn't mean that they're functional.

maybe this scrip can help you :
http://www.benzzon.se/forum/uploads/benzzon/2006-03-27_134824_sp_CompareDB.txt
this script compare two db.(structure and data)

Source control is for taking a snapshot of your current code base. Migration is for moving changing your database from one version to the next. So that at some future point you can take an old database, apply migrations and work with the latest code base.
I've never seen the actual migrations tested. I have seen the results tested, and they have caught/reminded me to run the latest migrations.
describe User do
it { should have_column :name, :type => :string }
it { should validate_presence_of :name}
end
So someone changes the model. Adds a test to reflect the model. Adds the migration. Then commits the source.
You grab the latest, run tests. Tests fail because the database doesn't correspond. You remember to run migrations, then rerun tests. Success.

Treat migrations testing as part of your overall persistence testing strategy if using NHibernate, i.e. if you can create and save all of your entities without any errors, your database and your mappings should be correct.

You COULD do a comparison of database system objects, but you would need to have a target against which to compare - otherwise how would you know if passed or failed?
I think you may be better off creating a set of edge case CRUD operation test cases that exercise the entities or operations in the data layer. If any of these fail, the database is not in sync with what is required. i.e. if the insert of an field char(20) fails because it is only char(15) in the database. Then the db structure comparison can be done to see what if off.
You may be able to short circuit this by focusing only on the recently changed items, and assuming prior changes have been applied.

I'm looking for an answer to this as well. I think this should be tested in an integration environment rather than a unit test one: For unit tests (DAL) I drop the database and re-create it.
However, ideally I'd like to have an integration environment were my DB is replicated from production and DB migration scripts run both ways:
Upwards to ensure a smooth upgrade of production and Downwards to ensure rollbacks are possible.

Related

Given I have to write the migration scripts myself, what value does Flyway provide?

In my situation I use a tool that generates SQL statements to contain all database init/create statements. How does Flyway provide value beyond what my tool provides? Why should I care to write hand-coded migration scripts to use Flyway?

The question above mixes two things that should be separate: the concept of database creation mixed with the concept of migration.
database creation
Given a complete database and an empty database, you can use many tools to generate the scripts needed to recreate the complete database where nothing exists. In Flyway terms, you just creating a baseline. This isn't the concept of migration at all. Of course, given a V2.0 database, you could see any V1.0 database, blow it away, and install the V2.0 database, but now you've lost your data.
migration
Given a complete database V2.0 and a V1.0 older database, and you want to make the V1.0 database be "upgraded" to the V2.0. In the database world, this is called a migration because the existing 1.0 data needs to be re-arranged in a way that it works on V2.0. Now you need a script that not only creates/alters tables, you need a script that does some ETL (extract data, transform the data to be able to load into the new table structures, alter the old database to the new table structures, then load the data into the database). This may or may not be trivial, depending. You build the script to do it, Flyway will manage executing that script.
Flyway
Flyway enables the following:
Migration scripts become part of the software asset. They are versioned so that baseline/migration scripts can be maintained in source control in a way that migration becomes a repeatable feature as opposed to "one off" scripting work.
Flyway maintains a meta table in each database it works with so it knows what scripts have been applied
Flyway can apply migration in a completely automated way that removes manual execution errors
Flyway enables the creation of migration scripts as part of development (like Test Driven Development makes unit test creation an integral part of development) so that all your database development is captured in the form of migration scripts (rather than building migration scripts as needed as part of "one off" migrations.
It's common when using Flyway to update any previous version of your application in seconds via a single command. It becomes so easy that the stress of migration from an old DB to a new version goes away and now, evolution of the DB becomes easy and usual.
To use Flyway well, it requires changing your workflow: every time develop a change in your developer DB, put the change into a migration script so you can execute those changes against all the older DB versions that exist in the world. And those scripts are checked into your application's source code making migration a first class citizen of your software asset just like any other functionality.

It depends very much on your use case,
If you plan to write a simple application with an database structure that will remain static over the lifetime of the application it will add very little value.
If the project is expected to have a dynamic design over its lifetime with changes taking place on the schema Flyway provides a formal structure in which the changes maybe expressed and viewed. This formal structure can also be very helpful if you end up with a larger team working on the project as Flyway can then become part of the framework to handle things like multi-schema CI work.
One key thing is that you do not have to start with Flyway, you can added it at a later point, normally with limited retooling as the schema at that point in time will just become your baseline to which all future changes can be added.

EF database / model first unit (integration) tests

I want each of my unit (integration) test methods to use a clean and consistent database and specific test data for each test.
Could you please provide me some code samples/snippets and tell me what are best practices for the following questions for both scenarios EF 5 database first and model first.
How to create the database for each test method?
How to set up the test data for each test method?
How to delete the database for each test method?
SSDT project is used to handle database schema, how to use the current SSDT schema for each test run? So that the tests are always executed against the current development version of the database.
Please consider the following assumptions for above questions:
The unit tests shall be executed locally on dev machine and on server CI builds.
Each test may have different test data.
Manually defined .mdf test files should be avoided because several developers are working on the product and there is a potential risk that one deleoper overwrites the changes of the .mdf file which another developer may have checked in previously -> development process should be as simpel as possible.
SSDT is used, so maybe this an option to create the database (probably not a good one because I want the database to be created for each test) and I have no deep knowledge yet about SSDT possiblities.
A good performance of test execution time would be nice to have.
VS/TFS 2012 is used.
SQL Server 2012 is used.
Application is a C# desktop application.
Mocking EF context etc. is not an option.
I hope you can guide me into the right direction how to solve the 4 questions from above. I don´t know if EF provides some functionality (I think only for code first) for my challenges or if this all must be solved by executing SQL scripts or something like that.
Thanks!

Code first migrations - is it really necessary?

I'm trying to find out a proper database development process in my applications. I've tried Visual Studio Database projects with Post/Pre deployment scripts (very nice feature), Entity Framework Database First approach (with separate script for each database change placed under source control), and now I'm dealing with Entity Framework Code First approach. I have to say that I'm really impressed with the possibilities that it gives, but I'm trying to figure out how to manage the changes in the models during the development. Assuming that I have the following environments in my company:
LOCALHOST - for each single developer,
TEST - single machine with SQL Server database for testing purposes,
PRODUCTION - single machine with SQL Server database used by clients
Now each time when I'm working on an application and the code changes, it's ok for me to drop and recreate the database each time when I'm testing an application (so for LOCALHOST and TEST environments). I've created proper database initializers that seeds the database with test data and I'm pretty happy with them.
However with each new build when model changes, I want to handle the PRODUCTION database changes in such a way that I won't lost the whole data. So, in Visual Studio 2012 there is the "SQL Schema Compare" tool and I'm just wondering if it is not enough to manage all changes in the database for PRODUCTION development? I can compare my {local} database schema with PRODUCTION schema and simply apply all changes?
Now, I want to ask what's the point of Code First Migrations here? Why should I manage all changes in the database through it? The only reason I can find is to allow to perform all sort of "INSERT" and "UPDATE" commands. However I think that if database is correctly designed there shouldn't be such need to perform these commands. (It's topic for another discussion so I don't want to go into details). Anyway I want to ask - what are the real advantages of Code First Migrations over the Code First + Schema Compare pattern?

It simplifies deployment. If you didn't manage the migrations in code, then you would have to run the appropriate delta scripts manually on your production environment. With EF migrations, you can configure your application to migrate the database automatically to the latest version on start up.
Typically, before EF migrations, if you wanted to automate this you would either have to run the appropriate delta scripts during a custom installation routine, or write some infrastructure into your application which runs the delta scripts in code. This would need to know the current database version, so that it knows which of the scripts to run, which you would normally have in a DbVersion table or something similar. With EF migrations, this plumbing is already in place for you.
Using migrations means the alignment of model and database changes is automated and therefore easier to manage.

Migration using model first approach in entity framework

I have setup a system where I have taken the model first approach as it made more logical sense for me. Now when even I have some changes in the model currently what I do is -
Use the Generate database from model feature of entity framework. I create a dummy database and apply those scripts. which deletes all my data and tables first and then updates the database with the latest sql file which is generated by entity framework.
Now I use the Visual Studio's schema compare feature and generate migration scripts for my local database and also for the one which is in production.
I go through the scripts manually and verify them. Once that is done I run the migration scripts on the production instances.
Question : The main problem is that is really tedious and since I do it from my local system, connecting to my prod databases is very slow and sometimes my visual studio also crashes. Is there a more cleaner approach to do this? Which is more automated such that my laptop is not really responsible for the database migrations on the production instances?

You can try Database Migration Power Pack - it allows creating change scripts instead of full database scripts but on behind it does the same procedure as you did by hand. The problem is that mentioned tool will not work with EF5.
Unfortunately EF migrations currently don't support models created through EDMX. Migrations support only code first approach at the moment.

In a Schema First design I use ApexSQL Diff (quite likely very similar to RedGate's product, perhaps a bit cheaper) - a good 3rd party tool is much easier to use than a VS Database Project and is easy to apply with a script-application tool like RoundHousE.
Using it in a Model First approach can follow the Schema First approach using a cycle of Model‑Schema‑Diff‑Schema‑Model as described in the post; consider these guidelines/notes below to make for a streamlined process. The schema-diff approach does not need to be tedious, slow, or excessively manual.
The current version of the database schema is obtained by applying a sequence of database patches (or DDL/DML scripts).
A tool (we use RoundHousE) automatically applies the scripts, as needed. It records information to know which scripts have been applied. Applying the same scripts is idempotent.
Diff done against a local database; this local database can be built up from all the previous change scripts in an automated fashion. This latest-local is always the diff target for the latest model changes.
The remote/live database is never used as a diff target. The same scripts can be applied later to the test (and then live) databases. Since everything is done the same way then the process is repeatable on all databases.
The only "issue" is that an update that is not well thought out may lead to data that is invalid under new restrictions/constraints. Of course, this was easy to identify, fix, and re-diff before pushing to the live database.
Once a diff is committed to source control it must be applied on the branch. To "undo" a previously commit change-script requires creating a new diff applying an inverse action. There is no implicit down-version.
We have a [Hg] model branch that affectively acts as a schema lock that that must be unified against; this could be viewed as a weak point, but it has worked well with small-team development.
A tool like Huagati DBML/EDMX is used to synchronize the Schema back to the Model which is really useful when developing. This little gem really pays for itself and is part of the cycle. When this is employed it's easy to also "update to a model" or make Schema changes in SSMS (or whatever) and then bring them back over.
The Code First migrations are "OK" (and definitely better than naught!), but I'm only using them because Azure SQL (aka SQL Database) is not supported by advanced diff tooling due to not exposing various sys information. (The diffs can be done locally as per normal, but ApexSQL Diff generates DDL/DML that is not always friendly with Azure SQL - plus, it's a chance for me to learn a slightly different approach :-)
Some advantages of Code First migrations via the Power Pack: can perform update tasks in C# instead of being limited to the DDL/DML (can be convenient), automatic downgrades (although I question their use), do not need to purchase a 3rd party tool (can be expensive), easier integration/deployment to Azure SQL, less tied to a specific database vendor (in theory), etc.
While Code First migrations (and automation of such) are a good step forward vs. the absolutely horrid Drop-and-Recreate approach, I much prefer dedicate SQL tooling when developing.

Testing and Managing database versions against code versions

As you develop an application database changes inevitably pop up. The trick I find is keeping your database build in step with your code. In the past I have added a build step that executed SQL scripts against the target database but that is dangerous in so much as you could inadvertanly add bogus data or worse.
My question is what are the tips and tricks to keep the database in step with the code? What about when you roll back the code? Branching?

Version numbers embedded in the database are helpful. You have two choices, embedding values into a table (allows versioning multiple items) that can be queried, or having an explictly named object (such as a table or somesuch) you can test for.
When you release to production, do you have a rollback plan in the event of unexpected catastrophe? If you do, is it the application of a schema rollback script? Use your rollback script to rollback the database to a previous code version.

You should be able to create your database from scratch into a known state.
While being able to do so is helpful (especially in the early stages of a new project), many (most?) databases will quickly become far too large for that to be possible. Also, if you have any BLOBs then you're going to have problems generating SQL scripts for your entire database.
I've definitely been interested in some sort of DB versioning system, but I haven't found anything yet. So, instead of a solution, you'll get my vote. :-P

You really do want to be able to take a clean machine, get the latest version from source control, build in one step, and run all tests in one step. Making this fast makes you produce good software faster.
Just like external libraries, database configuration must also be in source control.
Note that I'm not saying that all your live database content should be in the same source control, just enough to get to a clean state. (Do back up your database content, though!)

Define your schema objects and your reference data in version-controlled text files. For example, you can define the schema in Torque format, and the data in DBUnit format (both use XML). You can then use tools (we wrote our own) to generate the DDL and DML that take you from one version of your app to another. Our tool can take as input either (a) the previous version's schema & data XML files or (b) an existing database, so you are always able to get a database of any state into the correct state.

I like the way that Django does it. You build models and the when you run a syncdb it applies the models that you have created. If you add a model you just need to run syncdb again. This would be easy to have your build script do every time you made a push.
The problem comes when you need to alter a table that is already made. I do not think that syncdb handles that. That would require you to go in and manually add the table and also add a property to the model. You would probably want to version that alter statement. The models would always be under version control though, so if you needed to you could get a db schema up and running on a new box without running the sql scripts. Another problem with this is keeping track of static data that you always want in the db.
Rails migration scripts are pretty nice too.
A DB versioning system would be great, but I don't really know of such a thing.

While being able to do so is helpful (especially in the early stages of a new project), many (most?) databases will quickly become far too large for that to be possible. Also, if you have any BLOBs then you're going to have problems generating SQL scripts for your entire database.
Backups and compression can help you there. Sorry - there's no excuse not to be able to get a a good set of data to develop against. Even if it's just a sub-set.

Put your database developments under version control. I recommend to have a look at neXtep designer :
http://www.nextep-softwares.com/wiki
It is a free GPL product which offers a brand new approach to database development and deployment by connecting version information with a SQL generation engine which could automatically compute any upgrade script you need to upgrade any version of your database into another. Any existing database could be version controlled by a reverse synchronization.
It currently supports Oracle, MySql and PostgreSql. DB2 support is under development. It is a full-featured database development environment where you always work on version-controlled elements from a repository. You can publish your updates by simple synchronization during development and you can generate exportable database deliveries which you will be able to execute on any targetted database through a standalone installer which validates the versions, performs structural checks and applies the upgrade scripts.
The IDE also offers you SQL editors, dependency management, support for modular database model components, data model diagrams, SQL clients and much more.
All the documentation and concepts could be found in the wiki.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight