Code first migrations - is it really necessary? - database

I'm trying to find out a proper database development process in my applications. I've tried Visual Studio Database projects with Post/Pre deployment scripts (very nice feature), Entity Framework Database First approach (with separate script for each database change placed under source control), and now I'm dealing with Entity Framework Code First approach. I have to say that I'm really impressed with the possibilities that it gives, but I'm trying to figure out how to manage the changes in the models during the development. Assuming that I have the following environments in my company:
LOCALHOST - for each single developer,
TEST - single machine with SQL Server database for testing purposes,
PRODUCTION - single machine with SQL Server database used by clients
Now each time when I'm working on an application and the code changes, it's ok for me to drop and recreate the database each time when I'm testing an application (so for LOCALHOST and TEST environments). I've created proper database initializers that seeds the database with test data and I'm pretty happy with them.
However with each new build when model changes, I want to handle the PRODUCTION database changes in such a way that I won't lost the whole data. So, in Visual Studio 2012 there is the "SQL Schema Compare" tool and I'm just wondering if it is not enough to manage all changes in the database for PRODUCTION development? I can compare my {local} database schema with PRODUCTION schema and simply apply all changes?
Now, I want to ask what's the point of Code First Migrations here? Why should I manage all changes in the database through it? The only reason I can find is to allow to perform all sort of "INSERT" and "UPDATE" commands. However I think that if database is correctly designed there shouldn't be such need to perform these commands. (It's topic for another discussion so I don't want to go into details). Anyway I want to ask - what are the real advantages of Code First Migrations over the Code First + Schema Compare pattern?

It simplifies deployment. If you didn't manage the migrations in code, then you would have to run the appropriate delta scripts manually on your production environment. With EF migrations, you can configure your application to migrate the database automatically to the latest version on start up.
Typically, before EF migrations, if you wanted to automate this you would either have to run the appropriate delta scripts during a custom installation routine, or write some infrastructure into your application which runs the delta scripts in code. This would need to know the current database version, so that it knows which of the scripts to run, which you would normally have in a DbVersion table or something similar. With EF migrations, this plumbing is already in place for you.
Using migrations means the alignment of model and database changes is automated and therefore easier to manage.

Related

Migration using model first approach in entity framework

I have setup a system where I have taken the model first approach as it made more logical sense for me. Now when even I have some changes in the model currently what I do is -
Use the Generate database from model feature of entity framework. I create a dummy database and apply those scripts. which deletes all my data and tables first and then updates the database with the latest sql file which is generated by entity framework.
Now I use the Visual Studio's schema compare feature and generate migration scripts for my local database and also for the one which is in production.
I go through the scripts manually and verify them. Once that is done I run the migration scripts on the production instances.
Question : The main problem is that is really tedious and since I do it from my local system, connecting to my prod databases is very slow and sometimes my visual studio also crashes. Is there a more cleaner approach to do this? Which is more automated such that my laptop is not really responsible for the database migrations on the production instances?
You can try Database Migration Power Pack - it allows creating change scripts instead of full database scripts but on behind it does the same procedure as you did by hand. The problem is that mentioned tool will not work with EF5.
Unfortunately EF migrations currently don't support models created through EDMX. Migrations support only code first approach at the moment.
In a Schema First design I use ApexSQL Diff (quite likely very similar to RedGate's product, perhaps a bit cheaper) - a good 3rd party tool is much easier to use than a VS Database Project and is easy to apply with a script-application tool like RoundHousE.
Using it in a Model First approach can follow the Schema First approach using a cycle of Model‑Schema‑Diff‑Schema‑Model as described in the post; consider these guidelines/notes below to make for a streamlined process. The schema-diff approach does not need to be tedious, slow, or excessively manual.
The current version of the database schema is obtained by applying a sequence of database patches (or DDL/DML scripts).
A tool (we use RoundHousE) automatically applies the scripts, as needed. It records information to know which scripts have been applied. Applying the same scripts is idempotent.
Diff done against a local database; this local database can be built up from all the previous change scripts in an automated fashion. This latest-local is always the diff target for the latest model changes.
The remote/live database is never used as a diff target. The same scripts can be applied later to the test (and then live) databases. Since everything is done the same way then the process is repeatable on all databases.
The only "issue" is that an update that is not well thought out may lead to data that is invalid under new restrictions/constraints. Of course, this was easy to identify, fix, and re-diff before pushing to the live database.
Once a diff is committed to source control it must be applied on the branch. To "undo" a previously commit change-script requires creating a new diff applying an inverse action. There is no implicit down-version.
We have a [Hg] model branch that affectively acts as a schema lock that that must be unified against; this could be viewed as a weak point, but it has worked well with small-team development.
A tool like Huagati DBML/EDMX is used to synchronize the Schema back to the Model which is really useful when developing. This little gem really pays for itself and is part of the cycle. When this is employed it's easy to also "update to a model" or make Schema changes in SSMS (or whatever) and then bring them back over.
The Code First migrations are "OK" (and definitely better than naught!), but I'm only using them because Azure SQL (aka SQL Database) is not supported by advanced diff tooling due to not exposing various sys information. (The diffs can be done locally as per normal, but ApexSQL Diff generates DDL/DML that is not always friendly with Azure SQL - plus, it's a chance for me to learn a slightly different approach :-)
Some advantages of Code First migrations via the Power Pack: can perform update tasks in C# instead of being limited to the DDL/DML (can be convenient), automatic downgrades (although I question their use), do not need to purchase a 3rd party tool (can be expensive), easier integration/deployment to Azure SQL, less tied to a specific database vendor (in theory), etc.
While Code First migrations (and automation of such) are a good step forward vs. the absolutely horrid Drop-and-Recreate approach, I much prefer dedicate SQL tooling when developing.

How do you deal with multiple developers and database changes?

I would like to know how you guys deal with development database changes in groups of 2 or more devs? Do you have a global db everyone access, maybe a local copy and manually apply script changes? It would be nice to see pros and cons that you've noticed for each approach and the number of devs in your team.
Start with "Evolutionary Database Design" by Martin Fowler. This sums it up nicely
There are have been other questions about DB development that may be useful too, for example Is RedGate SQL Source Control for me?
Our approach is that everyone has their own DB, the complete DB can be created from create scripts with base data if required. All the scripts required for this are in source control.
All scripts are CREATE scripts and they reflect the current state of the database schema. Upgrades are in separate SQL files which can upgrade existing DBs from a specific version to a newer one (run sequentially). After all the updates have been applied, the schema must be identical to what you would get from running the setup scripts.
We have some tools to do this (we use SQL Server and .NET):
Scripting is done with a tool which also applies a standard formatting so that the changes are well traceable with text diff tools (and by the SCM)
A runtime module takes care of comparing the existing DB objects, run updates if required, automatically apply "non-destructive" changes, then check the DB objects again to ensure a correct migration before committing the changes
The toolset is available as open-source project (licensed under LGPL), it's called the bsn ModuleStore (note that it is limited to SQL Server 2005/2008/Azure and to .NET for the runtime part).
We use what was code named "Data Dude" - the database features in TFS and Visual Studio - to deal with this. When you "get latest" and bring in code that relies on a schema change, you also bring in the revised schemas, stored procedures etc. You rigght-click the database project and Deploy; that gets your local schema and sp in sync but doesn't overwrite your data. The job of working out the script to get you from your old schema to the new one falls to Visual Studio, not to you or your DBA. We also have "populate" scripts for things like lists of provinces and a deploy runs them for you.
So much better than the old way which always fell apart at high stress times, with people checking in code then going home and nobody knowing what columns to add to make the code work etc.

How to Develop TSQL in Visual Studio 2010 Database Projects

Silly sounding question, I know... Let me lay some groundwork first.
I have successfully created a database project comprised of the hundreds of tables, stored procedures, indexes, et.al. that make up our production database.
I have successfully added the solution to source control (TFS).
I have made a change (as a test) to some of the objects and generated a deployment script, and the whole system is very impressive, I must say. But it seems the strength of VS 2010, from a DB perspective is deployment, and not necessarily development.
I am totally baffled on the day-to-day workflow involved in database/TSQL development using Visual Studio. Let's suppose I need to add a few columns to a table, and modify related stored procedures to return/update this data for these columns.
While it's easy enough to modify all the scripts in my database model, I'd like to be able to isolate them against a dev database where I can do some testing... But it's as simple as not being to update a proc if it exists without manually changing the script to an ALTER (or adding DROP code prior to the CREATE). Having to do this once or twice is a non-issue, but in a real dev environment, we do this all day long.
Perhaps the answer is to perform frequent deployments to the dev server, as I debug and make changes to procs, for instance? Quite a bit of overhead; I could execute the necessary scripts manually in a few seconds, building and deploying takes a few minutes. Plus, if three of us are deploying different changes to a dev DB, wouldn't we overwrite each other's modifications?
Sorry to be so longwinded, but I can't help but think I am missing something simple here.
Are there any books/tutorials/webinars that showcase this type of approach to actual development?
I think you've hit the nail on the head. In order to test your modified stored procedures, you have to go through the deployment step to update your database. That's the drawback of the offline development model.
Here at Red Gate we've had numerous requests to make SQL Source Control support the Database Project, which would allow developers to benefit from the 'online' development model whilst still benefiting from the Database Project features.
[EDIT] We've added 'Beta' support for the database project in SQL Source Control, which allows connected SSMS development against the database project format. Simple link to the folder with eh .sqlproj file from SQL Source Control and start developing! [/EDIT]
In the meantime, you'll have to keep deploying to dev on a regular basis!
An alternative is to develop on a real database, and use the Schema Compare feature to synchronize back to your Database Project. Schema Compare is available in the Premium and Ultimate editions of Visual Studio.
David Atkinson
Product Manager
Red Gate Software

Testing and Managing database versions against code versions

As you develop an application database changes inevitably pop up. The trick I find is keeping your database build in step with your code. In the past I have added a build step that executed SQL scripts against the target database but that is dangerous in so much as you could inadvertanly add bogus data or worse.
My question is what are the tips and tricks to keep the database in step with the code? What about when you roll back the code? Branching?
Version numbers embedded in the database are helpful. You have two choices, embedding values into a table (allows versioning multiple items) that can be queried, or having an explictly named object (such as a table or somesuch) you can test for.
When you release to production, do you have a rollback plan in the event of unexpected catastrophe? If you do, is it the application of a schema rollback script? Use your rollback script to rollback the database to a previous code version.
You should be able to create your database from scratch into a known state.
While being able to do so is helpful (especially in the early stages of a new project), many (most?) databases will quickly become far too large for that to be possible. Also, if you have any BLOBs then you're going to have problems generating SQL scripts for your entire database.
I've definitely been interested in some sort of DB versioning system, but I haven't found anything yet. So, instead of a solution, you'll get my vote. :-P
You really do want to be able to take a clean machine, get the latest version from source control, build in one step, and run all tests in one step. Making this fast makes you produce good software faster.
Just like external libraries, database configuration must also be in source control.
Note that I'm not saying that all your live database content should be in the same source control, just enough to get to a clean state. (Do back up your database content, though!)
Define your schema objects and your reference data in version-controlled text files. For example, you can define the schema in Torque format, and the data in DBUnit format (both use XML). You can then use tools (we wrote our own) to generate the DDL and DML that take you from one version of your app to another. Our tool can take as input either (a) the previous version's schema & data XML files or (b) an existing database, so you are always able to get a database of any state into the correct state.
I like the way that Django does it. You build models and the when you run a syncdb it applies the models that you have created. If you add a model you just need to run syncdb again. This would be easy to have your build script do every time you made a push.
The problem comes when you need to alter a table that is already made. I do not think that syncdb handles that. That would require you to go in and manually add the table and also add a property to the model. You would probably want to version that alter statement. The models would always be under version control though, so if you needed to you could get a db schema up and running on a new box without running the sql scripts. Another problem with this is keeping track of static data that you always want in the db.
Rails migration scripts are pretty nice too.
A DB versioning system would be great, but I don't really know of such a thing.
While being able to do so is helpful (especially in the early stages of a new project), many (most?) databases will quickly become far too large for that to be possible. Also, if you have any BLOBs then you're going to have problems generating SQL scripts for your entire database.
Backups and compression can help you there. Sorry - there's no excuse not to be able to get a a good set of data to develop against. Even if it's just a sub-set.
Put your database developments under version control. I recommend to have a look at neXtep designer :
http://www.nextep-softwares.com/wiki
It is a free GPL product which offers a brand new approach to database development and deployment by connecting version information with a SQL generation engine which could automatically compute any upgrade script you need to upgrade any version of your database into another. Any existing database could be version controlled by a reverse synchronization.
It currently supports Oracle, MySql and PostgreSql. DB2 support is under development. It is a full-featured database development environment where you always work on version-controlled elements from a repository. You can publish your updates by simple synchronization during development and you can generate exportable database deliveries which you will be able to execute on any targetted database through a standalone installer which validates the versions, performs structural checks and applies the upgrade scripts.
The IDE also offers you SQL editors, dependency management, support for modular database model components, data model diagrams, SQL clients and much more.
All the documentation and concepts could be found in the wiki.

Deploying SQL Server Databases from Test to Live

I wonder how you guys manage deployment of a database between 2 SQL Servers, specifically SQL Server 2005.
Now, there is a development and a live one. As this should be part of a buildscript (standard windows batch, even do with current complexity of those scripts, i might switch to PowerShell or so later), Enterprise Manager/Management Studio Express do not count.
Would you just copy the .mdf File and attach it? I am always a bit careful when working with binary data, as this seems to be a compatiblity issue (even though development and live should run the same version of the server at all time).
Or - given the lack of "EXPLAIN CREATE TABLE" in T-SQL - do you do something that exports an existing database into SQL-Scripts which you can run on the target server? If yes, is there a tool that can automatically dump a given Database into SQL Queries and that runs off the command line? (Again, Enterprise Manager/Management Studio Express do not count).
And lastly - given the fact that the live database already contains data, the deployment may not involve creating all tables but rather checking the difference in structure and ALTER TABLE the live ones instead, which may also need data verification/conversion when existing fields change.
Now, i hear a lot of great stuff about the Red Gate products, but for hobby projects, the price is a bit steep.
So, what are you using to automatically deploy SQL Server Databases from Test to Live?
I've taken to hand-coding all of my DDL (creates/alter/delete) statements, adding them to my .sln as text files, and using normal versioning (using subversion, but any revision control should work). This way, I not only get the benefit of versioning, but updating live from dev/stage is the same process for code and database - tags, branches and so on work all the same.
Otherwise, I agree redgate is expensive if you don't have a company buying it for you. If you can get a company to buy it for you though, it really is worth it!
For my projects I alternate between SQL Compare from REd Gate and the Database Publishing Wizard from Microsoft which you can download free
here.
The Wizard isn't as slick as SQL Compare or SQL Data Compare but it does the trick. One issue is that the scripts it generates may need some rearranging and/or editing to flow in one shot.
On the up side, it can move your schema and data which isn't bad for a free tool.
Don't forget Microsoft's solution to the problem: Visual Studio 2008 Database Edition. Includes tools for deploying changes to databases, producing a diff between databases for schema and/or data changes, unit tests, test data generation.
It's pretty expensive but I used the trial edition for a while and thought it was brilliant. It makes the database as easy to work with as any other piece of code.
Like Rob Allen, I use SQL Compare / Data Compare by Redgate. I also use the Database publishing wizard by Microsoft. I also have a console app I wrote in C# that takes a sql script and runs it on a server. This way you can run large scripts with 'GO' commands in it from a command line or in a batch script.
I use Microsoft.SqlServer.BatchParser.dll and Microsoft.SqlServer.ConnectionInfo.dll libraries in the console application.
I work the same way Karl does, by keeping all of my SQL scripts for creating and altering tables in a text file that I keep in source control. In fact, to avoid the problem of having to have a script examine the live database to determine what ALTERs to run, I usually work like this:
On the first version, I place everything during testing into one SQL script, and treat all tables as a CREATE. This means I end up dropping and readding tables a lot during testing, but that's not a big deal early into the project (since I'm usually hacking the data I'm using at that point anyway).
On all subsequent versions, I do two things: I make a new text file to hold the upgrade SQL scripts, that contain just the ALTERs for that version. And I make the changes to the original, create a fresh database script as well. This way an upgrade just runs the upgrade script, but if we have to recreate the DB we don't need to run 100 scripts to get there.
Depending on how I'm deploying the DB changes, I'll also usually put a version table in the DB that holds the version of the DB. Then, rather than make any human decisions about which scripts to run, whatever code I have running the create/upgrade scripts uses the version to determine what to run.
The one thing this will not do is help if part of what you're moving from test to production is data, but if you want to manage structure and not pay for a nice, but expensive DB management package, is really not very difficult. I've also found it's a pretty good way of keeping mental track of your DB.
If you have a company buying it, Toad from Quest Software has this kind of management functionality built in. It's basically a two-click operation to compare two schemas and generate a sync script from one to the other.
They have editions for most of the popular databases, including of course Sql Server.
I agree that scripting everything is the best way to go and is what I advocate at work. You should script everything from DB and object creation to populating your lookup tables.
Anything you do in UI only won't translate (especially for changes... not so much for first deployments) and will end up requiring a tools like what Redgate offers.
Using SMO/DMO, it isn't too difficult to generate a script of your schema. Data is a little more fun, but still doable.
In general, I take "Script It" approach, but you might want to consider something along these lines:
Distinguish between Development and Staging, such that you can Develop with a subset of data ... this I would create a tool to simply pull down some production data, or generate fake data where security is concerned.
For team development, each change to the database will have to be coordinated amongst your team members. Schema and data changes can be intermingled, but a single script should enable a given feature. Once all your features are ready, you bundle these up in a single SQL file and run that against a restore of production.
Once your staging has cleared acceptance, you run the single SQL file again on the production machine.
I have used the Red Gate tools and they are great tools, but if you can't afford it, building the tools and working this way isn't too far from the ideal.
I'm using Subsonic's migrations mechanism so I just have a dll with classes in squential order that have 2 methods, up and down. There is a continuous integration/build script hook into nant, so that I can automate the upgrading of my database.
Its not the best thign in the world, but it beats writing DDL.
RedGate SqlCompare is a way to go in my opinion. We do DB deployment on a regular basis and since I started using that tool I have never looked back.
Very intuitive interface and saves a lot of time in the end.
The Pro version will take care of scripting for the source control integration as well.
I also maintain scripts for all my objects and data. For deploying I wrote this free utility - http://www.sqldart.com. It'll let you reorder your script files and will run the whole lot within a transaction.
I agree with keeping everything in source control and manually scripting all changes. Changes to the schema for a single release go into a script file created specifically for that release. All stored procs, views, etc should go into individual files and treated just like .cs or .aspx as far as source control goes. I use a powershell script to generate one big .sql file for updating the programmability stuff.
I don't like automating the application of schema changes, like new tables, new columns, etc. When doing a production release, I like to go through the change script command by command to make sure each one works as expected. There's nothing worse than running a big change script on production and getting errors because you forgot some little detail that didn't present itself in development.
I have also learned that indexes need to be treated just like code files and put into source control.
And you should definitely have more than 2 databases - dev and live. You should have a dev database that everybody uses for daily dev tasks. Then a staging database that mimics production and is used to do your integration testing. Then maybe a complete recent copy of production (restored from a full backup), if that is feasible, so your last round of installation testing goes against something that is as close to the real thing as possible.
I do all my database creation as DDL and then wrap that DDL into a schema maintainence class. I may do various things to create the DDL in the first place but fundamentally I do all the schema maint in code. This also means that if one needs to do non DDL things that don't map well to SQL you can write procedural logic and run it between lumps of DDL/DML.
My dbs then have a table which defines the current version so one can code a relatively straightforward set of tests:
Does the DB exist? If not create it.
Is the DB the current version? If not then run the methods, in sequence, that bring the schema up to date (you may want to prompt the user to confirm and - ideally - do backups at this point).
For a single user app I just run this in place, for a web app we currently to lock the user out if the versions don't match and have a stand alone schema maint app we run. For multi-user it will depend on the particular environment.
The advantage? Well I have a very high level of confidence that the schema for the apps that use this methodology is consistent across all instances of those applications. Its not perfect, there are issues, but it works...
There are some issues when developing in a team environment but that's more or less a given anyway!
Murph
I'm currently working the same thing to you. Not only deploying SQL Server databases from test to live but also include the whole process from Local -> Integration -> Test -> Production. So what can make me easily everyday is I do NAnt task with Red-Gate SQL Compare. I'm not working for RedGate but I have to say it is good choice.

Resources