Comparing database structures: how to create an SQL patch? - sql-server

I have two SQL Server (2000) databases. Both are used for the same project, but different versions. Basically, the old database is from our TEST environment. The new database is from the DEVELOPMENT environment. We also have an ACCEPTANCE, PRODUCTION and MAINTENANCE environment and they all contain the same project. (It's our development street, moving versions from D to T to A to P and finally M.)
Now, the development database structure has changed. A few tables have been added, indices are added or removed, fields have changed in type and nullable fields have become non-nullable, things like that. The test database needs to be upgraded with the new structure but without any loss of data. Right now, I'm doing this with a lot of manual labor. I keep a list of structural changes and once everything is ready, I write an update script to patch the old test database.
But as a software engineer, I'm just lazy by design. So, is there some easy tool somewhere which will compare the two database structures and generate an update script by itself?
(Only to change the structure, btw. No data manipulation!)

Yes, the best one in my opinion is by Red Gate
The SQL Toolbelt they offer is very good, which includes a bundle of tools, or you can just get SQL Compare on it's own. Not free (except for 14 day trial I think), but well worth the money

Take a look at Red Gate's SQL Compare..a complete live safer

At work we've used a few tools; AdeptSql Diff and we were trialing Redgate-something-something.
AdeptSql was significantly cheaper than Redgate, and while Redgate had a lot of very nice bells and whistles, since we already owned Adept we decided to stay with it.
http://www.adeptsql.com/
I'm sorry that I don't know any free tools offhand; I know that Redgate at least has a trial period.

I have used a *nix command-line tool called sqlt-diff which uses Perl SQL::Translater (SQLFairy) to generate diffs between SQL database schemas.
https://metacpan.org/pod/distribution/SQL-Translator/script/sqlt-diff
This is a free open-source tool. I did manually edit the diffs generated to customize them slightly.

Related

Managing DB migration: scripts vs tools

Our project has about 20 developers, but our application makes relatively light use of databases. We have a collection of about 5 databases, all of which are very small and would have less than 20 tables each, none of which have millions of rows or anything large.
We have two options on the table for how to manage the evolution of the databases over time:
Some kind of tool. Currently we're using Visual Studio database projects, which contain the current definition of the schema, and look at a reference database to generate a diff script. We then use this diff script to bring the reference database up to date.
Use version scripts to build the database from a baseline. The scripts are manually placed in source control. Any data migration to move data from old columns/tables to new would be part of these scripts. There would be a version recorded in the DB somewhere and upgrading would run all scripts between DB version and the current version.
The second option seems to be widely used and I have found an indepth discussion here: http://odetocode.com/blogs/scott/archive/2008/01/31/versioning-databases-the-baseline.aspx
The problem we have with what we've got at the moment is that we don't have access over our Production databases. This means to create a release package, we have to restore a backup of Production into another location, generate a diff against that referece DB and give the script to the production DB team. So our release to production is different to our other environments.
This makes the idea of running versioned scripts appealing because we use the same scripts in all environments, and there's no ad-hoc work in deployment (eg manual restore of prod to reference DB). But given that we have such a small scale DB situation, I feel like we can hardly be a difficult case for the DB tools out there. What we want is something as simple as possible which is easy to understand.
Do the tools such as RedGate's suite make sense for this kind of scenario, or should we go with versioned scripts? Cost isn't so much of an issue, it's more about creating a Pit of Success where maintaining and deploying the DB is as basic and automated as possible.
I'm the product manager at Red Gate for SQL Compare, which generates diff scripts between two databases. I'd like you to take a look at our SQL Source Control tool, which will allow you to track schema changes as and when they're made in development. When it comes to deployment, if you know which schema version is in production, you can generate a deployment script from your source controlled versions. Of course you should always be testing this out in a staging environment before running on production.
Scott's article makes an excellent point in regards to migration script, and Denis alludes to more complex changes that can't realistically be second guessed by comparison tools, and would therefore require custom migration scripts to be managed and used appropriately. The next version of SQL Compare in conjunction with SQL Source Control will therefore manage both your schema versions and your migration scripts, allowing you to get the best of both worlds. If you'd like to see early screenshots of this, please email me at David dot Atkinson at red-gate dot com. I'd really love to discuss your requirements so we can better design the tool.
In my experience there always is more to it than mere schema changes. If you split a column in two, or shift a column to a separate table, or other such things, you need to migrate both the schema and the data.
No tool or script will allow you to migrate the actual data automatically. At the very most you'll get a diff for the schema which your devs may find useful as a reminder/check list for DB version migration scripts (sequences of create/alter/drop and insert/update/delete done in a single transaction).

How do you deal with multiple developers and database changes?

I would like to know how you guys deal with development database changes in groups of 2 or more devs? Do you have a global db everyone access, maybe a local copy and manually apply script changes? It would be nice to see pros and cons that you've noticed for each approach and the number of devs in your team.
Start with "Evolutionary Database Design" by Martin Fowler. This sums it up nicely
There are have been other questions about DB development that may be useful too, for example Is RedGate SQL Source Control for me?
Our approach is that everyone has their own DB, the complete DB can be created from create scripts with base data if required. All the scripts required for this are in source control.
All scripts are CREATE scripts and they reflect the current state of the database schema. Upgrades are in separate SQL files which can upgrade existing DBs from a specific version to a newer one (run sequentially). After all the updates have been applied, the schema must be identical to what you would get from running the setup scripts.
We have some tools to do this (we use SQL Server and .NET):
Scripting is done with a tool which also applies a standard formatting so that the changes are well traceable with text diff tools (and by the SCM)
A runtime module takes care of comparing the existing DB objects, run updates if required, automatically apply "non-destructive" changes, then check the DB objects again to ensure a correct migration before committing the changes
The toolset is available as open-source project (licensed under LGPL), it's called the bsn ModuleStore (note that it is limited to SQL Server 2005/2008/Azure and to .NET for the runtime part).
We use what was code named "Data Dude" - the database features in TFS and Visual Studio - to deal with this. When you "get latest" and bring in code that relies on a schema change, you also bring in the revised schemas, stored procedures etc. You rigght-click the database project and Deploy; that gets your local schema and sp in sync but doesn't overwrite your data. The job of working out the script to get you from your old schema to the new one falls to Visual Studio, not to you or your DBA. We also have "populate" scripts for things like lists of provinces and a deploy runs them for you.
So much better than the old way which always fell apart at high stress times, with people checking in code then going home and nobody knowing what columns to add to make the code work etc.

What is the State of the Art for deploying database updates to production databases?

Every shop at which I've worked has had their own cobbled-together, haphazard, poorly understood and poorly maintained method for updating production databases.
I've never seen a consistent method for doing this.
So, in the most recent versions of SQL Server, what is the best practice for updating schema changes and migrating data from a development or test server to a production server?
Is there a 3rd party tool which handles this painlessly?
I'd imagine the ultimate tool would be able to
detect schema changes between two DBs and generate DDL to update one to the other.
include the ability to have custom code which performs custom data migration steps
allow versioning so a v1 db could be updated all the way to a v99 database, running all scripts and migration steps in order.
The three things I've used are:
For schemas
Visual Studio Database Projects. Meh. They are okay but you still have to do alot of the work yourself.
Red Gate's SQL Compare and the entire SQL Toolbelt. They've worked pretty hard to make this something you can version control. In practice I've found with databases you are usually trying to get from point A in the version timeline to point B. With binaries, you often just clobber whatever is there with point B (an oversimplification I know, but often true).
http://www.red-gate.com/
xSQL is a good place to start if your system is small and perhaps will remain small:
http://www.xsqlsoftware.com/LiteEdition.aspx
I don't work for or know anyone who works for or get any money from these people. Just telling you what I've done in the past.
For data
Red Gate has SQL Data Compare.
However, if you want something "free" (or included with SQL Server)
I've actually had a lot of success just using BCP and writing a small system that injects and extracts data. Generally when I find myself doing this I ask myself, "Why? If I am changing data, does that mean I am really changing something that is configuration? Can I use a different method here?" But sometimes you can't (maybe it's a legacy system where the original devs thought databases are for everything).
The problem with BCP extracts is they don't version control very well. There are tricks I've used like extracting in character mode and stuffing an order by in the extract query to try and pull rows out in an order that makes them somewhat more palatable for version control.
For small Projects I have used RedGate to manage schema and data migrations with alot of success. Very easy to use works for most cases.
For larger enterprise systems for Schema and data changes normally you save all the SQL scripts as text files and run them. We also include a Rollback script to run incase something goes wrong during the migration. Run this on UAT server then Test/staging/pre prod server then on Production. Saving a copy of all these files plus their roll back scripts should allow you to move from multiple versions of a DB.
There is also http://code.google.com/p/migratordotnet/ if your using .NET it allows you to define these scripts in CODE. Very usesful if you want to deploy across multiple DBs in an automated way. Makes it easy to say set my DB to version 23. Or revert my DB to version 5. etc. Works for schema and data, but I would only really use it for a few lines of data.
First you have to think that the requirements between scenarios vary a lot:
Customers purchase v1 of the product at Costco and install it in they home office or small business. When v2 comes out, customer purchases a box of the product and installs it on a new computer. It exports the data from the v1 installation and imports it into v2 installation. Even though behind the scenes both v1 and v2 use a SQL Express instance there is no supported upgrade. Schema changes on the deployed databases are not expected (hidden database, non technical user) and definitely not supported. The only 'upgrade' path supported is an explicit export/import, which probably uses an XML file or something similar.
A business purchases v1 of the product with a support contract. It installs it on its department SQL Server instance, from where the data is accessed by the purchased product and by many more integration services, reports etc. When v2 is released, the customer runs the prescribed upgrade procedure, if it runs into problems it calls the product vendor customer support line which walks the customer through some specific steps for his deployment. Database schema customizations are expected and often supported, including upgrade scenarios, but the schema changes are done by the customer (not known at v2 design time).
A web startup has database that backs the site. Developers make changes on their personal instances and check in changes. Automated build deployment with contiguous integration picks up the changes and deploys them against a test instance, and run build validation tests. The main branch build can be, at any moment, deployed into production. Production is the one database that backs the site. The structure of the production database is documented and understood 100%, every single change to the production database schema occurs through the build system and QA process. On a side note, this is the scenarios most SO users that ask your question have in mind, minus the part about '100% documented and understood'. I give the example of WWW backing site, but deplyment can really be anything. The gist of it is that there is only one production database (it may include HA/DR copies, and it may consist of multiple actual SQL Server databases), and is the only database that has to be upgraded.
A succesfull web startup. Same as above, but the production database has 5TB of data and 5 minutes of downtime make the CNN headlines. Schema changes may involve setting up replicas and copying data into new schemas with contiguous updates, followed by an online switch of operations to the replica. Schema changes are designed by MCM experts and deployn a schema change can be a multi-week process.
I can go on wit more scenarios. The point is that the requirement of each of these cases are so vastly different, that no 'state of the art' can answer all of them. Some scenarios will be perfectly OK with a schema diff deployment tool like vsdbcmd or SQL Compare. Other scenarios will be much better faced with explicit versioning scripts. Other might have such specific requirements (eg. 0 downtime) that each upgrade is a project on its own and has to be specifically custom tailored.
One thing is clear though across all scenarios: if your shop threats the development database MDF file* as 'source' and makes changes to it using the management tools, that is always a major #fail. All changes should be captured explicitly as some sort of source control artifact, and this is why I favor most the explicit version scripts, as in Version Control and your Database. But I recon that the VSDB project support for compile time schema validation and its ease of refactoring schema objects make a pretty powerful proposition and VSDB schema compare deployment may be OK.
Another important approache that has to be addressed is the code first schema modeling from tools like EF or LinqToSql. It works brilliantly to deploy v1, but fails miserably at any subsequent version. I strongly discourage these approaches.
But to sum up and answer in brief: as today, the state of the art sucks.
At Red Gate we'd recommend one of two approaches depending on your requirements and how formal you need your processes to be. If you have a development database and simply want to push changes to production, SQL Compare is the tool for the job. A level of versioning can be achieved by using the schema snapshots.
However, if you wants full source control benefits, such as team collaboration, sandboxed environments, audit trail, compliance, history, rollback, etc, you should consider SQL Source Control. This links development databases to Team Foundation Server or Subversion.

SQL Server diff tool

Working on a team where people are prone to amending dev SQL Server tables and forgetting about it, or preparing a change for deployment and having to wait for that deployment. This leaves our dev and live tables inconsistent, causing problems when SPROCs are pushed live.
Is there a tool whereby I can enter a SPROC name and have it check all tables referenced in it in the dev and live DBs, and notify of any differences?
I know two excellent tools for diffing SQL database structures - they don't specifically look inside stored procedures at their text, but they'll show you structural differences in your databases:
RedGate SQL Compare
ApexSQL's SQL Diff
Redgate also has a SQL Dependency Tracker which visualizes object dependencies and could be quite useful here.
Marc
For SQL Server 2005/2008, Open DBDiff works pretty well. The great part about this is that it's free. Also note that I am writing this answer for version 0.9 which currently works for SQL 2005/2008.
It'll show you the differences between the database schema between a source database you specify and the destination database you specify. There are also buttons you can click that can update or create the table that is in question.
I would recommend SQL compare and SQL Data Compare from Redgate Software. I worked with these tools for several projects and they did a great job. Documenting changes is also a good thing to do, but some changes are to complex to write your own SQL code for (including juggling data around between tables).
The redgate tools create scripts in a matter of seconds and those scripts are almost always correct (some older versions had a hard time with table dependencies in big databases, but when playing around with the statements (in a begin transaction / rollback) I was able to quickly fix those problems).
Another strong point in the redgate suites is that you can save your comparison project. This is especially useful when you don't want to convert a certain table (or data), you can exclude them. When loading the project the next time the software will automatically ignore those tables.
One disadvantage is the cost of the software (smaller companies I worked with did not want to buy the software). SQL compare and SQL data compare together will cost you about 800 dollars, but if you look at the time you will save when releasing you will save a lot of money. There is also a trial you can play around with (30 days I believe).
SQLDBDiff is a nice and user-friendly and lite tool.
SQLDBDiff supports SQL Server 2000 to 2016 and also SQL Azure.
SQLDBDiff available with both free with limited use and full with a trial.
More Screen
Try Microsoft Visual Studio Database Edition aka Data Dude (formerly for Database Professionals). It'll do a complete schema comparison and generate the necessary scripts to upgrade the target schema.
Of course, this shouldn't replace a proper build process ;-)
If you need a quick schema comparison tool for SQL Server, you should take a look at dbForge Schema Compare for SQL Server.
I've made a MssqlMerge utility that allows to compare (and merge) MSSQL database data and programming objects. It also allows to search for particular word or phrase across table definitions and programming objects.

Deploying SQL Server Databases from Test to Live

I wonder how you guys manage deployment of a database between 2 SQL Servers, specifically SQL Server 2005.
Now, there is a development and a live one. As this should be part of a buildscript (standard windows batch, even do with current complexity of those scripts, i might switch to PowerShell or so later), Enterprise Manager/Management Studio Express do not count.
Would you just copy the .mdf File and attach it? I am always a bit careful when working with binary data, as this seems to be a compatiblity issue (even though development and live should run the same version of the server at all time).
Or - given the lack of "EXPLAIN CREATE TABLE" in T-SQL - do you do something that exports an existing database into SQL-Scripts which you can run on the target server? If yes, is there a tool that can automatically dump a given Database into SQL Queries and that runs off the command line? (Again, Enterprise Manager/Management Studio Express do not count).
And lastly - given the fact that the live database already contains data, the deployment may not involve creating all tables but rather checking the difference in structure and ALTER TABLE the live ones instead, which may also need data verification/conversion when existing fields change.
Now, i hear a lot of great stuff about the Red Gate products, but for hobby projects, the price is a bit steep.
So, what are you using to automatically deploy SQL Server Databases from Test to Live?
I've taken to hand-coding all of my DDL (creates/alter/delete) statements, adding them to my .sln as text files, and using normal versioning (using subversion, but any revision control should work). This way, I not only get the benefit of versioning, but updating live from dev/stage is the same process for code and database - tags, branches and so on work all the same.
Otherwise, I agree redgate is expensive if you don't have a company buying it for you. If you can get a company to buy it for you though, it really is worth it!
For my projects I alternate between SQL Compare from REd Gate and the Database Publishing Wizard from Microsoft which you can download free
here.
The Wizard isn't as slick as SQL Compare or SQL Data Compare but it does the trick. One issue is that the scripts it generates may need some rearranging and/or editing to flow in one shot.
On the up side, it can move your schema and data which isn't bad for a free tool.
Don't forget Microsoft's solution to the problem: Visual Studio 2008 Database Edition. Includes tools for deploying changes to databases, producing a diff between databases for schema and/or data changes, unit tests, test data generation.
It's pretty expensive but I used the trial edition for a while and thought it was brilliant. It makes the database as easy to work with as any other piece of code.
Like Rob Allen, I use SQL Compare / Data Compare by Redgate. I also use the Database publishing wizard by Microsoft. I also have a console app I wrote in C# that takes a sql script and runs it on a server. This way you can run large scripts with 'GO' commands in it from a command line or in a batch script.
I use Microsoft.SqlServer.BatchParser.dll and Microsoft.SqlServer.ConnectionInfo.dll libraries in the console application.
I work the same way Karl does, by keeping all of my SQL scripts for creating and altering tables in a text file that I keep in source control. In fact, to avoid the problem of having to have a script examine the live database to determine what ALTERs to run, I usually work like this:
On the first version, I place everything during testing into one SQL script, and treat all tables as a CREATE. This means I end up dropping and readding tables a lot during testing, but that's not a big deal early into the project (since I'm usually hacking the data I'm using at that point anyway).
On all subsequent versions, I do two things: I make a new text file to hold the upgrade SQL scripts, that contain just the ALTERs for that version. And I make the changes to the original, create a fresh database script as well. This way an upgrade just runs the upgrade script, but if we have to recreate the DB we don't need to run 100 scripts to get there.
Depending on how I'm deploying the DB changes, I'll also usually put a version table in the DB that holds the version of the DB. Then, rather than make any human decisions about which scripts to run, whatever code I have running the create/upgrade scripts uses the version to determine what to run.
The one thing this will not do is help if part of what you're moving from test to production is data, but if you want to manage structure and not pay for a nice, but expensive DB management package, is really not very difficult. I've also found it's a pretty good way of keeping mental track of your DB.
If you have a company buying it, Toad from Quest Software has this kind of management functionality built in. It's basically a two-click operation to compare two schemas and generate a sync script from one to the other.
They have editions for most of the popular databases, including of course Sql Server.
I agree that scripting everything is the best way to go and is what I advocate at work. You should script everything from DB and object creation to populating your lookup tables.
Anything you do in UI only won't translate (especially for changes... not so much for first deployments) and will end up requiring a tools like what Redgate offers.
Using SMO/DMO, it isn't too difficult to generate a script of your schema. Data is a little more fun, but still doable.
In general, I take "Script It" approach, but you might want to consider something along these lines:
Distinguish between Development and Staging, such that you can Develop with a subset of data ... this I would create a tool to simply pull down some production data, or generate fake data where security is concerned.
For team development, each change to the database will have to be coordinated amongst your team members. Schema and data changes can be intermingled, but a single script should enable a given feature. Once all your features are ready, you bundle these up in a single SQL file and run that against a restore of production.
Once your staging has cleared acceptance, you run the single SQL file again on the production machine.
I have used the Red Gate tools and they are great tools, but if you can't afford it, building the tools and working this way isn't too far from the ideal.
I'm using Subsonic's migrations mechanism so I just have a dll with classes in squential order that have 2 methods, up and down. There is a continuous integration/build script hook into nant, so that I can automate the upgrading of my database.
Its not the best thign in the world, but it beats writing DDL.
RedGate SqlCompare is a way to go in my opinion. We do DB deployment on a regular basis and since I started using that tool I have never looked back.
Very intuitive interface and saves a lot of time in the end.
The Pro version will take care of scripting for the source control integration as well.
I also maintain scripts for all my objects and data. For deploying I wrote this free utility - http://www.sqldart.com. It'll let you reorder your script files and will run the whole lot within a transaction.
I agree with keeping everything in source control and manually scripting all changes. Changes to the schema for a single release go into a script file created specifically for that release. All stored procs, views, etc should go into individual files and treated just like .cs or .aspx as far as source control goes. I use a powershell script to generate one big .sql file for updating the programmability stuff.
I don't like automating the application of schema changes, like new tables, new columns, etc. When doing a production release, I like to go through the change script command by command to make sure each one works as expected. There's nothing worse than running a big change script on production and getting errors because you forgot some little detail that didn't present itself in development.
I have also learned that indexes need to be treated just like code files and put into source control.
And you should definitely have more than 2 databases - dev and live. You should have a dev database that everybody uses for daily dev tasks. Then a staging database that mimics production and is used to do your integration testing. Then maybe a complete recent copy of production (restored from a full backup), if that is feasible, so your last round of installation testing goes against something that is as close to the real thing as possible.
I do all my database creation as DDL and then wrap that DDL into a schema maintainence class. I may do various things to create the DDL in the first place but fundamentally I do all the schema maint in code. This also means that if one needs to do non DDL things that don't map well to SQL you can write procedural logic and run it between lumps of DDL/DML.
My dbs then have a table which defines the current version so one can code a relatively straightforward set of tests:
Does the DB exist? If not create it.
Is the DB the current version? If not then run the methods, in sequence, that bring the schema up to date (you may want to prompt the user to confirm and - ideally - do backups at this point).
For a single user app I just run this in place, for a web app we currently to lock the user out if the versions don't match and have a stand alone schema maint app we run. For multi-user it will depend on the particular environment.
The advantage? Well I have a very high level of confidence that the schema for the apps that use this methodology is consistent across all instances of those applications. Its not perfect, there are issues, but it works...
There are some issues when developing in a team environment but that's more or less a given anyway!
Murph
I'm currently working the same thing to you. Not only deploying SQL Server databases from test to live but also include the whole process from Local -> Integration -> Test -> Production. So what can make me easily everyday is I do NAnt task with Red-Gate SQL Compare. I'm not working for RedGate but I have to say it is good choice.

Resources