How to keep code base and database schema in synch? - database

So recently on a project I'm working on, we've been struggling to keep a solution's code base and the associated database schema in synch (Database = SQL Server 2008).
Database changes occur fairly regularly (adding columns, constraints, relationships, etc) and as a result it's not uncommon for people to do a 'Get Latest' from source control and
find that they also need to rebuild the database as well (and sometimes they forget to do the latter).
We're not using VSTS: Database Edition (DataDude) but the standard Visual Studio database project with a script (batch file) which tears down and recreates the database from T-SQL scripts. The solution is a .Net & ASP.net solution with LINQ to SQL underlying as the ORM.
Anyone have ideas on an approach to take (automated or not) which would keep everyone up to date with the latest database schema?
Continuous integration with MSBuild is an option, but only helps pick up any breaking changes committed, it doesn't really help in the scenario I highlighted above.
We are using Team Foundation Server, if that helps..

We try to work forward from the creation scripts.
i.e a change to the database is not authorised unless the script has been tested and checked into source control.
But this assumes that the database team is integrated with your app team which is usually not the case in a large project...
(I was tempted to answer this "with great difficulty")
EDIT: Tools won't help you if your process isn't right.

Ok although its not the entire solution, you should include an assertion in the Application code that links up to the database to assert the correct schema is being used, that way at least it becomes obvious, and you avoid silent bugs and people complaining that stuff went crazy all of the sudden.
As for the schema version, you could use some database specific functionality if available, but i personally prefer to declare a schema version table and keep the version number in there, that way its portable and can be checked with a simple select statement

have a look at DB Ghost - you can create a dbp using the scripter in seconds and then manage all your database code with the change manager. www.dbghost.com
This is exactly what DB Ghost was designed to handle.

We basically do things the way you are, with the generation script checked into source control as well. I'm the designated database master so all changes to the script itself are done through me. People send me scripts of the changes they have made, I update my master copy of the schema, run a generate scripts (SSMS) to produce the new DB script, and then check it in. I keep my copy of the code current with any changes that are being made elsewhere. We're a small shop so this works pretty well for us. I realize that it probably doesn't scale.

If you are not using Visual Studio Database Professional Edition, then you will need another tool that can break the database down into its elemental pieces so that they are managable and changeable in an easier manner.
I'd recommend seriously considering Redgate's SQL tools if you want to maintain sanity over all your database changes and updates.
SQL Packager
SQL Multi Script
SQL Refactor

Use a tool like RedGate SQL Compare to generate the change schema between any given version of the database. You can then check that file into source code control

Have a look at this question: dynamic patching of databases. I think it's similar enough to your problem to be helpful.

My solution to this problem is simple. Define everything as XML, and make sure that both the database, the ORM and the UI are generated from this XML, no exceptions. That way, you can use code generation tools to quickly regenerate the database creation script, which will alter your schema while (hopefully) preserving some data. It takes some effort to do, but the net result is well worth it.

Related

How do you deal with multiple developers and database changes?

I would like to know how you guys deal with development database changes in groups of 2 or more devs? Do you have a global db everyone access, maybe a local copy and manually apply script changes? It would be nice to see pros and cons that you've noticed for each approach and the number of devs in your team.
Start with "Evolutionary Database Design" by Martin Fowler. This sums it up nicely
There are have been other questions about DB development that may be useful too, for example Is RedGate SQL Source Control for me?
Our approach is that everyone has their own DB, the complete DB can be created from create scripts with base data if required. All the scripts required for this are in source control.
All scripts are CREATE scripts and they reflect the current state of the database schema. Upgrades are in separate SQL files which can upgrade existing DBs from a specific version to a newer one (run sequentially). After all the updates have been applied, the schema must be identical to what you would get from running the setup scripts.
We have some tools to do this (we use SQL Server and .NET):
Scripting is done with a tool which also applies a standard formatting so that the changes are well traceable with text diff tools (and by the SCM)
A runtime module takes care of comparing the existing DB objects, run updates if required, automatically apply "non-destructive" changes, then check the DB objects again to ensure a correct migration before committing the changes
The toolset is available as open-source project (licensed under LGPL), it's called the bsn ModuleStore (note that it is limited to SQL Server 2005/2008/Azure and to .NET for the runtime part).
We use what was code named "Data Dude" - the database features in TFS and Visual Studio - to deal with this. When you "get latest" and bring in code that relies on a schema change, you also bring in the revised schemas, stored procedures etc. You rigght-click the database project and Deploy; that gets your local schema and sp in sync but doesn't overwrite your data. The job of working out the script to get you from your old schema to the new one falls to Visual Studio, not to you or your DBA. We also have "populate" scripts for things like lists of provinces and a deploy runs them for you.
So much better than the old way which always fell apart at high stress times, with people checking in code then going home and nobody knowing what columns to add to make the code work etc.

Versioning SQL Server DDL code

I'd like to have all DB DDL code under CVS.
We are using Subversion for our .NET code but all database code remains still unversioned.
All we know is how important DB logic can be. I've googled but I've found only few (expensive tools). I believe there exists other (cheaper) solution(s).
What approach do you advise to follow? What tools are most appropriate?
SQL Server 2005, VS 2008 TS, TSVN
UPDATE
Our coding scenario is that developers cannot access to PROD DB directly. It is changed only by scripts (so this is not a problem)
I'm mostly interested in the DEV environment where all of developers have full access.
So it happens that a developer overwrite USP previously changed by another.
I'd like to have the possibility to restore lost version / compare USPs revisions etc.
UPDATE-2
To create deployment script we are using Red-Gate SQL Compare.
Works perfectly - so deployment scripts are not a case.
If you haven't already read it, Martin Fowler's article Evolutionary Database Design is a great place to start.
The article is hard to summarize, but it describes how his team dealt with database versioning in a rapidly changing development process. They created their own tools to facilitate things: scripts to bring users up to the current master, to copy any version of the schema so users could debug one another's working copies, etc..
For a solid low-tech solution, I've found it helpful to keep two kinds of DDL scripts in source control:
A master version that can create the database objects from scratch.
'Version upgrade' scripts for each development iteration.
They're redundant to a degree, but extremely useful (particularly when it comes to deployment).
If you haven't already looked at the Visual Studio Database Edition GDR (a.k.a. "Data Dude"), you should definitely download it and try it out:
http://www.microsoft.com/downloads/details.aspx?FamilyID=bb3ad767-5f69-4db9-b1c9-8f55759846ed&displaylang=en
Among other things, the GDR will facilitate team development by making it easy for each developer to maintain their own local copy of a database, version scripts, create deployment scripts to move a database schema to a new version, and even support database rollback.
It's free if you are using team system developer edition. Check it out.
If you are using Visual Studio Team Suite or Visual Studio Developer Edition, you are entitled to a copy of Visual Studio Database Professional. This is designed to do exactly what you describe, and much more. We use it to manage our database schema (code).
Randy
We use Subversion for all our database code as well. Since nothing is allowed to go to Prod unless it is in a script, there seems to be no porblem with getting people to put all the scripts into subversion. We tend to write alter table scripts to change tables with existing data and then recreate the whole table structure in case we need to create a new database from scratch (we often have the same database structure on multiple servers as some of our clients are very large and do not want their data accidentally available to the competition and so pay for separate servers and therefore may need to create the whole database again with no data.) For objects that don't directly store data we drop the orginal object and recreate it with a create statement. Each project has it's own home inthe repository and each database does too, so the script may be in more than one place to facilitate deployment.
But the real key is that no one can load to Prod without a script. We don't give our devs direct rights to prod, so they have no problem doing things in scripts as opposed to using SSMS.
I wrote SMOscript which generates a CREATE script for each object in a database.
Use this tool to generate into a directory covered by CVS, and update your repository.
Finally I found this tool and approach extremely useful and very easy to introduce
(at least at the beginning - where no versioning solution on the place):
http://www.codeproject.com/KB/database/SQLScripter.aspx
You can run it out of the box.
For final solution I'd incline to GDR.
This also sounds interesting:
Freeware:
http://dbsourcetools.codeplex.com/
http://www.codeproject.com/KB/database/ScriptDB4Svn.aspx
http://www.codeproject.com/KB/database/SQLScripter.aspx
http://blog.boxedbits.com/archives/133
Commercial:
http://www.nobhillsoft.com/Randolph.aspx
You should use Management Studio (SSMS) and place the .sql under source control, possibly separate schema objects under folders.
Hope this helps
See if Wizardby fits your needs.

How do you put an large existing database (schema) under source control?

My DBA just lost some development work that he did on our development database. Poor fella. So naturally our manager asked him, at our status meeting, how this could happen and how we could avoid this happening in the future. "Source control could alleviate the problem" I suggested... The dba's response; "No, we just backup the server more often". Now I would like to help my DBA understand what source control is and how it fits together with a database schema and development on that schema.
Previously I've tried to explain him that there's nothing special about the source code behind tables and stored procedures and it should be in a source control system (TFS in this case). But he just didn't bite. Now, while this misap is in recent memory, I would like to take another stab at it.
So my question is, do you know of any good advice I could pass on to my DBA and maybe even a couple of resources explaining how you would go about migrating a DB schema to be under source control and find its proper place in the build and deployment processes?
A couple of facts about the environment:
Source Control on a TFS 2008 Server.
Database is a MS SQL server 2008 with >300 tables and >300 other objects (sprocs, triggers, functions etc.).
Clarification:
We have been using DB Ghost and other change management solutions on other projects with other DBAs, in the past. We even have the license for VS DB edition! The problem is getting the DBA to even think about this way of developing for the database. He's really old school (i.e. migrating changes manually from environment to environment), and unfortunately hes the only one who knows anything about this particular DB.
See how to version control sql server databases and Do you source control your databases, among many others. Or use the search page. Basically, your approach seems correct. Good luck persuading the DBA...
If you are using Visual Studio Team System, I recommend having a stab at their Database Edition (i think these days it comes with the Developer Edition if you are an MSDN Subscriber). What this will allow you to do is to script out all your schema, stored procs, views, triggers, etc and source control these. This should also make the dba more comfortable since he will be working with a "Database" version of the tool rather than the "Developer" version (naming can go a great lengths with people). As you make changes from Visual Studio, you can manage script changes as you work, and source control them.
If your company has an MSDN license, they can use the Visual Studio Database edition. There's a video tutorial of it here.
I have no power of purchase, so I don't know what the cost breakdowns are. But it has the capability of source controlling all the parts of a DB schema, and includes creating change-scripts as well as auto-deploying straight from VS if you want (I wouldn't recommend that).
In general though, it's pretty solid as a database source control option.
Source control for databases can be quite contentious. It's different to use source control for something that produces a binary because you can't lock the source: a stored proc is a row in a table and there is not single table to read to get a table definition.
Also, version to version is mostly a set of ALTER statements. You script out CREATEs and add them to source control. This makes it harder to use in cases like this.
To me, this is more a procedural error.
Why was the change not done from a script? Forget where the script lives, but why no reproducable and re-runnable script? Perhaps linked to the change tracking number? If the database is reset (loaded from prod) then how would the change have been re-applied to prepare for production. And other questions.
I believe in source control and we use it: but it has limits for database work.
First you are approaching this incorrectly. If the dba won't bite on Source Control and he is making errors that affect the system, the person you need to persuade is his boss.
If it helps, I'm from the old school too and I love having our database objects in source control. How nice to be able to revert one table without having to restore the whole database backup to a different location and then move the table. How much faster and simpler. How nice to be able to compare two different versions and see what changed. How nice to deploy a change and know exactly which database changes (say, for instance only twelve of the 23 possible ones) go with the part you are deploying and not some other unfinished project. How nice to know exactly which scripts were involved in a particular change you had to rollback. How nice that nobody is making on-the-fly changes on production since we now require all production changes to be from source control scripts. There are so many fewer errors and issues to worry about.
Yes it was a change in how we did business, but we did it through a policy change from on high so three was no argument and the dbas went through a couple of times and reverted any objects different from source control to the source control version, so now nobody will even think of doing a database change without it being in source control.
As the product manager for SQL Compare I've spoken to many 'traditional' DBAs who are uncomfortable with third party tools mainly because they have a system that works for them and sometimes changing can be difficult. There are many situations where I am convinced that they would benefit from our tools if only they gave them a chance. Frustrating.
One thing you might consider trying is Red Gate's upcoming tool, SQL Source Control. This is designed to build source control into SSMS, in other words it doesn't require DBAs to leave the comfort zone of their management environment. The bad news is that the tool hasn't been released yet. The good news is that we have an Early Access Program. Please visit the following link to find out more about the tool:
http://www.red-gate.com/Products/SQL_Source_Control/index.htm
you can't really put a large database under source control, so your DBA is right.
what you can do practically is to put your schema under source control, and maybe a few smallish 'configuration' tables.
One way to source control database is to store the data in and about the database separately
You can have the all the tables, procedures and function scripts as SQL files and add them to source control.
Export the database data as insert statements into SQL files, each with a fixed size. This is a cumbersome process as it would involve a lot of files that are to be tracked and controlled.
I am not sure if the VSS/SVN are able to read and keep history of changes to dump files created by the database backup options.
Its not clear from you question if you want to protect the data in the Db or the schemas in the Db. If the latter then you could identify all the important schemas and run an cron job that pulls the schema definitions from the Db and inserts them automatically into a source control system (perhaps even via triggers on the schemas??).
But this still just amounts to backing the system up more often. For what you envision you would need source control integrated with the Db tools and I don't know of any product that does that.
(and I shudder to think of VSS integrated into SQL management studio :-(( )
My answer to this same problem was to export all DB objects to text form (more than 136,000 of them) and then create the SourceSafe projects to hold them. Any New or changed objects in the DB now go to the SourceSafe structure, while unchanged are left alone.

SQL Server Compact - Schema Management

I've been searching for some time for a good solution to implement the idea of managing schema on an SQL Server Compact 3.5 database.
I know of several ways of managing schema on SQL Server Express, SQL Server Standard, SQL Server Enterprise, but the Compact Edition doesn't support the necessary tools required to use the same methodology.
Any suggestions/tips?
I should expand this to say that it is for 100+ clients with wrapperware software. As the system changes, I need to publish update scripts alongside the new binaries to the client. I was looking for a decent method by which to publish this without having to just hand the client a script file and say "Run this in SSMSE". Most clients are not capable of doing such a beast.
A buddy of mine disclosed a partial script on how to handle the SQL Server piece of my task, but never worked on Compact Edition. It looks like I'll be on my own for this.
What I think that I've decided to do, and it's going to need a "geek week" to accomplish, is to write some sort of a tool much like how WiX and NAnt works, so that I can just write an overzealous XML document to handle the work.
If I think that it is worthwhile, I'll publish it on CodePlex and/or The Code Project because I've used both sites a bit to gain better understanding of concepts for jobs I've done in the past, and I think it is probably worthwhile to give back a little.
Edit on 5/3/2010:
If someone is willing to "name" the project, I'll upload the dirty/nasty version that I've written for MS SQL to CodePlex so that maybe we can start hacking out a version of SQL Compact. Although, I think with the next revision of the initial application that I was planning, I'm going to be abandoning SQL Compact and just use XML Files for storage, as the software is being converted from an Installable package to being a Silverlight application. Silverlight just gives a better access strategy.
I am currently looking into Migrator.Net.
This allows you to write changes to your database, called migrations, directly in C#.
These migrations can contain everything from simple table additions/drops, column modifications, to complicated data update code.
When your application boots, it can verify what version the database is currently in and apply any migrations that are required to bring it up to date. All this is handled automatically. The code to run this update is as simple as:
Assembly asm = Assembly.Load("LocalModels.migration");
Migrator m = new Migrator("SqlServerCe", "Data Source=LocalModels.sdf", asm, false);
m.MigrateToLastVersion();
I am having a couple minor issues with the Compact support (it assumes the default schema is dbo). But I don't think it will be too difficult to fix them.
some random thoughts (not sure I can fully answer though)
the Microsoft Sync Framework is one option. I haven't had a chance to fully appreciate what it can do once you've deployed it after the initial first time (which seems to work fine). There's a MSDN site for it here
You can execute scripts on a mobile device, but not through something like SQL Management Studio, so in theory you could manage/maintain T-SQL scripts but the down side is that the T-SQL would be convoluted (to CE's supported statements) and I don't know a way to "automate" execution - but the Sync Framework might hold some answers..
If one of your key criteria is going to be working efficiently over a small pipe, the only real choice you have is to store a DB Schema Version (maybe somehow tied to the scripts checked into your CMS) and when an update is needed, the change scripts are sent over the wire and applied in order. You would probably want to keep a log in your DB as well of these scripts being applied so you can gracefully handle disconnects, reboots and other potentially nasty problems.
Is SQL Server Management Studio any use for you?
http://technet.microsoft.com/en-us/library/ms172933.aspx

Deploying SQL Server Databases from Test to Live

I wonder how you guys manage deployment of a database between 2 SQL Servers, specifically SQL Server 2005.
Now, there is a development and a live one. As this should be part of a buildscript (standard windows batch, even do with current complexity of those scripts, i might switch to PowerShell or so later), Enterprise Manager/Management Studio Express do not count.
Would you just copy the .mdf File and attach it? I am always a bit careful when working with binary data, as this seems to be a compatiblity issue (even though development and live should run the same version of the server at all time).
Or - given the lack of "EXPLAIN CREATE TABLE" in T-SQL - do you do something that exports an existing database into SQL-Scripts which you can run on the target server? If yes, is there a tool that can automatically dump a given Database into SQL Queries and that runs off the command line? (Again, Enterprise Manager/Management Studio Express do not count).
And lastly - given the fact that the live database already contains data, the deployment may not involve creating all tables but rather checking the difference in structure and ALTER TABLE the live ones instead, which may also need data verification/conversion when existing fields change.
Now, i hear a lot of great stuff about the Red Gate products, but for hobby projects, the price is a bit steep.
So, what are you using to automatically deploy SQL Server Databases from Test to Live?
I've taken to hand-coding all of my DDL (creates/alter/delete) statements, adding them to my .sln as text files, and using normal versioning (using subversion, but any revision control should work). This way, I not only get the benefit of versioning, but updating live from dev/stage is the same process for code and database - tags, branches and so on work all the same.
Otherwise, I agree redgate is expensive if you don't have a company buying it for you. If you can get a company to buy it for you though, it really is worth it!
For my projects I alternate between SQL Compare from REd Gate and the Database Publishing Wizard from Microsoft which you can download free
here.
The Wizard isn't as slick as SQL Compare or SQL Data Compare but it does the trick. One issue is that the scripts it generates may need some rearranging and/or editing to flow in one shot.
On the up side, it can move your schema and data which isn't bad for a free tool.
Don't forget Microsoft's solution to the problem: Visual Studio 2008 Database Edition. Includes tools for deploying changes to databases, producing a diff between databases for schema and/or data changes, unit tests, test data generation.
It's pretty expensive but I used the trial edition for a while and thought it was brilliant. It makes the database as easy to work with as any other piece of code.
Like Rob Allen, I use SQL Compare / Data Compare by Redgate. I also use the Database publishing wizard by Microsoft. I also have a console app I wrote in C# that takes a sql script and runs it on a server. This way you can run large scripts with 'GO' commands in it from a command line or in a batch script.
I use Microsoft.SqlServer.BatchParser.dll and Microsoft.SqlServer.ConnectionInfo.dll libraries in the console application.
I work the same way Karl does, by keeping all of my SQL scripts for creating and altering tables in a text file that I keep in source control. In fact, to avoid the problem of having to have a script examine the live database to determine what ALTERs to run, I usually work like this:
On the first version, I place everything during testing into one SQL script, and treat all tables as a CREATE. This means I end up dropping and readding tables a lot during testing, but that's not a big deal early into the project (since I'm usually hacking the data I'm using at that point anyway).
On all subsequent versions, I do two things: I make a new text file to hold the upgrade SQL scripts, that contain just the ALTERs for that version. And I make the changes to the original, create a fresh database script as well. This way an upgrade just runs the upgrade script, but if we have to recreate the DB we don't need to run 100 scripts to get there.
Depending on how I'm deploying the DB changes, I'll also usually put a version table in the DB that holds the version of the DB. Then, rather than make any human decisions about which scripts to run, whatever code I have running the create/upgrade scripts uses the version to determine what to run.
The one thing this will not do is help if part of what you're moving from test to production is data, but if you want to manage structure and not pay for a nice, but expensive DB management package, is really not very difficult. I've also found it's a pretty good way of keeping mental track of your DB.
If you have a company buying it, Toad from Quest Software has this kind of management functionality built in. It's basically a two-click operation to compare two schemas and generate a sync script from one to the other.
They have editions for most of the popular databases, including of course Sql Server.
I agree that scripting everything is the best way to go and is what I advocate at work. You should script everything from DB and object creation to populating your lookup tables.
Anything you do in UI only won't translate (especially for changes... not so much for first deployments) and will end up requiring a tools like what Redgate offers.
Using SMO/DMO, it isn't too difficult to generate a script of your schema. Data is a little more fun, but still doable.
In general, I take "Script It" approach, but you might want to consider something along these lines:
Distinguish between Development and Staging, such that you can Develop with a subset of data ... this I would create a tool to simply pull down some production data, or generate fake data where security is concerned.
For team development, each change to the database will have to be coordinated amongst your team members. Schema and data changes can be intermingled, but a single script should enable a given feature. Once all your features are ready, you bundle these up in a single SQL file and run that against a restore of production.
Once your staging has cleared acceptance, you run the single SQL file again on the production machine.
I have used the Red Gate tools and they are great tools, but if you can't afford it, building the tools and working this way isn't too far from the ideal.
I'm using Subsonic's migrations mechanism so I just have a dll with classes in squential order that have 2 methods, up and down. There is a continuous integration/build script hook into nant, so that I can automate the upgrading of my database.
Its not the best thign in the world, but it beats writing DDL.
RedGate SqlCompare is a way to go in my opinion. We do DB deployment on a regular basis and since I started using that tool I have never looked back.
Very intuitive interface and saves a lot of time in the end.
The Pro version will take care of scripting for the source control integration as well.
I also maintain scripts for all my objects and data. For deploying I wrote this free utility - http://www.sqldart.com. It'll let you reorder your script files and will run the whole lot within a transaction.
I agree with keeping everything in source control and manually scripting all changes. Changes to the schema for a single release go into a script file created specifically for that release. All stored procs, views, etc should go into individual files and treated just like .cs or .aspx as far as source control goes. I use a powershell script to generate one big .sql file for updating the programmability stuff.
I don't like automating the application of schema changes, like new tables, new columns, etc. When doing a production release, I like to go through the change script command by command to make sure each one works as expected. There's nothing worse than running a big change script on production and getting errors because you forgot some little detail that didn't present itself in development.
I have also learned that indexes need to be treated just like code files and put into source control.
And you should definitely have more than 2 databases - dev and live. You should have a dev database that everybody uses for daily dev tasks. Then a staging database that mimics production and is used to do your integration testing. Then maybe a complete recent copy of production (restored from a full backup), if that is feasible, so your last round of installation testing goes against something that is as close to the real thing as possible.
I do all my database creation as DDL and then wrap that DDL into a schema maintainence class. I may do various things to create the DDL in the first place but fundamentally I do all the schema maint in code. This also means that if one needs to do non DDL things that don't map well to SQL you can write procedural logic and run it between lumps of DDL/DML.
My dbs then have a table which defines the current version so one can code a relatively straightforward set of tests:
Does the DB exist? If not create it.
Is the DB the current version? If not then run the methods, in sequence, that bring the schema up to date (you may want to prompt the user to confirm and - ideally - do backups at this point).
For a single user app I just run this in place, for a web app we currently to lock the user out if the versions don't match and have a stand alone schema maint app we run. For multi-user it will depend on the particular environment.
The advantage? Well I have a very high level of confidence that the schema for the apps that use this methodology is consistent across all instances of those applications. Its not perfect, there are issues, but it works...
There are some issues when developing in a team environment but that's more or less a given anyway!
Murph
I'm currently working the same thing to you. Not only deploying SQL Server databases from test to live but also include the whole process from Local -> Integration -> Test -> Production. So what can make me easily everyday is I do NAnt task with Red-Gate SQL Compare. I'm not working for RedGate but I have to say it is good choice.

Resources