Database under source control again! - sql-server

We use MS SQL Server and C#. Our database is under sourse control and I will tell you some details of our implementation. We had implemented two operations:
Export database to plain-text files. Database schema files:
tables.sql
relationships.sql
views.sql
...
and table contents files:
Data/table1.txt
Data/table2.txt
...
It is easy to review database changes using source control logs because all these files has plain-text format.
Imlementation is based on classes from namespace Microsoft.SqlServer.Management.Smo.
Import database from this plain-text files. Implementation is strightforward - just execute sql statements from *.sql files, and then execute a bunch of inserts.
So we have two bat-files: create-test-databse.bat and export-test-database.bat. When a developer needs a new test database he just executes the bat-file and waits for a minute.
Every functional test, which needs a database creates a new database, uses it, and then kills it. But I should say that it is not very fast operation. :(
So what instruments do YOU use to put your database under source control?
I mean how do you implement operations "create test database" and "export test database" for example?

I think you are asking two questions here. The first is how to get your database under source control. Your solution is interesting, and I've also used Visual Studio Team Edition for Database Professionals (here's a tutorial I wrote on TDD of stored procedures using it)
The second is how to you set up your database for integration tests. Setting up and tearing down the entire database seems like it might be a bit overkill. There are several solutions out there. I've used DBFit. Roy Osherove published a tool called XtUnit a while back that I haven't played with. And, of course, you could always setup your tests to do a transaction start in the SetUp, and a Rollback during teardown.

We use Visual Studio for Database Professionals. Don't leave home without it.

I use VSTS for DB Pros. You point it at your SQL server and it analyzes your database and creates the individual files for you. You can even have it generate test data for you. The next release will include support for third party providers (think Oracle, MySQL, DB2).
The really great feature in here is the validation. We found that parts of our database were totally broken (they were vestigial, not used by the code anymore). It basically makes it possible to deploy your database on demand.

Related

How do you deal with multiple developers and database changes?

I would like to know how you guys deal with development database changes in groups of 2 or more devs? Do you have a global db everyone access, maybe a local copy and manually apply script changes? It would be nice to see pros and cons that you've noticed for each approach and the number of devs in your team.
Start with "Evolutionary Database Design" by Martin Fowler. This sums it up nicely
There are have been other questions about DB development that may be useful too, for example Is RedGate SQL Source Control for me?
Our approach is that everyone has their own DB, the complete DB can be created from create scripts with base data if required. All the scripts required for this are in source control.
All scripts are CREATE scripts and they reflect the current state of the database schema. Upgrades are in separate SQL files which can upgrade existing DBs from a specific version to a newer one (run sequentially). After all the updates have been applied, the schema must be identical to what you would get from running the setup scripts.
We have some tools to do this (we use SQL Server and .NET):
Scripting is done with a tool which also applies a standard formatting so that the changes are well traceable with text diff tools (and by the SCM)
A runtime module takes care of comparing the existing DB objects, run updates if required, automatically apply "non-destructive" changes, then check the DB objects again to ensure a correct migration before committing the changes
The toolset is available as open-source project (licensed under LGPL), it's called the bsn ModuleStore (note that it is limited to SQL Server 2005/2008/Azure and to .NET for the runtime part).
We use what was code named "Data Dude" - the database features in TFS and Visual Studio - to deal with this. When you "get latest" and bring in code that relies on a schema change, you also bring in the revised schemas, stored procedures etc. You rigght-click the database project and Deploy; that gets your local schema and sp in sync but doesn't overwrite your data. The job of working out the script to get you from your old schema to the new one falls to Visual Studio, not to you or your DBA. We also have "populate" scripts for things like lists of provinces and a deploy runs them for you.
So much better than the old way which always fell apart at high stress times, with people checking in code then going home and nobody knowing what columns to add to make the code work etc.

Database source control vs. schema change scripts

Building and maintaining a database that is then deplyed/developed further by many devs is something that goes on in software development all the time. We create a build script, and maintain further update scripts that get applied as the database grows over time. There are many ways to manage this, from manual updates to console apps/build scripts that help automate these processes.
Has anyone who has built/managed these processes moved over to a Source Control solution for database schema management? If so, what have they found the best solution to be? Are there any pitfalls that should be avoided?
Red Gate seems to be a big player in the MSSQL world and their DB source control looks very interesting:
http://www.red-gate.com/products/solutions_for_sql/database_version_control.htm
Although it does not look like it replaces the (default) data* management process, so it only replaces half the change management process from my pov.
(when I'm talking about data, I mean lookup values and that sort of thing, data that needs to be deployed by default or in a DR scenario)
We work in a .Net/MSSQL environment, but I'm sure the premise is the same across all languages.
Similar Questions
One or more of these existing questions might be helpful:
The best way to manage database changes
MySQL database change tracking
SQL Server database change workflow best practices
Verify database changes (version-control)
Transferring changes from a dev DB to a production DB
tracking changes made in database structure
Or a search for Database Change
I look after a data warehouse developed in-house by the bank where I work. This requires constant updating, and we have a team of 2-4 devs working on it.
We are fortunate because there is only the one instance of our "product", so we do not have to cater for deploying to multiple instances which may be at different versions.
We keep a creation script file for each object (table, view, index, stored procedure, trigger) in the database.
We avoid the use of ALTER TABLE whenever possible, preferring to rename a table, create the new one and migrate the data over. This means that we don't have to look through a history of ALTER scripts - we can always see the up to date version of every table by looking at its create script. The migration is performed by a separate migration script - this can be partly auto-generated.
Each time we do a release, we have a script which runs the create scripts / migration scripts in the appropriate order.
FYI: We use Visual SourceSafe (yuck!) for source code control.
I've been looking for a SQL Server source control tool - and came across a lot of premium versions that do the job - using SQL Server Management Studio as a plugin.
LiquiBase is a free one but i never quite got it working for my needs.
There is another free product out there though that works stand along from SSMS and scripts out objects and data to flat file.
These objects can then be pumped into a new SQL Server instance which will then re-create the database objects.
See gitSQL
Maybe you're asking for LiquiBase?

Transferring changes from a dev DB to a production DB

Say I have a website and a database of that website hosted locally on my computer (for development) and another database hosted (for production)...ie first I do the changes on the dev db and then I do the changes to the prod DB.
What is the best way to transfer the changes that I did on the local database to the hosted database?
If it matters, I am using MS Sql Server (2008)
The correct way to do this with Visual Studio and SQL Server is to add a Database Project to the web app solution. The database project should have SQL files that can recreate the entire database completely on a new server along with all the necessary tables, procedures users and roles.
That way, they are included in the source control for all the rest of the code as well.
There is a Changes sub-folder in the Database Project where I put the SQL files that apply any new alterations or additions to the database for subsequent versions.
The SQL in the files should be written with proper "if exists" blocks such that it can be run safely multiple times on an already updated database without error.
As a rule, you should never make your changes directly in the database - instead modify the SQL script in the project and apply it to the database to make sure your source code (the SQL files) is always up to date.
We do this in the (Ruby on) Rails world by writing "migrations," which capture the changes you make to the DB structure at each point. These are run with a migration tool (a task for rake), which also writes to a DB table so it knows whether a particular migration has been run or not.
You could make a structure like this for your dev platform (.Net?), but I think that in other answers to this question people will suggest available tools for handling database versioning in your development platform, or perhaps for your specific DB.
I don't know any of these, but check out this list. I see a lot of pay things out there, but there must be something free. Also check this out.
I migrate changes via change scripts written by developers when they have tested/verified their changes. (The exception being moving large data.) All scripts are stored in a Source control system. and can be verified by DBAs.
It is manual, sometime time consuming but effective, safe and controled process.
Databases are too vital to copy from dev.
There are tools to help create/verify these scripts.
See http://www.red-gate.com/
I have used their tools to compare 2 databases to create scripts.
Brian
If the changes are small, I sometimes make them by hand. For larger changes, I use Red Gate's SQL Compare to generate change scripts. These are hand-verified and run in the QA environment first to make sure they don't break anything. For large changes, we run a special backup prior to making the change both in QA and in production.
We used to use the approach provided by Ron. It makes sense for a big project with dedicated team of DBAs. But if you do not have a dedicated developers who write code only for DB this approach is time and resource expensive.
The approach to use RedGate DB compare is also not good. You still have a do a lot of manual work you can skip some step by mistake.
It needs something better. This is was the reason why we built the "Agile DB Recreation/Import/Reverse/Export tool"
The tool is free.
Advantages: your developers use any prefered tools to develop DEV DB. Then they run the DB RIRE and it makes reverseengeniring DB (tables, views, stor proc, etc) and export data into XML files. XML files you can keep in the any code repository system.
And the second step is to run DB RIRE one more time to generate difference scripts between structure and data in XML files and in Production DB.
Of course you can make as much iterations as you need.

How to keep code base and database schema in synch?

So recently on a project I'm working on, we've been struggling to keep a solution's code base and the associated database schema in synch (Database = SQL Server 2008).
Database changes occur fairly regularly (adding columns, constraints, relationships, etc) and as a result it's not uncommon for people to do a 'Get Latest' from source control and
find that they also need to rebuild the database as well (and sometimes they forget to do the latter).
We're not using VSTS: Database Edition (DataDude) but the standard Visual Studio database project with a script (batch file) which tears down and recreates the database from T-SQL scripts. The solution is a .Net & ASP.net solution with LINQ to SQL underlying as the ORM.
Anyone have ideas on an approach to take (automated or not) which would keep everyone up to date with the latest database schema?
Continuous integration with MSBuild is an option, but only helps pick up any breaking changes committed, it doesn't really help in the scenario I highlighted above.
We are using Team Foundation Server, if that helps..
We try to work forward from the creation scripts.
i.e a change to the database is not authorised unless the script has been tested and checked into source control.
But this assumes that the database team is integrated with your app team which is usually not the case in a large project...
(I was tempted to answer this "with great difficulty")
EDIT: Tools won't help you if your process isn't right.
Ok although its not the entire solution, you should include an assertion in the Application code that links up to the database to assert the correct schema is being used, that way at least it becomes obvious, and you avoid silent bugs and people complaining that stuff went crazy all of the sudden.
As for the schema version, you could use some database specific functionality if available, but i personally prefer to declare a schema version table and keep the version number in there, that way its portable and can be checked with a simple select statement
have a look at DB Ghost - you can create a dbp using the scripter in seconds and then manage all your database code with the change manager. www.dbghost.com
This is exactly what DB Ghost was designed to handle.
We basically do things the way you are, with the generation script checked into source control as well. I'm the designated database master so all changes to the script itself are done through me. People send me scripts of the changes they have made, I update my master copy of the schema, run a generate scripts (SSMS) to produce the new DB script, and then check it in. I keep my copy of the code current with any changes that are being made elsewhere. We're a small shop so this works pretty well for us. I realize that it probably doesn't scale.
If you are not using Visual Studio Database Professional Edition, then you will need another tool that can break the database down into its elemental pieces so that they are managable and changeable in an easier manner.
I'd recommend seriously considering Redgate's SQL tools if you want to maintain sanity over all your database changes and updates.
SQL Packager
SQL Multi Script
SQL Refactor
Use a tool like RedGate SQL Compare to generate the change schema between any given version of the database. You can then check that file into source code control
Have a look at this question: dynamic patching of databases. I think it's similar enough to your problem to be helpful.
My solution to this problem is simple. Define everything as XML, and make sure that both the database, the ORM and the UI are generated from this XML, no exceptions. That way, you can use code generation tools to quickly regenerate the database creation script, which will alter your schema while (hopefully) preserving some data. It takes some effort to do, but the net result is well worth it.

Deploying SQL Server Databases from Test to Live

I wonder how you guys manage deployment of a database between 2 SQL Servers, specifically SQL Server 2005.
Now, there is a development and a live one. As this should be part of a buildscript (standard windows batch, even do with current complexity of those scripts, i might switch to PowerShell or so later), Enterprise Manager/Management Studio Express do not count.
Would you just copy the .mdf File and attach it? I am always a bit careful when working with binary data, as this seems to be a compatiblity issue (even though development and live should run the same version of the server at all time).
Or - given the lack of "EXPLAIN CREATE TABLE" in T-SQL - do you do something that exports an existing database into SQL-Scripts which you can run on the target server? If yes, is there a tool that can automatically dump a given Database into SQL Queries and that runs off the command line? (Again, Enterprise Manager/Management Studio Express do not count).
And lastly - given the fact that the live database already contains data, the deployment may not involve creating all tables but rather checking the difference in structure and ALTER TABLE the live ones instead, which may also need data verification/conversion when existing fields change.
Now, i hear a lot of great stuff about the Red Gate products, but for hobby projects, the price is a bit steep.
So, what are you using to automatically deploy SQL Server Databases from Test to Live?
I've taken to hand-coding all of my DDL (creates/alter/delete) statements, adding them to my .sln as text files, and using normal versioning (using subversion, but any revision control should work). This way, I not only get the benefit of versioning, but updating live from dev/stage is the same process for code and database - tags, branches and so on work all the same.
Otherwise, I agree redgate is expensive if you don't have a company buying it for you. If you can get a company to buy it for you though, it really is worth it!
For my projects I alternate between SQL Compare from REd Gate and the Database Publishing Wizard from Microsoft which you can download free
here.
The Wizard isn't as slick as SQL Compare or SQL Data Compare but it does the trick. One issue is that the scripts it generates may need some rearranging and/or editing to flow in one shot.
On the up side, it can move your schema and data which isn't bad for a free tool.
Don't forget Microsoft's solution to the problem: Visual Studio 2008 Database Edition. Includes tools for deploying changes to databases, producing a diff between databases for schema and/or data changes, unit tests, test data generation.
It's pretty expensive but I used the trial edition for a while and thought it was brilliant. It makes the database as easy to work with as any other piece of code.
Like Rob Allen, I use SQL Compare / Data Compare by Redgate. I also use the Database publishing wizard by Microsoft. I also have a console app I wrote in C# that takes a sql script and runs it on a server. This way you can run large scripts with 'GO' commands in it from a command line or in a batch script.
I use Microsoft.SqlServer.BatchParser.dll and Microsoft.SqlServer.ConnectionInfo.dll libraries in the console application.
I work the same way Karl does, by keeping all of my SQL scripts for creating and altering tables in a text file that I keep in source control. In fact, to avoid the problem of having to have a script examine the live database to determine what ALTERs to run, I usually work like this:
On the first version, I place everything during testing into one SQL script, and treat all tables as a CREATE. This means I end up dropping and readding tables a lot during testing, but that's not a big deal early into the project (since I'm usually hacking the data I'm using at that point anyway).
On all subsequent versions, I do two things: I make a new text file to hold the upgrade SQL scripts, that contain just the ALTERs for that version. And I make the changes to the original, create a fresh database script as well. This way an upgrade just runs the upgrade script, but if we have to recreate the DB we don't need to run 100 scripts to get there.
Depending on how I'm deploying the DB changes, I'll also usually put a version table in the DB that holds the version of the DB. Then, rather than make any human decisions about which scripts to run, whatever code I have running the create/upgrade scripts uses the version to determine what to run.
The one thing this will not do is help if part of what you're moving from test to production is data, but if you want to manage structure and not pay for a nice, but expensive DB management package, is really not very difficult. I've also found it's a pretty good way of keeping mental track of your DB.
If you have a company buying it, Toad from Quest Software has this kind of management functionality built in. It's basically a two-click operation to compare two schemas and generate a sync script from one to the other.
They have editions for most of the popular databases, including of course Sql Server.
I agree that scripting everything is the best way to go and is what I advocate at work. You should script everything from DB and object creation to populating your lookup tables.
Anything you do in UI only won't translate (especially for changes... not so much for first deployments) and will end up requiring a tools like what Redgate offers.
Using SMO/DMO, it isn't too difficult to generate a script of your schema. Data is a little more fun, but still doable.
In general, I take "Script It" approach, but you might want to consider something along these lines:
Distinguish between Development and Staging, such that you can Develop with a subset of data ... this I would create a tool to simply pull down some production data, or generate fake data where security is concerned.
For team development, each change to the database will have to be coordinated amongst your team members. Schema and data changes can be intermingled, but a single script should enable a given feature. Once all your features are ready, you bundle these up in a single SQL file and run that against a restore of production.
Once your staging has cleared acceptance, you run the single SQL file again on the production machine.
I have used the Red Gate tools and they are great tools, but if you can't afford it, building the tools and working this way isn't too far from the ideal.
I'm using Subsonic's migrations mechanism so I just have a dll with classes in squential order that have 2 methods, up and down. There is a continuous integration/build script hook into nant, so that I can automate the upgrading of my database.
Its not the best thign in the world, but it beats writing DDL.
RedGate SqlCompare is a way to go in my opinion. We do DB deployment on a regular basis and since I started using that tool I have never looked back.
Very intuitive interface and saves a lot of time in the end.
The Pro version will take care of scripting for the source control integration as well.
I also maintain scripts for all my objects and data. For deploying I wrote this free utility - http://www.sqldart.com. It'll let you reorder your script files and will run the whole lot within a transaction.
I agree with keeping everything in source control and manually scripting all changes. Changes to the schema for a single release go into a script file created specifically for that release. All stored procs, views, etc should go into individual files and treated just like .cs or .aspx as far as source control goes. I use a powershell script to generate one big .sql file for updating the programmability stuff.
I don't like automating the application of schema changes, like new tables, new columns, etc. When doing a production release, I like to go through the change script command by command to make sure each one works as expected. There's nothing worse than running a big change script on production and getting errors because you forgot some little detail that didn't present itself in development.
I have also learned that indexes need to be treated just like code files and put into source control.
And you should definitely have more than 2 databases - dev and live. You should have a dev database that everybody uses for daily dev tasks. Then a staging database that mimics production and is used to do your integration testing. Then maybe a complete recent copy of production (restored from a full backup), if that is feasible, so your last round of installation testing goes against something that is as close to the real thing as possible.
I do all my database creation as DDL and then wrap that DDL into a schema maintainence class. I may do various things to create the DDL in the first place but fundamentally I do all the schema maint in code. This also means that if one needs to do non DDL things that don't map well to SQL you can write procedural logic and run it between lumps of DDL/DML.
My dbs then have a table which defines the current version so one can code a relatively straightforward set of tests:
Does the DB exist? If not create it.
Is the DB the current version? If not then run the methods, in sequence, that bring the schema up to date (you may want to prompt the user to confirm and - ideally - do backups at this point).
For a single user app I just run this in place, for a web app we currently to lock the user out if the versions don't match and have a stand alone schema maint app we run. For multi-user it will depend on the particular environment.
The advantage? Well I have a very high level of confidence that the schema for the apps that use this methodology is consistent across all instances of those applications. Its not perfect, there are issues, but it works...
There are some issues when developing in a team environment but that's more or less a given anyway!
Murph
I'm currently working the same thing to you. Not only deploying SQL Server databases from test to live but also include the whole process from Local -> Integration -> Test -> Production. So what can make me easily everyday is I do NAnt task with Red-Gate SQL Compare. I'm not working for RedGate but I have to say it is good choice.

Resources