Synchronizing Database Schemas among Developers - database

I'm working on a project with couple others. We all have local copies of the project, which is also constantly updated via svn repo.
Because we are in the early stage of the development, we often change the schema of our database. This leads to a lot of problem when we sync our code, because we don't have a great way to synchronize our database schemas.
What are some intuitive and easy way to sync frequently changing database schema?
We are working with CakePHP (not sure if this would help me find good solutions).
EDIT
Found some tools to do this type of work in CakePHP:
http://book.cakephp.org/view/734/Schema-management-and-migrations
And here is an additional website:
http://bakery.cakephp.org/articles/view/cake-db-migrations-v2-1

Database migrations are an easy way to keep your working databases in sync. Essentially, migrations are scripts that update a database to the latest schema and fill in the new tables with the correct data, so they are kept in a valid state.
There are few features provided by migrations:
Tools to automate the creation/update of the tables. The tools keep track of the schema version and which scripts need to be run.
Some migration tools provide ability to run code (c#, ruby, etc.) instead of sql scripts. Code libraries provided by the migration tool is usually better able to abstract the database dependent parts and make your database scripts more database independent.
There are tools available for Ruby (migrations are an important part of Rails), C# and Java. Surely, other languages also.
There are a number of questions here on migrations and I would suggest searching for a migration tool that fits in your tool chain.

CakePHP Database Migrations by Joel Moss is by far the best solution at the moment.
Project description from github:
Database Migrations for CakePHP 1.2 is a shell script supported by the CakePHP console, that lets you manage your database schema without touching one little bit of SQL. It is based on the Ruby on Rails implementation of Migrations, and uses Pear's MDB2 package, so supports all the databases that that supports.
You could think of Migrations as a version control system for your database. It's power lends itself perfectly to developing as part of a team, as each member can keep their own independent copy of their application's database, and use Migrations to make changes to its schema. All other members have to do then, is to run a simple two word shell command, and their database copy is up to date with everyone else's.
The Migrations shell will generate a migration file for each DB change you want to make. This file can include any number of DB changes.

Cake 2.x-compatible plugin on Github:
https://github.com/CakeDC/migrations

Here's a great example using Git but the same applies to SVN anyhow. http://thewebandthings.synodicsolutions.com/2009/06/13/cakephp-versioning-database-changes-with-git/

i have started a small project that we use to sync database between developers and deploy to production. Its still at an early stage, but its proven to work it just doesn't have a lot of documentation yet.
http://code.google.com/p/php-mysql-version-control/

Related

Managing DB migration: scripts vs tools

Our project has about 20 developers, but our application makes relatively light use of databases. We have a collection of about 5 databases, all of which are very small and would have less than 20 tables each, none of which have millions of rows or anything large.
We have two options on the table for how to manage the evolution of the databases over time:
Some kind of tool. Currently we're using Visual Studio database projects, which contain the current definition of the schema, and look at a reference database to generate a diff script. We then use this diff script to bring the reference database up to date.
Use version scripts to build the database from a baseline. The scripts are manually placed in source control. Any data migration to move data from old columns/tables to new would be part of these scripts. There would be a version recorded in the DB somewhere and upgrading would run all scripts between DB version and the current version.
The second option seems to be widely used and I have found an indepth discussion here: http://odetocode.com/blogs/scott/archive/2008/01/31/versioning-databases-the-baseline.aspx
The problem we have with what we've got at the moment is that we don't have access over our Production databases. This means to create a release package, we have to restore a backup of Production into another location, generate a diff against that referece DB and give the script to the production DB team. So our release to production is different to our other environments.
This makes the idea of running versioned scripts appealing because we use the same scripts in all environments, and there's no ad-hoc work in deployment (eg manual restore of prod to reference DB). But given that we have such a small scale DB situation, I feel like we can hardly be a difficult case for the DB tools out there. What we want is something as simple as possible which is easy to understand.
Do the tools such as RedGate's suite make sense for this kind of scenario, or should we go with versioned scripts? Cost isn't so much of an issue, it's more about creating a Pit of Success where maintaining and deploying the DB is as basic and automated as possible.
I'm the product manager at Red Gate for SQL Compare, which generates diff scripts between two databases. I'd like you to take a look at our SQL Source Control tool, which will allow you to track schema changes as and when they're made in development. When it comes to deployment, if you know which schema version is in production, you can generate a deployment script from your source controlled versions. Of course you should always be testing this out in a staging environment before running on production.
Scott's article makes an excellent point in regards to migration script, and Denis alludes to more complex changes that can't realistically be second guessed by comparison tools, and would therefore require custom migration scripts to be managed and used appropriately. The next version of SQL Compare in conjunction with SQL Source Control will therefore manage both your schema versions and your migration scripts, allowing you to get the best of both worlds. If you'd like to see early screenshots of this, please email me at David dot Atkinson at red-gate dot com. I'd really love to discuss your requirements so we can better design the tool.
In my experience there always is more to it than mere schema changes. If you split a column in two, or shift a column to a separate table, or other such things, you need to migrate both the schema and the data.
No tool or script will allow you to migrate the actual data automatically. At the very most you'll get a diff for the schema which your devs may find useful as a reminder/check list for DB version migration scripts (sequences of create/alter/drop and insert/update/delete done in a single transaction).

Database Schema Change and Entity Framework 4 in Release Production Environment

OK, here we go, after reading a couple of related questions, I haven't ended into a real productive solution for my thoughts.
Thoughts:
As we all developers creating applications, then these apps going into production, the client after couple of days demands additional features.
Great! you open your database through Server Explorer, create tables, add columns, maybe changing datatypes and then updating your model from database, nice it works everything OK!
Now you release the project with your way (InstallShield, InstallAware, VS Setup Project...).
You could e.g. have a schema compare tool, get the script, try it (it's working) and then add it to "InstallShield" or another installer supports this job!
I have been searching though if there is a way for entity framework to realize the changes, or if there is an out of the box way to update the schema based on your model???
In general is there Entity Framework 4 Schema Change support?
Thank you.
You never open Server Explorer to modify your schema, thats where it all falls apart. You always write an upgrade script and then apply the upgrade script to the client site. See Version Control and your Database. Or you store the project as a VS DB project and apply a vsdbcmd based upgrade of the on-site data based on your .schema file, but with this approach you give up a lot of control and it can ruin your day if you have large tables.
As for modeling tools schema upgrade support capabilities: they are quite well behind of the VSDB upgrade capabilities, and personally I find the explicit upgrade script based far superior and more flexible than any of the diff based tools (EF, VSDB, SQL Compare etc).

How do you manage your sqlserver database projects for new builds and migrations?

How do you manage your sql server database build/deploy/migrate for visual studio projects?
We have a product that includes a reasonable database part (~100 tables, ~500 procs/functions/views), so we need to be able to deploy new databases of the current version as well as upgrade older databases up to the current version. Currently we maintain separate scripts for creation of new databases and migration between versions. Clearly not ideal, but how is anyone else dealing with this?
This is complicated for us by having many customers who each have their own db instance, rather than say just having dev/test/live instances on our own web servers, but the processes around managing dev/test/live for others must be similar.
UPDATE: I'd prefer not to use any proprietary products like RedGate's (although I have always heard they're really good and will look into that as a solution).
We use Red-Gate SQLCompare and SQLDataCompare to handle this. The idea is simple. Both compare products let you maintain a complete image of the schema or data from selected tables (e.g. configuration tables) as scripts. You can then compare any database to the scripts and get a change script. We keep the scripts in our Mercurial source control and tag (label) each release. Support can then go get the script for any version and use the Redgate tools to either create from scratch or upgrade.
Redgate also has an API product that allows you to do the compare function from your code. For example, this would allow you to have an automatic upgrade function in your installer or in the product itself. We often use this for our hosted web apps as it allows us to more fully automate the rollout process. In our case, we have an MSBuild task that support can execute to do an automatic rollout and upgrade. If you distribute to third-parties, you have to pay a small additional license fee for each distribution that includes the API.
Redgate also has a tool that automatically packages a database install or upgrade. We don't use that one as we have found that the compare against scripts for a version gives us more flexibility.
The Redgate tools also help us in development because they make it trivial to source control the schema and configuration data in a very granular way (each database object can be placed in its own file)
The question was asked before SSDT projects appeared, but that's definitely the way I'd go nowadays, along with hand-crafting migration scripts for structural db changes where there is data that would be affected.
There's also the MS VSTS method (2008 description here), anyone got a good article on doing this with 2010 and the pros/cons of using these tools?

Databases and DVCS

I've recently asked a question about how suitable a DVCS is for the corporate environment, and that has sparked another question for me.
One of the plus sides to a DVCS seems to be that you can easily branch and try out new things. My problem starts when I begin to think about database changes. I've always found it tricky to get a DB into a VCS and it just sounds like it's going to be even harder with a DVCS.
So, whats the best way to work with databases and a DVCS?
EDIT: I've started looking into Migrator.NET. What do people think of projects like this for easily moving between versions specificaly with experimental branches in your DVCS?
I think the best way to deal with this issue is to work with DB Schemas, not the databases themselves. In this case, each developer would have their own database to develop against.
Here are some of the options available:
Migrations framework within Ruby on Rails.
South for Django, in addition to the schema being defined in the model classes themselves.
Visual Studio 2008 Team System Database Edition for .NET: You define the schema and the tool can do a diff on schema and data to generate scripts to go between different versions of the database.
These may give you some inspiration on how to deal with putting a database in version control. Another benefit that comes when you deal schemas is that you can more readily implement TDD and Continuous Integration (CI). Your TDD/CI environment would be able to build up a new version of the database and then run tests against the newly generated environment.
Version all the scripts you're using to manage your database. If you need to have "in-development" changes to a DB, make them on your personal DB until such time as you "publish" your changes.
Database version control is always the most difficult thing in a multi-developer environment.
Typically each user will have their own DB which is a chimera of some but not all of the DB changes. When they make changes, they'll need to commit their change scripts. This gets really awkward. The core problems seem to stem from database changes affecting many aspects of the system and multiple table changes being dependent on each other - and how to migrate to the new schema from the old schema. Migrating data to a new schema is typically non-trivial. Often you want to default a column when data is copied to the new schema, but NOT default a column in general for INSERT, say. These are typically already difficult in production deployment issues and having to manage the database during development when the database design could be in major flux in the same way as a major deployment is a lot more work than you usually need to be doing in development. Time that could be better spent ensuring that your database is well-designed - constraints, foregin keys, etc.
Because the developers are more likely to step on each other with database changes, we always had a database chokepoint - the developers all developed against the SAME development database and made their changes "live". Then the dev database was version controlled independently. This is not really easy when people are offsite or whatever. Another alternative is to have designated database developers who coordinate changes several developers need to the same table - that doesn't need to be their entire job, but gives you better DB design consistency. Or you can coordinate database revisions so that people become more aware of the DB revs other people are doing and time their changes to wait until a DB rev is available from another developer.
The best way to not put database into VCS in binary form. Period.
If you have text representation of your database and you have special merge tool to resolve conflicts when your database will be changed in different branches -- then you can start thinking about versioning databases. Otherwise it will be constant pain in the ass.

Testing and Managing database versions against code versions

As you develop an application database changes inevitably pop up. The trick I find is keeping your database build in step with your code. In the past I have added a build step that executed SQL scripts against the target database but that is dangerous in so much as you could inadvertanly add bogus data or worse.
My question is what are the tips and tricks to keep the database in step with the code? What about when you roll back the code? Branching?
Version numbers embedded in the database are helpful. You have two choices, embedding values into a table (allows versioning multiple items) that can be queried, or having an explictly named object (such as a table or somesuch) you can test for.
When you release to production, do you have a rollback plan in the event of unexpected catastrophe? If you do, is it the application of a schema rollback script? Use your rollback script to rollback the database to a previous code version.
You should be able to create your database from scratch into a known state.
While being able to do so is helpful (especially in the early stages of a new project), many (most?) databases will quickly become far too large for that to be possible. Also, if you have any BLOBs then you're going to have problems generating SQL scripts for your entire database.
I've definitely been interested in some sort of DB versioning system, but I haven't found anything yet. So, instead of a solution, you'll get my vote. :-P
You really do want to be able to take a clean machine, get the latest version from source control, build in one step, and run all tests in one step. Making this fast makes you produce good software faster.
Just like external libraries, database configuration must also be in source control.
Note that I'm not saying that all your live database content should be in the same source control, just enough to get to a clean state. (Do back up your database content, though!)
Define your schema objects and your reference data in version-controlled text files. For example, you can define the schema in Torque format, and the data in DBUnit format (both use XML). You can then use tools (we wrote our own) to generate the DDL and DML that take you from one version of your app to another. Our tool can take as input either (a) the previous version's schema & data XML files or (b) an existing database, so you are always able to get a database of any state into the correct state.
I like the way that Django does it. You build models and the when you run a syncdb it applies the models that you have created. If you add a model you just need to run syncdb again. This would be easy to have your build script do every time you made a push.
The problem comes when you need to alter a table that is already made. I do not think that syncdb handles that. That would require you to go in and manually add the table and also add a property to the model. You would probably want to version that alter statement. The models would always be under version control though, so if you needed to you could get a db schema up and running on a new box without running the sql scripts. Another problem with this is keeping track of static data that you always want in the db.
Rails migration scripts are pretty nice too.
A DB versioning system would be great, but I don't really know of such a thing.
While being able to do so is helpful (especially in the early stages of a new project), many (most?) databases will quickly become far too large for that to be possible. Also, if you have any BLOBs then you're going to have problems generating SQL scripts for your entire database.
Backups and compression can help you there. Sorry - there's no excuse not to be able to get a a good set of data to develop against. Even if it's just a sub-set.
Put your database developments under version control. I recommend to have a look at neXtep designer :
http://www.nextep-softwares.com/wiki
It is a free GPL product which offers a brand new approach to database development and deployment by connecting version information with a SQL generation engine which could automatically compute any upgrade script you need to upgrade any version of your database into another. Any existing database could be version controlled by a reverse synchronization.
It currently supports Oracle, MySql and PostgreSql. DB2 support is under development. It is a full-featured database development environment where you always work on version-controlled elements from a repository. You can publish your updates by simple synchronization during development and you can generate exportable database deliveries which you will be able to execute on any targetted database through a standalone installer which validates the versions, performs structural checks and applies the upgrade scripts.
The IDE also offers you SQL editors, dependency management, support for modular database model components, data model diagrams, SQL clients and much more.
All the documentation and concepts could be found in the wiki.

Resources