SQL Server Database Project to alter existing database - sql-server

I have a large legacy database I'm working with and the company would like to make some modifications to it. Most of the modifications would be something like modifying a stored procedure that's used for a report. I'd like to figure out the best way to standardize and document any modifications we make.
One way I'm exploring is using a SQL Server Database Project to organize any update scripts, but I'm not really getting the results I'm looking for. I've tried both importing the entire database as well as using the .dacpac as a reference. Turns out that the original database has some validation issues that VS doesn't like, so it's not making it easy to just change what I want to change and deploy that without having to fix a lot of unrelated reference issues.
What I know will work fine is just controlling everything in some deployment scripts, where I can just run an alter command on an existing procedure and leave it at that. This will at least let me use version control for all changes I make, but it's still a pretty manual process.
Are there any other approaches anybody can recommend?

Related

How to create a script for SQL Server database create / upgrade from any state

I need to create scripts for creating or updating a database. The scripts are created from my test Database or from my source control.
The script needs to upgrade a database from any version of my application to the current version so it needs to be agnostic to what already exists in the database.
I do not have access to the databases that will be upgraded.
e.g.
If a table does not exist the script should create it.
If the table exists the script should check if all the columns exist (And check their types).
I wrote a lot of this checking code in C# as in i have an SQL create table script and the C# code checks if the table (and columns) exists before running the script.
My code is not production ready and i wanted to know what ready made solutions are out there.
I have no experience with frameworks that can do this.
Such an inquiry is off-topic for SO anyway.
But depending on your demands, it may not be too hard to implement something yourself.
One straightforward approach would be to work with incremental schema changes; basically just a chronological list of SQL scripts.
Never change or delete existing script (unless something really bad is in there).
Instead, just keep adding upgrade scripts for every new version.
Yes, 15 years later you will have accumulated 5,000 scripts.
Trust me, it will be the least of your problems.
To create a new database, just execute the full chain of scripts in chronological order.
For upgrades, there are two possibilities.
Keep a progress list in every database.
That is basically just a table containing the names of all scripts that have already been executed there.
To upgrade, just execute every script that is not in that list already. Add them to the list as you go.
Note: if necessary, this can be done with one or more auto-generated, deployable, static T-SQL scripts.
Make every script itself responsible for recognizing whether or not it needs to do anything.
For example, a 'create table' script checks if the table already exists.
I would recommend a combination of the two:
option #1 for new versions (as it scales a lot better than #2)
option #2 for existing versions (as it may be hard to introduce #1 retroactively on legacy production databases)
Depending on how much effort you will put in your upgrade scripts, the 'option #2' part may be able to fix some schema issues in any given database.
In other words, make sure you start off with scripts that are capable of bringing messy legacy databases back in line with the schema dictated by your application.
Future scripts (the 'option #1' part) have less to worry about; they should trust the work done by those early scripts.
No, this approach is not resistant against outside interference, like a rogue sysadmin.
It will not magically fix a messed-up schema.
It's an illusion to think you can do that automatically, without somebody analyzing the problem.
Even if you have a tool that will recreate every missing column and table, that will not bring back the data that used to be in there.
And if you are not interested in recovering data, then you might as well discard (part of) the database and start from scratch.
On the other hand, I would recommend to make the upgrade scripts 'idempotent'.
Running it once or running it twice should make no difference.
For example, use DROP TABLE IF EXISTS rather than DROP TABLE; the latter will throw an exception when executed again.
That way, in desperate times you may still be able to repair a database semi-automatically, simply by re-running everything.
If you are talking about Schema state, you can look at state-based deployment-tools instead of change-based. (not the official terminology)
You should look at these two tools
SQL Server Data Tools (Dacpac) data-tier-applications which is practically free
RedGate has an entire toolset for this https://www.red-gate.com/solutions/need/automate. which is licensed
The one thing to keep in mind with State based deployments is that you don't control how the database gets from one-state to another, with SSDT
For example a column-rename = drop and recreate that column, same for a table-rename.
In their defence they do have some protections and do tell you what is about to happen.
EDIT (Updating to address comment below)
It should not be a problem that you can't access the TargetDb while in development. You can still use the above tools provided you can use them (Dacpac/Redgate) tooling when you are deploying to the TargetDb.
If you are hoping to have a dynamic TSQL script that can update a target database in an unknown state. Then that is a recipe for failure/disaster. I do have some suggestions at the end for dealing with this.
The way I see it working is
Do your development using Dacpac/Redgate
Build your artefacts Dacpac / Redgate package
Copy artefact to the deployment server with tools
when doing deployments use the tools (Dacpac Powershell) or Redgate manually
If your only choice is a TSQL script, then the only option is extensive-defensive coding covering all possibilities.
Every object must have an existence check
Every property must have a state check
Every object/property must have a roll forward / roll backward script.
For example to sync a table
A Script to check the table exists, if not create it
A script to check each property of the table is in the correct state
check all columns and their data-types and script to update them to match
check defaults
check indexes, partitioning etc
Even with this, you might not be able to handle every scenario.
The work you are trying to do requires you start using a standard change control process.
Given the risk of data loss, and issues related to creation of columns in a specific sequence and the potential for column definitions to change.
I recommend you look at defining a base line version which you will manually have to upgrade each system to.
You can roll your own code, and use a schema version table, or use any one of the tools available such as redgate sql source control, visual studio database projects, dbup, or others.
I do not believe any tool will bring you from 0-1, however, once you baseline, any one of these tools will greatly facilitate your workflow.
Start with this article Get Your Database Under Version Control
Here are some tools that can help you?
Octopus Schema Migrations
Flyway By Redgate
Idera Database Change Management
SQL Server Data Tools

Managing SQL Files in Source Control

I work on a piece of software that has many tables, views, and stored procedures. Currently, to make it easy for developers to run all of the latest updates on their local databases and for ease of deployment of the software, we have a large Update.sql file. This creates tables and stored procedures that don't exists and adds/updates/removes data that needs to change. It is designed to be run over and over again without messing up someones database and only apply the changes that are needed. This is very convenient for the developers and for deployment.
However, I would really love to be able to split all of the database objects (tables, functions, stored procedures, back-fills/data updates) into separate scripts in source control. This would allow us to track changes to individual database objects instead of just one large SQL file.
Is there a good way to get the best of both worlds? Perhaps a free tool that can run all SQL files in a folder and all of its sub-folders? Or some batch script that can merge all of the individual files together into a single file after every check-in?
EDIT 10/27/2017: After reviewing some of the links that the answers have shared, I think this question comes down to finding a way to take the best parts of State based VS Migration based database update management. Here is an article that I think breaks down the differences and pros/cons pretty well, but I'll summarize the parts that I am focused on below
STATE BASED: This is what is used by Visual Studio SQL Server Projects. It is a snapshot of what the database should look like at the current version. Updates to servers are created by comparing the database to this snapshot and auto-generating scripts that will alter tables/views/SPs/etc. to be what they need to be.
Pros:
Version Control: Each database objects (table, stored procedure, etc.) is a separate script file. This makes tracking changes made to those objects over time very manageable because you can just view the source control history.
Compilation: If you are using Visual Studio SQL Server Projects, you can actually compile them and they will tell you if your references are all good. For instance, if you drop a column in the table and there was a stored procedure that references that column, this will tell you that the SP references a column that no longer exists so you can fix it.
Simple Deployment: You can use these projects that have hundreds of individual database object scripts and have it update a database either in Visual Studio using Publish or by compiling it and taking the DacPac that it made to SQL and updating it that way. So even though there are a bunch of individual files, after compiling it just comes down to one file that you work with in the end.
Cons:
Updating data: In the real-world, State-based updates often aren't viable. For example, let's say your Contacts table used to have a Full Name column. In version 2, you decide to split this into First Name and Last Name and drop the Full Name column. Normally you would write scripts to add the new columns, convert the data, and then drop the old column. However, state-based doesn't work that way, it will just drop the column and add the new ones, but not do anything to convert the data.
MIGRATION BASED: This is pretty much what we are currently doing, except in one really big file instead of several small files. You start with a base-line (which might be an empty database), and then you write one or more files that then alter that base-line to get it to the current version. For instance, Version1.sql might create the Contacts table with the Full Name column, then Version2.sql could create the First Name/Last Name columns, move the data, and then drop the old column. You can either use tools that only runs each script once in the right order or you can do what we've been doing and have a big script that has logic in it to know what things have been run and which haven't and only do what needs to be done.
Pros and Cons: This is basically the reverse of State-based. It gives you a lot of flexibility on how you create your scripts and the power to use real-world logic to update your database the way it needs to be instead of letting it automatically create drop/alter/insert/etc. scripts itself. Much like State-based, as long as you have the right tools, it is easy to deploy. However, it usually isn't very easy to track changes made to database objects overtime. If I want to see the full history of changes to a particular table, who did it, and when, there's not really an easy way to do this, because there is not a single file representing that database object with a Source Control history. Also, I haven't seen any tools that can take a Migration-based strategy and compile it to show you if the changes made have any reference issues.
SO, MY QUESTION IS: How can I keep the power, flexibility, and ease of use of Migration-based that we are currently using, but also get the best parts of State-based (Version Control and Compilation to check dependencies)? I'm up for some hybrid solution as long as it doesn't mean that my developers have to manage two things (like write a Migration script, but also don't forget to update the SQL project so we can track the history). If I could automate a SQL Project to update the database object scripts based on the migration that would be cool, but it would need to know who made the changes that caused the update and preferably what changeset it happened in.
Thoughts?
With sql server mamagement studio you can generate scripts to recreate the db - you could do tha and put those in your source versionning system.
Use "Tasks" , "Generate Scripts" and click through the options. You can use single objects to file.
As for Data ... I think there is some kind of checkbox to export the data as well - not sure though.
f.e. here: Want to create a script to export Data and tables and views to a sql script
I'm not sure of a free tool, but the solution to the below seems interesting...
Run all SQL files in a directory
What I WILL say about that is there are no transactions, so if one of your .sql scripts breaks, it is not going to roll back all of your creations. Other than that though, this should work fine.

Possible to keep both DB Schema and *data* from "Application Config Tables" in synch between work and home PC's via source control?

I'm working with:
VS2013 Professional, Microsoft SQL Server 2012 - 11.0.5058.0 (X64)
I have kind of a two part question. What I'm wanting to achieve is: I want to, as seamlessly as possible, to be able to work on the same project on my work PC and home PC. As of right now, I am using online hosted Subversion for source control which is working fine for application code. The part I have no control over at the moment is the database. I would like if I could get "all" database changes made at either work or home to synch to my other machine.
By database changes, I mean:
Schema Changes
Data within specific "Application" tables (I obviously
do not intend to synch data in all tables)
I followed this just to test getting a DB schema into my project and under source control:
https://msdn.microsoft.com/en-us/library/aa833194%28v=vs.100%29.aspx
It seems to work fine. However, that covers schema changes when working on one machine. If I then go home and want to:
either build from new or update changes to the schema on my home machine, or
update data in base "Application" tables
...I have no clue how to do that, or if it is even possible?
I would think there should be a simple (ha!) way for making the schema changes flow through easily?
But changes to app tables might be harder - I'm happy to write a sql script to manage that, but I'd like to be able to have that script automatically run when I do a "refresh" my local copy of the database.
For schema changes, there are good blogs out there on using SSDT/DataDude/VS DB Projects. Jamie Thomson has written quite a few times on his experiences. I've written up my experiences here: http://schottsql.blogspot.com/2013/10/all-ssdt-articles.html
For data - you can use the native "Data Compare" option under the "SQL" menu in SSDT. It's not perfect, but it can help. Overall, though, what you'd want is one of a couple of things:
1. Extract data from the shared system, write a task to populate that - batch files w/ BCP, SSIS, or some apps that can actually generate T-SQL for you.
2. Write it yourself, being sure to guard against attempts to insert duplicate data and ensuring the key values remain unchanged.
3. Buy a copy of Red-Gate's SQL Data Compare Pro. You can save the compare options and can then execute those through the command line.
If you need this for multiple developers, option 1 or 2 is probably the best way to go, though you can use SQL Data Compare to get you started with a pretty good script. You should also be able to use something like Mladen Prajdic's SSMS Tools Pack to script result sets to T-SQL inserts that you could re-use.
If you use one of those options and combine it with a post-deploy script (maybe even one that only runs if this is a "new" build), you should be off to a good start.

VS2010 Database Project Doesn't Properly Rename Columns

The more I use the DB Project, the less useful I find it. I'm trying to use this project type to manage my db schema and be able to use it to generate differences for test/production schema updates.
Right now I'm stuck trying to rename a column. I am using the object refactor tool, which updates the refactorlog, but that log seems to have no impact on the deployment of the sql file. Every time I deploy or diff it generates the sql as a column drop and add which purges all the existing data. You'd think the schema diff tool would have an option to map 2 columns as a rename, but that feature is conveniently missing.
Also, the Always re-create database option doesn't appear to do anything. Regardless of the state of this checkbox, my deployed sql is exactly the same. Which means each time I run it my database is re-created, which is contrary to what the document is telling me for unchecking this to run updates.
If the db project can't do a simple rename, then it's pretty much useless since I can't trust it to render the proper update sql (if and when I figure out how to prevent it from re-creating my database).
At this point I'm about to punt and just manage everything by hand, which I would prefer not to do, because contrary to my "useless" statement, the VS DB tools do some nice things, but 90% of the way there isn't good enough.
Has anyone else had experience dealing with these issues with a VS2010 DB Project who can talk me off the ledge?
VS 2010 Schema Compare does not use the refactor log. That is used only when doing a project deployment. Here is a definitive statement to that effect from the product manager:
http://social.msdn.microsoft.com/Forums/en/vstsdb/thread/fd2c3d02-8792-4d58-b1cb-0c804a1142de

How to keep code base and database schema in synch?

So recently on a project I'm working on, we've been struggling to keep a solution's code base and the associated database schema in synch (Database = SQL Server 2008).
Database changes occur fairly regularly (adding columns, constraints, relationships, etc) and as a result it's not uncommon for people to do a 'Get Latest' from source control and
find that they also need to rebuild the database as well (and sometimes they forget to do the latter).
We're not using VSTS: Database Edition (DataDude) but the standard Visual Studio database project with a script (batch file) which tears down and recreates the database from T-SQL scripts. The solution is a .Net & ASP.net solution with LINQ to SQL underlying as the ORM.
Anyone have ideas on an approach to take (automated or not) which would keep everyone up to date with the latest database schema?
Continuous integration with MSBuild is an option, but only helps pick up any breaking changes committed, it doesn't really help in the scenario I highlighted above.
We are using Team Foundation Server, if that helps..
We try to work forward from the creation scripts.
i.e a change to the database is not authorised unless the script has been tested and checked into source control.
But this assumes that the database team is integrated with your app team which is usually not the case in a large project...
(I was tempted to answer this "with great difficulty")
EDIT: Tools won't help you if your process isn't right.
Ok although its not the entire solution, you should include an assertion in the Application code that links up to the database to assert the correct schema is being used, that way at least it becomes obvious, and you avoid silent bugs and people complaining that stuff went crazy all of the sudden.
As for the schema version, you could use some database specific functionality if available, but i personally prefer to declare a schema version table and keep the version number in there, that way its portable and can be checked with a simple select statement
have a look at DB Ghost - you can create a dbp using the scripter in seconds and then manage all your database code with the change manager. www.dbghost.com
This is exactly what DB Ghost was designed to handle.
We basically do things the way you are, with the generation script checked into source control as well. I'm the designated database master so all changes to the script itself are done through me. People send me scripts of the changes they have made, I update my master copy of the schema, run a generate scripts (SSMS) to produce the new DB script, and then check it in. I keep my copy of the code current with any changes that are being made elsewhere. We're a small shop so this works pretty well for us. I realize that it probably doesn't scale.
If you are not using Visual Studio Database Professional Edition, then you will need another tool that can break the database down into its elemental pieces so that they are managable and changeable in an easier manner.
I'd recommend seriously considering Redgate's SQL tools if you want to maintain sanity over all your database changes and updates.
SQL Packager
SQL Multi Script
SQL Refactor
Use a tool like RedGate SQL Compare to generate the change schema between any given version of the database. You can then check that file into source code control
Have a look at this question: dynamic patching of databases. I think it's similar enough to your problem to be helpful.
My solution to this problem is simple. Define everything as XML, and make sure that both the database, the ORM and the UI are generated from this XML, no exceptions. That way, you can use code generation tools to quickly regenerate the database creation script, which will alter your schema while (hopefully) preserving some data. It takes some effort to do, but the net result is well worth it.

Resources