Can you ignore columns when doing a SSDT data compare? - sql-server

I frequently use the SSDT data comparison tool to sync up database data from our integration environment to our production environment. However, I typically run into scenarios where columns should be ignored and never synced up. Even if I review the data differences that SSDT finds, the sync operation happens on the row level and unfortunately I need to control syncing on the cell level.
Anyone have any good solutions?

I just have came across this requirement in my project and in VS 2019 I have found the solution.
Create a new data comparison (Tools/SQL Server/New Data Comparison...) and
in the wizard after selecting the data sources click Next to have the wizard to enumerate the tables.
Select the desired table, expand it and unselect the fields you do not want to include in the comparison.
That's it. It was easy to overlook the little expand arrow in front of the table name...

The solution is to user an alternative tool, SSDT doesn't support this at the moment. It would certainly be nice to have.

Related

version control for ssas tabular cube data

my concern is about saving tabular cube data before processing, to have a possibility to go back to a version of data when needed, to be able then to compare between different versions in a power bi report.
what is the best way to do this ?
Thank you !
i thought about adding a column to the fact table where the version name is saved, but that would increase memory usage in the cube, not efficient as the number of versions grow..
expecting a more efficient way to do this.
You can backup and restore data from SSMS. This is a good article about it: https://www.mssqltips.com/sqlservertutorial/3614/sql-server-analysis-services-backup-and-restore/
If you are just comparing a data refresh to the previous version, you might process the model in Visual Studio, and compare that version to the deployed database. You can get the server name for the VS development copy from Solution Explorer, clicking on Model.him, and scrolling to the bottom of the Properties pane to the development server name. Then connect to both databases for your compare. If you want to compare to an older version, I’d restore a backup to a dev server.
How to do the compare? It depends on the amount of data. For tables < 1 million rows, I’d probably dump them to Excel and compare them there. I’m not aware of any tools to help with this. For our models, we just use our normal validation reports and point one at the deployed copy and one at our development copy and eyeball the changes.
If you want to compare metadata, see http://alm-toolkit.com/. It does everything you could want.
i thought about adding a column to the fact table where the version name is saved, but that would increase memory usage in the cube, not efficient as the number of versions grow.. expecting a more efficient way to do this.
Well that's how you do it. If you want the users to be able to write reports that compare two versions, then the versions have to be part of the same model.

How to create a script for SQL Server database create / upgrade from any state

I need to create scripts for creating or updating a database. The scripts are created from my test Database or from my source control.
The script needs to upgrade a database from any version of my application to the current version so it needs to be agnostic to what already exists in the database.
I do not have access to the databases that will be upgraded.
e.g.
If a table does not exist the script should create it.
If the table exists the script should check if all the columns exist (And check their types).
I wrote a lot of this checking code in C# as in i have an SQL create table script and the C# code checks if the table (and columns) exists before running the script.
My code is not production ready and i wanted to know what ready made solutions are out there.
I have no experience with frameworks that can do this.
Such an inquiry is off-topic for SO anyway.
But depending on your demands, it may not be too hard to implement something yourself.
One straightforward approach would be to work with incremental schema changes; basically just a chronological list of SQL scripts.
Never change or delete existing script (unless something really bad is in there).
Instead, just keep adding upgrade scripts for every new version.
Yes, 15 years later you will have accumulated 5,000 scripts.
Trust me, it will be the least of your problems.
To create a new database, just execute the full chain of scripts in chronological order.
For upgrades, there are two possibilities.
Keep a progress list in every database.
That is basically just a table containing the names of all scripts that have already been executed there.
To upgrade, just execute every script that is not in that list already. Add them to the list as you go.
Note: if necessary, this can be done with one or more auto-generated, deployable, static T-SQL scripts.
Make every script itself responsible for recognizing whether or not it needs to do anything.
For example, a 'create table' script checks if the table already exists.
I would recommend a combination of the two:
option #1 for new versions (as it scales a lot better than #2)
option #2 for existing versions (as it may be hard to introduce #1 retroactively on legacy production databases)
Depending on how much effort you will put in your upgrade scripts, the 'option #2' part may be able to fix some schema issues in any given database.
In other words, make sure you start off with scripts that are capable of bringing messy legacy databases back in line with the schema dictated by your application.
Future scripts (the 'option #1' part) have less to worry about; they should trust the work done by those early scripts.
No, this approach is not resistant against outside interference, like a rogue sysadmin.
It will not magically fix a messed-up schema.
It's an illusion to think you can do that automatically, without somebody analyzing the problem.
Even if you have a tool that will recreate every missing column and table, that will not bring back the data that used to be in there.
And if you are not interested in recovering data, then you might as well discard (part of) the database and start from scratch.
On the other hand, I would recommend to make the upgrade scripts 'idempotent'.
Running it once or running it twice should make no difference.
For example, use DROP TABLE IF EXISTS rather than DROP TABLE; the latter will throw an exception when executed again.
That way, in desperate times you may still be able to repair a database semi-automatically, simply by re-running everything.
If you are talking about Schema state, you can look at state-based deployment-tools instead of change-based. (not the official terminology)
You should look at these two tools
SQL Server Data Tools (Dacpac) data-tier-applications which is practically free
RedGate has an entire toolset for this https://www.red-gate.com/solutions/need/automate. which is licensed
The one thing to keep in mind with State based deployments is that you don't control how the database gets from one-state to another, with SSDT
For example a column-rename = drop and recreate that column, same for a table-rename.
In their defence they do have some protections and do tell you what is about to happen.
EDIT (Updating to address comment below)
It should not be a problem that you can't access the TargetDb while in development. You can still use the above tools provided you can use them (Dacpac/Redgate) tooling when you are deploying to the TargetDb.
If you are hoping to have a dynamic TSQL script that can update a target database in an unknown state. Then that is a recipe for failure/disaster. I do have some suggestions at the end for dealing with this.
The way I see it working is
Do your development using Dacpac/Redgate
Build your artefacts Dacpac / Redgate package
Copy artefact to the deployment server with tools
when doing deployments use the tools (Dacpac Powershell) or Redgate manually
If your only choice is a TSQL script, then the only option is extensive-defensive coding covering all possibilities.
Every object must have an existence check
Every property must have a state check
Every object/property must have a roll forward / roll backward script.
For example to sync a table
A Script to check the table exists, if not create it
A script to check each property of the table is in the correct state
check all columns and their data-types and script to update them to match
check defaults
check indexes, partitioning etc
Even with this, you might not be able to handle every scenario.
The work you are trying to do requires you start using a standard change control process.
Given the risk of data loss, and issues related to creation of columns in a specific sequence and the potential for column definitions to change.
I recommend you look at defining a base line version which you will manually have to upgrade each system to.
You can roll your own code, and use a schema version table, or use any one of the tools available such as redgate sql source control, visual studio database projects, dbup, or others.
I do not believe any tool will bring you from 0-1, however, once you baseline, any one of these tools will greatly facilitate your workflow.
Start with this article Get Your Database Under Version Control
Here are some tools that can help you?
Octopus Schema Migrations
Flyway By Redgate
Idera Database Change Management
SQL Server Data Tools

Sub version for database (I want something for data values in the database, not for the schema)

I am using github for maintaining versions and code synchronization.
We are team of two and we are located at different places.
How can we make sure that our databases are synchronized.
Update:--
I am rails developer. But these days i m working on drupal projects (where database is the center of variations). So i want to make sure that team must have a synchronized database. Also the values in various tables.
I need something which keep our data values synchronized.
Centralized database is a good solution. But things get disturbed when someone works offline
if you use visual studio then you can script your database tables, views, stored procedures and functions as .sql files from a database solution and then check those into version control as well - its what i currently do at my workplace
In you dont use visual studio then you can still script your sql as .sql files [but with more work] and then version control them as necessary
Have a look at Red Gate SQL Source Control - http://www.red-gate.com/products/SQL_Source_Control/
To be honest I've never used it, but their other software is fantastic. And if all you want to do is keep the DB schema in sync (rather than full source control) then I have used their SQL Compare product very succesfully in the past.
(ps. I don't work for them!)
You can use Sql Source Control together with Sql Data Compare to source control both: schema and data. Here is an article from redgate: Source controlling data.
These are some of the possibilities.
Using the same database. Set-up a central database where everybody can connect to. This way you are sure everybody uses the same database all the time.
After every change, export the database and commit it to the VCS. This option requires discipline and manual labor.
Use some kind of other definition of the schema. For example, Doctrine for php has the ability to build the database from a yaml definition which can be stored in the vcs. This can be easier automated then point 2.
Use some other software/script which updates the database.
I feel your pain. I had terrible trouble getting SQL Server to play nice with SVN. In the end I opted for a shared database solution. Every day I run an extensive script to backup all our schema definitions (specifically stored procedures) for version control into text files. Due to the limited number of changes this works well.
I now use this technique for our major project and personal projects too. The only negative is that it relies on being connected all the time. The other answers suggest that full database versioning is very time consuming and I tend to agree. For "live" upgrades we use the Red Gate tools, they do both schema and data compare and it works very well.
http://www.red-gate.com/products/SQL_Data_Compare/. We were using this tool for keeping databases in sync in our company. Later we had some specific demands so we had to write our own code for synchronization. Depends how complex is you database and how much changes is happening. It is much simpler if you have time when no one is working and you can lock database for syncronization.
Check out OffScale DataGrove.
This product tracks changes to the entire DB - schema and data. You can tag versions in any point in time, and return to older states of the DB with a simple command. It also allows you to create virtual, separate, copies of the same database so each team member can have his own separate DB. All the virtual copies are tracked into the same repository so it's super-easy to revert your DB to someone else's version (you simply check-out their version, just like you do with your source control). This means all your DBs can always be synchronized.
Regarding a centralized DB - just like you don't want to work on the same source code, you don't want to be working on the same DB. It means you'll constantly break each other's code and builds each time someone changes something in the DB.
I suggest that you go with a separate DB for each developer, and sync them using DataGrove.
Disclaimer - I work at OffScale :-)
Try Wizardby. This is my personal project, but I've used it in my several previous jobs with great deal of success.
Basically, it's a tool which lets you specify all changes to your database schema in a database-independent manner and then apply these changes to all your databases.

How do you put an large existing database (schema) under source control?

My DBA just lost some development work that he did on our development database. Poor fella. So naturally our manager asked him, at our status meeting, how this could happen and how we could avoid this happening in the future. "Source control could alleviate the problem" I suggested... The dba's response; "No, we just backup the server more often". Now I would like to help my DBA understand what source control is and how it fits together with a database schema and development on that schema.
Previously I've tried to explain him that there's nothing special about the source code behind tables and stored procedures and it should be in a source control system (TFS in this case). But he just didn't bite. Now, while this misap is in recent memory, I would like to take another stab at it.
So my question is, do you know of any good advice I could pass on to my DBA and maybe even a couple of resources explaining how you would go about migrating a DB schema to be under source control and find its proper place in the build and deployment processes?
A couple of facts about the environment:
Source Control on a TFS 2008 Server.
Database is a MS SQL server 2008 with >300 tables and >300 other objects (sprocs, triggers, functions etc.).
Clarification:
We have been using DB Ghost and other change management solutions on other projects with other DBAs, in the past. We even have the license for VS DB edition! The problem is getting the DBA to even think about this way of developing for the database. He's really old school (i.e. migrating changes manually from environment to environment), and unfortunately hes the only one who knows anything about this particular DB.
See how to version control sql server databases and Do you source control your databases, among many others. Or use the search page. Basically, your approach seems correct. Good luck persuading the DBA...
If you are using Visual Studio Team System, I recommend having a stab at their Database Edition (i think these days it comes with the Developer Edition if you are an MSDN Subscriber). What this will allow you to do is to script out all your schema, stored procs, views, triggers, etc and source control these. This should also make the dba more comfortable since he will be working with a "Database" version of the tool rather than the "Developer" version (naming can go a great lengths with people). As you make changes from Visual Studio, you can manage script changes as you work, and source control them.
If your company has an MSDN license, they can use the Visual Studio Database edition. There's a video tutorial of it here.
I have no power of purchase, so I don't know what the cost breakdowns are. But it has the capability of source controlling all the parts of a DB schema, and includes creating change-scripts as well as auto-deploying straight from VS if you want (I wouldn't recommend that).
In general though, it's pretty solid as a database source control option.
Source control for databases can be quite contentious. It's different to use source control for something that produces a binary because you can't lock the source: a stored proc is a row in a table and there is not single table to read to get a table definition.
Also, version to version is mostly a set of ALTER statements. You script out CREATEs and add them to source control. This makes it harder to use in cases like this.
To me, this is more a procedural error.
Why was the change not done from a script? Forget where the script lives, but why no reproducable and re-runnable script? Perhaps linked to the change tracking number? If the database is reset (loaded from prod) then how would the change have been re-applied to prepare for production. And other questions.
I believe in source control and we use it: but it has limits for database work.
First you are approaching this incorrectly. If the dba won't bite on Source Control and he is making errors that affect the system, the person you need to persuade is his boss.
If it helps, I'm from the old school too and I love having our database objects in source control. How nice to be able to revert one table without having to restore the whole database backup to a different location and then move the table. How much faster and simpler. How nice to be able to compare two different versions and see what changed. How nice to deploy a change and know exactly which database changes (say, for instance only twelve of the 23 possible ones) go with the part you are deploying and not some other unfinished project. How nice to know exactly which scripts were involved in a particular change you had to rollback. How nice that nobody is making on-the-fly changes on production since we now require all production changes to be from source control scripts. There are so many fewer errors and issues to worry about.
Yes it was a change in how we did business, but we did it through a policy change from on high so three was no argument and the dbas went through a couple of times and reverted any objects different from source control to the source control version, so now nobody will even think of doing a database change without it being in source control.
As the product manager for SQL Compare I've spoken to many 'traditional' DBAs who are uncomfortable with third party tools mainly because they have a system that works for them and sometimes changing can be difficult. There are many situations where I am convinced that they would benefit from our tools if only they gave them a chance. Frustrating.
One thing you might consider trying is Red Gate's upcoming tool, SQL Source Control. This is designed to build source control into SSMS, in other words it doesn't require DBAs to leave the comfort zone of their management environment. The bad news is that the tool hasn't been released yet. The good news is that we have an Early Access Program. Please visit the following link to find out more about the tool:
http://www.red-gate.com/Products/SQL_Source_Control/index.htm
you can't really put a large database under source control, so your DBA is right.
what you can do practically is to put your schema under source control, and maybe a few smallish 'configuration' tables.
One way to source control database is to store the data in and about the database separately
You can have the all the tables, procedures and function scripts as SQL files and add them to source control.
Export the database data as insert statements into SQL files, each with a fixed size. This is a cumbersome process as it would involve a lot of files that are to be tracked and controlled.
I am not sure if the VSS/SVN are able to read and keep history of changes to dump files created by the database backup options.
Its not clear from you question if you want to protect the data in the Db or the schemas in the Db. If the latter then you could identify all the important schemas and run an cron job that pulls the schema definitions from the Db and inserts them automatically into a source control system (perhaps even via triggers on the schemas??).
But this still just amounts to backing the system up more often. For what you envision you would need source control integrated with the Db tools and I don't know of any product that does that.
(and I shudder to think of VSS integrated into SQL management studio :-(( )
My answer to this same problem was to export all DB objects to text form (more than 136,000 of them) and then create the SourceSafe projects to hold them. Any New or changed objects in the DB now go to the SourceSafe structure, while unchanged are left alone.

How to keep code base and database schema in synch?

So recently on a project I'm working on, we've been struggling to keep a solution's code base and the associated database schema in synch (Database = SQL Server 2008).
Database changes occur fairly regularly (adding columns, constraints, relationships, etc) and as a result it's not uncommon for people to do a 'Get Latest' from source control and
find that they also need to rebuild the database as well (and sometimes they forget to do the latter).
We're not using VSTS: Database Edition (DataDude) but the standard Visual Studio database project with a script (batch file) which tears down and recreates the database from T-SQL scripts. The solution is a .Net & ASP.net solution with LINQ to SQL underlying as the ORM.
Anyone have ideas on an approach to take (automated or not) which would keep everyone up to date with the latest database schema?
Continuous integration with MSBuild is an option, but only helps pick up any breaking changes committed, it doesn't really help in the scenario I highlighted above.
We are using Team Foundation Server, if that helps..
We try to work forward from the creation scripts.
i.e a change to the database is not authorised unless the script has been tested and checked into source control.
But this assumes that the database team is integrated with your app team which is usually not the case in a large project...
(I was tempted to answer this "with great difficulty")
EDIT: Tools won't help you if your process isn't right.
Ok although its not the entire solution, you should include an assertion in the Application code that links up to the database to assert the correct schema is being used, that way at least it becomes obvious, and you avoid silent bugs and people complaining that stuff went crazy all of the sudden.
As for the schema version, you could use some database specific functionality if available, but i personally prefer to declare a schema version table and keep the version number in there, that way its portable and can be checked with a simple select statement
have a look at DB Ghost - you can create a dbp using the scripter in seconds and then manage all your database code with the change manager. www.dbghost.com
This is exactly what DB Ghost was designed to handle.
We basically do things the way you are, with the generation script checked into source control as well. I'm the designated database master so all changes to the script itself are done through me. People send me scripts of the changes they have made, I update my master copy of the schema, run a generate scripts (SSMS) to produce the new DB script, and then check it in. I keep my copy of the code current with any changes that are being made elsewhere. We're a small shop so this works pretty well for us. I realize that it probably doesn't scale.
If you are not using Visual Studio Database Professional Edition, then you will need another tool that can break the database down into its elemental pieces so that they are managable and changeable in an easier manner.
I'd recommend seriously considering Redgate's SQL tools if you want to maintain sanity over all your database changes and updates.
SQL Packager
SQL Multi Script
SQL Refactor
Use a tool like RedGate SQL Compare to generate the change schema between any given version of the database. You can then check that file into source code control
Have a look at this question: dynamic patching of databases. I think it's similar enough to your problem to be helpful.
My solution to this problem is simple. Define everything as XML, and make sure that both the database, the ORM and the UI are generated from this XML, no exceptions. That way, you can use code generation tools to quickly regenerate the database creation script, which will alter your schema while (hopefully) preserving some data. It takes some effort to do, but the net result is well worth it.

Resources