version control for ssas tabular cube data - sql-server

my concern is about saving tabular cube data before processing, to have a possibility to go back to a version of data when needed, to be able then to compare between different versions in a power bi report.
what is the best way to do this ?
Thank you !
i thought about adding a column to the fact table where the version name is saved, but that would increase memory usage in the cube, not efficient as the number of versions grow..
expecting a more efficient way to do this.

You can backup and restore data from SSMS. This is a good article about it: https://www.mssqltips.com/sqlservertutorial/3614/sql-server-analysis-services-backup-and-restore/
If you are just comparing a data refresh to the previous version, you might process the model in Visual Studio, and compare that version to the deployed database. You can get the server name for the VS development copy from Solution Explorer, clicking on Model.him, and scrolling to the bottom of the Properties pane to the development server name. Then connect to both databases for your compare. If you want to compare to an older version, I’d restore a backup to a dev server.
How to do the compare? It depends on the amount of data. For tables < 1 million rows, I’d probably dump them to Excel and compare them there. I’m not aware of any tools to help with this. For our models, we just use our normal validation reports and point one at the deployed copy and one at our development copy and eyeball the changes.
If you want to compare metadata, see http://alm-toolkit.com/. It does everything you could want.

i thought about adding a column to the fact table where the version name is saved, but that would increase memory usage in the cube, not efficient as the number of versions grow.. expecting a more efficient way to do this.
Well that's how you do it. If you want the users to be able to write reports that compare two versions, then the versions have to be part of the same model.

Related

Can you ignore columns when doing a SSDT data compare?

I frequently use the SSDT data comparison tool to sync up database data from our integration environment to our production environment. However, I typically run into scenarios where columns should be ignored and never synced up. Even if I review the data differences that SSDT finds, the sync operation happens on the row level and unfortunately I need to control syncing on the cell level.
Anyone have any good solutions?
I just have came across this requirement in my project and in VS 2019 I have found the solution.
Create a new data comparison (Tools/SQL Server/New Data Comparison...) and
in the wizard after selecting the data sources click Next to have the wizard to enumerate the tables.
Select the desired table, expand it and unselect the fields you do not want to include in the comparison.
That's it. It was easy to overlook the little expand arrow in front of the table name...
The solution is to user an alternative tool, SSDT doesn't support this at the moment. It would certainly be nice to have.

Sub version for database (I want something for data values in the database, not for the schema)

I am using github for maintaining versions and code synchronization.
We are team of two and we are located at different places.
How can we make sure that our databases are synchronized.
Update:--
I am rails developer. But these days i m working on drupal projects (where database is the center of variations). So i want to make sure that team must have a synchronized database. Also the values in various tables.
I need something which keep our data values synchronized.
Centralized database is a good solution. But things get disturbed when someone works offline
if you use visual studio then you can script your database tables, views, stored procedures and functions as .sql files from a database solution and then check those into version control as well - its what i currently do at my workplace
In you dont use visual studio then you can still script your sql as .sql files [but with more work] and then version control them as necessary
Have a look at Red Gate SQL Source Control - http://www.red-gate.com/products/SQL_Source_Control/
To be honest I've never used it, but their other software is fantastic. And if all you want to do is keep the DB schema in sync (rather than full source control) then I have used their SQL Compare product very succesfully in the past.
(ps. I don't work for them!)
You can use Sql Source Control together with Sql Data Compare to source control both: schema and data. Here is an article from redgate: Source controlling data.
These are some of the possibilities.
Using the same database. Set-up a central database where everybody can connect to. This way you are sure everybody uses the same database all the time.
After every change, export the database and commit it to the VCS. This option requires discipline and manual labor.
Use some kind of other definition of the schema. For example, Doctrine for php has the ability to build the database from a yaml definition which can be stored in the vcs. This can be easier automated then point 2.
Use some other software/script which updates the database.
I feel your pain. I had terrible trouble getting SQL Server to play nice with SVN. In the end I opted for a shared database solution. Every day I run an extensive script to backup all our schema definitions (specifically stored procedures) for version control into text files. Due to the limited number of changes this works well.
I now use this technique for our major project and personal projects too. The only negative is that it relies on being connected all the time. The other answers suggest that full database versioning is very time consuming and I tend to agree. For "live" upgrades we use the Red Gate tools, they do both schema and data compare and it works very well.
http://www.red-gate.com/products/SQL_Data_Compare/. We were using this tool for keeping databases in sync in our company. Later we had some specific demands so we had to write our own code for synchronization. Depends how complex is you database and how much changes is happening. It is much simpler if you have time when no one is working and you can lock database for syncronization.
Check out OffScale DataGrove.
This product tracks changes to the entire DB - schema and data. You can tag versions in any point in time, and return to older states of the DB with a simple command. It also allows you to create virtual, separate, copies of the same database so each team member can have his own separate DB. All the virtual copies are tracked into the same repository so it's super-easy to revert your DB to someone else's version (you simply check-out their version, just like you do with your source control). This means all your DBs can always be synchronized.
Regarding a centralized DB - just like you don't want to work on the same source code, you don't want to be working on the same DB. It means you'll constantly break each other's code and builds each time someone changes something in the DB.
I suggest that you go with a separate DB for each developer, and sync them using DataGrove.
Disclaimer - I work at OffScale :-)
Try Wizardby. This is my personal project, but I've used it in my several previous jobs with great deal of success.
Basically, it's a tool which lets you specify all changes to your database schema in a database-independent manner and then apply these changes to all your databases.

Database source control vs. schema change scripts

Building and maintaining a database that is then deplyed/developed further by many devs is something that goes on in software development all the time. We create a build script, and maintain further update scripts that get applied as the database grows over time. There are many ways to manage this, from manual updates to console apps/build scripts that help automate these processes.
Has anyone who has built/managed these processes moved over to a Source Control solution for database schema management? If so, what have they found the best solution to be? Are there any pitfalls that should be avoided?
Red Gate seems to be a big player in the MSSQL world and their DB source control looks very interesting:
http://www.red-gate.com/products/solutions_for_sql/database_version_control.htm
Although it does not look like it replaces the (default) data* management process, so it only replaces half the change management process from my pov.
(when I'm talking about data, I mean lookup values and that sort of thing, data that needs to be deployed by default or in a DR scenario)
We work in a .Net/MSSQL environment, but I'm sure the premise is the same across all languages.
Similar Questions
One or more of these existing questions might be helpful:
The best way to manage database changes
MySQL database change tracking
SQL Server database change workflow best practices
Verify database changes (version-control)
Transferring changes from a dev DB to a production DB
tracking changes made in database structure
Or a search for Database Change
I look after a data warehouse developed in-house by the bank where I work. This requires constant updating, and we have a team of 2-4 devs working on it.
We are fortunate because there is only the one instance of our "product", so we do not have to cater for deploying to multiple instances which may be at different versions.
We keep a creation script file for each object (table, view, index, stored procedure, trigger) in the database.
We avoid the use of ALTER TABLE whenever possible, preferring to rename a table, create the new one and migrate the data over. This means that we don't have to look through a history of ALTER scripts - we can always see the up to date version of every table by looking at its create script. The migration is performed by a separate migration script - this can be partly auto-generated.
Each time we do a release, we have a script which runs the create scripts / migration scripts in the appropriate order.
FYI: We use Visual SourceSafe (yuck!) for source code control.
I've been looking for a SQL Server source control tool - and came across a lot of premium versions that do the job - using SQL Server Management Studio as a plugin.
LiquiBase is a free one but i never quite got it working for my needs.
There is another free product out there though that works stand along from SSMS and scripts out objects and data to flat file.
These objects can then be pumped into a new SQL Server instance which will then re-create the database objects.
See gitSQL
Maybe you're asking for LiquiBase?

How update in a Multi Tenant app all schema of all tenants?

I am developing a multi-tenant app. I chose the "Shared Database/Separate Schemas" approach.
My idea is to have a default schema (dbo) and when deploying this schema, to do an update on the tenants' schemas (tenantA, tenantB, tenantC); in other words, to make synchronized schemas.
How can I synchronize the schemas of tenants with the default schema?
I am using SQL Server 2008.
First thing you will need is a table or other mechanism to store the version information of the schema. If nothing else so that you can bind your application and schema together. There is nothing more painful than a version of the application against the wrong schema—failing, corrupting data, etc.
The application should reject or shutdown if its not the right version—you might get some blowback when its not right, but protects you from the really bad day when the database corrupts the valuable data.
You'll need a way to track changes such as Subversion or something else—from SQL you can export the initial schema. From here you will need a mechanism to track changes using a nice tool like SQL compare and then track the schema changes and match to an update in version number in the target database.
We keep each delta in a separate folder beneath the upgrade utility we built. This utility signs onto the server, reads the version info and then applies the transform scripts from the next version in the database until it can find no more upgrade scripts in its sub folder. This gives us the ability upgrade a database no matter how old it is to the current version. If there are data transforms unique the tenant, these are going to get tricky.
Of course you should always make a backup of the database that writes to an external file preferable with an human identifiable version number so you can find it and restore it when the script(s) go bad. And eventually it will so just plan on figuring out how to recover and restore.
I saw there is some sort of schema upgrader tool in the new VS 2010 but I haven't used it. That might also be useful to you.
There is no magic command to synchronize the schemas as far as I know. You would need to use a tool - either built in house or bought (Check out Red Gate's SQL Compare and SQL Examiner - you need to tweak them to compare different schemas).
Just synchronizing can often be tricky business though. If you added a column, do you need to also fill that column with data? If you split a column into two new columns there has to be conversion code for something like that.
My suggestion would be to very carefully track any scripts that you run against the dbo schema and make sure that they also get run against the other schemas when appropriate. You can then use a tool like SQL Compare as an occasional sanity check to look for any unexpected differences.

SQL Server Object Organisation

I'm not sure if this is valid, however I have a bug bear with SQL Server, and that is that I cannot organise objects in to a group of objects.
Imagine I'm working on a new section of work in a large database and I perhaps have 15 objects that I will be regularly using. What I want to do is sort of "Favourite" them in to a folder so that I don't have to trawl through all objects in my databases.
I know I could organise objects by schema, however these objects aren't necessarily schema specific, they cross boundaries.
Has anyone come across a method for organising objects in to a favourites group? I know SQL Server Projects organise scripts, but I can't see that they can organises tables?
Thanks
You can't do that with the native tools (SQL Server Management Studio) but there's a workaround: create a new empty database with those 15 tables - just the schema, not the data. Then when you're writing T-SQL code, you can quickly drag and drop elements out of those tables into your code.
The downside is that changes made in the real database won't be reflected in your working database, but you can automate that with a script to pull out the objects you need and recreate them in your working database. You can run that as often as you like (like every X hours, or as a SQL Agent job that runs when your local dev server starts up) without losing data, since you won't be modifying the structure in your "favorites" database.
I know I'm really late to the party, but the question showed up on the right under "Related" and I was curious enough to look.
There is a free add-in for Management Studio that seems to do exactly what you're asking:
http://www.sqltreeo.com/wp/dowload-free-ssms-add-in-to-create-own-folder-for-database-objects/
There is also a $65 commercial add-in which you may want to try as well. I haven't tried either so I'm not sure how well they work or what the paid version offers over the free add-in (if anything).
http://www.skilledsoftware.com/
Also can't hurt to vote for this Connect item and add a comment describing your business use case. While you may find it discouraging that it's been closed as Won't Fix, that is not necessarily a permanent decision:
http://connect.microsoft.com/SQLServer/feedback/details/209340

Resources