I have a SQL Server database where we have created some views based on dim and fact tables. I need to build SSAS tabular model based on my tables and views. But one of the view runs for 1.5 hour inside SQL query (SSMS). Now I need to use this same view to build my SSAS tabular model but 1.5 hour is not acceptable. This view is made up of more than 10 table joins and lot of Where conditions.
1) Can I bring all these tables being used in this view inside my SSAS tabular model but then I am not sure how to join them all and use where clauses inside SSSAS and build something similar to my view. Is that possible? If yes how?
or
2) I will build one time SSAS model from that view and then if I want to incrementally load the data daily, whats is the best way to do that?
The best option is to set up a proper ETL process. That is:
Extract the tables from your source SQL database into a new SQL database that you control.
Transform the data into a star schema.
Load the data from the star schema into SSAS.
On SQL Server, the most common approach is use SSIS packages for data extraction, movement, and orchestration, and SQL Server Agent Jobs for scheduling.
To answer your questions:
Yes, it is certainly possible to bring in all of the tables directly from your source system into your tabular model, but please don't do this! You will only create problems for yourself later on when creating DAX calculations. More information here.
Incrementally loading data is something you decide for each table that is imported into your tabular model. Again, this is much easier if you have a proper star schema, as you would typically run a full processing on all your dimension tables, and then do incremental processing only on the largest fact tables.
Related
I need to sync data from several tables in a legacy SQL Server db (source) to a single table in a Postgres db (target). The schema of the source db is absurd, so the query to select the data takes a very long time to run. I'm planning to create an indexed view in the source db, and then somehow sync that indexed view to the Postgres table.
Right now, I simply have a scheduled task that drops the Postgres table (target) and then recreates it from scratch by running the complex query in the source db. This was quick to set up, and it ensures that changes in the source db always eventually make it to the target db, but recreating the table every few hours is (understandably) very slow and expensive. I need a way to replicate ongoing changes (only the new/updated data) from the source view to the target table. Is there a (relatively) simple way to do this?
I'm somewhat familiar with CDC, but I understand that CDC cannot be used on a view, so I don't believe that's an option. Adding "updated at" timestamps to the source tables is not an option, so I can't use that approach. I could add a hash column to the source tables, or maybe add a hash column to the view, so that's an option if that would work. Is there an existing tool/service that does what I need?
If you want to view SQL Server DB data in PostgreSQL, then you can also tds_fdw.
https://github.com/tds-fdw/tds_fdw
Also, there are some third-party tools which could help you to achieve your goal, for example, SymmetricDS
http://www.symmetricds.org/about/overview
Is there a native solution/application/script for creating documentation in Power BI? I am especially interested in documenting all relationships.
Power BI Models (and the new Tabular Models) have DMVs that are separate from the MDSCHEMA rowsets for SSAS multidimensional. While some of the SSAS MD DMVs mostly work, the new TMSchema DMVs work well since they are made specifically for this type of model. The trick is that you must know the connection info. The port and database number change each time you open Power BI Desktop. But generating documentation can be done.
There are a couple of ways to go about it. You can use DAX Studio to get your connection info (a la Chris Webb). Or you can get that info dynamically from Power BI (a la The BIccountant). Using DAX Studio works as a one-time way to get documentation, or if you are ok updating the connection and database info each time you want to run it. The BIccountant way is more dynamic. I haven't tried it, but it looks promising.
To get relationships, you can get your connection info for your Power BI model and then run queries against the following DMVs:
$System.TMSCHEMA_RELATIONSHIPS
$System.TMSCHEMA_TABLES
$System.TMSCHEMA_COLUMNS
Pull those down into Power BI (either in the same model you are documenting, or in a different model). Then either
A) Use the Edit Queries functionality to merge the queries to add the Name column from the Tables DMV and the Explicit Name column from the Columns DMV based upon FromTableID, FromColumnID, ToTableID, ToColumnID.
B) Create relationships between these columns using the modeling functionality of Power BI to achieve the same affect.
Once you've done this and cleaned up column names, hidden/deleted unused fields, you can then use Power BI to create your documentation. You can create plain tables and/or use something like a force-directed graph to show relationships. Here's a screenshot of one that I made.
One question though lets say publisher database had 100 tables and I use Transactional Replication to move the data from those 100 tables to Subscriber Database that would be fine.
But lets say I don't want the 100 tables but i want to create 3-4 Views which contain the key information I want from those 100 tables. How Would I achieve this.
1) Firstly I guess the views need to be created on the publisher database
2) Secondly Do i need to create then 3/4 Tables in the Subscriber database which have the same columns as the view from publisher database.
3) What sort of replication or maybe even SSIS or something to move the data from the publisher view to subscriber database
Replication probably wouldn't be viable or as performant an option as creating a SSIS package for transferring data from those views and into the small set of tables in the remote database. SSIS's strongest feature is it's ability to transfer large volumes of data quickly from a source and into a destination. With a little upkeep, you could potentially just transfer the differences between the two databases at some scheduled interval and have a fairly flexible solution.
SSIS will be the better solution. You would create the tables on your target database. Then, you can create the SSIS pacakge(s) to populate the target tables.
SSIS can use queries on tables or views. And, it can also execute a stored procedure to retrieve the data.
I am trying come up with a way to pull the tables out of an Access database, automate the creation of those same tables in a SQL 2008 DB, and move the data to the new tables. This process will happen on a regular basis and there may be different tables each time.
I would like to do this totally in SSIS.
C# SQL CLR objects are an option.
The main issue I have been running into is how to get the Access table's schema and then convert that to a SQL script that I can run via SSIS.
Any ideas?
TIA
J
SSIS cannot adapt to new tables at runtime. (You can change connections, move a source to a table with a different name, but the same schema) So, it's not really easy to do what I think you are saying: Upsize an arbitrary set of tables in an Access DB to SQL (mirroring their structure and data, naming, etc), so that I can then write some straight SQL to transform the data into another SQL database or the same part of the database.
You can access the SSIS object model from C# and build a package (or modify a template package) programmatically and then execute it. This might offer the best bang for your buck, but the SSIS object model is kind of deep. The SSIS Team blog have finally started putting up examples (a year after I had to figure a lot of this out for myself)
There is always the upsizing wizard, and I'm sure there are some third party tools.
What is the best method to transfer data from sales table to sales history table in sql server 2005. sales history table will be used for reporting.
Take a look at SSAS. OLAP is built for reporting and is easy to query with tools like excel pivot tables.
Bulkcopy is fast and it will not use the transaction log. One batch run at the end of the day.
Deleting the copied records from your production server is a different situation that needs to be planed on that server's maintenance approach/plans. Your reporting server solution should not interfere with or affect the production server.
Keep in mind that your reporting server is not meant to be a backup of the data but rather a copy made exclusively for reporting purposes.
Also check on the server settings of your reporting server to be on Simple recovery model.
Most solutions will require 2 steps;
-copy the records from source to target
-delete records from source.
It is essential that your source table have a primary key.
The "best" method depends on a lot of things.
How many records?
Is this a production environment?
What tools do you have?
Unless you are moving a large amount of data, a simple stored procedure should do the trick.
A sql server job can manage the timing of when to call the proc.
if you just want to move the data to another table, use BulkCopy/BulkInsert. if you want to build reporting I would suggest a BI solution such as MS Analysis Service (OLAP).
It is difficult and in my opinion ugly to maintain two or more history/archive tables in the same database. For a reporting solution you will be considering all the tables for that piece of information anyway. History/Archive tables should only be used if you are going to put the data away and not touch it for a long period of time, ie. archive it away outside the operational DB.