Represent Oracle sql Cube with Microstrategy - cube

Hi i have a serveral cube tables on oracle 12c database. How respresent its with Microstrategy? The Object Intelligent Cube the Microstrategy don't represent correctly this cubes and It save in-memory sqls. I need execute sql realtime to cube table

A MicroStrategy cube is an in-memory copy of the results of an SQL query executed against your data warehouse. It's not intended to be a representation of the Oracle cubes.
I assume both these "cubes" organize data in a way that is easy and fast to use for dimensional queries, but I don't think you can import directly an Oracle cube into MicroStrategy IServer memory.
I'm not an expert with Oracle Cubes, but I think you need to map dimensions and facts like you would do with any other Oracle table. At the end an Oracle cube is a tool that Oracle provide to organize your data (once dimensions and metrics are defined) and speed up your query, but you still need to query it: MicroStrategy will write your queries, but also MicroStrategy needs to be aware of your dimensions and metrics (MicroStrategy facts).
At the end the a cube speeds up your queries organizing and aggregating your data, and it seems to me that you have achieved this already with your Oracle cube. A MicroStrategy cube is an in-memory structure that saves also the time required by a query against the database.

If your requirements are that you execute SQL against your database at all times, then you need to disable caching on the MicroStrategy side (this can be done on a report-by-report basis, or at a project level).
MicroStrategy Intelligent Cubes aren't going to be a good fit for you here, because they explicitly cache data, in order to decrease response time, and reduce load on your source database.

Related

Best solution for generating flat report

I have an SSAS multidimentional cube (sql server 2019) of 350 GB with a retention of 10 years of data.
I noticed that users often use the cube to extract data at the leaf level (Excel tables with multiple columns).
I think that SSAS is not suited for producing these type of reports.
What is the best tool / solution to let users genrate flat reports ? I know that sql is good for that but users aren't sql developers.
Could a PowerBI Model with direct query be more efficient than tha actual SSAS cube ?
Could a PowerBI Model with direct query be more efficient than tha actual SSAS cube ?
SSAS Multidimensional is exceptionally bad at generating large flattened results. Almost anything will be better. A PowerBI or SSASS Tabular DirectQuery model is much better, but not ideal for very large extracts. But be sure to extract through DAX not MDX. A Paginated Report exported to CSV or Excel is a good choice too.

Long running view in ssas-tabular

I have a SQL Server database where we have created some views based on dim and fact tables. I need to build SSAS tabular model based on my tables and views. But one of the view runs for 1.5 hour inside SQL query (SSMS). Now I need to use this same view to build my SSAS tabular model but 1.5 hour is not acceptable. This view is made up of more than 10 table joins and lot of Where conditions.
1) Can I bring all these tables being used in this view inside my SSAS tabular model but then I am not sure how to join them all and use where clauses inside SSSAS and build something similar to my view. Is that possible? If yes how?
or
2) I will build one time SSAS model from that view and then if I want to incrementally load the data daily, whats is the best way to do that?
The best option is to set up a proper ETL process. That is:
Extract the tables from your source SQL database into a new SQL database that you control.
Transform the data into a star schema.
Load the data from the star schema into SSAS.
On SQL Server, the most common approach is use SSIS packages for data extraction, movement, and orchestration, and SQL Server Agent Jobs for scheduling.
To answer your questions:
Yes, it is certainly possible to bring in all of the tables directly from your source system into your tabular model, but please don't do this! You will only create problems for yourself later on when creating DAX calculations. More information here.
Incrementally loading data is something you decide for each table that is imported into your tabular model. Again, this is much easier if you have a proper star schema, as you would typically run a full processing on all your dimension tables, and then do incremental processing only on the largest fact tables.

How is querying a data warehouse different than querying a database?

Say I have a data warehouse like BigQuery, RedShift. I store data which is fit for online analytical processing (OLAP). Similarly suppose I have a database like MySQL or Microsoft SQL Server which has some data fit for online transaction processing(OTLP).
What are the different parameters on which querying a data warehouse and a database would be different?
This is a very general question nevertheless I think the following can help you make your desicion:
1. How much data you have Vs relational features
2. Cloud solution Vs on premesies
3. Payment models (derived from 2) for example bq model is per scan while other is per storage

AWS Glue: SQL Server multiple partitioned databases ETL into Redshift

Our team is trying to create an ETL into Redshift to be our data warehouse for some reporting. We are using Microsoft SQL Server and have partitioned out our database into 40+ datasources. We are looking for a way to be able to pipe the data from all of these identical data sources into 1 Redshift DB.
Looking at AWS Glue it doesn't seem possible to achieve this. Since they open up the job script to be edited by developers, I was wondering if anyone else has had experience with looping through multiple databases and transfering the same table into a single data warehouse. We are trying to prevent ourselves from having to create a job for each database... Unless we can programmatically loop through and create multiple jobs for each database.
We've taken a look at DMS as well, which is helpful for getting the schema and current data over to redshift, but it doesn't seem like it would work for the multiple partitioned datasource issue as well.
This sounds like an excellent use-case for Matillion ETL for Redshift.
(Full disclosure: I am the product manager for Matillion ETL for Redshift)
Matillion is an ELT tool - it will Extract data from your (numerous) SQL server databases and Load them, via an efficient Redshift COPY, into some staging tables (which can be stored inside Redshift in the usual way, or can be held on S3 and accessed from Redshift via Spectrum). From there you can add Transformation jobs to clean/filter/join (and much more!) into nice queryable star-schemas for your reporting users.
If the table schemas on your 40+ databases are very similar (your question doesn't clarify how you are breaking your data down into those servers - horizontal or vertical) you can parameterise the connection details in your jobs and use iteration to run them over each source database, either serially or with a level of parallelism.
Pushing down transformations to Redshift works nicely because all of those transformation queries can utilize the power of a massively parallel, scalable compute architecture. Workload Management configuration can be used to ensure ETL and User queries can happen concurrently.
Also, you may have other sources of data you want to mash-up inside your Redshift cluster, and Matillion supports many more - see https://www.matillion.com/etl-for-redshift/integrations/.
You can use AWS DMS for this.
Steps:
set up and configure DMS instance
set up target endpoint for redshift
set up source endpoints for each sql server instance see
https://docs.aws.amazon.com/dms/latest/userguide/CHAP_Source.SQLServer.html
set up a task for each sql server source, you can specify the tables
to copy/synchronise and you can use a transformation to specify
which schema name(s) on redshift you want to write to.
You will then have all of the data in identical schemas on redshift.
If you want to query all those together, you can do that by wither running some transformation code inside redsshift to combine and make new tables. Or you may be able to use views.

Is a single table a bad starting point for OLAP cubes (SQL Server Analysis Services)?

I'm going to use a single table to aggregate historical data about our (very big) virtual infrastructure. The table will be composed of 15 to 30 fields, and I esitmate from 500 to 1000 records a day.
Why a single table? A couple of reasons:
Data is extracted to csv using powershell scripts. Then bulk load on a single table is very easy and fast.
I will use the table to connect excel and report through pivot tables. Then a single table is perfect (otherwise I should create views).
Now my question:
If I'm planning in the future to build cubes upon this table is the "single-table" choice a bad solution?
Do cubes rely on relational databases or they can be easily built upon single-table databases?
Thanks for any suggestion
Can't tell you specifically about SQL Server Analysis Services, but for OLAP you typically use denormalized and aggregated data. That means fewer tables than in a normal relational scenario. And as your data volume is not really big (365k rows/year - even small for OLAP), I don't see any problem using a single table for your data.

Resources