Should OLAP cubes always be built upon a data warehouse? - sql-server

I have a OLTP database that I want to do complex analyses of its data. I recently learned about OLAP cubes and SQL Server Analysis Services. Building a cube for analyzing the data seems like the right way to go.
However, when looking through the Microsoft SSAS tutorial, I wasn't able to clarify whether the cube is only meant to be built upon data warehouses or OLTP databases. I realize that a data warehouse could be as simple as a database (like I have). If I want to build a cube, will I have to create a warehouse of what I currently have? Should I even be thinking about data warehousing? Both seem like must-haves for data analysis.

Related

Which one is ideal for reporting, multidimensional or relational data warehouse?

We are having a transactional database which acts as the operational database for our application. In order to facilitate reports generation, we have been considering creating a separate reporting database. We will be employing ETL procedures to migrate the data from the transactional
database to reporting database. Now the question we have is which database architecture should we use for the reporting database, Multidimensional database or a relational data warehouse (having transactional database tables denormalized to a certain extent).
I learned that multi dimensional database is ideal for hierarchical analysis of data at multiple level from different perspectives - we wont be doing that.
Our reports are just tabular in nature like typical conventional ones.
In this situation, which one is ideal, multidimensional or relational data warehouse.
Please share your thoughts.

Can a SQL Server OLAP cube be loaded from csv files

I have never used or implemented an OLAP cube before, so I'm basically a beginner with this technology.
I am analyzing a project that is converting scanned documents to an on-line MS SQL Server database. From the research I have done, OLAP cubes appear to offer substantially faster queries than OLTP databases. So, using an OLAP cube appears to be a better choice performance wise.
But, I have only found examples that show how to load an OLAP cube with data from database tables. I have not been able to find any examples of loading data from csv files using tools like BCP or Bulk Insert for OLAP cubes.
Setting up an OLTP database first is possible, but it would only be used to load the OLAP cube. This can certainly be done, but I just wanted to make certain that there isn't an easier way to load an OLAP cube directly with csv files first.
So, does SQL Server provide a way to load an OLAP cube with csv files or does an OLAP cube have to be loaded from an existing OLTP database?
Traditionally, OLAP cubes sit on top of OLTP "data warehouses". There are a number of advantages of loading your data into an OLTP database before loading it into an OLAP cube. This process is known as "ETL" (Extract, Transform, Load). For more information, search for "Ralph Kimball" or "Bill Inmon". There is a lot of literature on how to design and build data warehouses and dimensional models ("star schemas").
If you want to use SQL Server Analysis Services for your OLAP cube, you have the choice between SSAS multidimensional and SSAS tabular. Currently, SSAS multidimensional does not support loading data from anything other than SQL Server database tables, whereas SSAS tabular supports a number of sources (including flat files). Even so, the recommended approach is to load the data from a relational database, and then use some other tool (for example, SQL Server Integration Services, SSIS), to perform the "ETL", to get the data from the source into the database.

Do I need a cube?

We have a content ingestion system which receives (mobile) digital contents of different types (Music, Ringtone, Video, Game, Wallpaper etc) from various providers (Sony, Universal Music, EA Games etc) and then dispatches them across several online stores (e.g. Store1, Store2 etc).
The managers want to know how many of each content type, in a given time window, has been come through from each suppliers and they have gone to which store!
To me it seems like a report that needs an OLAP cube. Am I correct? The problem is that I am a .NET developer and not much skilled in BI and SQ Server Analysis Services therefore I want to make this simple yet flexible and meaningful. Is there an easier way of having a reporting cube, and a data mart to produce reports like this? (I am not sure if we can purchase SSAS and SSIS licenses at all).
And for such data mart and cube, what structure is suggested?
From your description, a cube isn't necessary. Assuming this data is in a database you can just write a query to get that result. If you've bought a licence of SQL Server (i,e, not the free edition) then you already have SSAS, SSIS, SSRS.
Some of a cube's main advantages are:
It's easier for end users to do adhoc reporting
Performance is often better than a relational (SQL Query) source
Some disadvantages are:
You need to spend processing time 'building' the cube
The query language (MDX) can be a challenge to learn
You don't have an adhoc user analysis requirement here
An SSAS cube presented in Excel Pivot Tables is probably still the most powerful and flexible end-user query tool out there, with a very low learning curve (most managers/analysts can already use Excel). Once they have a cube they can satisfy many requirements themselves, without you needing to constantly tweak queries. Even when they do want something more complex, you have a perfect source for report/query design and testing.
But designing and building an SSAS cube is very difficult and they are quite obscure to debug.
I suggest starting with Power Pivot - it's a free Excel Add-In that builds an in-memory cube, and presents the results as Excel Pivot Tables. It scales well through advanced compression and the resulting Model can be published to an SSAS Tabular server. The calculation language is DAX which is an improvement on the horrible MDX - DAX reads more like Excel functions.
This site is probably the best starting point for Power Pivot:
http://www.powerpivotpro.com/
You can solve this with just standard queries or views in SQL Server. Tools such as PowerPivot for Excel also allow you to create local cubes with very little effort.
Of course, purchasing an SSAS license and moving to a cube environment has several advantages, despite the extra cost:
Cubes are faster and allow for more complex calculations than SQL
Queries
With the introduction of the SSAS Tabular Model, making cubes really isn't hard anymore
Creating cubes often forces you to clean up your data model, which has a positive effect on your architecture overall in most cases
Create a cube might be overkilled for your scenario as your data is not quite complicate and not so big. But excel might not enough as it is hard to pivot data in your database directly.
You can try embed WebPivotTable into your website or your application. It provide all functions of excel pivot table and can be connect to CSV/Excel files or connect to database by web service interface. It is web based and the front end user interface are quite intuitive so that users can easily get what he want by simple drag and drops. Here is demo and Documents.
Of course, if you still want to create a cube, this tool can also be very helpful as it can connect to SSAS cubes directly.

Does Process Cube eliminate the need for SSIS?

I am trying to understand how SQL Server Analysis Services fits into the Business Intelligence field.
I have used SSIS to create copy databases and then SSRS to produce reports, which are accessed by he users.
I know that SSAS is a database engine, which allows you to create Cubes. There is an option in SSAS to process cube (http://technet.microsoft.com/en-us/library/aa216366(v=sql.80).aspx). Is SSAS a replacement for SSIS as it seems to do the ETL for you (using process cube)?
SSIS is an ETL tool providing you with the ability to move, manipulate and consolidate (from multiple sources) data. SSIS tends to be a developer tool used to get the data in the correct shape either for an application or a reporting tool.
SSAS is a cube building tool providing the business with the ability to slice and dice the data ad-hocly. Developers will build cubes, however the consumers will tend to by the business.
I have seen instances of SSAS cubes built pulling data directly from source, but these tend not to work very well, due to the load on the source systems and the complexity involved in structuring the data correctly.
A more typical approach is to utilise SSIS to pull (possibly only daily differences) and stage the data into a dimensional model that can then be cleanly consumed by SSAS. This way both tools are playing to their strengths - SSIS moves the data around and SSAS presents the data in an efficient and user friendly way.

Cube In Tableau

I have few questions for the experts:
Q1- Can we develop a OLAP cube in Tableau? [I know we can develop reports by connecting to relational database and also to OLAP cubes (e.g. Cognos or SSAS). But I am interested to know if we can really develop a cube in Tableau?]
Q2- Is there a difference between creating a dynamic dimension in tableau vs having a standalone dimension table? [somebody suggested me to create a de-normalized table and have tableau create the dimension on the fly. but what about records that are missing in the child/fact table. for instance, customer dimension has 10 records while only 8 were exist in the fact table. wouldn't i be missing other 2 if i connect to child/fact table directly?]
Q3- What about performance characteristics of Tableau? [I know tableau executes sql statements behind the scene when it displays data in the reporting tool. if i have millions of records in the de-normalized/child/fact table, will it perform fine?]
Thanks,
Moiz
Q1. No. Tableau is a visual analytics front-end, not a tool to build a multi-dimensional OLAP store. While Tableau does have it's own in-memory engine, it does not work the same way a cube does (pre-aggregating by dimension and hierarchy).
Q2. Sorry, this question makes no sense to me.
Q3. In the scenario you mention, Tableau's performance is defined by your database's ability to respond quickly. If your database responds quickly, Tableau will be fast. If not, Tableau will be slow. No magic here. In instances where your db is slow, try Tableau's in-memory engine.

Resources