Create hierarchies in SSAS Cube from multiple dimensions - sql-server

I have created a SSAS cube for sales and it has a lot of Dims and i want to create hierarchy for the warehouses, My data source has two tables for warehouses one for the Classifications and another joined one for the last level of warehouses. My classification Dim has ClassificationId and Parentid and the Warehouses Dim has the WarehouseId and classificationId. i want to create a hierarchy with all level can i do that?
This's my two dimensions:
1) Classification Dim.
2) Warehouse Dim
As an example from pics: I want to get the levels as >> [Oteena Warehouses]>[Cairo]>[Main Website Stor] In one dimension

You cannot create a hierarchy with attributes from two different dimensions.
What you can do - if the two tables have a shared Primary Key (ClassificationId?) - you can merge these two tables or add any relevant columns you want to use as attributes in that hierarchy.
You can do that with a named calculation in the Data Source view and create a dimension from there.

Related

Star Schema from multiple source tables

I am struggling in figuring out how to create a star schema from multiple source tables. I work at a trading firm so the data is related to user trading activity. The issue I am having is that our datasets do not have primary ids for every field that could be a dimension. Instead, we usually relate our data together using the combination of date and account number. Here is an example of 3 source tables...
I would like to turn this into a star schema, something that looks like ...
Is my only option to denormalize my source tables into one wide table (joining trades to position on account number and date, and joining the users table on account number), create keys for each dimension, then re normalizing it into the star schema? Are star schema's ever built from multiple source tables?
Star schemas are almost always created from multiple source tables.
The normal process is:
Populate your dimension tables
Create a temporary/virtual fact record using your source data
Using this fact record, look up the relevant dimension keys
Write the actual fact record to your target fact table
Data-warehousing is about query speed. The data-warehouse should not be concerned with data integrity. IT SHOULD NOT CLEAN OR CORRECT BAD DATA. It only needs to gather all the data together into a single record to present to the model for analysis. Denormalizing the data is how this is done.
In a star schema, dimensions do not know about each other and have no relationships with other dimensions. In a snowflake, dimensions are related to other dimensions. That is the primary difference between star and snowflake.
All the metadata options for events are rolled up into dimensions and used for slicing/filtering. All the measurable/calculation data for an event are in the event fact, along with a reference to the dimension(s) containing the relevant metadata. The Metadata/Dimension is reused across multiple fact records.
Based on the limited example you've provided, I'd suggest you research degenerate dimensions and junk dimensions. Your Trade and Position data may need to be turned into a fact and a dimension (degenerate), and some of your flag attributes may be best placed into a junk dimension.
You should also make sure your dimension keys are clear. You should not have multiple paths to a dimension (accountnumber: trade -> position -> user & trade -> user ) as that will cause inconsistent results when querying depending on which relationship you traverse.

Link fact tables at different granularity levels of a dimension

New to data warehouse design. I have a denormalised dimension table representing geographies (e.g. suburb, city, state). This is a slowly changing dimension.
Also have multiple fact tables, each at different grain levels.
Is it possible to model this so the fact tables use surrogate keys, whilst maintaining a denormalised dimension table?
If you have effectively the same dimensional data but at different grains then you handle this by creating "aggregate" dimensions. In your example, copy the dim_geo table definition (not the data), name the dim to something like dim_geo_city and drop all the columns at a lower granularity than city (e.g. suburb_id, suburb). If you have facts at the state level then you would create dim_geo_state in the same way - and so on for any further levels of aggregation.
Fact_population will continue to reference dim_geo but fact_housing should reference dim_geo_city.
The easiest way to populate aggregate dims is to run a SELECT DISTINCT on the base dim (dim_geo) and only include the columns that exist in the target dim (dim_geo_city) - you then take the resulting data and apply the appropriate SCD logic to insert/update it into the target dim.

Basic questions regarding Data Warehousing

I'm wanting to use OLAP cubes and have to first design a data warehouse. I am going for the star-schema. I'm a little confused about how to convert from a normal database to a data warehouse, especially with regards to foreign keys between dimension tables. I know a fact table has foreign keys to dimensions, but do dimensions have foreign keys between them? For example, what do I need to do with the following 2 examples:
TABLE: Airports
COLUMNS: Id, Name, Code, CityId
When I make the Airports dimension, do I remove CityId and put the City Name instead? Or what?
TABLE: Regions
COLUMNS: Id, Name, RegionType, ParentId
The question for this one is mostly the same, but a bit more complex, because here ParentId refers to the same table (Regions).. example: a City can refer to a parent Country record. How do I translate these over to a data warehouse star schema?
Lastly, regarding measures, those go on the fact table, right? I think I will likely need multiple fact tables. Is that normal? Does one fact table translate to one OLAP cube? Or what?
You want to include city within your airport dimension. You are intentionally flattening out your normalised schema to aid the speed of the dimensional model which can seem counter intuitive if you are coming from transactional development.
With regards to the perennial child relationship, you want the parented to be translated into the surrogate of the region record. Ssas will provide the functionality to relate parent child records when you are designing your cube.
Multiple facts are not unusual, but unless the fact data is completely unrelated, there is no need to separate them into different cubes. The requirement for multiple facts will be driven by having data at a different grain. Keep all of you metrics (I.e. Flights) together, but you would separate out flight metrics from food sale metrics
you not converting to data warehouse, you are creating new data warehouse with few dimension and 1 (at least) Fact table. dimension tables are loaded first and you DO NOT want to change id with name.
you need additional key for each dimension table. once you load dimensions, I usually use ssis package to load fact table.(either incremental load or you can truncate fact table each time before you load with new data( depends what you need) ...

How to design a database model for a large data warehouse?

Lets say I design a database model for an online seller such as Amazon:
Next, to create the database model for the larger data warehouse, I flatten the Order and OrderDetails tables, and flatten the Product and Vendor tables:
I do this to apply the concept of designing models for parent-child applications, as described here http://bit.ly/1bOuOXQ
The data in the data warehouse tables becomes repetitive:
Several values such as $76.30 for OrderTotal repeat on each row, is this correct? Is the model correct?
Each row represents a match for each product, the corresponding order subtotal and total will be identical for each row with a specific orderid.
The result is as expected for your model definition.
Yes.
OrderTotal is the same for all records, so it is duplicated when you put everything in a single spreadsheet.
(Of course, in a relational database you are keep data in several tables, so it isn't duplicated.)

Handling duplicate values in Business Intelligence Studio Cube dimensions

I have a Cube dimension which has a relationship between tier and market.
They are contained in the same table, with structure id, tier, market. However, when I create the dimension, it's showing a row in member properties for EACH row in the table. Is there any way to construct this again so that it becomes :
Tier1
Tier2
Tier3
with the markets listed underneath each relevant tier when I expand it?
Ok, found the issue. The above table had 3 columns: id, tier, market. When BIS automatically created the dimension, it added a one-one relationship.
Delete the dimension, creaqte a new dimension using that existing table, set id as the key column, then select the other columns as dimension attributes. Rebuild, redeploy, and process, and the dimension will have a tree structure as required.

Resources