So I'm having trouble trying to configure a new cube that takes a snapshot of my company's open orders each day. Every night, a snapshot is taken and stored in the data warehouse with a date key for the date the snapshot was taken. This date would be the one that we want to be non-additive. However, we also have other dates in this data set, such as scheduled ship-date, order date, etc. that are fully-additive, just like the other non-date dimensions.
Does anyone have any advice on how I can create a cube for this data so that the order totals can be summed across the other dates, but the capture date is LastNonEmpty?
The first connected Date dimension in the Dimension Usage tab is the semi-additive Date dimension. The rest are additive. I describe the exact behavior here.
This answer applies to Multidimensional cubes (not Tabular) which is what I assume you have since you mentioned LastNonEmpty.
Related
We have a legacy ERP system that stores data in flat files. we have replicated these flat files in SQL Server database pretty much as it is.
Some of the sales tables store historical data in multiple columns without storing any dates with them. the name of the column will tell us which month the sales data belong to. Sales01 is current month, Sales02 is previous month, Sales03 is the month before and so on. Same with sales Qty and margin. i.e. Qty01, qty02 and margin01 and margin 02 and so on. This is repeated for each customer and each item sold.
Now, I am working on a small project where I have to design a small DB for reporting with some tables that will be fed by this main database.
I want to load this data in such a way that these values from each month are stored in rows with a month-year or date from first day of the month in another column so I can use where clause with dates.
Not sure what would the be best way to go? I have written a stored proc in the past to load this data this way, but wonder if there a better way to go.
Some how I can use SSIS Pivot transformation?
or just use Pivot or similar statement to do this in an SP?
I will most likely be using this practice to built a data warehouse in future.
I am using report builder to create a report showing a budget for a project. The dataset includes line items for both budget and projected. See below for example rows. I am using a matrix with column group to display budget and projected side by side as well as a row group to show section, category, etc. I need to have a variance column that subtracts projected from budget.
I have scoured the interwebs for solutions but nothing that has worked so far. I feel like there has to be simple solution to this given it is something that could be done in a sql query with zero effort. Most solutions are assuming I have two separate fields, but these are dynamic fields pull out with the column group.
Dataset Row Samples
Type Section Cateogry Phase Task Total
Budget Building Kitchen Pre-Construction Cabinet Hardware $100
Projected Building Kitchen Pre-Construction Cabinet Hardware $220
Report sample
COL GROUP This is the column i want
Budget Projected Variance
+Buidling $100 $220 -$120
+Kitchen
+Pre-Con
EDIT: I tried the below solution without success and have already visited every link provided in the second answer. Maybe there is something I am missing, but I ended up just doing everything in the SQL query and not use Column groups. This is 100% the simplest solution. I am very surprised there is no easy way to reference individual columns in a column group. The below may work for others, but I just could not get them to work for me. Not sure why.
You could add an additional column inside the “Type” group (provided that this is the name of your column group). Set the Column Visibility to hide the column by an expression like
= IsNothing(Previous(Field!Type.Value, “Type”)
Calculate the values for that column as
= Previous(Sum(Fields!Total.Value), “Type”) – Sum(Fields!Total.Value)
That should calculate the difference between the values of the previous type and the current type, and
only show that column for the "Projected" type (when there is a previous type).
On the matrix, you can use the group subtotals to achieve this, you only have to overwrite the SUM operation with an expression that subtract to values. There are many link mentioning how to do that or that can helps you:
How to add calculated column from dynamic columns to a matrix
Adding subtotals to SSRS report tablix
How to write Expression to subtract row Group SubTotals
Reporting in SQL Server – Using calculated Expressions within reports
So I am trying to plot a value over a time series in powerBI report builder. I am currently getting the data from a relational MSSQL database. Now, this value (UnitCapacity) has a StartDate and an End Date. So what I have done is created a date time dimension inside powerbi using an mquery to replicate the days between a particular year and another. What I am trying to do is to plot the Unit capacities over a time series chart. Then I created filters so that I can choose which Refinery unit to plot.
So how I tried to tackle it is by creating a relationship between the IIROutagesDenormalised and DateTimeDim over the handle where the handle is in this format: {YYYY}-{MM}-{DD}. Is this the right way to do this please?
When I tried to Create the DAX query to get the Calendar date dimension, this is giving me the error below:
You don't need to take care of the date format because it should be handled by Power BI, as long as the data type is correct. Not sure about the business logic but there is a simpler way using DAX.
You can create a calendar table using DAX:
DateTimeDim = CALENDAR(MIN(IIROutagesDenormalised[OutageStartDate]), MAX(IIROutagesDenormalised[OutageEndDate]))
Which returns a table with column Date.
If you create a relationship between the Date column and OutageStartDate:
With a simple measure (depending on the business logic), like
Total = SUM(IIROutagesDenormalised[UnitCapacity])
You can plot something like the following:
Which also works with the filter:
I receive new data files every day. Right now, I'm building the database with all the required tables to import the data and perform the required calculations.
Should I just append each new day's data to my current tables? Each file contains a date column, which would allow for a "WHERE" query in the future if I need to analyze data for one particular day. Or should I be creating a new set of tables for every day?
I'm new to database design (coming from Excel). I will be using SQL Server for this.
Assuming that the structure of the data being received is the same, you should only need one set of tables rather than creating new tables each day.
I'd recommend storing the value of the date column from your incoming data in your database, and also having a 'CreateDate' column in your tables, with a default value of 'GetDate()' so that it automatically gets populated with the current date when the row is inserted.
You may also want to have another column to store the data filename that the row was imported from, but if you're already storing the value of the date column and the date that the row was inserted, this shouldn't really be necessary.
In the past, when doing this type of activity using a custom data loader application, I've also found it useful to create log files to log success/error/warning messages, including some type of unique key of the source data and target database - ie. if coming from an Excel file and going into a database column, you could store the row index from Excel and the primary key of the inserted row. This helps tracking down any problems later on.
You might want to consider having a look at SSIS (SqlServer Integration Services). It's the SqlServer tool for doing ETL activities.
yes, append each day's data to the tables; 1 set of tables for all data.
yes, use a date column to identify the day that the data was loaded.
maybe have another table with a date column and a clob column. The date to contain the load date and the clob to contain the file that you imported.
Good question. You most definitely should have a single set of tables and append the data daily. Consider this: if you create a new set of tables each day, what would, say, a monthly report query look like? A quarterly report query? It would be a mess, with UNIONs and JOINs all over the place.
A single set of tables with a WHERE clause makes the querying and reporting manageable.
You might do a little reading on relational database theory. Wikipedia is a good place to start. The basics are pretty straightforward if you have the knack for it.
I would have the data load into a stage table regardless and append to the main tables after. Once a week i would then refresh all data in the main table to ensure that the data remains correct as per the source.
Marcus
Hi
i'm struggling with adding time dimension to OLAP cube.
I can get everything in cube to work except date.
In my source data view I have datetime column.
I go by using Dimensions->New Dimension->Generate time dimension on the server.
I end up with a nice hierachical time dimension (Date-Month-Quarter-Year).
Later I add this dimension to cube and define regular relationship with datetime column from source data view (same table which has fact data).
When I try to deploy the cube, I get error:
Errors in the OLAP storage engine: The attribute key cannot be found when processing:Table: 'table_name', Column: 'registration_date', Value: '3/29/2007 3:00:00 PM'. The attribute is 'Date'
Maybe I don't get something? Every manual I can find talks about calendar table already created in the source database. There are plenty of script which will create calendar table for you. But why should I ? Isn't Generate time dimension on the server meant for it?
I would guess that your date field in your fact table needs to be present in the time dimension. Perhaps remove the time or create a calculated field in the SSAS designer. More experience people may have better answers, I've only made one cube.