convert a history table to temporal table

convert a history table to temporal table - sql-server

i have a history table that i'd like to convert to a sql server temporal history table. Every day, i take a snapshot of the data and add it to the table, regardless of changes. I'm able to use this to get a point in time for each day of the year. and since i have 3 years of data, i can get a 3 year comparison of the day for each year. Anyway, these tables are getting rather large. We'd like to start using temporal tables. I've been trying to figure out if i can convert these history tables to a temporal history table with the proper from and to dates based on the date added in the current table.
The examples i've found don't seem to allow for this. Any ideas? Thanks.

Related

Can you join two SSAS tables at the time of query based on end user filter selection?

I'm working with a "slowly changing fact table" that keeps track of changes to a reservation by stamping the effective start and effective end dates on when a change is effective for. The latest change has an effective end date of 12/31/9999 23:59:59.999 to keep it effective forever until if/when a new change comes in. I have a second table called an "As-Of" date table that has every single date since 8/1/2008 to current with a timestamp of 23:59:59.999 that I can use to join back to the slowly changing fact where "as-of" dates are between the effective start and end dates of in the fact. The purpose of this is to be able to go back to any specific date in time and see what the reservation looked like on that date. As you can imagine, each new as-of date has exponentially more data.
I've been tasked with creating an SSAS tabular model that has every "as-of" date available in a drop down for end users to select and be able to see data for that particular date. I'm concerned about storage and performance issues of having all as-of dates joined back to the fact table to provide end users the freedom to select any as-of date they want at any given time.
If I create a view that has the fact and as-of date table joined together, is it possible to pass the as-of dates they select in the drop down in SSAS back to a dynamic where clause in the view so that I am only joining the fact and as-of date tables dynamically for the as-of dates they actually need to see? Is it possible to create some type of "live" connection that only joins the fact and as-of date tables on the fly so I don't have to blow out the underlying data for each as-of date?
It seems as though the two tables will have to be joined in a SQL view before I bring it into SSAS since it doesn't seem possible to join two tables on more than one column in SSAS.
Can someone please tell me if this is something that is even technically possible? Or if you have any other ideas on the best way to represent this data in SSAS?

For some table, you can use Direct Query (live query) instead of Import.
DirectQuery Documentation
You can use a virtual relationship in your measure (where you can specifying more than one column).
VirtualRelationship
example:
CALCULATE (
<target_measure>,
TREATAS (
SUMMARIZE (
<lookup_table>
<lookup_granularity_column_1>
<lookup_granularity_column_2>
),
<target_granularity_column_1>,
<target_granularity_column_2>
)
)

How to check the data in sql server if it is there for everyminute

I have table with
fields: Id, ChId, ChValue, ChLoggingDate
Now the data will be saved for everyminute in to the database. I need a query to check if the data exists for every minute in the table through out the year for a particular weekday. That is, all Monday's in 2013 if it has complete data for that day calculate the arithmetic mean for the year of Monday's.

YOu will need a table - or a table value function - to create a timestamp for every minute, then you can join with that and check where the original table has no data.
Only way - any query otherwise can only work with the data it has.
This is a common approach for any reporting.
http://www.kodyaz.com/t-sql/create-date-time-intervals-table-in-sql-server.aspx
explains how, you just must expand it to a times table. If you separate date and time you can get away with a time table (00:00:00 to 24:59:59) and a date table separately.

Can a Accumulating snapshot table has multiple dates in it?

I am trying to make sense of dimension modeling. While reading a dimension modeling book, I have created a star schema.
The fact table is a Accumulating snapshot table and it has multiple date columns which are linked to a date dimension using a surrogate key.
FactApplicants
{
Interview_No_Show_Date_Key (FK)
Cancel_Date_Key (FK)
Interviewed_Date_Key (FK)
. ....
Applicant_Key(FK)
InquiryCount int
}
DimDate
{
Date_Key (PK, int),
FullDateUSA (char(10))
Date (datetime)
}
I do have a well defined process for which i am trying to make this star schema for. I have a date field in the fact table for each of this step as I need to prepare funnel like report and activity reports. So the question really is
Is this correct? can a fact table refer to same date dimension table multiple times?
The examples I am seeing all over the internet seems to indicate this is correct but i am having hard time making it work with Pentaho reporting. so I am not sure if its a design problem or its something i am not doing correctly in Pentaho

Yes it is correct to refer to the date dimension multiple times

Yes, a fact can refer to the same dimension multiple times. However, given only what I see in your example, I am not sure why you need the date dimension. The date in applicants is just a date and can be used as an attribute without referring to a separate date dimension. It's just the attribute "date". You would need a separate date dimension if, for example, (1) you want to ensure that only valid dates are used, or (2) you want to elevate date to a full calendar in which other attributes are used to describe a date, such as day of the week, weekday/weekend, holiday, etc. or (3) you want to rollup date to other levels, such as week, month, year.

Implementing Date Range in OLAP systems

Please bear with me if this is a trivial question,I am a new bee
I am in the design phase of a OLAP system where i need to show cost for a date range.
I have three other dimension like product,vendor and language.
Should I add date as one more dimension??
My queries are mostly cost on a date range like from 5-11-1997 to 01-09-2-13
Which is the best way to do it.

You do need to add a Time Dimension. If all the Date/Time facts are just Dates (no Time part as in the example range) then you need to create a table/view which consists of a row for each Date in the domain range.
This table can also have extra fields for things like week, month, quarter, season, year that your users may be interested in querying. (If there are none of these, then just have one column with the date.)
You would need to tell the OLAP data model that this date column in the Time table is the PK, and that the dates in other tables are FK's to it. The OLAP engine will then allow this new Time Demension to be used is queries just like any other dimensions.

Database design for a yearly updated database (once a year)

I have a large database which will only be updated once a year. Every year of data will use the same schema (the data will not be adding any new variables). There's a 'main' table where most of the customer information lives. To keep track of what happens from year to year, is it better design to put a field in the main customer table that says what year it is, or have a 'year' table that relates to the main customer table?

I recommend having a year field in the customer table, that way it is all together. You could even use a timestamp to automatically input the date of user sign up.

To really answer, we'd need to see your schema, but it is almost never the right choice to make a new table for a new year. You probably want to relate years to customers.

Usually you would split off your archive data because you are doing OLTP stuff on your current data, because you want to mostly work on current data, and sometimes look at old stuff. But you have very few updates it seems. I guess the main driver is your queries, and what they 'usually' do, and what performance you need to get out of them. Its probably easier for you to have everything in one table - with a year column. But if most of your queries are for the current year, and they are tight on performance you may want to look at splitting the current data out - either using physical tables, or partitioning of the table (depending on the DB some can do this for you, whilst still being a single table)