SSAS - Facts that happened over a time range - database

I don't really know how to handle facts that happened over a period of time. I sually deal with facts that happened on a specific date.
In clear, my facts have a start_date and an end_date. So, let's say my start_date is 01/01/2008 and my end_date is 01/01/2011. I need to get the number of those facts that happened in 2009 and those that happened this year. The same fact can have happened on both years. The way to determine a fact is part of 2009 is to check for 12/31/2009.
I was thinking about a StartDate and EndDate dimensions, using date ranges (so from the first date of my StartDate dimension to 12/31/2009 and from 12/31/2009 to the last date in my EndDate dimension). I would cross join those.
I tried it, it works, but it's REALLY slow.
Any thoughts?

I found the solution to what I wanted. David and Chris for the anwsers tho! Here's what I wanted to achieve, but I was lacking MDX syntax :
SELECT [Measures].[NumberX] ON COLUMNS
FROM [MyCube]
WHERE ([START DATE].[REFDATE].FirstMember:[START DATE].[REFDATE].&[2009-12-31T00:00:00],
[END DATE].[REFDATE].&[2010-01-01T00:00:00]:[END DATE].[REFDATE].LastMember)
Pretty simple. I think my question was not clear, that's why I got different answers ;-)

You can always use a date range with the two time dimensions like:
Select [start date].[year].[2009]:[end date].[year].[2010] on 0
from cube
If I'm understanding the question correctly. Two time dimensions should work together fine. I have two in a project I'm doing at work and they work rather fast together. Make sure that you set them up in dimension usage section of the cube so you can differentiate the two dates.

You just need one Date dimension DimDate. Your fact table can have 2 foreign keys to the DimDate, one for startdate and one for enddate.
FactTable
{
FactID int,
StartDate int,
EndDate int
-- Other fields
}
DimDate
{
DimDateID int,
Year int,
Month int,
Day int,
-- Other fields if needed
}
To get all facts that fall on the year 2009, use
SELECT f.FactID FROM FactTable f
INNER JOIN DimDate dStart ON dStart.DimDateID = f.StartDate
INNER JOIN DimDate dEnd ON dEnd.DimDateID = f.EndDate
WHERE dStart.Year <= 2009
AND dEnd.Year >= 2009

Related

Workaround on Sliding Window Function in Snowflake

I've stumbled upon a problem that is giving me huge headaches, which is the following:
I have a table Deals, that contains information about this entity from our Sales CRM. I also have a table Company, that contains information about the companies pegged to those deals.
I was asked to compute a metric called Pipeline Conversion Rate, which is calculated as:
Won deals / Created Deals
Until here, everything is quite clear. Nevertheless, when computing this metric I was asked to do so in a sliding-window-function-fashion, which means to compute the metric only looking at the prior 90 days. Thing is that to look at the last 90 days of the numerator, we need to use one Date (created date); while when looking at the prior 90 days of the denominator, we should take into account the closed date (both dimensions are part of the Deals table).
There wouldn't be any problem if we could do this kind of window functions in Snowflake, as the following (I know syntax may not be exactly this one, but you get the idea):
count(deal_id) over (
partition by is_inbound, sales_agent, sales_tier, country
order by created_date range between 90 days preceding and current row
) as created_deals_last_90_days,
count(case when is_deal_won then deal_id end) over (
partition by is_inbound, sales_agent, sales_tier, country
order by created_date range between 90 days preceding and current row
) as won_deals_last_90_days
But we can't as far as I know. So my current workaround is the following (taken from this post):
select
calendar_date,
is_inbound,
sales_tier,
sales_agent,
country,
(
select count(deal_id)
from deals
where d.is_inbound = is_inbound
and d.sales_tier = sales_tier
and d.sales_agent = sales_agent
and d.country = country
and created_date between cal.calendar_date - 90 and cal.calendar_date
) as created_deals_last_90_days,
(
select count(case when is_deal_won then deal_id end)
from deals
where d.is_inbound = is_inbound
and d.sales_tier = sales_tier
and d.sales_agent = sales_agent
and d.country = country
and closed_date between cal.calendar_date - 90 and cal.calendar_date
) as won_deals_last_90_days
from calendar as cal
left join deals as d on cal.calendar_date between d.created_date and d.closed_date
*Note that I am using a calendar table here as base table, in order to have visibility on all calendar dates since without it I might say I'd be missing on those dates where there are no new deals (could happen on weekends).
Problem is that I am not getting correct figures when I cross check the raw data and the output of this query, and I have no idea how to make this (ugly) workaround, well... work.
Any ideas are more than welcome!
Well, it turns out it was way easier than I expected. After some trial-and-error, I figured out the only thing that could be failing was the JOIN condition in the outer query:
on cal.calendar_date between d.created_date and d.closed_date
This was assuming that both dates needed to be in the range, while this assumption is wrong. By tweaking the above mentioned part of the outer query to:
on cal.calendar_date >= d.created_date
It captures all those Deals that were created on or before the calendar_date, and therefore all of them since it is a mandatory field.
Maintaining the rest of the query as is, and assuming that there will be no nulls in any of the partitions, the results are the ones I expected.

Splitting business date into Day Month and year into separate columns

My business intelligence tool is creating a lot of issues with calculating a variance YOY, instead, I am contemplating creating a view in my Database which will allow me to subtract two columns giving me the variance.
I am trying to wrap my head around the best way to go about this, been testing datepart, convert, cast on the date but I am sure I am going the wrong way about this.
select top 1
Business_date,
CONCAT(DATEPART(MM, Business_Date),'-', DATEPART(DD, Business_Date)) as
DayMonth,
case
when DATEPART(YYYY, Business_Date) = '2019' then 2019
end
from Occupancy_Forecast;
I know the code above does not give me anything where i need to be as I am trying to see the best way to do this, what I am looking for is something like the attached screenshot:
I have also included a screenshot of the current table I am reading from so you understand the current format
Using #Larnu statement regarding the pivot I have been able to create a view stored containing the data required using the below to give me the desired output:
select Resort as Resort, Business_Date as Date, [2016], [2017], [2018], [2019],
[2020]
from
(select Resort, business_date, DATEPART(YYYY, Business_Date) as Year, ADR
from Occupancy_Forecast
where Business_Date > '2015-12-31') as SourceTable
PIVOT
(
AVG(ADR)
FOR YEAR IN ([2016], [2017], [2018], [2019], [2020] )
) as PivotTable

SQL Server 2014 Management Studio

I'm a non profit lawyer trying to set up a SQL Server database for my agency. The issue I'm having is query based: I need a simple query that will aggregate the total number of rows on a table, not the sum of the cell contents.
I working with 4 columns of I to: attorney's name, client name, trial date and remedy (the last 2 are date and dollar amount, so integers].
*** Script for SelectTopNRows command from SSMS***
SELECT TOP 100
[attorney]
,[client]
,[trial_date]
,[remedy]
FROM [MyLegalDB]
WHERE [trial_date] between '20160101' and '20160531'
I'm trying to find a way (script, batch file, etc) that will populate a total number of cases by month (according to trial date) total number of clients, and sum the remedy column.
Sorry for the vagueness. There are privilege rules in place. Hope that helps clarify.
Thanks
Assuming that your case history spans years, not just months, try this:
SELECT
,YEAR([trial_date]) AS [Year]
,MONTH([trial_date]) AS [Month]
,COUNT(1) AS [Trial_Count]
FROM [MyLegalDB]
WHERE [trial_date] between '20160101' and '20160531'
GROUP BY YEAR([trial_date]), MONTH([trial_date])
If you want to separate this by attorney, you would need to add that column to the SELECT list, as well as the GROUP BY clause, as such:
SELECT
[attorney]
,YEAR([trial_date]) AS [Year]
,MONTH([trial_date]) AS [Month]
,COUNT(1) AS [Trial_Count]
FROM [MyLegalDB]
WHERE [trial_date] between '20160101' and '20160531'
GROUP BY [attorney], YEAR([trial_date]), MONTH([trial_date])
This is a very general answer to a very general question. If you want me to be more specific, I'm going to have to understand your goal a little better. Hope it helps.

T-SQL search for returning customer

I've got a question, where I'm struggeling with at the moment.
I really don't know how to solve this problem, it seems so simple.
I've got a Customer ID, and an Order Date.
I only want to show customer, that ordered things before 2015 AND buyed something lets say in the last 10 days.
I created a little test table for that - lets say it's January 2016:
Now there is Customer 1, that did a purchase on January and in year 2010. Ok that fits my need, I want to show him.
But customer 2 did a purchase on December last year, so he is a not a "returning" customer, but a customer that often buys my things. I dont want to show him.
I tryed something like this, but it didn't work:
SELECT [Kunden_ID],Bestellung
FROM [Immo].[dbo].[TEST] AS A
WHERE (Bestellung >=DATEADD (day,-10,getdate())
AND Bestellung <= DATEADD (month,-12,getdate()))
You need two separate queries. The first finds those customers that bought something in the last 10 days. The second uses the exists query to find those same customers (join using ID) that bought more than 12 months ago.
Try this:
SELECT [Kunden_ID],Bestellung
FROM [Immo].[dbo].[TEST] AS A
WHERE (Bestellung >=DATEADD (day,-10,getdate()))
and exists (
select 1
from [Immo].[dbo].[TEST] AS B
where a.[Kunden_ID] = b.[Kunden_ID]
AND b.Bestellung <= DATEADD (month,-12,getdate())
)
Another way to do this uses a Common Table Expression (CTE). It's a little easier to see the different queries.
With Get10Days as (
SELECT [Kunden_ID],Bestellung
FROM [Immo].[dbo].[TEST] AS A
WHERE (Bestellung >=DATEADD (day,-10,getdate()))
)
select b.Kunden_ID
from [Immo].[dbo].[TEST] AS B
join Get10Days as A on a.Kunden_ID = b.Kunden_ID
where b.Bestellung <= DATEADD (month,-12,getdate())

Fill one Table with information from another

I have 2 Tables with the following fields:
Table_1:
Date, Month, Dayname
2013-01-01, January, Tuesday
This goes from 2013-01-01 to 2013-12-31
Table_2:
Workers_ID, Monday, Monday_Hours, Tuesday, Tuesday_Hours and so forth
1, TRUE, 8, TRUE, 8 ...
2, False, 0, FALSE, 4
What i need is a Table:
Date, Month, Dayname, Hours_to_work, Workers_ID
This Table has Number of workers x Days a year rows and in each row you can see how many hours a specific worker works.
My problem, i have no clue how to accomplish that. It would be greate if someone could help me with it.
Ian P gives good advice as far as revising your schema.
If you're absolutely stuck with the current design, here's a strategy you could use:
SELECT Date, Month, Dayname, Workers_ID,
CASE Dayname
WHEN 'Monday' THEN Monday_Hours
WHEN 'Tuesday' Then Tuesday_Hours
-- etc...
END AS Hours_to_work
FROM Table_1 CROSS JOIN Table_2
Well there are a few things here, but its all part of the learning curve! First, there are functions within MS SQL to provide Month an Dayname, look up DateName and DatePart so if you havnt already, these should be calculated columns on a DateTime or smallDate column. Second, Table_2 is a not an ideal design, in general, tables should be narrow, indexes wide. So Table2 should be of the form WorkersID, Date, Hours and may well have a covering index, or at worst a composite clustered index on Date, EmployeeID. It is then (relatively) easy to produce well perfoming Queries on your data.
If you maintain your structure I would also say a pivot would not help you, I think you want a union against date and employeeId if you maintain your structure. MS SQL also now contains functionality to produce a sequence of values in a range (contiguous range of dates) http://www.sqlperformance.com/2013/01/t-sql-queries/generate-a-set-1

Resources