Data Warehouse: Modelling a future schedule - database

I'm creating a DW that will contain data on financial securities such as bonds and loans. These securities are associated with payment schedules. For example, a bond could pay quarterly, while a mortgage would usually pay monthly (sometimes biweekly). The payment schedule is created when the security is traded and, in the majority of cases, will remain unchanged. However, the design would need to accommodate those cases where it does change.
I'm currently attempting to model this data and I'm having difficulty coming up with a workable design. One of the most commonly queried fields is "next payment date". Users often want to know when a security will pay next. Therefore, I want to make it as easy as possible for them to get the next payment date and amount for each security.
Also, users often run historical queries in which case they'd want the next payment date and amount as of a specific point in time. For example, they may want to look back at 1/31/09 and query the next payment dates (which would usually be in February 2009 for mortgages). It's also common that they want to query a security's entire payment schedule, which might consist of 360 records (30 year mortgage x 12 payments/year).
Since the next payment date and amount would be changing each month or even biweekly, these fields wouldn't seem to fit into a slow-changing dimension very well. It would probably make more sense to use a fact table, but I'm unsure of how to model it. Any ideas would be greatly appreciated.

Next payment date is an example of a "fact-free fact table". There's no measure, just FK's between at least two dimensions: the security and time.
You can denormalize the security to have a type-1 SCD (overwritten with each load) that has a few important "next payment dates".
I think it's probably better, however, to carry a few important payment dates along with the facts. If you have a "current balance" fact table for loans, then you have an applicable date for this balance, and you can carry previous and next payment dates along with the balance, also.
For the whole payment schedule, you have a special fact-free fact table that just has applicable date and the sequence of payment dates on into the future. That way, when the schedule changes, you can pick the payment sequence as of some particular date.

I would use a table (securityid,startdate, paymentevery, period) it could also include enddate, paymentpershare
period would be 1 for days, 2 for weeks, 3 for months, 4 for years.
So for security 1 that started paying weekly on 3/1/2009, then the date changed to every 20 days on 4/2, then weekly after 5/1/2009, then to monthly on 7/1/2009, it would contain:
1,'3/1/2009',1,2
1,'4/2/2009',20,1
1,'5/1/2009',1,2
1,'7/1/2009',1,3
To get the actual dates, I'd use an algorithm like this:
To know the payment dates on security 1 from 3/5/2009 to 5/17/2008:
Find first entry before 3/5 = 3/1
Loop:
Get next date that's after 3/5 and before the next entry (4/2 - weekly) = 3/8
Get next date that's before next the entry (4/2) = 3/15
Get next date that's before next the entry (4/2) = 3/22
Get next date that's before next the entry (4/2) = 3/29
Next date >4/2 switch to next entry:
Loop:
Get next date that's after 4/2 and before the next entry (5/1 - every 20 days) = 4/22
Next date 5/12 is AFTER next entry 5/1, switch to next entry
Loop:
Get next date that's after 5/1 and before the lastdate (5/17 - weekly) = 5/8
Get next date that's before the lastdate = 5/15
Next date > 5/17
The dates between 3/5/2009 and 5/17/2008 would be 3/8,3/15,3/22,3/29,4/22,5/8,5/15

Why not store the next payment date as the amount of days from the date of the current payment?
Further clarification:
There would be a fact for every past payment linked to some date dimension. Each one of these facts will have a field next payment in which will be an integer. The idea is that the date of the current payment + next payment in will be the date of the next payment fact. This should be able to cater for everything.

Related

Number of monthly occurrences of bill date between pay dates

For information; today is 24/01/2018.
I have formulas that put my previous pay date in a cell of a worksheet named in the format "YYYYMM". In this case cell 201712!B1, which is based on the month of the last pay date that has occurred.
The next pay date is in template!b1.
The date a bill comes out of my account is in [#[start date]]
The best formulas I've managed to come up with at the moment are the array formulas:
To work out how many times payments have occurred:
{=SUMPRODUCT(--(TEXT(ROW(INDIRECT(INDIRECT(TEXT(DATE(YEAR(TEMPLATE!$B$1),MONTH(TEMPLATE!$B$1)-1,DAY(TEMPLATE!$B$1)),"yyyymm")&"!$b$1")&":"&DATE(YEAR(TODAY()),MONTH(TODAY()),DAY(TODAY())))),"dd")=TEXT([#[START DATE]],"dd")))}
The above seems to be functioning okay... but I'm now questioning it, as
To work out how many times payments will still occur I have:
{=SUMPRODUCT(--(TEXT(ROW(INDIRECT(DATE(YEAR(TODAY()),MONTH(TODAY()),DAY(TODAY())+1)&":"&TEMPLATE!$B$1-1)),"dd")=TEXT([#[START DATE]],"dd")))}
My last pay date was 22/12/2017 and my next pay date is 25/01/2018.
The second formula is showing that I still have one payment left to make for a payment that occurs on the 25th of every month within this pay period of which today should be the last day.
I think I may be overcomplicating this... any help would be much appreciated.
Your second formula malfunctions because of how ROW function works, for example ROW(3:1) gives you the same as ROW(1:3) so in your formula when today+1 is after next payday-1 you still get both those dates, hence counting a payment date that's out of range
Why not just check if bill dates are after today, but before the next payday, like this:
=COUNTIFS([#[START DATE]],">"&TODAY(),[#[START DATE]],"<"&TEMPLATE!$B$1)
For bills paid between last payday and today inclusive you can do a similar thing, i.e.
=COUNTIFS([#[START DATE]],">="&INDIRECT(TEXT(EDATE(TEMPLATE!$B$1,-1),"yyyymm")&"!B1"),[#[START DATE]],"<="&TODAY())

Google Data Studio date aggregation - average number of daily users over time

This should be simple so I think I am missing it. I have a simple line chart that shows Users per day over 28 days (X axis is date, Y axis is number of users). I am using hard-coded 28 days here just to get it to work.
I want to add a scorecard for average daily users over the 28 day time frame. I tried to use a calculated field AVG(Users) but this shows an error for re-aggregating an aggregated value. Then I tried Users/28, but the result oddly is the value of Users for today. The division seems to be completely ignored.
What is the best way to show average number of daily users over a time frame? Average daily users over 10 days, 20 day, etc.
Try to create a new metric that counts the dates eg
Count of Date = COUNT(Date) or
Count of Date = COUNT_DISTINCT(Date) in case you have duplicated dates
Then create another metric for average users
Users AVG = (Users / Count of Date)
The average depends on the timeframe you have selected. If you are selecting the last 28 days the average is for those 28 days (dates), if you filter 20 days the average is for those 20 days etc.
Hope that helps.
I have been able to do this in an extremely crude and ugly manner using Google Sheets as a means to do the calculation and serve as a data source for Data studio.
This may be useful for other people trying to do the same thing. This assumes you know how to work with GA data in Sheets and are starting with a Report Configuration. There must be a better way.
Example for Average Number of Daily Users over the last 7 days:
Edit the Report Configuration fields:
Report Name: create one report per day, in this case 7 reports. Name them (for example) Users-1 through Users-7. These are your Row 2 values. You'll have 7 columns, with the first report name in column B.
Start Date and End Date: use TODAY()-X where X is the number of days previous to define the start and end dates for each report. Each report will contain the user count for one day. Report Users-1 will use TODAY()-1 for start and end, etc.
Metrics: enter the metrics e.g. ga:users and ga:new users
Create the reports
Use 'Run reports' to have the result sheets created and populated.
Create a sheet for an interim data set you will use as the basis for the average calculation. The first column is date, the remaining columns are for the metrics, in this case Users and New Users.
Populate the interim data set with the dates and values. You will reference the Report Configuration to get the dates, and you will pull the metrics from each of the individual reports. At this stage you have a sheet with date in first columns and values in subsequent columns with a row for each day's values. Be sure to use a header.
Finally, create a sheet that averages the values in the interim data set. This sheet will have a column for each metric, with one value per column. The one value is calculated from the series in the interim data set, for example =AVG(interim_sheet_reference:range) or any other calculation you'd like to do.
At last, you can use Data Studio to connect to this data source and use the values. For counts of users such as this example, you would use Sum as the aggregation field type when you are creating the data source.
It's super ugly but it works.

Keep PivotTable report filter after data refresh

I have a PivotTable (actually it is five PivotTables, each on its own separate sheet) that is created from a query of an outside database. Each of the PivotTables represents a day (i.e. Today, Tomorrow, Today+2, Today+3, and Today+4). For the report filter for the first two, we use a date range filter of today and tomorrow which automatically filters the data and allows it to roll over. We created custom date ranges for the other three days, but upon every external data refresh we have to go into each sheet and reselect the report filter from all to the specified time frame. This data rolls over every day so we can see the lineup for the next 96 hours out.
Is there a way to either keep the PivotTable report filter criteria (VBA and macros are both acceptable, although we are also fairly new to both)?
Or is there some super secret way to extend the report filter from just today and tomorrow to a time range (48 hours, 96 hours) instead of next month?
I need the days to be separated, so next week will not work because all the days will populate on one page.
Without seeing a real example it's hard to tell, but how about changing the query to a relative date index, i.e. something like
SELECT DATEDIFF('day', GETDATE(), report_dt) AS days_from_today FROM reporting_table
And then set your report filters on this relative date index (days_from_today = 1 for tomorrow, etc)? You can always create another Excel column in the report =TODAY() + days_from_today to get your absolute date back. (Assuming you are just dealing with one time zone for reporting purposes.)
I.e., instead of rolling filters, keep the filters on constant indices, and let the indices cover a rolling date range. I'm not sure Excel is smart enough to do the rolling filters thing.

MS SQL - Calculating plan payments for a month

I need to calculate how much a plan has cost the customer in a specific month.
Plans have floating billing cycles of a month's length - for example a billing cycle can run from '2014-04-16' to '2014-05-16'.
I know the start date of a plan, and the end date can either be a specific date or NULL if the plan is still running.
If the end date is not null, then the customer is charged for a whole month - not pro rated. Example: The billing cycle is going from the 4th to 4th each month, but the customer ends his plan on the 10th, he will still be charged until the 4th next month.
Can anyone help me? I feel like I've been going over this a million times, and just can't figure it out.
Variables I have:
#planStartDate [Plan's start date]
#planEndDate [Plan's end date - can be null]
#billStartDate [The bill's start date - example: 2015-02-01]
#billEndDate [One month after bill's start date - 2015-03-01]
#price [the plan's price per billing cycle]
Heres the best answer I can give based on the very small information you have given so far(btw, in the future, it would really help people answer your question faster/easier/more efficiently if you could specify a lot more info;tables involved, all columns, etc..):
"I need to calculate how much a plan has cost the customer in a specific month."
SELECT SUM(price), customerID(I assume you have a column of some sort in this table to distinguish between customers) FROM table_foo
where planStartDate BETWEEN = 'a specific date you specify'
Its a bit rough of a query, but thats the best I can give till you specify more clearly your variable (i.e. tables involved, ALL columns in table, etc etc.....)

Forecast on payments for 5 Years

I've been given the task of creating a data extract that shows how much money an account will make over the next five years(monthly) based on its current rate of payment.
So using 1 example account that pays £1000 a month, which owes £12,000 overall, I can predict we will get £1000 a month and that in 12 months the payment will be settled and every month from then would carry a zero.
I can't work out how to create an SQL query that gives me that, it needs to look at the balance and last payment received, then create a payment for each future month until that balance is zero.
I think I may be over complicating it, All I have managed to come up with so far is some code that tells me how many months it would take to clear based on the last payment which is basically just a balance divided payment. So as it is I have created a query that gets, last payment, number of months till its paid. What I need to do is get it to use the count of months till paid to create future months?
Desired output:
Month|Payment
Jun-14|£1000.00
Jul-14|£1000.00
Aug-14|£1000.00
Sep-14|£1000.00
Oct-14|£1000.00
Nov-14|£1000.00
Dec-14|£1000.00
Jan-15|£1000.00
Feb-15|£1000.00
Mar-15|£1000.00
Apr-15|£1000.00
May-15|£1000.00
Jun-15|£0.00
Jul-15|£0.00
Aug-15|£0.00
Sep-15|£0.00
Oct-15|£0.00
Nov-15|£0.00
Dec-15|£0.00
Jan-16|£0.00
Feb-16|£0.00
Mar-16|£0.00

Resources