Can we get forecast on train data using DeepAR gluonts method? - forecasting

I want to know the residuals value on train dataset i.e. actual value-fitted value.
I have tried with gluonts DeepAR package. It's working fine when I am training on train data and trying to predict on validation dataset. But I am unable to get the forecast value on train data.

Related

Add predictor to model quantile function in Tableau

Does anyone know how to get a model quantile function to project into future using a predictor variable/measure? I have support ticket volume I'm trying to predict using previous months and another measure.
MODEL_QUANTILE(0.5,SUM(Volume),ATTR(DATETRUNC('month',[fiscal date])),SUM(Time / User (min)]))
If I take out the last predictor (Time/User) it produces a smooth curve into future, but only gives me until this month if I have it in. Both measures stop at this month, and I've tried "Infer Properties from Missing Values" and "Extend Date Range"

Time series forecast with TensorFlow Probability - seasonality parameters

I was using SARIMAX for timeseries forecasting and getting decent results. I am currently exploring the TensorFlow Probability and came across "tfp.sts.forecast".
About data: I have sample timeseries data with hourly data. This data can have hourly as well as weekly seasonality.
For buidling the model, as mentioned in samples, I tried setting the effect as below
hour_of_day_effect = tfp.sts.Seasonal(
num_seasons=24,
num_steps_per_season=1,
observed_time_series=observed_time_series,
allow_drift=True,
name='hour_of_day_effect')
day_of_week_effect = tfp.sts.Seasonal(
num_seasons=168,
num_steps_per_season=1,
observed_time_series=observed_time_series,
allow_drift=True,
name='day_of_week_effect')
autoregressive = tfp.sts.Autoregressive(
order=1,
observed_time_series=observed_time_series,
name='autoregressive')
model = tfp.sts.Sum([hour_of_day_effect,
day_of_week_effect,
autoregressive],
observed_time_series=observed_time_series)
My question is about "day_of_week_effect". For my hourly data, if I set it up as
num_seasons=7
num_steps_per_season=24
I do not get good results.
For example, if I see spike on every Friday, that is not captured correctly if the values are set as above.
But if I set them up as
num_seasons=168,
num_steps_per_season=1,
it is captured correctly. (I arrived at 168 as 24 * 7)
Could you please let me know about this behavior?

How Can I get Tableau to just Pull a Specific Data Point

I am new to Tableau and am trying to figure out how to isolate a row of data in a calculated field.
For example, to get a certain piece of data for a certain year only and then use that for calculations.
I want to find the percentage change using a calculated field between undocumented immigration growth from 2000-2010-2018.
I have tried the following to just try to isolate the value point for 2018:
(IF STR([Year])='2018' THEN [Undocmented Population] ELSE NULL end)
and also this but it needs to be aggregate measures:
{ FIXED [State]: IF STR([Year])='2018' THEN [Undocmented Population] ELSE NULL end}
I just want to isolate the date points for the various years per state so then I can perform a simple change in percentage calculation and cannot seem to figure out how to do it...
Tableau Excerpt

Google Data Studio with BigQuery Data Source Issue in Calculated Fields and Aggregation

I have a Google Data Studio dashboard that loads really slowly since it's using Google Sheets as a Data Source. I migrated the same data to BigQuery then used it as my new Data Source however, I came across an issue:
When creating a calculated field, the new calculated field is not tagged as Auto in the Default Aggregation I still have to select Sum as a Default Aggregation. This causes problems in my report. Also, it's not Blue, where normal fields are shown as green, and calculated fields are shown as Blue.
When I was using Google Sheets, I could do direct computations in the calculated fields.
Example:
Handle Time = Talk Time / Number of calls
I just create a calculated field called Handle Time, then put the formula Talk Time / Number of calls
Now, I need to create 3 separate Calculated Fields:
Calculated Field 1: SUM(Talk Time)
Calculated Field 2: SUM(Number of calls)
Calculated Field 3: Calculated Field 1 / Calculated Field 2
This is even though I already tagged them as Sum in the Default Aggregation. Can anyone help me understand what I'm doing wrong?
Solution:
A single calculated field will do the trick; the aggregation of each respective field needs to be stated explicitly in the calculated field:
SUM(Talk Time) / SUM(Number of calls)
Why the Change?
To elaborate, the change was part of the Data Modeling update on 31st October 2020; one of the benefits of explicitly stating the aggregation is that it offers greater flexibility with the ability to aggregate fields as required when creating a calculated field, for example, something like:
MAX(Talk Time) - MIN(Talk Time) / COUNT(Handle Time) * AVG(Handle Time) / COUNT_DISTINCT(Text_Field1) * COUNT(Text_Field2)
Speed
Regarding speed, where the Data Set is large and static (daily updates are fine and real time data is not required), then a Data Extract would be a good option.
Dimensions are shown as green, metrics are shown as blue. Data imported from other sources, particularly from Google sheets tend to show metrics as green but when you add them to a chart or table they get recognised as metrics and change to blue.

How do I create a default everyday date dimension?

I am trying to create a line chart counting all the optins per date, however the only dimension that is will allow me to choose from have to be a date column on my source. The problem with this is it only chooses from dates that are populated in those fields with an optin date.
For example: I have 5 optins on 1/1/2019, 0 on 1/2/2019, and 3 on 1/3/2019
If I use this series and want to include another metric, 1/2/2019 will not show anything for that other metric
I just want a standard everyday series that counts every metric on a given day. The google analytics connection source has a generic Date dimension but I can not figure out how it was done
Ive tried creating a new column with everydate on it and trying to use that as a dimension without any luck
You should be able to use a Time Series graph (of which there are 3 types) instead of a Line graph.
A Time Series will keep the days where no data is available unlike the Line Graph which only presents labels for those which have values in the data.

Resources