How can I calculate between two dates per User Pseudo ID for specific events? - google-data-studio

I linked Firebase to BigQuery and start using Google Data Studio to create a table to list users by "User Pseudo ID".
My goal is to calculate the difference between two dates, the date of first_open and the date of app_remove to come up with an average retention time.
How can I write the right query in Data Studio?

It can be achieved using the three step process below:
1) HH:MM:SS
The Calculated Field below uses the DATETIME_DIFF function to find the difference between app_remove and first_open, and displays the difference in SECOND (for future reference, set the third input DATETIME_DIFF as required, for example, to view the difference in days, set the input to DAY):
DATETIME_DIFF(app_remove, first_open, SECOND)
2) Type (HH:MM:SS)
Number > Duration (Sec.)
3) Aggregation (HH:MM:SS)
AVG
Google Data Studio Report and a GIF to elaborate:

DATE_DIFF may be what you are looking for.
That is if first_open and app_remove are date fields or date expressions

Related

Comparation Periods in Google Data Studio

I'm creating a in google data studio that has a table behind with data by day.
I need it to be comparable with the month before, but there's a catch that I'm currently stuck!
The period should be something like:
DAY(date)/MONTH(date)-1/YEAR(date)
This allows the comparation between periods with different number of days, example:
Date of analysis: 28/06/2021 - 27/07/2021
Date of comparation: 28/05/2021 - 27/06/2021
When trying to create something like this in DataStudio (with date range controls) none of the options does this, and for what I've explored, there isn't an option to do a formula like the one above.
The closest I get is "Previous Period" but that makes the Date of Comparation 29/05/2021 - 27/06/2021, missing the 28/05/2021.
I'm really stuck and running out of ideas, I've even considered changing the SQL query behind to convert the days somehow.
I've done this by mocking the dates for the comparator.
The other way is to create two series, and don't use the comparator.
Series #1 will have label 28/06/2021 and data from 28/06/2021 (current period).
Series #2 will have label 28/06/2021 (the same) and data from 28/05/2021 (previous period).
They plot nicely:

Can time series settings in Google Data Studio change depending on date format?

I have a table with column "date" in YYYY-MM-DD format HH:MM:SS:MMM (2015-01-27 11:22:03:742). I'm trying to make a time series with the dimension of month/year grouping, to display the total number of records by period.
Settings:
period dimension: date (type: date and time)
period: date (type: year and month)
metric: record count
My time graph doesn't display anything. Can someone help me identify what's going on?
formatDate is the column created with the expression:
PARSE_DATETIME("%Y-%m-%d %H:%M:%S",REGEXP_EXTRACT( create_date,"(.*):[0-9]*"))
Using the date in its standard format, as mentioned at the beginning of the question, the same happens.
When entering dates (original and formatted), both appear with null values.
The milliseconds have to be separated by a . not a :. An option is to import your date a as string/text and add a calculated field, which parse the string in Data Studio:
PARSE_DATETIME("%Y-%m-%d %H:%M:%S",REGEXP_EXTRACT( data_field,"(.*):[0-9]*"))
If the dates are several years in the past, please adjust the Default date range in your graph:
I leave the solution to my problem to the community.
The problem is in the date format. Failed to get Google Data Studio to receive a date with milliseconds. By removing the milliseconds it was possible to work with the dates normally, managing to apply the available functions.
Note: It may be a knowledge limitation, but none of the date formatting functions work if the datetime field contains milliseconds (FORMAT_DATETIME, PARSE_DATETIME,...)

Data Studio - Grouping by Week

I have a simple Data Studio table consisting of two columns. The first column is the week (ISO Year Week) and the second column is the total registrations we've received for that week.
However, my Week column repeats 7 times (7 Rows) for each week as it's counting by day within that week. See below:
Is there any way to get this to group by the listed week? Below are my settings:
Dimension = Conversion Date set as "ISO Year Week" for the type.
Metric = Equals the count of Conversion Date (Same Conversion Date field used for dimension)
Any help would be much appreciated.
There might be an issue with the date format of the source. Without knowing the source (e.g. Google Analytics or Sheets) it’s hard to tell.
Blended Data
I recently had this issue with blended data. The response of a similar question helped me to find a way.
Basically you have to add a new custom field to the data source with the formula WEEK(date_field_link). Data studio will recognise this as a date in compatibility mode, but for me it still works. Then you can use this new date field as a join key to blend the data while grouping it in weeks.
Normal Data
If this problem appears in a regular non-blended dataset you might want to check if Data Studio correctly catches the date as a date. This help article from Google might be worth checking out: https://support.google.com/datastudio/answer/6401549?hl=en#zippy=%2Cin-this-article
I made a similar case work using blended data.
Your column "Conversion Date" repeats the same week 7 times because it's just a display value. Every row has a date value (year, month and a day) but you're just showing the corresponding week. So, data-studio treats them as different data and doesn't group them.
To be able to group them by Week you need to create a new field with a value containing only the week and the year. So, you can use the formula
YEARWEEK(your_date)
The resulting Date will be groupable.
NB1: If your date isn't of the type Date, you can parse it from text to date using
the method:
PARSE_DATE("%Y-%m-%d", your_date_text)
NB2: If the created field has the type number and doesn't show the possibility to change type to Date, you can do this trick: (it's weird but it works):
First type as a formula for the created field and apply:
MONTH(your_date)
This will unlock the compatibility Date types. You can choose from them the ISO Year Week type.
and then change the formula from MONTH(your_date) to YEARWEEK(your_date) [your formula] as I explained above. The chosen date type won't go away even if it wasn't available the first time.

Google Data Studio date aggregation - average number of daily users over time

This should be simple so I think I am missing it. I have a simple line chart that shows Users per day over 28 days (X axis is date, Y axis is number of users). I am using hard-coded 28 days here just to get it to work.
I want to add a scorecard for average daily users over the 28 day time frame. I tried to use a calculated field AVG(Users) but this shows an error for re-aggregating an aggregated value. Then I tried Users/28, but the result oddly is the value of Users for today. The division seems to be completely ignored.
What is the best way to show average number of daily users over a time frame? Average daily users over 10 days, 20 day, etc.
Try to create a new metric that counts the dates eg
Count of Date = COUNT(Date) or
Count of Date = COUNT_DISTINCT(Date) in case you have duplicated dates
Then create another metric for average users
Users AVG = (Users / Count of Date)
The average depends on the timeframe you have selected. If you are selecting the last 28 days the average is for those 28 days (dates), if you filter 20 days the average is for those 20 days etc.
Hope that helps.
I have been able to do this in an extremely crude and ugly manner using Google Sheets as a means to do the calculation and serve as a data source for Data studio.
This may be useful for other people trying to do the same thing. This assumes you know how to work with GA data in Sheets and are starting with a Report Configuration. There must be a better way.
Example for Average Number of Daily Users over the last 7 days:
Edit the Report Configuration fields:
Report Name: create one report per day, in this case 7 reports. Name them (for example) Users-1 through Users-7. These are your Row 2 values. You'll have 7 columns, with the first report name in column B.
Start Date and End Date: use TODAY()-X where X is the number of days previous to define the start and end dates for each report. Each report will contain the user count for one day. Report Users-1 will use TODAY()-1 for start and end, etc.
Metrics: enter the metrics e.g. ga:users and ga:new users
Create the reports
Use 'Run reports' to have the result sheets created and populated.
Create a sheet for an interim data set you will use as the basis for the average calculation. The first column is date, the remaining columns are for the metrics, in this case Users and New Users.
Populate the interim data set with the dates and values. You will reference the Report Configuration to get the dates, and you will pull the metrics from each of the individual reports. At this stage you have a sheet with date in first columns and values in subsequent columns with a row for each day's values. Be sure to use a header.
Finally, create a sheet that averages the values in the interim data set. This sheet will have a column for each metric, with one value per column. The one value is calculated from the series in the interim data set, for example =AVG(interim_sheet_reference:range) or any other calculation you'd like to do.
At last, you can use Data Studio to connect to this data source and use the values. For counts of users such as this example, you would use Sum as the aggregation field type when you are creating the data source.
It's super ugly but it works.

Tableau – Using Nested Aggregations to Establish a Weekday/Hour Baseline

Background Information: We have an incident time tracker that tracks how long each user spends with a representative before the issue can be closed. We want to determine the average volume of incidents that are being handled for each hour. To say this in another way: We want to get an hourly baseline for each day of the week that will show us the average total call length within the specific time period. Eg: We want to average the total length of every call on Monday from 9AM-10AM for all the weeks in the database, and the same for other hourly intervals.
The simplest way to think of this is that I want AVG(SUM) for the specific time periods, but Tableau does not allow me to do this.
Tableau Output:
This is the desired, target visualization that I am looking for from Tableau.
SQL Query:
I have written a SQL query that returns the answer:
We are looking at two columns: start_time (time stamp) and interval_seconds(float)
In the inner query I use the hour_start function which truncates the date/time value to the hour start, so I can group by the hour and day of the week in the outer query.
SQL Results:
Question:
Is there a way to solve this problem ENTIRELY in Tableau that would get me the result that I am looking for without having to write any SQL code?
Files Stored on Drive
CSV File:
https://drive.google.com/open?id=0B4nMLxIVTDc7NEtqWlpHdVozRXc
Tableau Worksheet:
https://drive.google.com/open?id=0B4nMLxIVTDc7M3A4Q0JxbGdlTE0
You can use Level of Detail expressions to compute the SUM(interval_seconds) at the hour level and then use AVG to calculate the number you are looking for.
I created a couple of calculations:
hour which is defined as: DATETRUNC('hour',[start_time])
this should be equivalent to your hour_start(start_time).
and interval_hours which is defined as {FIXED [hour] : SUM([interval_seconds])/3600 }
This calculates the aggregate for each start_time truncated to the hour.
After this, you simply calculate AVG(interval_hours) and use it in your view.
I put a workbook in dropbox: https://www.dropbox.com/s/3hfvz8w529g9f46/Interval%20Time%20Baseline.twbx?dl=0
Although the chart looks similar to yours, the numbers I came up with are somewhat different from the "SQL Results" you show. Was the data you provided slightly different?

Resources