DataStudio: Wrong numbers after data blending - google-data-studio

I have 2 tables from the same data source, same columns, one of them is filtered. For calculating total count COUNT_DISTINCT(another_field) metric is used. The percentages are calculated as COUNT_DISTINCT(another_field) and used the Percent of total comparison function. Title is the dimension.
Title
Total count
Percentage
title1
1,734
47.36%
Title
Total count(filtered)
Percentage(filtered)
title1
1,639
45.69%
Which are correct numbers. Now after blending these 2 tables joined by Title I get.
Title
Total count
Percentage
Total count(filtered)
Percentage(filtered)
title1
1,734
6.18%
1,639
9.97%
Now what happened to the percentages? Why did they change? And how to accomplish this joined table with the same numbers as in the separated ones.

The reason of my problem is that when I blend the data it's already aggregated so the percentages lose the sense of the base data. The correct workaround was to just create 2 charts with no aggregated dimensions or any metrics (one filtered, the other one - not), then manage the blended data to add any fields that are going to be used as a dimension, and only then aggregate the data on the blended chart.

Related

Google Data Studio convert metric to dimension not working

I have imported my GA4 data into Google Data Studio and am trying to see how many giftcards have been sold by their value.
The item revenue metric in GA4 is equal to the giftcard value (i.e. revenue = $200 therefore $200 giftcard was sold).
I want to breakdown sales by giftcard value like so:
Giftcard (revenue)
Count
$200
4
$250
3
$300
6
To do this, I need to set a copy of item revenue as a dimension rather than a metric.
In Google Data Studio, I can create a calculated field with the following formula that should convert the item revenue into text:
CAST(Item Revenue AS TEXT)
The problem I'm having is that while the formula sets the field type as text, it is still regarded by GDS as a metric and can't be used as a dimension.
Even when I try to add text, GDS still recognises the field as a number:
CONCAT(CAST(Item Revenue AS TEXT), " giftcard")
To use a metric as a dimension you can make a combination of data. When defining the graphic element (table, for example) and the respective data source, just create a data combination, but do not combine the data with any other source and just define the combination with the initial data itself. So you will have the same data structure only through a combined structure.
When making a combination of data, data studio recognizes all calculated fields (metrics) as dimensions. Thus, it is possible to make the conversion.

Google Data Studio, how to get a sum of all Max or Min values

I am working with a data set where i have to get Min or Max for different text fields. My dataset can have thousands of rows so below is a simpler example. So I have 3 categories having multiple values and I can put this dataset in GDS to build a table where I select Category as dimention and Value as Max(Value) in metric.
Now I need to see the sum of all those values too. But like the pivot table in excel, the subtotal in GDS shows the Max out of all the max listed above. So instead of 65, it shows 30 in GDS. Is there a way I can get it to show the sum?
To reach the desired result you will need:
Make a data combination, not being necessary to insert a second base, just so that a current base is defined as a data combination.
In the combination use the Category dimension and define the Max Value metric. The combination is only necessary for the metric to be used in the table as a dimension (this is a property resulting from the combination of data).
Configure the table with the Category dimension and Include the metric with the Value sum option. Remember that now Value is the maximum value (as defined in the data combination).
Finally, display the Summary line. And the desired result is obtained

How to get the value once only from the blended data, when the left dataset makes it repeat in Google Data Studio?

I have blended 2 datasets, joining them by a couple of keys. The left dataset contains data for most of the dates, while the second one has monthly sales goals for each salesman.
So, I'll have the daily sales, which when summed up, they give me the total, but when I sum the sales goals from the right dataset, it gets repeated for each sales person occurrence in the left one, giving me the wrong result.
If I put it on a table visual and set its calculation to Average, it gives me the correct sales goal for each person, but the total is wrong and if I put it on a KPI visual, the total is also wrong.
Any help is appreciated.
Thank you!
Sorry, it is not possible to have the same field aggregated as average first and in the next step sum over these average values.
For the total of the "sales goal" can be extracted from right dataset.
If your data comes from Big Query, you could do following steps:
add an "empty" record for each month and sales person to the left dataset
join the two datasets in BigQuery
add a calculated column "goal per order", which is the "monthly sales goals" divided by the number of orders this salesmen had this month. This count number is count(orders) over (partition by salesmen, month_column)
In Data Studio the aggegated sum of the "goal per order" is the value for "goal per salesman"

Calculating dynamic pricing on Google Sheets

I have imported data from a trading exchange listing sellers of a particular cryptocurrency.
From this data, I want to create dynamic pricing to display an average cost on an order based on given order size.
I will give an example of what I am looking for:
Example dataset
Within this example, we would be purchasing the cryptocurrency 'SINS'. As per the data showed on this table, if 29.06 SINS was purchased, that would fill the first order, and the total BTC paid would be 0.00459 BTC.
If an order was placed for 145 SINS, it would fill the orders up to row 12 and partially fill the order in row 13. By calculating that manually, I know that would cost 0.02293365 BTC (calculated using col D) at an average price of 0.00015816 per SIN.
What I would like to achieve is if a number is entered in a cell, it confirms the average price of an order based on the number entered and the orders imported from the trading exchange.
=INDIRECT(ADDRESS(MATCH(VLOOKUP(O2,F2:F,1),F:F,0),7,4))+(
INDIRECT(ADDRESS(MATCH(VLOOKUP(O2,F2:F,1),F:F,0)+1,4,4))*(O2-
INDIRECT(ADDRESS(MATCH(VLOOKUP(O2,F2:F,1),F:F,0),6,4)))/
INDIRECT(ADDRESS(MATCH(VLOOKUP(O2,F2:F,1),F:F,0)+1,3,4)))
spreadsheet demo

Efficient Excel formula for returning multiple matches from a large number of rows

I'm stumped by a major issue. I have a data set consisting of about 16000 rows (could be more in future). This list is basically a price list containing products and their corresponding installation fees. Now the products are classified by the following hierarchy: City -> Category -> Rating/Type. Before I was using named ranges to refer to each set by concatenating City & Category & Rating (_XYZ_SPC_9.5). This resulted in about 1500 named ranges which inflated the size of the Excel file. So I decided to calculate the products on-the-fly using inputs from the user. I have tried array formulas and simple formulas but they take some time to calculate (16000 rows!!) which is not acceptable from a usability perspective; our sales people are very particular about how much time they have to spend on the tool.
I have uploaded a sample file at:
Price List Sample
Formulas that I have used so far are:
=IFERROR(INDEX($H$6:$H$15000, SMALL(INDEX(($AE$9=$R$6:$R$15000)*(MATCH(ROW($R$6:$R$15000), ROW($R$6:$R$15000)))+($AE$9<>$R$6:$R$15000)*15000, 0, 0), AC3)),"Not Available")
{=IFERROR(INDEX(ref_PRICE_LIST!$H$6:$H$16074,MATCH(INDEX(ref_PRICE_LIST!$H$6:$H$16074,(SMALL(IF(IF(RIGHT($AE$3,3)="All",ref_PRICE_LIST!$Z$6:$Z$16074,ref_PRICE_LIST!$R$6:$R$16074)=$AE$3,ROW(ref_PRICE_LIST!$H$6:$H$16074)-ROW(ref_PRICE_LIST!$H$6)+1),$AC3))),ref_PRICE_LIST!$H$6:$H$16074,0),1),"Not Available")}
I would really appreciate if someone can help me out.
Thank you so much!
I think the best way to speed this up is to split the formula into a helper column K and a reult column L
Helper Column (copy down for all 16,000 data rows)
=IF($D:$D=$O$2,ROW(),"")
Result column (starting at L2, copy down as many as you need)
=IFERROR(INDEX($F:$F,SMALL($K:$K,ROW()-1)),"Not available")
I've tested this with about 150,000 rows and it updates in < 1s

Resources