SUM of 2 columns from VLOOKUP in Google Sheets when the search key is not in both columns - arrays

I have 2 columns of File # data, representing different weeks in a payroll cycle. I also have 2 columns of Regular Hours data. I am using VLOOKUP and SUM to add the Regular Hours together to receive hours for the pay period.
=SUM(VLOOKUP($AI2,RICS_TimeClocks!O$2:S,4, FALSE) , VLOOKUP($AI2,RICS_TimeClocks!T$2:X,4, FALSE))
I have the File #s and Names flattened into one column each with
=UNIQUE(FLATTEN(
The issue I have now though, is that there are employees that only worked on one of the weeks, resulting in a
"Did not find value '____' in VLOOKUP evaluation"...
Any suggestions so that the formula can function when there is information in only one of 2 data columns? Example, File # 43021 only works in the second week, and File # 43034 only works in the first week, but I still want to be able to compute and display their total hours.
...or a better way to match and add the information into another flattened column of information?

try IFNA set to zero:
=SUM(IFNA(VLOOKUP($AI2, RICS_TimeClocks!O$2:S, 4, 0), 0),
IFNA(VLOOKUP($AI2, RICS_TimeClocks!T$2:X, 4, 0), 0))

Related

Long calculation times with XLOOKUP vs INDEX-MIN-COLUMN

I'm using this formula =IF(B24="","",IFERROR(INDEX(Sheet3!$C$3:$EE$3,,MIN(IF(Sheet3!$C$4:$EE$23=(Sheet2!C24&$K$18),COLUMN(Sheet3!$C:$EE)))-2),"NF")) to return a cell value in the top row of an array - a date in this case.
The search criteria is a combination of a unique project number and a 2 digit status alphanumerical code for the project. The array consists of 23 rows where combinations of the unique numbers are found, each with different status codes.
So essentially, I'm building a FILTERED project status dashboard that returns dates linked to the relevant project status.
The code above is inspired from ( LINK ) that uses a very similar layout, but it uses town suburbs linked to postal codes instead of project numbers and status codes. The formula works well (though, not entered as an array formula), but I don't have a single formula in the sheet, I have 3 300 occurrences of this formula.
The problem comes in when the user changes the FILTER - Excel recalculates the entire dashboard and that takes anywhere from 2 to 5 minutes to run. You hit the escape button and cancel the calculation after setting the filter, but Excel just starts calculating again after a few seconds. After that, Excel's response is sluggish and almost unusable. Yes - our hardware is pretty weak ...
I tried XLOOKUP as well, but can't set the "lookup_array" to an array ( Sheet3!$C$4:$EE$23 ) because it doesn't match the "return-array" ( Sheet3!$C$3:$EE$3 ) Concatenating the lookup arrays with & works, but then you'd have to do that for all 23 rows, and again, multiply that by 3 300.
I thought of creating a UDF, but the function will still be called every time Excel recalculates after filtering... 3 300 calls ...
Any ideas on how to make the INDEX version run faster, or make the XLOOKUP accept the lookup_array as Sheet3!$C$4:$EE$23 in the hopes that it'll run faster?
Thank you!
Not really an elegant solution, but it works.
I imported the dataset into a helper sheet, where I combined the cell value with the corresponding value in Column A for each row ( a name in this case ) and the date from row 1 for each column, using underscore as a delimiter.
This new data range was then given a unique name, EE in this case.
On a second helper sheet, using this formula =INDEX(Filtered,1+INT((ROW('Sheet1'!C3)-1)/COLUMNS(Filtered)),MOD(ROW('Sheet1'!C3)-1+COLUMNS(Filtered),COLUMNS(Filtered))+1) and drag it down till it returns an REF! error and going back one row before the error.
This transposes all the data into a single column G. Using =UNIQUE(SORT(FILTER(B3:B3240,B3:B3240<> "",""))) then gives me a filtered list of unique values in column H that I then run
=IF(H3="","",LEFT(H3, SEARCH("_",H3,1)-1)) for the first data value in I, and
=IF(H3="","",MID(H3, SEARCH("_",H3) + 1, SEARCH("_",H3,SEARCH("_",H3)+1) - SEARCH("_",H3) - 1)) for the middle data value in J, and
=IF(H3="","",IFERROR(TEXT(RIGHT(H3,5),"yyyy-mm-dd"),"NF")) for the last data value in K.
Then just run XLOOPUP across columns I, J and K.
Runs quick and easy and solves a few of the other issue I had as well.
The second data set has just over 35 000 rows - still works well and fast.

EXCEL: How to count if data matches in specific columns where header is within the date range?

I would like to get a count on the total number of lessons of a specific length given within a specific date range.
I can figure out how to get the number of a specific type of lesson on a specific day with something like in:
=countifs(INDEX($E:$V,,MATCH($A8,$E$1:$V$1,1)),"=30")
But I can't figure out how to to find all of the values for say Dates <=A8 (for row 8), or dates >A8 & <=A9 (for row 9).
I am looking to get the data to output like the yellow section.
In Excel:
=countifs(INDEX($E:$V,,MATCH($A8,$E$1:$V$1,1)):INDEX($E:$V,,IFERROR(MATCH($A9,$E$1:$V$1,1)-1,COLUMN($V$1))),"=30")
But note, this will not work in Google Sheets.
I'm not sure it is the most elegant. My solution was to create a hidden column (Now E:E) for the to calculate the days between the A column dates.
Example for cell E9
=SUMPRODUCT(($F$1:$W$1>A8)*($F$1:$W$1<A9))
This returned a number of days between days A7 and A8.
I then modified Scott Craner's formula to this (Example for cell B9):
=COUNTIFS(INDEX($F:$W,,MATCH($A9-$E9,$F$1:$W$1,1)):INDEX($F:$W,,IFERROR(MATCH($A9,$F$1:$W$1,),COLUMN($W$1))),"=30")
Remember now, that after adding a column my dates shifted from E1:V1 to F1:W1
These two steps solved my issue with double counting on certain dates.
Lastly, I went back and tested this with Google Sheets and it does now appear that it will also work with Google Sheets.

MS Excel 2010 Count unique values with multiple criteria and EDATE

I am trying to get a count of all Unique values listed in Col A, by state and within a date range, for example all records up to the end of April 2018.
I am able to get the count of Unique values by state (result is 2) with the below formula:
{=SUMPRODUCT(1*(FREQUENCY(IF($C$2:$C$14=F10,MATCH($A$2:$A$14,$A$2:$A$14,0)),ROW($A$2:$A$14)-ROW($A$2))>0))}
but I am unable to get the IF function to work with EDATE. I tried the following but I'm getting 0 as the result. The result should be 1.
{=SUMPRODUCT(1*(FREQUENCY(IF(D2:D14="<"&EDATE(G1,1),IF($C$2:$C$14=F10,MATCH($A$2:$A$14,$A$2:$A$14,0))),ROW($A$2:$A$14)-ROW($A$2))>0))}
I am unable to use Pivot as I need to include date range filter. Could someone please look at my code and tell me what I'm doing wrong? I am using CSE with my formulas. Thankyou!
I managed to work it out. EDATE wasn't working so I entered the Month date in cell:G1 then referenced it in the IF formula using "<=" eg: IF(D2:D14<=G1).
Whole array formula is:
`{=SUMPRODUCT(1*(FREQUENCY(IF(D2:D14<=G1,IF($C$2:$C$14=F10,MATCH($A$2:$A$14,$A$2:$A$14,0))),ROW($A$2:$A$14)-ROW($A$2))>0))}
I now receive the correct count of 1, though I have to ensure I have entered the last day of the Month in G1. State is referenced in F10 and count of unique values is in Column A.
My full data set is sourced from multiple documents over 5000 rows each and growing daily so my workbook is quite slow to calculate over 1000 array formulas... but it works!
If anyone knows of a faster (possibly non-array) formula, I would appreciate the advice! Thanks!

Google Data Studio date aggregation - average number of daily users over time

This should be simple so I think I am missing it. I have a simple line chart that shows Users per day over 28 days (X axis is date, Y axis is number of users). I am using hard-coded 28 days here just to get it to work.
I want to add a scorecard for average daily users over the 28 day time frame. I tried to use a calculated field AVG(Users) but this shows an error for re-aggregating an aggregated value. Then I tried Users/28, but the result oddly is the value of Users for today. The division seems to be completely ignored.
What is the best way to show average number of daily users over a time frame? Average daily users over 10 days, 20 day, etc.
Try to create a new metric that counts the dates eg
Count of Date = COUNT(Date) or
Count of Date = COUNT_DISTINCT(Date) in case you have duplicated dates
Then create another metric for average users
Users AVG = (Users / Count of Date)
The average depends on the timeframe you have selected. If you are selecting the last 28 days the average is for those 28 days (dates), if you filter 20 days the average is for those 20 days etc.
Hope that helps.
I have been able to do this in an extremely crude and ugly manner using Google Sheets as a means to do the calculation and serve as a data source for Data studio.
This may be useful for other people trying to do the same thing. This assumes you know how to work with GA data in Sheets and are starting with a Report Configuration. There must be a better way.
Example for Average Number of Daily Users over the last 7 days:
Edit the Report Configuration fields:
Report Name: create one report per day, in this case 7 reports. Name them (for example) Users-1 through Users-7. These are your Row 2 values. You'll have 7 columns, with the first report name in column B.
Start Date and End Date: use TODAY()-X where X is the number of days previous to define the start and end dates for each report. Each report will contain the user count for one day. Report Users-1 will use TODAY()-1 for start and end, etc.
Metrics: enter the metrics e.g. ga:users and ga:new users
Create the reports
Use 'Run reports' to have the result sheets created and populated.
Create a sheet for an interim data set you will use as the basis for the average calculation. The first column is date, the remaining columns are for the metrics, in this case Users and New Users.
Populate the interim data set with the dates and values. You will reference the Report Configuration to get the dates, and you will pull the metrics from each of the individual reports. At this stage you have a sheet with date in first columns and values in subsequent columns with a row for each day's values. Be sure to use a header.
Finally, create a sheet that averages the values in the interim data set. This sheet will have a column for each metric, with one value per column. The one value is calculated from the series in the interim data set, for example =AVG(interim_sheet_reference:range) or any other calculation you'd like to do.
At last, you can use Data Studio to connect to this data source and use the values. For counts of users such as this example, you would use Sum as the aggregation field type when you are creating the data source.
It's super ugly but it works.

Efficient Excel formula for returning multiple matches from a large number of rows

I'm stumped by a major issue. I have a data set consisting of about 16000 rows (could be more in future). This list is basically a price list containing products and their corresponding installation fees. Now the products are classified by the following hierarchy: City -> Category -> Rating/Type. Before I was using named ranges to refer to each set by concatenating City & Category & Rating (_XYZ_SPC_9.5). This resulted in about 1500 named ranges which inflated the size of the Excel file. So I decided to calculate the products on-the-fly using inputs from the user. I have tried array formulas and simple formulas but they take some time to calculate (16000 rows!!) which is not acceptable from a usability perspective; our sales people are very particular about how much time they have to spend on the tool.
I have uploaded a sample file at:
Price List Sample
Formulas that I have used so far are:
=IFERROR(INDEX($H$6:$H$15000, SMALL(INDEX(($AE$9=$R$6:$R$15000)*(MATCH(ROW($R$6:$R$15000), ROW($R$6:$R$15000)))+($AE$9<>$R$6:$R$15000)*15000, 0, 0), AC3)),"Not Available")
{=IFERROR(INDEX(ref_PRICE_LIST!$H$6:$H$16074,MATCH(INDEX(ref_PRICE_LIST!$H$6:$H$16074,(SMALL(IF(IF(RIGHT($AE$3,3)="All",ref_PRICE_LIST!$Z$6:$Z$16074,ref_PRICE_LIST!$R$6:$R$16074)=$AE$3,ROW(ref_PRICE_LIST!$H$6:$H$16074)-ROW(ref_PRICE_LIST!$H$6)+1),$AC3))),ref_PRICE_LIST!$H$6:$H$16074,0),1),"Not Available")}
I would really appreciate if someone can help me out.
Thank you so much!
I think the best way to speed this up is to split the formula into a helper column K and a reult column L
Helper Column (copy down for all 16,000 data rows)
=IF($D:$D=$O$2,ROW(),"")
Result column (starting at L2, copy down as many as you need)
=IFERROR(INDEX($F:$F,SMALL($K:$K,ROW()-1)),"Not available")
I've tested this with about 150,000 rows and it updates in < 1s

Resources