Excel - my formula needs to incorporate the last filled cell in column - arrays

I am building a calculator that detects streamflows that falls below a certain flow rate. I have an array formula that takes a column of data, and identifies how many data points in a row fall below a particular threshold (threshold value in C12):
{=MAX(FREQUENCY(IF(E23:E12275<C12,ROW(E23:E12275)),IF(1-(E23:E12275<C12),ROW(E23:E12275))))}
This formula works, but I want to be able to build in the ability to detect the last row with entered data. In this example, the data set finishes at E12275, but datasets can extend many more rows. If I extend the formula to this:
{=MAX(FREQUENCY(IF(E23:E1000000<C12,ROW(E23:E1000000)),IF(1-(E23:E1000000<C12),ROW(E23:E1000000))))}
the formula interprets the blank cells (after the last full cell) as zero, and says that they fall below the threshold. This gives me a result of 987725 (1000000-12275)
I have built another formula that detects the bottom row cell address:
=ADDRESS(LOOKUP(2,1/(E23:E1000000<>""),ROW(E23:E1000000)),5,1)
However I am having trouble incorporating this result into the existing formula. Does anyone have any thoughts on how to do this?
NB: I have also toyed with the idea of building a formula that excludes blank cells, however the actual datasets include blank cells, which should be interpreted as "below threshold"

Related

Unavoidable merged Cells - Fill blank cells with previous non-blank cell in same column

I have a roster table for a sports facility that has been formatted and has a column of merged cells (for human readability). Unfortunately I cannot change the formatting to eliminate the merged cells - too many people use it and in any case I'd need to overhaul all the formulas everywhere.
The cells contain names and merge 4 rows of a single column.
Formatted roster table w/ sample data
In a separate range I am trying to take this formatted info and put it into 1st normal form for analysis & graphing purposes. Since merged cells only contain the top-leftmost value, when trying to copy the column contents by formula (e.g. "=B14") it only shows the name in the top cell followed by 3 empty ones below.
I need to fill in the blank rows by copying the athlete names down. The other column formulas are working just fine.
For the life of me I can't figure it out. It has to be a formula and not apps script due to mobile use, and I've always been really bad with certain formulas and good with others. Usually I can make a guess at it, but this time I'm just lost.
Can someone point me in the right direction?
use:
=ARRAYFORMULA(IF(B2:B="",, VLOOKUP(ROW(A2:A), IF(A2:A<>"", {ROW(A2:A), A2:A}), 2, 1)))

How would I reference cells in order while moving down several cells in a formula on a separate sheet?

I have been at this for hours and it's kicking me. I'm trying to build a log for someone, and I have a sheet with standard data in table format. I need the next sheet to look a certain way so that it can be exported to PDF and continue looking like the log always has - which means that it will not be a standard table.
In the Log sheet, data is all on one row, in the PrintSheet the cell references will be placed in three rows, with a gap fourth row. Obviously, when you paste formulas in Excel, it picks the row you're in, vs the next row down in the referenced sheet. I've included the formulas that "work" in blue in the image for reference, but that would involve manually subtracting 3 (or 4 depending on which one I'm doing) to each formula (Formula for reference -- =Log!$A$1&": "&INDIRECT("'log'!A"&ROW()-3).
Is there a way to dynamically write this formula so it can just be copy/pasted every 4th row when they need more in the PrintSheet? Is it possible I need to be using an array formula (that is an area of Excel that I am deeply lacking in)?
Input Sheet (log) vs Output Sheet (PrintSheet) with formulas in blue
Use a bit of maths on the row number and pull the formula down as required:
=CHOOSE(MOD(ROW()-2,4)+1,Log!$A$1&":"&INDEX(Log!A:A,QUOTIENT(ROW()-2,4)+2),Log!$D$1&":"&INDEX(Log!D:D,QUOTIENT(ROW()-2,4)+2),"","")
Log:
PrintSheet:
If you have Excel 365, you can do it using the same method but as a spill formula:
=LET(logRows,COUNTA(Log!A:A)-1,
seq,SEQUENCE(logRows*4,1,0),
CHOOSE(MOD(seq,4)+1,Log!$A$1&":"&INDEX(Log!A:A,QUOTIENT(seq,4)+2),
Log!$D$1&":"&INDEX(Log!D:D,QUOTIENT(seq,4)+2),"",""))
In this case it counts the number of rows in the log so will expand as you add more rows.
The date and time can be done in a similar way.
Instead of row()-3 etc. have a look here whether this offset calculation helps:
Column A is the row in the output list and column B is the target row from the input list

Excel Array: Reducing the Number of Calculation Steps

As per my on-going journey through the world of Excel arrays, I was wondering if someone might be able to give me a pointer or two.
On the excel sheet attached, I currently have a four-step process to get from a segregated lookup to a gapless list:
Step 1 (yellow): For the 50-word long list in sheet 'Data', a 50-cell lookup is performed to see whether the input in row 1 (red) appears somewhere in the corresponding cell. In this case, the lookup is performed three times for three different inputs, i.e. in columns C-E.
Step 2 (orange): An array then relists the contents of the 50-cell lookup above it but removes all empty cells (i.e. where there is no match to the input in row 1)
Step 3 (green): The results from step 2 are listed out in a single column.
Step 4 (blue): The results from step 3 are listed out using the same technique as in step 2 in order to remove the blank cells.
Collectively, this enables a gapless listing of all data objects which contain the given inputs somewhere in their string.
However, my real list of data objects is 5000 entries long and I would like to look up the results for 100 or more inputs. As step 1 requires each combination to be looked up separately, this requires 500,000 calculations for step 1 alone, which causes a heavy toll on the processors.
Therefore, I was wondering if anyone had an idea as to how I could shortcut this process to reduce the number of cells / calculations involved. I assume that step 1 and 2 could somehow be merged, but my knowledge of arrays is not sufficient to think of how this could be done.
It would be brilliant to hear from somebody who may have some advice on the matter!
Kind regards,
Rob
File Link: https://drive.google.com/open?id=10O91QDD78RkbWtQx2iWfax17Dt5TPw1G
Since you're not removing duplicated entries from the final list, this is quite straightforward.
Based on the workbook you provided, to be entered within the Lookup sheet:
In cell A1:
=SUMPRODUCT(0+ISNUMBER(FIND(C1:E1,Data!A1:A50)))
In any cell of your choice, to begin the list of returns, array formula**:
=IF(ROWS($1:1)>A$1,"",INDIRECT("'Data'!"&TEXT(SMALL(IF(ISNUMBER(FIND(C$1:E$1,Data!A$1:A$50)),10^5*ROW(Data!A$1:A$50)+COLUMN(Data!A$1:A$50)),ROWS($1:1)),"R0C00000"),0))
and copied down until you start to get blanks for the results.
Notes:
Instructions for entering an array formula are at the foot of this post.
The sheet name (emboldened within the second formula) should be amended as required.
It is important that the range containing the values being searched for (A1:C1 here) and that containing the entries to be searched within (A1:A50) be orthogonal, i.e. one is a horizontal range, the other a vertical range.
If you are not using an English-language version of Excel then the part "R0C00000" within the second formula may need amending.
Regards
**Array formulas are not entered in the same way as 'standard' formulas. Instead of pressing just ENTER, you first hold down CTRL and SHIFT, and only then press ENTER. If you've done it correctly, you'll notice Excel puts curly brackets {} around the formula (though do not attempt to manually insert these yourself).

Excel array Formula that copies only cells containing a string

I would prefer to use an excel array formula (but if it can only be done in VBA, so be it) that copies ALL cells from a column array that contains specific text. The picture below shows what I am after and what I have tried. I'm getting close (thanks to similar, but different questions) but can't quite get to where I want. At the moment, I am getting only the first cell instead of all the cells. In my actual application, I am searching through about 20,000 cells and will have a few hundred search terms. I expect most search terms to give me about 8 - 12 cells with that value.
formula I am using:
=INDEX($A$4:$A$10,MATCH(FALSE,ISERROR(SEARCH($C$1,$A$4:$A$10)),0))
Spredsheet Image
To make this work efficiently, I recommend having a separate cell holding the results count (I used cell C2) which has this formula:
=COUNTIF(A:A,"*"&C1&"*")
Then in cell C4 and copied down use this array formula (The -3 is just because the header row is row 3. If the header row was row 1, it would be -1):
=IF(ROW(A1)>$C$2,"",INDEX($A$4:$A$21000,SMALL(IF(ISNUMBER(SEARCH($C$1,$A$4:$A$21000)),ROW($A$4:$A$21000)-3),ROW(C1))))
I tested this with 21000 rows of data in column A with an average of 30 results per search string and the formula is copied down for 60 cells in column C. With that much data, this takes about 1-2 seconds to finish recalculating. Recalculation time can vary widely depending on other factors in your workbook (additional formulas, nested dependencies, use of volatile functions, etc) as well as your hardware.
Alternately, you could just use the built-in Filter functionality, but I hope this helps.
You need to get the ROWS. Put this in C4 and copy down.
=IFERROR(AGGREGATE(15,6, IF(SEARCH($C$1, $A$4:$A$10)>0, ROW($A$4:$A$10)), ROW($C4)-ROW($A$4)+1), "")
Array formula so use ctrl-shift-Enter

Code equivalent to Array Formula

I am currently using an array formula in my data to find a row where columns O, Y, and AA match the current row, and where column A value does not match, and return column C for the matching row.
Here is my formula:
=INDEX(C:C,MATCH(1,(O:O=O2)*(Y:Y=Y2)*(AA:AA=AA2)*(A:A<>A2),0))
Using named ranges I have been able to input this formula using VBA, but what I really want to do is use VBA to perform a similar function and write the resulting value to column D.
I am thinking that possibly a loop, for each i from 2 to last row, find the other row within the range that matches and write cell(row that was found, 3).value to cell(i, 4), but I don't know the syntax for a VBA array to find that matching row.
While not explicitly stated, it could easily be inferred that you are seeking to use VBA to increase the efficiency of the calculation/recalculation of your array formula. You haven't provided the scope (i.e. number of rows) of your data but it is unlikely that you require the full column references you are using. The following calculation cycle times were based on ~1000 rows of static data.
Your array formula:
=INDEX(C:C, MATCH(1, (O:O=O2)*(Y:Y=Y2)*(AA:AA=AA2)*(A:A<>A2), 0))
Elapsed time to fill down and calculate: 24.828 seconds
Your array formula with column references truncated to actual extents of the data:
=INDEX($C$2:$C$999, MATCH(1, ($O$2:$O$999=O2)*($Y$2:$Y$999=Y2)*($AA$2:$AA$999=AA2)*($A$2:$A$999<>A2), 0))
Elapsed time to fill down and calculate: 0.203 seconds
Comparable standard formula with column references truncated to actual extents of the data:
=INDEX($C$2:$C$999, MIN(INDEX(ROW($1:$998)+(($A$2:$A$999=A2)+($O$2:$O$999<>O2)+($Y$2:$Y$999<>Y2)+($AA$2:$AA$999<>AA2))*1E+99, , )))
Elapsed time to fill down and calculate: 0.257 seconds
As you can see, cutting down the column references to what are actually being used increases efficiency immensely. Array formulas process by calculating everything against everything else. The calculation load increases exponentially
as the rows of cells referenced increases.
If your data is constantly changing and you do not know how many rows you will be dealing with, use named ranges with a Refers to: that is defined dynamically. Example:
Pick a column that usually defines the extents of the data and note the nature of the data. This method differs slightly depending upon whether you are dealing with true numbers or text. For demonstration purposes, column C has numbers.
Choose Formulas ► Defined Names► Name Manager. When the Name Manager dialog opens, click New.
Type a friendly name for the range in the Name: text box; e.g. MyColAA. Lave the Scope: as Workbook.
Use the following for the Refers to::     =$AA$2:INDEX($AA:$AA, MATCH(1e99, $C:$C)) 
This will define myColAA as AA2 to the row matching the last number in column C. If column C:C was full of text values you would use the following,     =$AA$2:INDEX($AA:$AA, MATCH("zzz", $C:$C)) 
Repeat for columns A:A, C:C, O:O and Y:Y. Keep the last used reference as column C:C so they will always have the same number of rows but change the other column references and give each a new name.
When you created all of the named ranges and are back at the worksheet, test one by tapping F5, typing myColAA into the Reference: text box and clicking OK.
Your array formula will now look similar to the following.
=INDEX(myColC, MATCH(1, (myColO=O2)*(myColY=Y2)*(myColAA=AA2)*(myColA<>A2), 0))
The named ranges will grow and shrink with the amount of data available.

Resources