Combining Ranges in Excel Functions with out helper cells - arrays

I am having difficulty figuring out how to get the correct data type evaluated as a single array passed to the first argument of the Small() in this function. My overall goal is (completely without the use of helper cells) to combine two sets of ranges passed to Small() as arrays into a 2-dimensional array output. The formulas work correctly when placed separately in ranges, but when combined in a Let() I get type inconsistency caused #VALUE! errors as the output.
Here is the LET() function ...
=LET(
A1v, SEQUENCE(1,10,1,0),
A2v, SEQUENCE(1,10,2,0),
SMALL((A1v,A2v),SEQUENCE(2,COLUMNS(A1v)))
)
When the Let() formula is broken into pieces and placed into separate ranges as described below it produces the desired output in A3:J4 (A3#) as shown in this
Range A1 formula:
=SEQUENCE(1,10,1,0)
Range A2 formula:
=SEQUENCE(1,10,2,0)
Range A3 formula:
=SMALL((A1#,A2#),SEQUENCE(2,COLUMNS(A1#)))
I am aware that there are other function constructs that can be used to combine ranges. I am not looking for alternatives to using Small(). I am looking for answers that will help further my understanding of constructing arrays from and to be used as inputs to the new array functions in Excel. Thanks to everyone in advance!

The reason your Let function fails is the (A1v,A2v) parameter to Small. That construct is the Union Operator for Ranges. A1v and A2v are arrays, not ranges.
This can be seen in the Formula Evaluator dialog
In contrast (A1#,A2#) works because A1# and A2# are ranges.
FWIW, Small itself can accept either ranges or arrays
To solve this you'll need a general purpose solution to getting the Union of two data sets, whether they be Ranges or Arrays. This can be achieved using a Lambda function.
Add a Workbook scoped Name to the Name Manager, lets call it Union
In the Refers To section insert
=LAMBDA(tabl1, tabl2,
LET(rowindex, SEQUENCE(ROWS(tabl1)+ROWS(tabl2)),
colindex, SEQUENCE(1,COLUMNS(tabl1)),
IF(rowindex<=ROWS(tabl1),
INDEX(tabl1,rowindex,colindex),
INDEX(tabl2,rowindex-ROWS(tabl1),colindex)
)
)
)
Union is now available to use as a stand alone function, or imbedded in another function
Your Formula now becomes
=LET(
A1v, SEQUENCE(1,10,1,0),
A2v, SEQUENCE(1,10,2,0),
SMALL(Union(A1v,A2v),SEQUENCE(2,COLUMNS(A1v)))
)
Or without Lambda
=LET(
A1v, SEQUENCE(1,10,1,0),
A2v, SEQUENCE(1,10,2,0),
rowindex, SEQUENCE(ROWS(A1v)+ROWS(A2v)),
colindex, SEQUENCE(1,COLUMNS(A1v)),
Av, IF(rowindex<=ROWS(A1v),
INDEX(A1v,rowindex,colindex),
INDEX(A2v,rowindex-ROWS(A1v),colindex),
SMALL(Av,SEQUENCE(2,COLUMNS(A1v))) )
)
Untested, so you might have to tweak it a bit

Related

How to Rank an array directly or a group of arrays without creating more cell references?

How to RANK an array directly? I would like to avoid creating more intermediate data in cells just to reference them.
Excel RANK.AVG formula states it accepts both array and reference:
Syntax
RANK.AVG(number,ref,[order])
The RANK.AVG function syntax has the following arguments:
Number Required. The number whose rank you want to find.
Ref Required. **An array of, or a reference to**, a list of numbers. Nonnumeric values in Ref are ignored.
Order Optional. A number specifying how to rank number.
But Excel keeps rejecting the below formula.
=RANK.AVG(5, {3,1,7,10,5})
If the numbers are put in cells, say B1:B5, Excel accepts
=RANK.AVG(5, B1:B5}
Ultimately, I would like to rank a dynamic array
=RANK.AVG(value, TOCOL(VSTACK(array1, array2))
e.g. =RANK.AVG(5, TOCOL(VSTACK(B1:B5,C1:C10))
It seems that the official documentation on the various RANK functions is simply wrong with respect to the fact that they permit arrays for the ref argument (see here, for example).
You will have to come up with creative alternatives which mimic the RANK.AVG function, for example:
=LET(ζ,SORT(MyArray,,-1),AVERAGE(FILTER(SEQUENCE(COUNT(ζ)),ζ=MyValue)))

Excel SUM of SUMIF/SUMIFS with dynamic multiple criteria

I need to pass a multiple criteria list (a constant array) via cell reference rather than hard-typing it into my formula.
So, instead of this:
=SUM(SUMIFS(sum_range,criteria_range,{"red","blue"}))
But I would need to use this:
=SUM(SUMIFS(sum_range,criteria_range,$A1)) where $A1 is {"red","blue"}
I understand that one can use a range of cells to pass an array but I really need my condition to come from a single cell.
It seems that passing a constant array via cell reference only passes the first element to the formula (i.e. only "red" is used as a condition) and all the working examples I could find of this (here or here) are hard-typing the condition into the formula.
Any luck anybody ?
EDIT: I should add that my data set includes blank rows so it is not contiguous and in general, I'm looking for a not too convoluted solution that will work most of the time and with as little restrictions and caveats as possible.
Change the "Array" in A1 to a comma delineated list:
blue,purple
No quotes or {}
Change the SUM to SUMPRODUCT and use this as the criteria:
TRIM(MID(SUBSTITUTE(A1,",",REPT(" ",99)),(ROW(INDEX(AAA:AAA,1):INDEX(AAA:AAA,LEN(A1)-LEN(SUBSTITUTE(A1,",",""))+1))-1)*99+1,99))
The $20 should be placed at the max number of choices possible. I just used it here as a placeholder, it can be more without problem but not less or it will skip any more than that.
Based on the formula you provided.
=SUMPRODUCT(SUMIFS(W$12:W$448,$I$12:$I$448,$I474,$J$12:$J$448,$J474,$K$12:$K$448,TRIM(MID(SUBSTITUTE(A1,",",REPT(" ",99)),(ROW(INDEX(AAA:AAA,1):INDEX(AAA:AAA,LEN(A1)-LEN(SUBSTITUTE(A1,",",""))+1))-1)*99+1,99))))
With cell A1 containing {"red","blue"} I then setup a named range Condition to which I assigned =EVALUATE($A1) and now I can pass my condition like so:
=SUM(SUMIFS(W$12:W$448,$I$12:$I$448,$I474,$J$12:$J$448,$J474,$K$12:$K$448,Condition))

Excel array countif formula

I want to use COUNTIF function to evaluate how many items out of 2,0,0,5 are greater than 2? In Countif function, first argument is range and second is criteria. I have tried the below formula. Even tried using Ctrl+Shift+Enter at the end to evaluate. But doesn't seem to work.
=COUNTIF({"2","0","0","5"},">2")
COUNTIF doesn't accept array constants (as far as I know). Try this:
=SUMPRODUCT(--({2,0,0,5}>2))
You could also create a countif-style formula like this (the combination ctrl+shift+enter):
=COUNT(IF({2,0,0,5}>2,1,""))
Recommended reading:
Array vs Range
Some functions like Offset, SumIf, CountIf, SumIfs, and CountIfs are designed to operate only on (multi-cell) range objects. Sum, SumProduct, Frequency, Linest, lookup functions, etc. take both range and array objects.
Array means: {2,0,0,5}
Range means:
To use countif, you have to use range in cells, defining the array in the formula on the go will not work.
=COUNTIF(A1:A4,">"&2)
I know this thread is a few years old, but I ended up here with a similar problem (how to use arrays, not ranges, with countif).
Although my end goal was a little different (I wanted to find items common to two arrays), I figure the workaround I came up with might be useful for others: I ended up using the "match" function coupled with "isnumber". The formula looked like this:
=isnumber(match({a},{b},0))
this will return an array of true/false which corresponds to the values in {a} that are also in {b}. In case it wasn't clear, {a} and {b} are arrays...

Implicit array using sumif in Excel

I'd like to know if it's possible to use the SUMIF function with implicit or "nested" arrays. With "implicit" array I mean a matrix which data isn't in it's final form in any rank of the spreadsheet, but it's function of some other array. For example, lets say that we have data of an independent variable (which values, all integers, range from 0 to 5) in the rank A1:A100, and data of a dependent variable in B1:B100. With the SUMIF function we may calculate easily, for example, the sum of the dependent variable when the independent is 4. But if we want to know the sum of the SQUARES of the dependent variable it's not that easy, indeed, the SUMIF function gives an error if we write SUMIF(A1:A100;4;B1:B100^2) no matter how we enter it (as array or as a simple formula).
Is there any way to do this without having to waste an entire column for the squares of the values of column B?
I know that for this very example the function SUMPRODUCT((A1:A100=4)*B1:B100^2) would do the job, what I don't know is how to "nest" arrays (which is very useful) in general.
The answer is no, I'm afraid. The ranges used in COUNTIF(S)/SUMIF(S)/AVERAGEIF(S) must be either:
1) References to worksheet ranges
2) Constructions which resolve to references to worksheet ranges
One example of the former:
=SUMIF(A1:A10,"A",B1:B10)
And two of the latter (which just happen to be identical to the above):
=SUMIF(A1:INDEX(A:A,10),"A",B1:INDEX(B:B,10))
=SUM(SUMIF(OFFSET(A1,{0,1,2,3,4,5,6,7,8,9},),"A",OFFSET(B1,{0,1,2,3,4,5,6,7,8,9},)))
Here SUMPRODUCT has the advantage over this group of functions, in that constructions may be passed which do not necessarily resolve to worksheet ranges.
However, it might well be the case that a more efficient set-up is achieved by, as you suggested, first using an additional column within the worksheet to compute the squares and then referencing that column within a SUMIF, not least since one of the major advantages that COUNTIF(S), SUMIF(s), etc. can claim over SUMPRODUCT is that arbitrarly large references can be passed with no detriment to calculation performance. For example, the difference in performance between:
=SUMIF(A:A,"A",B:B)
and:
=SUMPRODUCT(0+(A:A="A"),B:B)
is enormous, the latter, having to process all 1,048,576 cells within that range (whether they are technically beyond the last-used cells or not), being not at all recommendable.
Regards

Change array dimensions, using spreadsheet functions, when used inside SUMPRODUCT

I am interested in spreadsheet functions, not VBA solutions, to be included in a single cell formula.
[A1:A15 contain numeric values from 1 to 127, B1:B15 contain integers from 1 to 7 that set a divisor.]
Given the function:
=SUMPRODUCT(MOD(FREQUENCY(A1:A15;A1:A15);B1:B15))
FREQUENCY(A1:A15;A1:A15) gives a 1-column array of 15+1 rows, whereas the second part (B1:B15) is a 1-column array of 15 rows.
I would like to change the resulting array given by FREQUENCY (only in memory -not explicit in sheet-) from a 1-column 16 rows array to a 1-column 15 rows array with the first 15 cell values of that array.
[FREQUENCY documentation: https://support.office.com/en-us/article/FREQUENCY-function-44e3be2b-eca0-42cd-a3f7-fd9ea898fdb9 NB: for Excel, second remark states number of elements that depend on bins_array. ]
I would appreciate suggestions.
Thus, both arrays within MOD will have the same dimensions and SUMPRODUCT will not find cells with error values. I can disregard error values using IF and ISERROR within SUMPRODUCT, but I'd rather disregard the non-relevant part of the FREQUENCY resulting array if it is possible.
It has been thought that making it more specific might be more helpful, so it has been heavily reduced and simplified.
With external help, I have been able to fine-tune a way to solve my problem using INDEX in array formula mode. I am posting the answer in case it helps others.
One way: Put FREQUENCY(A1:A15;A1:A15), or any formula that produces an multi-cell array, within INDEX and have 2nd and/or 3rd arguments as array of consecutive values which will represent rows/columns.
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(FREQUENCY(A1:A15;A1:A15)-1));1)
First argument within INDEX is the resulting array coming from a formula to shrink (from 16x1 to 15x1), which would be a multi-cell array formula if explicitly entered; second argument is the array 1..15 given by row numbers from 1 to the number of total rows of the "array from formula to shrink" MINUS 1: the first 15 (out of 16) values in the array from a formula; 3rd argument is the column of the shrank array (if need be, more than one could be selected using an analogue to the second argument).
In the particular case of FREQUENCY, because it is known that we are interested in the "bins" part of the function, the formula can be simplified by including the total rows of the "bins"/"intervals" array inside FREQUENCY (its second argument). We will have
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1)
and the complete formula would become
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1);B1:B15))
Now, both dividend and divisor of MOD have exactly the same dimensions (15x1) and because B1:B15 includes integers greater than 0 there are no errors.
Thanks all for helping me in making question more concise and better formatted.
ADDITIONAL INFORMATION: As pointed out correctly in comments by XOR LX, this does not seem to work in the widely popular spreadsheet software Excel. It has been developed for an INDEX function inside SUMPRODUCT as used in Open Office Calc which I had mistakenly thought 100% equivalent to Excel's version. A more complete answer perhaps using other functions would be appreciated.
In the previous answer, XOR LX points out very correctly that this formula cannot work in Excel, due to row_num/column_num argument behaviour. Very kindly XOR LX has shown me how that approach can work, and also thanks and credit for supplying a good answer: "INDEX can be used to redimension array (even dynamically created ones) if the row_num/column_num array is coerced to take an arbitrary array with the right dimensions, as shown on this blog entry " The following formula has been checked in Excel 2010 and has the expected results:
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15,A1:A15),N(INDEX(ROW(INDIRECT("1:" & ROWS(A1:A15))),,)),1),B1:B15))
NB: row_num argument of first INDEX, a ROW generated auxiliary array, has been nested inside N(INDEX([...],,)); at least one comma is necessary to account for the two arguments minimum of the nested INDEX. It is in itself interesting the discussion that applies generally to INDEX's arguments, and other functions', that need to be coerced to take arrays (see, here and here at XOR LX's blog). For Open Office users it might be worth stressing the point made at the blog
Unlike OFFSET, (...) for which the first parameter must be a
reference (...) in the worksheet, INDEX can also accept –
and manipulate – for its reference arrays which consist of values
generated e.g. via other subfunctions within the formula. XOR LX's blog
That would be indeed the case in changing the dimension in an array as in this question, but also useful in reversing or displacing the values in an array, for example. Open Office accepts arrays as row_num/column_num, so the coercion is not needed and some formulas rely on this, but without it, these formulas are unlikely to work when files are open in Excel.
Regrettably, this type of coercion is not passed correctly to Open Office, and formula need to be "decoerced" to work, at least in my casual tests.
In order to use a formula that would work in both spreadsheet programs regarding shortening arrays, the only thing I have managed is the following (required: arrays must be single-column)
SUMPRODUCT(
(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
-ROW(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
=0)
*MOD(TRANSPOSE(FREQUENCY(vals_for_freq,vals_for_freq)),vals_to_mod)
)
(it "shortens" one array to the shortest of the pair, by creating an auxiliary array with TRUE/1s on the diagonal starting top-left and FALSE/0s elsewhere, therefore disregarding all defined values outside the square section of the array. Thus, SUMPRODUCT adds values within the diagonal of the square section which are the product of the corresponding values up to the last value of the shorter array.)

Resources