I am struggling with an array formula, that logically seems sound, however it doesn't appear to be working correctly. I have been working on a complex sheet, that is not to include VBA, which has made it formula heavy, and using arrays instead.
In the image below, the first part is the problem, for the data shown in columns A-F, I wish to get a sum of the values that match the values in I1:K1.
The formula I have used to begin with can be seen in the first image also, this evaluates, pressing F9, to give me the desired output 20,40 & 50. However when I add the SUM around the formula, I only get the first result out.
I think this is an issue with me not seeing the wood for the trees on this one.
Thanks in advance.
This array formula seems to work:
=SUM((IFERROR(MATCH(A1:F1,I1:K1,0),0)>0)*A2:F2)
There are probably multiple better formulas to achieve the same thing.
But to talk about why this fails:
It is because of the OFFSET function returns a reference rather than a value. And so used in this array formula it returns an array of references {B2,D2,E2} instead of an array of values {20,40,50} which leads to the problem.
If you are using:
=SUMPRODUCT(OFFSET(A2,0,MATCH($I$1:$K$1,$A$1:$F$1,0)-1))
then using Evaluate Formula, you will get:
SUMPRODUCT({#VALUE,#VALUE,#VALUE})
in next to last step and 0 as the result. So the OFFSET leads to error values because of it returns an array of references which will not be dereferenced automatically and so will become #VALUE error each.
If you are using
=SUMPRODUCT(N(OFFSET(A2,0,MATCH($I$1:$K$1,$A$1:$F$1,0)-1)))
then it works and returns 110. So the N dereferences the references of each OFFSET and so the whole formula leads to an array of values {20,40,50} in sum.
{=SUM(N(OFFSET(A2,0,MATCH($I$1:$K$1,$A$1:$F$1,0)-1)))}
works too.
These problems occur using funktions like OFFSET and INDIRECT, which returns references rather than values, in array formulas. And having a dereferencing function around the OFFSET or INDIRECT stops the problems.
It is the same with:
=SUMPRODUCT(INDIRECT("R2C"&MATCH(I1:K1,A1:F1,0),FALSE))
versus
=SUMPRODUCT(N(INDIRECT("R2C"&MATCH(I1:K1,A1:F1,0),FALSE)))
Related
I have been trying to use =row(indirect("0-5000")) to work like I have read on all literature I could find and yet I keep getting the #REF error, which occurs when it processes the indirect(string) part of the formula. Is there a setting somewhere that is making my function to act improperly?
I am trying to produce an array of numbers inside a formula at time of calcution (so no stored array), based on a range of two numbers from 2 cells like the example 1/4 of the way down on https://exceljet.net/formula/create-array-of-numbers
Try,
=row(indirect("1:5000"))
There is no row 0 and a colon is used to delimit the upper and lower limits of a range.
So I'm having difficulties understanding fully how arrays works and when they are used by excel and specifically what happens in the background.
From reading the past few hours I understand that one of the reasons my Index Match doesn't work without array is simply because its a multicriteria Match that I use as below:
{=INDEX(D30:E36,MATCH(F33&G33,B30:B36&C30:C36),2)}
From what I understand the reason is that Match returns a {x,y} result which classifies it as an array formula. But considering the point is to get a row number, if the row I'm looking for is 5 then Match will return a {5,5} for row number for Index. And then Index interprets this as just 5? or what exactly happens in the background here?
Then I found an article which showed how to circumvent the array formula and not need ctrl+shift+enter as shown below. How does the below change things and what happens in the background?
=INDEX(D30:E36,MATCH(F33&G33,INDEX(B30:B36&C30:C36)),2)
The below is a an array SUM/COUNTIF formula which counts unique cells only which does not work without array brackets. Why is that and how does it work? It involves maths so I'm not sure.
{=SUM(1/(COUNTIF(A1:A5,A1:A5)))}
Thank you!
I am interested in spreadsheet functions, not VBA solutions, to be included in a single cell formula.
[A1:A15 contain numeric values from 1 to 127, B1:B15 contain integers from 1 to 7 that set a divisor.]
Given the function:
=SUMPRODUCT(MOD(FREQUENCY(A1:A15;A1:A15);B1:B15))
FREQUENCY(A1:A15;A1:A15) gives a 1-column array of 15+1 rows, whereas the second part (B1:B15) is a 1-column array of 15 rows.
I would like to change the resulting array given by FREQUENCY (only in memory -not explicit in sheet-) from a 1-column 16 rows array to a 1-column 15 rows array with the first 15 cell values of that array.
[FREQUENCY documentation: https://support.office.com/en-us/article/FREQUENCY-function-44e3be2b-eca0-42cd-a3f7-fd9ea898fdb9 NB: for Excel, second remark states number of elements that depend on bins_array. ]
I would appreciate suggestions.
Thus, both arrays within MOD will have the same dimensions and SUMPRODUCT will not find cells with error values. I can disregard error values using IF and ISERROR within SUMPRODUCT, but I'd rather disregard the non-relevant part of the FREQUENCY resulting array if it is possible.
It has been thought that making it more specific might be more helpful, so it has been heavily reduced and simplified.
With external help, I have been able to fine-tune a way to solve my problem using INDEX in array formula mode. I am posting the answer in case it helps others.
One way: Put FREQUENCY(A1:A15;A1:A15), or any formula that produces an multi-cell array, within INDEX and have 2nd and/or 3rd arguments as array of consecutive values which will represent rows/columns.
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(FREQUENCY(A1:A15;A1:A15)-1));1)
First argument within INDEX is the resulting array coming from a formula to shrink (from 16x1 to 15x1), which would be a multi-cell array formula if explicitly entered; second argument is the array 1..15 given by row numbers from 1 to the number of total rows of the "array from formula to shrink" MINUS 1: the first 15 (out of 16) values in the array from a formula; 3rd argument is the column of the shrank array (if need be, more than one could be selected using an analogue to the second argument).
In the particular case of FREQUENCY, because it is known that we are interested in the "bins" part of the function, the formula can be simplified by including the total rows of the "bins"/"intervals" array inside FREQUENCY (its second argument). We will have
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1)
and the complete formula would become
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1);B1:B15))
Now, both dividend and divisor of MOD have exactly the same dimensions (15x1) and because B1:B15 includes integers greater than 0 there are no errors.
Thanks all for helping me in making question more concise and better formatted.
ADDITIONAL INFORMATION: As pointed out correctly in comments by XOR LX, this does not seem to work in the widely popular spreadsheet software Excel. It has been developed for an INDEX function inside SUMPRODUCT as used in Open Office Calc which I had mistakenly thought 100% equivalent to Excel's version. A more complete answer perhaps using other functions would be appreciated.
In the previous answer, XOR LX points out very correctly that this formula cannot work in Excel, due to row_num/column_num argument behaviour. Very kindly XOR LX has shown me how that approach can work, and also thanks and credit for supplying a good answer: "INDEX can be used to redimension array (even dynamically created ones) if the row_num/column_num array is coerced to take an arbitrary array with the right dimensions, as shown on this blog entry " The following formula has been checked in Excel 2010 and has the expected results:
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15,A1:A15),N(INDEX(ROW(INDIRECT("1:" & ROWS(A1:A15))),,)),1),B1:B15))
NB: row_num argument of first INDEX, a ROW generated auxiliary array, has been nested inside N(INDEX([...],,)); at least one comma is necessary to account for the two arguments minimum of the nested INDEX. It is in itself interesting the discussion that applies generally to INDEX's arguments, and other functions', that need to be coerced to take arrays (see, here and here at XOR LX's blog). For Open Office users it might be worth stressing the point made at the blog
Unlike OFFSET, (...) for which the first parameter must be a
reference (...) in the worksheet, INDEX can also accept –
and manipulate – for its reference arrays which consist of values
generated e.g. via other subfunctions within the formula. XOR LX's blog
That would be indeed the case in changing the dimension in an array as in this question, but also useful in reversing or displacing the values in an array, for example. Open Office accepts arrays as row_num/column_num, so the coercion is not needed and some formulas rely on this, but without it, these formulas are unlikely to work when files are open in Excel.
Regrettably, this type of coercion is not passed correctly to Open Office, and formula need to be "decoerced" to work, at least in my casual tests.
In order to use a formula that would work in both spreadsheet programs regarding shortening arrays, the only thing I have managed is the following (required: arrays must be single-column)
SUMPRODUCT(
(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
-ROW(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
=0)
*MOD(TRANSPOSE(FREQUENCY(vals_for_freq,vals_for_freq)),vals_to_mod)
)
(it "shortens" one array to the shortest of the pair, by creating an auxiliary array with TRUE/1s on the diagonal starting top-left and FALSE/0s elsewhere, therefore disregarding all defined values outside the square section of the array. Thus, SUMPRODUCT adds values within the diagonal of the square section which are the product of the corresponding values up to the last value of the shorter array.)
I apologise if this is a very simple question, but I am at a bit of a loss here.
A bespoke formula I want to use returns an array of values, as seen here:
But I cannot find a way to present this output in a cell separated format, only the first cell (39478) is returned.
There is a note included in the documentation: Hint: This function is a multiple result function. You MUST set an array for the output.
Whilst I understand I am going to need an array to display multiple results, I cannot find the method of doing so. Any tips?
If the bespoke formula wants to return an array of values, there are a couple of ways to get the results into multiple cells.
Put the formula into a cell and hit Enter↵. Next, select that cell along with several cells below it. Tap F2 then hit Ctrl+Shift+Enter↵. The successive values should fill the cells selected until an error (no more returns) is reached.
Put that formula into a cell and hit Ctrl+Shift+Enter↵. The formula should be wrapped in braces (e.g. { and }). If the correct relative and absolute cell addresses were used (e.g. $**A$1 or $**A1, etc) then you should be able to fill, copy or drag down the formula into successive rows.
Use an INDEX function to contain the array of returned values from the bespoke formula and peel off successive values using the row_num parameter. =INDEX(<bespoke formula>, ROW(1:1)) Filled down.
Sooner or later, you will run out of rows to fill. An IFERROR function used as a wrapper can help avoid he display of errors.
If you want to put all of the values into a single cell, then a User defined Function (aka UDF) could concatenate the array into a single string. This last method is generally not recommended as it renders the values useless for anything other than display purposes.
Array formulas need to be finalized with Ctrl+Shift+Enter↵. Once entered into the first cell correctly, they can be filled or copied down or right just like any other formula.
Array formulas chew up calculation cycles logarithmically so it is good practise to narrow the referenced ranges to a minimum.
See Guidelines and examples of array formulas for more information.
Simple question that I wasn't able to figure out. Trying to calculate counts and medians of arrays with a conditional that depends on numbers in two columns in Excel, as seen below.
=MEDIAN(IF(AND($D$3:$D$1216=1,OR($B$3:$B$1216=3,$B$3:$B$1216=6,$B$3:$B$1216=9,$B$3:$B$1216=12)),$M$3:$M$1216))
Please see above for the function, which I input as an array. All I'd like to do is scan Column D for all 1s and then 3/6/9/12 in Column B.
The function works but only returns zeroes for all relevant values in Column D, which I find strange.
Thank you for your help!
Is 0 in fact the most common element in $M$3:$M$1216? i.e., are a lot of those cells just blank?
When I keep the range large, but only enter a few values, I get the same result as you. When I shrink the range to something more manageable, and fill in the values, the function returns what I'd expect.
Also, here's a somewhat simpler syntax for the same result:
=MEDIAN(($D$3:$D$5=1)*OR($B$3:$B$5=3,$B$3:$B$5=6,$B$3:$B$5=9,$B$3:$B$5=12)*($M$3:$M$5))