I am interested in spreadsheet functions, not VBA solutions, to be included in a single cell formula.
[A1:A15 contain numeric values from 1 to 127, B1:B15 contain integers from 1 to 7 that set a divisor.]
Given the function:
=SUMPRODUCT(MOD(FREQUENCY(A1:A15;A1:A15);B1:B15))
FREQUENCY(A1:A15;A1:A15) gives a 1-column array of 15+1 rows, whereas the second part (B1:B15) is a 1-column array of 15 rows.
I would like to change the resulting array given by FREQUENCY (only in memory -not explicit in sheet-) from a 1-column 16 rows array to a 1-column 15 rows array with the first 15 cell values of that array.
[FREQUENCY documentation: https://support.office.com/en-us/article/FREQUENCY-function-44e3be2b-eca0-42cd-a3f7-fd9ea898fdb9 NB: for Excel, second remark states number of elements that depend on bins_array. ]
I would appreciate suggestions.
Thus, both arrays within MOD will have the same dimensions and SUMPRODUCT will not find cells with error values. I can disregard error values using IF and ISERROR within SUMPRODUCT, but I'd rather disregard the non-relevant part of the FREQUENCY resulting array if it is possible.
It has been thought that making it more specific might be more helpful, so it has been heavily reduced and simplified.
With external help, I have been able to fine-tune a way to solve my problem using INDEX in array formula mode. I am posting the answer in case it helps others.
One way: Put FREQUENCY(A1:A15;A1:A15), or any formula that produces an multi-cell array, within INDEX and have 2nd and/or 3rd arguments as array of consecutive values which will represent rows/columns.
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(FREQUENCY(A1:A15;A1:A15)-1));1)
First argument within INDEX is the resulting array coming from a formula to shrink (from 16x1 to 15x1), which would be a multi-cell array formula if explicitly entered; second argument is the array 1..15 given by row numbers from 1 to the number of total rows of the "array from formula to shrink" MINUS 1: the first 15 (out of 16) values in the array from a formula; 3rd argument is the column of the shrank array (if need be, more than one could be selected using an analogue to the second argument).
In the particular case of FREQUENCY, because it is known that we are interested in the "bins" part of the function, the formula can be simplified by including the total rows of the "bins"/"intervals" array inside FREQUENCY (its second argument). We will have
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1)
and the complete formula would become
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1);B1:B15))
Now, both dividend and divisor of MOD have exactly the same dimensions (15x1) and because B1:B15 includes integers greater than 0 there are no errors.
Thanks all for helping me in making question more concise and better formatted.
ADDITIONAL INFORMATION: As pointed out correctly in comments by XOR LX, this does not seem to work in the widely popular spreadsheet software Excel. It has been developed for an INDEX function inside SUMPRODUCT as used in Open Office Calc which I had mistakenly thought 100% equivalent to Excel's version. A more complete answer perhaps using other functions would be appreciated.
In the previous answer, XOR LX points out very correctly that this formula cannot work in Excel, due to row_num/column_num argument behaviour. Very kindly XOR LX has shown me how that approach can work, and also thanks and credit for supplying a good answer: "INDEX can be used to redimension array (even dynamically created ones) if the row_num/column_num array is coerced to take an arbitrary array with the right dimensions, as shown on this blog entry " The following formula has been checked in Excel 2010 and has the expected results:
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15,A1:A15),N(INDEX(ROW(INDIRECT("1:" & ROWS(A1:A15))),,)),1),B1:B15))
NB: row_num argument of first INDEX, a ROW generated auxiliary array, has been nested inside N(INDEX([...],,)); at least one comma is necessary to account for the two arguments minimum of the nested INDEX. It is in itself interesting the discussion that applies generally to INDEX's arguments, and other functions', that need to be coerced to take arrays (see, here and here at XOR LX's blog). For Open Office users it might be worth stressing the point made at the blog
Unlike OFFSET, (...) for which the first parameter must be a
reference (...) in the worksheet, INDEX can also accept –
and manipulate – for its reference arrays which consist of values
generated e.g. via other subfunctions within the formula. XOR LX's blog
That would be indeed the case in changing the dimension in an array as in this question, but also useful in reversing or displacing the values in an array, for example. Open Office accepts arrays as row_num/column_num, so the coercion is not needed and some formulas rely on this, but without it, these formulas are unlikely to work when files are open in Excel.
Regrettably, this type of coercion is not passed correctly to Open Office, and formula need to be "decoerced" to work, at least in my casual tests.
In order to use a formula that would work in both spreadsheet programs regarding shortening arrays, the only thing I have managed is the following (required: arrays must be single-column)
SUMPRODUCT(
(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
-ROW(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
=0)
*MOD(TRANSPOSE(FREQUENCY(vals_for_freq,vals_for_freq)),vals_to_mod)
)
(it "shortens" one array to the shortest of the pair, by creating an auxiliary array with TRUE/1s on the diagonal starting top-left and FALSE/0s elsewhere, therefore disregarding all defined values outside the square section of the array. Thus, SUMPRODUCT adds values within the diagonal of the square section which are the product of the corresponding values up to the last value of the shorter array.)
Related
I have been looking around for a while but unable to find an answer to my question.
In Excel, what compact formula can I use to create an array made up of a single element repeated n times, where n is an input (potentially hard-coded)?
For example, something that would look like this (the formula below does not work but gives an idea of what I am looking for):
{={"Constant"}*3}
Note: I am not looking for a VBA-based solution.
EDIT Reading #AxelRichter answer, I see I should also indicate that the formulas below assume Constant is a number. If Constant is text, then this solution will not work.
Volatile:
=ROW(INDIRECT("1:" & Repts))/ROW(INDIRECT("1" & ":" & Repts)) * Constant
non-Volatile:
=ROW(INDEX($1:$65535,1,1):INDEX($1:$65535,Repts,1))/ROW(INDEX($1:$65535,1,1):INDEX($1:$65535,Repts,1))*Constant
If
Constant = 14
Repts = 3
then
Result = {14;14;14}
The first part of the formulas create an array of 1's repeated Repts times. Then we multiply that array by Constant to get the desired result.
And after reading #MacroMarc's comment, the following non-volatile formula shouyld also work for numbers:
=(ROW($A$1:INDEX($A:$A,Repts))>0)*Constant
One could concatenate 1:n empty cells to the "Constant" to create a string array having n items "Constant":
"Constant"&INDEX(XFD:XFD,1):INDEX(XFD:XFD,3)
There 3 is n.
Used in Formula
=INDEX("Constant"&INDEX(XFD:XFD,1):INDEX(XFD:XFD,3),0)
Evaluate Formula shows that it works:
Here column XFD is used because in most cases this column will be empty and a column which is guaranteed to be empty is needed for this solution.
If used
"Constant"&T(ROW($A$1:INDEX($A:$A,3)))
=INDEX("Constant"&T(ROW($A$1:INDEX($A:$A,3))),0)
the need of an empty column disappears. The function ROW returns numbers but the T returns an empty string if its parameter is not text. So empty strings will be concatenated for each 1:3 (n).
Thanks to #MacroMarc for the hint.
Try:
REPT("Constant", SEQUENCE(3,1,1,0))
Or, if the reference is to a dynamic array:
REPT("Constant", SEQUENCE(A1#,1,1,0))
The dynamic array spills, and has your constant repeated one time.
Using SEQUENCE with a step of 0 is a much cleaner way to make an array of constants. You can choose whether you want rows or columns (or both!) as well.
=SEQUENCE(Repts,1,Constant,0)
I will generally use a sequence (like Claire (above) said). But if you want to provide an output of text objects, I would do it this way:
=IF(SEQUENCE(A1,A2,1,0),A3)
Where:
A1 has the number of rows
A2 has the number of columns
A3 has the thing you want repeated into an array
The sequence will create a matrix of 1's, which the IF statement will default to the TRUE expression (being the contents of A3).
So, if you wanted a vertical list of 3 items that says "Constant", this would do it:
=IF(SEQUENCE(3,,1,0),"Constant")
If you would prefer it be arranged horizontally instead of vertically, just amend the SEQUENCE function:
=IF(SEQUENCE(,3,1,0),"Constant")
I would like to know whether it's possible to return an array from a single cell formula, which is filtered to remove duplicates, and which is built purely on Excel formulas.
I'm aware of approaches to return a list of values where the duplicates are removed (see this question), where the list is spread over multiple cells. However I specifically want to return an array intermediate.
E.g. For the list in A1:A5, I can get an array of values {0.1,0.2,0.2,0.7,0.3}, from which I want a second array {0.1,0.2,0.7,0.3}, as an intermediate in an array formula. Current approaches use single-end anchored ranges (like C$1:C1) to iterate through the items in the array geometrically (by dragging down column C). I would like to leave the array un-iterated, within the formula. I can then manipulate this as I would any other array.
All this should take place in a single cell if possible.
NB
Both MacroMarc's and Barry Houdini's answers are perfectly valid, and I ran a speed check on each - there was negligible difference (any difference was smaller than the variation between test runs). Both scored ~ 1.0±0.2 ms
I have used a defined name for the Range (A1:A5) and called it myList. You can do the same, or substitute in the Address $A$1:$A$5 if you wish:
{=INDEX(myList, N(IF({1}, MODE.MULT(IF(MATCH(myList, myList, 0) = ROW(myList), ROW(myList)*{1,1})))), 1)}
EDIT: Above wasn't robust to handle if the column list is further down the sheet, and a shorter minrow routine courtesy of OP:
{=INDEX(myList, N(IF({1}, MODE.MULT(IF(MATCH(myList, myList, 0)=ROW(myList)-MIN(ROW(myList))+1, (ROW(myList)-MIN(ROW(myList))+1)*{1,1})))), 1)}
This should be ok for you. Needless to say, these are array formulas..
This formula will return a sorted array without duplicates, e.g. for your example
{0.1;0.2;0.3;0.7}
=SMALL(IF(MATCH(A1:A5,A1:A5,0)=ROW(A1:A5)-ROW(A1)+1,A1:A5),ROW(INDIRECT("1:"&SUM(0+(0<(FREQUENCY(A1:A5,A1:A5)))))))
confirmed with CTRL+SHIFT+ENTER
......or this version will keep the order
=INDEX(A1:A5,N(IF({1},SMALL(IF(MATCH(A1:A5,A1:A5,0)=ROW(A1:A5)-ROW(A1)+1,ROW(A1:A5)-ROW(A1)+1),ROW(INDIRECT("1:"&SUM(0+(0<(FREQUENCY(A1:A5,A1:A5))))))))))
Use this array formula to sum the unique, it requires the use of TEXTJOIN() which is only available with Office 365 Excel:
=SUM( IF(ISERROR(FIND(TRIM(MID(TEXTJOIN(REPT(" ",999),TRUE,A:A),(ROW(INDIRECT("1:" & COUNTA(A:A)))-1)*999+1,999)),MID(TEXTJOIN(REPT(" ",999),TRUE,A:A),1,(ROW(INDIRECT("1:" & COUNTA(A:A)))-1)*999))),--TRIM(MID(TEXTJOIN(REPT(" ",999),TRUE,A:A),(ROW(INDIRECT("1:" & COUNTA(A:A)))-1)*999+1,999))))
Being an array formula it needs to be confirmed with Ctrl-Shift-Enter instead of Enter when exiting edit mode.
You can replace SUM with many different formula.
If one does not have Office 365 Excel then vba and or helper columns are the only method available.
Edit:
To remove the False from the array and return the 3rd non duplicate in the array we can wrap it in another TEXTJOIN:
=--TRIM(MID(TEXTJOIN(REPT(" ",999),TRUE,IF(ISERROR(FIND(TRIM(MID(TEXTJOIN(REPT(" ",999),TRUE,A:A),(ROW(INDIRECT("1:" & COUNTA(A:A)))-1)*999+1,999)),MID(TEXTJOIN(REPT(" ",999),TRUE,A:A),1,(ROW(INDIRECT("1:" & COUNTA(A:A)))-1)*999))),--TRIM(MID(TEXTJOIN(REPT(" ",999),TRUE,A:A),(ROW(INDIRECT("1:" & COUNTA(A:A)))-1)*999+1,999)),"")),(3-1)*999,999))
The 3 in the (3-1)*999 can be replaced by anything that returns the Index number desired.
Still an array formula that needs Ctrl-Shift-Enter.
If you want to return the relative position in then use this array:
=AGGREGATE(15,6,ROW(INDIRECT("1:" & COUNTA(A:A)))/(--TRIM(MID(TEXTJOIN(REPT(" ",999),TRUE,IF(ISERROR(FIND(TRIM(MID(TEXTJOIN(REPT(" ",999),TRUE,A:A),(ROW(INDIRECT("1:" & COUNTA(A:A)))-1)*999+1,999)),MID(TEXTJOIN(REPT(" ",999),TRUE,A:A),1,(ROW(INDIRECT("1:" & COUNTA(A:A)))-1)*999))),--TRIM(MID(TEXTJOIN(REPT(" ",999),TRUE,A:A),(ROW(INDIRECT("1:" & COUNTA(A:A)))-1)*999+1,999)),"")),(ROW(INDIRECT("1:" & COUNTA(A:A)))-1)*999,999))=B1),1)
I gave you the long formula for creating the array but for the sum:
=SUMPRODUCT(A1:A5/COUNTIF(A1:A5,A1:A5))
would be sufficient.
And for the Average:
=SUMPRODUCT(A1:A5/COUNTIF(A1:A5,A1:A5))/SUMPRODUCT(1/COUNTIF(A1:A5,A1:A5))
There are many work arounds that are shorter and better if you know what you want.
=SUM(SUMIFS('Output'!$H$50:$H$69,'Output'!$C$50:$C$69,{"*PLoan","Deficit Loan"},'Output'!$G$50:$G$69,X97:AC97))
I have the above array formula. X97:AC97 is where I have my array literal and the values in this array literal are conditioned on some other parameters (let's say the values are a,b,c,d,e if the conditions are met). Because the values are conditional, it might show up as a "" "" "" e "". So in the sumifs array formula it takes the array literal as values {"a","","","","e",""}, which causes an error. How do I make it so that the array is {"a","e"}?
edit: I realized after much effort that sumifs might not be the correct method as it requires alignment. For example, I might have a (1,0) criteria met from {PLoan, Deficit Loan} while X97:AC97 = (0,1,0,0,0,0). I do want the sum for that number but it won't sum due to the mismatch.
I got the below to work. If anyone can think of a way to incorporate wildcards as part of my string search, let me know...
=SUMPRODUCT(IFERROR(ISNUMBER(MATCH('Output'!$G$50:$G$69,$X$97:$AC$97,0)*MATCH('Output'!$C$50:$C$69,{"PLoan","Operating Deficit Loan"},0))*('Output'!$H$50:$H$69),0))
COUNTIF(S)/SUMIF(S) do not necessarily "require alignment" of the criteria being passed. It all depends upon what it is you wish to count.
If you require a count which comprises all combinations of the two sets of criteria, then it is sufficient that the two arrays containing those criteria be orthogonal.
At the moment, both of the criteria arrays which you are passing are of the same vector-type, i.e. row-vectors, since:
X97:AC97
is clearly a row- (horizontal) vector, as also is:
{"*PLoan","Deficit Loan"}
since, for English-language versions of Excel, the comma and semicolon represent, respectively, separators within row (horizontal) and column (vertical) array constants.
As such, in your case you simply need to transpose one of these two arrays, viz:
=SUM(SUMIFS(Output!$H$50:$H$69,Output!$C$50:$C$69,{"*PLoan";"Deficit Loan"},Output!$G$50:$G$69,X97:AC97))
(We could also transpose the other, viz:
=SUM(SUMIFS(Output!$H$50:$H$69,Output!$C$50:$C$69,{"*PLoan","Deficit Loan"},Output!$G$50:$G$69,TRANSPOSE(X97:AC97)))
or else move the entries within X97:AC97 to some other, vertical, range.)
See here for more information.
Regards
I'm trying to setup a formula that will return the contents of an related cell (my related cell is on another sheet) from the smallest 2 results in an array. This is what I'm using right now.
=INDEX('Sheet1'!$A$40:'Sheet1'!$A$167,MATCH(SMALL(F1:F128,1),F1:F128,0),1)
And
=INDEX('Sheet1'!$A$40:'Sheet1:!$A$167,MATCH(SMALL(F1:F128,2),F1:F128,0),1)
The problem I've run into is twofold.
First, if there are multiple lowest results I get whichever one appears first in the array for both entries.
Second, if the second lowest result is duplicated but the first is not I get whichever one shows up on the list first, but any subsequent duplicates are ignored. I would like to be able to display the names associated with the duplicated scores.
You will have to adjust the k parameter of the SMALL function to raise the k according to duplicates. The COUNTIF function should be sufficient for this. Once all occurrences of the top two scores are retrieved, standard 'lookup multiple values' formulas can be applied. Retrieving successive row positions with the AGGREGATE¹ function and passing those into an INDEX of the names works well.
The formulas in H2:I2 are,
=IF(SMALL(F$40:F$167, ROW(1:1))<=SMALL(F$40:F$167, 1+COUNTIF(F$40:F$167, MIN(F$40:F$167))), SMALL(F$40:F$167, ROW(1:1)), "") '◄ H2
=IF(LEN(H40), INDEX(A$40:A$167, AGGREGATE(15, 6, ROW($1:$128)/(F$40:F$167=H40), COUNTIF(H$40:H40, H40))), "") '◄ I2
Fill down as necessary. The scores are designed to terminate after the last second place so it would be a good idea to fill down several rows more than is immediately necessary for future duplicates.
¹ The AGGREGATE function was introduced with Excel 2010². It is not available in earlier versions.
² Related article for pre-xl2010 functions - see Multiple Ranked Returns from INDEX().
The following formula will do what I think you want:
=IF(OR(ROW(1:1)=1,COUNTIF($E$1:$E1,INDEX(Sheet1!$A$40:$A$167,MATCH(SMALL($F$1:$F$128,ROW(1:1)),$F$1:$F$128,0)))>0,ROW(1:1)=2),INDEX(Sheet1!$A$40:$A$167,MATCH(1,INDEX(($F$1:$F$128=SMALL($F$1:$F$128,ROW(1:1)))*(COUNTIF($E$1:$E1,Sheet1!$A$40:$A$167)=0),),0)),"")
NOTE:
This is an array formula and must be confirmed with Ctrl-Shift-Enter.
There are two references $E$1:$E1. This formula assumes that it will be entered in E2 and copied down. If it is going in a different column Change these two references. It must go in the second row or it will through a circular reference.
What it will do
If there is a tie for first place it will only list those teams that are tied for first.
If there is only one first place but multiple tied for second places it will list all those in second.
So make sure you copy the formula down far enough to cover all possible ties. It will put "" in any that do not fill, so err on the high side.
To get the Scores use this simple formula, I put mine in Column F:
=IF(E2<>"",SMALL($F$1:$F$128,ROW(1:1)),"")
Again change the E reference to the column you use for the output.
I did a small test:
This question is about how Excel's COUNTIF function treats different data types when used as an array formula.
There are lots of good posts out there detailing how to use COUNTIF for tasks such as extracting unique values from a list, for example this post. I've managed to use examples from this and other posts to solve specific problems, but I'm trying to get a deeper understanding of array formulas in order to adapt my formulas to new needs.
I came across a peculiar behavior of COUNTIF. In general, Excel seems to treat strings as "larger than" numbers, so that the following examples are valid:
Cell Formula Returns
=1<2 TRUE
="a"<"b" TRUE
="a">"b" FALSE
=1<"b" TRUE
Now, suppose range A1:A6 contains the following data set:
1
2
3
A
B
C
For each cell in this set, I want to check how many of all the cells in the set that are smaller than or equal to that cell (a useful technique in more complex formulas). I enter the following array formula in range B1:B6:
{=COUNTIF($A$1:$A$6,"<="&$A$1:$A$6)} (CTRL + SHIFT + ENTER)
Based on the examples above comparing numbers and strings (also illustrated in Column D below), I would expect the output shown below to look like Column C. However, the array formula returns the result shown in Column B, which suggests that strings and number elements are counted separately by arraywise COUNTIF.
Column A Column B Column C Column D
1 1 1 A1<"C" = TRUE
2 2 2 A2<"C" = TRUE
3 3 3 A3<"C" = TRUE
A 1 4 A4<"C" = TRUE
B 2 5 A5<"C" = TRUE
C 3 6 A6<"C" = FALSE
So the question is how to produce the output in Column C? (EDIT: Just to clarify, I'm specifically looking for solutions that make use of COUNTIF's array properties.)
Any insight into why arraywise COUNTIF apparently behaves differently than the single-cell examples would also be much appreciated.
NOTE: I've translated the examples from a non-English version of Excel, so I apologize in advance for any typos.
PS. For a background, I ran into this problem when I tried to build a formula that would both extract unique values from a list with possible duplicates, and sort the unique values in numerical/alphabetical order. My current solution is to do this in two steps. One solution for how to do it in one step is proposed here.
First of all, excellently laid-out question, and on an interesting topic to boot.
I also raised an eyebrow when I first came across this behaviour of the COUNTIF(S)/SUMIF(S) functions. In their defence, I suppose we could construct situations in which we actually want strings and numerics to be considered separately.
In order to construct your required in-formula array, you will need something like:
MMULT(0+(TRANSPOSE($A$1:$A$6)<=$A$1:$A$6),ROW($A$1:$A$6)^0)
though note that the necessary transposition will mean that any set-up which includes this construction will require committing with CSE.
Regards
The different behavior can easily shown if you compare
=COUNTIF($A$1:$A$6,"<=A")
with
{=COUNT(IF($A$1:$A$6<="A",1))}
The first will only get text values from $A$1:$A$6 because it is clearly text to compare and it is faster ignoring other values then. =COUNTIF($A$1:$A$6,"<=3") will only get numeric values from $A$1:$A$6 because of the same reasons. Even if the criterion would be a concatination with a cell reference, then the concatination would be the first process and would lead either to "<=A" or "<=3". So it is ever clear what to compare, text or numbers.
The second first needs an array of the comparisons, then performs the IF, gets so an array of 1 or FALSE and counts then. But the "A" could also be a cell reference. So it is not clear what to compare at the beginning and the first array has to compare all values in $A$1:$A$6.
So COUNTIF(S) and SUMIF(S) cant be used comparing mixed text and numeric data.
The solution is shown already by XOR LX.
Btw.: with your PS. For a background you should consider the following solution from an German Excel site: http://www.excelformeln.de/formeln.html?welcher=236.
In your linked example:
Formula in B2 downwards
{=INDEX($A$2:$A$99,MATCH(LARGE(COUNTIF(A$2:A$99,">="&A$2:A$99)+99*ISNUMBER(A$2:A$99),ROWS($1:1)),COUNTIF(A$2:A$99,">="&A$2:A$99)+99*ISNUMBER(A$2:A$99),0))&""}
In this solution the COUNTIF compares with >= so the biggest text or number will count lowest and so get the lowest position. All number positions are added with 99. So they are ever greater than all possible text positions. So we have a descended sorted array. Then, using LARGE, the list is created from the highest to the lowest position.
I doubt that countif is the right function for what you want to achieve here.
try this (ctrl+shift+enter):
={SUM(IF(A1>=$A$1:$A$6,1,0))}
You will get
1
2
3
4
5
6
PS: CountIf is an basically an array function internally. Using it in another array function results into multiple array functions and their behaviour becomes complex. Array functions are best used with clear logical path.
As tested in Excel 2013, you will only get 1 in all results instead of what was proposed in Column B.
Currently, in the function provided by you, countif cannot figure out which cell to compare to which cell. Array functions expand ranges and then perform the provided action. Therefore, it is comparing each cell to same cell and resulting into 1.
Try this FormulaArray in B1 then copy till B6:
=SUM(($A$1:$A$6<=$A1)*1)