There's a handy Excel function called SMALL that lets you find the n-th smallest value from an array. For example: SMALL({35;10;5000;6},2) = 10, the second smallest number in the set.
You could use this function by referencing an array of cells (SMALL(A1:A10,2)) or you can write an array of constant values in the formula directly (SMALL({1;2;3},2)).
Is there a way to write an array of computed values directly in the formula? It should look something like this, if using RAND to generate the values:
SMALL({RAND();RAND();RAND()},2)
but Excel doesn't allow that.
How can you use a function (like RAND) inside another function that demands an array (like SMALL)?
Yes, I'm aware that the usual solution would be to put the computed values in their own individual cells, then just use that array as the input of SMALL. It would be great if I could do this all inside a single cell.
The general answer as you may know is that you would use an array formula.
This is a bit difficult to illustrate with RAND(), but if you take a more normal function like SQRT, the usage would be
=SMALL(SQRT({9,4,1}),1)
so instead of providing a single argument to SQRT, you are providing a list of arguments which it is going to work through one at a time and return an array of 3 elements {3,2,1} which is passed to SMALL to evaluate.
If the same list of numbers was in (say) A1:A3, you would need to enter this as an array formula using CtrlShiftEnter
=SMALL(SQRT(A1:A3),1)
But as pointed out by #Jeeped, it's often more convenient to use AGGREGATE
=AGGREGATE(15,6,SQRT(A1:A3),1)
As far as I know, you can't reproduce this behaviour with RAND because it is one of the few Excel functions that takes no arguments.
If you really did want to generate an array of random numbers r where 0<=r<1 , you would have to do something like this
=SMALL((RANDBETWEEN({0,0,0},10^20-1)/10^20),1)
i.e. use RANDBETWEEN with an arbitrarily large upper limit.
Related
How to RANK an array directly? I would like to avoid creating more intermediate data in cells just to reference them.
Excel RANK.AVG formula states it accepts both array and reference:
Syntax
RANK.AVG(number,ref,[order])
The RANK.AVG function syntax has the following arguments:
Number Required. The number whose rank you want to find.
Ref Required. **An array of, or a reference to**, a list of numbers. Nonnumeric values in Ref are ignored.
Order Optional. A number specifying how to rank number.
But Excel keeps rejecting the below formula.
=RANK.AVG(5, {3,1,7,10,5})
If the numbers are put in cells, say B1:B5, Excel accepts
=RANK.AVG(5, B1:B5}
Ultimately, I would like to rank a dynamic array
=RANK.AVG(value, TOCOL(VSTACK(array1, array2))
e.g. =RANK.AVG(5, TOCOL(VSTACK(B1:B5,C1:C10))
It seems that the official documentation on the various RANK functions is simply wrong with respect to the fact that they permit arrays for the ref argument (see here, for example).
You will have to come up with creative alternatives which mimic the RANK.AVG function, for example:
=LET(ζ,SORT(MyArray,,-1),AVERAGE(FILTER(SEQUENCE(COUNT(ζ)),ζ=MyValue)))
In C if I have:
int grades[100][200];
and want to pass the first row, then I write: grades[0], but what if I want to pass first column? writing this won't help grades[][0]
You can't pass columns in C. You pass pointers to the beginning of some continuous data.
Rows in C are written continuously in memory, so you pass the pointer to the first element of some row (you do it implicitly by using its name: matrix[row]; an explicit version would be &matrix[row][0]), and you can use the row by iterating over the continuous memory.
To use columns, you need to pass the whole array (a pointer to the first element in the 2D array, actually), and pass also the length of the rows, and then the function has to jump that length to jump from an element of the same column to the next one. This is one of many possible solutions, you could develop any other solution, for example copying the column in a temporary array as some comment pointed out; but this one is commonly used in cblas functions for example.
If it helps to visualize, a 2-dimensional array is an array of arrays, it's not formulated as a matrix. Thereby, we can pass a sub-array (i.e., a row), but there's no direct way of passing a column.
One way to achieve this is to loop over the outer array, pick the element at the fixed location (mimicking the "column"), and use the values to create a separate array, or pass to function that needs to process the data.
Matrixes do not exist in C (check by reading the C11 standard n1570). Only arrays, and in your example, it is an array of arrays of int. So columns don't exist neither.
A good approach is to view a matrix like some abstract data type (using flexible array members ....) See this answer for details.
Consider also using (and perhaps looking inside its source code) the GNU scientific library GSL, and other libraries like OpenCV, see the list here.
In some cases, arbitrary precision arithmetic (with gmplib) could be needed.
I am interested in spreadsheet functions, not VBA solutions, to be included in a single cell formula.
[A1:A15 contain numeric values from 1 to 127, B1:B15 contain integers from 1 to 7 that set a divisor.]
Given the function:
=SUMPRODUCT(MOD(FREQUENCY(A1:A15;A1:A15);B1:B15))
FREQUENCY(A1:A15;A1:A15) gives a 1-column array of 15+1 rows, whereas the second part (B1:B15) is a 1-column array of 15 rows.
I would like to change the resulting array given by FREQUENCY (only in memory -not explicit in sheet-) from a 1-column 16 rows array to a 1-column 15 rows array with the first 15 cell values of that array.
[FREQUENCY documentation: https://support.office.com/en-us/article/FREQUENCY-function-44e3be2b-eca0-42cd-a3f7-fd9ea898fdb9 NB: for Excel, second remark states number of elements that depend on bins_array. ]
I would appreciate suggestions.
Thus, both arrays within MOD will have the same dimensions and SUMPRODUCT will not find cells with error values. I can disregard error values using IF and ISERROR within SUMPRODUCT, but I'd rather disregard the non-relevant part of the FREQUENCY resulting array if it is possible.
It has been thought that making it more specific might be more helpful, so it has been heavily reduced and simplified.
With external help, I have been able to fine-tune a way to solve my problem using INDEX in array formula mode. I am posting the answer in case it helps others.
One way: Put FREQUENCY(A1:A15;A1:A15), or any formula that produces an multi-cell array, within INDEX and have 2nd and/or 3rd arguments as array of consecutive values which will represent rows/columns.
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(FREQUENCY(A1:A15;A1:A15)-1));1)
First argument within INDEX is the resulting array coming from a formula to shrink (from 16x1 to 15x1), which would be a multi-cell array formula if explicitly entered; second argument is the array 1..15 given by row numbers from 1 to the number of total rows of the "array from formula to shrink" MINUS 1: the first 15 (out of 16) values in the array from a formula; 3rd argument is the column of the shrank array (if need be, more than one could be selected using an analogue to the second argument).
In the particular case of FREQUENCY, because it is known that we are interested in the "bins" part of the function, the formula can be simplified by including the total rows of the "bins"/"intervals" array inside FREQUENCY (its second argument). We will have
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1)
and the complete formula would become
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1);B1:B15))
Now, both dividend and divisor of MOD have exactly the same dimensions (15x1) and because B1:B15 includes integers greater than 0 there are no errors.
Thanks all for helping me in making question more concise and better formatted.
ADDITIONAL INFORMATION: As pointed out correctly in comments by XOR LX, this does not seem to work in the widely popular spreadsheet software Excel. It has been developed for an INDEX function inside SUMPRODUCT as used in Open Office Calc which I had mistakenly thought 100% equivalent to Excel's version. A more complete answer perhaps using other functions would be appreciated.
In the previous answer, XOR LX points out very correctly that this formula cannot work in Excel, due to row_num/column_num argument behaviour. Very kindly XOR LX has shown me how that approach can work, and also thanks and credit for supplying a good answer: "INDEX can be used to redimension array (even dynamically created ones) if the row_num/column_num array is coerced to take an arbitrary array with the right dimensions, as shown on this blog entry " The following formula has been checked in Excel 2010 and has the expected results:
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15,A1:A15),N(INDEX(ROW(INDIRECT("1:" & ROWS(A1:A15))),,)),1),B1:B15))
NB: row_num argument of first INDEX, a ROW generated auxiliary array, has been nested inside N(INDEX([...],,)); at least one comma is necessary to account for the two arguments minimum of the nested INDEX. It is in itself interesting the discussion that applies generally to INDEX's arguments, and other functions', that need to be coerced to take arrays (see, here and here at XOR LX's blog). For Open Office users it might be worth stressing the point made at the blog
Unlike OFFSET, (...) for which the first parameter must be a
reference (...) in the worksheet, INDEX can also accept –
and manipulate – for its reference arrays which consist of values
generated e.g. via other subfunctions within the formula. XOR LX's blog
That would be indeed the case in changing the dimension in an array as in this question, but also useful in reversing or displacing the values in an array, for example. Open Office accepts arrays as row_num/column_num, so the coercion is not needed and some formulas rely on this, but without it, these formulas are unlikely to work when files are open in Excel.
Regrettably, this type of coercion is not passed correctly to Open Office, and formula need to be "decoerced" to work, at least in my casual tests.
In order to use a formula that would work in both spreadsheet programs regarding shortening arrays, the only thing I have managed is the following (required: arrays must be single-column)
SUMPRODUCT(
(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
-ROW(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
=0)
*MOD(TRANSPOSE(FREQUENCY(vals_for_freq,vals_for_freq)),vals_to_mod)
)
(it "shortens" one array to the shortest of the pair, by creating an auxiliary array with TRUE/1s on the diagonal starting top-left and FALSE/0s elsewhere, therefore disregarding all defined values outside the square section of the array. Thus, SUMPRODUCT adds values within the diagonal of the square section which are the product of the corresponding values up to the last value of the shorter array.)
I apologise if this is a very simple question, but I am at a bit of a loss here.
A bespoke formula I want to use returns an array of values, as seen here:
But I cannot find a way to present this output in a cell separated format, only the first cell (39478) is returned.
There is a note included in the documentation: Hint: This function is a multiple result function. You MUST set an array for the output.
Whilst I understand I am going to need an array to display multiple results, I cannot find the method of doing so. Any tips?
If the bespoke formula wants to return an array of values, there are a couple of ways to get the results into multiple cells.
Put the formula into a cell and hit Enter↵. Next, select that cell along with several cells below it. Tap F2 then hit Ctrl+Shift+Enter↵. The successive values should fill the cells selected until an error (no more returns) is reached.
Put that formula into a cell and hit Ctrl+Shift+Enter↵. The formula should be wrapped in braces (e.g. { and }). If the correct relative and absolute cell addresses were used (e.g. $**A$1 or $**A1, etc) then you should be able to fill, copy or drag down the formula into successive rows.
Use an INDEX function to contain the array of returned values from the bespoke formula and peel off successive values using the row_num parameter. =INDEX(<bespoke formula>, ROW(1:1)) Filled down.
Sooner or later, you will run out of rows to fill. An IFERROR function used as a wrapper can help avoid he display of errors.
If you want to put all of the values into a single cell, then a User defined Function (aka UDF) could concatenate the array into a single string. This last method is generally not recommended as it renders the values useless for anything other than display purposes.
Array formulas need to be finalized with Ctrl+Shift+Enter↵. Once entered into the first cell correctly, they can be filled or copied down or right just like any other formula.
Array formulas chew up calculation cycles logarithmically so it is good practise to narrow the referenced ranges to a minimum.
See Guidelines and examples of array formulas for more information.
I have a VBA function that returns an array to be displayed in Excel. The array's first two columns contain ID's that don't need to be displayed.
Is there any way to modify the Excel formula to skip the first two columns, without going back to create a VBA helper to strip off the columns?
The formula looks like this, where the brackets let the array be displayed across a span of cells:
{=GetCustomers($a$1)}
The closest thing Excel has to built-in array manipulation is the 'INDEX' function. In your case, if the array returned by your 'GetCustomers' routine is a single row, and if you know how long it is (which I guess you do since you're putting it into the sheet), you can get what you want by doing something like this:
=INDEX(GetCustomers($A$1),{3,4,5})
So say GetCustomers() returned the array {1,2,"a","b","c"}, the above would just give back {"a","b","c"}.
There are various ways to save yourself having to type out your array of indices, though. For example,
=COLUMN(C1:E1)
will return {3,4,5}, and you can use that instead:
=INDEX(GetCustomers($A$1),COLUMN(C1:E1))
This trick doesn't work with a true 2-D array, though. 'INDEX' will return a whole row or column if you pass in a zero in the right place, but I don't know how to make it return a 2-D subset. EDIT: You can do it, but it's cumbersome. Say your array is 2x5, and you want the last three columns. You could use:
=INDEX(GetCustomers($A$1), {1,1,1;2,2,2}, {3,4,5;3,4,5})
(FURTHER EDIT: chris neilsen provides a nice way to compute those arrays in his answer.)
Charles Williams has a link on his website that explains more about using arrays like this here:
http://www.decisionmodels.com/optspeedj.htm
He posted that in response to this question I asked:
Is there any documentation of the behavior of built-in Excel functions called with array arguments?
Personally, I use VBA helper functions for things like this. With the right library routines, you can do something like:
=subseq(GetCustomers($A$1),3,3)
in the 1-D case, or:
=hstack(subseq(asCols(GetCustomers($A$1)),3,3))
in the 2-D case, and it's much more clear.
simplest solution is to just hide the first two columns
another may be to use OFFSET to resize the returned array
syntax is
OFFSET(reference,rows,cols,height,width)
I suggest modifying the 'GetCustomers' function to include an optional Boolean variable to tell the function to return all the columns, or just the single column. That would be the cleanest solution, instead of trying to handle it on the formula side.
Public Function GetCustomers(rng as Range, Optional stripColumns as Boolean = False) as Variant()
If stripColumns Then 'Resize array to meet your needs
Else 'return full sized array
End If
End Function
You can use the INDEX function to extract items from the return array
- formula is in an range starting at cell B2
{=INDEX(getcustomers($A$1),ROW()-ROW($B$2)+1,COLUMN()-COLUMN($B$2)+3)}