MATLAB sum over all elements of array valued expression - arrays

So I've been wondering about this for a while now. Summing up over some array variable A is as easy as
sum(A(:))
% or
sum(...sum(sum(A,n),n-2)...,1) % where n is the dimension of A
However once it gets to expressions the (:) doesn't work anymore, like
sum((A-2*A)(:))
is no valid matlab syntax, instead we need to write
foo = A-2*A;
sum(foo(:))
%or the one liner
sum(sum(...sum(A-2*A,n)...,2),1) % n is the dimension of A
The one liner above will only work, if the dimension of A is fixed which, depending on what you are doing, may not necessary be the case. The downside of the two lines is, that foo will be kept in memory until you run clear foo or may not even be possible depending on the size of A and what else is in your workspace.
Is there a general way to circumvent this issue and sum up all elements of an array valued expression in a single line / without creating temporal variables? Something like sum(A-2*A,'-all')?
Edit: It differes from How can I index a MATLAB array returned by a function without first assigning it to a local variable?, as it doesn't concern general (nor specific) indexing of array valued expressions or return values, but rather the summation over each possible index.
While it is possible to solve my problem with the answer given in the link, gnovice says himself that using subref is a rather ugly solution. Further Andras Deak posted a much cleaner way of doing this in the comments below.

While the answers to the linked duplicate can indeed be applied to your problem, the narrower scope of your question allows us to give a much simpler solution than the answers provided there.
You can sum all the elements in an expression (including the return value of a function) by reshaping your array first to 1d:
sum(reshape(A-2*A,1,[]))
%or even sum(reshape(magic(3),1,[]))
This will reshape your array-valued expression to size [1, N] where N is inferred from the size of the array, i.e. numel(A-2*A) (but the above syntax of reshape will compute the missing dimension for you, no need to evaluate your expression twice). Then a single call to sum will sum all the elements, as needed.
The actual case where you have to resort to something like this is when a function returns an array with an unknown number of dimensions, and you want to use its sum in an anonymous function (making temporary variables unavailable):
fun = #() rand(2*ones(1,randi(10))); %function returning random 2 x 2 x ... x 2 array with randi(10) dimensions
sumfun = #(A) sum(reshape(A,1,[]));
sumfun(fun()) %use it

Related

Matlab access array element in one line

a = 0:99
s = size(a)
disp(s(2))
Can the last two lines be written as one? In other languages I am able to do f(x)[i], but Matlab seems to complain.
In this specific case where you are using the size function, you can add an additional argument to specify the dimension you want the size of, allowing you to easily do this in one line:
disp(size(a, 2)); % Displays the size of the second dimension
In the more general case of accessing an array element without having to store it in a local variable first, things get a little more complicated, since MATLAB doesn't have the same kind of indexing shorthand you would find in other languages. Octave, for example, would allow you to do disp(size(a)(2)).
It is possible to collapse those two lines in one single statement and achieve a sort of f(x)[i] thanks to the functional form of the indexing operator: subsref.
disp(subsref(size(a), struct('type', '()', 'subs', {{2}})))

Why [1:2] != Array[1:2]

I am learning Julia following the Wikibook, but I don't understand why the following two commands give different results:
julia> [1:2]
1-element Array{UnitRange{Int64},1}:
1:2
julia> Array[1:2]
1-element Array{Array,1}:
[1,2]
Apologies if there is an explanation I haven't seen in the Wikibook, I have looked briefly but didn't find one.
Type[a] runs convert on the elements, and there is a simple conversion between a Range to an Array (collect). So Array[1:2] converts 1:2 to an array, and then makes an array of objects like that. This is the same thing as why Float64[1;2;3] is an array of Float64.
These previous parts answer answered the wrong thing. Oops...
a:b is not an array, it's a UnitRange. Why would you create an array for A = a:b? It only takes two numbers to store it, and you can calculate A[i] basically for free for any i. Using an array would take an amount of memory which is proportional to the b-a, and thus for larger arrays would take a lot of time to allocate, whereas allocation for UnitRange is essentially free.
These kinds of types in Julia are known as lazy iterators. LinSpace is another. Another interesting set of types are the special matrix types: why use more than an array to store a Diagonal? The UniformScaling operator acts as the identity matrix while only storing one value (it's scale) to make A-kI efficient.
Since Julia has a robust type system, there is no reason to make all of these things arrays. Instead, you can make them a specialized type which will act (*, +, etc.) and index like an array, but actually aren't. This will make them take less memory and be faster. If you ever need the array, just call collect(A) or full(A).
I realized that you posted something a little more specific. The reason here is that Array[1:2] calls the getindex function for an array. This getindex function has a special dispatch on a Range so that way it "acts like it's indexed by an array" (see the discussion from earlier). So that's "special-cased", but in actuality it just has dispatches to act like an array just like it does with every other function. [A] gives an array of typeof(A) no matter what A is, so there's no magic here.

Apply slicing to array valued function in Fortran [duplicate]

How does one access an element of an array that is returned from a function? For example, shape() returns an array of integers. How does one compare an element of that array to an integer? The following does not compile:
integer :: a
integer, dimension(5) :: b
a = 5
if (a .eq. shape(b)) then
print *, 'equal'
end if
The error is:
if (a .eq. shape(c)) then
1
Error: IF clause at (1) requires a scalar LOGICAL expression
I understand that this is because shape(c) returns an array. However, accessing an element of the array does not appear to be possible like so: shape(c)(1)
Now if I add these two lines:
integer, dimension(1) :: c
c = shape(b)
...and change the if clause to this:
if (a .eq. c(1)) then
... then it works. But do I really have to declare an extra array variable to hold the return value of shape(), or is there some other way to do it?
Further to the answers that deal with SHAPE and logical expressions etc, the general answer to your question "How does one access an element of an array that is returned from a function?" is
you assign the expression that has the function reference to an array variable, and then index that array variable.
you use the expression that has the function reference as an actual argument to a procedure that takes a dummy array argument, and does the indexing for you.
Consequently, the general answer to your last questions "But do I really have to declare an extra array variable to hold the return value of shape(), or is there some other way to do it?" is "Yes, you do need to declare another array variable" and hence "No, there is no other way".
(Note that reasonable optimising compilers will avoid the need for any additional memory operations/allocations etc once they have the result of the array function, it's really just a syntax issue.)
The rationale for this particular aspect of language design is sometimes ascribed to a need to avoid syntax ambiguity and confusion for array function results that are of character type (they could potentially be indexed and/or substringed - how do you tell what was intended?). Others think it was done this way just to annoy C programmers.
Instead of using shape(array), I would use size(array).
Note that this will return an integer indicating how many elements there are in ALL dimensions, unless you specify the DIM attribute, in which case it will return only the number of elements in the specified dimension.
Take a look at the gfortran documentation:
http://gcc.gnu.org/onlinedocs/gfortran/SIZE.html.
Also, look up lbound and ubound.
Note that the expression
a == shape(b)
returns a rank-1 array of logicals and the if statement requires that the condition reduce to a scalar logical expression. You could reduce the rank-1 array to a scalar like this:
if (all(a == shape(b)))
This is certainly not a general replacement for the syntactically-invalid application of array indexing to an array-returning function such as shape(b)(1).
It is possible even without the intermediate variable using ASSOCIATE:
real c(3,3)
associate (x=>shape(c))
print *,x(1),x(2)
end associate
end

Change array dimensions, using spreadsheet functions, when used inside SUMPRODUCT

I am interested in spreadsheet functions, not VBA solutions, to be included in a single cell formula.
[A1:A15 contain numeric values from 1 to 127, B1:B15 contain integers from 1 to 7 that set a divisor.]
Given the function:
=SUMPRODUCT(MOD(FREQUENCY(A1:A15;A1:A15);B1:B15))
FREQUENCY(A1:A15;A1:A15) gives a 1-column array of 15+1 rows, whereas the second part (B1:B15) is a 1-column array of 15 rows.
I would like to change the resulting array given by FREQUENCY (only in memory -not explicit in sheet-) from a 1-column 16 rows array to a 1-column 15 rows array with the first 15 cell values of that array.
[FREQUENCY documentation: https://support.office.com/en-us/article/FREQUENCY-function-44e3be2b-eca0-42cd-a3f7-fd9ea898fdb9 NB: for Excel, second remark states number of elements that depend on bins_array. ]
I would appreciate suggestions.
Thus, both arrays within MOD will have the same dimensions and SUMPRODUCT will not find cells with error values. I can disregard error values using IF and ISERROR within SUMPRODUCT, but I'd rather disregard the non-relevant part of the FREQUENCY resulting array if it is possible.
It has been thought that making it more specific might be more helpful, so it has been heavily reduced and simplified.
With external help, I have been able to fine-tune a way to solve my problem using INDEX in array formula mode. I am posting the answer in case it helps others.
One way: Put FREQUENCY(A1:A15;A1:A15), or any formula that produces an multi-cell array, within INDEX and have 2nd and/or 3rd arguments as array of consecutive values which will represent rows/columns.
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(FREQUENCY(A1:A15;A1:A15)-1));1)
First argument within INDEX is the resulting array coming from a formula to shrink (from 16x1 to 15x1), which would be a multi-cell array formula if explicitly entered; second argument is the array 1..15 given by row numbers from 1 to the number of total rows of the "array from formula to shrink" MINUS 1: the first 15 (out of 16) values in the array from a formula; 3rd argument is the column of the shrank array (if need be, more than one could be selected using an analogue to the second argument).
In the particular case of FREQUENCY, because it is known that we are interested in the "bins" part of the function, the formula can be simplified by including the total rows of the "bins"/"intervals" array inside FREQUENCY (its second argument). We will have
INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1)
and the complete formula would become
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15;A1:A15);ROW(INDIRECT("1:" & ROWS(A1:A15)));1);B1:B15))
Now, both dividend and divisor of MOD have exactly the same dimensions (15x1) and because B1:B15 includes integers greater than 0 there are no errors.
Thanks all for helping me in making question more concise and better formatted.
ADDITIONAL INFORMATION: As pointed out correctly in comments by XOR LX, this does not seem to work in the widely popular spreadsheet software Excel. It has been developed for an INDEX function inside SUMPRODUCT as used in Open Office Calc which I had mistakenly thought 100% equivalent to Excel's version. A more complete answer perhaps using other functions would be appreciated.
In the previous answer, XOR LX points out very correctly that this formula cannot work in Excel, due to row_num/column_num argument behaviour. Very kindly XOR LX has shown me how that approach can work, and also thanks and credit for supplying a good answer: "INDEX can be used to redimension array (even dynamically created ones) if the row_num/column_num array is coerced to take an arbitrary array with the right dimensions, as shown on this blog entry " The following formula has been checked in Excel 2010 and has the expected results:
SUMPRODUCT(MOD(INDEX(FREQUENCY(A1:A15,A1:A15),N(INDEX(ROW(INDIRECT("1:" & ROWS(A1:A15))),,)),1),B1:B15))
NB: row_num argument of first INDEX, a ROW generated auxiliary array, has been nested inside N(INDEX([...],,)); at least one comma is necessary to account for the two arguments minimum of the nested INDEX. It is in itself interesting the discussion that applies generally to INDEX's arguments, and other functions', that need to be coerced to take arrays (see, here and here at XOR LX's blog). For Open Office users it might be worth stressing the point made at the blog
Unlike OFFSET, (...) for which the first parameter must be a
reference (...) in the worksheet, INDEX can also accept –
and manipulate – for its reference arrays which consist of values
generated e.g. via other subfunctions within the formula. XOR LX's blog
That would be indeed the case in changing the dimension in an array as in this question, but also useful in reversing or displacing the values in an array, for example. Open Office accepts arrays as row_num/column_num, so the coercion is not needed and some formulas rely on this, but without it, these formulas are unlikely to work when files are open in Excel.
Regrettably, this type of coercion is not passed correctly to Open Office, and formula need to be "decoerced" to work, at least in my casual tests.
In order to use a formula that would work in both spreadsheet programs regarding shortening arrays, the only thing I have managed is the following (required: arrays must be single-column)
SUMPRODUCT(
(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
-ROW(COLUMN(INDIRECT("R1C1:R"& ROWS(vals_to_mod) &"C"& ROWS(FREQUENCY(vals_for_freq,vals_for_freq)),FALSE))
=0)
*MOD(TRANSPOSE(FREQUENCY(vals_for_freq,vals_for_freq)),vals_to_mod)
)
(it "shortens" one array to the shortest of the pair, by creating an auxiliary array with TRUE/1s on the diagonal starting top-left and FALSE/0s elsewhere, therefore disregarding all defined values outside the square section of the array. Thus, SUMPRODUCT adds values within the diagonal of the square section which are the product of the corresponding values up to the last value of the shorter array.)

How can I use any() on a multidimensional array?

I'm testing an arbitrarily-large, arbitrarily-dimensioned array of logicals, and I'd like to find out if any one or more of them are true. any() only works on a single dimension at a time, as does sum(). I know that I could test the number of dimensions and repeat any() until I get a single answer, but I'd like a quicker, and frankly, more-elegant, approach.
Ideas?
I'm running 2009a (R17, in the old parlance, I think).
If your data is in a matrix A, try this:
anyAreTrue = any(A(:));
EDIT: To explain a bit more for anyone not familiar with the syntax, A(:) uses the colon operator to take the entire contents of the array A, no matter what the dimensions, and reshape them into a single column vector (of size numel(A)-by-1). Only one call to ANY is needed to operate on the resulting column vector.
As pointed out, the correct solution is to reshape the result into a vector. Then any will give the desired result. Thus,
any(A(:))
gives the global result, true if any of numel(A) elements were true. You could also have used
any(reshape(A,[],1))
which uses the reshape operator explicitly. If you don't wish to do the extra step of converting your matrices into vectors to apply any, then another approach is to write a function of your own. For example, here is a function that would do it for you:
======================
function result = myany(A)
% determines if any element at all in A was non-zero
result = any(A(:));
======================
Save this as an m-file on your search path. The beauty of MATLAB (true for any programming language) is it is fully extensible. If there is some capability that you wish it had, just write a little idiom that does it. If you do this often enough, you will have customized the environment to fit your needs.

Resources