I am trying to use numpy_where to find the index of a particular value. Though I have searched quite a bit on the web including stackoverflow I did not find a simple 1D example.
ar=[3,1,4,8,2,1,0]
>>> np.where(ar==8)
(array([], dtype=int64),)
I expected np.where(ar==8) to return me the index/location of 8 in the the array.
What am I doing wrong? Is it something in my array?
Thanks
This is a really good example of how the range of variable types in Python and numpy can be confusing for a beginner. What's happening is [3,1,4,8,2,1,0] returns a list, not an ndarray. So, the expression ar == 8 returns a scalar False, because all comparisons between list and scalar types return False. Thus, np.where(False) returns an empty array. The way to fix this is:
arr = np.array([3,1,4,8,2,1,0])
np.where(arr == 8)
This returns (array([3]),). There's opportunity for further confusion, because where returns a tuple. If you write a script that intends to access the index position (3, in this case), you need np.where(arr == 8)[0] to pull the first (and only) result out of the tuple. To actually get the value 3, you need np.where(arr == 8)[0][0] (although this will raise an IndexError if there are no 8's in the array).
This is an example where numeric-specialized languages like Matlab or Octave are simpler to use for newbies, because the language is less general and so has fewer return types to understand.
Related
I am learning Julia following the Wikibook, but I don't understand why the following two commands give different results:
julia> [1:2]
1-element Array{UnitRange{Int64},1}:
1:2
julia> Array[1:2]
1-element Array{Array,1}:
[1,2]
Apologies if there is an explanation I haven't seen in the Wikibook, I have looked briefly but didn't find one.
Type[a] runs convert on the elements, and there is a simple conversion between a Range to an Array (collect). So Array[1:2] converts 1:2 to an array, and then makes an array of objects like that. This is the same thing as why Float64[1;2;3] is an array of Float64.
These previous parts answer answered the wrong thing. Oops...
a:b is not an array, it's a UnitRange. Why would you create an array for A = a:b? It only takes two numbers to store it, and you can calculate A[i] basically for free for any i. Using an array would take an amount of memory which is proportional to the b-a, and thus for larger arrays would take a lot of time to allocate, whereas allocation for UnitRange is essentially free.
These kinds of types in Julia are known as lazy iterators. LinSpace is another. Another interesting set of types are the special matrix types: why use more than an array to store a Diagonal? The UniformScaling operator acts as the identity matrix while only storing one value (it's scale) to make A-kI efficient.
Since Julia has a robust type system, there is no reason to make all of these things arrays. Instead, you can make them a specialized type which will act (*, +, etc.) and index like an array, but actually aren't. This will make them take less memory and be faster. If you ever need the array, just call collect(A) or full(A).
I realized that you posted something a little more specific. The reason here is that Array[1:2] calls the getindex function for an array. This getindex function has a special dispatch on a Range so that way it "acts like it's indexed by an array" (see the discussion from earlier). So that's "special-cased", but in actuality it just has dispatches to act like an array just like it does with every other function. [A] gives an array of typeof(A) no matter what A is, so there's no magic here.
How does one access an element of an array that is returned from a function? For example, shape() returns an array of integers. How does one compare an element of that array to an integer? The following does not compile:
integer :: a
integer, dimension(5) :: b
a = 5
if (a .eq. shape(b)) then
print *, 'equal'
end if
The error is:
if (a .eq. shape(c)) then
1
Error: IF clause at (1) requires a scalar LOGICAL expression
I understand that this is because shape(c) returns an array. However, accessing an element of the array does not appear to be possible like so: shape(c)(1)
Now if I add these two lines:
integer, dimension(1) :: c
c = shape(b)
...and change the if clause to this:
if (a .eq. c(1)) then
... then it works. But do I really have to declare an extra array variable to hold the return value of shape(), or is there some other way to do it?
Further to the answers that deal with SHAPE and logical expressions etc, the general answer to your question "How does one access an element of an array that is returned from a function?" is
you assign the expression that has the function reference to an array variable, and then index that array variable.
you use the expression that has the function reference as an actual argument to a procedure that takes a dummy array argument, and does the indexing for you.
Consequently, the general answer to your last questions "But do I really have to declare an extra array variable to hold the return value of shape(), or is there some other way to do it?" is "Yes, you do need to declare another array variable" and hence "No, there is no other way".
(Note that reasonable optimising compilers will avoid the need for any additional memory operations/allocations etc once they have the result of the array function, it's really just a syntax issue.)
The rationale for this particular aspect of language design is sometimes ascribed to a need to avoid syntax ambiguity and confusion for array function results that are of character type (they could potentially be indexed and/or substringed - how do you tell what was intended?). Others think it was done this way just to annoy C programmers.
Instead of using shape(array), I would use size(array).
Note that this will return an integer indicating how many elements there are in ALL dimensions, unless you specify the DIM attribute, in which case it will return only the number of elements in the specified dimension.
Take a look at the gfortran documentation:
http://gcc.gnu.org/onlinedocs/gfortran/SIZE.html.
Also, look up lbound and ubound.
Note that the expression
a == shape(b)
returns a rank-1 array of logicals and the if statement requires that the condition reduce to a scalar logical expression. You could reduce the rank-1 array to a scalar like this:
if (all(a == shape(b)))
This is certainly not a general replacement for the syntactically-invalid application of array indexing to an array-returning function such as shape(b)(1).
It is possible even without the intermediate variable using ASSOCIATE:
real c(3,3)
associate (x=>shape(c))
print *,x(1),x(2)
end associate
end
So I've been wondering about this for a while now. Summing up over some array variable A is as easy as
sum(A(:))
% or
sum(...sum(sum(A,n),n-2)...,1) % where n is the dimension of A
However once it gets to expressions the (:) doesn't work anymore, like
sum((A-2*A)(:))
is no valid matlab syntax, instead we need to write
foo = A-2*A;
sum(foo(:))
%or the one liner
sum(sum(...sum(A-2*A,n)...,2),1) % n is the dimension of A
The one liner above will only work, if the dimension of A is fixed which, depending on what you are doing, may not necessary be the case. The downside of the two lines is, that foo will be kept in memory until you run clear foo or may not even be possible depending on the size of A and what else is in your workspace.
Is there a general way to circumvent this issue and sum up all elements of an array valued expression in a single line / without creating temporal variables? Something like sum(A-2*A,'-all')?
Edit: It differes from How can I index a MATLAB array returned by a function without first assigning it to a local variable?, as it doesn't concern general (nor specific) indexing of array valued expressions or return values, but rather the summation over each possible index.
While it is possible to solve my problem with the answer given in the link, gnovice says himself that using subref is a rather ugly solution. Further Andras Deak posted a much cleaner way of doing this in the comments below.
While the answers to the linked duplicate can indeed be applied to your problem, the narrower scope of your question allows us to give a much simpler solution than the answers provided there.
You can sum all the elements in an expression (including the return value of a function) by reshaping your array first to 1d:
sum(reshape(A-2*A,1,[]))
%or even sum(reshape(magic(3),1,[]))
This will reshape your array-valued expression to size [1, N] where N is inferred from the size of the array, i.e. numel(A-2*A) (but the above syntax of reshape will compute the missing dimension for you, no need to evaluate your expression twice). Then a single call to sum will sum all the elements, as needed.
The actual case where you have to resort to something like this is when a function returns an array with an unknown number of dimensions, and you want to use its sum in an anonymous function (making temporary variables unavailable):
fun = #() rand(2*ones(1,randi(10))); %function returning random 2 x 2 x ... x 2 array with randi(10) dimensions
sumfun = #(A) sum(reshape(A,1,[]));
sumfun(fun()) %use it
I have a numpy array. The best way I can describe it is an array of arrays. I have N arrays that are all the same size (L x M). What I need to do is obtain the value for each (L,M) combination and assemble these combinations into a list of N values.
Example:
I have 400 arrays that are 8 x 8. I need to obtain the value of (2,5) for all 400 arrays and put them in a list.
I have looked into numpy.dsplit() and numpy.array_split(), but either I'm not applying them correctly or they aren't what I'm needing.
Can anyone advise me? And, no, at this point, I don't have any code to show beyond obtaining the original array, and as that is research data, I'm not comfortable posting it here.
This is basic indexing.
If, for instance, myArray.shape is (400, 8, 8), you'd pull those values out with:
myArray[:, 2, 5]
(the colon means "everything in this dimension")
I'm testing an arbitrarily-large, arbitrarily-dimensioned array of logicals, and I'd like to find out if any one or more of them are true. any() only works on a single dimension at a time, as does sum(). I know that I could test the number of dimensions and repeat any() until I get a single answer, but I'd like a quicker, and frankly, more-elegant, approach.
Ideas?
I'm running 2009a (R17, in the old parlance, I think).
If your data is in a matrix A, try this:
anyAreTrue = any(A(:));
EDIT: To explain a bit more for anyone not familiar with the syntax, A(:) uses the colon operator to take the entire contents of the array A, no matter what the dimensions, and reshape them into a single column vector (of size numel(A)-by-1). Only one call to ANY is needed to operate on the resulting column vector.
As pointed out, the correct solution is to reshape the result into a vector. Then any will give the desired result. Thus,
any(A(:))
gives the global result, true if any of numel(A) elements were true. You could also have used
any(reshape(A,[],1))
which uses the reshape operator explicitly. If you don't wish to do the extra step of converting your matrices into vectors to apply any, then another approach is to write a function of your own. For example, here is a function that would do it for you:
======================
function result = myany(A)
% determines if any element at all in A was non-zero
result = any(A(:));
======================
Save this as an m-file on your search path. The beauty of MATLAB (true for any programming language) is it is fully extensible. If there is some capability that you wish it had, just write a little idiom that does it. If you do this often enough, you will have customized the environment to fit your needs.