Check all the values in a Julia array? - arrays

How can I check all the values in a Julia array at once? Let's say I have an array like a=[3,4,6,10,55,31,9,10] How can I check if the array has any values greater than 10? Or how can I check if there are repeating values (like the 10 that is contained twice in the sample? I know I can write loops to check this, but I assume Julia has a faster way to check all the values at once.

The functions any and count do this:
julia> a = [3,4,6,10,55,31,9,10]
8-element Array{Int64,1}:
3
4
6
10
55
31
9
10
julia> any(x->x==3, a)
true
julia> count(x->x==10, a)
2
However the performance will probably be about the same as a loop, since loops in julia are fast (and these functions are themselves implemented in julia in the standard library).
If the problem has more structure you can get big speedups. For example if the vector is sorted you can use searchsorted to find matching values with binary search.

Not sure if this had been implemented at the time of previous answers, but the most concise way now would be:
all(a .> 10)
As Chris Rackauckas mentioned, a .> 10 returns an array of booleans, and then all simply checks that all values are true. Equivalent of Python's any and all.

You can also use broadcasted operations. In some cases it's nicer syntax than any and count, in other cases it can be less obvious what it's doing:
boola = a.>10 # Returns an Array{Bool}, true at any value >10
minimum(boola) # Returns false if any are <10
sum(a-10 .== 0) # Finds all values equal to 10, sums to get a count

Related

Repeat array rows specified number of times

New to julia, so this is probably very easy.
I have an n-by-m array and a vector of length n and want to repeat each row of the array the number of times in the corresponding element of the vector. For example:
mat = rand(3,6)
v = vec([2 3 1])
The result should be a 6-by-6 array. I tried the repeat function but
repeat(mat, inner = v)
yields a 6×18×1 Array{Float64,3}: array instead so it takes v to be the dimensions along which to repeat the elements. In matlab I would use repelem(mat, v, 1) and I hope julia offers something similar. My actual matrix is a lot bigger and I will have to call the function many times, so this operation needs to be as fast as possible.
It has been discussed to add a similar thing to Julia Base, but currently it is not implemented yet AFAIK. You can achieve what you want using the inverse_rle function from StatsBase.jl:
julia> row_idx = inverse_rle(axes(v, 1), v)
6-element Array{Int64,1}:
1
1
2
2
2
3
and now you can write:
mat[row_idx, :]
or
#view mat[row_idx, :]
(the second option creates a view which might be relevant in your use case if you say that your mat is large and you need to do such indexing many times - which option is faster will depend on your exact use case).

Updating a static array, in a nested function without making a temporary array? <Julia>

I have been banging my head against a wall trying to use static arrays in julia.
https://github.com/JuliaArrays/StaticArrays.jl
They are fast but updating them is a pain. This is no surprise, they are meant to be immutable!
But it is continuously recommended to me that I use static arrays even though I have to update them. In my case, the static arrays are small, just length 3, and i have a vector of them, but I only update 1 length three SVector at a time.
Option 1
There is a really neat package called Setfield that allows you to do inplace updates of SVectors in Julia.
https://github.com/jw3126/Setfield.jl
The catch... it updates the local copy. So if you are in a nested function, it updates the local copy. So it comes with some book keeping since you have to inplace update the local copy and then return that copy and update the actual array of interest. You can't pass in your desired array and update it in place, at least, not that I can figure out! Now, I do not mind bookeeping, but I feel like updating a local copy, then returning the value, updating another local copy, and then returning the values and finally updating the actual array must come with a speed penalty. I could be wrong.
Option 2
It bugs me that in order to do an update a static array I must
exampleSVector::SVector{3,Float64} <-- just to make clear its type and size
exampleSVector = [value1, value2, value3]
This will update the desired array even if it is inside a function, which is nice and the goal, but if you do this inside a function it creates a temporary array. And this kills me because my function is in a loop that gets called 4+ million times, so this creates a ton of allocations and slows things down.
How do I update an SVector for the Option 2 scenario without creating a temporary array?
For the Option 1 scenario, can I update the actual array of interest rather than the local copy?
If this requires a simple example code, please say so in the comments, and I will make one. My thinking is that it is answerable without one, but I will make one if it is needed.
EDIT:
MCVE code - Option 1 works, option 2 does not.
using Setfield
using StaticArrays
struct Keep
dreaming::Vector{SVector{3,Float64}}
end
function INNER!(vec::SVector{3,Float64},pre::SVector{3,Float64})
# pretend series of calculations
for i = 1:3 # illustrate use of Setfield (used in real code for this)
pre = #set pre[i] = rand() * i * 1000
end
# more pretend calculations
x = 25.0 # assume more calculations equals x
################## OPTION 1 ########################
vec = #set vec = x * [ pre[1], pre[2], pre[3] ] # UNCOMMENT FOR FOR OPTION 1
return vec # UNCOMMENT FOR FOR OPTION 1
################## OPTION 2 ########################
#vec = x * [ pre[1], pre[2], pre[3] ] # UNCOMMENT FOR FOR OPTION 2
#nothing # UNCOMMENT FOR FOR OPTION 2
end
function OUTER!(always::Keep)
preAllocate = SVector{3}(0.0,0.0,0.0)
for i=1:length(always.dreaming)
always.dreaming[i] = INNER!(always.dreaming[i], preAllocate) # UNCOMMENT FOR FOR OPTION 1
#INNER!(always.dreaming[i], preAllocate) # UNCOMMENT FOR FOR OPTION 2
end
end
code = Keep([zero(SVector{3}) for i=1:5])
OUTER!(code)
println(code.dreaming)
I hope that I have understood your question correctly. It's a bit hard with a MWE like this, that does a lot of things that are mostly redundant and a bit confusing.
There seems to be two alternative interpretations here: Either you really need to update ('mutate') an SVector, but your MWE fails to demonstrate why. Or, you have convinced yourself that you need to mutate, but you actually don't.
I have decided to focus on alternative 2: You don't really need to 'mutate'. Rewriting your code from that point of view simplifies it greatly.
I couldn't find any reason for you to mutate any static vectors here, so I just removed that. The behaviour of the INNER! function with the inputs was very confusing. You provide two inputs but don't use either of them, so I removed those inputs.
function inner()
pre = #SVector [rand() * 1000i for i in 1:3]
x = 25
return pre .* x
end
function outer!(always::Keep)
always.dreaming .= inner.() # notice the dot in inner.()
end
code = Keep([zero(SVector{3}) for i in 1:5])
outer!(code)
display(code.dreaming)
This runs fast and with zero allocations. In general with StaticArrays, don't try to mutate things, just create new instances.
Even though it's not clear from your MWE, there may be some legitimate reason why you may want to 'mutate' an SVector. In that case you can use the setindex method of StaticArrays, you don't need Setfield.jl:
julia> v = rand(SVector{3})
3-element SArray{Tuple{3},Float64,1,3}:
0.4730258499237898
0.23658547518737905
0.9140206579322541
julia> v = setindex(v, -3.1, 2)
3-element SArray{Tuple{3},Float64,1,3}:
0.4730258499237898
-3.1
0.9140206579322541
To clarify: setindex (without a !) does not mutate its input, but creates a new instance with one index value changed.
If you really do need to 'mutate', perhaps you can make a new MWE that shows this. I would recommend that you try to simplify it a bit, because it is quite confusing now. For example, the inclusion of the type Keep seems entirely unnecessary and distracting. Just make a Vector of SVectors and show what you want to do with that.
Edit: Here's an attempt based on the comments below. As far as I understand it now, the question is about modifying a vector of SVectors. You cannot really mutate the SVectors, but you can replace them using a convenient syntax, setindex, where you can keep some of the elements and change some of the others:
oldvec = [zero(SVector{3}) for _ in 1:5]
replacevec = [rand(SVector{3}) for _ in 1:5]
Now we replace the second element of each element of oldvec with the corresponding one in replacevec. First a one-liner:
oldvec .= setindex.(oldvec, getindex.(replacevec, 2), 2)
Then an even faster one with a loop:
for i in eachindex(oldvec, replacevec)
#inbounds oldvec[i] = setindex(oldvec[i], replacevec[i][2], 2)
end
There are two types of static arrays - mutable (starting with M in type name) and immutable ones (starting with S) - just use the mutable ones! have a look at the example below:
julia> mut = MVector{3,Int64}(1:3);
julia> mut[1]=55
55
julia> mut
3-element MArray{Tuple{3},Int64,1,3}:
55
2
3
julia> immut = SVector{3,Int64}(1:3);
julia> inmut[1]=55
ERROR: setindex!(::SArray{Tuple{3},Int64,1,3}, value, ::Int) is not defined.
Let us see some simple benchmark (ordinary array, vs mutable static vs immutable static):
using BenchmarkTools
julia> ord = [1,2,3];
julia> #btime $ord.*$ord;
39.680 ns (1 allocation: 112 bytes)
3-element Array{Int64,1}:
1
4
9
julia> #btime $mut.*$mut
8.533 ns (1 allocation: 32 bytes)
3-element MArray{Tuple{3},Int64,1,3}:
3025
4
9
julia> #btime $immut.*$immut
2.133 ns (0 allocations: 0 bytes)
3-element SArray{Tuple{3},Int64,1,3}:
1
4
9

arrayfun to bsxfun possible

I know that bsxfun(which works fast!) and arrayfun(as far as I could understand, uses loops internally which is expected to be slow) are intended for different uses, at least, at the most basic level.
Having said this, I am trying
to sum up all the numbers in a given array, say y, before a certain index
add the number at the specific location(which is the number at the above index location) to the above sum.
I could perform this with the below piece of example code easily:
% index array
x = [ 1:6 ]; % value array
y = [ 3 3 4 4 1 1 ];
% arrayfun version
o2 = arrayfun(#(a) ...
sum(y(1:(a-1)))+...
y(a), ...
x)
But it seems to be slow on large inputs.
I was wondering what would be a good way to convert this to a version that works with bsxfun, if possible.
P.S. the numbers in y do not repeat as given above, this was just an example, it could also be [3 4 3 1 4 ...]
Is x always of the form 1 : n? Assuming the answer is yes, then you can get the same result with the much faster code:
o2 = cumsum(y);
Side note: you don't need the brackets in the definition of x.
if you have a supported GPU device, you can define your variables as gpuArray type since arrayfun, bsxfun and pagefun are compatible with GPUs.
GPU computing is supposed to be faster for large data.

Problems defining an array in SAS

I'm trying to define an array using the following code:
data all_dates;
array arr[*] _temporary_ (1 2 3 4 5);
run;
It gives me this message:
ERROR: The non-variable based array arr has been defined with zero elements.
Looking at the documentation examples I can't see why this wouldn't work. Am I doing something wrong or is this just not allowed? If it's not allowed what is an alternative equivalent method?
Ah nevermind - the documentation actually does state:
You cannot use the asterisk with _TEMPORARY_ arrays or when you define a multidimensional array.
I guess I'll have to count the number of elements I'll need beforehand and then use:
data all_dates;
array arr[*] a1-a5 (1 2 3 4 5);
drop a1-a5;
run;
The general solution with temporary arrays is to over-define the temporary array. Since it's not in the PDV, and not written out (obviously), you can simply define it for 50 or 100 or whatever is a safe number above your maximum - it won't cost almost any performance to do so.
Alternately, if the 'safe number' is too large for your desired memory allocation (say, tens of thousands), use a macro variable and precount the number. Given that you wrote out
(1 2 3 4 5)
You should be able to count the number of needed variables when you create this list.

Finding whether a value is equal to the value of any array element in MATLAB

Can anyone tell me if there is a way (in MATLAB) to check whether a certain value is equal to any of the values stored within another array?
The way I intend to use it is to check whether an element index in one matrix is equal to the values stored in another array (where the stored values are the indices of the elements which meet a certain criteria).
So, if the indices of the elements which meet the criteria are stored in the matrix below:
criteriacheck = [3 5 6 8 20];
Going through the main array (called array) and checking if the index matches:
for i = 1:numel(array)
if i == 'Any value stored in criteriacheck'
%# "Do this"
end
end
Does anyone have an idea of how I might go about this?
The excellent answer previously given by #woodchips applies here as well:
Many ways to do this. ismember is the first that comes to mind, since it is a set membership action you wish to take. Thus
X = primes(20);
ismember([15 17],X)
ans =
0 1
Since 15 is not prime, but 17 is, ismember has done its job well here.
Of course, find (or any) will also work. But these are not vectorized in the sense that ismember was. We can test to see if 15 is in the set represented by X, but to test both of those numbers will take a loop, or successive tests.
~isempty(find(X == 15))
~isempty(find(X == 17))
or,
any(X == 15)
any(X == 17)
Finally, I would point out that tests for exact values are dangerous if the numbers may be true floats. Tests against integer values as I have shown are easy. But tests against floating point numbers should usually employ a tolerance.
tol = 10*eps;
any(abs(X - 3.1415926535897932384) <= tol)
you could use the find command
if (~isempty(find(criteriacheck == i)))
% do something
end
Note: Although this answer doesn't address the question in the title, it does address a more fundamental issue with how you are designing your for loop (the solution of which negates having to do what you are asking in the title). ;)
Based on the for loop you've written, your array criteriacheck appears to be a set of indices into array, and for each of these indexed elements you want to do some computation. If this is so, here's an alternative way for you to design your for loop:
for i = criteriacheck
%# Do something with array(i)
end
This will loop over all the values in criteriacheck, setting i to each subsequent value (i.e. 3, 5, 6, 8, and 20 in your example). This is more compact and efficient than looping over each element of array and checking if the index is in criteriacheck.
NOTE: As Jonas points out, you want to make sure criteriacheck is a row vector for the for loop to function properly. You can form any matrix into a row vector by following it with the (:)' syntax, which reshapes it into a column vector and then transposes it into a row vector:
for i = criteriacheck(:)'
...
The original question "Can anyone tell me if there is a way (in MATLAB) to check whether a certain value is equal to any of the values stored within another array?" can be solved without any loop.
Just use the setdiff function.
I think the INTERSECT function is what you are looking for.
C = intersect(A,B) returns the values common to both A and B. The
values of C are in sorted order.
http://www.mathworks.de/de/help/matlab/ref/intersect.html
The question if i == 'Any value stored in criteriacheck can also be answered this way if you consider i a trivial matrix. However, you are proably better off with any(i==criteriacheck)

Resources