How do you create an SArray/MArray with changed units? - arrays

I have a vector u and a number t with Unitful units, and I want du to have the units of typeof(oneunit(u)/oneunit(t)). I want to find a single generic line of code which constructs an SArray or MArray output which matches the input. There are a few cases which I have tried:
Obviously copy(u) doesn't match the units.
u/oneunit(t) and u./oneunit(t) both create SArrays, even when u <: MArray.
similar always creates a mutable type, so it always creates an MArray
Do I need to directly use the constructor (which would be a pain because it would add an odd branch to an otherwise generic code, but it's fine if that's the answer)?
Example that a simple convert does not work with MArrays
u = #MArray [1u"g",2u"g",3u"g"]
t = 1u"s"
DimensionError: g and 1.0 g s^-1 are not dimensionally compatible.
While similar is hopeless:
u = #SArray [1u"g",2u"g",3u"g"]
3-element MVector{3,Quantity{Int64, Dimensions:{𝐌}, Units:{g}}}:
72559480 g
581132080 g
29791 g

How is:
static_similar(s, v) =
( isimmutable(s) ? StaticArrays.mutable_similar_type :
Size(s), StaticArrays.length_val(s) )(v)
julia> u = #MArray [1u"g",2u"g",3u"g"];
julia> s = #SArray [1u"g",2u"g",3u"g"];
julia> static_similar(u,u./oneunit(t))
3-element SVector{3,Quantity{Float64, Dimensions:{𝐌 𝐓^-1}, Units:{g s^-1}}}:
1.0 g s^-1
2.0 g s^-1
3.0 g s^-1
julia> static_similar(s,s./oneunit(t))
3-element MVector{3,Quantity{Float64, Dimensions:{𝐌 𝐓^-1}, Units:{g s^-1}}}:
1.0 g s^-1
2.0 g s^-1
3.0 g s^-1
The relevant functions are defined in StaticArrays/src/abstractarray.jl. Especially, note comment:

Great question!
Basically, as a static arrays user I always use SArray and a function programming approach to them. (When I need to manage memory I can use Array{SArray{...}} or whatever, and replace elements of the outer Array).
Probably not the answer you are looking for but I'd tend to chill out about the fact that operations return SArray and just learn to replace SArrays in their entirety. In most cases this is faster than fiddling with MArray because LLVM naturally invokes SIMD instructions for stack variables while the heap-allocated MArray operations do not.
Was it your expectation that operations like division would preserve the ability (or not) to mutate?
EDIT: yes, using the constructor or convert is totally a viable approach.


Custom searchsortedfirst method

I'm kinda new in Julia lang, so I'm still struggling with reading Julia documentation. Here is a piece of it and I am looking for explanation specifically the bolded part.
Base.Sort.searchsortedfirst — Function.
searchsortedfirst(a, x, [by=,] [lt=,]
Returns the index of the first value in a greater than or equal to x,
according to the specified order. Returns length(a)+1 if x is greater
than all values in a. a is assumed to be sorted.
My array looks like this:
A = Vector{Record}()
type Record
Now here is my problem. I would like to call above-mentioned method on my array and obtain Record where given x equals y in this Record (Record.y == x). Guess I have to write 'by' transfrom or 'lt' comparator? or both?
Any help would be appraciated :)
#crstnbr has provided a perfectly good answer for the case of one-off uses of searchsortedfirst. I thought it worth adding that there is also a more permanent solution. If your type Record exhibits a natural ordering, then just extend Base.isless and Base.isequal to your new type. The following example code shows how this works for some new type you might define:
struct MyType ; x::Float64 ; end #Define some type of my own
yvec = MyType.(sort!(randn(10))) #Build a random vector of my type
yval = MyType(0.0) #Build a value of my type
searchsortedfirst(yvec, yval) #ERROR: this use of searchsortedfirst will throw a MethodError since julia doesn't know how to order MyType
Base.isless(y1::MyType, y2::MyType)::Bool = y1.x < y2.x #Extend (aka overload) isless so it is defined for the new type
Base.isequal(y1::MyType, y2::MyType)::Bool = y1.x == y2.x #Ditto for isequal
searchsortedfirst(yvec, yval) #Now this line works
Some points worth noting:
1) In the step where I overload isless and isequal, I preface the method definition with Base.. This is because the isless and isequal functions are originally defined in Base, where Base refers to the core julia package that is automatically loaded every time you start julia. By prefacing with Base., I ensure that my new methods are added to the current set of methods for these two functions, rather than replacing them. Note, I could also achieve this by omitting Base. but including a line beforehand of import Base: isless, isequal. Personally, I prefer the way I've done it above (for the overly pedantic, you can also do both).
2) I can define isless and isequal however I want. It is my type and my method extensions. So you can choose whatever you think is the natural ordering for your new type.
3) The operators <. <=, ==, >=, >, all actually just call isless and isequal under the hood, so all of these operators will now work with your new type, eg MyType(1.0) > MyType(2.0) returns false.
4) Any julia function that uses the comparative operators above will now work with your new type, as long as the function is defined parametrically (which almost everything in Base is).
You can just define a custom less-than operation and give it to searchsortedfirst via lt keyword argument:
julia> type Record
julia> A = Vector{Record}()
0-element Array{Record,1}
julia> push!(A, Record(3,3.0))
1-element Array{Record,1}:
Record(3, 3.0)
julia> push!(A, Record(4,3.0))
2-element Array{Record,1}:
Record(3, 3.0)
Record(4, 3.0)
julia> push!(A, Record(5,3.0))
3-element Array{Record,1}:
Record(3, 3.0)
Record(4, 3.0)
Record(5, 3.0)
julia> searchsortedfirst(A, 4, lt=(r,x)->r.y<x)
Here, (r,x)->r.y<x is an anonymous function defining your custom less-than. It takes two arguments (the elements to be compared). The first will be the elements from A, the second is the fixed element to compare to.

Converting Array{Array{Float64,1},1} to Array{Float64,2} in Julia

My problem is similar to the problem described earlier,
with the difference that I don't input numbers manually. Thus the accepted answer there does not work for me.
I want to convert the vector of cartesian coordinates to polars:
function cart2pol(x0,
rho = sqrt(x0^2 + x1^2)
phi = atan2(x1, x0)
return [rho, phi]
#vectorize_2arg Number cart2pol
function cart2pol(x)
x1 = view(x,:,1)
x2 = view(x,:,2)
return cart2pol(x1, x2)
x = rand(5,2)
The last command does not collect Arrays for some reason, returning the output of type 5-element Array{Array{Float64,1},1}. Any idea how to cast it to Array{Float64,2}?
If you look at the definition of cat (which is the underlying function for hcat and vcat), you see that you can collect several arrays into one single array of dimension 2:
cat(2, [1,2], [3,4], [5,6])
2×3 Array{Int64,2}:
1 3 5
2 4 6
This is basically what you want. The problem is that you have all your output polar points in an array itself. cat expects you to provide them as several arguments. This is where ... comes in.
... used to cause a single function argument to be split apart into many different arguments when used in the context of a function call.
Therefore, you can write
cat(2, [[1,2], [3,4], [5,6]]...)
2×3 Array{Int64,2}:
1 3 5
2 4 6
In your situation, it works exactly in the same way (I changed your x to have the points in columns):
cat(2, cart2pol.(view(x,1,:),view(x,2,:))...)
2×5 Array{Float64,2}:
0.587301 0.622 0.928159 0.579749 0.227605
1.30672 1.52956 0.352177 0.710973 0.909746
The function mapslices can also do this, essentially transforming the rows of the input:
julia> x = rand(5,2)
5×2 Array{Float64,2}:
0.458583 0.205246
0.285189 0.992547
0.947025 0.0853141
0.79599 0.67265
0.0273176 0.381066
julia> mapslices(row->cart2pol(row[1],row[2]), x, [2])
5×2 Array{Float64,2}:
0.502419 0.420827
1.03271 1.291
0.95086 0.0898439
1.04214 0.701612
0.382044 1.49923
The last argument specifies dimensions to operate over; e.g. passing [1] would transform columns.
As an aside, I would encourage one or two stylistic changes. First, it's good to map like to like, so if we stick with the row representation then cart2pol should accept a 2-element array (since that's what it returns). Then this call would just be mapslices(cart2pol, x, [2]). Or, if what we're really trying to represent is an array of coordinates, then the data could be an array of tuples [(x1,y1), (x2,y2), ...], and cart2pol could accept and return a tuple. In either case cart2pol would not need to be able to operate on arrays, and it's partly for this reason that we've deprecated the #vectorize_ macros.

Function handle applied to array in MATLAB

I've searched and there are many answers to this kind of question, suggesting functions like arrayfun, bsxfun, and so on. I haven't been able to resolve the issue due to dimension mismatches (and probably a fundamental misunderstanding as to how MATLAB treats anonymous function handles).
I have a generic function handle of more than one variable:
f = #(x,y) (some function of x, y)
Heuristically, I would like to define a new function handle like
g = #(x) sum(f(x,1:3))
More precisely, the following does exactly what I need, but is tedious to write out for larger arrays (say, 1:10 instead of 1:3):
g = #(x) f(x,1)+f(x,2)+f(x,3)
I tried something like
g = #(x) sum(arrayfun(#(y) f(x,y), 1:3))
but this does not work as soon as the size of x exceeds 1x1.
Thanks in advance.
Assuming you cannot change the definition of f to be more vector-friendly, you can use your last solution by specifying a non-uniform output and converting the output cell array to a matrix:
g = #(x) sum(cell2mat(arrayfun(#(y) f(x,y), 1:3,'UniformOutput',false)),2);
This should work well if f(x,y) outputs column vectors and you wish to sum them together. For rows vectors, you can use
g = #(x) sum(cell2mat(arrayfun(#(y) f(x,y), 1:3,'UniformOutput',false).'));
If the arrays are higher in dimension, I actually think a function accumulator would be quicker and easier. For example, consider the (extremely simple) function:
function acc = accumfun(f,y)
acc = f(y(1));
for k = 2:numel(y)
acc = acc + f(y(k));
Then, you can make the one-liner
g = #(x) accumfun(#(y) f(x,y),y);

Haskell Constant Propagation on Data Structures?

I want to know how deeply Haskell evaluates data structures at compile time.
Consider the following list:
simpleTableMultsList :: [Int]
simpleTableMultsList = [n*m | n <- [1 ..9],m <- [1 ..9]]
This list gives a representation of the multiplication table for 1 through 9. Now, suppose we want to change it so that we represent the product of two one digit numbers as a pair of numbers (first digit, second digit). Then we may consider
simpleTableMultsList :: [(Int,Int)]
simpleTableMultsList = [(k `div` 10, k `rem` 10) | n <- [1 ..9],m <- [1 ..9],let k = n*m]
Now we can implement multiplication on one digit numbers as a table lookup. YAY!! However, we want to be more efficient than this! So we want to make this structure an unboxed array. Haskell gives a really great way to do this using
import qualified Data.Array.Unboxed as A
Then we can do:
simpleTableMults :: A.Array (Int,Int) (Int,Int)
simpleTableMults = A.listArray ((1,1),(9,9)) simpleTableMultsList
Now if I want a constant time multiplication of two one digit numbers n and m, I can do:
simpleTableMults ! (n,m)
This is great! Now suppose I compile this module we've been working on. Does the simpleTableMults get fully evaluated so that when I run the computation simpleTableMults ! (n,m), the program literally makes a lookup in memory ... or does it have to build the data structure in memory first. Since it is an unboxed array, my understanding is that the Array must be created at once and is completely strict in its elements -- so that all the elements of the array are fully evaluated.
So really my question is: when does this evaluation occur, and can I force it to occur at compile time?
------- Edit ---------------
I tried to dig further on this! I tried compiling and examining information about the core. It seems GHC is performing a lot of reductions on the code at compile time. I wish I knew more about core to be able to tell. If we compile with
ghc -O2 -ddump-simpl-stats Main.hs
We can see that 98 beta reductions are performed, an unpack-list operation is carried out, many things are unfolded, and a bunch of inlines are performed (around 150). It even tells you where the beta reductions occur, ... since the word IxArray is coming, I am more curious if some sort of simplification is occuring. Now the interesting thing from my point of view is that adding
simpleTableMults = D.deepseq t t
where t = A.listArray ((1,1),(9,9)) simpleTableMultsList
increases the number of beta reductions, inlines, and simplifications quite substantially at compile time. It would be really great if I could load the compiled into a debugger of some sort and "view" the data structure! I am, as it stands, more mistified than before.
------ Edit 2 -------------
I still don't know what beta reductions are being performed. However, I did find out some interesting things based on sassa-nf's repsonse response. For the following experiment, I used the ghc-heap-view package. I changed the way Array was represented in the source according to the Sassa-NF answer. I loaded the program into GHCi, and immediately called
:printHeap simpleTableMults
And as expected got a index too large exception. But under the suggested unpacked datatype, I got a let expression with a toArray and a bunch of _thunks, and some _funs. Not really sure yet what these mean ... The other interesting thing is that by using seq, or some other strictness forcing in the source code, I ended up with all _thunks inside of the let. I can upload the exact emission if that helps.
Also, if I perform a single indexing, the array gets completely evaluated in all cases.
Also, there is no way to call ghci with optimizations, so I might not be getting the same results as when compiled with GHC -O2.
Let's exaggerate:
import qualified Data.Array.Unboxed as A
simpleTableMults :: A.Array (Int,Int) (Int,Int)
simpleTableMults = A.listArray ((1,1),(10000,2000))
[(k `div` 10, k `rem` 10) | n <- [1 ..10000],m <- [1 ..2000],let k = n*m]
main = print $ simpleTableMults A.! (10000,1000)
ghc -O2 -prof b.hs
b +RTS -hy
......Out of memory
hp2hs b.exe.hp
What happened?! You can see the heap consumption graph to go above 1GB, and then it died.
Well, the pair is computed eagerly, but the projections of the pair are lazy, so we end up with tons of thunks to compute k ``div`` 10 and k ``rem`` 10.
import qualified Data.Array.Unboxed as A
data P = P {-# UNPACK #-} !Int {-# UNPACK #-} !Int deriving (Show)
simpleTableMults :: A.Array (Int,Int) P
simpleTableMults = A.listArray ((1,1),(10000,2000))
[P (k `div` 10) (k `rem` 10) |
n <- [1 ..10000],m <- [1 ..2000],let k = n*m]
main = print $ simpleTableMults A.! (10000,1000)
This one is fine, because we eagerly computed the pair.

Parallel mapM on Repa arrays

In my recent work with Gibbs sampling, I've been making great use of the RVar which, in my view, provides a near ideal interface to random number generation. Sadly, I've been unable to make use of Repa due to the inability to use monadic actions in maps.
While clearly monadic maps can't be parallelized in general, it seems to me that RVar may be at least one example of a monad where effects can be safely parallelized (at least in principle; I'm not terribly familiar with the inner workings of RVar). Namely, I want to write something like the following,
drawClass :: Sample -> RVar Class
drawClass = ...
drawClasses :: Array U DIM1 Sample -> RVar (Array U DIM1 Class)
drawClasses samples = A.mapM drawClass samples
where A.mapM would look something like,
mapM :: ParallelMonad m => (a -> m b) -> Array r sh a -> m (Array r sh b)
While clearly how this would work depends crucially on the implementation of RVar and its underlying RandomSource, in principle one would think that this would involve drawing a new random seed for each thread spawned and proceeding as usual.
Intuitively, it seems that this same idea might generalize to some other monads.
So, my question is: Could one construct a class ParallelMonad of monads for which effects can be safely parallelized (presumably inhabited by, at the least, RVar)?
What might it look like? What other monads might inhabit this class? Have others considered the possibility of how this might work in Repa?
Finally, if this notion of parallel monadic actions can't be generalized, does anyone see any nice way to make this work in the specific case of RVar (where it would be very useful)? Giving up RVar for parallelism is a very difficult trade-off.
It's been 7 years since this question has been asked, and it still seems like no-one came up with a good solution to this problem. Repa doesn't have a mapM/traverse like function, even one that could run without parallelization. Moreover, considering the amount of progress there was in the last few years it seems unlikely that it will happen either.
Because of stale state of many array libraries in Haskell and my overall dissatisfaction with their feature sets I've put forth a couple of years of work into an array library massiv, which borrows some concepts from Repa, but takes it to a completely different level. Enough with the intro.
Prior to today, there was three monadic map like functions in massiv (not counting the synonym like functions: imapM, forM et al.):
mapM - the usual mapping in an arbitrary Monad. Not parallelizable for obvious reasons and is also a bit slow (along the lines of usual mapM over a list slow)
traversePrim - here we are restricted to PrimMonad, which is significantly faster than mapM, but the reason for this is not important for this discussion.
mapIO - this one, as name suggests, is restricted to IO (or rather MonadUnliftIO, but that is irrelevant). Because we are in IO we can automatically split array in as many chunks as there are cores and use separate worker threads to map the IO action over each element in those chunks. Unlike pure fmap, which is also parallelizable, we have to be in IO here because of non-determinism of scheduling combined with side effects of our mapping action.
So, once I read this question, I thought to myself that the problem is practically solved in massiv, but no so fast. Random number generators, such as in mwc-random and others in random-fu can't use the same generator across many threads. Which means, that the only piece of the puzzle I was missing was: "drawing a new random seed for each thread spawned and proceeding as usual". In other words, I needed two things:
A function that would initialize as many generators as there gonna be worker threads
and an abstraction that would seamlessly give the correct generator to the mapping function depending on which thread that the action is running in.
So that is exactly what I did.
First I will give examples using the specially crafted randomArrayWS and initWorkerStates functions, as they are more relevant to the question and later move to the more general monadic map. Here are their type signatures:
randomArrayWS ::
(Mutable r ix e, MonadUnliftIO m, PrimMonad m)
=> WorkerStates g -- ^ Use `initWorkerStates` to initialize you per thread generators
-> Sz ix -- ^ Resulting size of the array
-> (g -> m e) -- ^ Generate the value using the per thread generator.
-> m (Array r ix e)
initWorkerStates :: MonadIO m => Comp -> (WorkerId -> m s) -> m (WorkerStates s)
For those who are not familiar with massiv, the Comp argument is a computation strategy to use, notable constructors are:
Seq - run computation sequentially, without forking any threads
Par - spin up as many threads as there are capabilities and use those to do the work.
I'll use mwc-random package as an example initially and later move to RVarT:
λ> import Data.Massiv.Array
λ> import System.Random.MWC (createSystemRandom, uniformR)
λ> import System.Random.MWC.Distributions (standard)
λ> gens <- initWorkerStates Par (\_ -> createSystemRandom)
Above we initialized a separate generator per thread using system randomness, but we could have just as well used a unique per thread seed by deriving it from the WorkerId argument, which is a mere Int index of the worker. And now we can use those generators to create an array with random values:
λ> randomArrayWS gens (Sz2 2 3) standard :: IO (Array P Ix2 Double)
Array P Par (Sz (2 :. 3))
[ [ -0.9066144845415213, 0.5264323240310042, -1.320943607597422 ]
, [ -0.6837929005619592, -0.3041255565826211, 6.53353089112833e-2 ]
By using Par strategy the scheduler library will split evenly the work of generation among available workers and each worker will use it's own generator, thus making it thread safe. Nothing prevents us from reusing the same WorkerStates arbitrary number of times as long as it is not done concurrently, which otherwise would result in an exception:
λ> randomArrayWS gens (Sz1 10) (uniformR (0, 9)) :: IO (Array P Ix1 Int)
Array P Par (Sz1 10)
[ 3, 6, 1, 2, 1, 7, 6, 0, 8, 8 ]
Now putting mwc-random to the side we can reuse the same concept for other possible uses cases by using functions like generateArrayWS:
generateArrayWS ::
(Mutable r ix e, MonadUnliftIO m, PrimMonad m)
=> WorkerStates s
-> Sz ix -- ^ size of new array
-> (ix -> s -> m e) -- ^ element generating action
-> m (Array r ix e)
and mapWS:
mapWS ::
(Source r' ix a, Mutable r ix b, MonadUnliftIO m, PrimMonad m)
=> WorkerStates s
-> (a -> s -> m b) -- ^ Mapping action
-> Array r' ix a -- ^ Source array
-> m (Array r ix b)
Here is the promised example on how to use this functionality with rvar, random-fu and mersenne-random-pure64 libraries. We could have used randomArrayWS here as well, but for the sake of example let's say we already have an array with different RVarTs, in which case we need a mapWS:
λ> import Data.Massiv.Array
λ> import Control.Scheduler (WorkerId(..), initWorkerStates)
λ> import Data.IORef
λ> import System.Random.Mersenne.Pure64 as MT
λ> import Data.RVar as RVar
λ> import Data.Random as Fu
λ> rvarArray = makeArrayR D Par (Sz2 3 9) (\ (i :. j) -> Fu.uniformT i j)
λ> mtState <- initWorkerStates Par (newIORef . MT.pureMT . fromIntegral . getWorkerId)
λ> mapWS mtState RVar.runRVarT rvarArray :: IO (Array P Ix2 Int)
Array P Par (Sz (3 :. 9))
[ [ 0, 1, 2, 2, 2, 4, 5, 0, 3 ]
, [ 1, 1, 1, 2, 3, 2, 6, 6, 2 ]
, [ 0, 1, 2, 3, 4, 4, 6, 7, 7 ]
It is important to note, that despite the fact that pure implementation of Mersenne Twister is being used in the above example, we cannot escape the IO. This is because of the non-deterministic scheduling, which means that we never know which one of the workers will be handling which chunk of the array and consequently which generator will be used for which part of the array. On the up side, if the generator is pure and splittable, such as splitmix, then we can use the pure, deterministic and parallelizable generation function: randomArray, but that is already a separate story.
It's probably not a good idea to do this due to inherently sequential nature of PRNGs. Instead, you might want to transition your code as follows:
Declare an IO function (main, or what have you).
Read as many random numbers as you need.
Pass the (now pure) numbers onto your repa functions.
