Yarr slice usage - arrays

I've been exploring the Data.Yarr Array library, as a possible replacement for some code I have in Repa. It seems fully featured, and the benchmarks - if correct suggest a performance improvement may be had.
I'm interested in the correct use of the slices function.
Say I had a 2D ForeignPtr backed matrix of Complex Floats, in row-major format
matrix2D :: UArray F L DIM2 (Complex Float)
How would I go about extracting a vector of slices of columns, and / or rows?
A motivating example? Lets say I wish to permute the columns, multiply each element wise with another set of slices, then perform a 1D FFT on each slice.
This seems a very common thing to want to do (in my world of signal processing). What is the idiomatic way of doing this?
Edited: to reduce scope of question.

I maintain yarr but sadly I only have intermittent access to the Internet for the next few weeks. I did write this comparison of yarr and repa some time ago: https://idontgetoutmuch.wordpress.com/2013/08/06/planetary-simulation-with-excursions-in-symplectic-manifolds-6/. I am surprised that you can't do slices with yarr without type coercion. I will try and take a look over the next few days.

Related

StaticArray efficient max size in Julia?

What is the maximum efficient size of StaticArray?
I Mean if there exists size, when StaticArray is less efficient than ordinary Array?
And one more similar question.
I should use StaticArray every time my array is not supposed to change it's size? Or there is any performance caveats?
Thx
As i found later, doc says:
A very rough rule of thumb is that you should consider using a normal
Array for arrays larger than 100 elements. For example, the
performance crossover point for a matrix multiply microbenchmark seems
to be about 11x11 in julia 0.5 with default optimizations.
As mentioned chris-rackauckas, there is some issues with this "rule" in later julia versions. See https://github.com/JuliaArrays/StaticArrays.jl/issues/506

Why [1:2] != Array[1:2]

I am learning Julia following the Wikibook, but I don't understand why the following two commands give different results:
julia> [1:2]
1-element Array{UnitRange{Int64},1}:
1:2
julia> Array[1:2]
1-element Array{Array,1}:
[1,2]
Apologies if there is an explanation I haven't seen in the Wikibook, I have looked briefly but didn't find one.
Type[a] runs convert on the elements, and there is a simple conversion between a Range to an Array (collect). So Array[1:2] converts 1:2 to an array, and then makes an array of objects like that. This is the same thing as why Float64[1;2;3] is an array of Float64.
These previous parts answer answered the wrong thing. Oops...
a:b is not an array, it's a UnitRange. Why would you create an array for A = a:b? It only takes two numbers to store it, and you can calculate A[i] basically for free for any i. Using an array would take an amount of memory which is proportional to the b-a, and thus for larger arrays would take a lot of time to allocate, whereas allocation for UnitRange is essentially free.
These kinds of types in Julia are known as lazy iterators. LinSpace is another. Another interesting set of types are the special matrix types: why use more than an array to store a Diagonal? The UniformScaling operator acts as the identity matrix while only storing one value (it's scale) to make A-kI efficient.
Since Julia has a robust type system, there is no reason to make all of these things arrays. Instead, you can make them a specialized type which will act (*, +, etc.) and index like an array, but actually aren't. This will make them take less memory and be faster. If you ever need the array, just call collect(A) or full(A).
I realized that you posted something a little more specific. The reason here is that Array[1:2] calls the getindex function for an array. This getindex function has a special dispatch on a Range so that way it "acts like it's indexed by an array" (see the discussion from earlier). So that's "special-cased", but in actuality it just has dispatches to act like an array just like it does with every other function. [A] gives an array of typeof(A) no matter what A is, so there's no magic here.

What are the advantages and disadvantages of 3d array in Mathematica

Edited...
Thanks for every one to try to help me!!!
i am trying to make a Finite Element Analysis in Mathemetica.... We can obtain all the local stiffness matrices that has 8x8 dimensions. I mean there are 2000 matrices they are similar but not same. every local stiffness matrix shown like a function that name is KK. For example KK[1] is first element local stiffness matrix
i am trying to assemble all the local matrices to make global stiffness matrix. To make it easy:
Do[K[e][i][j]=KK[[e]][[i]][[j]],{e,2000},{i,8},{j,8}]....edited
Here is my question.... this equality can affect the analysis time...If yes what can i do to improve this...
in matlab this is named as 3d array but i don't know what is called in Mathematica
what are the advantages and disadvantages of this explanation type in Mathematica...is t faster or is it easy way
Thanks for your help...
It is difficult to understand what your question is, so you might want to reformulate it.
As others have mentioned, there is no advantage to be expected from a switch from a 3D array to DownValues or SubValues. In fact you will then move from accessing data-structures to pattern matching, which is powerful and the real strength of Mathematica but not very efficient for what you plan to do, so I would strongly suggest to stay in the realm of ordinary arrays.
There is another thing that might not be clear for someone more familiar with matlab than with Mathematica: In Mathematica the "default" for arrays behave a lot like cell arrays in matlab: each entry can contain arbitrary content and they don't need to be rectangular (as High Performance Mark has mentioned they are just expressions with a head List and can roughly be compared to matlab cell arrays). But if such a nested list is a rectangular array and every element of it is of the same type such arrays can be converted to so called PackedArrays. PackedArrays are much more memory efficient and will also speed up many calculations, they behave in many respect like regular ("not-cell") arrays in matlab. This conversion is often done implicitly from functions like Table, which will oten return a packed array automatically. But if you are interested in efficiency it is a good idea to check with Developer`PackedArrayQ and convert explicitly with Developer`ToPackedArray if necessary. If you are working with PackedArrays speed and memory efficiency of many operations are much better and usually comparable to verctorized operations on normal matlab arrays. Unfortunately it can happen that packed arrays get "unpacked" by some operations, so if calculations become slow it is usually a good idea to check if that has happend.
Neither "normal" arrays nor PackedArrays are restricted in the rank (called Depth in Mathematica) they can have, so you can of course create and use "3D arrays" just as you can in matlab. I have never experienced or would know of any efficiency penalties when doing so.
It probably is of interest that newer versions of Mathematica (>= 10) bring the finite element method as one of the solver methods for NDSolve, so if you are not doing this as an exercise you might want to have a look what is available already, there is quite excessive documentation about it.
A final remark is that you can instead of kk[[e]][[i]][[j]] use the much more readable form kk[[e,i,j]] which is also easier and less error prone to type...
extended comment i guess, but
KK[e][[i]][[j]]
is not the (e,i,j) element of a "3d array". Note the single
brackets on the e. When you use the single brackets you are not denoting an array or list element but a DownValue, which is quite different from a list element.
If you do for example,
f[1]=0
f[2]=2
...
the resulting f appears similar to an array, but is actually more akin to an overloaded function in some other language. It is convenient because the indices need not be contiguous or even integers, but there is a significant performance drawback if you ever want to operate on the structure as a list.
Your 'do' loop example would almost certainly be better written as:
kk = Table[ k[e][i][j] ,{e,2000},{i,8},{j,8} ]
( Your loop wont even work as-is unless you previously "initialized" each of the kk[e] as an 8x8 array. )
Note now the list elements are all double bracketed, ie kk[[e]][[i]][[j]] or kk[[e,i,j]]

Array ordering in Julia

Is there a way to work with C-ordered or non-contiguous arrays natively in Julia?
For example, when using NumPy, C-ordered arrays are the default, but I can initialize a Fortran ordered array and do computations with that as well.
One easy way to do this was to take the Transpose of a matrix.
I can also work with non-contiguous arrays that are made via slicing.
I have looked through the documentation, etc. and can't find a way to make, declare, or work with a C-ordered array in Julia.
The transpose appears to return a copy.
Does Julia allow a user to work with C-ordered and non-contiguous arrays?
Is there currently any way to get a transpose or a slice without taking a copy?
Edit: I have found how to do slicing.
Currently it is available as a different type called a SubArray.
As an example, I could do the following to get the first row of a 100x100 array A
sub(A, 1, 1:100)
It looks like there are plans to improve this, as can be seen in https://github.com/JuliaLang/julia/issues/5513
This still leaves open the question of C-ordered arrays.
Is there an interface for C-ordered arrays?
Is there a way to do a transpose via a view instead of a copy?
Naturally, there's nothing that prevents you from working with row-major arrays as a chunk of memory, and certain packages (like Images.jl) support arbitrary ordering of arbitrary-dimensional arrays.
Presumably the main issue you're wondering about is linear algebra. Currently I don't know of anything out-of-the-box, but note that matrix multiplication in Julia is implemented through a series of functions with names like A_mul_B, At_mul_B, Ac_mul_Bc, etc, where t means transpose and c means conjugate. The parser replaces expressions like A'*b with Ac_mul_B(A, b) without actually taking the transpose.
Consequently, you could implement a RowMajorMatrix <: AbstractArray type yourself, and set up special multiplication rules:
A_mul_B(A::RowMajorMatrix, B::RowMajorMatrix) = At_mul_Bt(A, B)
A_mul_B(A::RowMajorMatrix, B::AbstractArray) = At_mul_B(A, B)
A_mul_B(A::AbstractArray, B::RowMajorMatrix) = A_mul_Bt(A, B)
etc. In addition to these two-argument versions, there are 3-argument versions (like A_mul_B!) that store the result in a pre-allocated output; you'd need to implement those, too. Finally, you'd also have to set up appropriate show methods (to display them appropriately), size methods, etc.
Finally, Julia's transpose function has been implemented in a cache-friendly manner, so it's quite a lot faster than the naive
for j = 1:n, i = 1:m
At[j,i] = A[i,j]
end
Consequently there are occasions where it's not worth worrying about creating custom implementations of algorithms, and you can just call transpose.
If you implement something like this, I'd encourage you to contribute it as a package, as it's likely that others may be interested.

How can I efficiently copy 2-dimensional arrays of bytes into a larger 2D array?

I have a structure called Patch that represents a 2D array of data.
newtype Size = (Int, Int)
data Patch = Patch Size Strict.ByteString
I want to construct a larger Patch from a set of smaller Patches and their assigned positions. (The Patches do not overlap.) The function looks like this:
newtype Position = (Int, Int)
combinePatches :: [(Position, Patch)] -> Patch
combinePatches plan = undefined
I see two sub-problems. First, I must define a function to translate 2D array copies into a set of 1D array copies. Second, I must construct the final Patch from all those copies.
Note that the final Patch will be around 4 MB of data. This is why I want to avoid a naive approach.
I'm fairly confident that I could do this horribly inefficiently, but I would like some advice on how to efficiently manipulate large 2D arrays in Haskell. I have been looking at the "vector" library, but I have never used it before.
Thanks for your time.
If the spec is really just a one-time creation of a new Patch from a set of previous ones and their positions, then this is a straightforward single-pass algorithm. Conceptually, I'd think of it as two steps -- first, combine the existing patches into a data structure with reasonable lookup for any give position. Next, write your new structure lazily by querying the compound structure. This should be roughly O(n log(m)) -- n being the size of the new array you're writing, and m being the number of patches.
This is conceptually much simpler if you use the Vector library instead of a raw ByteString. But it is simpler still if you simply use Data.Array.Unboxed. If you need arrays that can interop with C, then use Data.Array.Storable instead.
If you ditch purity, at least locally, and work with an ST array, you should be able to trivially do this in O(n) time. Of course, the constant factors will still be worse than using fast copying of chunks of memory at a time, but there's no way to keep that code from looking low-level.

Resources