Suppose there are two 1-D arrays of the same length:
let x = fromListUnboxed (ix1 4) [1, 2, 3, 4]
let y = fromListUnboxed (ix1 4) [5, 6, 7, 8]
Now I would like to stack these two arrays into one 2-D array so that these arrays form the rows. How can I do it in repa?
Basically, I'm looking for an equivalent of numpy's row_stack:
>>> x = np.array([1, 2, 3, 4])
>>> y = np.array([5, 6, 7, 8])
>>> np.row_stack((x, y))
array([[1, 2, 3, 4],
[5, 6, 7, 8]])
Note. The two arrays, x and y, come from outside, i.e. I cannot create the 2-D array from scratch.
As I mentioned in the initial comment, all you need is to reshape then append (both in Data.Array.Repa.
ghci> let x' = reshape (ix2 4 1) x
ghci> let y' = reshape (ix2 4 1) y
ghci> z <- computeP $ x' `append` y' :: IO (Array U DIM2 Int)
ghci> z
AUnboxed ((Z :. 4) :. 2) [1,5,2,6,3,7,4,8]
As for pretty-printing, repa isn't very good (likely because there is no good pretty printing for higher dimensions). Here is a one-line hack to display z
ghci> putStr $ unlines [ unwords [ show $ z ! ix2 i j | i<-[0..3] ] | j<-[0..1] ]
1 2 3 4
5 6 7 8
Related
In Python we can create index a numpy.ndarray with tuples, like
cube = numpy.zeros((3,3,3,))
print(cube[(0,1,2,)])
. However, in Haskell, to index a multi-layered array this can only be done with multiple !!'s which seems pretty adhoc.
I tried foldl:
foldl
(!!)
[[[1, 2, ....], [1, 2, ....], [1, 2, ....]],
[[1, 2, ....], [1, 2, ....], [1, 2, ....]],
[[1, 2, ....], [1, 2, ....], [1, 2, ....]]]
[0, 1, 2]
However foldl can only apply to functions like a -> b -> a, not [a] -> b -> a. Some other information shows hmatrix can do things like numpy in python, but it only applies to matrix and vectors, where the dimension is not adjustable.
This can always be done with C style indexing, i.e. put all data in a 1d list, and index them with multiplications, 0 + 1*3 + 2*9, but it seems rude, losing the information of dimensions and will cause the compiler fail to adjust them in a proper order.
How to do this with a more abstract way?
It is not quite clear to me from the question what you are trying to achieve, but if your question is only about indexing multidimensional arrays in Haskell then I'll try to answer it to best of my ability. Thanks to #leftaroundabout for suggesting massiv in the comments section, being the author of that library I am inclined to agree with his comment.
One thing is for certain, for multiple reasons you do not want to use nested lists for the purpose of arrays. Linear indexing complexity and abysmal performance are only some of those reasons.
Constructing an array
Let's see how we can get it done with massiv. First I'll translate your numpy example:
cube :: Array P Ix3 Float
cube = A.replicate Seq (Sz (3 :> 3 :. 3)) 0
Note because we actually have types in Haskell we need to do some annotations on what type of array we are trying to construct, eg. boxed vs unboxed, mutable vs immutable etc. I recommend reading through library's documentation in order to get more info on those topics. Here I'll focus on indices, since that is what the question is about. In order to get an element from the above 3D array at 0th page, 2nd row and 3rd column (the cube[(0,1,2,)] from numpy example) we can use O(1) time operator ! with an index supplied on its right side:
λ> cube ! (0 :> 1 :. 2)
0.0
Note that indexing operator ! is partial and will result in a runtime exception on out of bounds:
λ> cube ! (10 :> 1 :. 2)
*** Exception: IndexOutOfBoundsException: (10 :> 1 :. 2) is not safe for (Sz (3 :> 3 :. 3))
CallStack (from HasCallStack):
throwEither, called at src/Data/Massiv/Core/Common.hs:807:11 in massiv-1.0.1.0-...
Which can be easily avoid with its safer variant !?:
λ> cube !? (0 :> 1 :. 2) :: Maybe Float
Just 0.0
λ> cube !? (10 :> 1 :. 2) :: Maybe Float
Nothing
Index syntax
Same as with numpy it is possible to use tuples for indexing massiv arrays, but because tuples are polymorphic, it is sometimes trickier for the type checker to infer the right thing, also tuples are supported in massiv only up to 5 dimensions. That's why I will show examples for Ix n type instead, where n is number of dimensions, which can be arbitrary.
When working with flat vectors then regular Int is used for indexing (corresponds to Ix 1):
λ> let vec = makeVectorR P Seq (Sz 10) id
λ> vec
Array P Seq (Sz1 10)
[ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
λ> vec ! 7
7
For two dimensions there is a special operator :. (corresponds to Ix 2):
λ> let mat = makeArrayR P Seq (Sz (2 :. 10)) $ \(i :. j) -> i + j
λ> mat
Array P Seq (Sz (2 :. 10))
[ [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 ]
, [ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ]
]
λ> mat ! (1 :. 3)
4
Index for any dimension larger than 2 is built with :> operator (corresponds to Ix n):
λ> let arr3D = makeArrayR P Seq (Sz (3 :> 2 :. 1)) $ \(i :> j :. k) -> i + j + k
λ> arr3D ! (2 :> 1 :. 0)
3
λ> let arr4D = makeArrayR P Seq (Sz (4 :> 3 :> 2 :. 1)) $ \(h :> i :> j :. k) -> h + i + j + k
λ> arr4D ! (3 :> 2 :> 1 :. 0)
6
More info on indices with examples can be found in the README's #index section.
How do I zip a two-dimensional array with a "vector" row-wise in Julia?
This
X = [1 2; 3 4]
ndims(X)
Y = [-1 -2]
ndims(Y)
first(zip(X,Y))
gives (1, -1) while I want to get ([1 2], -1).
If you're ok with using column-vectors for the input and output, then you can use the eachrow function, which iterates over the rows of a matrix and returns the rows as column-vectors:
julia> X = [1 2; 3 4];
julia> Y = [-1, -2];
julia> collect(zip(eachrow(X), Y))
2-element Array{Tuple{Array{Int64,1},Int64},1}:
([1, 2], -1)
([3, 4], -2)
On the other hand, if you need the first elements of your zipped tuples to be row-vectors (as is shown in your question), then you could convert your matrix into a vector of rows and then use zip:
julia> X = [1 2; 3 4];
julia> Y = [-1 -2];
julia> rows = [X[[i], :] for i in 1:size(X, 1)]
2-element Array{Array{Int64,2},1}:
[1 2]
[3 4]
julia> collect(zip(rows, Y))
2-element Array{Tuple{Array{Int64,2},Int64},1}:
([1 2], -1)
([3 4], -2)
Note that I've used X[[i], :] inside the comprehension instead of X[i, :], so that we get an array of rows rather than an array of column-vectors.
Finally, just to be clear, note that Y = [-1 -2] produces a row-vector. We usually represent vectors as column vectors:
julia> Y = [-1, -2]
2-element Array{Int64,1}:
-1
-2
There are iterator builders in Julia: eachrow and eachcol, which work for arrays and are concise (at least in this case):
X = [1 2; 3 4]
Y = [-1 -2]
z = zip(eachrow(X), eachcol(Y))
Then
for el in z
print(el)
end
gives
([1, 2], [-1])
([3, 4], [-2])
I'm getting a surprising result when selecting a 2D sub-slice of a slice.
Consider the following 2D int array
a := [][]int{
{0, 1, 2, 3},
{1, 2, 3, 4},
{2, 3, 4, 5},
{3, 4, 5, 6},
}
To select the top left 3x3 2D slice using ranges I would use
b := a[0:2][0:2]
I would expect the result to be
[[0 1 2] [1 2 3] [2 3 4]]
however the second index range doesn't seem to have any effect, and returns the following instead:
[[0 1 2 3] [1 2 3 4] [2 3 4 5]]
What am I missing? Can you simply not select a sub-slice like this where the dimension > 1 ?
You can't do what you want in a single step. Slices and arrays are not 2-dimensional, they are just composed to form a multi-dimensional object. See How is two dimensional array's memory representation
So with a slice expression, you just get a slice that will hold a subset of the "full" rows, and its type will be the same: [][]int. If you slice it again, you just slicing the slice of rows again.
Also note that the higher index in a slice expression is exclusive, so a[0:2] will only have 2 rows, so you should use a[0:3] or simply a[:3] instead.
To get what you want, you have to slice the rows individually like this:
b := a[0:3]
for i := range b {
b[i] = b[i][0:3]
}
fmt.Println(b)
This will output (try it on the Go Playground):
[[0 1 2] [1 2 3] [2 3 4]]
Or shorter:
b := a[:3]
for i, bi := range b {
b[i] = bi[:3]
}
I have a numpy array
X = [[1,2], [3,4], [5,6], [1,2], [5,6]]
I want a numpy array Y = [1, 2, 3, 1, 3], where [1,2] is replaced by 1, [3,4] replaced by 2 and so on. This is for a very large (think millions) X.
Intuition is Y[X == [1,2]] = 1. But this does't work.
Intuition is Y[X == [1,2]] = 1. But this does't work.
Here is how to make it work:
Y = np.empty(len(X), dtype=np.int)
Y[np.all(X == [1, 2], 1)] = 1
To process all the possible values:
s = set(map(tuple, X))
r = np.arange(1, len(s) + 1) # or assign whatever values you want
cond = [np.all(X == v, 1) for v in s]
Y = np.dot(r, cond)
I need to copy a part of a 3D array.
I have the indexes of start and end of the copy.
For example 2D array:
[[2 2 3 4 5]
[2 3 3 4 5]
[2 3 4 4 5]
[2 3 4 5 5]
[2 3 4 5 6]]
starting index, end index are:
mini = [2, 1]
maxi = [4, 3]
So the result should be:
[[3 4 4]
[3 4 5]]
I can write:
result = matrix[mini[0]:maxi[0], mini[1]:maxi[1]]
Is there a way to do it generally ? for 3Dim or NDim arrays ?
The trick here is realizing what the indexing syntax is under the hood. This:
result = matrix[mini[0]:maxi[0], mini[1]:maxi[1]]
Is shorthand in python (not just numpy) for:
indices = slice(mini[0], maxi[0]), slice(mini[1], maxi[1])
result = matrix[indices]
So we just need to generate indices dynamically:
lower = [2, 1, ...]
upper = [4, 3, ...]
indices = tuple(np.s_[l:u] for l, u in zip(lower, upper))
result = matrix_nd[indices]
np.s_[a:b] is a shorthand for slice(a, b). Here we build a tuple containing as many slices as you have values in lower and upper
What you are looking for is the slice object, see that example:
matrix = np.random.rand(4,5)
mini = [2, 1]
maxi = [4, 3]
slices=[slice(b,e) for b, e in zip(mini,maxi)]
print(slices)
print(matrix[slices])
print(matrix[mini[0]:maxi[0], mini[1]:maxi[1]])