Repeat array rows specified number of times - arrays

New to julia, so this is probably very easy.
I have an n-by-m array and a vector of length n and want to repeat each row of the array the number of times in the corresponding element of the vector. For example:
mat = rand(3,6)
v = vec([2 3 1])
The result should be a 6-by-6 array. I tried the repeat function but
repeat(mat, inner = v)
yields a 6×18×1 Array{Float64,3}: array instead so it takes v to be the dimensions along which to repeat the elements. In matlab I would use repelem(mat, v, 1) and I hope julia offers something similar. My actual matrix is a lot bigger and I will have to call the function many times, so this operation needs to be as fast as possible.

It has been discussed to add a similar thing to Julia Base, but currently it is not implemented yet AFAIK. You can achieve what you want using the inverse_rle function from StatsBase.jl:
julia> row_idx = inverse_rle(axes(v, 1), v)
6-element Array{Int64,1}:
1
1
2
2
2
3
and now you can write:
mat[row_idx, :]
or
#view mat[row_idx, :]
(the second option creates a view which might be relevant in your use case if you say that your mat is large and you need to do such indexing many times - which option is faster will depend on your exact use case).

Related

How to get each index of one or more of an array's dimensions?

If I want eachindex but only of a specific dimension, what's a good way to accomplish this?
E.g. x is a 3x5x7 Array
x = rand(3,5,7)
And I'd like to get the 2nd dimension's indexes of 1:5, ideally in a way that doesn't assume that the indexing starts at 1
The axes function is a generic way to get that.
axes(A, d)
Return the valid range of indices for array A along dimension d.
julia> A = fill(1, (5,6,7));
julia> axes(A, 2)
Base.OneTo(6)

Better algorithm to construct a matrix using vcat() or hcat() in a loop in Julia?

To use vcat(a,b) and hcat(a,b), one must match the number of columns or number of rows in the matrices a and b.
When constructing a matrix using vact(a, b) or hcat(a, b) in a loop, one needs an initial matrix a (like a starting statement). Although all the sub-matrices are created in the same manner, I might need to construct this initial matrix a outside of the loop.
For example, if the loop condition is for i in 1:w, then I would need to pre-create a using i = 1, then start the loop with for i in 2:w.
If there is a nested loop, then my method is very awkward. I have thought the following methods, but it seems they don't really work:
Use a dummy a, delete a after the loop. From this question, we cannot delete row in a matrix. If we use another variable to refer to the useful rows and columns, we might waste some memory allocation.
Use reshape() to make an empty dummy a. It works for 1 dimension, but not multiple dimensions.
julia> a = reshape([], 2, 0)
2×0 Array{Any,2}
julia> b = hcat(a, [3, 3])
2×1 Array{Any,2}:
3
3
julia> a = reshape([], 2, 2)
ERROR: DimensionMismatch("new dimensions (2,2) must be consistent with array size 0")
in reshape(::Array{Any,1}, ::Tuple{Int64,Int64}) at ./array.jl:113
in reshape(::Array{Any,1}, ::Int64, ::Int64, ::Vararg{Int64,N}) at ./reshapedarray.jl:39
So my question is how to work around with vcat() and hcat() in a loop?
Edit:
Here is the problem I got stuck in:
There are many gray pixel images. Each one is represented as a 20 by 20 Float64 array. One function foo(n) randomly picks n of those matrices, and combine them to a big square.
If n has integer square root, then foo(n) returns a sqrt(n) * 20 by sqrt(n) * 20 matrix.
If n does not have integer square root, then foo(n) returns a ceil(sqrt(n)) * 20 by ceil(sqrt(n)) * 20 matrix. On the last row of the big square image (a row of 20 by 20 matrices), foo(n) fills ceil(sqrt(n)) ^ 2 - n extra black images (each one is represented as zeros(20,20)).
My current algorithm for foo(n) is to use a nested loop. In the inner loop, hcat() builds a layer (consisting ceil(sqrt(n)) images). In the outer loop, vcat() combines those layers.
Then dealing with hcat() and vcat() in a loop becomes complicated.
So would:
pickimage() = randn(20,20)
n = 16
m = ceil(Int, sqrt(n))
out = Matrix{Float64}(20m, 20m)
k = 0
for i in (1:m)-1
for j in (1:m)-1
out[20i + (1:20), 20j + (1:20)] .= ((k += 1) <= n) ? pickimage() : zeros(20,20)
end
end
be a relevant solution?

Julia: Converting Vector of Arrays to Array for Arbitrary Dimensions

Using timing tests, I found that it's much more performant to grow Vector{Array{Float64}} objects using push! than it is to simply use an Array{Float64} object and either hcat or vcat. However, after the computation is completed, I need to change the resulting object to an Array{Float64} for further analysis. Is there a way that works regardless of the dimensions? For example, if I generate the Vector of Arrays via
u = [1 2 3 4
1 3 3 4
1 5 6 3
5 2 3 1]
uFull = Vector{Array{Int}}(0)
push!(uFull,u)
for i = 1:10000
push!(uFull,u)
end
I can do the conversion like this:
fill = Array{Int}(size(uFull)...,size(u)...)
for i in eachindex(uFull)
fill[i,:,:] = uFull[i]
end
but notice this requires that I know the arrays are matrices (2-dimensional). If it's 3-dimensional, I would need another :, and so this doesn't work for arbitrary dimensions.
Note that I also need a form of the "inverse transform" (except first indexed by the last index of the full array) in arbitrary dimensions, and I currently have
filla = Vector{Array{Int}}(size(fill)[end])
for i in 1:size(fill)[end]
filla[i] = fill[:,:,i]'
end
I assume the method for the first conversion will likely solve the second as well.
This is the sort of thing that Julia's custom array infrastructure excels at. I think the simplest solution here is to actually make a special array type that does this transformation for you:
immutable StackedArray{T,N,A} <: AbstractArray{T,N}
data::A # A <: AbstractVector{<:AbstractArray{T,N-1}}
dims::NTuple{N,Int}
end
function StackedArray(vec::AbstractVector)
#assert all(size(vec[1]) == size(v) for v in vec)
StackedArray(vec, (length(vec), size(vec[1])...))
end
StackedArray{T, N}(vec::AbstractVector{T}, dims::NTuple{N}) = StackedArray{eltype(T),N,typeof(vec)}(vec, dims)
Base.size(S::StackedArray) = S.dims
#inline function Base.getindex{T,N}(S::StackedArray{T,N}, I::Vararg{Int,N})
#boundscheck checkbounds(S, I...)
S.data[I[1]][Base.tail(I)...]
end
Now just wrap your vector in a StackedArray and it'll behave like an N+1 dimensional array. This could be expanded and made more featureful (it could similarly support setindex! or even push!ing arrays to concatenate natively), but I think that it's sufficient to solve your problem. By simply wrapping uFull in a StackedArray you get an object that acts like an Array{T, N+1}. Make a copy, and you get exactly a dense Array{T, N+1} without ever needing to write a for loop yourself.
julia> S = StackedArray(uFull)
10001x4x4 StackedArray{Int64,3,Array{Array{Int64,2},1}}:
[:, :, 1] =
1 1 1 5
1 1 1 5
1 1 1 5
…
julia> squeeze(S[1:1, :, :], 1) == u
true
julia> copy(S) # returns a dense Array{T,N}
10001x4x4 Array{Int64,3}:
[:, :, 1] =
1 1 1 5
1 1 1 5
…
Finally, I'll just note that there's another solution here: you could introduce the custom array type sooner, and make a GrowableArray that internally stores its elements as a linear Vector{T}, but allows pushing entire columns or arrays directly.
Matt B.'s answer is great, because it "simulates" an array without actually having to create or store it. When you can use this solution, it's likely to be your best choice.
However, there might be circumstances where you need to create a concatenated array (e.g., if you're passing this to some C code which requires contiguous memory). In that case you can just call cat, which is generic (it can handle arbitrary dimensions).
For example:
u = [1 2 3 4
1 3 3 4
1 5 6 3
5 2 3 1]
uFull = Vector{typeof(u)}(0)
push!(uFull,u)
for i = 1:10000
push!(uFull,u)
end
ucat = cat(ndims(eltype(uFull))+1, uFull)
I took the liberty of making one important change to your code: uFull = Vector{typeof(u)}(0) because it ensures that the objects stored in the Vector container have concrete type. Array{Int} is actually an abstract type, because you'd need to specify the dimensionality too (Array{Int,2}).

Is there a way to quickly extract the parts from a vector without looping?

Consider that I have a vector/array such that it looks as follows:
each part is a sub array of some size fixed and known size (that can only be accessed through indexing, i.e. its not a tensor nor a higher order array). So for example:
x1 = x(1:d);
if d is the size of each sub array. The size of each sub array is the same but it might vary depending on the current x we are considering. However, we do know n (the number of sub arrays) and d (the size of all of the sub arrays).
I know there is usually really strange but useful tricks in matlab to do things more optimized. Is there a way to extract those using maybe indexing and and make a matrix where the rows (or columns) are those parts? as in:
X = [x_1, ..., x_n]
the caveat is that n is a variable and we don't know aprior what it is. We can find what n is, but its not fixed.
I want to minimize the amount of for loops I actually write in matlab to hope its faster...just to add some more context.
First I would consider simple reshaping to keep the output as a simple double matrix
x = (1:15).' %'
d = 3;
out = reshape(x,d,[])
and further on just use indexing to access the columns out(:,idx);
There is no need to know n in advance, as reshape is calculating it based on d and the number of elements in x.
out =
1 4 7 10 13
2 5 8 11 14
3 6 9 12 15
If you'd insist on something like cell arrays, use accumarray with ceil to get the subs:
out = accumarray( ceil( (1:numel(x))/d ).', x(:), [], #(x) {x})

Dynamic slicing of Matlab array

I have an n-dimensional array A and want to slice it dynamically, i.e., given a list of array dimensions, like [2 4], and a list of values, like [6 8], I want
B = A(:,6,:,8,:,:,:,:,...)
List lengths are unknown. Using eval would work but is not an option. This question is a generalization of a previous post to multiple indices and dimensions without a for-loop.
You can still use the previous post I linked to (which I originally flagged as a duplicate) to answer your question. This original post only slices in one dimension. I originally flagged it as a duplicate and closed it because all you need to do is replace one line of code in the original post's accepted answer to achieve what you want. However, because it isn't that obvious, I have decided to reopen the question and answer the question for you.
Referring to the previous post, this is what Andrew Janke (the person with the accepted answer on the linked post) did (very clever I might add):
function out = slice(A, ix, dim)
subses = repmat({':'}, [1 ndims(A)]);
subses{dim} = ix;
out = A(subses{:});
Given a matrix A, an index number ix and the dimension you want to access dim, the above function would equivalently perform:
out = A(:, :, ..., ix, :, :,...:);
^ ^ ^ ^
dimensions --> 1 2 dim dim+1
You would access your desired dimension in dim, and place what value you want to use to slice into that dimension. As such, you'd call it like this:
out = slice(A, ix, dim);
How the function works is that subses would generate a cell array of ':' strings (that will eventually be converted into ':' operators) that is as long as the total number of dimensions of A. Next, you would access the element at dim, which corresponds to the dimension you want and you would replace this with ix. You would then unroll this cell array so that we would access A in the manner that you see in the above equivalent statement.
Who would have thought that you can use strings to index into an array!?
Now, to generalize this, all you have to do is make one small but very crucial change. ix would now be a vector of indices, and dim would be a vector of dimensions you want to access. As such, it would look something like this:
function out = slice(A, ix, dim)
subses = repmat({':'}, [1 ndims(A)]);
subses(dim) = num2cell(ix);
out = A(subses{:});
The only difference we see here is the second line of the code. We have to use num2cell so that you can convert each element into a cell array, and we slice into this cell array to replace the : operators with your desired dimensions. Note that we are using () braces and not {} braces. () braces are used to slice through cell arrays while {} are used to access cell array contents. Because we are going to assign multiple cells to subses, () is needed. We then perform our slicing in A accordingly.
As such, given your problem and with the above modifications, you would do:
out = slice(A, [6 8], [2 4]);
Be advised that ix and dim must contain the same number of elements and they must be 1D. Also, ix and dim should be sensible inputs (i.e. not floating point and negative). I don't do this error checking because I'm assuming you know what you're doing and you're smart enough to know how to use this properly.
Good luck!

Resources