Julia: Questions about array where the dimension is not determined - arrays

I have two beginner's questions:
(1) I want to reshape an array, but dimensions come from a vector which can be a variable. For example,
A = ones(120,1)
b = [2,3,4,5]
I can write
C = reshape(A,2,3,4,5)
But in case b can vary, I want something like
C = reshape(A,b)
This code works in Matlab. Is there an analog in Julia?
(2) I want to slice a high-dimensional array, while keeping the dimensions flexible. In the example above, I fix the last dimension:
C[:,:,:,1]
C[:,:,:,2]
etc. The problem is to find an efficient way: For an array of any dimensions, I can always fix the last dimension and extract values.
Any help will be highly appreciated!

(1) C = reshape(A,b...)
(2) EllipsisNotation.jl provides a .. operator, so C[..,1] does what you want.

And there is C[ntuple(x->:, ndims(C)-1)..., 1] for (2) if you don't want to install a package.

Related

Julia: How to efficiently sort subarrays of 2 large arrays in parallel?

I have large 1D arrays a and b, and an array of pointers I that separates them into subarrays. My a and b barely fit into RAM and are of different dtypes (one contains UInt32s, the other Rational{Int64}s), so I don’t want to join them into a 2D array, to avoid changing dtypes.
For each i in I[2:end], I wish to sort the subarray a[I[i-1],I[i]-1] and apply the same permutation to the corresponding subarray b[I[i-1],I[i]-1]. My attempt at this is:
function sort!(a,b)
p=sortperm(a);
a[:], b[:] = a[p], b[p]
end
Threads.#threads for i in I[2:end]
sort!( a[I[i-1], I[i]-1], b[I[i-1], I[i]-1] )
end
However, already on a small example, I see that sort! does not alter the view of a subarray:
a, b = rand(1:10,10), rand(-1000:1000,10) .//1
sort!(a,b); println(a,"\n",b) # works like it should
a, b = rand(1:10,10), rand(-1000:1000,10) .//1
sort!(a[1:5],b[1:5]); println(a,"\n",b) # does nothing!!!
Any help on how to create such function sort! (as efficient as possible) are welcome.
Background: I am dealing with data coming from sparse arrays:
using SparseArrays
n=10^6; x=sprand(n,n,1000/n); #random matrix with 1000 entries per column on average
x = SparseMatrixCSC(n,n,x.colptr,x.rowval,rand(-99:99,nnz(x)).//1); #chnging entries to rationals
U = randperm(n) #permutation of rows of matrix x
a, b, I = U[x.rowval], x.nzval, x.colptr;
Thus these a,b,I serve as good examples to my posted problem. What I am trying to do is sort the row indices (and corresponding matrix values) of entries in each column.
Note: I already asked this question on Julia discourse here, but received no replies nor comments. If I can improve on the quality of the question, don't hesitate to tell me.
The problem is that a[1:5] is not a view, it's just a copy. instead make the view like
function sort!(a,b)
p=sortperm(a);
a[:], b[:] = a[p], b[p]
end
Threads.#threads for i in I[2:end]
sort!(view(a, I[i-1]:I[i]-1), view(b, I[i-1]:I[i]-1))
end
is what you are looking for
ps.
the #view a[2:3], #view(a[2:3]) or the #views macro can help making thins more readable.
First of all, you shouldn't redefine Base.sort! like this. Now, sort! will shadow Base.sort! and you'll get errors if you call sort!(a).
Also, a[I[i-1], I[i]-1] and b[I[i-1], I[i]-1] are not slices, they are just single elements, so nothing should happen if you sort them either with views or not. And sorting arrays in a moving-window way like this is not correct.
What you want to do here, since your vectors are huge, is call p = partialsortperm(a[i:end], i:i+block_size-1) repeatedly in a loop, choosing a block_size that fits into memory, and modify both a and b according to p, then continue to the remaining part of a and find next p and repeat until nothing remains in a to be sorted. I'll leave the implementation as an exercise for you, but you can come back if you get stuck on something.

Split array into smaller unequal-sized arrays dependend on array-column values

I'm quite new to MatLab and this problem really drives me insane:
I have a huge array of 2 column and about 31,000 rows. One of the two columns depicts a spatial coordinate on a grid the other one a dependent parameter. What I want to do is the following:
I. I need to split the array into smaller parts defined by the spatial column; let's say the spatial coordinate are ranging from 0 to 500 - I now want arrays that give me the two column values for spatial coordinate 0-10, then 10-20 and so on. This would result in 50 arrays of unequal size that cover a spatial range from 0 to 500.
II. Secondly, I would need to calculate the average values of the resulting columns of every single array so that I obtain per array one 2-dimensional point.
III. Thirdly, I could plot these points and I would be super happy.
Sadly, I'm super confused since I miserably fail at step I. - Maybe there is even an easier way than to split the giant array in so many small arrays - who knows..
I would be really really happy for any suggestion.
Thank you,
Arne
First of all, since you wish a data structure of array of different size you will need to place them in a cell array so you could try something like this:
res = arrayfun(#(x)arr(arr(:,1)==x,:), unique(arr(:,1)), 'UniformOutput', 0);
The previous code return a cell array with the array splitted according its first column with #(x)arr(arr(:,1)==x,:) you are doing a function on x and arrayfun(function, ..., 'UniformOutput', 0) applies function to each element in the following arguments (taken a single value of each argument to evaluate the function) but you must notice that arr must be numeric so if not you should map your values to numeric values or use another way to select this values.
In the same way you could do
uo = 'UniformOutput';
res = arrayfun(#(x){arr(arr(:,1)==x,:), mean(arr(arr(:,1)==x,2))), unique(arr(:,1)), uo, 0);
You will probably want to flat the returning value, check the function cat, you could do:
res = cat(1,res{:})
Plot your data depends on their format, so I can't help if i don't know how the data are, but you could try to plot inside a loop over your 'res' variable or something similar.
Step I indeed comes with some difficulties. Once these are solved, I guess steps II and III can easily be solved. Let me make some suggestions for step I:
You first define the maximum value (maxValue = 500;) and the step size (stepSize = 10;). Now it is possible to iterate through all steps and create your new vectors.
for k=1:maxValue/stepSize
...
end
As every resulting array will have different dimensions, I suggest you save the vectors in a cell array:
Y = cell(maxValue/stepSize,1);
Use the find function to find the rows of the entries for each matrix. At each step k, the range of values of interest will be (k-1)*stepSize to k*stepSize.
row = find( (k-1)*stepSize <= X(:,1) & X(:,1) < k*stepSize );
You can now create the matrix for a stepk by
Y{k,1} = X(row,:);
Putting everything together you should be able to create the cell array Y containing your matrices and continue with the other tasks. You could also save the average of each value range in a second column of the cell array Y:
Y{k,2} = mean( Y{k,1}(:,2) );
I hope this helps you with your task. Note that these are only suggestions and there may be different (maybe more appropriate) ways to handle this.

Extract one Logical variable from several Elementwise-Conditions in multidimensional arrays in Matlab

What is the most elegant way to reduce elementwise-conditions in multidimensional-arrays to one logical variable in Matlab? I need this for a big project with a lot of if conditions and assertions. In the Matlab documentation about logical arrays and find array elements there is no satifying solution for this problem.
For example, a logical variable myBool is true iff there are two ones at the same position in the matrices A and B:
A = [0,1;0,0]
B = [0,1;1,0]
My preferred solution so far is:
myBool = any(A(:)==1 & B(:)==1)
But it doesn't look like the shortest solution and it doesn't work with array indexing.
A shorter but not very readable solution:
myBool = any(A(B==1))
The biggest problem is that for higher dimensional arrays the functions like nnz() only reduce the order by one dimension without the colon (:), but with the colon it is not possible to index a part of the array...
Firstly, if you use matrices of class logical, then you don't need to test for equality to 1.
Indexing aside, the best approach would be:
bFlag = any(A(:) & B(:));
If you need indexing, you have two options. You can use a small vectorising anonymous function:
fhVec = #(T)(T(:));
bFlag = any(fhVec(A(rowIndices, colIndices) & B(rowIndices, colIndices)));
alternatively, you can use linear indexing:
vnLinearIndices = sub2ind(size(A), rowIndices, colIndices);
bFlag = any(A(vnLinearIndices) & B(vnLinearIndices));

How can I combine these two arrays into a matrix?

In MATLAB, if I define 2 matrices like:
A = [1:10];
B = [1:11];
How do I make matrix C with column 1 equal to A and column 2 equal to B? I cannot find any answers online. Sorry if I used the wrong MATLAB terminology for this scenario.
Well, to accomplish this you first need to make sure that A and B are the same length. In your example, A has 10 elements and B has 11, so that won't work.
However, assuming A and B have the same number of elements, this will do the trick:
C = [A(:) B(:)];
This first reshapes A and B into column vectors using single-colon indexing, then concatenates them horizontally.
if A,B same length, then can just type
C=[A' B']

Distributing a function over a single dimension of an array in MATLAB?

I often find myself wanting to collapse an n-dimensional matrix across one dimension using a custom function, and can't figure out if there is a concise incantation I can use to do this.
For example, when parsing an image, I often want to do something like this. (Note! Illustrative example only. I know about rgb2gray for this specific case.)
img = imread('whatever.jpg');
s = size(img);
for i=1:s(1)
for j=1:s(2)
bw_img(i,j) = mean(img(i,j,:));
end
end
I would love to express this as something like:
bw = on(color, 3, #mean);
or
bw(:,:,1) = mean(color);
Is there a short way to do this?
EDIT: Apparently mean already does this; I want to be able to do this for any function I've written. E.g.,
...
filtered_img(i,j) = reddish_tint(img(i,j,:));
...
where
function out = reddish_tint(in)
out = in(1) * 0.5 + in(2) * 0.25 + in(3) * 0.25;
end
Many basic MATLAB functions, like MEAN, MAX, MIN, SUM, etc., are designed to operate across a specific dimension:
bw = mean(img,3); %# Mean across dimension 3
You can also take advantage of the fact that MATLAB arithmetic operators are designed to operate in an element-wise fashion on matrices. For example, the operation in your function reddish_tint can be applied to all pixels of your image with this single line:
filtered_img = 0.5.*img(:,:,1)+0.25.*img(:,:,2)+0.25.*img(:,:,3);
To handle a more general case where you want to apply a function to an arbitrary dimension of an N-dimensional matrix, you will probably want to write your function such that it accepts an additional input argument for which dimension to operate over (like the above-mentioned MATLAB functions do) and then uses some simple logic (i.e. if-else statements) and element-wise matrix operations to apply its computations to the proper dimension of the matrix.
Although I would not suggest using it, there is a quick-and-dirty solution, but it's rather ugly and computationally more expensive. You can use the function NUM2CELL to collect values along a dimension of your array into cells of a cell array, then apply your function to each cell using the function CELLFUN:
cellArray = num2cell(img,3); %# Collect values in dimension 3 into cells
filtered_img = cellfun(#reddish_tint,cellArray); %# Apply function to each cell
I wrote a helper function called 'vecfun' that might be useful for this, if it's what you're trying to achieve?
link
You could use BSXFUN for at least some of your tasks. It performs an element-wise operation among two arrays by expanding the size 1 - dimensions to match the size in the other array. The 'reddish tint' function would become
reddish_image = bsxfun(#times,img,cat(3,0.5,0.25,0.25));
filtered_img = sum(reddish_image,3);
All the above statement requires in order to work is that the third dimension of img has size 1 or 3. Number and size of the other dimensions can be chosen freely.
If you are consistently trying to apply a function to a vector comprised by the 3 dimension in a block of images, I recommend using a pair reshapes, for instance:
Img = rand(480,640,3);
sz = size(Img);
output = reshape(myFavoriteFunction(reshape(Img,[prod(sz(1:2)),sz(3)])'),sz);
This way you can swap in any function that operates on matrices along their first dimension.
edit.
The above code will crash if you input an image which has only one layer: The function below can fix it.
function o = nLayerImage2MatrixOfPixels(i)
%function o = nLayerImage2MatrixOfPixels(i)
s = size(i);
if(length(s) == 2)
s3 = 1;
else
s3 = s(3);
end
o = reshape(i,[s(1)*s(2),s(3)])';
Well, if you are only concerned with multiplying vectors together you could just use the dot product, like this:
bw(:,:,1)*[0.3;0.2;0.5]
taking care that the shapes of your vectors conform.

Resources