Julia: Sort the columns of a matrix by the values in another vector (in place...)? - arrays

I am interested in sorting the columns of a matrix in terms of the values in 2 other vectors. As an example, suppose the matrix and vectors look like this:
M = [ 1 2 3 4 5 6 ;
7 8 9 10 11 12 ;
13 14 15 16 17 18 ]
v1 = [ 2 , 6 , 6 , 1 , 3 , 2 ]
v2 = [ 3 , 1 , 2 , 7 , 9 , 1 ]
I want to sort the columns of A in terms of their corresponding values in v1 and v2, with v1 taking precedence over v2. Additionally, I am interested in trying to sort the matrix in place as the matrices I am working with are very large. Currently, my crude solution looks like this:
MM = [ v1' ; v2' ; M ] ; ## concatenate the vectors with the matrix
MM[:,:] = sortcols(MM , by=x->(x[1],x[2]))
M[:,:] = MM[3:end,:]
which gives the desired result:
3x6 Array{Int64,2}:
4 6 1 5 2 3
10 12 7 11 8 9
16 18 13 17 14 15
Clearly my approach is not ideal is it requires computing and storing intermediate matrices. Is there a more efficient/elegant approach for sorting the columns of a matrix in terms of 2 other vectors? And can it be done in place to save memory?
Previously I have used sortperm for sorting an array in terms of the values stored in another vector. Is it possible to use sortperm with 2 vectors (and in-place)?

I would probably do it this way:
julia> cols = sort!([1:size(M,2);], by=i->(v1[i],v2[i]));
julia> M[:,cols]
3×6 Array{Int64,2}:
4 6 1 5 2 3
10 12 7 11 8 9
16 18 13 17 14 15
This should be pretty fast and uses only one temporary vector and one copy of the matrix. It's not fully in-place, but doing this operation completely in-place is not easy. You would need a sorting function that moves columns as it works, or alternatively a version of permute! that works on columns. You could start with the code for permute!! in combinatorics.jl and modify it to permute columns, reusing a single column-size temporary buffer.

Related

How to perform transpose to each page of nd array? [duplicate]

This question already has an answer here:
Matlab - Transpose a 3D matrix only in the third dimension
(1 answer)
Closed 5 years ago.
I am trying to figure out how to import large array of data into 3D matrix to a specific order. I have already asked two question but i have not get reliable answer yet and get down voted too. Since then i have done some work and was able to import data to 3D matrix using reshape function. Instead of shooting actual problem, this is a simulation of actual problem.
k=1:27 % create a array of 27 data
r=reshape(k,[3,3,3]) % convert the array into 3 x 3 x 3 matrix,
The results of the first page and second of the matrix is, the data is placed along the columns, but i wanted to place them along rows, The transpose function does not work with ND arrays, I tried to use permute but i did not get the desired result, One solution will be perform transpose to each page, but that will break the 3D matrix in to many 2D matrices.
r(:,:,1) =
1 4 7
2 5 8
3 6 9
r(:,:,2) =
10 13 16
11 14 17
12 15 18
the expected outcome should be,
r(:,:,1) =
1 2 3
4 5 6
7 8 9
Link to the actual problem is,
Thanks
Is this what you want?
result = permute(r, [2 1 3]);
This permutes the first two dimensions. For your example r,
>> k = 1:27;
>> r = reshape(k, [3,3,3]);
>> result = permute(r, [2 1 3]);
>> result
result(:,:,1) =
1 2 3
4 5 6
7 8 9
result(:,:,2) =
10 11 12
13 14 15
16 17 18
result(:,:,3) =
19 20 21
22 23 24
25 26 27

xor after applying filters on an array

We have an original array and a list of filters where each filter consists of indices which are allowed through the filter. The filters are rather nice, e.g. they are grouped for each power of 2 in the following way (the filters are upto n = 20).
1 (2^0) = 1 3 5 7 9 11 13 15 17 19
2 (2^1) = 1 2 5 6 9 10 13 14 17 18
4 (2^2) = 1 2 3 4 9 10 11 12 17 18 19 20
8 (2^3) = 1 2 3 4 5 6 7 8 17 18 19 20
I hope you get the idea. Now we would apply some or all of these filters (user dictates which filters to apply) to the original array and the xor of the elements of the transformed array is the answer. To take an example if the original array would have been [3 7 8 1 2 9 6 4 11] e.g. n = 9 and we needed to apply the filters of 4, 2 and 1, the transformations would be like this.
After applying filter of 4 - [3 7 8 1 x x x x 11]
After applying filter of 2 - [3 7 x x x x x x 11]
After applying filter of 1 - [3 x x x x x x x 11]
Now the xor of 3 and 11 e.g. 8 is the answer. I can solve this O(n * no. of filters) time, but I need a better solution which might give the answer in O(no of filters) time. Is there any way to take advantage of the properties of xor and/or pre-compute the results for some and then give the answer for the filters. This is because there are many queries with filters, so I need to answer the queries in O(no of filters) time. Any kind of help will be appreciated.
It can be done in O(M) where M is the number of items that pass all filters (independent of the number of filters) by iterating over the array in a particular way, generating only the indexes that pass all the filters.
This is easier to see if you write down the examples starting at zero:
1: 0 2 4 6 8 10 12 14 16 18 (numbers that don't contain 1)
2: 0 1 4 5 8 9 12 13 16 17 (numbers that don't contain 2, etc)
4: 0 1 2 3 8 9 10 11 16 17 18 19
8: 0 1 2 3 4 5 6 7 16 17 18 19
The filters are really just a constraint on the bits of the indexes in the array. That constraint is of the form index & filters = 0, where filters is just the sum of all the individual filters (eg 1 + 2 + 4 = 7). Given a valid index i the next valid index i' can be computed with only primitive operations: i' = (i | filters) + 1 & ~filters. The idea here is to set the bits that are filtered to zero so the +1 will carry through them, then filtered bits are cleared again to make the index valid. The total effect is that the unfiltered bits are incremented and the filtered bits stay zero.
This gives a simple algorithm to iterate directly over all valid indexes. Start at 0 (which is always valid) and increment using the rule above until the end of the array is reached:
for (int i = 0; i < N; i = (i | filters) + 1 & ~filters)
// do something with array[i], like XOR them all together

Extracting array values based on values in different dimension

I've got a problem with subsetting values of an array.
raw.table <- array(data = c(1:12,13:24,rep(1:6, each=2)),
dim=c(3,4,3),
dimnames=list(LETTERS[1:3],1:4,c("target","ctrl","samples")))
The first two dimensions of my array represent some values that I want to do statistics on and the higher dimensions contain different attributes I want to use to access specific subsets. In this case I have only sample numbers, whereas there are always two values assigned to the same sample number (measurement replicates).
, , target
1 2 3 4
A 1 4 7 10
B 2 5 8 11
C 3 6 9 12
, , ctrl
1 2 3 4
A 13 16 19 22
B 14 17 20 23
C 15 18 21 24
, , samples
1 2 3 4
A 1 2 4 5
B 1 3 4 6
C 2 3 5 6
How do I access the values in dimension 1 (= target) that have the same sample number denoted in dimension 3 (= samples)? I tried out different approaches using unique(), duplicated() and match() but without coming to a result. I just cannot wrap my head about the indexing of arrays -.-
Cheers,
zuup
Form a logical index with a logical test (across dimensions):
> raw.table[,,1] == raw.table[,,3]
1 2 3 4
A TRUE FALSE FALSE FALSE
B FALSE FALSE FALSE FALSE
C FALSE FALSE FALSE FALSE
And use it to select items from the first dimension (and since they will be equal length there is no recycling):
> raw.table[, , 1 ][ raw.table[,,1] == raw.table[,,3] ]
[1] 1
Chaining calls to the Extract-operator is perfectly acceptable in R

Matlab: Reshaping grid points from ndgrid into N x m matrix [duplicate]

This question pops up quite often in one form or another (see for example here or here). So I thought I'd present it in a general form, and provide an answer which might serve for future reference.
Given an arbitrary number n of vectors of possibly different sizes, generate an n-column matrix whose rows describe all combinations of elements taken from those vectors (Cartesian product) .
For example,
vectors = { [1 2], [3 6 9], [10 20] }
should give
combs = [ 1 3 10
1 3 20
1 6 10
1 6 20
1 9 10
1 9 20
2 3 10
2 3 20
2 6 10
2 6 20
2 9 10
2 9 20 ]
The ndgrid function almost gives the answer, but has one caveat: n output variables must be explicitly defined to call it. Since n is arbitrary, the best way is to use a comma-separated list (generated from a cell array with ncells) to serve as output. The resulting n matrices are then concatenated into the desired n-column matrix:
vectors = { [1 2], [3 6 9], [10 20] }; %// input data: cell array of vectors
n = numel(vectors); %// number of vectors
combs = cell(1,n); %// pre-define to generate comma-separated list
[combs{end:-1:1}] = ndgrid(vectors{end:-1:1}); %// the reverse order in these two
%// comma-separated lists is needed to produce the rows of the result matrix in
%// lexicographical order
combs = cat(n+1, combs{:}); %// concat the n n-dim arrays along dimension n+1
combs = reshape(combs,[],n); %// reshape to obtain desired matrix
A little bit simpler ... if you have the Neural Network toolbox you can simply use combvec:
vectors = {[1 2], [3 6 9], [10 20]};
combs = combvec(vectors{:}).' % Use cells as arguments
which returns a matrix in a slightly different order:
combs =
1 3 10
2 3 10
1 6 10
2 6 10
1 9 10
2 9 10
1 3 20
2 3 20
1 6 20
2 6 20
1 9 20
2 9 20
If you want the matrix that is in the question, you can use sortrows:
combs = sortrows(combvec(vectors{:}).')
% Or equivalently as per #LuisMendo in the comments:
% combs = fliplr(combvec(vectors{end:-1:1}).')
which gives
combs =
1 3 10
1 3 20
1 6 10
1 6 20
1 9 10
1 9 20
2 3 10
2 3 20
2 6 10
2 6 20
2 9 10
2 9 20
If you look at the internals of combvec (type edit combvec in the command window), you'll see that it uses different code than #LuisMendo's answer. I can't say which is more efficient overall.
If you happen to have a matrix whose rows are akin to the earlier cell array you can use:
vectors = [1 2;3 6;10 20];
vectors = num2cell(vectors,2);
combs = sortrows(combvec(vectors{:}).')
I've done some benchmarking on the two proposed solutions. The benchmarking code is based on the timeit function, and is included at the end of this post.
I consider two cases: three vectors of size n, and three vectors of sizes n/10, n and n*10 respectively (both cases give the same number of combinations). n is varied up to a maximum of 240 (I choose this value to avoid the use of virtual memory in my laptop computer).
The results are given in the following figure. The ndgrid-based solution is seen to consistently take less time than combvec. It's also interesting to note that the time taken by combvec varies a little less regularly in the different-size case.
Benchmarking code
Function for ndgrid-based solution:
function combs = f1(vectors)
n = numel(vectors); %// number of vectors
combs = cell(1,n); %// pre-define to generate comma-separated list
[combs{end:-1:1}] = ndgrid(vectors{end:-1:1}); %// the reverse order in these two
%// comma-separated lists is needed to produce the rows of the result matrix in
%// lexicographical order
combs = cat(n+1, combs{:}); %// concat the n n-dim arrays along dimension n+1
combs = reshape(combs,[],n);
Function for combvec solution:
function combs = f2(vectors)
combs = combvec(vectors{:}).';
Script to measure time by calling timeit on these functions:
nn = 20:20:240;
t1 = [];
t2 = [];
for n = nn;
%//vectors = {1:n, 1:n, 1:n};
vectors = {1:n/10, 1:n, 1:n*10};
t = timeit(#() f1(vectors));
t1 = [t1; t];
t = timeit(#() f2(vectors));
t2 = [t2; t];
end
Here's a do-it-yourself method that made me giggle with delight, using nchoosek, although it's not better than #Luis Mendo's accepted solution.
For the example given, after 1,000 runs this solution took my machine on average 0.00065935 s, versus the accepted solution 0.00012877 s. For larger vectors, following #Luis Mendo's benchmarking post, this solution is consistently slower than the accepted answer. Nevertheless, I decided to post it in hopes that maybe you'll find something useful about it:
Code:
tic;
v = {[1 2], [3 6 9], [10 20]};
L = [0 cumsum(cellfun(#length,v))];
V = cell2mat(v);
J = nchoosek(1:L(end),length(v));
J(any(J>repmat(L(2:end),[size(J,1) 1]),2) | ...
any(J<=repmat(L(1:end-1),[size(J,1) 1]),2),:) = [];
V(J)
toc
gives
ans =
1 3 10
1 3 20
1 6 10
1 6 20
1 9 10
1 9 20
2 3 10
2 3 20
2 6 10
2 6 20
2 9 10
2 9 20
Elapsed time is 0.018434 seconds.
Explanation:
L gets the lengths of each vector using cellfun. Although cellfun is basically a loop, it's efficient here considering your number of vectors will have to be relatively low for this problem to even be practical.
V concatenates all the vectors for easy access later (this assumes you entered all your vectors as rows. v' would work for column vectors.)
nchoosek gets all the ways to pick n=length(v) elements from the total number of elements L(end). There will be more combinations here than what we need.
J =
1 2 3
1 2 4
1 2 5
1 2 6
1 2 7
1 3 4
1 3 5
1 3 6
1 3 7
1 4 5
1 4 6
1 4 7
1 5 6
1 5 7
1 6 7
2 3 4
2 3 5
2 3 6
2 3 7
2 4 5
2 4 6
2 4 7
2 5 6
2 5 7
2 6 7
3 4 5
3 4 6
3 4 7
3 5 6
3 5 7
3 6 7
4 5 6
4 5 7
4 6 7
5 6 7
Since there are only two elements in v(1), we need to throw out any rows where J(:,1)>2. Similarly, where J(:,2)<3, J(:,2)>5, etc... Using L and repmat we can determine whether each element of J is in its appropriate range, and then use any to discard rows that have any bad element.
Finally, these aren't the actual values from v, just indices. V(J) will return the desired matrix.

Rearranging an array using for loop in Matlab

I have a 1 x 15 array of values:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
I need to rearrange them into a 3 x 5 matrix using a for loop:
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
How would I do that?
I'm going to show you three methods. One where you need to have a for loop, and two others when you don't:
Method #1 - for loop
First, create a matrix that is 3 x 5, then keep track of an index that will go through your array. After, create a double for loop that will help you populate the array.
index = 1;
array = 1 : 15; %// Array we wish to access
matrix = zeros(3,5); %// Initialize
for m = 1 : 3
for n = 1 : 5
matrix(m,n) = array(index);
index = index + 1;
end
end
matrix =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
Method #2 - Without a for loop
Simply put, use reshape:
matrix = reshape(1:15, 5, 3).';
matrix =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
reshape will take a vector and restructure it into a matrix so that you populate the matrix by columns first. As such, we want to put 1 to 5 in the first column, 6 to 10 in the second and 11 to 15 in the third column. Therefore, our output matrix is in fact 5 x 3. When you see this, this is actually the transposed version of the matrix we want, which is why you do .' to transpose the matrix back.
Method #3 - Another method without a for loop (tip of the hat goes to Luis Mendo)
You can use vec2mat, and specify that you need to have 5 columns worth for your matrix:
matrix = vec2mat(1:15, 5);
matrix =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
vec2mat takes a vector and reshapes it into a matrix of as many columns as you specify in the second parameter. In this case, we need 5 columns.
For the sake of (bsx)fun, here is another option...
bsxfun(#plus,1:5,[0:5:10]')
ans =
1 2 3 4 5
6 7 8 9 10
11 12 13 14 15
less readable, maybe faster, but who cares if it is such a small of an array...
A = [ 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 ] ;
A = reshape( A' , 3 , 5 ) ;
A' = 1 2 3 4 5
6 7 8 9 10
11 12 13 14 15

Resources