Matlab: Creating a blockwise permutation - arrays

I have a vector from 1 to 40 and want to shuffle it in such a way that each block of four integers (ten blocks in total) are shuffled only with themselves.
For example: 3 4 2 1 | 7 6 5 8 | 9 11 10 12 | ...
My original idea was to append ten permutation vectors to eachother and then add a 1 to 40 vector to the big permutation vector, but it didn't work at all as expected and was logically wrong.
Has anyone an idea how to solve this?

data = 10:10:120; % input: values to be permuted
group_size = 4; % input: group size
D = reshape(data, group_size, []); % step 1
[~, ind] = sort(rand(size(D)), 1); % step 2
result = D(bsxfun(#plus, ind, (0:size(D,2)-1)*group_size)); % step 3
result = result(:).'; % step 4
Example result:
result =
20 10 30 40 60 50 70 80 110 100 120 90
How it works
Reshape the data vector into a matrix D, such that each group is a column. This is done with reshape.
Generate a matrix, ind, where each column contains the indices of a permutation of the corresponding column of D. This is done generating independent, uniform random values (rand), sorting each column, and getting the indices of the sorting (second output of sort).
Apply ind as column indices into D. This requires converting to linear indices, which can be done with bsxfun (or with sub2ind, but that's usually slower).
Reshape back into a vector.

You can use A = A(randperm(length(A))) to shuffle an array.
Example in Octave:
for i = 1:4:40
v(i:i+3) = v(i:i+3)(randperm(4));
end

Related

MATLAB: extract values from 3d matrix at given row and column indcies using sub2ind 3d

I have 3d matrix A that has my data. At multiple locations defined by row and column indcies as shown by matrix row_col_idx I want to extract all data along the third dimension as shown below:
A = cat(3,[1:3;4:6], [7:9;10:12],[13:15;16:18],[19:21;22:24]) %matrix(2,3,4)
row_col_idx=[1 1;1 2; 2 3];
idx = sub2ind(size(A(:,:,1)), row_col_idx(:,1),row_col_idx(:,2));
out=nan(size(A,3),size(row_col_idx,1));
for k=1:size(A,3)
temp=A(:,:,k);
out(k,:)=temp(idx);
end
out
The output of this code is as follows:
A(:,:,1) =
1 2 3
4 5 6
A(:,:,2) =
7 8 9
10 11 12
A(:,:,3) =
13 14 15
16 17 18
A(:,:,4) =
19 20 21
22 23 24
out =
1 2 6
7 8 12
13 14 18
19 20 24
The output is as expected. However, the actual A and row_col_idx are huge, so this code is computationally expensive. Is there away to vertorize this code to avoid the loop and the temp matrix?
This can be vectorized using linear indexing and implicit expansion:
out = A( row_col_idx(:,1) + ...
(row_col_idx(:,2)-1)*size(A,1) + ...
(0:size(A,1)*size(A,2):numel(A)-1) ).';
The above builds an indexing matrix as large as the output. If this is unacceptable due to memory limiations, it can be avoided by reshaping A:
sz = size(A); % store size A
A = reshape(A, [], sz(3)); % collapse first two dimensions
out = A(row_col_idx(:,1) + (row_col_idx(:,2)-1)*sz(1),:).'; % linear indexing along
% first two dims of A
A = reshape(A, sz); % reshape back A, if needed
A more efficient method is using the entries of the row_col_idx vector for selecting the elements from A. I have compared the two methods for a large matrix, and as you can see the calculation is much faster.
For the A given in the question, it gives the same output
A = rand([2,3,10000000]);
row_col_idx=[1 1;1 2; 2 3];
idx = sub2ind(size(A(:,:,1)), row_col_idx(:,1),row_col_idx(:,2));
out=nan(size(A,3),size(row_col_idx,1));
tic;
for k=1:size(A,3)
temp=A(:,:,k);
out(k,:)=temp(idx);
end
time1 = toc;
%% More efficient method:
out2 = nan(size(A,3),size(row_col_idx,1));
tic;
for jj = 1:size(row_col_idx,1)
out2(:,jj) = [A(row_col_idx(jj,1),row_col_idx(jj,2),:)];
end
time2 = toc;
fprintf('Time calculation 1: %d\n',time1);
fprintf('Time calculation 2: %d\n',time2);
Gives as output:
Time calculation 1: 1.954714e+01
Time calculation 2: 2.998120e-01

reversing shuffling of array by indexing

I have a matrix whose columns which was shuffled according to some index. I know want to find the index that 'unshuffles' the array back into its original state.
For example:
myArray = [10 20 30 40 50 60]';
myShuffledArray = nan(6,3)
myShufflingIndex = nan(6,3)
for x = 1:3
myShufflingIndex(:,x) = randperm(length(myArray))';
myShuffledArray(:,x) = myArray(myShufflingIndex(:,x));
end
Now I want to find a matrix myUnshufflingIndex, which reverses the shuffling to get an array myUnshuffledArray = [10 20 30 40 50 60; 10 20 30 40 50 60; 10 20 30 40 50 60]'
I expect to use myUnshufflingIndex in the following way:
for x = 1:3
myUnShuffledArray(:,x) = myShuffledArray(myUnshufflingIndex(:,x), x);
end
For example, if one column in myShufflingIndex = [2 4 6 3 5 1]', then the corresponding column in myUnshufflingIndex is [6 1 4 2 5 3]'
Any ideas on how to get myUnshufflingIndex in a neat vectorised way? Also, is there a better way to unshuffle the array columnwise than in a loop?
You can get myUnshufflingIndex with a single call to sort:
[~, myUnshufflingIndex] = sort(myShufflingIndex, 1);
Alternatively, you don't even need to compute myUnshufflingIndex, since you can just use myShufflingIndex on the left hand side of the assignment to unshuffle the data:
for x = 1:3
myUnShuffledArray(myShufflingIndex(:, x), x) = myShuffledArray(:, x);
end
And if you'd like to avoid a for loop while unshuffling, you can vectorize it by adding an offset to each column of your index, turning it into a matrix of linear indices instead of just row indices:
[nRows, nCols] = size(myShufflingIndex);
myUnshufflingIndex = myShufflingIndex+repmat(0:nRows:(nRows*(nCols-1)), nRows, 1);
myUnShuffledArray = nan(nRows, nCols); % Preallocate
myUnShuffledArray(myUnshufflingIndex) = myShuffledArray;

How to find maximum value and location of each slice of 3D array in MATLAB?

What is the fastest way of calculating the maximum value, with it's corresponding index, of each 'slice' of a 3D array?
Say you have A with n slices (here I just made each slice 10 by 10, but this can be changed to any size):
A = rand(10,10,n);
You can reshape it to n-columns matrix, then take the maximum of each column:
[val,ind] = max(reshape(A,[],n),[],1);
The first output val will be an n-element vector with all the maximum values, and the second output ind will be their row index in the reshaped A.
Then you get the size of the slices:
sz = size(A);
and use it to find the row (r) and column (c) of each maximum element in each slice:
[r,c] = ind2sub(sz(1:2),ind)
So in this example (using rand and 10x10x6 array for A) you would get something like this at the end (but with different values):
val =
0.99861 0.98895 0.98681 0.99991 0.96057 0.99176
r =
9 7 3 8 2 9
c =
1 1 8 10 10 5
If you have a matrix A with n layers, you can apply max function in two steps to get a 1 x 1 x n matrix with max of each layer
A = rand(10,10,n);
layer_max = max(max(A,[],1),[],2); % 1 x 1 x n matrix, use squeeze to remove extra dims
layer_max = squeeze(layer_max);

summing over a matrix in different parts of that matrix in matlab

In a matrix, how can we sum part by part of the elements? Consider the primary matrix in a way that can be divided into smaller m by n matrix. then i want to sum the whole elements of each m by n matrix together and put the number instead of the m by n matrix
for example consider the following matrix, i want to sum every four elements and create another matrix:
A = [1 2 3 4
5 6 7 8
9 10 11 12
13 14 15 16];
And after summing i want to have:
B = [14 22
46 54];
I this example i summed 4 elements as a matrix of 2 by 2 then for example the result of summing 1,2,5 and 6 seats in the first element of the new matrix.
Let
m = 2; %// number of rows per block
n = 2; %// number of columns per block
You can do the sum with blockproc (from the Image Processing Toolbox), which is very suited for this task:
B = blockproc(A, [m n], #(x) sum(x.data(:)));
Or, if you build the appropriate indices, you can use accumarray:
[ii jj] = ndgrid(1:size(A,1), 1:size(A,2));
B = accumarray([ceil(ii(:)/n) ceil(jj(:)/m)], A(:))
One approach -
B = squeeze(sum(reshape(sum(reshape(A,m,[])),size(A,1)/m,n,[]),2))
Another approach if you would like to avoid squeeze, which is sometimes slower -
B = reshape(sum(reshape(reshape(sum(reshape(A,m,[])),size(A,1)/m,[])',n,[])),[],size(A,1)/m)'

matlab: eliminate elements from array

I have quite big array. To make things simple lets simplify it to:
A = [1 1 1 1 2 2 3 3 3 3 4 4 5 5 5 5 5 5 5 5];
So, there is a group of 1's (4 elements), 2's (2 elements), 3's (4 elements), 4's (2 elements) and 5's (8 elements). Now, I want to keep only columns, which belong to group of 3 or more elements. So it will be like:
B = [1 1 1 1 3 3 3 3 5 5 5 5 5 5 5 5];
I was doing it using for loop, scanning separately 1's, 2's, 3's and so on, but its extremely slow with big arrays...
Thanks for any suggestions how to do it in more efficient way :)
Art.
A general approach
If your vector is not necessarily sorted, then you need to run to count the number of occurrences of each element in the vector. You have histc just for that:
elem = unique(A);
counts = histc(A, elem);
B = A;
B(ismember(A, elem(counts < 3))) = []
The last line picks the elements that have less than 3 occurrences and deletes them.
An approach for a grouped vector
If your vector is "semi-sorted", that is if similar elements in the vector are grouped together (as in your example), you can speed things up a little by doing the following:
start_idx = find(diff([0, A]))
counts = diff([start_idx, numel(A) + 1]);
B = A;
B(ismember(A, A(start_idx(counts < 3)))) = []
Again, note that the vector need not to be entirely sorted, just that similar elements are adjacent to each other.
Here is my two-liner
counts = accumarray(A', 1);
B = A(ismember(A, find(counts>=3)));
accumarray is used to count the individual members of A. find extracts the ones that meet your '3 or more elements' criterion. Finally, ismember tells you where they are in A. Note that A needs not be sorted. Of course, accumarray only works for integer values in A.
What you are describing is called run-length encoding.
There is software for this in Matlab on the FileExchange. Or you can do it directly as follows:
len = diff([ 0 find(A(1:end-1) ~= A(2:end)) length(A) ]);
val = A(logical([ A(1:end-1) ~= A(2:end) 1 ]));
Once you have your run-length encoding you can remove elements based on the length. i.e.
idx = (len>=3)
len = len(idx);
val = val(idx);
And then decode to get the array you want:
i = cumsum(len);
j = zeros(1, i(end));
j(i(1:end-1)+1) = 1;
j(1) = 1;
B = val(cumsum(j));
Here's another way to do it using matlab built-ins.
% Set up
A=[1 1 1 1 2 2 3 3 3 3 4 4 5 5 5 5 5];
threshold=2;
% Get the unique elements of the array
uniqueElements=unique(A);
% Count haw many times each unique element occurs
counts=histc(A,uniqueElements);
% Write which elements should be kept
toKeep=uniqueElements(counts>threshold);
% Make a logical index
indexer=false(size(A));
for i=1:length(toKeep)
% For every unique element we want to keep select the indices in A that
% are equal
indexer=indexer|(toKeep(i)==A);
end
% Apply index
B=A(indexer);

Resources