Removing zeros and then vertically collapse the matrix

Removing zeros and then vertically collapse the matrix - arrays

In MATLAB, say I have a set of square matrices, say A, with trace(A)=0 as follows:
For example,
A = [0 1 2; 3 0 4; 5 6 0]
How can I remove the zeros and then vertically collapse the matrix to become as follow:
A_reduced = [1 2; 3 4; 5 6]
More generally, what if the zeroes can appear anywhere in the column (i.e., not necessarily at the long diagonal)? Assuming, of course, that the total number of zeros for all columns are the same.
The matrix can be quite big (hundreds x hundreds in dimension). So, an efficient way will be appreciated.

To compress the matrix vertically (assuming every column has the same number of zeros):
A_reduced_v = reshape(nonzeros(A), nnz(A(:,1)), []);
To compress the matrix horizontally (assuming every row has the same number of zeros):
A_reduced_h = reshape(nonzeros(A.'), nnz(A(1,:)), []).';

Case #1
Assuming that A has equal number of zeros across all rows, you can compress it horizontally (i.e. per row) with this -
At = A' %//'# transpose input array
out = reshape(At(At~=0),size(A,2)-sum(A(1,:)==0),[]).' %//'# final output
Sample code run -
>> A
A =
0 3 0 2
3 0 0 1
7 0 6 0
1 0 6 0
0 16 0 9
>> out
out =
3 2
3 1
7 6
1 6
16 9
Case #2
If A has equal number of zeros across all columns, you can compress it vertically (i.e. per column) with something like this -
out = reshape(A(A~=0),size(A,1)-sum(A(:,1)==0),[]) %//'# final output
Sample code run -
>> A
A =
0 3 7 1 0
3 0 0 0 16
0 0 6 6 0
2 1 0 0 9
>> out
out =
3 3 7 1 16
2 1 6 6 9

This seems to work, quite fiddly to get the behaviour right with transposing:
>> B = A';
>> C = B(:);
>> reshape(C(~C==0), size(A) - [1, 0])'
ans =
1 2
3 4
5 6

As your zeros are always in the main diagonal you can do the following:
l = tril(A, -1);
u = triu(A, 1);
out = l(:, 1:end-1) + u(:, 2:end)

A correct and very simple way to do what you want is:
A = [0 1 2; 3 0 4; 5 6 0]
A =
0 1 2
3 0 4
5 6 0
A = sort((A(find(A))))
A =
1
2
3
4
5
6
A = reshape(A, 2, 3)
A =
1 3 5
2 4 6

I came up with almost the same solution as Mr E's though with another reshape command. This solution is more universal, as it uses the number of rows in A to create the final matrix, instead of counting the number of zeros or assuming a fixed number of zeros..
B = A.';
B = B(:);
C = reshape(B(B~=0),[],size(A,1)).'

Related

Number 0's and 1's blocks in a binary vector

In MATLAB, there is the bwlabel function, that given a binary vector, for instance x=[1 1 0 0 0 1 1 0 0 1 1 1 0] gives (bwlabel(x)):
[1 1 0 0 0 2 2 0 0 3 3 3 0]
but what I want to obtain is
[1 1 2 2 2 3 3 4 4 5 5 5 6]
I know I can negate x to obtain (bwlabel(~x))
[0 0 1 1 1 0 0 2 2 0 0 0 3]
But how can I combine them?

All in one line:
y = cumsum([1,abs(diff(x))])
Namely, abs(diff(x)) spots changes in the binary vector, and you gain the output with the cumulative sum.

You can still do it using bwlabel by vertically concatenating x and ~x, using 4-connected components for the labeling, then taking the maximum down each column:
>> max(bwlabel([x; ~x], 4))
ans =
1 1 2 2 2 3 3 4 4 5 5 5 6
However, the solution from Bentoy13 is probably a bit faster.

x=[1 1 0 0 0 1 1 0 0 1 1 1 0];
A = bwlabel(x);
B = bwlabel(~x);
if x(1)==1
tmp = A>0;
A(tmp) = 2*A(tmp)-1;
tmp = B>0;
B(tmp) = 2*B(tmp);
C = A+B
elseif x(1)==0
tmp = A>0;
A(tmp) = 2*A(tmp);
tmp = B>1;
B(tmp) = 2*B(tmp)-1;
C = A+B
end
C =
1 1 2 2 2 3 3 4 4 5 5 5 6
You know the first index should remain 1, but the second index should go from 1 to 2, the third from 2 to 3 etc; thus even indices should be doubled and odd indices should double minus one. This is given by A+A-1 for odd entries, and B+B for even entries. So a simple check for whether A or B contains the even points is sufficient, and then simply add the two arrays.

I found this function that does exactly what i wanted:
https://github.com/davidstutz/matlab-multi-label-connected-components
So, clone the repository and compile in matlab using mex :
mex sp_fast_connected_relabel.cpp
Then,
labels = sp_fast_connected_relabel(x);

Finding the distribution of length of islands in a 2D array?

I will explain my question using an example. Imagine you have a 2D matrix like below:
5 4 3 8 0 0
5 4 2 9 1 0
5 6 2 7 2 0
5 4 7 9 0 0
5 6 7 1 2 0
By islands I mean column groups of same elements (except zeros).
I would like to find the histogram of length of islands except those consisting of zero elements.
This matrix has
island-length occurrence
5 1
2 3
1 12
How can I realize this task using Matlab ?

Maybe there are shorter possibilities, but this will do - and it is fully vectorized:
A = [5 4 3 8 0 0
5 4 2 9 1 0
5 6 2 7 2 0
5 4 7 9 0 0
5 6 7 1 2 0]
%// pad zeros to first line of A
X(2:size(A,1)+1,:) = A;
%// differences of X
dX = diff(X)
%// cumulative sum of "logicalized" differences
cs = cumsum(logical(dX(:)))
%// filter out zeros
cs = cs(logical(A(:)))
%// count occurances
aa = accumarray(cs,1)
%// unique occurances
uaa = unique(aa)
%// count unique occurances
occ = hist(aa,uaa).'
%// accumarray may introduce new zeros, filter out
mask = logical(uaa)
%// output
out = [occ(mask) uaa(mask)]
out =
12 1
3 2
1 5

Needed a slight modification to one of my old snippets to filter the zeros. Here you go:
% Your Matrix
A = [ 5 4 3 8 0 0;
5 4 2 9 1 0;
5 6 2 7 2 0;
5 4 7 9 0 0;
5 6 7 1 2 0];
% Find Edges (Ends of Islands)
B = diff(A);
B = [ones(1,size(A,2));B~=0;ones(1,size(A,2))];
% At each column, find distances between island edges, filter out zero islands.
R = cell(size(A,2),1);
for i = 1:size(A,2)
[C ~] = find(B(:,i));
Ac = A(C(1:end-1),i);
D = diff(C);
D(Ac==0)=[];
R{i} = D;
end
% Find histogram of island lengths
R = R(find(~cellfun(#isempty,R)),1);
R = cell2mat(R);
[a,~,c] = unique(R);
out = [a, accumarray(c,ones(size(R)))];

MATLAB: detect and remove mirror imaged pairs in 2 column matrix

I have a matrix
[1 2
3 6
7 1
2 1]
and would like to remove mirror imaged pairs..i.e. output would be either:
[1 2
3 6
7 1]
or
[3 6
7 1
2 1]
Is there a simple way to do this? I can imagine a complicated for loop, something like (or a version which wouldn't delete the original pair..only the duplicates):
for i=1:y
var1=(i,1);
var2=(i,2);
for i=1:y
if array(i,1)==var1 && array(i,2)==var2 | array(i,1)==var2 && array(i,2)==var1
array(i,1:2)=[];
end
end
end
thanks

How's this for simplicity -
A(~any(tril(squeeze(all(bsxfun(#eq,A,permute(fliplr(A),[3 2 1])),2))),2),:)
Playing code-golf? Well, here we go -
A(~any(tril(pdist2(A,fliplr(A))==0),2),:)
If dealing with two column matrices only, here's a simpler version of bsxfun -
M = bsxfun(#eq,A(:,1).',A(:,2)); %//'
out = A(~any(tril(M & M.'),2),:)
Sample run -
A =
1 2
3 6
7 1
6 5
6 3
2 1
3 4
>> A(~any(tril(squeeze(all(bsxfun(#eq,A,permute(fliplr(A),[3 2 1])),2))),2),:)
ans =
1 2
3 6
7 1
6 5
3 4
>> A(~any(tril(pdist2(A,fliplr(A))==0),2),:)
ans =
1 2
3 6
7 1
6 5
3 4

Here a not so fancy, but hopefully understandable and easy way.
% Example matrix
m = [1 2; 3 6 ; 7 1; 2 1; 0 3 ; 3 0];
Comparing m with its flipped version, the function ismember returns mirror_idx, a 1D-vector with each row containing the index of the mirror-row, or 0 if there's none.
[~, mirror_idx] = ismember(m,fliplr(m),'rows');
Go through the indices of the mirror-rows. If you find one "mirrored" row (mirror_idx > 0), set its counter-part to "not mirrored".
for ii = 1:length(mirror_idx)
if (mirror_idx(ii) > 0)
mirror_idx(mirror_idx(ii)) = 0;
end
end
Take only the rows that are marked as not having a mirror.
m_new = m(~mirror_idx,:);
Greetings

How can I find all the cells that have the same values in a multi-dimensional array in octave / matlab

How can I find all the cells that have the same values in a multi-dimensional array?
I can get it partially to work with result=A(:,:,1)==A(:,:,2) but I'm not sure how to also include A(:,:,3)
I tried result=A(:,:,1)==A(:,:,2)==A(:,:,3) but the results come back as all 0 when there should be 1 correct answer
which is where the number 8 is located in the same cell on all the pages of the array. Note: this is just a test the repeating number could be found multiple times and as different numbers.
PS: I'm using octave 3.8.1 which is like matlab
See code below:
clear all, tic
%graphics_toolkit gnuplot %use this for now it's older but allows zoom
A(:,:,1)=[1 2 3; 4 5 6; 7 9 8]; A(:,:,2)=[9 1 7; 6 5 4; 7 2 8]; A(:,:,3)=[2 4 6; 8 9 1; 3 5 8]
[i j k]=size(A)
for ii=1:k
maxamp(ii)=max(max(A(:,:,ii)))
Ainv(:,:,ii)=abs(A(:,:,ii)-maxamp(ii));%the extra max will get the max value of all values in array
end
%result=A(:,:,1)==A(:,:,2)==A(:,:,3)
result=A(:,:,1)==A(:,:,2)
result=double(result); %turns logical index into double to do find
[row col page] = find(result) %gives me the col, row, page
This is the output it gives me:
>>>A =
ans(:,:,1) =
1 2 3
4 5 6
7 9 8
ans(:,:,2) =
9 1 7
6 5 4
7 2 8
ans(:,:,3) =
2 4 6
8 9 1
3 5 8
i = 3
j = 3
k = 3
maxamp = 9
maxamp =
9 9
maxamp =
9 9 9
result =
0 0 0
0 1 0
1 0 1
row =
3
2
3
col =
1
2
3
page =
1
1
1

Use bsxfun(MATLAB doc, Octave doc) and check to see if broadcasting the first slice is equal across all slices with a call to all(MATLAB doc, Octave doc):
B = bsxfun(#eq, A, A(:,:,1));
result = all(B, 3);
If we're playing code golf, a one liner could be:
result = all(bsxfun(#eq, A, A(:,:,1)), 3);
The beauty of the above approach is that you can have as many slices as you want in the third dimension, other than just three.
Example
%// Your data
A(:,:,1)=[1 2 3; 4 5 6; 7 9 8];
A(:,:,2)=[9 1 7; 6 5 4; 7 2 8];
A(:,:,3)=[2 4 6; 8 9 1; 3 5 8];
B = bsxfun(#eq, A, A(:,:,1));
result = all(B, 3);
... gives us:
>> result
result =
0 0 0
0 0 0
0 0 1
The above makes sense since the third row and third column for all slices is the only value where every slice shares this same value (i.e. 8).

Here's another approach: compute differences along third dimension and detect when all those differences are zero:
result = ~any(diff(A,[],3),3);

You can do
result = A(:,:,1) == A(:,:,2) & A(:,:,1) == A(:,:,3);

sum the elements along the third dimension and divide it with the number of dimensions. We get back the original value if the values are the same in all dimension. Otherwise a different (e.g. a decimal) value. Then find the location where A and the summation are equal over the third dimension.
all( A == sum(A,3)./size(A,3),3)
ans =
0 0 0
0 0 0
0 0 1
or
You could also do
all(A==repmat(sum(A,3)./size(A,3),[1 1 size(A,3)]),3)
where repmat(sum(A,3)./size(A,3),[1 1 size(A,3)]) would highlight the implicit broadcasting of this when compared with A.
or
you skip the broadcasting altogether and just compare it with the first slice of A
A(:,:,1) == sum(A,3)./size(A,3)
Explanation
3 represents the third dimension .
sum(A,3) means that we are taking the sum over the third dimension.
Then we divide that sum by the number of dimensions.
It's basically the average value for that position in the third dimension.
If you add three values and then divide it by three then you get the original value back.
For example, A(3,3,:) is [8 8 8]. (8+8+8)/3 = 8.
If you take another example, i.e. the value above, A(2,3,:) = [6 4 1].
Then (6+4+1)/3=3.667. This is not equal to A(2,3,:).
sum(A,3)./size(A,3)
ans =
4.0000 2.3333 5.3333
6.0000 6.3333 3.6667
5.6667 5.3333 8.0000
Therefore, we know that the elements are not the same
throughout the third dimension. This is just a trick I use
to determine that. You also have to remember that
sum(A,3)./size(A,3) is originally a 3x3x1 matrix
that will be automatically expanded (i.e. broadcasted) to a
3x3x3 matrix when we do the comparison with A (A == sum(A,3)./size(A,3)).
The result of that comparison will be a logical array with 1 for the positions that are the same throughout the third dimension.
A == sum(A,3)./size(A,3)
ans =
ans(:,:,1) =
0 0 0
0 0 0
0 0 1
ans(:,:,2) =
0 0 0
1 0 0
0 0 1
ans(:,:,3) =
0 0 0
0 0 0
0 0 1
Then use all(....,3) to get those. The result is a 3x3x1
matrix where a 1 indicates that the value is the same in the
third dimension.
all( A == sum(A,3)./size(A,3),3)
ans =
0 0 0
0 0 0
0 0 1

Element-wise array replication according to a count [duplicate]

This question already has answers here:
Repeat copies of array elements: Run-length decoding in MATLAB
(5 answers)
Closed 8 years ago.
My question is similar to this one, but I would like to replicate each element according to a count specified in a second array of the same size.
An example of this, say I had an array v = [3 1 9 4], I want to use rep = [2 3 1 5] to replicate the first element 2 times, the second three times, and so on to get [3 3 1 1 1 9 4 4 4 4 4].
So far I'm using a simple loop to get the job done. This is what I started with:
vv = [];
for i=1:numel(v)
vv = [vv repmat(v(i),1,rep(i))];
end
I managed to improve by preallocating space:
vv = zeros(1,sum(rep));
c = cumsum([1 rep]);
for i=1:numel(v)
vv(c(i):c(i)+rep(i)-1) = repmat(v(i),1,rep(i));
end
However I still feel there has to be a more clever way to do this... Thanks

Here's one way I like to accomplish this:
>> index = zeros(1,sum(rep));
>> index(cumsum([1 rep(1:end-1)])) = 1;
index =
1 0 1 0 0 1 1 0 0 0 0
>> index = cumsum(index)
index =
1 1 2 2 2 3 4 4 4 4 4
>> vv = v(index)
vv =
3 3 1 1 1 9 4 4 4 4 4
This works by first creating an index vector of zeroes the same length as the final count of all the values. By performing a cumulative sum of the rep vector with the last element removed and a 1 placed at the start, I get a vector of indices into index showing where the groups of replicated values will begin. These points are marked with ones. When a cumulative sum is performed on index, I get a final index vector that I can use to index into v to create the vector of heterogeneously-replicated values.

To add to the list of possible solutions, consider this one:
vv = cellfun(#(a,b)repmat(a,1,b), num2cell(v), num2cell(rep), 'UniformOutput',0);
vv = [vv{:}];
This is much slower than the one by gnovice..

What you are trying to do is to run-length decode. A high level reliable/vectorized utility is the FEX submission rude():
% example inputs
counts = [2, 3, 1];
values = [24,3,30];
the result
rude(counts, values)
ans =
24 24 3 3 3 30
Note that this function performs the opposite operation as well, i.e. run-length encodes a vector or in other words returns values and the corresponding counts.

accumarray function can be used to make the code work if zeros exit in rep array
function vv = repeatElements(v, rep)
index = accumarray(cumsum(rep)'+1, 1);
vv = v(cumsum(index(1:end-1))+1);
end
This works similar to solution of gnovice, except that indices are accumulated instead being assigned to 1. This allows to skip some indices (3 and 6 in the example below) and remove corresponding elements from the output.
>> v = [3 1 42 9 4 42];
>> rep = [2 3 0 1 5 0];
>> index = accumarray(cumsum(rep)'+1, 1)'
index =
0 0 1 0 0 2 1 0 0 0 0 2
>> cumsum(index(1:end-1))+1
ans =
1 1 2 2 2 4 5 5 5 5 5
>> vv = v(cumsum(index(1:end-1))+1)
vv =
3 3 1 1 1 9 4 4 4 4 4