Get the first 2 non-zero elements from every row of matrix

Get the first 2 non-zero elements from every row of matrix - arrays

I have a matrix A like this:
A = [ 1 0 2 4; 2 3 1 0; 0 0 3 4 ]
A has only unique row elements except zero, and each row has at least 2 non-zero elements.
I want to create a new matrix B from A,where each row in B contains the first two non-zero elements of the corresponding row in A.
B = [ 1 2 ; 2 3 ; 3 4 ]
It is easy with loops but I need vectorized solution.

Here's a vectorized approach:
A = [1 0 2 4; 2 3 1 0; 0 0 3 4]; % example input
N = 2; % number of wanted nonzeros per row
[~, ind] = sort(~A, 2); % sort each row of A by the logical negation of its values.
% Get the indices of the sorting
ind = ind(:, 1:N); % keep first N columns
B = A((1:size(A,1)).' + (ind-1)*size(A,1)); % generate linear index and use into A

Here is another vectorised approach.
A_bool = A > 0; A_size = size(A); A_rows = A_size(1);
A_boolsum = cumsum( A_bool, 2 ) .* A_bool; % for each row, and at each column,
% count how many nonzero instances
% have occurred up to that column
% (inclusive), and then 'zero' back
% all original zero locations.
[~, ColumnsOfFirsts ] = max( A_boolsum == 1, [], 2 );
[~, ColumnsOfSeconds ] = max( A_boolsum == 2, [], 2 );
LinearIndicesOfFirsts = sub2ind( A_size, [1 : A_rows].', ColumnsOfFirsts );
LinearIndicesOfSeconds = sub2ind( A_size, [1 : A_rows].', ColumnsOfSeconds );
Firsts = A(LinearIndicesOfFirsts );
Seconds = A(LinearIndicesOfSeconds);
Result = horzcat( Firsts, Seconds )
% Result =
% 1 2
% 2 3
% 3 4
PS. Matlab / Octave common subset compatible code.

Related

Finding number(s) that is(are) repeated consecutively most often

Given this array for example:
a = [1 2 2 2 1 3 2 1 4 4 4 5 1]
I want to find a way to check which numbers are repeated consecutively most often. In this example, the output should be [2 4] since both 2 and 4 are repeated three times consecutively.
Another example:
a = [1 1 2 3 1 1 5]
This should return [1 1] because there are separate instances of 1 being repeated twice.
This is my simple code. I know there is a better way to do this:
function val=longrun(a)
b = a(:)';
b = [b, max(b)+1];
val = [];
sum = 1;
max_occ = 0;
for i = 1:max(size(b))
q = b(i);
for j = i:size(b,2)
if (q == b(j))
sum = sum + 1;
else
if (sum > max_occ)
max_occ = sum;
val = [];
val = [val, q];
elseif (max_occ == sum)
val = [val, q];
end
sum = 1;
break;
end
end
end
if (size(a,2) == 1)
val = val'
end
end

Here's a vectorized way:
a = [1 2 2 2 1 3 2 1 4 4 4 5 1]; % input data
t = cumsum([true logical(diff(a))]); % assign a label to each run of equal values
[~, n, z] = mode(t); % maximum run length and corresponding labels
result = a(ismember(t,z{1})); % build result with repeated values
result = result(1:n:end); % remove repetitions

One solution could be:
%Dummy data
a = [1 2 2 2 1 3 2 1 4 4 4 5 5]
%Preallocation
x = ones(1,numel(a));
%Loop
for ii = 2:numel(a)
if a(ii-1) == a(ii)
x(ii) = x(ii-1)+1;
end
end
%Get the result
a(find(x==max(x)))
With a simple for loop.
The goal here is to increase the value of x if the previous value in the vector a is identical.
Or you could also vectorized the process:
x = a(find(a-circshift(a,1,2)==0)); %compare a with a + a shift of 1 and get only the repeated element.
u = unique(x); %get the unique value of x
h = histc(x,u);
res = u(h==max(h)) %get the result

Sum up vector values till threshold, then start again

I have a vector a = [1 3 4 2 1 5 6 3 2]. Now I want to create a new vector 'b' with the cumsum of a, but after reaching a threshold, let's say 5, cumsum should reset and start again till it reaches the threshold again, so the new vector should look like this:
b = [1 4 4 2 3 5 6 3 5]
Any ideas?

You could build a sparse matrix that, when multiplied by the original vector, returns the cumulative sums. I haven't timed this solution versus others, but I strongly suspect this will be the fastest for large arrays of a.
% Original data
a = [1 3 4 2 1 5 6 3 2];
% Threshold
th = 5;
% Cumulative sum corrected by threshold
b = cumsum(a)/th;
% Group indices to be summed by checking for equality,
% rounded down, between each cumsum value and its next value. We add one to
% prevent NaNs from occuring in the next step.
c = cumsum(floor(b) ~= floor([0,b(1:end-1)]))+1;
% Build the sparse matrix, remove all values that are in the upper
% triangle.
S = tril(sparse(c.'./c == 1));
% In case you use matlab 2016a or older:
% S = tril(sparse(bsxfun(#rdivide,c.',c) == 1));
% Matrix multiplication to create o.
o = S*a.';

By normalizing the arguments of cumsum with the threshold and flooring you can get grouping indizes for accumarray, which then can do the cumsumming groupwise:
t = 5;
a = [1 3 4 2 1 5 6 3 2];
%// cumulative sum of normalized vector a
n = cumsum(a/t);
%// subs for accumarray
subs = floor( n ) + 1;
%// cumsum of every group
aout = accumarray( subs(:), (1:numel(subs)).', [], #(x) {cumsum(a(x))});
%// gather results;
b = [aout{:}]

One way is to use a loop. You create the first cumulative sum cs, and then as long as elements in cs are larger than your threshold th, you replace them with elements from the cumulative sum on the rest of the elements in a.
Because some elements in a might be larger than th, this loop will be infinite unless we also eliminate these elements too.
Here is a simple solution with a while loop:
a = [1 3 4 2 1 5 6 3 2];
th = 5;
cs = cumsum(a);
while any(cs>th & cs~=a) % if 'cs' has values larger that 'th',
% and there are any values smaller than th left in 'a'
% sum all the values in 'a' that are after 'cs' reached 'th',
% excluding values that are larger then 'th'
cs(cs>th & cs~=a) = cumsum(a(cs>th & cs~=a));
end

Calculate the cumulative sum and replace the indices value obeying your condition.
a = [1 3 4 2 1 5 6 3 2] ;
b = [1 4 4 2 3 5 6 3 5] ;
iwant = a ;
a_sum = cumsum(a) ;
iwant(a_sum<5) = a_sum(a_sum<5) ;

Block diagonal matrix from columns

Suppose I have an m x n matrix A .
Is there a way to create B, a (n x m) x n matrix whose "diagonal" is formed by A's columns ?
Example:
A = [1 2;
3 4]
B = [1 0;
3 0;
0 2;
0 4]

Here is a way:
Convert A to a cell array of its columns, using mat2cell;
From that cell array generate a comma-separated list, and use it as an input to blkdiag.
Code:
A = [1 2; 3 4]; %// example data
C = mat2cell(A, size(A,1), ones(1,size(A,2))); %// step 1
B = blkdiag(C{:}); %// step 2
This produces
B =
1 0
3 0
0 2
0 4

Here is a short script to accomplish this. It works for any dimensions of A.
A=[1 2; 3 4];
[R C] = size(A);
for i=1:C
B( 1+R*(i-1) : R*i , i ) = A(:,i);
end

Find consecutive values in 3D array

Say I have an array the size 100x150x30, a geographical grid 100x150 with 30 values for each grid point, and want to find consecutive elements along the third dimension with a congruous length of minimum 3.
I would like to find the maximum length of consecutive elements blocks, as well as the number of occurrences.
I have tried this on a simple vector:
var=[20 21 50 70 90 91 92 93];
a=diff(var);
q = diff([0 a 0] == 1);
v = find(q == -1) - find(q == 1);
v = v+1;
v2 = v(v>3);
v3 = max(v2); % maximum length: 4
z = numel(v2); % number: 1
Now I'd like to apply this to the 3rd dimension of my array.
With A being my 100x150x30 array, I've come this far:
aa = diff(A, 1, 3);
b1 = diff((aa == 1),1,3);
b2 = zeros(100,150,1);
qq = cat(3,b2,b1,b2);
But I'm stuck on the next step, which would be: find(qq == -1) - find(qq == 1);. I can't make it work.
Is there a way to put it in a loop, or do I have to find the consecutive values another way?
Thanks for any help!

A = randi(25,100,150,30); %// generate random array
tmpsize = size(A); %// get its size
B = diff(A,1,3); %// difference
v3 = zeros(tmpsize([1 2])); %//initialise
z = zeros(tmpsize([1 2]));
for ii = 1:100 %// double loop over all entries
for jj = 1:150
q = diff([0 squeeze(B(ii,jj,:)).' 0] == 1);%'//
v = find(q == -1) - find(q == 1);
v=v+1;
v2=v(v>3);
try %// if v2 is empty, set to nan
v3(ii,jj)=max(v2);
catch
v3(ii,jj)=nan;
end
z(ii,jj)=numel(v2);
end
end
The above seems to work. It just doubly loops over both dimensions you want to get the difference over.
The part where I think you were stuck was using squeeze to get the vector to put in your variable q.
The try/catch is there solely to prevent empty consecutive arrays in v2 throwing an error in the assignment to v3, since that would remove its entry. Now it simply sets it to nan, though you can switch that to 0 of course.

Here's one vectorized approach -
%// Parameters
[m,n,r] = size(var);
max_occ_thresh = 2 %// Threshold for consecutive occurrences
% Get indices of start and stop of consecutive number islands
df = diff(var,[],3)==1;
A = reshape(df,[],size(df,3));
dfA = diff([zeros(size(A,1),1) A zeros(size(A,1),1)],[],2).'; %//'
[R1,C1] = find(dfA==1);
[R2,C2] = find(dfA==-1);
%// Get interval lengths
interval_lens = R2 - R1+1;
%// Get max consecutive occurrences across dim-3
max_len = zeros(m,n);
maxIDs = accumarray(C1,interval_lens,[],#max);
max_len(1:numel(maxIDs)) = maxIDs
%// Get number of consecutive occurrences that are a bove max_occ_thresh
num_occ = zeros(m,n);
counts = accumarray(C1,interval_lens>max_occ_thresh);
num_occ(1:numel(counts)) = counts
Sample run -
var(:,:,1) =
2 3 1 4 1
1 4 1 5 2
var(:,:,2) =
2 2 3 1 2
1 3 5 1 4
var(:,:,3) =
5 2 4 1 2
1 5 1 5 1
var(:,:,4) =
3 5 5 1 5
5 1 3 4 3
var(:,:,5) =
5 5 4 4 4
3 4 5 2 2
var(:,:,6) =
3 4 4 5 3
2 5 4 2 2
max_occ_thresh =
2
max_len =
0 0 3 2 2
0 2 0 0 0
num_occ =
0 0 1 0 0
0 0 0 0 0

Get elements of a matrix that are greater than sum of their two indices in row major order

I'm Writing a function called large_elements that takes input an array named X that is a matrix or a vector. The function identifies those elements of X that are greater than the sum of their two indexes.
For example, if the element X(2,3) is 6, then that element would be identified because 6 > (2 + 3). The output of the function gives the indexes(row and column sub) of such elements found in row-major order. It is a matrix with exactly two columns. The first column contains the row indexes, while the second column contains the corresponding column indexes.
Here is an example, the statement
indexes = large_elements([1 4; 5 2; 6 0])
should give the output like this:
[1 2; 2 1; 3 1]
If no such element exists,
the function returns an
empty array.
I have came up with the following code
function indexes = large_elements(A)
[r c] = size(A);
ind = 1;
for ii = 1:r
for jj = 1:c
if A(ii,jj) > ii + jj
indexes(ind,:) = [ii jj];
ind = ind + 1;
else
indexes = [];
end
end
end
end
But the results are not as expected. Any help would be appreciated.

One vectorised approch using bsxfun, find and ind2sub
A = randi(8,5); %// Your matrix
%// finding sum of the indexes for all elements
indSum = bsxfun(#plus, (1:size(A,1)).', 1:size(A,2));
%// generating a mask of which elements satisfies the given condition (i.e A > indSum)
%// Transposing the mask and finding corresponding indexes
[c,r] = find(bsxfun(#gt, A, indSum).') ;
%// getting the matrix by appending row subs and col subs
out = [r,c]
Results:
Input A:
>> A
A =
4 4 7 2 2
1 3 4 8 3
8 8 2 8 7
8 3 4 5 1
4 1 1 1 1
Output in row-major order:
out =
1 1
1 2
1 3
2 4
3 1
3 2
3 4
4 1
Note: Getting subs in row-major order is tricky here
Also here is your correct loopy approach
[r, c] = size(A);
ind = 0;
indexes = [];
for ii = 1:r
for jj = 1:c
if A(ii,jj) > ii + jj
ind = ind + 1;
indexes(ind,:) = [ii jj];
end
end
end

That is because whenever you encounter an element which is smaller than the sum of its indices you are reinitializing the array to null. So the output is coming out to be null. You should not initialize it to null on the else condition.