I would like to construct a function
[B, ind] = extract_ones(A)
which removes some sub-arrays from a binary array A in arbitrary dimensions, such that the remaining array B is the largest possible array with only 1's, and I also would like to record in ind that where each of the 1's in B comes from.
Example 1
Assume A is a 2-D array as shown
A =
1 1 0 0 0 1
1 1 1 0 1 1
0 0 0 1 0 1
1 1 0 1 0 1
1 1 0 1 0 1
1 1 1 1 1 1
After removing A(3,:) and A(:,3:5), we have the output B
B =
1 1 1
1 1 1
1 1 1
1 1 1
1 1 1
which is the largest array with only ones by removing rows and columns of A.
As the fifteen 1's of B corresponds to
A(1,1) A(1,2) A(1,6)
A(2,1) A(2,2) A(2,6)
A(4,1) A(4,2) A(4,6)
A(5,1) A(5,2) A(5,6)
A(6,1) A(6,2) A(6,6)
respectively, or equivalently
A(1) A(7) A(31)
A(2) A(8) A(32)
A(4) A(10) A(34)
A(5) A(11) A(35)
A(6) A(12) A(36)
so, the output ind looks like (of course ind's shape does not matter):
ind = [1 2 4 5 6 7 8 10 11 12 31 32 34 35 36]
Example 2
If the input A is constructed by
A = ones(6,3,4,3);
A(2,2,2,2) = 0;
A(4,1,3,3) = 0;
A(1,1,4,2) = 0;
A(1,1,4,1) = 0;
Then, by deleting the minimum cuboids containing A(2,2,2,2), A(4,1,3,3), A(1,1,4,3) and A(1,1,4,1), i.e. after deleting these entries
A(2,:,:,:)
A(:,1,:,:)
Then the remaining array B will be composed by 1's only. And the ones in B corresponds to
A([1,3:6],2:3,1:4,1:3)
So, the output ind lists the subscripts transformed into indices, i.e.
ind = [7 9 10 11 12 13 15 16 17 18 25 27 28 29 30 31 33 34 35 36 43 45 46 47 48 49 51 52 53 54 61 63 64 65 66 67 69 70 71 72 79 81 82 83 84 85 87 88 89 90 97 99 100 101 102 103 105 106 107 108 115 117 118 119 120 121 123 124 125 126 133 135 136 137 138 139 141 142 143 144 151 153 154 155 156 157 159 160 161 162 169 171 172 173 174 175 177 178 179 180 187 189 190 191 192 193 195 196 197 198 205 207 208 209 210 211 213 214 215 216]
As the array needed to be processed as above is in 8-D, and it should be processed more than once, so can anyone give me opinions on how to composing the program doing this task fast?
My work so far [Added at 2 am (GMT-4), 2nd Aug 2017]
My idea was that I delete the sub-arrays with the largest proportion of zero one by one. And here is my work so far:
Inds = reshape(1:numel(A),size(A)); % Keep track on which 1's survive.
cont = true;
while cont
sz = size(A);
zero_percentage = 0;
Test_location = [];
% This nested for loops are for determining which sub-array of A has the
% maximum proportion of zeros.
for J = 1 : ndims(A)
for K = 1 : sz(J)
% Location is in the form of (_,_,_,...,_)
% where the J-th blank is K, the other blanks are colons.
Location = strcat('(',repmat(':,',1,(J-1)),int2str(K),repmat(',:',1,(ndims(A)-J)),')');
Test_array = eval(strcat('A',Location,';'));
N = numel(Test_array);
while numel(Test_array) ~= 1
Test_array = sum(Test_array);
end
test_zero_percentage = 1 - (Test_array/N);
if test_zero_percentage > zero_percentage
zero_percentage = test_zero_percentage;
Test_location = Location;
end
end
end
% Delete the array with maximum proportion of zeros
eval(strcat('A',Test_location,'= [];'))
eval(strcat('Inds',Test_location,'= [];'))
% Determine if there are still zeros in A. If there are, continue the while loop.
cont = A;
while numel(cont) ~= 1
cont = prod(cont);
end
cont = ~logical(cont);
end
But I encountered two problems:
1) It may be not efficient to check all arrays in all sub-dimensions one-by-one.
2) The result does not contain the most number of rectangular ones. for example, I tested my work using a 2-dimensional binary array A
A =
0 0 0 1 1 0
0 1 1 0 1 1
1 0 1 1 1 1
1 0 0 1 1 1
0 1 1 0 1 1
0 1 0 0 1 1
1 0 0 0 1 1
1 0 0 0 0 0
It should return me the result as
B =
1 1
1 1
1 1
1 1
1 1
1 1
Inds =
34 42
35 43
36 44
37 45
38 46
39 47
But, instead, the code returned me this:
B =
1 1 1
1 1 1
1 1 1
Inds =
10 34 42
13 37 45
14 38 46
*My work so far 2 [Added at 12noon (GMT-4), 2nd Aug 2017]
Here is my current amendment. This may not provide the best result.
This may give a fairly OK approximation to the problem, and this does not give empty Inds. But I am still hoping that there is a better solution.
function [B, Inds] = Finding_ones(A)
Inds = reshape(1:numel(A),size(A)); % Keep track on which 1's survive.
sz0 = size(A);
cont = true;
while cont
sz = size(A);
zero_percentage = 0;
Test_location = [];
% This nested for loops are for determining which sub-array of A has the
% maximum proportion of zeros.
for J = 1 : ndims(A)
for K = 1 : sz(J)
% Location is in the form of (_,_,_,...,_)
% where the J-th blank is K, the other blanks are colons.
Location = strcat('(',repmat(':,',1,(J-1)),int2str(K),repmat(',:',1,(ndims(A)-J)),')');
Test_array = eval(strcat('A',Location,';'));
N = numel(Test_array);
Test_array = sum(Test_array(:));
test_zero_percentage = 1 - (Test_array/N);
if test_zero_percentage > zero_percentage
eval(strcat('Testfornumel = numel(A',Location,');'))
if Testfornumel < numel(A) % Preventing the A from being empty
zero_percentage = test_zero_percentage;
Test_location = Location;
end
end
end
end
% Delete the array with maximum proportion of zeros
eval(strcat('A',Test_location,'= [];'))
eval(strcat('Inds',Test_location,'= [];'))
% Determine if there are still zeros in A. If there are, continue the while loop.
cont = A;
while numel(cont) ~= 1
cont = prod(cont);
end
cont = ~logical(cont);
end
B = A;
% command = 'i1, i2, ... ,in'
% here, n is the number of dimansion of A.
command = 'i1';
for J = 2 : length(sz0)
command = strcat(command,',i',int2str(J));
end
Inds = reshape(Inds,numel(Inds),1); %#ok<NASGU>
eval(strcat('[',command,'] = ind2sub(sz0,Inds);'))
% Reform Inds into a 2-D matrix, which each column indicate the location of
% the 1 originated from A.
Inds = squeeze(eval(strcat('[',command,']')));
Inds = reshape(Inds',length(sz0),numel(Inds)/length(sz0));
end
It seems a difficult problem to solve, since the order of deletion can change a lot in the final result. If in your first example you start with deleting all the columns that contain a 0, you don't end up with the desired result.
The code below removes the row or column with the most zeros and keeps going until it's only ones. It keeps track of the rows and columns that are deleted to find the indexes of the remaining ones.
function [B,ind] = extract_ones( A )
if ~islogical(A),A=(A==1);end
if ~any(A(:)),B=[];ind=[];return,end
B=A;cdel=[];rdel=[];
while ~all(B(:))
[I,J] = ind2sub(size(B),find(B==0));
ih=histcounts(I,[0.5:1:size(B,1)+0.5]); %zero's in rows
jh=histcounts(J,[0.5:1:size(B,2)+0.5]); %zero's in columns
if max(ih)>max(jh)
idxr=find(ih==max(ih),1,'first');
B(idxr,:)=[];
%store deletion
rdel(end+1)=idxr+sum(rdel<=idxr);
elseif max(ih)==max(jh)
idxr=find(ih==max(ih),1,'first');
idxc=find(jh==max(jh),1,'first');
B(idxr,:)=[];
B(:,idxc)=[];
%store deletions
rdel(end+1)=idxr+sum(rdel<=idxr);
cdel(end+1)=idxc+sum(cdel<=idxc);
else
idxc=find(jh==max(jh),1,'first');
B(:,idxc)=[];
%store deletions
cdel(end+1)=idxc+sum(cdel<=idxc);
end
end
A(rdel,:)=0;
A(:,cdel)=0;
ind=find(A);
Second try: Start with a seed point and try to grow the matrix in all dimensions. The result is the start and finish point in the matrix.
function [ res ] = seed_grow( A )
if ~islogical(A),A=(A==1);end
if ~any(A(:)),res={};end
go = true;
dims=size(A);
ind = cell([1 length(dims)]); %cell to store find results
seeds=A;maxmat=0;
while go %main loop to remove all posible seeds
[ind{:}]=find(seeds,1,'first');
S = [ind{:}]; %the seed
St = [ind{:}]; %the end of the seed
go2=true;
val_dims=1:length(dims);
while go2 %loop to grow each dimension
D=1;
while D<=length(val_dims) %add one to each dimension
St(val_dims(D))=St(val_dims(D))+1;
I={};
for ct = 1:length(S),I{ct}=S(ct):St(ct);end %generate indices
if St(val_dims(D))>dims(val_dims(D))
res=false;%outside matrix
else
res=A(I{:});
end
if ~all(res(:)) %invalid addition to dimension
St(val_dims(D))=St(val_dims(D))-1; %undo
val_dims(D)=[]; D=D-1; %do not try again
if isempty(val_dims),go2=false;end %end of growth
end
D=D+1;
end
end
%evaluate the result
mat = prod((St+1)-S); %size of matrix
if mat>maxmat
res={S,St};
maxmat=mat;
end
%tried to expand, now remove seed option
for ct = 1:length(S),I{ct}=S(ct):St(ct);end %generate indices
seeds(I{:})=0;
if ~any(seeds),go=0;end
end
end
I tested it using your matrix:
A = [0 0 0 1 1 0
0 1 1 0 1 1
1 0 1 1 1 1
1 0 0 1 1 1
0 1 1 0 1 1
0 1 0 0 1 1
1 0 0 0 1 1
1 0 0 0 0 0];
[ res ] = seed_grow( A );
for ct = 1:length(res),I{ct}=res{1}(ct):res{2}(ct);end %generate indices
B=A(I{:});
idx = reshape(1:numel(A),size(A));
idx = idx(I{:});
And got the desired result:
B =
1 1
1 1
1 1
1 1
1 1
1 1
idx =
34 42
35 43
36 44
37 45
38 46
39 47
A = [5 10 16 22 28 32 36 44 49 56]
B = [2 1 1 2 1 2 1 2 2 2]
How to get this?
C1 = [10 16 28 36]
C2 = [5 22 32 44 49 56]
C1 needs to get the values from A, only in the positions in which B is 1
C2 needs to get the values from A, only in the positions in which B is 2
You can do this this way :
C1 = A(B==1);
C2 = A(B==2);
B==1 gives a logical array : [ 0 1 1 0 1 0 1 0 0 0 ].
A(logicalArray) returns elements for which the value of logicalArray is true (it is termed logical indexing).
A and logicalArray must of course have the same size.
It is probably the fastest way of doing this operation in matlab.
For more information on indexing, see matlab documentation.
To achieve this with an arbitrary number of groups (not just two as in your example), use accumarray with an a anoynmous function to collect the values in each group into a cell. To preserve order, B needs to be sorted first (and the same order needs to be applied to A):
[B_sort, ind_sort] = sort(B);
C = accumarray(B_sort.', A(ind_sort).', [], #(x){x.'});
This gives the result in a cell array:
>> C{1}
ans =
10 16 28 36
>> C{2}
ans =
5 22 32 44 49 56
I need to loop through coloumn 1 of a matrix and return (i) when I have come across ALL of the elements of another vector which i can predefine.
check_vector = [1:43] %% I dont actually need to predefine this - i know I am looking for the numbers 1 to 43.
matrix_a coloumn 1 (which is the only coloumn i am interested in looks like this for example
1
4
3
5
6
7
8
9
10
11
12
13
14
16
15
18
17
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
1
3
4
2
6
7
8
We want to loop through matrix_a and return the value of (i) when we have hit all of the numbers in the range 1 to 43.
In the above example we are looking for all the numbers from 1 to 43 and the iteration will end round about position 47 in matrix_a because it is at this point that we hit number '2' which is the last number to complete all numbers in the sequence 1 to 43.
It doesnt matter if we hit several of one number on the way, we count all those - we just want to know when we have reached all the numbers from the check vector or in this example in the sequence 1 to 43.
Ive tried something like:
completed = []
for i = 1:43
complete(i) = find(matrix_a(:,1) == i,1,'first')
end
but not working.
Assuming A as the input column vector, two approaches could be suggested here.
Approach #1
With arrayfun -
check_vector = [1:43]
idx = find(arrayfun(#(n) all(ismember(check_vector,A(1:n))),1:numel(A)),1)+1
gives -
idx =
47
Approach #2
With customary bsxfun -
check_vector = [1:43]
idx = find(all(cumsum(bsxfun(#eq,A(:),check_vector),1)~=0,2),1)+1
To find the first entry at which all unique values of matrix_a have already appeared (that is, if check_vector consists of all unique values of matrix_a): the unique function almost gives the answer:
[~, ind] = unique(matrix_a, 'first');
result = max(ind);
Someone might have a more compact answer but is this what your after?
maxIndex = 0;
for ii=1:length(a)
[f,index] = ismember(ii,a);
maxIndex=max(maxIndex,max(index));
end
maxIndex
Here is one solution without a loop and without any conditions on the vectors to be compared. Given two vectors a and b, this code will find the smallest index idx where a(1:idx) contains all elements of b. idx will be 0 when b is not contained in a.
a = [ 1 4 3 5 6 7 8 9 10 11 12 13 14 16 15 18 17 19 20 21 22 23 24 25 26 ...
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 1 3 4 2 6 7 8 50];
b = 1:43;
[~, Loca] = ismember(b,a);
idx = max(Loca) * all(Loca);
Some details:
ismember(b,a) checks if all elements of b can be found in a and the output Loca lists the indices of these elements within a. The index will be 0, if the element cannot be found in a.
idx = max(Loca) then is the highest index in this list of indices, so the smallest one where all elements of b are found within a(1:idx).
all(Loca) finally checks if all indices in Loca are nonzero, i.e. if all elements of b have been found in a.