I have a matrix A of size m x n and another matrix b of size 1 x n (in Matlab).
The matrix b is such that it consists of sequences of 1s, then sequences of 2s, then sequences of 3s, etc. up to some value k.
(For example b = [1 1 1 2 2 2 3 4 4], n = 9)
I want to take A, and for each row in A, choose the max in each segment, zeroing everything else in that subsequence.
So, for example, for a row A = [0 -1 2 3 4 1 3 4 5]) I would get
[0 0 2 0 4 0 3 0 5]
If there are multiple rows in A (m > 1), this should happen for each row.
I can do it easily using for loops, but it works very slowly, because I loop both over m and n.
Is there a "oneliner" to do it in Matlab, or something simple that works fast?
If A is a single row, accumarray can do the job using an ad hoc function:
result = accumarray(b(:), A(:) ,[] , #(x) {x==max(x)});
result = vertcat(result{:}).' .* A;
Not sure how fast this will be, since it uses cells.
If A has several rows, you can use a loop over the rows.
Related
I have a vector a = [1 3 4 2 1 5 6 3 2]. Now I want to create a new vector 'b' with the cumsum of a, but after reaching a threshold, let's say 5, cumsum should reset and start again till it reaches the threshold again, so the new vector should look like this:
b = [1 4 4 2 3 5 6 3 5]
Any ideas?
You could build a sparse matrix that, when multiplied by the original vector, returns the cumulative sums. I haven't timed this solution versus others, but I strongly suspect this will be the fastest for large arrays of a.
% Original data
a = [1 3 4 2 1 5 6 3 2];
% Threshold
th = 5;
% Cumulative sum corrected by threshold
b = cumsum(a)/th;
% Group indices to be summed by checking for equality,
% rounded down, between each cumsum value and its next value. We add one to
% prevent NaNs from occuring in the next step.
c = cumsum(floor(b) ~= floor([0,b(1:end-1)]))+1;
% Build the sparse matrix, remove all values that are in the upper
% triangle.
S = tril(sparse(c.'./c == 1));
% In case you use matlab 2016a or older:
% S = tril(sparse(bsxfun(#rdivide,c.',c) == 1));
% Matrix multiplication to create o.
o = S*a.';
By normalizing the arguments of cumsum with the threshold and flooring you can get grouping indizes for accumarray, which then can do the cumsumming groupwise:
t = 5;
a = [1 3 4 2 1 5 6 3 2];
%// cumulative sum of normalized vector a
n = cumsum(a/t);
%// subs for accumarray
subs = floor( n ) + 1;
%// cumsum of every group
aout = accumarray( subs(:), (1:numel(subs)).', [], #(x) {cumsum(a(x))});
%// gather results;
b = [aout{:}]
One way is to use a loop. You create the first cumulative sum cs, and then as long as elements in cs are larger than your threshold th, you replace them with elements from the cumulative sum on the rest of the elements in a.
Because some elements in a might be larger than th, this loop will be infinite unless we also eliminate these elements too.
Here is a simple solution with a while loop:
a = [1 3 4 2 1 5 6 3 2];
th = 5;
cs = cumsum(a);
while any(cs>th & cs~=a) % if 'cs' has values larger that 'th',
% and there are any values smaller than th left in 'a'
% sum all the values in 'a' that are after 'cs' reached 'th',
% excluding values that are larger then 'th'
cs(cs>th & cs~=a) = cumsum(a(cs>th & cs~=a));
end
Calculate the cumulative sum and replace the indices value obeying your condition.
a = [1 3 4 2 1 5 6 3 2] ;
b = [1 4 4 2 3 5 6 3 5] ;
iwant = a ;
a_sum = cumsum(a) ;
iwant(a_sum<5) = a_sum(a_sum<5) ;
Let's say I have two (large) vectors a=[0 0 0 0 0] and b=[1 2 3 4 5] of the same size and one index vector ind=[1 5 2 1] with values in {1,...,length(a)}. I would like to compute
for k = 1:length(ind)
a(ind(k)) = a(ind(k)) + b(ind(k));
end
% a = [2 2 0 0 5]
That is, I want to add those entries of b declared in ind to a including multiplicity.
a(ind)=a(ind)+b(ind);
% a = [1 2 0 0 5]
is much faster, of course, but ignores indices which appear multiple times.
How can I speed up the above code?
We can use unique to identify the unique index values and use the third output to determine which elements of ind share the same index. We can then use accumarray to sum all the elements of b which share the same index. We then add these to the original value of a at these locations.
[uniqueinds, ~, inds] = unique(ind);
a(uniqueinds) = a(uniqueinds) + accumarray(inds, b(ind)).';
If max(inds) == numel(a) then this could be simplified to the following since accumarray will simply return 0 for any missing entry in ind.
a(:) = a(:) + accumarray(ind(:), b(ind));
Another approach based on accumarray:
a(:) = a(:) + accumarray(ind(:), b(ind(:)), [numel(a) 1]);
How it works
accumarray with two column vectors as inputs aggregates the values of the second input corresponding to the same index in the first. The third input is used here to force the result to be the same size as a, padding with zeros if needed.
I have a 5 by 3 matrix, e.g the following:
A=[1 1 1; 2 2 2; 3 3 3; 4 4 4; 5 5 5]
I run a for loop:
for i = 1:5
AA = A(i)'*A(i);
end
My question is how to store each of the 5 (3 by 3) AA matrices?
Thanks.
You could pre-allocate enough memory to the AA matrix to hold all the results:
[r,c] = size(A); % get the rows and columns of A (r and c respectively)
AA = zeros(c,c,r); % pre-allocate memory to AA for all 5 products
% (so we have 5 3x3 arrays)
Now do almost the same loop as above BUT realize that A(i) in the above code only returns one element whereas you want the full row. So you want the data from row i but all columns which can be represented as 1:3 or just the colon :
for i=1:r
AA(:,:,i) = A(i,:)' * A(i,:);
end
In the above, A(i,:) is the ith row of A and we are setting all rows and columns in the third dimension (i) of AA to the result of the product.
Assuming, as in Geoff's answer, that you mean A(i,:)'*A(i,:) (to get 5 matrices of size 3x3 in your example), you can do it in one line with bsxfun and permute:
AA = bsxfun(#times, permute(A, [3 2 1]), permute(A, [2 3 1]));
(I'm also assuming that your matrices only contain real numbers, as in your example. If by ' you really mean conjugate transpose, you need to add a conj in the above).
I have a two cell arrays R and C (two vectors with R(n-elements), C(m-elements)) and my task is to compare each element of R with each element of R and each element of C with each element of C. Comparison is to finding intersection of two cells. In result I want to obtain two matrices. One matrix Q for R nxn, where in cell Q(i,j) is intersection of two elements R(i) and R(j) and second matrix P for C mxm, where in cell P(i,j) is intersection of two elements C(i) and C(j).
Generally I can do this using two for-loops, but my data is quite big and I wonder if there is any method to speed up the computation?
The first idea was to replace the cell array, where in each cell are the indexes of rows (vector R) or columns (vector C) which I want to compare (rows and columns of binary matrix BM, BM is input data) . So If R(1) = {2 3 4}, and BM is 5x5, then R(1,:)=[0 1 1 1 0]. Now having this binary matrix R I could compare each row with each row only with one loop. But then I still need to come back to number of rows eg
R(1,:) = [0 1 1 1 0];
R(2,:) = [0 1 1 0 0]; %then
Q(1,2) = [0 1 1 0 0]; %(intersection of element R(1) and R(2)) and
C(1,:) = [1 1 0 0 0];
C(2,:) = [1 0 0 1 0]; %then
P(1,2) = [1 0 0 0 0]; % Now I want to obtain
Results(i,j) = sum(BM(Q(1,2),P(1,2)))=sum(BM([2 3],[1]));
Do you have any idea how to cope with this, and compare two vectors of cell array without a two loops?
Since Q( k, l ) is a vector with numCols (5 in your example) it cannot be stored in a 2D matrix Q: Q should either be a 2D cell array, or a 3D matrix.
Using the binary matrix directly to obtain Q (row intersections):
>> Q = bsxfun( #times, permute( BM, [1 3 2] ), permute( BM, [3 1 2] ) );
Now, Q( k, l, : ) holds the intersection betwee the k-th and l-th rows of BM.
Same goes for P:
>> P = bsxfun( #times, permute( BM, [3 2 1] ), permute( BM, [2 3 1] ) );
I have quite big array. To make things simple lets simplify it to:
A = [1 1 1 1 2 2 3 3 3 3 4 4 5 5 5 5 5 5 5 5];
So, there is a group of 1's (4 elements), 2's (2 elements), 3's (4 elements), 4's (2 elements) and 5's (8 elements). Now, I want to keep only columns, which belong to group of 3 or more elements. So it will be like:
B = [1 1 1 1 3 3 3 3 5 5 5 5 5 5 5 5];
I was doing it using for loop, scanning separately 1's, 2's, 3's and so on, but its extremely slow with big arrays...
Thanks for any suggestions how to do it in more efficient way :)
Art.
A general approach
If your vector is not necessarily sorted, then you need to run to count the number of occurrences of each element in the vector. You have histc just for that:
elem = unique(A);
counts = histc(A, elem);
B = A;
B(ismember(A, elem(counts < 3))) = []
The last line picks the elements that have less than 3 occurrences and deletes them.
An approach for a grouped vector
If your vector is "semi-sorted", that is if similar elements in the vector are grouped together (as in your example), you can speed things up a little by doing the following:
start_idx = find(diff([0, A]))
counts = diff([start_idx, numel(A) + 1]);
B = A;
B(ismember(A, A(start_idx(counts < 3)))) = []
Again, note that the vector need not to be entirely sorted, just that similar elements are adjacent to each other.
Here is my two-liner
counts = accumarray(A', 1);
B = A(ismember(A, find(counts>=3)));
accumarray is used to count the individual members of A. find extracts the ones that meet your '3 or more elements' criterion. Finally, ismember tells you where they are in A. Note that A needs not be sorted. Of course, accumarray only works for integer values in A.
What you are describing is called run-length encoding.
There is software for this in Matlab on the FileExchange. Or you can do it directly as follows:
len = diff([ 0 find(A(1:end-1) ~= A(2:end)) length(A) ]);
val = A(logical([ A(1:end-1) ~= A(2:end) 1 ]));
Once you have your run-length encoding you can remove elements based on the length. i.e.
idx = (len>=3)
len = len(idx);
val = val(idx);
And then decode to get the array you want:
i = cumsum(len);
j = zeros(1, i(end));
j(i(1:end-1)+1) = 1;
j(1) = 1;
B = val(cumsum(j));
Here's another way to do it using matlab built-ins.
% Set up
A=[1 1 1 1 2 2 3 3 3 3 4 4 5 5 5 5 5];
threshold=2;
% Get the unique elements of the array
uniqueElements=unique(A);
% Count haw many times each unique element occurs
counts=histc(A,uniqueElements);
% Write which elements should be kept
toKeep=uniqueElements(counts>threshold);
% Make a logical index
indexer=false(size(A));
for i=1:length(toKeep)
% For every unique element we want to keep select the indices in A that
% are equal
indexer=indexer|(toKeep(i)==A);
end
% Apply index
B=A(indexer);