Using percentage function with accumarray - arrays

I have two arrays:
OTPCORorder = [61,62,62,62,62,62,62,62,62,62,62,62,65,65,...]
AprefCOR = [1,3,1,1,1,1,1,1,1,1,2,3,3,2,...]
for each element in OTPCORorder there is a corresponding element in AprefCOR.
I want to know the percent of the number 1 for each set of unique OTPCORorder as follows:
OTPCORorder1 = [61,62,65,...]
AprefCOR1 = [1,0.72,0,...]
I already have this:
[OTPCORorder1,~,idx] = unique(OTPCORorder,'stable');
ANS = OTPCORorder1 = [61,62,65,...];
and I used to work with "accumarray" but I used the "mean" or "sum" function such as this:
AprefCOR1 = accumarray(idx,AprefCOR,[],#mean).';
I was just wondering if there exists a way to use this but with "prctile" function or any other function that gives me the percent of a specific element for example "1" in this case.
Thank you very much.

This could be one approach:
%// make all those non-zero values to zero
AprefCORmask = AprefCOR == 1;
%// you have done this
[OTPCORorder1,~,idx] = unique(OTPCORorder,'stable');
%// Find number of each unique values
counts = accumarray(idx,1);
%// Find number of ones for each unique value
sumVal = accumarray(idx,AprefCORmask);
%// find percentage of ones to get the results
perc = sumVal./counts
Results:
Inputs:
OTPCORorder = [61,62,62,62,62,62,62,62,62,62,62,62,65,65];
AprefCOR = [1,3,1,1,1,1,1,1,1,1,2,3,3,2];
Output:
perc =
1.0000
0.7273
0

Here's another approach without using accumarray. I think it's more readable:
>> list = unique(PCORorder);
>> counts_master = histc(PCORorder, list);
>> counts = histc(PCORorder(AprefCOR == 1), list);
>> perc = counts ./ counts_master
perc =
1.0000 0.7273 0
How the above code works is that we first find those elements in PCORorder that are unique. Once we do this, we first count up how many elements belong to each unique value in PCORorder via histc using the bins to count at as this exact list. If you're using a more newer version of MATLAB, use histcounts instead... same syntax. Once we find the total number of elements for each value in PCORorder, we simply count up how many elements correspond to PCORorder where AprefCOR == 1 and then to calculate the percentage, you simply divide each entry in this list with the total number of elements from the previous list.
It'll give you the same results as accumarray but with less overhead.

Your approach works, you only need to define an appropriate anonymous function to be used by accumarray. Let value = 1 be the value whose percentage you want to compute. Then
[~, ~, u] = unique(OTPCORorder); %// labels for unique values in OTPCORorder
result = accumarray(u(:), AprefCOR(:), [], #(x) mean(x==value)).';
As an alternative, you can use sparse as follows. Generate a two-row matrix sucha that each column corresponds to one of the possible values in OTPCORorder. First row tallies how many times each value in OTPCORorder had the desired value in AprefCOR; second row tallies how many times it didn't.
[~, ~, u] = unique(OTPCORorder);
s = full(sparse((AprefCOR==value)+1, u, 1));
result = s(2,:)./sum(s,1);

Related

Find nearest smaller value in Matlab

I have two vectors of different size. Just as an example:
Triggs = [38.1680, 38.1720, 38.1760, 38.1800, 38.1840, 38.1880, 38.1920, 38.1960, 38.2000, 38.2040, 38.2080, 38.2120, 38.2160, 38.2200, 38.2240, 38.2280, 38.2320, 38.2360, 38.2400, 38.2440, 38.2480, 38.2520, 38.2560, 38.2600, 38.2640, 38.2680]
Peaks = [27.7920, 28.4600, 29.1360, 29.8280, 30.5200, 31.2000, 31.8920, 32.5640, 33.2600, 33.9480, 34.6520, 35.3680, 36.0840, 36.7680, 37.5000, 38.2440, 38.9920, 39.7120, 40.4160, 41.1480, 41.8840, 42.5960, 43.3040, 44.0240, 44.7160, 45.3840, 46.1240, 46.8720, 47.6240, 48.3720, 49.1040, 49.8080, 50.5200, 51.2600]
For each element in Triggs I need to find the nearest smaller element in Peaks.
That is, if Triggs(1) == 38.1680, I need to find the column number equal to Peaks(15) (the 15th element of Peaks).
Just to be 100% clear, the closest element of course could be the next one, that is 38.2440. That would not be ok for me. I will always need the one to the left of the array.
So far I have this:
for i = 1:length(triggersStartTime)
[~,valuePosition] = (min(abs(Peaks-Triggs(i))))
end
However, this could give me the incorrect value, that is, one bigger than Triggs(i), right?
As a solution I was thinking I could do this:
for i = 1:length(Triggs)
[~,valuePosition] = (min(abs(Peaks-Triggs(i))))
if Peaks(valuePosition) >= Triggs(i)
valuePosition = valuePosition-1
end
end
Is there a better way of doing this?
This can be done in a vectorized way as follows (note that the intermediate matrix d can be large). If there is no number satisfying the condition the output is set to NaN.
d = Triggs(:).'-Peaks(:); % matrix of pair-wise differences. Uses implicit expansion
d(d<=0) = NaN; % set negative differences to NaN, so they will be disregarded
[val, result] = min(d, [], 1); % for each column, get minimum value and its row index
result(isnan(val)) = NaN; % if minimum was NaN the index is not valid
If it is assured that there will always be a number satisfying the condition, the last line and the variable val can be removed:
d = Triggs(:).'-Peaks(:); % matrix of pair-wise differences. Uses implicit expansion
d(d<=0) = NaN; % set negative differences to NaN, so they will be disregarded
[~, result] = min(d, [], 1); % for each column, get row index of minimum value
I think this should help you:
temp=sort(abs(Peaks-Triggs));
lowest=find(abs(Peaks-Triggs)==temp(1))

construct a matrix by removing different elements from an array without loops in matlab

Given a vector X of discrete positive integers with size 160*1, and a table Tb1 in size 40*200, that contains a list of indices to be deleted from X Each column from the 200 columns in Tb1 points to 40 elements to be deleted from original X.
I create a new matrix of the remaining 120*200 elements by using a for loop with 200 iterations, that at round i deletes 40 elements from a copy of the original X according to the indices listed in Tb1(:,i), but it takes too much time and memory.
How can I get the result without using loops and with a minimum number of operations?
Here are different methods:
Method1:
idx = ~hist(tbl, 1:160);
[f,~]=find(idx);
result1 = reshape(M(f),120,200);
Method2:
idx = ~hist(tbl, 1:160);
M2=repmat(M,200,1);
result2 = reshape(M2(idx),120,200);
Method 3 & 4:
% idx can be generated using accumarray
idx = ~accumarray([tbl(:) reshape(repmat(1:200,40,1),[],1)],true,[160,200],#any);
%... use method 1 and 2
Method5:
M5=repmat(M,200,1);
M5(bsxfun(#plus,tbl,0:160:160*199))=[];
result5 = reshape(M5,120,200);
Assuming that M is an array of integers and tbl is the table of indices.
It can be tested with the following data:
M = rand(160,1);
[~,tbl] = sort(rand(160,200));
tbl = tbl(1:40,:);
However it is more efficient if you generate indices of elements to be remained instead of indices of elements to be removed.

Compute the product of the next n elements in array

I would like to compute the product of the next n adjacent elements of a matrix. The number n of elements to be multiplied should be given in function's input.
For example for this input I should compute the product of every 3 consecutive elements, starting from the first.
[p, ind] = max_product([1 2 2 1 3 1],3);
This gives [1*2*2, 2*2*1, 2*1*3, 1*3*1] = [4,4,6,3].
Is there any practical way to do it? Now I do this using:
for ii = 1:(length(v)-2)
p = prod(v(ii:ii+n-1));
end
where v is the input vector and n is the number of elements to be multiplied.
in this example n=3 but can take any positive integer value.
Depending whether n is odd or even or length(v) is odd or even, I get sometimes right answers but sometimes an error.
For example for arguments:
v = [1.35912281237829 -0.958120385352704 -0.553335935098461 1.44601450110386 1.43760259196739 0.0266423803393867 0.417039432979809 1.14033971399183 -0.418125096873537 -1.99362640306847 -0.589833539347417 -0.218969651537063 1.49863539349242 0.338844452879616 1.34169199365703 0.181185490389383 0.102817336496793 0.104835620599133 -2.70026800170358 1.46129128974515 0.64413523430416 0.921962619821458 0.568712984110933]
n = 7
I get the error:
Index exceeds matrix dimensions.
Error in max_product (line 6)
p = prod(v(ii:ii+n-1));
Is there any correct general way to do it?
Based on the solution in Fast numpy rolling_product, I'd like to suggest a MATLAB version of it, which leverages the movsum function introduced in R2016a.
The mathematical reasoning is that a product of numbers is equal to the exponent of the sum of their logarithms:
A possible MATLAB implementation of the above may look like this:
function P = movprod(vec,window_sz)
P = exp(movsum(log(vec),[0 window_sz-1],'Endpoints','discard'));
if isreal(vec) % Ensures correct outputs when the input contains negative and/or
P = real(P); % complex entries.
end
end
Several notes:
I haven't benchmarked this solution, and do not know how it compares in terms of performance to the other suggestions.
It should work correctly with vectors containing zero and/or negative and/or complex elements.
It can be easily expanded to accept a dimension to operate along (for array inputs), and any other customization afforded by movsum.
The 1st input is assumed to be either a double or a complex double row vector.
Outputs may require rounding.
Update
Inspired by the nicely thought answer of Dev-iL comes this handy solution, which does not require Matlab R2016a or above:
out = real( exp(conv(log(a),ones(1,n),'valid')) )
The basic idea is to transform the multiplication to a sum and a moving average can be used, which in turn can be realised by convolution.
Old answers
This is one way using gallery to get a circulant matrix and indexing the relevant part of the resulting matrix before multiplying the elements:
a = [1 2 2 1 3 1]
n = 3
%// circulant matrix
tmp = gallery('circul', a(:))
%// product of relevant parts of matrix
out = prod(tmp(end-n+1:-1:1, end-n+1:end), 2)
out =
4
4
6
3
More memory efficient alternative in case there are no zeros in the input:
a = [10 9 8 7 6 5 4 3 2 1]
n = 2
%// cumulative product
x = [1 cumprod(a)]
%// shifted by n and divided by itself
y = circshift( x,[0 -n] )./x
%// remove last elements
out = y(1:end-n)
out =
90 72 56 42 30 20 12 6 2
Your approach is correct. You should just change the for loop to for ii = 1:(length(v)-n+1) and then it will work fine.
If you are not going to deal with large inputs, another approach is using gallery as explained in #thewaywewalk's answer.
I think the problem may be based on your indexing. The line that states for ii = 1:(length(v)-2) does not provide the correct range of ii.
Try this:
function out = max_product(in,size)
size = size-1; % this is because we add size to i later
out = zeros(length(in),1) % assuming that this is a column vector
for i = 1:length(in)-size
out(i) = prod(in(i:i+size));
end
Your code works when restated like so:
for ii = 1:(length(v)-(n-1))
p = prod(v(ii:ii+(n-1)));
end
That should take care of the indexing problem.
using bsxfun you create a matrix each row of it contains consecutive 3 elements then take prod of 2nd dimension of the matrix. I think this is most efficient way:
max_product = #(v, n) prod(v(bsxfun(#plus, (1 : n), (0 : numel(v)-n)')), 2);
p = max_product([1 2 2 1 3 1],3)
Update:
some other solutions updated, and some such as #Dev-iL 's answer outperform others, I can suggest fftconv that in Octave outperforms conv
If you can upgrade to R2017a, you can use the new movprod function to compute a windowed product.

Multiply elements in second column according to labels in the first

I'm working in Matlab.
I have a two-dimensional matrix with two columns. Lets consider elements in the first column as labels. Labels may be repeated.
How to multiply all elements in the second column for every label?
Example:
matrix = [1,3,3,1,5; 2,3,7,8,3]'
I need to get:
a = [1,3,5; 16,21,3]'
Can you help me with doing it without for-while cycles?
I would use accumarray. The preprocessing with unique assigns integer indices 1:n to the values in the first row, which allow accumarray to work without creating unnecessary bins for 2 and 4. It also enables the support for negative numbers and floats.
[ulable,~,uindex]=unique(matrix(:,1))
r=accumarray(uindex,matrix(:,2),[],#prod)
r=[ulable,r]
/You can also use splitapply:
[ulable,~,uindex]=unique(matrix(:,1))
r=splitapply(#prod,matrix(:,2),uindex)
r=[ulable,r]
You can do it without loops using accumarray and the prod function:
clear
clc
matrix = [1,3,3,1,5; 2,3,7,8,3]';
A = unique(matrix,'rows');
group = A(:,1);
data = A(:,2);
indices = [group ones(size(group))];
prods = accumarray(indices, data,[],#prod); %// As mentionned by #Daniel. My previous answer had a function handle but there is no need for that here since prod is already defined in Matlab.
a = nonzeros(prods)
Out = [unique(group) a]
Out =
1 16
3 21
5 3
Check Lauren blog's post here, accumarray is quite interesting and powerful!
Try something like this, I'm sure it can be improved...
unValues = unique(matrix(:,1));
bb = ones(size(unValues));
for ii = 1:length(unValues)
bb(ii) = bb(ii)*prod(matrix(matrix(:, 1) == unValues(ii), 2));
end
a = [unValues bb];

How to obtain a vector with the indexes of the elements given in a combination?

I have a vector that stores unique area values. I am using a for loop to generate an array with the sum of every possible combination of these areas, as shown below:
A_values=[155 143 193.5 233.25 419.7 351.9 256.8 1054.9 997.5 997.5 726.2 73.5 66.8 62 82.5]
comb_sums=[];
indexes=[];
for x=1:length(A_values)
comb_sums=[comb_sums;
sum(combntns(A_values,x),2)];
end
Now I would like to obtain the indexes of the elements given in every combination. For example, if some of the possible given combinations had been [143], [726.2 66.8] and [155 419.7 256.8], the code would give me an array like this:
indexes=[ 2 0 0 0;
11 13 0 0;
1 5 7 0];
The array that I get from the for loop is obviously much bigger than the example given in the indexes variable above, so indexes would give me a much bigger array too.
You can create an array of indices and use combntns on it, just you like did on A_values -
nA = numel(A_values)
for k1 = 1:nA
comb_out = combntns([1:nA],k1);
indexes = [comb_out zeros(size(comb_out,1),nA - size(comb_out,2))]
end
If you would like to store up indexes from each iteration into a huge array named indexes_all instead, you can pre-calculate the sizes of indexes for each iteration and use them to pre-allocate for indexes_all. The code would be -
%// Number of A_values to be used at various places in the code
nA = numel(A_values);
%// Get number of rows to be produced at each iteration
nrows = arrayfun(#(x) factorial(nA)/(factorial(x)*factorial(nA-x)),1:nA);
%// Preallocate with zeros as also needed in the desired output
indexes_all = zeros(sum(nrows),nA);
off1 = 1; %// row-offset
for k1 = 1:nA
comb_out = combntns(1:nA,k1); %// combntns on array of indices
indexes_all(off1:off1+nrows(k1)-1,1:size(comb_out,2)) = comb_out; %// Store
off1 = off1+nrows(k1); %// Update row-offset for next set of combinations
end

Resources