Matlab, averaging y values based on x =1 - arrays

I am having trouble averaging y values based on their x counter parts.
For example
1 5
3 4
1 6
How do I get 5 and 6 to average based on being paired with an x value of 1? For my specific issue I will have 98 values between repeating 1's, and there will be a total of 99 1's in the array.
This is not extremely complicated, but it has been over a year since I have used matlab so being rusty has me scratching my head.

Here's what I got:
x = [1, 5;
3, 4;
1, 6]
col1 = x(:, 1) % extract first row
col1 =
1
3
1
ri = find(col1 == 1) % get row indices where 1 appears
ri =
1
3
mean(x(ri, 2)) % index into the second column of rows with a 1, and take average
ans = 5.5000

Related

Sum up vector values till threshold, then start again

I have a vector a = [1 3 4 2 1 5 6 3 2]. Now I want to create a new vector 'b' with the cumsum of a, but after reaching a threshold, let's say 5, cumsum should reset and start again till it reaches the threshold again, so the new vector should look like this:
b = [1 4 4 2 3 5 6 3 5]
Any ideas?
You could build a sparse matrix that, when multiplied by the original vector, returns the cumulative sums. I haven't timed this solution versus others, but I strongly suspect this will be the fastest for large arrays of a.
% Original data
a = [1 3 4 2 1 5 6 3 2];
% Threshold
th = 5;
% Cumulative sum corrected by threshold
b = cumsum(a)/th;
% Group indices to be summed by checking for equality,
% rounded down, between each cumsum value and its next value. We add one to
% prevent NaNs from occuring in the next step.
c = cumsum(floor(b) ~= floor([0,b(1:end-1)]))+1;
% Build the sparse matrix, remove all values that are in the upper
% triangle.
S = tril(sparse(c.'./c == 1));
% In case you use matlab 2016a or older:
% S = tril(sparse(bsxfun(#rdivide,c.',c) == 1));
% Matrix multiplication to create o.
o = S*a.';
By normalizing the arguments of cumsum with the threshold and flooring you can get grouping indizes for accumarray, which then can do the cumsumming groupwise:
t = 5;
a = [1 3 4 2 1 5 6 3 2];
%// cumulative sum of normalized vector a
n = cumsum(a/t);
%// subs for accumarray
subs = floor( n ) + 1;
%// cumsum of every group
aout = accumarray( subs(:), (1:numel(subs)).', [], #(x) {cumsum(a(x))});
%// gather results;
b = [aout{:}]
One way is to use a loop. You create the first cumulative sum cs, and then as long as elements in cs are larger than your threshold th, you replace them with elements from the cumulative sum on the rest of the elements in a.
Because some elements in a might be larger than th, this loop will be infinite unless we also eliminate these elements too.
Here is a simple solution with a while loop:
a = [1 3 4 2 1 5 6 3 2];
th = 5;
cs = cumsum(a);
while any(cs>th & cs~=a) % if 'cs' has values larger that 'th',
% and there are any values smaller than th left in 'a'
% sum all the values in 'a' that are after 'cs' reached 'th',
% excluding values that are larger then 'th'
cs(cs>th & cs~=a) = cumsum(a(cs>th & cs~=a));
end
Calculate the cumulative sum and replace the indices value obeying your condition.
a = [1 3 4 2 1 5 6 3 2] ;
b = [1 4 4 2 3 5 6 3 5] ;
iwant = a ;
a_sum = cumsum(a) ;
iwant(a_sum<5) = a_sum(a_sum<5) ;

How to check if all the entries in columns of a matrix are equal (in MATLAB)?

I have a matrix of growing length for example a 4-by-x matrix A where x is increasing in a loop. I want to find the smallest column c where all columns before that, each, carry one single number. The matrix A can look like:
A = [1 2 3 4;
1 2 3 5;
1 2 3 1;
1 2 3 0];
where c=3, and x=4.
At each iteration of the loop where A grows in length, the value of index c grows as well. Therefore, at each iteration, I want to update the value of c. How efficiently can I code this in Matlab?
Let's say you had the matrix A and you wanted to check a particular column iito see if all its elements are the same. The code would be:
all(A(:, ii)==A(1, ii)) % checks if all elements in column are same as first element
Also, keep in mind that once the condition is broken, x cannot be updated anymore. Therefore, your code should be:
x = 0;
while true
%% expand A by one column
if ~all(A(:, x)==A(1, x)) % true if all elements in column are not the same as first element
break;
end
x = x+1;
end
You could use this:
c = find(arrayfun(#(ind)all(A(1, ind)==A(:, ind)), 1:x), 1, 'first');
This finds the first column where not all values are the same. If you run this in a loop, you can detect when entries in a column start to differ:
for x = 1:maxX
% grow A
c = find(arrayfun(#(ind)~all(A(1, ind)==A(:, ind)), 1:x), 1, 'first');
% If c is empty, all columns have values equal to first row.
% Otherwise, we have to subtract 1 to get the number of columns with equal values
if isempty(c)
c = x;
else
c = c - 1;
end
end
Let me give a try as well:
% Find the columns which's elements are same and sum the logical array up
c = sum(A(1,:) == power(prod(A,1), 1/size(A,1)))
d=size(A,2)
To find the last column such that each column up to that one consists of equal values:
c = find(any(diff(A,1,1),1),1)-1;
or
c = find(any(bsxfun(#ne, A, A(1,:)),1),1)-1;
For example:
>> A = [1 2 3 4 5 6;
1 2 3 5 5 7;
1 2 3 1 5 0;
1 2 3 0 5 8];
>> c = find(any(diff(A,1,1),1),1)-1
c =
3
You can try this (easy and fast):
Equal_test = A(1,:)==A(2,:)& A(2,:)==A(3,:)&A(3,:)==A(4,:);
c=find(Equal_test==false,1,'first')-1;
You can also check the result of find if you want.

Matlab: how to find an enclosing grid cell index for multiple points

I am trying to allocate (x, y) points to the cells of a non-uniform rectangular grid. Simply speaking, I have a grid defined as a sorted non-equidistant array
xGrid = [x1, x2, x3, x4];
and an array of numbers x lying between x1 and x4. For each x, I want to find its position in xGrid, i.e. such i that
xGrid(i) <= xi <= xGrid(i+1)
Is there a better (faster/simpler) way to do it than arrayfun(#(x) find(xGrid <= x, 1, 'last'), x)?
You are looking for the second output of histc:
[~,where] = histc(x, xGrid)
This returns the array where such that xGrid(where(i)) <= x(i) < xGrid(where(i)+1) holds.
Example:
xGrid = [2,4,6,8,10];
x = [3,5,6,9,11];
[~,where] = histc(x, xGrid)
Yields the following output:
where =
1 2 3 4 0
If you want xGrid(where(i)) < x(i) <= xGrid(where(i)+1), you need to do some trickery of negating the values:
[~,where] = histc(-x,-flip(xGrid));
where(where~=0) = numel(xGrid)-where(where~=0)
This yields:
where =
1 2 2 4 0
Because x(3)==6 is now counted for the second interval (4,6] instead of [6,8) as before.
Using bsxfun for the comparisons and exploiting find-like capabilities of max's second output:
xGrid = [2 4 6 8]; %// example data
x = [3 6 5.5 10 -10]; %// example data
comp = bsxfun(#gt, xGrid(:), x(:).'); %'// see if "x" > "xGrid"
[~, result] = max(comp, [], 1); %// index of first "xGrid" that exceeds each "x"
result = result-1; %// subtract 1 to find the last "xGrid" that is <= "x"
This approach gives 0 for values of x that lie outside xGrid. With the above example values,
result =
1 3 2 0 0
See if this works for you -
matches = bsxfun(#le,xGrid(1:end-1),x(:)) & bsxfun(#ge,xGrid(2:end),x(:))
[valid,pos] = max(cumsum(matches,2),[],2)
pos = pos.*(valid~=0)
Sample run -
xGrid =
5 2 1 6 8 9 2 1 6
x =
3 7 14
pos =
8
4
0
Explanation on the sample run -
First element of x, 3 occurs last between ...1 6 with the criteria of xGrid(i) <= xi <= xGrid(i+1) at the backend of xGrid and that 1 is at the eight position, so the first element of the output pos is 8. This continues for the second element 7, which is found between 6 and 8 and that 6 is at the fourth place in xGrid, so the second element of the output is 4. For the third element 14 which doesn't find any neighbours to satisfy the criteria xGrid(i) <= xi <= xGrid(i+1) and is therefore outputted as 0.
If x is a column this might help
xg1=meshgrid(xGrid,1:length(x));
xg2=ndgrid(x,1:length(xGrid));
[~,I]=min(floor(abs(xg1-xg2)),[],2);
or a single line implementation
[~,I]=min(floor(abs(meshgrid(xGrid,1:length(x))-ndgrid(x,1:length(xGrid)))),[],2);
Example: xGrid=[1 2 3 4 5], x=[2.5; 1.3; 1.7; 4.8; 3.3]
Result:
I =
2
1
1
4
3

Vectorization- Matlab

Given a vector
X = [1 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3]
I would like to generate a vector such
Y = [1 2 3 4 5 1 2 3 4 5 6 1 2 3 4 5]
So far what I have got is
idx = find(diff(X))
Y = [1:idx(1) 1:idx(2)-idx(1) 1:length(X)-idx(2)]
But I was wondering if there is a more elegant(robust) solution?
One approach with diff, find & cumsum for a generic case -
%// Initialize array of 1s with the same size as input array and an
%// intention of using cumsum on it after placing "appropriate" values
%// at "strategic" places for getting the final output.
out = ones(size(X))
%// Find starting indices of each "group", except the first group, and
%// by group here we mean run of identical numbers.
idx = find(diff(X))+1
%// Place differentiated and subtracted values of indices at starting locations
out(idx) = 1-diff([1 idx])
%// Perform cumulative summation for the final output
Y = cumsum(out)
Sample run -
X =
1 1 1 1 2 2 3 3 3 3 3 4 4 5
Y =
1 2 3 4 1 2 1 2 3 4 5 1 2 1
Just for fun, but customary bsxfun based alternative solution -
%// Logical mask with each column of ones for presence of each group elements
mask = bsxfun(#eq,X(:),unique(X(:).')) %//'
%// Cumulative summation along columns and use masked values for final output
vals = cumsum(mask,1)
Y = vals(mask)
Here's another approach:
Y = sum(triu(bsxfun(#eq, X, X.')), 1);
This works as follows:
Compare each element with all others (bsxfun(...)).
Keep only comparisons with current or previous elements (triu(...)).
Count, for each element, how many comparisons are true (sum(..., 1)); that is, how many elements, up to and including the current one, are equal to the current one.
Another method is using the function unique
like this:
[unqX ind Xout] = unique(X)
Y = [ind(1):ind(2) 1:ind(3)-ind(2) 1:length(X)-ind(3)]
Whether this is more elegant is up to you.
A more robust method will be:
[unqX ind Xout] = unique(X)
for ii = 1:length(unqX)-1
Y(ind(ii):ind(ii+1)-1) = 1:(ind(ii+1)-ind(ii));
end

finding the first value of one matrix after satisfying condition of another matrix

Please help me solve this problem...
a = [1 2 3 4 5 6 7 8 9 10]
b = [12 4 13 7 5 7 8 10 3 12]
c = [4 5 3 2 6 7 5 3 4 5]
I have to find the first value on a, if the value on b is less than 10 for more than 3 consecutive places and index for the starting of satisfying the condition. Also the value of c after finding the value of b for same index.
Ans should be index for b=4, index for a=4 and value for a =4 and c=2
Thank you in advance
You may use strfind as one approach -
str1 = num2str(b <10,'%1d') %%// String of binary numbers
indx = strfind(['0' str1],'0111') %%// Indices where the condition is met
ind = indx(1) %%// Choose the first occurance
a_out = a(ind) %%// Index into a
c_out = c(ind) %%// Index into c
Output -
ind =
4
a_out =
4
c_out =
2
To find a given number of consecutive values lower than a threshold, you can apply conv to a vector of 0-1 values resulting from the comparison:
threshold = 10; %// values "should" be smaller than this
number = 4; %// at least 4 consecutive occurrences
ind = find(conv(double(b<threshold), ones(1,number), 'valid')==number, 1);
%// double(b<threshold) gives 0-1.
%// conv(...)==... gives 1 when the sought number of consecutive 1's is reached
%// find(... ,1) gives the first index where that happens
a_out = a(ind);
c_out = c(ind);

Resources