Related
Suppose a data like:
X y
1 5
2 6
3 1
4 7
5 3
6 8
I want to remove 3 1 and 5 3 because their difference with the previous row is more than 3. In fact, I want to draw a plot with them and want it to be smooth.
I tried
for qq = 1:size(data,1)
if data(qq,2) - data(qq-1,2) > 3
data(qq,:)=[];
end
end
However, it gives:
Subscript indices must either be real positive integers or logicals.
Moreover, I guess the size of array changes as I remove some elements.
In the end, the difference between no consecutive elements must be greater than threshold.
In practice I want to smooth the following picture where there is high fluctuate
One very simple filter from Mathematical morphology that you could try is the closing with a structuring element of size 2. It changes the value of any sample that is lower than both neighbors to the lowest of its two neighbors. Other values are not changed. Thus, it doesn't use a threshold to determine what samples are wrong, it only looks that the sample is lower than both neighbors:
y = [5, 6, 1, 7, 3, 8]; % OP's second column
y1 = y;
y1(end+1) = -inf; % enforce boundary condition
y1 = max(y1,circshift(y1,1)); % dilation
y1 = min(y1,circshift(y1,-1)); % erosion
y1 = y1(1:end-1); % undo boundary condition change
This returns y1 = [5 6 6 7 7 8].
If you want to prevent changing your signal for small deviations, you can apply your threshold as a second step:
I = y1 - y < 3;
y1(I) = y(I);
This finds the places where we changed the signal, but the change was less than the threshold of 3. At those places we write back the original value.
You have a few errors:
Your index needs to start from 2, so that you aren't trying to index 0 for a previous index.
You need to check that the absolute value of the difference is greater than 3.
Since your data matrix is changing sizes, you can't use a for loop with a fixed number of iterations. Use a while loop instead.
This should give you the results you want:
qq = 2;
while qq <= size(data, 1)
if abs(data(qq, 2) - data(qq-1, 2)) > 3,
data(qq, :) = [];
else
qq = qq+1;
end
end
This question already has answers here:
Create a zero-filled 2D array with ones at positions indexed by a vector
(4 answers)
Closed 6 years ago.
I have a vector v of size (m,1) whose elements are integers picked from 1:n. I want to create a matrix M of size (m,n) whose elements M(i,j) are 1 if v(i) = j, and are 0 otherwise. I do not want to use loops, and would like to implement this as a simple vector-matrix manipulation only.
So I thought first, to create a matrix with repeated elements
M = v * ones(1,n) % this is a (m,n) matrix of repeated v
For example v=[1,1,3,2]'
m = 4 and n = 3
M =
1 1 1
1 1 1
3 3 3
2 2 2
then I need to create a comparison vector c of size (1,n)
c = 1:n
1 2 3
Then I need to perform a series of logical comparisons
M(1,:)==c % this results in [1,0,0]
.
M(4,:)==c % this results in [0,1,0]
However, I thought it should be possible to perform the last steps of going through each single row in compact matrix notation, but I'm stumped and not knowledgeable enough about indexing.
The end result should be
M =
1 0 0
1 0 0
0 0 1
0 1 0
A very simple call to bsxfun will do the trick:
>> n = 3;
>> v = [1,1,3,2].';
>> M = bsxfun(#eq, v, 1:n)
M =
1 0 0
1 0 0
0 0 1
0 1 0
How the code works is actually quite simple. bsxfun is what is known as the Binary Singleton EXpansion function. What this does is that you provide two arrays / matrices of any size, as long as they are broadcastable. This means that they need to be able to expand in size so that both of them equal in size. In this case, v is your vector of interest and is the first parameter - note that it's transposed. The second parameter is a vector from 1 up to n. What will happen now is the column vector v gets replicated / expands for as many values as there are n and the second vector gets replicated for as many rows as there are in v. We then do an eq / equals operator between these two arrays. This expanded matrix in effect has all 1s in the first column, all 2s in the second column, up until n. By doing an eq between these two matrices, you are in effect determining which values in v are equal to the respective column index.
Here is a detailed time test and breakdown of each function. I placed each implementation into a separate function and I also let n=max(v) so that Luis's first code will work. I used timeit to time each function:
function timing_binary
n = 10000;
v = randi(1000,n,1);
m = numel(v);
function luis_func()
M1 = full(sparse(1:m,v,1));
end
function luis_func2()
%m = numel(v);
%n = 3; %// or compute n automatically as n = max(v);
M2 = zeros(m, n);
M2((1:m).' + (v-1)*m) = 1;
end
function ray_func()
M3 = bsxfun(#eq, v, 1:n);
end
function op_func()
M4= ones(1,m)'*[1:n] == v * ones(1,n);
end
t1 = timeit(#luis_func);
t2 = timeit(#luis_func2);
t3 = timeit(#ray_func);
t4 = timeit(#op_func);
fprintf('Luis Mendo - Sparse: %f\n', t1);
fprintf('Luis Mendo - Indexing: %f\n', t2);
fprintf('rayryeng - bsxfun: %f\n', t3);
fprintf('OP: %f\n', t4);
end
This test assumes n = 10000 and the vector v is a 10000 x 1 vector of randomly distributed integers from 1 up to 1000. BTW, I had to modify Luis's second function so that the indexing will work as the addition requires vectors of compatible dimensions.
Running this code, we get:
>> timing_binary
Luis Mendo - Sparse: 0.015086
Luis Mendo - Indexing: 0.327993
rayryeng - bsxfun: 0.040672
OP: 0.841827
Luis Mendo's sparse code wins (as I expected), followed by bsxfun, followed by indexing and followed by your proposed approach using matrix operations. The timings are in seconds.
Assuming n equals max(v), you can use sparse:
v = [1,1,3,2];
M = full(sparse(1:numel(v),v,1));
What sparse does is build a sparse matrix using the first argument as row indices, the second as column indices, and the third as matrix values. This is then converted into a full matrix with full.
Another approach is to define the matrix containing initially zeros and then use linear indexing to fill in the ones:
v = [1,1,3,2];
m = numel(v);
n = 3; %// or compute n automatically as n = max(v);
M = zeros(m, n);
M((1:m) + (v-1)*m) = 1;
I think I've also found a way to do it, and it would be nice if somebody could tell me which of the methods shown is faster for very large vectors and matrices. The additional method I thought of is the following
M= ones(1,m)'*[1:n] == v * ones(1,n)
Problem from :
https://www.hackerrank.com/contests/epiccode/challenges/white-falcon-and-sequence.
Visit link for references.
I have a sequence of integers (-10^6 to 10^6) A. I need to choose two contiguous disjoint subsequences of A, let's say x and y, of the same size, n.
After that you will calculate the sum given by ∑x(i)y(n−i+1) (1-indexed)
And I have to choose x and y such that sum is maximised.
Eg:
Input:
12
1 7 4 0 9 4 0 1 8 8 2 4
Output: 120
Where x = {4,0,9,4}
y = {8,8,2,4}
∑x(i)y(n−i+1)=4×4+0×2+9×8+4×8=120
Now, the approach that I was thinking of for this is something in lines of O(n^2) which is as follows:
Initialise two variables l = 0 and r = N-1. Here, N is the size of the array.
Now, for l=0, I will calculate the sum while (l<r) which basically refers to the subsequences that will start from the 0th position in the array. Then, I will increment l and decrement r in order to come up with subsequences that start from the above position + 1 and on the right hand side, start from right-1.
Is there any better approach that I can use? Anything more efficient? I thought of sorting but we cannot sort numbers since that will change the order of the numbers.
To answer the question we first define S(i, j) to be the max sum of multlying the two sub-sequence items, for sub-array A[i...j] when the sub-sequence x starts at position i, and sub-sequence y ends on position j.
For example, if A=[1 7 4 0 9 4 0 1 8 8 2 4], then S(1, 2)=1*7=7 and S(2, 5)=7*9+4*0=63.
The recursive rule to compute S is: S(i, j)=max(0, S(i+1, j-1)+A[i]*A[j]), and the end condition is S(i, j)=0 iff i>=j.
The requested final answer is simply the maximum value of S(i, j) for all combinations of i=1..N, j=1..N, since one of the S(i ,j) values will correspond to the max x,y sub-sequences, and thus will be equal the maximum value for the whole array. The complexity of computing all such S(i, j) values is O(N^2) using dynamic programming, since in the course of computing S(i, j) we will also compute the values of up to N other S(i', j') values, but ultimately each combination will be computed only once.
def max_sum(l):
def _max_sub_sum(i, j):
if m[i][j]==None:
v=0
if i<j:
v=max(0, _max_sub_sum(i+1, j-1)+l[i]*l[j])
m[i][j]=v
return m[i][j]
n=len(l)
m=[[None for i in range(n)] for j in range(n)]
v=0
for i in range(n):
for j in range(i, n):
v=max(v, _max_sub_sum(i, j))
return v
WARNING:
This method assumes the numbers are non-negative so this solution does not answer the poster's actual problem now it has been clarified that negative input values are allowed.
Trick 1
Assuming the numbers are always non-negative, it is always best to make the sequences as wide as possible given the location where they meet.
Trick 2
We can change the sum into a standard convolution by summing over all values of i. This produces twice the desired result (as we get both the product of x with y, and y with x), but we can divide by 2 at the end to get the original answer.
Trick 3
You are now attempting to find the maximum of a convolution of a signal with itself. There is a standard method for doing this which is to use the fast fourier transform. Some libraries will have this built in, e.g. in Scipy there is fftconvolve.
Python code
Note that you don't allow the central value to be reused (e.g. for a sequance 1,3,2 we can't make x 1,3 and y 3,1) so we need to examine alternate values of the convolved output.
We can now compute the answer in Python via:
import scipy.signal
A = [1, 7, 4, 0, 9, 4, 0, 1, 8, 8, 2, 4]
print max(scipy.signal.fftconvolve(A,A)[1::2]) / 2
Given a matrix A, how do I get the elements (and their indices) larger than x in a specific range?
e.g.
A = [1:5; 2:6; 3:7; 4:8; 5:9]
A =
1 2 3 4 5
2 3 4 5 6
3 4 5 6 7
4 5 6 7 8
5 6 7 8 9
And for instance I want all elements larger than 5 and appear in the range A(2:4,3:5). I should get:
elements:
6 , 6 , 7 , 6 , 7 , 8
indices:
14, 18, 19, 22, 23, 24
A(A>5) would give me all entries which are larger than 5.
A(2:4,3:5) would give all elements in the range 2:4,3:5.
I want some combination of the two. Is it possible or the only way is to put the needed range in another array B and only then perform B(B>5)? Obviously 2 problems here: I'd lose the original indices, and it will be slower. I'm doing this on a large number of matrices.
Code. I'm trying to avoid matrix multiplication, so this may look a bit odd:
A = [1:5; 2:6; 3:7; 4:8; 5:9];
[r,c] = meshgrid(2:4,3:5);
n = sub2ind(size(A), r(:), c(:));
indices = sort(n(A(n) > 5)); %'skip sorting if not needed'
values = A(indices);
Explanation. The code converts the Cartesian product of the subscripts to linear indices in the A matrix. Then it selects the indices that respect the condition, then it selects the values.
However, it is slow.
Optimization. Following LuisMendo's suggestion, the code may be sped up by replacing the sub2ind-based linear index calculation with a handcrafted linear index calculation:
A = [1:5; 2:6; 3:7; 4:8; 5:9];
%'For column-first, 1-based-index array memory '
%'layout, as in MATLAB/FORTRAN, the linear index '
%'formula is: '
%'L = R + (C-1)*NR '
n = bsxfun(#plus, (2:4), (transpose(3:5) - 1)*size(A,1));
indices = n(A(n) > 5);
values = A(indices);
If you only need the values (not the indices), it can be done using the third output of find and matrix multiplication. I don't know if it will be faster than using a temporary array, though:
[~, ~, values] = find((A(2:4,3:5)>5).*A(2:4,3:5));
Assuming you need the linear indices and the values, then if the threshold is positive you could define a mask. This may be a good idea if the mask can be defined once and reused for all matrices (that is, if the desired range is the same for all matrices):
mask = false(size(A));
mask(2:4,3:5) = true;
indices = find(A.*mask>5);
values = A(indices);
its a little clunky, but:
R = 2:4;
C = 3:5;
I = reshape(find(A),size(A))
indicies = nonzeros(I(R,C).*(A(R,C)>5))
values = A(indicies)
I have a 3X3 cell array and each element store a (x,y) point.
The point are generate by random number from [0,1].
What I want to do is sort the cell array so that it looks like following
ex: 9 points
each circle is one 2D point
index:(1,1) in the left top corner and (3,3) to the right bottom corner as the usual array index
that is to ensure the topological order.
How do I do it?
Thank in advance.
for the example
pairs = [4 9 2 6 5 1 7 8 3; 9 6 2 1 3 8 7 4 5] (row 1 = x-values, row 2 = y-values))
what I want to do is put them in the cell array so that they can be connected by read lines like the image's topology.
The number of permutations is factorial(9), which is not terribly large. So a brute-froce approach is feasible: test all permutations for your desired conditions, and pick the first that is valid.
In the following I'm using a 2x3x3 array, instead of a 3x3 cell array containing length-2 vectors, because it's much easier that way.
N = 3;
data = rand(2,N,N);
permutations = perms(1:N^2); %// generate all permutations
for k = 1:numel(permutations)
dx = reshape(data(1,permutations(k,:)),N,N); %// permuted x data
dy = reshape(data(2,permutations(k,:)),N,N); %// permuted y data
if all(all(diff(dy,[],1)<0)) && all(all(diff(dx,[],2)>0)) %// solution found
disp(dx) %// display solution: x values
disp(dy) %// y values
break %// we only want one solution
end
end
Note that for some choices of data there may not be a solution.