Find median position points of duration events - arrays

I have the following vector A:
A = [34 35 36 5 6 7 78 79 7 9 10 80 81 82 84 85 86 102 3 4 6 103 104 105 106 8 11 107 201 12 202 203 204];
For n = 2, I counted the elements larger or equal to 15 within A:
D = cellfun(#numel, regexp(char((A>=15)+'0'), [repmat('0',1,n) '+'], 'split'));
The above expression gives the following output as duration values:
D = [3 2 7 4 6] = [A(1:3) **stop** A(7:8) **stop** A(12:18) **stop** A(22:25) **stop** A(28:33)];
The above algorithm computes the duration values by counting the elements larger or equal to 15. The counting also allows less than 2 consecutive elements smaller than 15 (n = 2). The counter stops when there are 2 or more consecutive elements smaller than 15 and starts over at the next substring within A.
Eventually, I want a way to find the median position points of the duration events A(1:3), A(7:8), A(12:18), A(22:25) and A(28:33), which are correctly computed. The result should look like this:
a1 = round(median(A(1:3))) = 2;
a2 = round(median(A(7:8))) = 8;
a3 = round(median(A(12:18))) = 15;
a4 = round(median(A(22:25))) = 24;
a5 = round(median(A(28:33))) = 31;
I edited the question to make it more clear, because the solution that was provided here assigns the last number within the row of 2 or more consecutive numbers smaller than 15 (3 in this case) after A(1:3) to the next substring A(7:8)and the same with the other substrings, therefore generating wrong duration values and in consequence wrong median position points of the duration events when n = 2 or for any given even n.
Anyone has any idea how to achieve this?

Related

Delete values between specific ranges of indices in an array

I have an array :
Z = [1 24 3 4 52 66 77 8 21 100 101 120 155];
I have another array:
deletevaluesatindex=[1 3; 6 7;10 12]
I want to delete the values in array Z at indices (1 to 3, 6 to 7, 10 to 12) represented in the array deletevaluesatindex
So the result of Z is:
Z=[4 52 8 21 155];
I tried to use the expression below, but it does not work:
X([deletevaluesatindex])=[]
Another solution using bsxfun and cumsum:
%// create index matrix
idx = bsxfun(#plus , deletevaluesatindex.', [0; 1])
%// create mask
mask = zeros(numel(Z),1);
mask(idx(:)) = (-1).^(0:numel(idx)-1)
%// extract unmasked elements
out = Z(~cumsum(mask))
out = 4 52 8 21 155
This will do it:
rdvi= size(deletevaluesatindex,1); %finding rows of 'deletevaluesatindex'
temp = cell(1,rdvi); %Pre-allocation
for i=1:rdvi
%making a cell array of elements to be removed
temp(i)={deletevaluesatindex(i,1):deletevaluesatindex(i,2)};
end
temp = cell2mat(temp); %Now temp array contains the elements to be removed
Z(temp)=[] % Removing the elements
If you control how deletevaluesatindex is generated, you can instead directly generate the ranges using MATLAB's colon operator and concatenate them together using
deletevaluesatindex=[1:3 6:7 10:12]
then use the expression you suggested
Z([deletevaluesatindex])=[]
If you have to use deletevaluesatindex as it is given, you can generate the concatenated range using a loop or something like this
lo = deletevaluseatindex(:,1)
up = deletevaluseatindex(:,2)
x = cumsum(accumarray(cumsum([1;up(:)-lo(:)+1]),[lo(:);0]-[0;up(:)]-1)+1);
deleteat = x(1:end-1)
Edit: as in comments noted this solution only works in GNU Octave
with bsxfun this is possible:
Z=[1 24 3 4 52 66 77 8 21 100 101 120 155];
deletevaluesatindex = [1 3; 6 7;10 12];
idx = 1:size(deletevaluesatindex ,1);
idx_rm=bsxfun(#(A,B) (A(B):deletevaluesatindex (B,2))',deletevaluesatindex (:,1),idx);
Z(idx_rm(idx_rm ~= 0))=[]

Find timeline for duration values in Matlab

I have the following time-series:
b = [2 5 110 113 55 115 80 90 120 35 123];
Each number in b is one data point at a time instant. I computed the duration values from b. Duration is represented by all numbers within b larger or equal to 100 and arranged consecutively (all other numbers are discarded). A maximum gap of one number smaller than 100 is allowed. This is how the code for duration looks like:
N = 2; % maximum allowed gap
duration = cellfun(#numel, regexp(char((b>=100)+'0'), [repmat('0',1,N) '+'], 'split'));
giving the following duration values for b:
duration = [4 3];
I want to find the positions (time-lines) within b for each value in duration. Next, I want to replace the other positions located outside duration with zeros. The result would look like this:
result = [0 0 3 4 5 6 0 0 9 10 11];
If anyone could help, it would be great.
Answer to original question: pattern with at most one value below 100
Here's an approach using a regular expression to detect the desired pattern. I'm assuming that one value <100 is allowed only between (not after) values >=100. So the pattern is: one or more values >=100 with a possible value <100 in between .
b = [2 5 110 113 55 115 80 90 120 35 123]; %// data
B = char((b>=100)+'0'); %// convert to string of '0' and '1'
[s, e] = regexp(B, '1+(.1+|)', 'start', 'end'); %// find pattern
y = 1:numel(B);
c = any(bsxfun(#ge, y, s(:)) & bsxfun(#le, y, e(:))); %// filter by locations of pattern
y = y.*c; %// result
This gives
y =
0 0 3 4 5 6 0 0 9 10 11
Answer to edited question: pattern with at most n values in a row below 100
The regexp needs to be modified, and it has to be dynamically built as a function of n:
b = [2 5 110 113 55 115 80 90 120 35 123]; %// data
n = 2;
B = char((b>=100)+'0'); %// convert to string of '0' and '1'
r = sprintf('1+(.{1,%i}1+)*', n); %// build the regular expression from n
[s, e] = regexp(B, r, 'start', 'end'); %// find pattern
y = 1:numel(B);
c = any(bsxfun(#ge, y, s(:)) & bsxfun(#le, y, e(:))); %// filter by locations of pattern
y = y.*c; %// result
Here is another solution, not using regexp. It naturally generalizes to arbitrary gap sizes and thresholds. Not sure whether there is a better way to fill the gaps. Explanation in comments:
% maximum step size and threshold
N = 2;
threshold = 100;
% data
b = [2 5 110 113 55 115 80 90 120 35 123];
% find valid data
B = b >= threshold;
B_ind = find(B);
% find lengths of gaps
step_size = diff(B_ind);
% find acceptable steps (and ignore step size 1)
permissible_steps = 1 < step_size & step_size <= N;
% find beginning and end of runs
good_begin = B_ind([permissible_steps, false]);
good_end = good_begin + step_size(permissible_steps);
% fill gaps in B
for ii = 1:numel(good_begin)
B(good_begin(ii):good_end(ii)) = true;
end
% find durations of runs in B. This finds points where we switch from 0 to
% 1 and vice versa. Due to padding the first match is always a start of a
% run, the last one always an end. There will be an even number of matches,
% so we can reshape and diff and thus fidn the durations
durations = diff(reshape(find(diff([false, B, false])), 2, []));
% get positions of 'good' data
outpos = zeros(size(b));
outpos(B) = find(B);

Retrieving specific elements in matlab from arrays in MATLAB

I have an array in MATLAB
For example
a = 1:100;
I want to select the first 4 element in every successive 10 elements.
In this example I want to b will be
b = [1,2,3,4,11,12,13,14, ...]
can I do it without for loop?
I read in the internet that i can select the element for each step:
b = a(1:10:end);
but this is not working for me.
Can you help me?
With reshape
%// reshaping your matrix to nx10 so that it has successive 10 elements in each row
temp = reshape(a,10,[]).'; %//'
%// taking first 4 columns and reshaping them back to a row vector
b = reshape(temp(:,1:4).',1,[]); %//'
Sample Run for smaller size (although this works for your actual dimensions)
a = 1:20;
>> b
b =
1 2 3 4 11 12 13 14
To vectorize the operation you must generate the indices you wish to extract:
a = 1:100;
b = a(reshape(bsxfun(#plus,(1:4)',0:10:length(a)-1),[],1));
Let's break down how this works. First, the bsxfun function. This performs a function, here it is addition (#plus) on each element of a vector. Since you want elements 1:4 we make this one dimension and the other dimension increases by tens. this will lead a Nx4 matrix where N is the number of groups of 4 we wish to extract.
The reshape function simply vectorizes this matrix so that we can use it to index the vector a. To better understand this line, try taking a look at the output of each function.
Sample Output:
>> b = a(reshape(bsxfun(#plus,(1:4)',0:10:length(a)-1),[],1))
b =
Columns 1 through 19
1 2 3 4 11 12 13 14 21 22 23 24 31 32 33 34 41 42 43
Columns 20 through 38
44 51 52 53 54 61 62 63 64 71 72 73 74 81 82 83 84 91 92
Columns 39 through 40
93 94

check if ALL elements of a vector are in another vector

I need to loop through coloumn 1 of a matrix and return (i) when I have come across ALL of the elements of another vector which i can predefine.
check_vector = [1:43] %% I dont actually need to predefine this - i know I am looking for the numbers 1 to 43.
matrix_a coloumn 1 (which is the only coloumn i am interested in looks like this for example
1
4
3
5
6
7
8
9
10
11
12
13
14
16
15
18
17
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
1
3
4
2
6
7
8
We want to loop through matrix_a and return the value of (i) when we have hit all of the numbers in the range 1 to 43.
In the above example we are looking for all the numbers from 1 to 43 and the iteration will end round about position 47 in matrix_a because it is at this point that we hit number '2' which is the last number to complete all numbers in the sequence 1 to 43.
It doesnt matter if we hit several of one number on the way, we count all those - we just want to know when we have reached all the numbers from the check vector or in this example in the sequence 1 to 43.
Ive tried something like:
completed = []
for i = 1:43
complete(i) = find(matrix_a(:,1) == i,1,'first')
end
but not working.
Assuming A as the input column vector, two approaches could be suggested here.
Approach #1
With arrayfun -
check_vector = [1:43]
idx = find(arrayfun(#(n) all(ismember(check_vector,A(1:n))),1:numel(A)),1)+1
gives -
idx =
47
Approach #2
With customary bsxfun -
check_vector = [1:43]
idx = find(all(cumsum(bsxfun(#eq,A(:),check_vector),1)~=0,2),1)+1
To find the first entry at which all unique values of matrix_a have already appeared (that is, if check_vector consists of all unique values of matrix_a): the unique function almost gives the answer:
[~, ind] = unique(matrix_a, 'first');
result = max(ind);
Someone might have a more compact answer but is this what your after?
maxIndex = 0;
for ii=1:length(a)
[f,index] = ismember(ii,a);
maxIndex=max(maxIndex,max(index));
end
maxIndex
Here is one solution without a loop and without any conditions on the vectors to be compared. Given two vectors a and b, this code will find the smallest index idx where a(1:idx) contains all elements of b. idx will be 0 when b is not contained in a.
a = [ 1 4 3 5 6 7 8 9 10 11 12 13 14 16 15 18 17 19 20 21 22 23 24 25 26 ...
27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 1 3 4 2 6 7 8 50];
b = 1:43;
[~, Loca] = ismember(b,a);
idx = max(Loca) * all(Loca);
Some details:
ismember(b,a) checks if all elements of b can be found in a and the output Loca lists the indices of these elements within a. The index will be 0, if the element cannot be found in a.
idx = max(Loca) then is the highest index in this list of indices, so the smallest one where all elements of b are found within a(1:idx).
all(Loca) finally checks if all indices in Loca are nonzero, i.e. if all elements of b have been found in a.

Vectorized range checking in Matlab

In trying to port an algorithm from C# to Matlab I found that Matlab is inefficient at running for loops. As such I want to vectorize the algorithm.
I have following inputs:
lowrange:
[ 00 10 20 30 40 50 ... ]
highrange:
[ 10 20 30 40 50 60 ... ]
These arrays are equal in length.
I now have a third array Values (which could be any length) and for this array I want to count the occurrences of Values elements between lowerange(i) and highrange(i) (You can see I'm coming from a for loop).
The output should be an array of length lowrange/highrange.
So with the above arrays and input LineData:
[ 1 2 3 4 6 11 12 16 31 34 45 ]
I expect to get:
[ 05 03 00 02 01 00 ... ]
I tried the (for me) obvious thing:
LineData(LineData < PixelEnd & LineData > PixelStart)
But that doesn't work because it just checks LineData on an element by element way. It does not try to apply the comparison over all values in LineData.
Unfortunately, I cannot come up with anything else since I'm not yet used to think in a Matlab 'vector' way, let alone knowing all applicable instructions from memory.
As you are looking to do a basic histogram with given edges, you can use Matlabs built-in function histc:
values = [ 1 2 3 4 6 11 12 16 31 34 45 ];
edges = 0:10:60;
histc(values, edges)
ans =
5 3 0 2 1 0 0
For ranges with identical intervals and starting from 0, here's a bsxfun based counting approach -
LineData = [ 1 2 3 4 6 11 12 16 31 34 45 ] %// Input
interval = 10; %// interval width
num_itervals = 6; %// number of intervals
%// Get matches for each interval and sum them within each interval for the counts
out = sum(bsxfun(#eq,ceil(LineData(:)/interval),1:num_itervals))
Output -
LineData =
1 2 3 4 6 11 12 16 31 34 45
out =
5 3 0 2 1 0
Assuming that the last interval would be the one holding the max of input data, you can try out a diff + indexing based approach too -
LineData = [ 1 2 3 4 6 11 12 16 31 34 45 ] %// Input
interval = 10; %// interval width
labels = ceil(LineData(:)/interval); %// set labels to each input entry
df_labels = diff(labels)~=0; %// mark the change of labels
df_labels_pos = find([df_labels; 1]); %// get the positions of label change
intv_pos= labels([true;df_labels]);%// position of each interval with nonzero counts
%// get counts from interval between label position change and put at right places
out(intv_pos) = [ df_labels_pos(1) ; diff(df_labels_pos)];

Resources