How can I calculate the length of groups of consecutive ones in binary vector? [duplicate] - arrays

This question already has answers here:
How to count number of 1 and 0 in the matrix?
(1 answer)
Finding islands of zeros in a sequence
(6 answers)
Closed 7 years ago.
I have a series of binary vectors (time x 1) in which 1s represent a connection between two variables at a given point in time. The connections between the two variables are sporadic, and I would like to know how 'long' each connection between the two variables exists for.
e.g. if the vector for a given set of variables is:
[0 0 1 1 1 0 0 0 1 0 0 1 1 1 1 1 1 1 ]
Then I would like to create a new variable which contains the length of contiguous 1s in each instance. From the above example, the new variable would look like this:
[3,1,7]
As the first time that a 1 arose, it was there for 3 consecutive time points, whereas the next time it was only there for 1 time point and finally, the connection was in the data for 7 consecutive time points.
If there is a good way to solve this, I'd love some help.
Cheers
Mac

diff and cumsum give a good pair!
a = [0 0 1 1 1 0 0 0 1 0 0 1 1 1 1 1 1 1 ]
b = cumsum([a 0])
c = diff( [0 b(diff([a 0]) == -1) ] )
%// or
c = diff( [0 b(~(diff([a 0]) + 1)) ] )
c =
3 1 7

Related

Number 0's and 1's blocks in a binary vector

In MATLAB, there is the bwlabel function, that given a binary vector, for instance x=[1 1 0 0 0 1 1 0 0 1 1 1 0] gives (bwlabel(x)):
[1 1 0 0 0 2 2 0 0 3 3 3 0]
but what I want to obtain is
[1 1 2 2 2 3 3 4 4 5 5 5 6]
I know I can negate x to obtain (bwlabel(~x))
[0 0 1 1 1 0 0 2 2 0 0 0 3]
But how can I combine them?
All in one line:
y = cumsum([1,abs(diff(x))])
Namely, abs(diff(x)) spots changes in the binary vector, and you gain the output with the cumulative sum.
You can still do it using bwlabel by vertically concatenating x and ~x, using 4-connected components for the labeling, then taking the maximum down each column:
>> max(bwlabel([x; ~x], 4))
ans =
1 1 2 2 2 3 3 4 4 5 5 5 6
However, the solution from Bentoy13 is probably a bit faster.
x=[1 1 0 0 0 1 1 0 0 1 1 1 0];
A = bwlabel(x);
B = bwlabel(~x);
if x(1)==1
tmp = A>0;
A(tmp) = 2*A(tmp)-1;
tmp = B>0;
B(tmp) = 2*B(tmp);
C = A+B
elseif x(1)==0
tmp = A>0;
A(tmp) = 2*A(tmp);
tmp = B>1;
B(tmp) = 2*B(tmp)-1;
C = A+B
end
C =
1 1 2 2 2 3 3 4 4 5 5 5 6
You know the first index should remain 1, but the second index should go from 1 to 2, the third from 2 to 3 etc; thus even indices should be doubled and odd indices should double minus one. This is given by A+A-1 for odd entries, and B+B for even entries. So a simple check for whether A or B contains the even points is sufficient, and then simply add the two arrays.
I found this function that does exactly what i wanted:
https://github.com/davidstutz/matlab-multi-label-connected-components
So, clone the repository and compile in matlab using mex :
mex sp_fast_connected_relabel.cpp
Then,
labels = sp_fast_connected_relabel(x);

Strange behavior when modifying logical arrays [duplicate]

This question already has an answer here:
Assigning value to array by logical indexing doesn't work
(1 answer)
Closed 5 years ago.
I have an array that contains a bunch of logical values, it looks like:
test = [1 1 1 1 1 0 0 1 1 0 0 ...].
If I want to change a normal array of scalar values - lets say
a = [1 2 3 4]
I could do:
a(a == 1) = 5
and the result would be
[5 2 3 4]
As expected.
However if I do:
test(test == 0) = 5
I get back something unexpected:
[1 1 1 1 1 1 1 1 1 1 1 1 1 1....
All of the 0s have been changed to 1!
I suspect this is because the array is filled with logicals, and because of typechecking MATLAB coerces any value that is not 1 or 0 to the closest logical value - but I want to confirm. This is surely strange.
This is because your array is boolean, and 5 evaluates to true in boolean, which displays as 1. In English, your code test(test == 0) = 5 translates to "set all False values to True". The result is an all-true array, i.e. all ones.

Find indices of blocks of 0s that are continuous [duplicate]

This question already has answers here:
Finding islands of zeros in a sequence
(6 answers)
Closed 6 years ago.
I have a vector and I want to find the indices of blocks of 0s that are continuous for at least 3 times.
y = [1 1 1 0 1 1 0 0 0 1 1 1 0 1 0 1 0 0 1 0 0 0 0 1 1];
So in this case, the blocks should be [0 0 0] from 7-9 and [0 0 0 0] from 20-23. The output should give me the indices, something like [7, 9] and [20,23], or even better, change these blocks of 0s to a single NAN to become:
[1 1 1 0 1 1 NAN 1 1 1 0 1 0 1 0 0 1 NAN 1 1]
Thanks!
What you can do is:
Pad the vector with 1 on each side.
Use find and diff to find where the vector changes from 1 to 0 (diff = -1)
Use find and diff to find where the vector changes from 0 to 1 (diff = 1)
Find the duration of each interval by subtracting the values in 3 by the values in 2 (and add 1)
Create a logical vector with true where the duration is >= 3, and use that vector to find the start indices (from the values found in point 2).
Set the value of each of the start indices to NaN
Set the value of start indices + 1 : end indices to [].
And you're set to go!
It actually took a lot more time writing the explanation than it took to write the code. It's quite a nice exercise to learn some basic MATLAB so I'll leave it to you. Good luck!

Average of dynamic row range

I have a table of rows which consist of zeros and numbers like this:
A B C D E F G H I J K L M N
0 0 0 4 3 1 0 1 0 2 0 0 0 0
0 1 0 1 4 0 0 0 0 0 1 0 0 0
9 5 7 9 10 7 2 3 6 4 4 0 1 0
I want to calculate an average of the numbers including zeros, but starting from the first nonzero value and put it into column after tables end. E.g. for the first row first value is 4, so average - 11/11; for the second - 7/13; the last one is 67/14.
How could I using excel formulas do this? Probably OFFSET with nested IF?
This still needs to be entered as an array formula (ctrl-shift-enter) but it isn't volatile:
=AVERAGE(INDEX(($A2:$O2),MATCH(TRUE,$A2:$O2<>0,0)):$O2)
or, depending on location:
=AVERAGE(INDEX(($A2:$O2);MATCH(TRUE;$A2:$O2<>0;0)):$O2)
The sum is the same no matter how many 0's you include, so all you need to worry about is what to divide it by, which you could determine using nested IFs, or take a cue from this: https://superuser.com/questions/671435/excel-formula-to-get-first-non-zero-value-in-row-and-return-column-header
Thank you, Scott Hunter, for good reference.
I solved the problem using a huge formula, and I think it's a bit awkward.
Here it is:
=AVERAGE(INDIRECT(CELL("address";INDEX(A2:O2;MATCH(TRUE;INDEX(A2:O2<>0;;);0)));TRUE):O2)

Element-wise array replication according to a count [duplicate]

This question already has answers here:
Repeat copies of array elements: Run-length decoding in MATLAB
(5 answers)
Closed 8 years ago.
My question is similar to this one, but I would like to replicate each element according to a count specified in a second array of the same size.
An example of this, say I had an array v = [3 1 9 4], I want to use rep = [2 3 1 5] to replicate the first element 2 times, the second three times, and so on to get [3 3 1 1 1 9 4 4 4 4 4].
So far I'm using a simple loop to get the job done. This is what I started with:
vv = [];
for i=1:numel(v)
vv = [vv repmat(v(i),1,rep(i))];
end
I managed to improve by preallocating space:
vv = zeros(1,sum(rep));
c = cumsum([1 rep]);
for i=1:numel(v)
vv(c(i):c(i)+rep(i)-1) = repmat(v(i),1,rep(i));
end
However I still feel there has to be a more clever way to do this... Thanks
Here's one way I like to accomplish this:
>> index = zeros(1,sum(rep));
>> index(cumsum([1 rep(1:end-1)])) = 1;
index =
1 0 1 0 0 1 1 0 0 0 0
>> index = cumsum(index)
index =
1 1 2 2 2 3 4 4 4 4 4
>> vv = v(index)
vv =
3 3 1 1 1 9 4 4 4 4 4
This works by first creating an index vector of zeroes the same length as the final count of all the values. By performing a cumulative sum of the rep vector with the last element removed and a 1 placed at the start, I get a vector of indices into index showing where the groups of replicated values will begin. These points are marked with ones. When a cumulative sum is performed on index, I get a final index vector that I can use to index into v to create the vector of heterogeneously-replicated values.
To add to the list of possible solutions, consider this one:
vv = cellfun(#(a,b)repmat(a,1,b), num2cell(v), num2cell(rep), 'UniformOutput',0);
vv = [vv{:}];
This is much slower than the one by gnovice..
What you are trying to do is to run-length decode. A high level reliable/vectorized utility is the FEX submission rude():
% example inputs
counts = [2, 3, 1];
values = [24,3,30];
the result
rude(counts, values)
ans =
24 24 3 3 3 30
Note that this function performs the opposite operation as well, i.e. run-length encodes a vector or in other words returns values and the corresponding counts.
accumarray function can be used to make the code work if zeros exit in rep array
function vv = repeatElements(v, rep)
index = accumarray(cumsum(rep)'+1, 1);
vv = v(cumsum(index(1:end-1))+1);
end
This works similar to solution of gnovice, except that indices are accumulated instead being assigned to 1. This allows to skip some indices (3 and 6 in the example below) and remove corresponding elements from the output.
>> v = [3 1 42 9 4 42];
>> rep = [2 3 0 1 5 0];
>> index = accumarray(cumsum(rep)'+1, 1)'
index =
0 0 1 0 0 2 1 0 0 0 0 2
>> cumsum(index(1:end-1))+1
ans =
1 1 2 2 2 4 5 5 5 5 5
>> vv = v(cumsum(index(1:end-1))+1)
vv =
3 3 1 1 1 9 4 4 4 4 4

Categories

Resources