Find high & low peak points in cell array MATLAB - arrays

I want to find "significant" changes in a cell array in MATLAB for when I have a movement.
E.g. I have YT which represents movements in a yaw presentation for a face interaction. YT can change based on an interaction from anywhere upwards of 80x1 to 400x1. The first few lines might be
YT = {-7 -8 -8 -8 -8 -9 -9 -9 -6 ...}
I would like to record the following
Over the entire cell array;
1) Count the number of high and low peaks
I can do this with findpeak but not for low peaks?*
2) Measure the difference between each peak -
For this example, peaks -9 and -6 so difference of +3 between those. So report 1 peak change of +3. At the moment I am only interested in changes of +/- 3, but this might change, so I will need a threshold?
and then over X number of cells (repeating for the cell array)
3) count number of changes - for this example, 3 changes
3) count number of significant changes - for this example, 1 changes of -/+3
4) describe the change - 1 change of -1, 1 change of -1, 1 change of +3
Any help would be appreciated, bit of a MATLAB noob.
Thanks!

1) Finding negative peaks is the same as finding positive ones - all you need to do is multiply the sequence by -1 and then findpeaks again
2) If you simply want the differences, then you could subtract the vectors of the positive and negative peaks (possibly offset by one if you want differences in both directions). Something like pospeaks-negpeaks would do one side. You'd need to identify whether the positive or negative peak was first (use the loc return from findpeaks to determine this), and then do pospeaks(1:end-1)-negpeaks(2:end) or vice versa as appropriate.
[edit]As pointed out in your comment, the above assumes that pospeaks and negpeaks are the same length. I shouldn't have been so lazy! The code might be better written as:
if (length(pospeaks)>length(negpeaks))
% Starts and ends with a positive peak
neg_diffs=pospeaks(1:end-1)-negpeaks;
pos_diffs=negpeaks-pospeaks(2:end);
elseif (length(pospeaks)<length(negpeaks))
% Starts and ends with a negative peak
pos_diffs=negpeaks(1:end-1)-pospeaks;
neg_diffs=pospeaks-negpeaks(1:end-1);
elseif posloc<negloc
% Starts with a positive peak, and ends with a negative one
neg_diffs=pospeaks-negpeaks;
pos_diffs=pospeaks(2:end)-negpeaks(1:end-1);
else
% Starts with a negative peak, and ends with a positive one
pos_diffs=negpeaks-pospeaks;
neg_diffs=negpeaks(2:end)-pospeaks(1:end-1);
end
I'm sure that could be coded more effectively, but I can't think just now how to write it more compactly. posloc and negloc are the location returns from findpeaks.[/edit]
For (3) to (5) it is easier to record the differences between samples: changes=[YT{2:end}]-[YT{1:end-1}];
3) To count changes, count the number of non-zeros in the difference between adjacent elements: sum(changes~=0)
4) You don't define what you mean by "significant changes", but the test is almost identical to 3) sum(abs(changes)>=3)
5) It is simply changes(changes~=0)

I would suggest diff is the command which can provide the basis of a solution to all your problems (prior converting the cell to an array with cell2mat). It outputs the difference between adjacent values along an array:
1) You'd have to define what a 'peak' is but at a guess:
YT = cell2mat(YT); % convert cell to array
change = diff(YT); % get diffs
highp = sum(change >= 3); % high peak threshold
lowp = sum(change <= -3); % low peak threshold
2) diff(cell2mat(YT)) provides this.
3)
YT = cell2mat(YT); % convert cell to array
change = diff(YT); % get diffs
count = sum(change~=0);
4) Seems to be answered in the other points?

Related

Efficiently finding an element in in an array where consecutive elements differ by +1/0/-1

I have this problem, that I feel I am vastly overcomplicating. I feel like this should be incredibly basic, but I am stumbling on a mental block.
The question reads as follows:
Given an array of integers A[1..n], such that A[1] ≤ A[n] and for all
i, 1 ≤ i < n, we have |A[i] − A[i+ 1]| ≤ 1. Devise an semi-efficient
algorithm (better in the worst case then the native case of looking at
every cell in the array) to find any j such that A[j] = z for a given
value of z, A[1] ≤ z ≤ A[n].
My understanding of the given array is as follows: You have an array that is 1-indexed where the first element of the array is smaller than or equal to the last element of the array. Each element of the array is with in 1 of the previous one (So A[2] could be -1, 0, or +1 of A[1]'s value).
I have had several solutions to this question all of which have had there issues, here is an example of one to show my thought process.
i = 2
while i <= n {
if (A[i] == x) then
break // This can be changed into a less messy case where
// I don't use break, but this is a rough concept
else if (abs(A[i] - j) <= 1) then
i--
else
i += 2
}
This however fails when most of the values inside the array are repeating.
An array of [1 1 1 1 1 1 1 1 1 1 2] where searching for 2 for example, it would run forever.
Most of my attempted algorithms follow a similar concept of incrementing by 2, as that seems like the most logical approach when dealing with with an array that is increasing by a maximum of 1, however, I am struggling to find any that would work in a case such as [1 1 1 1 1 1 1 1 1 1 2] as they all either fail, or match the native worst case of n.
I am unsure if I am struggling because I don't understand what the question is asking, or if I am simply struggling to to put together an algorithm.
What would an algorithm look like that fits the requirements?
This can be solved via a form of modified binary search. The most important premises:
the input array always contains the element
distance between adjacent elements is always 1
there's always an increasingly ordered subarray containing the searched value
Taking it from there we can apply two strategies:
divide and conquer: we can reduce the range searched by half, since we always know which subarray will definitely contain the specified value as a part of an increasing sequence.
limiting the search-range: suppose the searched value is 3 and the limiting value on the right half of the range is 6, we can then shift the right limit to the left by 3 cells.
As code (pythonesque, but untested):
def search_semi_binary(arr, val):
low, up = 0, len(arr) - 1
while low != up:
# reduce search space
low += abs(val - arr[low])
up -= abs(val - arr[up])
# binary search
mid = (low + up) // 2
if arr[mid] == val:
return mid
elif val < arr[mid]:
# value is definitely in the lower part of the array
up = mid - 1
else:
# value is definitely in the upper part of the array
low = mid + 1
return low
The basic idea consists of two parts:
First we can reduce the search space. This uses the fact that adjacent cells of the array may only differ by one. I.e. if the lower bound of our search space has an absolute difference of 3 to val, we can shift the lower bound to the right by at least three without shifting the value out of the search window. Same applies to the upper bound.
The next step follows the basic principle of binary search using the following loop-invariant:
At the start of each iteration there exists an array-element in arr[low:up + 1] that is equal to val and arr[low] <= val <= arr[up]. This is also guaranteed after applying the search-space reduction. Depending on how mid is chosen, one of three cases can happen:
arr[mid] == val: in this case, the searched index is found
arr[mid] < val: In this case arr[mid] < val <= arr[up] must hold due to the assumption of an initial valid state
arr[mid] > val: analogous for arr[mid] > val >= arr[low]
For the latter two cases, we can pick low = mid + 1 (or up = mid - 1 respectively) and start the next iteration.
In the worst case, you'll have to look at all array elements.
Assume all elements are zero, except that a[k] = 1 for one single k, 1 ≤ k ≤ n. k isn't known, obviously. And you look for the value 1. Until you visit a[k], whatever you visit has a value of 0. Any element that you haven't visited could be equal to 1.
Let's say we are looking for a number 5. If they array starts with A[1]=1, the best case scenario is having the 5 in A[5] as it needs to be incremented at least 4 times. If A[5] = 3, then let's check A[7] as it's the closest possible solution. How do we decide it's A[7]? From the number we are looking for, let's call it R for result, we subtract what we currently have, let's call it C for current, and add the result to i as in A[i+(R-C)]
Unfortunately the above solution would apply to every scenario but the worst case scenario (when we iterate through the whole array).

Daily Coding Problem 260 : Reconstruct a jumbled array - Intuition?

I'm going through the question below.
The sequence [0, 1, ..., N] has been jumbled, and the only clue you have for its order is an array representing whether each number is larger or smaller than the last. Given this information, reconstruct an array that is consistent with it.
For example, given [None, +, +, -, +], you could return [1, 2, 3, 0, 4].
I went through the solution on this post but still unable to understand it as to why this solution works. I don't think I would be able to come up with the solution if I had this in front of me during an interview. Can anyone explain the intuition behind it? Thanks in advance!
This answer tries to give a general strategy to find an algorithm to tackle this type of problems. It is not trying to prove why the given solution is correct, but lying out a route towards such a solution.
A tried and tested way to tackle this kind of problem (actually a wide range of problems), is to start with small examples and work your way up. This works for puzzles, but even so for problems encountered in reality.
First, note that the question is formulated deliberately to not point you in the right direction too easily. It makes you think there is some magic involved. How can you reconstruct a list of N numbers given only the list of plusses and minuses?
Well, you can't. For 10 numbers, there are 10! = 3628800 possible permutations. And there are only 2⁹ = 512 possible lists of signs. It's a very huge difference. Most original lists will be completely different after reconstruction.
Here's an overview of how to approach the problem:
Start with very simple examples
Try to work your way up, adding a bit of complexity
If you see something that seems a dead end, try increasing complexity in another way; don't spend too much time with situations where you don't see progress
While exploring alternatives, revisit old dead ends, as you might have gained new insights
Try whether recursion could work:
given a solution for N, can we easily construct a solution for N+1?
or even better: given a solution for N, can we easily construct a solution for 2N?
Given a recursive solution, can it be converted to an iterative solution?
Does the algorithm do some repetitive work that can be postponed to the end?
....
So, let's start simple (writing 0 for the None at the start):
very short lists are easy to guess:
'0++' → 0 1 2 → clearly only one solution
'0--' → 2 1 0 → only one solution
'0-+' → 1 0 2 or 2 0 1 → hey, there is no unique outcome, though the question only asks for one of the possible outcomes
lists with only plusses:
'0++++++' → 0 1 2 3 4 5 6 → only possibility
lists with only minuses:
'0-------'→ 7 6 5 4 3 2 1 0 → only possibility
lists with one minus, the rest plusses:
'0-++++' → 1 0 2 3 4 5 or 5 0 1 2 3 4 or ...
'0+-+++' → 0 2 1 3 4 5 or 5 0 1 2 3 4 or ...
→ no very obvious pattern seem to emerge
maybe some recursion could help?
given a solution for N, appending one sign more?
appending a plus is easy: just repeat the solution and append the largest plus 1
appending a minus, after some thought: increase all the numbers by 1 and append a zero
→ hey, we have a working solution, but maybe not the most efficient one
the algorithm just appends to an existing list, no need to really write it recursively (although the idea is expressed recursively)
appending a plus can be improved, by storing the largest number in a variable so it doesn't need to be searched at every step; no further improvements seem necessary
appending a minus is more troublesome: the list needs to be traversed with each append
what if instead of appending a zero, we append -1, and do the adding at the end?
this clearly works when there is only one minus
when two minus signs are encountered, the first time append -1, the second time -2
→ hey, this works for any number of minuses encountered, just store its counter in a variable and sum with it at the end of the algorithm
This is in bird's eye view one possible route towards coming up with a solution. Many routes lead to Rome. Introducing negative numbers might seem tricky, but it is a logical conclusion after contemplating the recursive algorithm for a while.
It works because all changes are sequential, either adding one or subtracting one, starting both the increasing and the decreasing sequences from the same place. That guarantees we have a sequential list overall. For example, given the arbitrary
[None, +, -, +, +, -]
turned vertically for convenience, we can see
None 0
+ 1
- -1
+ 2
+ 3
- -2
Now just shift them up by two (to account for -2):
2 3 1 4 5 0
+ - + + -
Let's look at first to a solution which (I think) is easier to understand, formalize and demonstrate for correctness (but I will only explain it and not demonstrate in a formal way):
We name A[0..N] our input array (where A[k] is None if k = 0 and is + or - otherwise) and B[0..N] our output array (where B[k] is in the range [0, N] and all values are unique)
At first we see that our problem (find B such that B[k] > B[k-1] if A[k] == + and B[k] < B[k-1] if A[k] == -) is only a special case of another problem:
Find B such that B[k] == max(B[0..k]) if A[k] == + and B[k] == min(B[0..k]) if A[k] == -.
Which generalize from "A value must larger or smaller than the last" to "A value must be larger or smaller than everyone before it"
So a solution to this problem is a solution to the original one as well.
Now how do we approach this problem?
A greedy solution will be sufficient, indeed is easy to demonstrate that the value associated with the last + will be the biggest number in absolute (which is N), the one associated with the second last + will be the second biggest number in absolute (which is N-1) ecc...
And in the same time the value associated with the last - will be the smallest number in absolute (which is 0), the one associated with the second last - will be the second smallest (which is 1) ecc...
So we can start filling B from right to left remembering how many + we have seen (let's call this value X), how many - we have seen (let's call this value Y) and looking at what is the current symbol, if it is a + in B we put N-X and we increase X by 1 and if it is a - in B we put 0+Y and we increase Y by 1.
In the end we'll need to fill B[0] with the only remaining value which is equal to Y+1 and to N-X-1.
An interesting property of this solution is that if we look to only the values associated with a - they will be all the values from 0 to Y (where in this case Y is the total number of -) sorted in reverse order; if we look to only the values associated with a + they will be all the values from N-X to N (where in this case X is the total number of +) sorted and if we look at B[0] it will always be Y+1 and N-X-1 (which are equal).
So the - will have all the values strictly smaller than B[0] and reverse sorted and the + will have all the values strictly bigger than B[0] and sorted.
This property is the key to understand why the solution proposed here works:
It consider B[0] equals to 0 and than it fills B following the property, this isn't a solution because the values are not in the range [0, N], but it is possible with a simple translation to move the range and arriving to [0, N]
The idea is to produce a permutation of [0,1...N] which will follow the pattern of [+,-...]. There are many permutations which will be applicable, it isn't a single one. For instance, look the the example provided:
[None, +, +, -, +], you could return [1, 2, 3, 0, 4].
But you also could have returned other solutions, just as valid: [2,3,4,0,1], [0,3,4,1,2] are also solutions. The only concern is that you need to have the first number having at least two numbers above it for positions [1],[2], and leave one number in the end which is lower then the one before and after it.
So the question isn't finding the one and only pattern which is scrambled, but to produce any permutation which will work with these rules.
This algorithm answers two questions for the next member of the list: get a number who’s both higher/lower from previous - and get a number who hasn’t been used yet. It takes a starting point number and essentially create two lists: an ascending list for the ‘+’ and a descending list for the ‘-‘. This way we guarantee that the next member is higher/lower than the previous one (because it’s in fact higher/lower than all previous members, a stricter condition than the one required) and for the same reason we know this number wasn’t used before.
So the intuition of the referenced algorithm is to start with a referenced number and work your way through. Let's assume we start from 0. The first place we put 0+1, which is 1. we keep 0 as our lowest, 1 as the highest.
l[0] h[1] list[1]
the next symbol is '+' so we take the highest number and raise it by one to 2, and update both the list with a new member and the highest number.
l[0] h[2] list [1,2]
The next symbol is '+' again, and so:
l[0] h[3] list [1,2,3]
The next symbol is '-' and so we have to put in our 0. Note that if the next symbol will be - we will have to stop, since we have no lower to produce.
l[0] h[3] list [1,2,3,0]
Luckily for us, we've chosen well and the last symbol is '+', so we can put our 4 and call is a day.
l[0] h[4] list [1,2,3,0,4]
This is not necessarily the smartest solution, as it can never know if the original number will solve the sequence, and always progresses by 1. That means that for some patterns [+,-...] it will not be able to find a solution. But for the pattern provided it works well with 0 as the initial starting point. If we chose the number 1 is would also work and produce [2,3,4,0,1], but for 2 and above it will fail. It will never produce the solution [0,3,4,1,2].
I hope this helps understanding the approach.
This is not an explanation for the question put forward by OP.
Just want to share a possible approach.
Given: N = 7
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Go from 0 to N
[1] fill all '-' starting from right going left.
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Answer: 2 1 0
[2] fill all the vacant places i.e [X & +] starting from left going right.
Index: 0 1 2 3 4 5 6 7
Pattern: X + - + - + - + //X = None
Answer: 3 4 5 6 7
Final:
Pattern: X + - + - + - + //X = None
Answer: 3 4 2 5 1 6 0 7
My answer definitely is too late for your problem but if you need a simple proof, you probably would like to read it:
+min_last or min_so_far is a decreasing value starting from 0.
+max_last or max_so_far is an increasing value starting from 0.
In the input, each value is either "+" or "-" and for each increase the value of max_so_far or decrease the value of min_so_far by one respectively, excluding the first one which is None. So, abs(min_so_far, max_so_far) is exactly equal to N, right? But because you need the range [0, n] but max_so_far and min_so_far now are equal to the number of "+"s and "-"s with the intersection part with the range [0, n] being [0, max_so_far], what you need to do is to pad it the value equal to min_so_far for the final solution (because min_so_far <= 0 so you need to take each value of the current answer to subtract by min_so_far or add by abs(min_so_far)).

Binary search modification

I have been attempting to solve following problem. I have a sequence of positive
integer numbers which can be very long (several milions of elements). This
sequence can contain "jumps" in the elements values. The aforementioned jump
means that two consecutive elements differs each other by more than 1.
Example 01:
1 2 3 4 5 6 7 0
In the above mentioned example the jump occurs between 7 and 0.
I have been looking for some effective algorithm (from time point of view) for
finding of the position where this jump occurs. This issue is complicated by the
fact that there can be a situation when two jumps are present and one of them
is the jump which I am looking for and the other one is a wrap-around which I
am not looking for.
Example 02:
9 1 2 3 4 6 7 8
Here the first jump between 9 and 1 is a wrap-around. The second jump between
4 and 6 is the jump which I am looking for.
My idea is to somehow modify the binary search algorithm but I am not sure whether it is possible due to the wrap-around presence. It is worthwhile to say that only two jumps can occur in maximum and between these jumps the elements are sorted. Does anybody have any idea? Thanks in advance for any suggestions.
You cannot find an efficient solution (Efficient meaning not looking at all numbers, O(n)) since you cannot conclude anything about your numbers by looking at less than all. For example if you only look at every second number (still O(n) but better factor) you would miss double jumps like these: 1 5 3. You can and must look at every single number and compare it to it's neighbours. You could split your workload and use a multicore approach but that's about it.
Update
If you have the special case that there is only 1 jump in your list and the rest is sorted (eg. 1 2 3 7 8 9) you can find this jump rather efficiently. You cannot use vanilla binary search since the list might not be sorted fully and you don't know what number you are searching but you could use an abbreviation of the exponential search which bears some resemblance.
We need the following assumptions for this algorithm to work:
There is only 1 jump (I ignore the "wrap around jump" since it is not technically between any following elements)
The list is otherwise sorted and it is strictly monotonically increasing
With these assumptions we are now basically searching an interruption in our monotonicity. That means we are searching the case when 2 elements and b have n elements between them but do not fulfil b = a + n. This must be true if there is no jump between the two elements. Now you only need to find elements which do not fulfil this in a nonlinear manner, hence the exponential approach. This pseudocode could be such an algorithm:
let numbers be an array of length n fulfilling our assumptions
start = 0
stepsize = 1
while (start < n-1)
while (start + stepsize > n)
stepsize -= 1
stop = start + stepsize
while (numbers[stop] != numbers[start] + stepsize)
// the number must be between start and stop
if(stepsize == 1)
// congratiulations the jump is at start to start + 1
return start
else
stepsize /= 2
start += stepsize
stepsize *= 2
no jump found

Define a vector with random steps

I want to create an array that has incremental random steps, I've used this simple code.
t_inici=(0:10*rand:100);
The problem is that the random number keeps unchangable between steps. Is there any simple way to change the seed of the random number within each step?
If you have a set number of points, say nPts, then you could do the following
nPts = 10; % Could use 'randi' here for random number of points
lims = [0, 10] % Start and end points
x = rand(1, nPts); % Create random numbers
% Sort and scale x to fit your limits and be ordered
x = diff(lims) * ( sort(x) - min(x) ) / diff(minmax(x)) + lims(1)
This approach always includes your end point, which a 0:dx:10 approach would not necessarily.
If you had some maximum number of points, say nPtsMax, then you could do the following
nPtsMax = 1000; % Max number of points
lims = [0,10]; % Start and end points
% Could do 10* or any other multiplier as in your example in front of 'rand'
x = lims(1) + [0 cumsum(rand(1, nPtsMax))];
x(x > lims(2)) = []; % remove values above maximum limit
This approach may be slower, but is still fairly quick and better represents the behaviour in your question.
My first approach to this would be to generate N-2 samples, where N is the desired amount of samples randomly, sort them, and add the extrema:
N=50;
endpoint=100;
initpoint=0;
randsamples=sort(rand(1, N-2)*(endpoint-initpoint)+initpoint);
t_inici=[initpoint randsamples endpoint];
However not sure how "uniformly random" this is, as you are "faking" the last 2 data, to have the extrema included. This will somehow distort pure randomness (I think). If you are not necessarily interested on including the extrema, then just remove the last line and generate N points. That will make sure that they are indeed random (or as random as MATLAB can create them).
Here is an alternative solution with "uniformly random"
[initpoint,endpoint,coef]=deal(0,100,10);
t_inici(1)=initpoint;
while(t_inici(end)<endpoint)
t_inici(end+1)=t_inici(end)+rand()*coef;
end
t_inici(end)=[];
In my point of view, it fits your attempts well with unknown steps, start from 0, but not necessarily end at 100.
From your code it seems you want a uniformly random step that varies between each two entries. This implies that the number of entries that the vector will have is unknown in advance.
A way to do that is as follows. This is similar to Hunter Jiang's answer but adds entries in batches instead of one by one, in order to reduce the number of loop iterations.
Guess a number of required entries, n. Any value will do, but a large value will result in fewer iterations and will probably be more efficient.
Initiallize result to the first value.
Generate n entries and concatenate them to the (temporary) result.
See if the current entries are already too many.
If they are, cut as needed and output (final) result. Else go back to step 3.
Code:
lower_value = 0;
upper_value = 100;
step_scale = 10;
n = 5*(upper_value-lower_value)/step_scale*2; % STEP 1. The number 5 here is arbitrary.
% It's probably more efficient to err with too many than with too few
result = lower_value; % STEP 2
done = false;
while ~done
result = [result result(end)+cumsum(step_scale*rand(1,n))]; % STEP 3. Include
% n new entries
ind_final = find(result>upper_value,1)-1; % STEP 4. Index of first entry exceeding
% upper_value, if any
if ind_final % STEP 5. If non-empty, we're done
result = result(1:ind_final-1);
done = true;
end
end

Split Entire Hash Range Into n Equal Ranges

I am looking to take a hash range (md5 or sha1) and split it into n equal ranges.
For example, if m (num nodes) = 5, the entire hash range would be split by 5 so that there would be a uniform distribution of key ranges. I would like n=1 (node 1) to be from the beginning of the hash range to 1/5, 2 from 1/5 to 2/5, etc all the way to the end.
Basically, I need to have key ranges mapped to each n such that when I hash a value, it knows which n is going to take care of that range.
I am new to hashing and a little bit unsure of where I could start on solving this for a project. Any help you could give would be great.
If you are looking to place a hash value into a number of "buckets" evenly, then some simple math will do the trick. Watch out for rounding edge cases... You would be better to use a power of 2 for the BUCKETS value.
This is python code, by the way, which supports large integers...
BUCKETS = 5
BITS = 160
BUCKETSIZE = 2**BITS / BUCKETS
int('ad01c5b3de58a02a42367e33f5bdb182d5e7e164', 16) / BUCKETSIZE == 3
int('553ae7da92f5505a92bbb8c9d47be76ab9f65bc2', 16) / BUCKETSIZE == 1
int('001c7c8c5ff152f1cc8ed30421e02a898cfcfb23', 16) / BUCKETSIZE == 0
If you can stand a little very hard to remove bias (any power of two is impossible to divide evenly in 5, so there has to be some bias), then modulo (% in C and many other languages with C-like syntax) is the way to divide the full range into 5 almost identically-sized partitions.
Any message m with md5(m)%5==0 is in the first partition, etc.

Resources