Given an array , we need to find the length of longest sub-sequence with alternating increasing and decreasing values.
For example , if the array is ,
7 4 8 9 3 5 2 1 then the L = 6 for 7,4,8,3,5,2 or 7,4,9,3,5,1 , etc.
It could also be the case that first we have small then big element.
What could be the most efficient solution for this ? I had a DP solution in mind. And if we were to do it using brute force how would we do it (O(n^3) ?) ?
And it's not a homework problem.
You indeed can use dynamic programming approach here. For sake of simplicity , assume we need to find only the maximal length of such sequence seq (it will be easy to tweak solution to find the sequence itself).
For each index we will store 2 values:
maximal length of alternating sequence ending at that element where last step was increasing (say, incr[i])
maximal length of alternating sequence ending at that element where last step was decreasing (say, decr[i])
also by definition we assume incr[0] = decr[0] = 1
then each incr[i] can be found recursively:
incr[i] = max(decr[j])+1, where j < i and seq[j] < seq[i]
decr[i] = max(incr[j])+1, where j < i and seq[j] > seq[i]
Required length of the sequence will be the maximum value in both arrays, complexity of this approach is O(N*N) and it requires 2N of extra memory (where N is the length of initial sequence)
simple example in c:
int seq[N]; // initial sequence
int incr[N], decr[N];
... // Init sequences, fill incr and decr with 1's as initial values
for (int i = 1; i < N; ++i){
for (int j = 0; j < i; ++j){
if (seq[j] < seq[i])
{
// handle "increasing" step - need to check previous "decreasing" value
if (decr[j]+1 > incr[i]) incr[i] = decr[j] + 1;
}
if (seq[j] > seq[i])
{
if (incr[j]+1 > decr[i]) decr[i] = incr[j] + 1;
}
}
}
... // Now all arrays are filled, iterate over them and find maximum value
How algorithm will work:
step 0 (initial values):
seq = 7 4 8 9 3 5 2 1
incr = 1 1 1 1 1 1 1 1
decr = 1 1 1 1 1 1 1 1
step 1 take value at index 1 ('4') and check previous values. 7 > 4 so we make "decreasing step from index 0 to index 1, new sequence values:
incr = 1 1 1 1 1 1 1 1
decr = 1 2 1 1 1 1 1 1
step 2. take value 8 and iterate over previous value:
7 < 8, make increasing step: incr[2] = MAX(incr[2], decr[0]+1):
incr = 1 1 2 1 1 1 1 1
decr = 1 2 1 1 1 1 1 1
4 < 8, make increasing step: incr[2] = MAX(incr[2], decr[1]+1):
incr = 1 1 3 1 1 1 1 1
decr = 1 2 1 1 1 1 1 1
etc...
Related
I'm trying to write a function that shuffles an array, which contains repeating elements, but ensures that repeating elements are not too close to one another.
This code works but seems inefficient to me:
function shuffledArr = distShuffle(myArr, myDist)
% this function takes an array myArr and shuffles it, while ensuring that repeating
% elements are at least myDist elements away from on another
% flag to indicate whether there are repetitions within myDist
reps = 1;
while reps
% set to 0 to break while-loop, will be set to 1 if it doesn't meet condition
reps = 0;
% randomly shuffle array
shuffledArr = Shuffle(myArr);
% loop through each unique value, find its position, and calculate the distance to the next occurence
for x = 1:length(unique(myArr))
% check if there are any repetitions that are separated by myDist or less
if any(diff(find(shuffledArr == x)) <= myDist)
reps = 1;
break;
end
end
end
This seems suboptimal to me for three reasons:
1) It may not be necessary to repeatedly shuffle until a solution has been found.
2) This while loop will go on forever if there is no possible solution (i.e. setting myDist to be too high to find a configuration that fits). Any ideas on how to catch this in advance?
3) There must be an easier way to determine the distance between repeating elements in an array than what I did by looping through each unique value.
I would be grateful for answers to points 2 and 3, even if point 1 is correct and it is possible to do this in a single shuffle.
I think it is sufficient to check the following condition to prevent infinite loops:
[~,num, C] = mode(myArr);
N = numel(C);
assert( (myDist<=N) || (myDist-N+1) * (num-1) +N*num <= numel(myArr),...
'Shuffling impossible!');
Assume that myDist is 2 and we have the following data:
[4 6 5 1 6 7 4 6]
We can find the the mode , 6, with its occurence, 3. We arrange 6s separating them by 2 = myDist blanks:
6 _ _ 6 _ _6
There must be (3-1) * myDist = 4 numbers to fill the blanks. Now we have five more numbers so the array can be shuffled.
The problem becomes more complicated if we have multiple modes. For example for this array [4 6 5 1 6 7 4 6 4] we have N=2 modes: 6 and 4. They can be arranged as:
6 4 _ 6 4 _ 6 4
We have 2 blanks and three more numbers [ 5 1 7] that can be used to fill the blanks. If for example we had only one number [ 5] it was impossible to fill the blanks and we couldn't shuffle the array.
For the third point you can use sparse matrix to accelerate the computation (My initial testing in Octave shows that it is more efficient):
function shuffledArr = distShuffleSparse(myArr, myDist)
[U,~,idx] = unique(myArr);
reps = true;
while reps
S = Shuffle(idx);
shuffledBin = sparse ( 1:numel(idx), S, true, numel(idx) + myDist, numel(U) );
reps = any (diff(find(shuffledBin)) <= myDist);
end
shuffledArr = U(S);
end
Alternatively you can use sub2ind and sort instead of sparse matrix:
function shuffledArr = distShuffleSparse(myArr, myDist)
[U,~,idx] = unique(myArr);
reps = true;
while reps
S = Shuffle(idx);
f = sub2ind ( [numel(idx) + myDist, numel(U)] , 1:numel(idx), S );
reps = any (diff(sort(f)) <= myDist);
end
shuffledArr = U(S);
end
If you just want to find one possible solution you could use something like that:
x = [1 1 1 2 2 2 3 3 3 3 3 4 5 5 6 7 8 9];
n = numel(x);
dist = 3; %minimal distance
uni = unique(x); %get the unique value
his = histc(x,uni); %count the occurence of each element
s = [sortrows([uni;his].',2,'descend'), zeros(length(uni),1)];
xr = []; %the vector that will contains the solution
%the for loop that will maximize the distance of each element
for ii = 1:n
s(s(:,3)<0,3) = s(s(:,3)<0,3)+1;
s(1,3) = s(1,3)-dist;
s(1,2) = s(1,2)-1;
xr = [xr s(1,1)];
s = sortrows(s,[3,2],{'descend','descend'})
end
if any(s(:,2)~=0)
fprintf('failed, dist is too big')
end
Result:
xr = [3 1 2 5 3 1 2 4 3 6 7 8 3 9 5 1 2 3]
Explaination:
I create a vector s and at the beggining s is equal to:
s =
3 5 0
1 3 0
2 3 0
5 2 0
4 1 0
6 1 0
7 1 0
8 1 0
9 1 0
%col1 = unique element; col2 = occurence of each element, col3 = penalities
At each iteration of our for-loop we choose the element with the maximum occurence since this element will be harder to place in our array.
Then after the first iteration s is equal to:
s =
1 3 0 %1 is the next element that will be placed in our array.
2 3 0
5 2 0
4 1 0
6 1 0
7 1 0
8 1 0
9 1 0
3 4 -3 %3 has now 5-1 = 4 occurence and a penalities of -3 so it won't show up the next 3 iterations.
at the end every number of the second column should be equal to 0, if it's not the minimal distance was too big.
I have an array of the following values:
X=[1 1 1 2 3 4 1 1 1 1 5 4 2 1 1 2 3 4 1 1 1 1 1 2 2 1]
I want to get the position (the index) of all the consecutive ones in the array, and construct an array that holds the start and end positions of each set of the consecutive zeros:
idx= [1 3; 7 10; 14 15; 19 23; 26 26];
I tried to use the following functions, but I am not sure how to implement it:
positionofoness= find(X==1);
find(diff(X==1));
How can I construct idx array ??
You were almost there with your find and diff solution. To find all the positions where X changes from 1, pad X with a NaN in the beginning and the end:
tmp = find(diff([NaN X NaN] == 1)) % NaN to identify 1st and last elements as start and end
tmp =
1 4 7 11 14 16 19 24 26 27
%start|end start|end
Notice that every even element tmp indicates the index + 1 of where consecutive 1s end.
idx = [reshape(tmp,2,[])]'; % reshape in desired form
idx = [idx(:,1) idx(:,2)-1]; % subtract 1 from second column
I have an array (say of 1s and 0s) and I want to find the index, i, for the first location where 1 appears n times in a row.
For example,
x = [0 0 1 0 1 1 1 0 0 0] ;
i = 5, for n = 3, as this is the first time '1' appears three times in a row.
Note: I want to find where 1 appears n times in a row so
i = find(x,n,'first');
is incorrect as this would give me the index of the first n 1s.
It is essentially a string search? eg findstr but with a vector.
You can do it with convolution as follows:
x = [0 0 1 0 1 1 1 0 0 0];
N = 3;
result = find(conv(x, ones(1,N), 'valid')==N, 1)
How it works
Convolve x with a vector of N ones and find the first time the result equals N. Convolution is computed with the 'valid' flag to avoid edge effects and thus obtain the correct value for the index.
Another answer that I have is to generate a buffer matrix where each row of this matrix is a neighbourhood of overlapping n elements of the array. Once you create this, index into your array and find the first row that has all 1s:
x = [0 0 1 0 1 1 1 0 0 0]; %// Example data
n = 3; %// How many times we look for duplication
%// Solution
ind = bsxfun(#plus, (1:numel(x)-n+1).', 0:n-1); %'
out = find(all(x(ind),2), 1);
The first line is a bit tricky. We use bsxfun to generate a matrix of size m x n where m is the total number of overlapping neighbourhoods while n is the size of the window you are searching for. This generates a matrix where the first row is enumerated from 1 to n, the second row is enumerated from 2 to n+1, up until the very end which is from numel(x)-n+1 to numel(x). Given n = 3, we have:
>> ind
ind =
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9
8 9 10
These are indices which we will use to index into our array x, and for your example it generates the following buffer matrix when we directly index into x:
>> x = [0 0 1 0 1 1 1 0 0 0];
>> x(ind)
ans =
0 0 1
0 1 0
1 0 1
0 1 1
1 1 1
1 1 0
1 0 0
0 0 0
Each row is an overlapping neighbourhood of n elements. We finally end by searching for the first row that gives us all 1s. This is done by using all and searching over every row independently with the 2 as the second parameter. all produces true if every element in a row is non-zero, or 1 in our case. We then combine with find to determine the first non-zero location that satisfies this constraint... and so:
>> out = find(all(x(ind), 2), 1)
out =
5
This tells us that the fifth location of x is where the beginning of this duplication occurs n times.
Based on Rayryeng's approach you can loop this as well. This will definitely be slower for short array sizes, but for very large array sizes this doesn't calculate every possibility, but stops as soon as the first match is found and thus will be faster. You could even use an if statement based on the initial array length to choose whether to use the bsxfun or the for loop. Note also that for loops are rather fast since the latest MATLAB engine update.
x = [0 0 1 0 1 1 1 0 0 0]; %// Example data
n = 3; %// How many times we look for duplication
for idx = 1:numel(x)-n
if all(x(idx:idx+n-1))
break
end
end
Additionally, this can be used to find the a first occurrences:
x = [0 0 1 0 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0]; %// Example data
n = 3; %// How many times we look for duplication
a = 2; %// number of desired matches
collect(1,a)=0; %// initialise output
kk = 1; %// initialise counter
for idx = 1:numel(x)-n
if all(x(idx:idx+n-1))
collect(kk) = idx;
if kk == a
break
end
kk = kk+1;
end
end
Which does the same but shuts down after a matches have been found. Again, this approach is only useful if your array is large.
Seeing you commented whether you can find the last occurrence: yes. Same trick as before, just run the loop backwards:
for idx = numel(x)-n:-1:1
if all(x(idx:idx+n-1))
break
end
end
One possibility with looping:
i = 0;
n = 3;
for idx = n : length(x)
idx_true = 1;
for sub_idx = (idx - n + 1) : idx
idx_true = idx_true & (x(sub_idx));
end
if(idx_true)
i = idx - n + 1;
break
end
end
if (i == 0)
disp('No index found.')
else
disp(i)
end
I'm looking for efficient algorithm (or any at all..) for this tricky thing. I'll simplify my problem. In my application, this array is about 10000 times bigger :)
I have an 2D array like this:
0 2 1 3 4
1 2 0 4 3
0 2 1 3 4
4 1 2 3 0
Yes, in every row there are values range from 0 to 4 but in different order. The order matters! I can't just sort it and solve this in easy way :)
Then, I shuffle it by choosing a random indexes and swapping them - couple of times. Example result:
0 1 1 1 4
1 2 2 4 3
0 2 3 3 4
4 2 0 3 0
I see duplicates in the rows, that's not good.. Algorithm should find this duplicates and replace them with a value that will not be another duplicate in particular row, for example:
0 1 2 3 4
1 2 0 4 3
0 2 3 1 4
4 2 0 3 1
Can you share your ideas? Maybe there is already very famous algorithm for this problem? I'd be grateful for any hint.
EDIT
Clarification for T_G: After the shuffle, particular row can't exchange values with another rows. It need to find duplicates and replace it with available (any) value left - which is not another duplicate.
After shuffling:
0 1 1 1 4
1 2 2 4 3
0 2 3 3 4
4 2 0 3 0
Steps:
I have 0; I don't see another zeros. Next.
I have 1; I see another 1; I should change it (the second one); there is no 2 in this row, so lets change this duplicate 1 to 2.
I have 1; I see another 1. I should change it (the second one); there is no 3 in this row, so lets change this duplicate 1 to 3. etc...
So if you input this row:
0 0 0 0 0 0 0 0 0
You should get:
0 1 2 3 4 5 6 7 8
Try something like this:
// Iterate matrix lines, line by line
for(uint32_t line_no = 0; line_no < max_line_num; line_no++) {
// counters for each symbol 0-4; index is symbol, val is counter
uint8_t counters[6];
// Clear counters before usage
memset(0, counters, sizeof(counters));
// Compute counters
for(int i = 0; i < 6; i++)
counters[matrix[line_no][i]]++;
// Index of maybe unused symbol; by default is 4
int j = 4;
// Iterate line in reversed order
for(int i = 4; i >= 0; i--)
if(counters[matrix[line_no][i]] > 1) { // found dup
while(counters[j] != 0) // find unused symbol "j"
j--;
counters[matrix[line_no][i]]--; // Decrease dup counter
matrix[line_no][i] = j; // substitute dup to symbol j
counters[j]++; // this symbol j is used
} // for + if
} // for lines
Here is the problem:
Given an array of N elements (1 to N), sort the array with one constraint: you can only move an element to begin of the array or end of the array. How many moves do you at least need to sort the array?
For example: 2 5 3 4 1 => 1 2 5 3 4 => 1 2 3 4 5, so I need at least 2 moves.
I figure out one solution: N - length of longest increasing subsequence, in above example the answer if 5 - 3 = 2.
I know a O(NlogN) algorithm to find longest increasing subsequence (LIS). But with elements in the array being in [1, N], I wonder is there a O(N) solution to find LIS of the array?
Or is there a O(N) solution to solve the initial problem given that we know elements are from 1 to N?
What you are looking for is the longest increasing sequence where the difference between any two consecutive elements is 1.
Just finding the longest increasing sequence is not enough, for example with 1 5 3 4 2 the longest inc seq has length 3 but the problem can only be solved in 3 steps not 2 as far as I can tell.
To find the longest inc seq where the difference is 1 in O(N) time and O(N) space can be done by allocating a helper array of size N initialized to all 0 for example. This array will store at position i the length of the longest subsequence up i and if i hasn't been seen yet it will be 0.
Then you go through the unsorted array and when you find an element x you set helper[x] = helper[x-1] + 1 and you update a max variable.
Finally, the cost to sort is input_array.length - max
Example:
array: 3 1 2
0 1 2 3
helper: 0 0 0 0
max = 0
step 1:
check element at position 1 which is 3. helper[3] = helper[3 - 1] + 1 == 1:
0 1 2 3
helper: 0 0 0 1
max = 1
step 2:
check element at position 2 which is 1. helper[1] = helper[1 - 1] + 1 == 1:
0 1 2 3
helper: 0 1 0 1
max = 1
step 3:
check element at position 3 which is 2. helper[2] = helper[2 - 1] + 1 == 2:
0 1 2 3
helper: 0 1 2 1
max = 2
cost = 3 - 2 = 1