Increment value of every element in an array in a particular range - arrays

There's an array A[] having n elements. There's another array B[] of the same size n with every element initialized to zero. For every i in range 1 to n, elements of B[] in the range i-A_i to i+A_i (inclusive) need to be increased by 1.
I've already tried an O(n^2) solution using nested loop method. I cannot really figure out an O(n) solution if existent.
i=1;
while(i<=n)
{
start=(i-A[i]<1)?1:i-A[i];
end=(i+A[i]>n)?n:i+A[i];
while(start<=end)
{
B[start]+=1;
start+=1;
}
i+=1;
}

A naive implementation is to to increment each range per item in A, but you do not need to do taht. You can first "prepare" your array by adding 1 where the increment should start, and -1 where the increment should stop. Next you can calculate the cummulative sum of the array. Like:
def fill_list(la):
lb = [0]*len(la)
n1 = len(la)-1
for i, a in enumerate(la, 1):
xf, xt = i-a, i+a+1
lb[max(0, i-a)] += 1
if xt <= n1:
lb[xt] -= 1
c = 0
for i, b in enumerate(lb):
c += b
lb[i] = c
return lb
or if you want to return the range from 1 to n:
def fill_list1(la):
n1 = len(la)
lb = [0]*(n1+1)
for i, a in enumerate(la, 1):
xf, xt = i-a, i+a+1
lb[max(0, i-a)] += 1
if xt <= n1:
lb[xt] -= 1
c = 0
for i, b in enumerate(lb):
c += b
lb[i] = c
return lb[1:]
We can then for example generate a list with:
>>> fill_list([1,4,2,5,1,3,0,2])
[4, 4, 4, 5, 5, 5, 4, 3]
>>> fill_list1([1,2,3,4,5])
[5, 5, 4, 4, 3]
This thus has ranges for:
-3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11
--*--*--*--*--*--*--*--*--*--*--*--*--*--*--*--
|-----|
|-----------------------|
|-----------|
|-----------------------------|
|-----|
|-----------------|
|
|-----------|
--*--*--*--*--*--*--*--*--*--*--*--*--*--*--*--
0 1 1 1 1 0 0 1 0 0 -1 -1 -1 -2 -1
--*--*--*--*--*--*--*--*--*--*--*--*--*--*--*--
0 1 2 3 4 4 4 5 5 5 4 3 3 1 0
The increments that are done before the start of the range (so with an index less than 0) are just placed at index 0 such that we take these into account. The ones that are done after the window (so with an index larger than or equal to n are simply ignored).
In the image the first row shows the indices, next we denote the ranges that arise from the same input, next we show the increments and decrements that would be put on an infinite tape, and next we show the cummulative sum.
The algorithm works in O(n): first we iterate over la in linear time, and increment and decrement the corresponding elements in b. Next we iterate over b, again in O(n) to calcuate the cummulative sum.

Related

Shuffle array while spacing repeating elements

I'm trying to write a function that shuffles an array, which contains repeating elements, but ensures that repeating elements are not too close to one another.
This code works but seems inefficient to me:
function shuffledArr = distShuffle(myArr, myDist)
% this function takes an array myArr and shuffles it, while ensuring that repeating
% elements are at least myDist elements away from on another
% flag to indicate whether there are repetitions within myDist
reps = 1;
while reps
% set to 0 to break while-loop, will be set to 1 if it doesn't meet condition
reps = 0;
% randomly shuffle array
shuffledArr = Shuffle(myArr);
% loop through each unique value, find its position, and calculate the distance to the next occurence
for x = 1:length(unique(myArr))
% check if there are any repetitions that are separated by myDist or less
if any(diff(find(shuffledArr == x)) <= myDist)
reps = 1;
break;
end
end
end
This seems suboptimal to me for three reasons:
1) It may not be necessary to repeatedly shuffle until a solution has been found.
2) This while loop will go on forever if there is no possible solution (i.e. setting myDist to be too high to find a configuration that fits). Any ideas on how to catch this in advance?
3) There must be an easier way to determine the distance between repeating elements in an array than what I did by looping through each unique value.
I would be grateful for answers to points 2 and 3, even if point 1 is correct and it is possible to do this in a single shuffle.
I think it is sufficient to check the following condition to prevent infinite loops:
[~,num, C] = mode(myArr);
N = numel(C);
assert( (myDist<=N) || (myDist-N+1) * (num-1) +N*num <= numel(myArr),...
'Shuffling impossible!');
Assume that myDist is 2 and we have the following data:
[4 6 5 1 6 7 4 6]
We can find the the mode , 6, with its occurence, 3. We arrange 6s separating them by 2 = myDist blanks:
6 _ _ 6 _ _6
There must be (3-1) * myDist = 4 numbers to fill the blanks. Now we have five more numbers so the array can be shuffled.
The problem becomes more complicated if we have multiple modes. For example for this array [4 6 5 1 6 7 4 6 4] we have N=2 modes: 6 and 4. They can be arranged as:
6 4 _ 6 4 _ 6 4
We have 2 blanks and three more numbers [ 5 1 7] that can be used to fill the blanks. If for example we had only one number [ 5] it was impossible to fill the blanks and we couldn't shuffle the array.
For the third point you can use sparse matrix to accelerate the computation (My initial testing in Octave shows that it is more efficient):
function shuffledArr = distShuffleSparse(myArr, myDist)
[U,~,idx] = unique(myArr);
reps = true;
while reps
S = Shuffle(idx);
shuffledBin = sparse ( 1:numel(idx), S, true, numel(idx) + myDist, numel(U) );
reps = any (diff(find(shuffledBin)) <= myDist);
end
shuffledArr = U(S);
end
Alternatively you can use sub2ind and sort instead of sparse matrix:
function shuffledArr = distShuffleSparse(myArr, myDist)
[U,~,idx] = unique(myArr);
reps = true;
while reps
S = Shuffle(idx);
f = sub2ind ( [numel(idx) + myDist, numel(U)] , 1:numel(idx), S );
reps = any (diff(sort(f)) <= myDist);
end
shuffledArr = U(S);
end
If you just want to find one possible solution you could use something like that:
x = [1 1 1 2 2 2 3 3 3 3 3 4 5 5 6 7 8 9];
n = numel(x);
dist = 3; %minimal distance
uni = unique(x); %get the unique value
his = histc(x,uni); %count the occurence of each element
s = [sortrows([uni;his].',2,'descend'), zeros(length(uni),1)];
xr = []; %the vector that will contains the solution
%the for loop that will maximize the distance of each element
for ii = 1:n
s(s(:,3)<0,3) = s(s(:,3)<0,3)+1;
s(1,3) = s(1,3)-dist;
s(1,2) = s(1,2)-1;
xr = [xr s(1,1)];
s = sortrows(s,[3,2],{'descend','descend'})
end
if any(s(:,2)~=0)
fprintf('failed, dist is too big')
end
Result:
xr = [3 1 2 5 3 1 2 4 3 6 7 8 3 9 5 1 2 3]
Explaination:
I create a vector s and at the beggining s is equal to:
s =
3 5 0
1 3 0
2 3 0
5 2 0
4 1 0
6 1 0
7 1 0
8 1 0
9 1 0
%col1 = unique element; col2 = occurence of each element, col3 = penalities
At each iteration of our for-loop we choose the element with the maximum occurence since this element will be harder to place in our array.
Then after the first iteration s is equal to:
s =
1 3 0 %1 is the next element that will be placed in our array.
2 3 0
5 2 0
4 1 0
6 1 0
7 1 0
8 1 0
9 1 0
3 4 -3 %3 has now 5-1 = 4 occurence and a penalities of -3 so it won't show up the next 3 iterations.
at the end every number of the second column should be equal to 0, if it's not the minimal distance was too big.

How to improve an algorithm to check if there is an element in the array that is equal to the difference between any other two elements in the array?

I know that this is apparently a simple question. But I can't get a better approach to get better efficiency. Here's what I'm trying. It is very naive but I still can't get it correct.
Sort the array. (Divide and Conquer)
a) Select one element at a time
b) loop through all the remaining elements of the array (in a pair) to get
the difference between them to match the selected element.
Repeat step 2 till at least all the elements are found.
Store all the elements that match the condition.
Print the stored elements.
Condition A[i] - A[j] = A[k] is equal to A[i] = A[j] + A[k], so we can look for sum.
Sort the array.
For every element search if it is sum of two others using two pointers approach (increment lower index when sum is too small, decrement upper index when sum is too big)
Resulting complexity is quadratic
Just out of interest, we can solve this problem in O(n log n + m log m) time, where m is the range, using a Fast Fourier Transform.
First sort the input. Now consider that each of the attainable distances between numbers can be achieved by subtracting one difference-prefix-sum from another. For example:
input: 1 3 7
diff-prefix-sums: 2 6
difference between 7 and 3 is 6 - 2
Now let's add the total (the rightmost prefix sum) to each side of the equation:
ps[r] - ps[l] = D
ps[r] + (T - ps[l]) = D + T
Let's list the differences:
1 3 7
2 4
and the prefix sums:
p => 0 2 6
T - p => 6 4 0 // 6-0, 6-2, 6-6
We need to efficiently determine the counts of all the different achievable differences. This is akin to multiplying the polynomial with coefficients [1, 0, 0, 0, 1, 0, 1] by the polynomial with coefficients, [1, 0, 1, 0, 0, 0, 0] (we don't need the zero coefficient in the second set since it only generates degrees less than or equal to T), which we can accomplish in m log m time, where m is the degree, with a Fast Fourier Transform.
The resultant coefficients would be:
1 0 0 0 1 0 1
*
1 0 1 0 0 0 0
=>
x^6 + x^2 + 1
*
x^6 + x^4
= x^12 + x^10 + x^8 + 2x^6 + x^4
=> 1 0 1 0 1 0 1 0 1 0 0 0 0
We discard counts of degrees lower than or equal to T, and display our ordered results:
1 * 12 = 1 * (T + 6) => 1 diffs of 6
1 * 10 = 1 * (T + 4) => 1 diffs of 4
1 * 8 = 1 * (T + 2) => 1 diffs of 2
If any of the coefficients, their negatives, or T are in our set of array elements, we have a match.

Find where condition is true n times consecutively

I have an array (say of 1s and 0s) and I want to find the index, i, for the first location where 1 appears n times in a row.
For example,
x = [0 0 1 0 1 1 1 0 0 0] ;
i = 5, for n = 3, as this is the first time '1' appears three times in a row.
Note: I want to find where 1 appears n times in a row so
i = find(x,n,'first');
is incorrect as this would give me the index of the first n 1s.
It is essentially a string search? eg findstr but with a vector.
You can do it with convolution as follows:
x = [0 0 1 0 1 1 1 0 0 0];
N = 3;
result = find(conv(x, ones(1,N), 'valid')==N, 1)
How it works
Convolve x with a vector of N ones and find the first time the result equals N. Convolution is computed with the 'valid' flag to avoid edge effects and thus obtain the correct value for the index.
Another answer that I have is to generate a buffer matrix where each row of this matrix is a neighbourhood of overlapping n elements of the array. Once you create this, index into your array and find the first row that has all 1s:
x = [0 0 1 0 1 1 1 0 0 0]; %// Example data
n = 3; %// How many times we look for duplication
%// Solution
ind = bsxfun(#plus, (1:numel(x)-n+1).', 0:n-1); %'
out = find(all(x(ind),2), 1);
The first line is a bit tricky. We use bsxfun to generate a matrix of size m x n where m is the total number of overlapping neighbourhoods while n is the size of the window you are searching for. This generates a matrix where the first row is enumerated from 1 to n, the second row is enumerated from 2 to n+1, up until the very end which is from numel(x)-n+1 to numel(x). Given n = 3, we have:
>> ind
ind =
1 2 3
2 3 4
3 4 5
4 5 6
5 6 7
6 7 8
7 8 9
8 9 10
These are indices which we will use to index into our array x, and for your example it generates the following buffer matrix when we directly index into x:
>> x = [0 0 1 0 1 1 1 0 0 0];
>> x(ind)
ans =
0 0 1
0 1 0
1 0 1
0 1 1
1 1 1
1 1 0
1 0 0
0 0 0
Each row is an overlapping neighbourhood of n elements. We finally end by searching for the first row that gives us all 1s. This is done by using all and searching over every row independently with the 2 as the second parameter. all produces true if every element in a row is non-zero, or 1 in our case. We then combine with find to determine the first non-zero location that satisfies this constraint... and so:
>> out = find(all(x(ind), 2), 1)
out =
5
This tells us that the fifth location of x is where the beginning of this duplication occurs n times.
Based on Rayryeng's approach you can loop this as well. This will definitely be slower for short array sizes, but for very large array sizes this doesn't calculate every possibility, but stops as soon as the first match is found and thus will be faster. You could even use an if statement based on the initial array length to choose whether to use the bsxfun or the for loop. Note also that for loops are rather fast since the latest MATLAB engine update.
x = [0 0 1 0 1 1 1 0 0 0]; %// Example data
n = 3; %// How many times we look for duplication
for idx = 1:numel(x)-n
if all(x(idx:idx+n-1))
break
end
end
Additionally, this can be used to find the a first occurrences:
x = [0 0 1 0 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0 0 0 1 0 1 1 1 0 0 0]; %// Example data
n = 3; %// How many times we look for duplication
a = 2; %// number of desired matches
collect(1,a)=0; %// initialise output
kk = 1; %// initialise counter
for idx = 1:numel(x)-n
if all(x(idx:idx+n-1))
collect(kk) = idx;
if kk == a
break
end
kk = kk+1;
end
end
Which does the same but shuts down after a matches have been found. Again, this approach is only useful if your array is large.
Seeing you commented whether you can find the last occurrence: yes. Same trick as before, just run the loop backwards:
for idx = numel(x)-n:-1:1
if all(x(idx:idx+n-1))
break
end
end
One possibility with looping:
i = 0;
n = 3;
for idx = n : length(x)
idx_true = 1;
for sub_idx = (idx - n + 1) : idx
idx_true = idx_true & (x(sub_idx));
end
if(idx_true)
i = idx - n + 1;
break
end
end
if (i == 0)
disp('No index found.')
else
disp(i)
end

How can I get no. of Swap operations to form the 2nd Array

I have got two arrays with same elements... (But in different order)
e.g 1 2 12 9 7 15 22 30
and 1 2 7 12 9 20 15 22
how many swaps operations are needed to form the 2nd array from the first.?
I have tried taking no. of different elements for each index and dividing the result by 2 but that isn't fetching me the right answer...
One classic algorithm seems to be permutation cycles (https://en.m.wikipedia.org/wiki/Cycle_notation#Cycle_notation). The number of swaps needed equals the total number of elements subtracted by the number of cycles.
For example:
1 2 3 4 5
2 5 4 3 1
Start with 1 and follow the cycle:
1 down to 2, 2 down to 5, 5 down to 1.
1 -> 2 -> 5 -> 1
3 -> 4 -> 3
We would need to swap index 1 with 5, then index 5 with 2; as well as index 3 with index 4. Altogether 3 swaps or n - 2. We subtract n by the number of cycles since cycle elements together total n and each cycle represents a swap less than the number of elements in it.
1) re-index elements from 0 to n-1. In your example, arrayA becomes 0..7 and arrayB becomes 0 1 4 2 3 7 5 6.
2) sort the second array using your swapping algorithm and count the number of operations.
A bit naive, but I think you can use recursion as follows (pseudo code):
function count_swaps(arr1, arr2):
unless both arrays contain the same objects return false
if arr1.len <= 1 return 0
else
if arr1[0] == arr2[0] return count_swaps(arr1.tail, arr2.tail)
else
arr2_tail = arr2.tail
i = index_of arr1[0] in arr2_tail
arr2_tail[i] = arr2[0]
return 1+count_swaps(arr1.tail, arr2_tail)
Here's a ruby implementation:
require 'set'
def count_swaps(a1, a2)
raise "Arrays do not have the same objects: #{a1} #{a2}" unless a1.length == a2.length && Set[*a1]==Set[*a2]
return count_swap_rec(a1, a2)
end
def count_swap_rec(a1, a2)
return 0 if a1.length <= 1
return count_swaps(a1[1..-1], a2[1..-1]) if a1[0] == a2[0]
a2_tail = a2[1..-1]
a2_tail[a2_tail.find_index(a1[0])] = a2[0]
return 1 + count_swaps(a1[1..-1], a2_tail)
end

Longest subsequence with alternating increasing and decreasing values

Given an array , we need to find the length of longest sub-sequence with alternating increasing and decreasing values.
For example , if the array is ,
7 4 8 9 3 5 2 1 then the L = 6 for 7,4,8,3,5,2 or 7,4,9,3,5,1 , etc.
It could also be the case that first we have small then big element.
What could be the most efficient solution for this ? I had a DP solution in mind. And if we were to do it using brute force how would we do it (O(n^3) ?) ?
And it's not a homework problem.
You indeed can use dynamic programming approach here. For sake of simplicity , assume we need to find only the maximal length of such sequence seq (it will be easy to tweak solution to find the sequence itself).
For each index we will store 2 values:
maximal length of alternating sequence ending at that element where last step was increasing (say, incr[i])
maximal length of alternating sequence ending at that element where last step was decreasing (say, decr[i])
also by definition we assume incr[0] = decr[0] = 1
then each incr[i] can be found recursively:
incr[i] = max(decr[j])+1, where j < i and seq[j] < seq[i]
decr[i] = max(incr[j])+1, where j < i and seq[j] > seq[i]
Required length of the sequence will be the maximum value in both arrays, complexity of this approach is O(N*N) and it requires 2N of extra memory (where N is the length of initial sequence)
simple example in c:
int seq[N]; // initial sequence
int incr[N], decr[N];
... // Init sequences, fill incr and decr with 1's as initial values
for (int i = 1; i < N; ++i){
for (int j = 0; j < i; ++j){
if (seq[j] < seq[i])
{
// handle "increasing" step - need to check previous "decreasing" value
if (decr[j]+1 > incr[i]) incr[i] = decr[j] + 1;
}
if (seq[j] > seq[i])
{
if (incr[j]+1 > decr[i]) decr[i] = incr[j] + 1;
}
}
}
... // Now all arrays are filled, iterate over them and find maximum value
How algorithm will work:
step 0 (initial values):
seq = 7 4 8 9 3 5 2 1
incr = 1 1 1 1 1 1 1 1
decr = 1 1 1 1 1 1 1 1
step 1 take value at index 1 ('4') and check previous values. 7 > 4 so we make "decreasing step from index 0 to index 1, new sequence values:
incr = 1 1 1 1 1 1 1 1
decr = 1 2 1 1 1 1 1 1
step 2. take value 8 and iterate over previous value:
7 < 8, make increasing step: incr[2] = MAX(incr[2], decr[0]+1):
incr = 1 1 2 1 1 1 1 1
decr = 1 2 1 1 1 1 1 1
4 < 8, make increasing step: incr[2] = MAX(incr[2], decr[1]+1):
incr = 1 1 3 1 1 1 1 1
decr = 1 2 1 1 1 1 1 1
etc...

Resources