Minimum-Maximum recursive algorithm with a non-even partition, complexity - arrays

So I have been trying to find the recurrence relation of the following algorithm in order to compute its complexity. The following algorithm describes how to find the minimum-maximum element in an array recursively but instead of partitioning the array into two even sub arrays, this time we divide the array into two sub arrays with one containing the first two elements (low, low+1) and the other the rest elements (low+2, high). Here is the pseudo-code of this algorithm:
MAXMIN (A, low, high)
if (high − low + 1 = 2) then
if (A[low] < A[high]) then
max = A[high]; min = A[low].
return((max, min)).
else
max = A[low]; min = A[high].
return((max, min)).
end if
else
(max_l , min_l ) = MAXMIN(A, low, low+1).
(max_r , min_r ) =MAXMIN(A, low+2, high).
end if
Set max to the larger of max_l and max_r ;
Set min to the smaller of min_l and min_r ;
return((max, min))
The classic divide and conquer algorithm as it follows in pseudocode has the following recurrence relation(as given from my textbook):
T(n) = 2, n=2 or n=1 and T(n) = 2T(n/2)+3, n>2
and the pseudocode:
MAXMIN (A, low, high)
if (high − low + 1 = 2) then
if (A[low] < A[high]) then
max = A[high]; min = A[low].
return((max, min)).
else
max = A[low]; min = A[high].
return((max, min)).
end if
else
mid = low+high/2
(max_l , min_l ) = MAXMIN(A, low, mid).
(max_r , min_r ) =MAXMIN(A, mid + 1, high).
end if
Set max to the larger of max_l and max_r ;
Set min to the smaller of min_l and min_r;
return((max, min))
So I came to the conclusion that the recurrence relation of the first algorithm should look something like that:
T(n) = 2, n=2 or n=1 and T(n) = T(2) + T(n-2) + 2
which can also be written as:
T(n) = 2 + T(n-2) + 2 <=> T(n) = T(n-2) + 4.
Is my approach correct or did i miss something? I would be glad if someone could help me out!
P.S.: Sorry for my english!

Related

Minimize (firstA_max - firstA_min) + (secondB_max - secondB_min)

Given n pairs of integers. Split into two subsets A and B to minimize sum(maximum difference among first values of A, maximum difference among second values of B).
Example : n = 4
{0, 0}; {5;5}; {1; 1}; {3; 4}
A = {{0; 0}; {1; 1}}
B = {{5; 5}; {3; 4}}
(maximum difference among first values of A, maximum difference among second values of B).
(maximum difference among first values of A) = fA_max - fA_min = 1 - 0 = 1
(maximum difference among second values of B) = sB_max - sB_min = 5 - 4 = 1
Therefore, the answer if 1 + 1 = 2. And this is the best way.
Obviously, maximum difference among the values equals to (maximum value - minimum value). Hence, what we need to do is find the minimum of (fA_max - fA_min) + (sB_max - sB_min)
Suppose the given array is arr[], first value if arr[].first and second value is arr[].second.
I think it is quite easy to solve this in quadratic complexity. You just need to sort the array by the first value. Then all the elements in subset A should be picked consecutively in the sorted array. So, you can loop for all ranges [L;R] of the sorted. Each range, try to add all elements in that range into subset A and add all the remains into subset B.
For more detail, this is my C++ code
int calc(pair<int, int> a[], int n){
int m = 1e9, M = -1e9, res = 2e9; //m and M are min and max of all the first values in subset A
for (int l = 1; l <= n; l++){
int g = m, G = M; //g and G are min and max of all the second values in subset B
for(int r = n; r >= l; r--) {
if (r - l + 1 < n){
res = min(res, a[r].first - a[l].first + G - g);
}
g = min(g, a[r].second);
G = max(G, a[r].second);
}
m = min(m, a[l].second);
M = max(M, a[l].second);
}
return res;
}
Now, I want to improve my algorithm down to loglinear complexity. Of course, sort the array by the first value. After that, if I fixed fA_min = a[i].first, then if the index i increase, the fA_max will increase while the (sB_max - sB_min) decrease.
But now I am still stuck here, is there any ways to solve this problem in loglinear complexity?
The following approach is an attempt to escape the n^2, using an argmin list for the second element of the tuples (lets say the y-part). Where the points are sorted regarding x.
One Observation is that there is an optimum solution where A includes index argmin[0] or argmin[n-1] or both.
in get_best_interval_min_max we focus once on including argmin[0] and the next smallest element on y and so one. The we do the same from the max element.
We get two dictionaries {(i,j):(profit, idx)}, telling us how much we gain in y when including points[i:j+1] in A, towards min or max on y. idx is the idx in the argmin array.
calculate the objective for each dict assuming max/min or y is not in A.
combine the results of both dictionaries, : (i1,j1): (v1, idx1) and (i2,j2): (v2, idx2). result : j2 - i1 + max_y - min_y - v1 - v2.
Constraint: idx1 < idx2. Because the indices in the argmin array can not intersect, otherwise some profit in y might be counted twice.
On average the dictionaries (dmin,dmax) are smaller than n, but in the worst case when x and y correlate [(i,i) for i in range(n)] they are exactly n, and we do not win any time. Anyhow on random instances this approach is much faster. Maybe someone can improve upon this.
import numpy as np
from random import randrange
import time
def get_best_interval_min_max(points):# sorted input according to x dim
L = len(points)
argmin_b = np.argsort([p[1] for p in points])
b_min,b_max = points[argmin_b[0]][1], points[argmin_b[L-1]][1]
arg = [argmin_b[0],argmin_b[0]]
res_min = dict()
for i in range(1,L):
res_min[tuple(arg)] = points[argmin_b[i]][1] - points[argmin_b[0]][1],i # the profit in b towards min
if arg[0] > argmin_b[i]: arg[0]=argmin_b[i]
elif arg[1] < argmin_b[i]: arg[1]=argmin_b[i]
arg = [argmin_b[L-1],argmin_b[L-1]]
res_max = dict()
for i in range(L-2,-1,-1):
res_max[tuple(arg)] = points[argmin_b[L-1]][1]-points[argmin_b[i]][1],i # the profit in b towards max
if arg[0]>argmin_b[i]: arg[0]=argmin_b[i]
elif arg[1]<argmin_b[i]: arg[1]=argmin_b[i]
# return the two dicts, difference along y,
return res_min, res_max, b_max-b_min
def argmin_algo(points):
# return the objective value, sets A and B, and the interval for A in points.
points.sort()
# get the profits for different intervals on the sorted array for max and min
dmin, dmax, y_diff = get_best_interval_min_max(points)
key = [None,None]
res_min = 2e9
# the best result when only the min/max b value is includes in A
for d in [dmin,dmax]:
for k,(v,i) in d.items():
res = points[k[1]][0]-points[k[0]][0] + y_diff - v
if res < res_min:
key = k
res_min = res
# combine the results for max and min.
for k1,(v1,i) in dmin.items():
for k2,(v2,j) in dmax.items():
if i > j: break # their argmin_b indices can not intersect!
idx_l, idx_h = min(k1[0], k2[0]), max(k1[1],k2[1]) # get index low and idx hight for combination
res = points[idx_h][0]-points[idx_l][0] -v1 -v2 + y_diff
if res < res_min:
key = (idx_l, idx_h) # new merged interval
res_min = res
return res_min, points[key[0]:key[1]+1], points[:key[0]]+points[key[1]+1:], key
def quadratic_algorithm(points):
points.sort()
m, M, res = 1e9, -1e9, 2e9
idx = (0,0)
for l in range(len(points)):
g, G = m, M
for r in range(len(points)-1,l-1,-1):
if r-l+1 < len(points):
res_n = points[r][0] - points[l][0] + G - g
if res_n < res:
res = res_n
idx = (l,r)
g = min(g, points[r][1])
G = max(G, points[r][1])
m = min(m, points[l][1])
M = max(M, points[l][1])
return res, points[idx[0]:idx[1]+1], points[:idx[0]]+points[idx[1]+1:], idx
# let's try it and compare running times to the quadratic_algorithm
# get some "random" points
c1=0
c2=0
for i in range(100):
points = [(randrange(100), randrange(100)) for i in range(1,200)]
points.sort() # sorted for x dimention
s = time.time()
r1 = argmin_algo(points)
e1 = time.time()
r2 = quadratic_algorithm(points)
e2 = time.time()
c1 += (e1-s)
c2 += (e2-e1)
if not r1[0] == r2[0]:
print(r1,r2)
raise Exception("Error, results are not equal")
print("time of argmin_algo", c1, "time of quadratic_algorithm",c2)
UPDATE: #Luka proved the algorithm described in this answer is not exact. But I will keep it here because it's a good performance heuristics and opens the way to many probabilistic methods.
I will describe a loglinear algorithm. I couldn't find a counter example. But I also couldn't find a proof :/
Let set A be ordered by first element and set B be ordered by second element. They are initially empty. Take floor(n/2) random points of your set of points and put in set A. Put the remaining points in set B. Define this as a partition.
Let's call a partition stable if you can't take an element of set A, put it in B and decrease the objective function and if you can't take an element of set B, put it in A and decrease the objective function. Otherwise, let's call the partition unstable.
For an unstable partition, the only moves that are interesting are the ones that take the first or the last element of A and move to B or take the first or the last element of B and move to A. So, we can find all interesting moves for a given unstable partition in O(1). If an interesting move decreases the objective function, do it. Go like that until the partition becomes stable. I conjecture that it takes at most O(n) moves for the partition to become stable. I also conjecture that at the moment the partition becomes stable, you will have a solution.

How to translate a solution into divide-and-conquer (finding a sub array with the largest, smallest value)

I am trying to get better at divide an conquer algorithms and am using this one below as an example. Given an array _in and some length l it finds the start point of a sub array _in[_min_start,_min_start+l] such that the lowest value in that sub array is the highest it could possible be. I have come up with a none divide and conquer solution and am wondering how I could go about translating this into one which divides the array up into smaller parts (divide-and-conquer).
def main(_in, l):
_min_start = 0
min_trough = None
for i in range(len(_in)+1-l):
if min_trough is None:
min_trough = min(_in[i:i+l])
if min(_in[i:i+l]) > min_trough:
_min_start = i
min_trough = min(_in[i:i+l])
return _min_start, _in[_min_start:_min_start+l]
e.g. For the array [5, 1, -1, 2, 5, -4, 3, 9, 8, -2, 0, 6] and a sub array of lenght 3 it would return start position 6 (resulting in the array [3,9,8]).
Three O(n) solutions and a benchmark
Note I'm renaming _in and l to clearer-looking names A and k.
Solution 1: Divide and conquer
Split the array in half. Solve left half and right half recursively. The subarrays not yet considered cross the middle, i.e., they're a suffix of the left part plus a prefix of the right part. Compute k-1 suffix-minima of the left half and k-1 prefix-minima of the right half. That allows you to compute the minimum for each middle-crossing subarray of length k in O(1) time each. The best subarray for the whole array is the best of left-best, right-best and crossing-best.
Runtime is O(n), I believe. As Ellis pointed out, in the recursion the subarray can become smaller than k. Such cases take O(1) time to return the equivalent of "there aren't any k-length subarrays in here". So the time is:
T(n) = { 2 * T(n/2) + O(k) if n >= k
{ O(1) otherwise
For any 0 <= k <= n we have k=nc with 0 <= c <= 1. Then the number of calls is Θ(n1-c) and each call's own work takes Θ(nc) time, for a total of Θ(n) time.
Posted a question about the complexity to be sure.
Python implementation:
def solve_divide_and_conquer(A, k):
def solve(start, stop):
if stop - start < k:
return -inf,
mid = (start + stop) // 2
left = solve(start, mid)
right = solve(mid, stop)
i0 = mid - k + 1
prefixes = accumulate(A[mid:mid+k-1], min)
if i0 < 0:
prefixes = [*prefixes][-i0:]
i0 = 0
suffixes = list(accumulate(A[i0:mid][::-1], min))[::-1]
crossing = max(zip(map(min, suffixes, prefixes), count(i0)))
return max(left, right, crossing)
return solve(0, len(A))[1]
Solution 2: k-Blocks
As commented by #benrg, the above dividing-and-conquering is needlessly complicated. We can simply work on blocks of length k. Compute the suffix minima of the first block and the prefix minima of the second block. That allows finding the minimum of each k-length subarray within these two blocks in O(1) time. Do the same with the second and third block, the third and fourth block, etc. Time is O(n) as well.
Python implementation:
def solve_blocks(A, k):
return max(max(zip(map(min, prefixes, suffixes), count(mid-k)))
for mid in range(k, len(A)+1, k)
for prefixes in [accumulate(A[mid:mid+k], min, initial=inf)]
for suffixes in [list(accumulate(A[mid-k:mid][::-1], min, initial=inf))[::-1]]
)[1]
Solution 3: Monoqueue
Not divide & conquer, but first one I came up with (and knew was O(n)).
Sliding window, represent the window with a deque of (sorted) indexes of strictly increasing array values in the window. When sliding the window to include a new value A[i]:
Remove the first index from the deque if the sliding makes it fall out of the window.
Remove indexes whose array values are larger than A[i]. (They can never be the minimum of the window anymore.)
Include the new index i.
The first index still in the deque is the index of the current window's minimum value. Use that to update overall result.
Python implementation:
from collections import deque
A = [5, 1, -1, 2, 5, -4, 3, 9, 8, -2, 0, 6]
k = 3
I = deque()
for i in range(len(A)):
if I and I[0] == i - k:
I.popleft()
while I and A[I[-1]] >= A[i]:
I.pop()
I.append(i)
curr_min = A[I[0]]
if i == k-1 or i > k-1 and curr_min > max_min:
result = i - k + 1
max_min = curr_min
print(result)
Benchmark
With 4000 numbers from the range 0 to 9999, and k=2000:
80.4 ms 81.4 ms 81.8 ms solve_brute_force
80.2 ms 80.5 ms 80.7 ms solve_original
2.4 ms 2.4 ms 2.4 ms solve_monoqueue
2.4 ms 2.4 ms 2.4 ms solve_divide_and_conquer
1.3 ms 1.4 ms 1.4 ms solve_blocks
Benchmark code (Try it online!):
from timeit import repeat
from random import choices
from itertools import accumulate
from math import inf
from itertools import count
from collections import deque
def solve_monoqueue(A, k):
I = deque()
for i in range(len(A)):
if I and I[0] == i - k:
I.popleft()
while I and A[I[-1]] >= A[i]:
I.pop()
I.append(i)
curr_min = A[I[0]]
if i == k-1 or i > k-1 and curr_min > max_min:
result = i - k + 1
max_min = curr_min
return result
def solve_divide_and_conquer(A, k):
def solve(start, stop):
if stop - start < k:
return -inf,
mid = (start + stop) // 2
left = solve(start, mid)
right = solve(mid, stop)
i0 = mid - k + 1
prefixes = accumulate(A[mid:mid+k-1], min)
if i0 < 0:
prefixes = [*prefixes][-i0:]
i0 = 0
suffixes = list(accumulate(A[i0:mid][::-1], min))[::-1]
crossing = max(zip(map(min, suffixes, prefixes), count(i0)))
return max(left, right, crossing)
return solve(0, len(A))[1]
def solve_blocks(A, k):
return max(max(zip(map(min, prefixes, suffixes), count(mid-k)))
for mid in range(k, len(A)+1, k)
for prefixes in [accumulate(A[mid:mid+k], min, initial=inf)]
for suffixes in [list(accumulate(A[mid-k:mid][::-1], min, initial=inf))[::-1]]
)[1]
def solve_brute_force(A, k):
return max(range(len(A)+1-k),
key=lambda start: min(A[start : start+k]))
def solve_original(_in, l):
_min_start = 0
min_trough = None
for i in range(len(_in)+1-l):
if min_trough is None:
min_trough = min(_in[i:i+l])
if min(_in[i:i+l]) > min_trough:
_min_start = i
min_trough = min(_in[i:i+l])
return _min_start # , _in[_min_start:_min_start+l]
solutions = [
solve_brute_force,
solve_original,
solve_monoqueue,
solve_divide_and_conquer,
solve_blocks,
]
for _ in range(3):
A = choices(range(10000), k=4000)
k = 2000
# Check correctness
expect = None
for solution in solutions:
index = solution(A.copy(), k)
assert 0 <= index and index + k-1 < len(A)
min_there = min(A[index : index+k])
if expect is None:
expect = min_there
print(expect)
else:
print(min_there == expect, solution.__name__)
print()
# Speed
for solution in solutions:
copy = A.copy()
ts = sorted(repeat(lambda: solution(copy, k), number=1))[:3]
print(*('%5.1f ms ' % (t * 1e3) for t in ts), solution.__name__)
print()

Efficiently vectorize an element-wise operation in matlab

I have an nx4 matrix A representing n spheres, and an mx3 matrix B representing m points. I need to test whether these m points are inside any of the spheres. I can do this using a for loop, but with large n and m this method is very inefficient. How can I vectorize this operation? My current method is
A = [0.8622 1.1594 0.7457 0.6925;
1.4325 0.2559 0.0520 0.4687;
1.8465 0.3979 0.2850 0.4259;
1.4387 0.8713 1.6585 0.4616;
0.2383 1.5208 0.5415 0.9417;
1.6812 0.2045 0.1290 0.1972];
B = [0.5689 0.9696 0.8196;
0.5211 0.4462 0.6254;
0.9000 0.4894 0.2202;
0.4192 0.9229 0.4639];
for i=1:size(B,1)
mask = vecnorm(A(:, 1:3) - B(i,:), 2, 2) < A(:, 4);
if sum(mask) > 0
C(i) = true;
else
C(i) = false;
end %if
end %for
I tested the method suggested by #LuisMendo, and it seems it only speeds up the calculation for quite small m and n, but for large m and n, say, around 10000 for my problem, the improvement is very limited. But #NickyMattsson gave me some hint. Because logical operation in matlab is faster than vecnorm, I first use a rough check to find the spheres near the point, and then do a fine check:
A = [0.8622 1.1594 0.7457 0.6925;
1.4325 0.2559 0.0520 0.4687;
1.8465 0.3979 0.2850 0.4259;
1.4387 0.8713 1.6585 0.4616;
0.2383 1.5208 0.5415 0.9417;
1.6812 0.2045 0.1290 0.1972];
B = [0.5689 0.9696 0.8196;
0.5211 0.4462 0.6254;
0.9000 0.4894 0.2202;
0.4192 0.9229 0.4639];
ids = 1:size(A, 1);
for i=1:size(B,1)
% first a rough check
xbound = abs(A(:, 1) - B(i, 1)) < A(:, 4);
ybound = abs(A(:, 2) - B(i, 2)) < A(:, 4);
zbound = abs(A(:, 3) - B(i, 3)) < A(:, 4);
nears = ids(xbound & ybound & zbound);
if isempty(nears)
C(i) = false;
else
% then a fine check
mask = vecnorm(A(nears, 1:3) - B(i,:), 2, 2) < A(nears, 4);
if sum(mask) > 0
C(i) = true;
else
C(i) = false;
end
end
end
This may reduce the time to 1/2 or 1/3, which is acceptable, and if I divide m and n into batches it may be even faster without too heavy memory burden. #CrisLuengo mentioned the R*-tree method, but it seems that the implementation is quite complicated XD
This uses implicit expansion to compute all distances between points and sphere centers, and then to compare those with the sphere radii:
C = any(vecnorm(permute(B, [1 3 2]) - permute(A(:,1:3), [3 1 2]), 2, 3) < A(:,4).', 2);
This is probably faster than the loop approach, but also more memory-intensive, because an intermediate m×n×3 array is computed.

A query regarding the algorithm of binary search using C

In Binary Search Algorithm,
in general
if mid_value > search_element we set high = mid_pos-1 ;
else mid_value < search_element we set low = mid_pos+1 ;
But I've just modified the algorithm like these
if mid_value > search_element we set high = mid_pos ;
else mid_value < search_element we set low = mid_pos ;
But my teacher told me that the standard algorithm for binary search is the first one and what you have written is also a search algorithm but it's not an algorithm for binary search.
Is he correct?.
Your Algo is not correct :
case :
list [1, 2] , searchElem = 2 , low = 0,high = 1
mid = (low+high)/2 = (0+1)/2 = 0
mid < searchElem set low = mid
updated mid = 0, high = 1 [list didn't change]
so you will end up with infinite loop.
I think you picked it up wrongly .
The basic Binary Search Algorithm's workflow:
Procedure binary_search
A ← sorted array
n ← size of array
x ← value to be searched
Set lowerBound = 1
Set upperBound = n
while x not found
if upperBound < lowerBound
EXIT: x does not exists.
set midPoint = lowerBound + ( upperBound - lowerBound ) / 2
if A[midPoint] < x
set lowerBound = midPoint + 1
if A[midPoint] > x
set upperBound = midPoint - 1
if A[midPoint] = x
EXIT: x found at location midPoint
end while
end procedure
Here you can see what actually midPoint = lowerBound + ( upperBound - lowerBound ) / 2 , lowerBound = midPoint + 1 and upperBound = midPoint - 1 does .

Trying to understand how to iterate over more-dim Arrays

I am trying to learn how to iterate over arrays and therefore made up my own scenarios to practise on.
Let's say my given matrix is a two-dimensional, therefore an two-dim. Array.
mat =[[1,2,300,-400],[0,3,-1,9],[3,4,-5,1]]
Task 1) Return the Array with the highest sum of the values.
Task 2) Given that this Array could produce a nxm matrix, return the value of the row and column for which the sum of the enclosing number is the highest.
To make it easier to understand let us use a different matrix here.
mat= [[1,1,1,1],[2,2,2,2],[3,3,3,3],[4,4,4,4]]
So it would look like this:
1111
2222
3333
4444
And the result would be [2,1] or [2,2]
since the sum for those numbers (2+2+2+3+3+4+4+4) = 24 would be the highest.
Here are my implementations so far:
Task 1)
I only can solve this with adding a sum function to the class Array.
def max_row(mat)
return mat.max{|a,b| a.sum <=> b.sum }
end
class Array
def sum
sum = 0
self.each(){|x|
sum += x
}
return sum
end
end
I do want to solve it without using an extra method so, but I do not know how to.
my idea so far :
def max_row(mat)
sum_ary = []
mat.each(){|ary|
sum = 0
ary.each(){|x|
sum += x
}
sum_ary << [sum]
}
I tried find_index on my sum_ary, but as implemented it returns the first value which is not false, therefore I cannot use it to search for the biggest value.
Implementation Task 2):
mat = [[1,1,1,1],[2,2,2,2],[3,3,3,3],[4,4,4,4]]
def max_neighbor_sum(mat)
sum_result = []
for n in 0...mat.size()
for m in 0...mat.size()
sum = 0
for a in (n-1)..(n+1)
for b in (m-1)..(m+1)
if m != nil && n !=nil && a>=0 && b>=0 && a<= (mat.size()-1)
# print "n:#{n} m:#{m} a:#{a} b:#{b} \n"
# p mat[a][b]
if mat[a][b] !=nil && !(n==a && m==b)
sum += mat[a][b]
end
end
end
end
sum_result << sum
# p sum_result
end
end
return sum_result
end
I calculated all the sums correctly, but have no idea how I get the index for the row and column now.
I hope you can understand where I need some help.
Problem 1:
arrays.map(&:sum).max
Calls sum for each of the arrays, then chooses the biggest of them
Problem 2 can't be solved so easily, but this should work:
max_sum = 0
max_index = []
for n in 0...mat.size
for m in 0...mat.size
sum = 0
for a in (n-1)..(n+1)
for b in (m-1)..(m+1)
sum += mat[a][b] unless mat[a].nil? || mat[a][b].nil?
end
end
if sum > max_sum
max_sum = sum
max_index = [n,m]
end
end
end
max_sum # => maximum sum of all neighbours
max_index # => a pair of indexes which have the max sum
If you want to keep all of max indexes, just replace it with an array of pairs and push if the sum is equal to max_sum.
Here is my solution to task 2 which I came up with thanks to Piotr Kruczek.
Thanks for the kind help!
def max_neighbour_sum(mat)
sum_result = []
max_sum = 0
for n in 0...mat.size()
for m in 0...mat.size()
sum = 0
for a in (n-1)..(n+1)
for b in (m-1)..(m+1)
if m != nil && n !=nil && a>=0 && b>=0 && a<= (mat.size()-1)
# print "n:#{n} m:#{m} a:#{a} b:#{b} \n"
# p mat[a][b]
if mat[a][b] !=nil && !(n==a && m==b)
sum += mat[a][b]
end
end
end
end
if sum > max_sum
max_sum = sum
sum_result = [n,m]
end
# p sum_result
end
end
return sum_result
end

Resources