median of the two sorted array - arrays

I was not able to understand the below base case 5 and 6 for calculating the median of two sorted arrays. N and M are the two array length.
Base cases:
The smaller array has only one element
Case 0: N = 0, M = 2
Case 1: N = 1, M = 1.
Case 2: N = 1, M is odd
Case 3: N = 1, M is even
The smaller array has only two elements
Case 4: N = 2, M = 2
Case 5: N = 2, M is odd
Case 6: N = 2, M is even
Case 0: There are no elements in first array, return median of second array. If second array is also empty, return -1.
Case 1: There is only one element in both arrays, so output the average of A[0] and B[0].
Case 2: N = 1, M is odd
Let B[5] = {5, 10, 12, 15, 20}
First find the middle element of B[], which is 12 for above array. There are following 4 sub-cases.
…2.1 If A[0] is smaller than 10, the median is average of 10 and 12.
…2.2 If A[0] lies between 10 and 12, the median is average of A[0] and
12.
…2.3 If A[0] lies between 12 and 15, the median is average of 12 and
A[0].
…2.4 If A[0] is greater than 15, the median is average of 12 and 15.
In all the sub-cases, we find that 12 is fixed. So, we need to find the median of B[ M / 2 – 1 ], B[ M / 2 + 1], A[ 0 ] and take its average with B[ M / 2 ].
Case 3: N = 1, M is even
Let B[4] = {5, 10, 12, 15}
First find the middle items in B[], which are 10 and 12 in above example. There are following 3 sub-cases.
…3.1 If A[0] is smaller than 10, the median is 10.
…3.2 If A[0] lies between 10 and 12, the median is A[0].
…3.3 If A[0] is greater than 12, the median is 12.
So, in this case, find the median of three elements B[ M / 2 – 1 ], B[ M / 2] and A[ 0 ].
Case 4: N = 2, M = 2
There are four elements in total. So we find the median of 4 elements.
Case 5: N = 2, M is odd
Let B[5] = {5, 10, 12, 15, 20}
The median is given by median of following three elements: B[M/2], max(A[0], B[M/2 – 1]), min(A[1], B[M/2 + 1]).
Case 6: N = 2, M is even
Let B[4] = {5, 10, 12, 15}
The median is given by median of following four elements: B[M/2], B[M/2 – 1], max(A[0], B[M/2 – 2]), min(A[1], B[M/2 + 1])
I was referring the below URL for understanding the median of two sorted arrays.
http://www.geeksforgeeks.org/median-of-two-sorted-arrays-of-different-sizes/

The simplest way to see how this works is ... with a pencil and paper. Draw two lists of numbers (both short ...), one with an odd number of elements, the other even. Now, look very carefully at the slight-but-important differences between the solutions for Case 5 vs. Case 6.
Since the logic as-presented uses the MAX() function several times, with your pencil-and-paper construct situations where first one, then the other of the two parameters to MAX() are "biggest."
In fifteen minutes or so with that piece of paper and that number-two pencil, you'll be able to convince yourself that the algorithm does (or, doesn't?) work.
... (and, yeah, this is a perfectly serious recommendation. This is exactly how I work such things out!)

Related

Algorithm for array permutation

We have an integer array A[] of size N (1 ≤ N ≤ 10^4), which originally is a sorted array with entries 1...N. For any permutation P of size N, the array is shuffled so that i-th entry from the left before the shuffle is at the Ai-th position after the shuffle. You would keep repeating this shuffle until the array is sorted again.
For example, for A[] = {1, 2, 3, 4}, if P = {1, 2, 3, 4}, it would only take one move for the array to be sorted (the entries would move to their original positions). If P = {4, 3, 1, 2}, then it would take 4 moves for the array to be sorted again:
Move 0 | [1, 2, 3, 4]
Move 1 | [3, 4, 2, 1]
Move 2 | [2, 1, 4, 3]
Move 3 | [4, 3, 1, 2]
Move 4 | [1, 2, 3, 4]
The problem is to find the sum of all positive integers J for which you can generate a permutation that requires J moves to get the array sorted again.
Example:
For A[] = {1, 2, 3, 4}, you can generate permutations that require 1, 2, 3, and 4 steps:
Requires 1 move: P = {1, 2, 3, 4}
Requires 2 moves: P = {1, 3, 2, 4}
Requires 3 moves: P = {1, 4, 2, 3}
Requires 4 moves: P = {4, 3, 1, 2}
So you would output 1 + 2 + 3 + 4 = 10.
Some observations I have made is that you can always generate a permutation that requires J moves for (1 ≤ J < N). This is because in the permutation, you would simply shift by 1 all the entries in the range of size J. However, for permutations that requires J moves where J ≥ N, you would need another algorithm.
The brute-force solution would be checking every permutation, or N! permutations which definitely wouldn't fit in run time. I'm looking for an algorithm with run time at most O(N^2).
EDIT 1: A permutation that requires N moves will always be guaranteed as well, as you can create a permutation where every entry is misplaced, and not just swapped with another entry. The question becomes how to find permutations where J > N.
EDIT 2: #ljeabmreosn made the observation that there exists a permutation that takes J steps if and only if there are natural numbers a_1 + ... + a_k = N and LCM(a_1, ..., a_k) = J. So using that observation, the problem comes down to finding all partitions of the array, or partitions of the integer N. However, this won't be a quadratic algorithm - how can I find them efficiently?
Sum of distinct orders of degree-n permutations.
https://oeis.org/A060179
This is the number you are looking for, with a formula, and some maple code.
As often when trying to compute an integer sequence, compute the first few values (here 1, 1, 3, 6, 10, 21) and look for it in the great "On-line Encyclopedia of Integer Sequences".
Here is some python code inspired by it, I think it fits your complexity goals.
def primes_upto(limit):
is_prime = [False] * 2 + [True] * (limit - 1)
for n in range(int(limit**0.5 + 1.5)):
if is_prime[n]:
for i in range(n*n, limit+1, n):
is_prime[i] = False
return [i for i, prime in enumerate(is_prime) if prime]
def sum_of_distinct_order_of_Sn(N):
primes = primes_upto(N)
res = [1]*(N+1)
for p in primes:
for n in range(N,p-1,-1):
pj = p
while pj <= n:
res[n] += res[n-pj] * pj
pj *= p
return res[N]
on my machine:
>%time sum_of_distinct_order_of_Sn(10000)
CPU times: user 2.2 s, sys: 7.54 ms, total: 2.21 s
Wall time: 2.21 s
51341741532026057701809813988399192987996798390239678614311608467285998981748581403905219380703280665170264840434783302693471342230109536512960230

Find K numbers whose product is N , keeping the maximum of K numbers to be minimum

Basically, we are given a number N and K, we need to find an array of size K such that the product of the array elements is N with the maximum of the elements being minimized.
for eg:
420 3
ans: 6 7 10
explanation: 420 can be written as the product of 6,10 and 7. Also it can be written as 5 7 12 but 10(maximum of 6 10 and 7) is minimum than 12(maximum of 5 7 12).
Constraints: numbers>0; 0 <= N < 10^6; 1<=k<=100
What I did so far was to first find the prime factors but after that I can't think of an efficient way to get the sequence.
Basically, amritanshu had a pretty good idea: You have a list of the prime factors and split this list into a list containing the K biggest factors and another containing the other prime factors:
[2, 2], [3, 5, 7]
Then you multiply the biggest element of the first list with the smallest element of the second list and overwrite the element of the second list with the result. Remove the biggest element of the first list. Repeat these steps until your first list is empty:
[2, 2], [3, 5, 7]
[2], [6, 5, 7] // 5 is now the smallest element
[], [6, 10, 7]
here another example:
N = 2310 = 2 * 3 * 5 * 7 * 11
K = 3
[2, 3], [5, 7, 11]
[2], [15, 7, 11]
[], [15, 14, 11]
however, this algorithm is still not the perfect one for some cases like N = 2310, K = 2:
[2, 3, 5], [7, 11]
[2, 3], [35, 11]
[2], [35, 33]
[], [35, 66] // better: [], [42, 55]
So, I thought you actually want to split the factors such that the factors are as close as possible to the Kth root of N. So I come up with this algorithm:
calculate R, the smallest integer bigger than or equal to the Kth root of N
calculate the gcd of R and N
if the gcd is equal to R, add R to the list, call your algorithm recursively with N / R, K-1, add the result to the list and return the list
if the gcd is not equal to R, add it to R and go to step 2
here is a little bit of python code:
import math
def gcd(a, b):
while b:
a, b = b, a % b
return a
def root(N, K):
R = int(math.exp(math.log(N) / K))
if R ** K < N:
R += 1
return R
def find_factors(N, K):
if K == 1:
return [N]
R = root(N, K)
while True:
GCD = gcd(N, R)
if GCD == R:
return [R] + find_factors(N // R, K-1)
R += GCD
EDIT:
I just noticed that this algorithm is still giving incorrect results in many cases. The correct way is incrementing R until it divides N:
def find_factors(N, K):
if K == 1:
return [N]
R = root(N, K)
while True:
if N % R == 0:
return [R] + find_factors(N // R, K-1)
R += 1
This way you don't need gcd.
Overall, I guess you need to factorize N and then essentially make some brute-force approach trying to combine the prime factors into combined factors of roughly equal size. Generally, that should not be too bad, because factorizing is already the most expensive part in many cases.
Original answer (wrong) (see comment by #gus):
Without proof of correctness, assuming N>0, K>0, in pseudo code:
Factorize N into prime factors, store into array F
find smallest integer m>=0 such that length(F) <= 2^m*K
Fill F by 1s to get size 2^m*K.
For i=m down to 1
sort F
for j=1 to 2^(i-1)*K
F[j] = F[j] * F[2^i*K+1-j] (multiply smallest with largest, and so on)
F=F[1:2^(i-1)*K] (delete upper half of F)
F contains result.
Example 420 3:
F={2,2,3,5,7}
m=1
F={1,2,2,3,5,7}
F={7,10,6} DONE
Example 2310 2:
F={2,3,5,7,11}
m=2
F={1,1,1,2,3,5,7,11} (fill to 2^m*K and sort)
F={11,7,5,6} (reduce to half)
F={5,6,7,11} (sort)
F={55, 42} DONE
Example N=17^3*72, K=3
F={2,2,2,3,3,17,17,17}
m=2
F={1,1,1,1,2,2,2,3,3,17,17,17}
F={17,17,17,3,6,4}
F={3,4,6,17,17,17}
F={3,4,6,17,17,17}
F={51,68,102}

Permute array to make it alternate between increasing and decreasing

An array X[1..n] of distinct integers is wobbly if it alternates between increasing and decreasing: X[i] < X[i+1] for every odd index i, and X[i] > X[i+1] for every even index i. For example, the following 16-element array is wobbly:
12, 13, 0, 16, 13, 31, 5, 7, -1, 23, 8, 10, -4, 37, 17, 42
Describe and analyze an algorithm that permutes the elements of a given array to make the array wobbly.
My attempt:
The more obvious solution that comes to mind would be to sort the original array, split it in half, and then alternate between each sub-array, grabbing the first element in the array to create the wobbly array. This would take O(nlogn). (Edit: Just realized this would only work if all of the integers are distinct.) I cant help but think there is a more efficient way to achieve this.
How could this be done?
(This is not a homework problem)
[After 3 years... :-)]
Your problem definition states that all the array elements are distinct. So, you can do better than sorting -- sorting does too much.
Consider that you have a wobbly sequence constructed out of the first k elements. There can be two cases for the last two elements in the sequence:
A[k-1] < A[k]
A[k-1] > A[k]
Case 1: if A[k+1] < A[k], you don't have to do anything because wobbliness is already maintained. However, if A[k+1] > A[k], swapping them will ensure wobbliness is restored.
Case 2: if A[k+1] > A[k], you don't have to do anything because wobbliness is already maintained. However, if A[k+1] < A[k], swapping them will ensure wobbliness is restored.
This gives you an O(n) time and O(1) space algorithm (because you are swapping in place). Your base case is when k = 2, which is trivially wobbly.
Following is an implementation in Python3:
def rearrange_wobbly(A):
if len(A) < 3:
return A
for i in range(2, len(A)):
if A[i - 2] < A[i - 1] < A[i] or A[i - 2] > A[i - 1] > A[i]:
# Swap A[i] and A[i - 1]
A[i - 1], A[i] = A[i], A[i - 1]
>>> A = [x for x in range(10)]
>>> A
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
>>> random.shuffle(A)
>>> A
[3, 2, 1, 0, 7, 6, 9, 8, 4, 5]
>>> rearrange_wobbly(A)
>>> A
[3, 1, 2, 0, 7, 6, 9, 4, 8, 5]
This most straight-forward approach I can think of is to sort the array and then alternate between taking the lowest and the highest remaining element.
E.g. with your example list, sorted:
-4 -1 0 5 7 8 10 12 13 13 16 17 23 31 37 42
The result then becomes
-4 42 -1 37 0 31 5 23 7 17 8 16 10 13 12 13
However, I think this breaks down if you have identical elements toward the middle, so in that scenario you might have to do a bit of manual value substitution towards the end of the sequence to restore the "wobbly" constraint.

Find the length of the longest contiguous sub-array in a sorted array in which the difference between the end and start values is at most k

I have a sorted array, for example
[0, 0, 3, 6, 7, 8, 8, 8, 10, 11, 13]
Here, let's say k = 1 so the longest sub-array is [7, 8, 8, 8] with length = 4.
As another example, consider [0, 0, 0, 3, 6, 9, 12, 12, 12, 12] with k = 3. Here the longest sub-array is [9, 12, 12, 12, 12] with length = 5.
So far, I have used a binary search algorithm O(n log n) which iterates from index 0 .. n - 1 and tries to find the rightmost index that satisfies our condition.
Is there a linear time algorithm to do this?
Yes, there is a linear time algorithm. You can use two pointers technique. Here is a pseudo code:
R = 0
res = 0
for L = 0 .. N - 1:
while R < N and a[R] - a[L] <= k:
R += 1
res = max(res, R - L)
It has O(n) time complexity because L and R are strictly increasing and each of them can be incremented only n times.
Why is this algorithm correct? For a fixed L, R is the index of the first element of the array such that a[R] - a[L] > k. That's why R - 1 is the index of the last element that fits. The length of [L, R - 1] subarray is exactly R - L. The resulting subarray is obtained by iterating over all possible values of L, that is, all possibilities are checked. That's why it always finds correct answer.

Partition an array of numbers into sets by proximity

Let's say we have an array like
[37, 20, 16, 8, 5, 5, 3, 0]
What algorithm can I use so that I can specify the number of partitions and have the array broken into them.
For 2 partitions, it should be
[37] and [20, 16, 8, 5, 5, 3, 0]
For 3, it should be
[37],[20, 16] and [8, 5, 5, 3, 0]
I am able to break them down by proximity by simply subtracting the element with right and left numbers but that doesn't ensure the correct number of partitions.
Any ideas?
My code is in ruby but any language/algo/pseudo-code will suffice.
Here's the ruby code by Vikram's algorithm
def partition(arr,clusters)
# Return same array if clusters are less than zero or more than array size
return arr if (clusters >= arr.size) || (clusters < 0)
edges = {}
# Get weights of edges
arr.each_with_index do |a,i|
break if i == (arr.length-1)
edges[i] = a - arr[i+1]
end
# Sort edge weights in ascending order
sorted_edges = edges.sort_by{|k,v| v}.collect{|k| k.first}
# Maintain counter for joins happening.
prev_edge = arr.size+1
joins = 0
sorted_edges.each do |edge|
# If join is on right of previous, subtract the number of previous joins that happened on left
if (edge > prev_edge)
edge -= joins
end
joins += 1
# Join the elements on the sides of edge.
arr[edge] = arr[edge,2].flatten
arr.delete_at(edge+1)
prev_edge = edge
# Get out when right clusters are done
break if arr.size == clusters
end
end
(assuming the array is sorted in descending order)
37, 20, 16, 8, 5, 5, 3, 0
Calculate the differences between adjacent numbers:
17, 4, 8, 3, 0, 2, 3
Then sort them in descending order:
17, 8, 4, 3, 3, 2, 0
Then take the first few numbers. For example, for 4 partitions, take 3 numbers:
17, 8, 4
Now look at the original array and find the elements with these given differences (you should attach the index in the original array to each element in the difference array to make this most easy).
17 - difference between 37 and 20
8 - difference between 16 and 8
4 - difference between 20 and 16
Now print the stuff:
37 | 20 | 16 | 8, 5, 5, 3, 0
I think your problem can be solved using k-clustering using kruskal's algorithm . Kruskal algorithm is used to find the clusters such that there is maximum spacing between them.
Algorithm : -
Construct path graph from your data set like following : -
[37, 20, 16, 8, 5, 5, 3, 0]
path graph: - 0 -> 1 -> 2 -> 3 -> 4 -> 5 -> 6 -> 7
then weight for each edge will be difference between their values
edge(0,1) = abs(37-20) = 17
edge(1,2) = abs(20-16) = 4
edge(2,3) = abs(16-8) = 8
edge(3,4) = abs(8-5) = 3
edge(4,5) = abs(5-5) = 0
edge(5,6) = abs(5-3) = 2
edge(6,7) = abs(3-0) = 3
Use kruskal on this graph till there are only k clusters remaining : -
Sort the edges first according to weights in ascending order:-
(4,5),(5,6),(6,7),(3,4),(1,2),(2,3),(0,1)
Use krushkal on it find exactly k = 3 clusters : -
iteration 1 : join (4,5) clusters = 7 clusters: [37,20,16,8,(5,5),3,0]
iteration 2 : join (5,6) clusters = 6 clusters: [37,20,16,8,(5,5,3),0]
iteration 3 : join (6,7) clusters = 5 clusters: [37,20,16,8,(5,5,3,0)]
iteration 4 : join (3,4) clusters = 4 clusters: [37,20,16,(8,5,5,3,0)]
iteration 5 : join (1,2) clusters = 3 clusters: [37,(20,16),(8,5,5,3,0)]
stop as clusters = 3
reconstrusted solution : [(37), (20, 16), (8, 5, 5, 3, 0)] is what
u desired
While #anatolyg's solution may be fine, you should also look at k-means clustering. It's usually done in higher dimensions, but ought to work fine in 1d.
You pick k; your examples are k=2 and k=3. The algorithm seeks to put the inputs into k sets that minimize the sum of distances squared from the set's elements to the centroid (mean position) of the set. This adds a bit of rigor to your rather fuzzy definition of the right result.
While getting an optimal result is NP hard, there is a simple greedy solution.
It's an iteration. Take a guess to get started. Either pick k elements at random to be the initial means or put all the elements randomly into k sets and compute their means. Some care is needed here because each of the k sets must have at least one element.
Additionally, because your integer sets can have repeats, you'll have to ensure the initial k means are distinct. This is easy enough. Just pick from a set that has been "unqualified."
Now iterate. For each element find its closest mean. If it's already in the set corresponding to that mean, leave it there. Else move it. After all elements have been considered, recompute the means. Repeat until no elements need to move.
The Wikipedia page on this is pretty good.

Resources