Find the longest subarray that contains a majority element - arrays

I am trying to solve this algorithmic problem:
https://dunjudge.me/analysis/problems/469/
For convenience, I have summarized the problem statement below.
Given an array of length (<= 2,000,000) containing integers in the range [0, 1,000,000], find the
longest subarray that contains a majority element.
A majority element is defined as an element that occurs > floor(n/2) times in a list of length n.
Time limit: 1.5s
For example:
If the given array is [1, 2, 1, 2, 3, 2],
The answer is 5 because the subarray [2, 1, 2, 3, 2] of length 5 from position 1 to 5 (0-indexed) has the number 2 which appears 3 > floor(5/2) times. Note that we cannot take the entire array because 3 = floor(6/2).
My attempt:
The first thing that comes to mind is an obvious brute force (but correct) solution which fixes the start and end indexes of a subarray and loop through it to check if it contains a majority element. Then we take the length of the longest subarray that contains a majority element. This works in O(n^2) with a small optimization. Clearly, this will not pass the time limit.
I was also thinking of dividing the elements into buckets that contain their indexes in sorted order.
Using the example above, these buckets would be:
1: 0, 2
2: 1, 3, 5
3: 4
Then for each bucket, I would make an attempt to merge the indexes together to find the longest subarray that contains k as the majority element where k is the integer label of that bucket.
We could then take the maximum length over all values of k. I didn't try out this solution as I didn't know how to perform the merging step.
Could someone please advise me on a better approach to solve this problem?
Edit:
I solved this problem thanks to the answers of PhamTrung and hk6279. Although I accepted the answer from PhamTrung because he first suggested the idea, I highly recommend looking at the answer by hk6279 because his answer elaborates the idea of PhamTrung and is much more detailed (and also comes with a nice formal proof!).

Note: attempt 1 is wrong as #hk6279 has given a counter example. Thanks for pointing it out.
Attempt 1:
The answer is quite complex, so I will discuss a brief idea
Let process each unique number one by one.
Processing each occurrence of number x from left to right, at index i, let add an segment (i, i) indicates the start and end of the current subarray. After that, we need to look to the left side of this segment, and try to merge the left neighbour of this segment into (i, i), (So, if the left is (st, ed), we try to make it become (st, i) if it satisfy the condition) if possible, and continue to merge them until we are not able to merge, or there is no left neighbour.
We keep all those segments in a stack for faster look up/add/remove.
Finally, for each segment, we try to enlarge them as large as possible, and keep the biggest result.
Time complexity should be O(n) as each element could only be merged once.
Attempt 2:
Let process each unique number one by one
For each unique number x, we maintain an array of counter. From 0 to end of the array, if we encounter a value x we increase the count, and if we don't we decrease, so for this array
[0,1,2,0,0,3,4,5,0,0] and number 0, we have this array counter
[1,0,-1,0,1,0,-1,-2,-1,0]
So, in order to make a valid subarray which ends at a specific index i, the value of counter[i] - counter[start - 1] must be greater than 0 (This can be easily explained if you view the array as making from 1 and -1 entries; with 1 is when there is an occurrence of x, -1 otherwise; and the problem can be converted into finding the subarray with sum is positive)
So, with the help of a binary search, the above algo still have an complexity of O(n ^ 2 log n) (in case we have n/2 unique numbers, we need to do the above process n/2 times, each time take O (n log n))
To improve it, we make an observation that, we actually don't need to store all values for all counter, but just the values of counter of x, we saw that we can store for above array counter:
[1,#,#,0,1,#,#,#,-1,0]
This will leads to O (n log n) solution, which only go through each element once.

This elaborate and explain how attempt 2 in #PhamTrung solution is working
To get the length of longest subarray. We should
Find the max. number of majority element in a valid array, denote as m
This is done by attempt 2 in #PhamTrung solution
Return min( 2*m-1, length of given array)
Concept
The attempt is stem from a method to solve longest positive subarray
We maintain an array of counter for each unique number x. We do a +1 when we encounter x. Otherwise, do a -1.
Take array [0,1,2,0,0,3,4,5,0,0,1,0] and unique number 0, we have array counter [1,0,-1,0,1,0,-1,-2,-1,0,-1,0]. If we blind those are not target unique number, we get [1,#,#,0,1,#,#,#,-1,0,#,0].
We can get valid array from the blinded counter array when there exist two counter such that the value of the right counter is greater than or equal to the left one. See Proof part.
To further improve it, we can ignore all # as they are useless and we get [1(0),0(3),1(4),-1(8),0(9),0(11)] in count(index) format.
We can further improve this by not record counter that is greater than its previous effective counter. Take counter of index 8,9 as an example, if you can form subarray with index 9, then you must be able to form subarray with index 8. So, we only need [1(0),0(3),-1(8)] for computation.
You can form valid subarray with current index with all previous index using binary search on counter array by looking for closest value that is less than or equal to current counter value (if found)
Proof
When right counter greater than left counter by r for a particular x, where k,r >=0 , there must be k+r number of x and k number of non x exist after left counter. Thus
The two counter is at index position i and r+2k+i
The subarray form between [i, r+2k+i] has exactly k+r+1 number of x
The subarray length is 2k+r+1
The subarray is valid as (2k+r+1) <= 2 * (k+r+1) -1
Procedure
Let m = 1
Loop the array from left to right
For each index pi
If the number is first encounter,
Create a new counter array [1(pi)]
Create a new index record storing current index value (pi) and counter value (1)
Otherwise, reuse the counter array and index array of the number and perform
Calculate current counter value ci by cprev+2-(pi - pprev), where cprev,pprev are counter value and index value in index record
Perform binary search to find the longest subarray that can be formed with current index position and all previous index position. i.e. Find the closest c, cclosest, in counter array where c<=ci. If not found, jump to step 5
Calculate number of x in the subarray found in step 2
r = ci - cclosest
k = (pi-pclosest-r)/2
number of x = k+r+1
Update counter m by number of x if subarray has number of x > m
Update counter array by append current counter if counter value less than last recorded counter value
Update index record by current index (pi) and counter value (ci)

For completeness, here's an outline of an O(n) theory. Consider the following, where * are characters different from c:
* c * * c * * c c c
i: 0 1 2 3 4 5 6 7 8 9
A plot for adding 1 for c and subtracting 1 for a character other than c could look like:
sum_sequence
0 c c
-1 * * c c
-2 * * c
-3 *
A plot for the minimum of the above sum sequence, seen for c, could look like:
min_sum
0 c * *
-1 * c * *
-2 c c c
Clearly, for each occurrence of c, we are looking for the leftmost occurrence of c with sum_sequence lower than or equal to the current sum_sequence. A non-negative difference would mean c is a majority, and leftmost guarantees the interval is the longest up to our position. (We can extrapolate a maximal length that is bounded by characters other than c from the inner bounds of c as the former can be flexible without affecting the majority.)
Observe that from one occurrence of c to the next, its sum_sequence can decrease by an arbitrary size. However, it can only ever increase by 1 between two consecutive occurrences of c. Rather than each value of min_sum for c, we can record linear segments, marked by cs occurrences. A visual example:
[start_min
\
\
\
\
end_min, start_min
\
\
end_min]
We iterate over occurrences of c and maintain a pointer to the optimal segment of min_sum. Clearly we can derive the next sum_sequence value for c from the previous one since it is exactly diminished by the number of characters in between.
An increase in sum_sequence for c corresponds with a shift of 1 back or no change in the pointer to the optimal min_sum segment. If there is no change in the pointer, we hash the current sum_sequence value as a key to the current pointer value. There can be O(num_occurrences_of_c) such hash keys.
With an arbitrary decrease in c's sum_sequence value, either (1) sum_sequence is lower than the lowest min_sum segment recorded so we add a new, lower segment and update the pointer, or (2) we've seen this exact sum_sequence value before (since all increases are by 1 only) and can use our hash to retrieve the optimal min_sum segment in O(1).
As Matt Timmermans pointed out in the question comments, if we were just to continually update the pointer to the optimal min_sum by iterating over the list, we would still only perform O(1) amortized-time iterations per character occurrence. We see that for each increasing segment of sum_sequence, we can update the pointer in O(1). If we used binary search only for the descents, we would add at most (log k) iterations for every k occurences (this assumes we jump down all the way), which keeps our overall time at O(n).

Algorithm :
Essentially, what Boyer-Moore does is look for a suffix sufsuf of nums where suf[0]suf[0] is the majority element in that suffix. To do this, we maintain a count, which is incremented whenever we see an instance of our current candidate for majority element and decremented whenever we see anything else. Whenever count equals 0, we effectively forget about everything in nums up to the current index and consider the current number as the candidate for majority element. It is not immediately obvious why we can get away with forgetting prefixes of nums - consider the following examples (pipes are inserted to separate runs of nonzero count).
[7, 7, 5, 7, 5, 1 | 5, 7 | 5, 5, 7, 7 | 7, 7, 7, 7]
Here, the 7 at index 0 is selected to be the first candidate for majority element. count will eventually reach 0 after index 5 is processed, so the 5 at index 6 will be the next candidate. In this case, 7 is the true majority element, so by disregarding this prefix, we are ignoring an equal number of majority and minority elements - therefore, 7 will still be the majority element in the suffix formed by throwing away the first prefix.
[7, 7, 5, 7, 5, 1 | 5, 7 | 5, 5, 7, 7 | 5, 5, 5, 5]
Now, the majority element is 5 (we changed the last run of the array from 7s to 5s), but our first candidate is still 7. In this case, our candidate is not the true majority element, but we still cannot discard more majority elements than minority elements (this would imply that count could reach -1 before we reassign candidate, which is obviously false).
Therefore, given that it is impossible (in both cases) to discard more majority elements than minority elements, we are safe in discarding the prefix and attempting to recursively solve the majority element problem for the suffix. Eventually, a suffix will be found for which count does not hit 0, and the majority element of that suffix will necessarily be the same as the majority element of the overall array.
Here's Java Solution :
Time complexity : O(n)
Space complexity : O(1)
public int majorityElement(int[] nums) {
int count = 0;
Integer candidate = null;
for (int num : nums) {
if (count == 0) {
candidate = num;
}
count += (num == candidate) ? 1 : -1;
}
return candidate;
}

Related

Does Binary Search guarantees that one of the THREE variables used in the algorithm will hold the right position of the key?

The title may not clear what I want to ask because it is not complete (SO restricted me to 150 words for the title).
My question is, Does Binary Search guarantees that one of the THREE variables used in the algorithm will hold the right position of the key, EVEN IF IT WAS NOT FOUND IN THE GIVEN SORTED SEQUENCE?
I have an example to clarify the question.
Consider a sorted array A with length 5;
int a[] = {2, 8, 9, 11, 14};
Clearly, the array doesn't contain 7. By looking at the sequence, we can say that the element 7 would have been given the index 1 if in the array.
Performing a binary search in the above sequence with key 7, will return -1 (that depends on the implementation, of course). But, does any of the three variables, say p (that if becomes greater than r, breaks the loop), q (stores the (p + r) / 2) and r (that if becomes less than p, breaks the loop) will hold the correct position (which is 1) for the value 7 in the above sequence, when the loop breaks?
Or, can some mathematical computation help us find the right position of 7?
As you see, binary search returns with -1 when the high and low index of array be the same. These indexes will be the same and on the position that it was assumed that the number is there. (Here index 1)
Yes ! check for a[low] and a[high] otherwise perform Binary Search. mid variable will store the exact location where the element should get placed before or after a[mid]
low = 0 and high = arrar_length - 1
check this code for reference. http://ideone.com/OYv3ih

Find a unique integer in an array

I am looking for an algorithm to solve the following problem: We are given an integer array of size n which contains k (0 < k < n) many elements exactly once. Every other integer occurs an even number of times in the array. The output should be any of the k unique numbers. k is a fixed number and not part of the input.
An example would be the input [1, 2, 2, 4, 4, 2, 2, 3] with both 1 and 3 being a correct output.
Most importantly, the algorithm should run in O(n) time and require only O(1) additional space.
edit: There has been some confusion regarding whether there is only one unique integer or multiple. I apologize for this. The correct problem is that there is an arbitrary but fixed amount. I have updated the original question above.
"Dante." gave a good answer for the case that there are at most two such numbers. This link also provides a solution for three. "David Eisenstat" commented that it is also possible to do for any fixed k. I would be grateful for a solution.
There is a standard algorithm to solve such problems using XOR operator:
Time Complexity = O(n)
Space Complexity = O(1)
Suppose your input array contains only one element that occurs odd no of times and rest occur even number of times,we take advantage of the following fact:
Any expression having even number of 0's and 1's in any order will always be = 0 when xor is applied.
That is
0^1^....... = 0 as long as number of 0 is even and number of 1 is even
and 0 and 1 can occur in any order.
Because all numbers that occur even number of times will have their corresponding bits form even number of 1's and 0's and only the number which occurs only once will have its bit left out when we take xor of all elements of array because
0(from no's occuring even times)^1(from no occuring once) = 1
0(from no's occuring even times)^0(from no occuring once) = 0
as you can see the bit of only the number occuring once is preserved.
This means when given such an array and you take xor of all the elements,the result is the number which occurs only once.
So the algorithm for array of length n is:
result = array[0]^array[1]^.....array[n-1]
Different Scenario
As the OP mentioned that input can also be an array which has two numbers occuring only once and rest occur even number of times.
This is solved using the same logic as above but with little difference.
Idea of algorithm:
If you take xor of all the elements then definitely all the bits of elements occuring even number of times will result in 0,which means:
The result will have its bit 1 only at that bit position where the bits of the two numbers occuring only once differ.
We will use the above idea.
Now we focus on the resultant xor bit which is 1(any bit which is 1) and make rest 0.The result is a number which will allow us to differentiate between the two numbers(the required ones).
Because the bit is 1,it means they differ at this position,it means one will have 0 at this position and one will have 1.This means one number when taken AND results in 0 and one does not.
Since it is very easy to set the right most bit,we set it of the result xor as
A = result & ~(result-1)
Now traverse through the array once and if array[i]&A is 0 store the number in variable number_1 as
number_1 = number_1^array[i]
otherwise
number_2 = number_2^array[i]
Because the remaining numbers occur even number of times,their bit will automatically disappear.
So the algorithm is
1.Take xor of all elements,call it xor.
2.Set the rightmost bit of xor and store it in B.
3.Do the following:
number_1=0,number_2=0;
for(i = 0 to n-1)
{
if(array[i] & B)
number_1 = number_1^array[i];
else
number_2 = number_2^array[i];
}
The number_1 and number_2 are the required numbers.
Here's a Las Vegas algorithm that, given k, the exact number of elements that occur an odd number of times, reports all of them in expected time O(n k) (read: linear-time when k is O(1)) and space O(1) words, assuming that "give me a uniform random word" and "give me the number of 1 bits set in this word (popcount)" are constant-time operations. I'm pretty sure that I'm not the first person to come up with this algorithm (and I'm not even sure that I'm remembering all of the refinements), but I've reached the limits of my patience trying to find it.
The central technique is called random restrictions. Essentially what we do is to filter the input randomly by value, in the hope that we retain exactly one odd-count element. We apply the classic XOR algorithm to the filtered array and check the result; if it succeeded, then we pretend to add it to the array, to make it even-count. Repeat until all k elements are found.
The filtration process goes like this. Treat each input word x as a binary vector of length w (doesn't matter what w is). Compute a random binary matrix A of size w by ceil(1 + lg k) and a random binary vector b of length ceil(1 + lg k). We filter the input by retaining those x such that Ax = b, where the left-hand side is a matrix multiplication mod 2. In implementation, A is represented as ceil(1 + lg k) vectors a1, a2, .... We compute the bits of Ax as popcount(a1 ^ x), popcount(a2 ^ x), .... (This is convenient because we can short-circuit the comparison with b, which shaves a factor lg k from the running time.)
The analysis is to show that, in a given pass, we manage with constant probability to single out one of the odd-count elements. First note that, for some fixed x, the probability that Ax = b is 2-ceil(1 + lg k) = Θ(1/k). Given that Ax = b, for all y ≠ x, the probability that Ay = b is less than 2-ceil(1 + lg k). Thus, the expected number of elements that accompany x is less than 1/2, so with probability more than 1/2, x is unique in the filtered input. Sum over all k odd-count elements (these events are disjoint), and the probability is Θ(1).
Here's a deterministic linear-time algorithm for k = 3. Let the odd-count elements be a, b, c. Accumulate the XOR of the array, which is s = a ^ b ^ c. For each bit i, observe that, if a[i] == b[i] == c[i], then s[i] == a[i] == b[i] == c[i]. Make another pass through the array, accumulate the XOR of the lowest bit set in s ^ x. The even-count elements contribute nothing again. Two of the odd-count elements contribute the same bit and cancel each other out. Thus, the lowest bit set in the XOR is where exactly one of the odd-count elements differs from s. We can use the restriction method above to find it, then the k = 2 method to find the others.
The question title says "the unique integer", but the question body says there can be more than one unique element.
If there is in fact only one non-duplicate: XOR all the elements together. The duplicates all cancel, because they come in pairs (or higher multiples of 2), so the result is the unique integer.
See Dante's answer for an extension of this idea that can handle two unique elements. It can't be generalized to more than that.
Perhaps for k unique elements, we could use k accumulators to track sum(a[i]**k). i.e. a[i], a[i]2, etc. This probably only works for Faster algorithm to find unique element between two arrays?, not this case where the duplicates are all in one array. IDK if an xor of squares, cubes, etc. would be any use for resolving things.
Track the counts for each element and only return the elements with a count of 1. This can be done with a hash map. The below example tracks the result using a hash set while it's still building the counts map. Still O(n) but less efficient, but I think it's slightly more instructive.
Javascript with jsfiddle http://jsfiddle.net/nmckchsa/
function findUnique(arr) {
var uniq = new Map();
var result = new Set();
// iterate through array
for(var i=0; i<arr.length; i++) {
var v = arr[i];
// add value to map that contains counts
if(uniq.has(v)) {
uniq.set(v, uniq.get(v) + 1);
// count is greater than 1 remove from set
result.delete(v);
} else {
uniq.set(v, 1);
// add a possibly uniq value to the set
result.add(v);
}
}
// set to array O(n)
var a = [], x = 0;
result.forEach(function(v) { a[x++] = v; });
return a;
}
alert(findUnique([1,2,3,0,1,2,3,1,2,3,5,4,4]));
EDIT Since the non-uniq numbers appear an even number of times #PeterCordes suggested a more elegant set toggle.
Here's how that would look.
function findUnique(arr) {
var result = new Set();
// iterate through array
for(var i=0; i<arr.length; i++) {
var v = arr[i];
if(result.has(v)) { // even occurances
result.delete(v);
} else { // odd occurances
result.add(v);
}
}
// set to array O(n)
var a = [], x = 0;
result.forEach(function(v) { a[x++] = v; });
return a;
}
JSFiddle http://jsfiddle.net/hepsyqyw/
Assuming you have an input array: [2,3,4,2,4]
Output: 3
In Ruby, you can do something as simple as this:
[2,3,4,2,4].inject(0) {|xor, v| xor ^ v}
Create an array counts that has INT_MAX slots, with each element initialized to zero.
For each element in the input list, increment counts[element] by one. (edit: actually, you will need to do counts[element] = (counts_element+1)%2, or else you might overflow the value for really ridiculously large values of N. It's acceptable to do this kind of modulus counting because all duplicate items appear an even number of times)
Iterate through counts until you find a slot that contains "1". Return the index of that slot.
Step 2 is O(N) time. Steps 1 and 3 take up a lot of memory and a lot of time, but neither one is proportional to the size of the input list, so they're still technically O(1).
(note: this assumes that integers have a minimum and maximum value, as is the case for many programming languages.)

Minimum value of numbers in char array

I recently ran into this problem in an interview, and I was curious what the best way to solve it would be. The question is given a char array that has the ascii characters '0' to '9' make one swap such that the the set of ascii values in the resultant array forms the lowest possible value. Both the input array will not have preceding 0s and neither should the resultant array.
So here is an example: char a[] = {'1','0', '9','7','6'}
The solution: char b[] = { '1','0', '6', '7', '9'}
Another example: char a[] = {'9','0', '7','6','1'}
The solution: char b[] = {'1','0', '7','6','9'}
I am looking for the best solution in terms of performance. Since only one swap is allowed I assumed that sorting is not allowed. I did not clarify that though. So we are looking for the lowest possible value that can be obtained through just using one swap. It would help if you could provide the complexity of the solution as well.
Algorithm:
Note that since there can't be a leading 0, 0's should be catered for separately.
Go from the right, keeping track of the minimum non-zero number. Also record the first 0.
Whenever you find a number larger than the recorded non-zero number, record those 2 as the best possible swap so far.
Once you've found a zero, record any non-zero non-leading character from here as the best possible swap for the zero.
Note that you're not doing any comparisons between the quality of either of the swaps above, we simply replace the current best with the new one, as it's always better to swap with a more-left position.
When done, compare the target positions of the best swap for the zero and the best swap for the non-zero and pick the left-most position, or the zero if they're the same.
If no possible swap was found, the array is already the minimum permutation of the given numbers, thus don't do anything. Or, if we have to, swap the two right-most characters.
Running time:
O(n).
Example:
Input: 10976
Processing 6 7 9 0 1
Minimum 6 6 6 6 1
Best swap - 6+7 6+9 6+9 6+9
Zero? No No No Yes Yes
Best 0 swap - - - - -
So the best swap is 6/9, giving us 10679
Input: 3601
Processing 1 0 6 3
Minimum 1 1 1 3
Best swap - - 1+6 1+3
Zero? No Yes Yes Yes
Best 0 swap - - 0+6 0+6
Here possible swaps are 1/3 and 0/6.
For the 1/3 swap, the target position is 0 (0-indexed).
For the 0/6 swap, the target position is 1.
So we pick the 1/3 swap giving us 1603.
Depends on what you mean by best.
Take out the lowest number except '0' as the first item in b, then sort the rest into ascending order.
Start at the leftmost digit
While current digit is valued the lowest (non-zero for the first digits equal
to the number of zeroes in the set) of the remaining set of numbers, advance
to the next number
Swap current value with lowest remaining value.
You can keep a sorted copy of the array to help you in your decision making process. You'd need it for knowing how many there are of your current lowest number(s). You can make it even more efficient if you store their indices as well.
It's not necessary to have any additional storage, but it would likely make things faster. In a single pass of the first array, you can get the number of 0s as well as the index to the next lowest number, but in an array like 1, 0, 6, 9, 7, you would then have to go through the array 4 different times.
EDIT - slightly more flushed out algorithm
Copy the array into a separate one, called c, and sort c. (You'll use this to make your decision faster, though you could simply repeatedly analyze the array.)
if c has zeros, find the value of its first non-zero value, called x
if a[0] != x, swap a[0] with x.
set index to 1
while a[index] == c[index], ++index
swap a[index] with c[index]
If you do it this way, it costs you another array, but is done with one sort and one pass through the array.
If you don't do it that way, you'll have to pass through the remainder of the array each time to find the minimums. I'm not great at complexity, but I believe that's n log n, since you'll be starting from higher indices each iteration.
Without using the array, you'd have to do something like
Find the lowest non-zero value
if a[0] isn't this value, swap with this value
index = 1
find lowest value starting at a[index]
if they're not equal, swap the values, done. Otherwise, increment index
That's still n log n. I think sorting can make it more efficient.
I think that you clearly have to go through the array at least once. You need to do this to find the smallest value.
You also need to find the value which will satisfy the conditions. The poster says that the input array will not have '0' as the leading elements.
We loop over the array one position at a time. We are looking for another position which has the minimum value (non zero for the first spot, anything for the other spots) and is smaller than any other seen.
for (pos = 0; pos < array_size-1; pos++) {
low_index = pos;
min_value = (pos == 0) ? '1' : '0';
for (i = pos+1; i < array_size; i++) {
if (min_value <= array[i]) && (array[i] < array[low_index])) {
low_index = i;
}
}
if (low_index != pos) {
low_value = array[pos];
array[pos] = array[low_index];
array[low_index] = low_value;
break;
}
}

Algorithm Olympiad : conditional minimum in array

I have an array A = [a1, a2, a3, a4, a5...] and I want to find two elements of the array, say A[i] and A[j] such that i is less than j and A[j]-A[i] is minimal and positive.
The runtime has to be O(nlog(n)).
Would this code do the job:
First sort the array and keep track of the original index of each element (ie : the index of the element in the ORIGINAL (unsorted) array.
Go through the sorted array and calculate the differences between any two successive elements that verify the initial condition that the Original Index of the bigger element is bigger than the original index of the smaller element.
The answer would be the minimum value of all these differences.
Here is how this would work on an example:
A = [0, -5, 10, 1]
In this case the result should be 1 coming from the difference between A[3] and A[0].
sort A : newA=[-5,0,1,10]
since OriginalIndex(-5)>OriginalIndex(0), do not compute the difference
since OriginalIndex(1)>OriginalIndex(0), we compute the difference = 1
since OriginalIndex(10)>OriginalIndex(1), we compute the difference = 9
The result is the minimal difference, which is 1.
Contrary to the claim made in the other post there wouldn't be any problem regarding the runtime of your algorithm. Using heapsort for example the array could be sorted in O(n log n) as given as an upper bound in your question. An additional O (n) running once along the sorted array couldn't harm this any more, so you would still stay with runtime O (n log n).
Unfortunately your answer still doesn't seem to be correct as it doesn't give the correct result.
Taking a closer look at the example given you should be able to verify that yourself. The array given in your example was: A=[0,-5,10,1]
Counting from 0 choosing indices i=2 and j=3 meets the given requirement i < j as 2 < 3. Calculating the difference A[j] - A[i] which with the chosen values comes down to A[3] - A[2] calculates to 1 - 10 = -9 which is surely less than the minimal value of 1 calculated in the example application of your algorithm.
Since you're minimising the distance between elements, they must be next to each other in the sorted list (if they weren't then the element in between would be a shorter distance to one of them -> contradiction). Your algorithm runs in O(nlogn) as specified so it looks fine to me.

Minimum value of maximum values in sub-segments ... in O(n) complexity

I interviewed with Amazon a few days ago. I could not answer one of the questions the asked me to their satisfaction. I have tried to get the answer after the interview but I have not been successful so far. Here is the question:
You have an array of integers of size n. You are given parameter k where k < n. For each segment of consecutive elements of size k in the array you need to calculate the maximum value. You only need to return the minimum value of these maximum values.
For instance given 1 2 3 1 1 2 1 1 1 and k = 3 the answer is 1.
The segments would be 1 2 3, 2 3 1, 3 1 1, 1 1 2, 1 2 1, 2 1 1, 1 1 1.
The maximum values in each segment are 3, 3, 3, 2, 2, 2, 1.
The minimum of these values are 1 thus the answer is 1.
The best answer I came up with is of complexity O(n log k). What I do is to create a binary search tree with the first k elements, get the maximum value in the tree and save it in variable minOfMax, then loop one element at a time with the remaining elements in the array, remove the first element in the previous segment from the binary search tree, insert the last element of the new segment in the tree, get the maximum element in the tree and compare it with minOfMax leaving in minOfMax the min value of the two.
The ideal answer needs to be of complexity O(n).
Thank you.
There is a very clever way to do this that's related to this earlier question. The idea is that it's possible to build a queue data structure that supports enqueue, dequeue, and find-max in amortized O(1) time (there are many ways to do this; two are explained in the original question). Once you have this data structure, begin by adding the first k elements from the array into the queue in O(k) time. Since the queue supports O(1) find-max, you can find the maximum of these k elements in O(1) time. Then, continuously dequeue an element from the queue and enqueue (in O(1) time) the next array element. You can then query in O(1) what the maximum of each of these k-element subarrays are. If you track the minimum of these values that you see over the course of the array, then you have an O(n)-time, O(k)-space algorithm for finding the minimum maximum of the k-element subarrays.
Hope this helps!
#templatetypedef's answer works, but I think I have a more direct approach.
Start by computing the max for the following (closed) intervals:
[k-1, k-1]
[k-2, k-1]
[k-3, k-1]
...
[0, k-1]
Note that each of these can be computed in constant time from the preceeding one.
Next, compute the max for these intervals:
[k, k]
[k, k+1]
[k, k+2]
...
[k, 2k-1]
Now these intervals:
[2k-1, 2k-1]
[2k-2, 2k-1]
[2k-3, 2k-1]
...
[k+1, 2k-1]
Next you do the intervals from 2k to 3k-1 ("forwards intervals"), then from 3k-1 down to 2k+1 ("backwards intervals"). And so on until you reach the end of the array.
Put all of these into a big table. Note that each entry in this table took constant time to compute. Observe that there are at most 2*n intervals in the table (because each element appears once on the right side of a "forwards interval" and once on the left side of a "backwards interval").
Now, if [a,b] is any interval of width k, it must contain exactly one of 0, k, 2k, ...
Say it contains m*k.
Observe that the intervals [a, m*k-1] and [m*k ,b] are both somewhere in our table. So we can simply look up the max for each, and the max of those two values is the max of the interval [a,b].
So for any interval of width k, we can use our table to get its maximum in constant time. We can generate the table in O(n) time. Result follows.
I implemented (and commented) templatetypedef's answer in C#.
n is array length, k is window size.
public static void printKMax(int[] arr, int n, int k)
{
Deque<int> qi = new Deque<int>();
int i;
for (i=0 ; i < k ; i++) // The first window of the array
{
while ((qi.Count > 0) && (arr[i] >= arr[qi.PeekBack()]))
{
qi.PopBack();
}
qi.PushBack(i);
}
for(i=k ; i < n ; ++i)
{
Console.WriteLine(arr[qi.PeekFront()]); // the first item is the largest element in previous window. The second item is its index.
while (qi.Count > 0 && qi.PeekFront() <= i - k)
{
qi.PopFront(); //When it's out of its window k
}
while (qi.Count>0 && arr[i] >= arr[qi.PeekBack()])
{
qi.PopBack();
}
qi.PushBack(i);
}
Console.WriteLine(arr[qi.PeekFront()]);
}

Resources