Minimum Number of Operations to make an array sorted - arrays

I have been trying this problem on spoj, not able to come up with the correct approach.
What is the correct algo to solve the problem ?

You should find longest consecutive increasing subsequence, which can be done in O(n log n) (by sorting array), after that, the number of changes needed is N - longest consecutive increasing subsequence. Note that by consecutive I mean there order in sorted array.
e.g:
1 7 6 2 5 4 3 => 1-2-3 is longest consecutive increasing subsequence,
number of moves needed is 4.
1 6 4 3 5 2 7 => 1-2 or 4-5 or 6-7 is longest consecutive increasing
subsequence, note that 1-4-5-7 is longest increasing subsequence but
number of moves needed is 5 not 3.
Why this works:
Best algorithm does not changes some of a items places, call biggest subsequence without changes as X, you wont change the position of X items related to each other during operations, so they should be sorted in increasing mode. But because you just allowed to move some items in the first or the last of array, you can't put any item between X items (note that we assumed X is biggest unchanged subsequence during operations), So there should be no gap between X items. so they should be sorted and consecutive in sorted format.
So number of changes needed couldn't be smaller than the N- length of X, but also is not hard to do your job with N-length of X operation.

Related

find minium of increasing subseqeunces in permutation

i want to find one minimum of increasing subsequences List<int[] in a permutation 1-n. So i just needn't a number but the concrete subsequences.
Currently I search the longest increasing subsequence, delete these elements and search again, until no element left. But this isn't optimal.
Example:
1,4,3,2,5 -> 3 -> 1,2,5 4 3
1,4,5,2,3 -> 2 -> 1,4,5 2,3

For each element of an unordered array output the number of greater elements

I guess question is quite straight forward, so let me explain with an example
Input Array = {3 1 8 2 5 3 6 7};
Output Required = {4,7,0,6,3,4,2,1};
Greater than 3 are 4 elements in array (5,6,7,8)
Greater than 1 are 7 elements in array (2,3,3,5,6,7,8)
Greater than 8 are 0 elements in array ()
Greater than 2 are 6 elements in array (3,3,5,6,7,8)
Greater than 5 are 3 elements in array (6,7,8)
Greater than 3 are 4 elements in array (5,6,7,8)
Greater than 6 are 2 elements in array (7,8)
Greater than 7 are 1 elements in array (8)
So one approach will be just to run two nested for loops and be done with it,
time complexity O(N^2), space complexity O(1)
How this can be further optimized?
If you create a copy of the list and sort it, then (assuming unique elements), the 'greater element count' for a value is just
(total number of elements - 1 - position of value in sorted_list),
where we subtract 1 since indices start at 0 and we only want strictly greater elements.
When elements can be repeated, we should now find the unique elements of the original list and sort them, but also keep track of how many times each element appeared. Then, we need the 'weighted position' of each value in the sorted list, which is the sum of counts of all values at or before that index.
After creating a mapping from each unique value to the count of strictly greater elements, iterate over the original list, replacing each element with the count it's been mapped to.
Since we can convert between 'greater element counts' and the full sorted list in linear time, this method is asymptotically optimal, as it finds greater element counts in O(n log n) time.
Here's a short Python implementation of that idea.
def greater_element_counts(arr: List[int]) -> List[int]:
"""Return a list with the number of strictly larger elements
in arr for each position in arr."""
element_to_counts = collections.Counter(arr)
unique_sorted_elements = sorted(element_to_counts.keys())
greater_element_count = len(arr)
answer_by_element = {}
for unique_element in unique_sorted_elements:
greater_element_count -= element_to_counts[unique_element]
answer_by_element[unique_element] = greater_element_count
return [answer_by_element[element] for element in arr]

Algorithm - Find maximum sum of two numbers in unsorted array that have a minimum distance

I'm trying to find an algorithm that finds the maximum sum of two numbers that have a minimum distance D between them.
For example, lets say we have this array of 8 numbers and the minimum distance for a sum is 2:
9 4 6 2 8 7 5 6
9 can be paired with 2, 8, 7, 5, 6
4 can be paired with 8, 7, 5, 6
6 can be paired with 7, 5, 6
2 can be paired with 9 (from the left side), 5, 6
8 can be paired with 9, 4 and 6
etc..
From this array it is obvious that the maximum possible sum is 9+8 = 17
Does anyone know of an efficient algorithm to do this, or has any idea?
I have tried an algorithm which finds the max number, compares it with every possible else, and then checks every value that does not have a minimum distance from it with every possible else...but this is very slow when the array consists of many numbers and the especially when the minimum distance is large.
Note: All numbers are positive
Thanks.
For each position in the array, find and store the maximum of the elements up to that position. This can be done in O(n) for all positions by updating the maximum in each step.
By scanning right-to-left in the same manner, find and store for each position the maximum of all elements from that position to the end.
Now for each element, array[pos] + max(max_up_to[pos-D], max_from[pos+D]) will give you the highest sum that can be generated with that element. So another O(n) pass gives you the maximum over all elements.
Total: O(n) time, O(n) space.
In fact, you don't even need the additional space: The max_from array isn't needed because it's enough to evaluate array[pos] + max_up_to[pos-D] (since each sum would otherwise be generated twice). And the max_up_to values can be generated on-the-fly as you're iterating over the array.

Can we use binary search to find most frequently occuring integer in sorted array? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Problem:
Given a sorted array of integers find the most frequently occurring integer. If there are multiple integers that satisfy this condition, return any one of them.
My basic solution:
Scan through the array and keep track of how many times you've seen each integer. Since it's sorted, you know that once you see a different integer, you've gotten the frequency of the previous integer. Keep track of which integer had the highest frequency.
This is O(N) time, O(1) space solution.
I am wondering if there's a more efficient algorithm that uses some form of binary search. It will still be O(N) time, but it should be faster for the average case.
Asymptotically (big-oh wise), you cannot use binary search to improve the worst case, for the reasons the answers above mine have presented. However, here are some ideas that may or may not help you in practice.
For each integer, binary search for its last occurrence. Once you find it, you know how many times it appears in the array, and can update your counts accordingly. Then, continue your search from the position you found.
This is advantageous if you have only a few elements that repeat a lot of times, for example:
1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3
Because you will only do 3 binary searches. If, however, you have many distinct elements:
1 2 3 4 5 6
Then you will do O(n) binary searches, resulting in O(n log n) complexity, so worse.
This gives you a better best case and a worse worst case than your initial algorithm.
Can we do better? We could improve the worst case by finding the last occurrence of the number at position i like this: look at 2i, then at 4i etc. as long as the value at those positions are the same. If they are not, look at (i + 2i) / 2 etc.
For example, consider the array:
i
1 2 3 4 5 6 7 ...
1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3
We look at 2i = 2, it has the same value. We look at 4i = 4, same value. We look at 8i = 8, different value. We backtrack to (4 + 8) / 2 = 6. Different value. Backtrack to (4 + 6) / 2 = 5. Same value. Try (5 + 6) / 2 = 5, same value. We search no more, because our window has width 1, so we're done. Continue the search from position 6.
This should improve the best case, while keeping the worst case as fast as possible.
Asymptotically, nothing is improved. To see if it actually works better on average in practice, you'll have to test it.
Binary search, which eliminates half of the remaining candidates, probably wouldn't work. There are some techniques you could use to avoid reading every element in the array. Unless your array is extremely long or you're solving a problem for curiosity, the naive (linear scan) solution is probably good enough.
Here's why I think binary search wouldn't work: start with an array: given the value of the middle item, you do not have enough information to eliminate the lower or upper half from the search.
However, we can scan the array in multiple passes, each time checking twice as many elements. When we find two elements that are the same, make one final pass. If no other elements were repeated, you've found the longest element run (without even knowing how many of that element is in the sorted list).
Otherwise, investigate the two (or more) longer sequences to determine which is longest.
Consider a sorted list.
Index 0 1 2 3 4 5 6 7 8 9 a b c d e f
List 1 2 3 3 3 3 3 3 3 4 5 5 6 6 6 7
Pass1 1 . . . . . . 3 . . . . . . . 7
Pass2 1 . . 3 . . . 3 . . . 5 . . . 7
Pass3 1 2 . 3 . x . 3 . 4 . 5 . 6 . 7
After pass 3, we know that the run of 3's must be at least 5, while the longest run of any other number is at most 3. Therefore, 3 is the most frequently occurring number in the list.
Using the right data structures and algorithms (use binary-tree-style indexing), you can avoid reading values more than once. You can also avoid reading the 3 (marked as an x in pass 3) since you already know its value.
This solution has running time O(n/k) which degrades to O(n) for k=1 for a list with n elements and a longest run of k elements. For small k, the naive solution will perform better due to simpler logic, data structures, and higher RAM cache hits.
If you need to determine the frequency of the most common number, it would take O((n/k) log k) as indicated by David to find the first and last position of the longest run of numbers using binary search on up to n/k groups of size k.
The worst case cannot be better than O(n) time. Consider the case where each element exists once, except for one element which exists twice. In order to find that element, you'd need to look at every element in the array until you find it. This is because knowing the value of any array element does not give you any information regarding the location of the duplicate element, until it's actually found. This is in contrast to binary search, where the value of an array element allows you to rule out many other elements.
No, in the worst case we have to scan at least n - 2 elements, but see
below for an algorithm that exploits inputs with many duplicates.
Consider an adversary that, for the first n - 3 distinct probes into the
n-element array, returns m for the value at index m. Now the algorithm
knows that the array looks like
1 2 3 ... i-1 ??? i+1 ... j-1 ??? j+1 ... k-1 ??? k+1 ... n-2 n-1 n.
Depending on what the ???s are, the sole correct answer could be j-1
or j+1, so the algorithm isn’t done yet.
This example involved an array where there were very few duplicates. In
fact, we can design an algorithm that, if the most frequent element
occurs k times out of n, uses O((n/k) log k) probes into the array. For
j from ceil(log2(n)) - 1 down to 0, examine the subarray consisting of
every (2**j)th element. Stop if we find a duplicate. The cost so far
is O(n/k). Now, for each element in the subarray, use binary search to
find its extent (O(n/k) searches in subarrays of size O(k), for a total
of O((n/k) log k)).
It can be shown that all algorithms have a worst case of Omega((n/k) log
k), making this one optimal in the worst case up to constant factors.

2sum with duplicate values

The classic 2sum question is simple and well-known:
You have an unsorted array, and you are given a value S. Find all pairs of elements in the array that add up to value S.
And it's always been said that this can be solved with the use of HashTable in O(N) time & space complexity or O(NlogN) time and O(1) space complexity by first sorting it and then moving from left and right,
well these two solution are obviously correct BUT I guess not for the following array :
{1,1,1,1,1,1,1,1}
Is it possible to print ALL pairs which add up to 2 in this array in O(N) or O(NlogN) time complexity ?
No, printing out all pairs (including duplicates) takes O(N2). The reason is because the output size is O(N2), thus the running time cannot be less than that (since it takes some constant amount of time to print each element in the output, thus to simply print the output would take CN2 = O(N2) time).
If all the elements are the same, e.g. {1,1,1,1,1}, every possible pair would be in the output:
1. 1 1
2. 1 1
3. 1 1
4. 1 1
5. 1 1
6. 1 1
7. 1 1
8. 1 1
9. 1 1
10. 1 1
This is N-1 + N-2 + ... + 2 + 1 (by taking each element with all elements to the right), which is
N(N-1)/2 = O(N2), which is more than O(N) or O(N log N).
However, you should be able to simply count the pairs in expected O(N) by:
Creating a hash-map map mapping each element to the count of how often it appears.
Looping through the hash-map and summing, for each element x up to S/2 (if we go up to S we'll include the pair x and S-x twice, let map[x] == 0 if x doesn't exist in the map):
map[x]*map[S-x] if x != S-x (which is the number of ways to pick x and S-x)
map[x]*(map[x]-1)/2 if x == S-x (from N(N-1)/2 above).
Of course you can also print the distinct pairs in O(N) by creating a hash-map similar to the above and looping through it, and only outputting x and S-x the value if map[S-x] exists.
Displaying or storing the results is O(N2) only.The worst case as highlighted by you clearly has N2 pairs and to write them onto the screen or storing them into a result array would clearly require at least that much time.In short, you are right!
No
You can pre-compute them in O(nlogn) using sorting but to print them you may need more than O(nlogn).In worst case It can be O(N^2).
Let's modify the algorithm to find all duplicate pairs.
As an example:
a[ ]={ 2 , 4 , 3 , 2 , 9 , 3 , 3 } and sum =6
After sorting:
a[ ] = { 2 , 2 , 3 , 3 , 3 , 4 , 9 }
Suppose you found pair {2,4}, now you have to find count of 2 and 4 and multiply them to get no of duplicate pairs.Here 2 occurs 2 times and 1 occurs 1 times.Hence {2,1} will appear 2*1 = 2 times in output.Now consider special case when both numbers are same then count no of occurrence and sq them .Here { 3,3 } sum to 6. occurrence of 3 in array is 3.Hence { 3,3 } will appear 9 times in output.
In your array {1,1,1,1,1} only pair {1,1} will sum to 2 and count of 1 is 5.hence there are going to 5^2=25 pairs of {1,1} in output.

Resources