Convert to Strictly increasing integer array with minimum changes

Convert to Strictly increasing integer array with minimum changes - arrays

Given an array {a0,a1,a2,a3,a4,.....an}.I need to find the minimum number of replacements of number to make the array strictly increasing.
I know that there exists dynamic programming solutions for the question.
https://www.geeksforgeeks.org/convert-to-strictly-increasing-integer-array-with-minimum-changes/
I have been trying out a greedy solution for the problem.
This is the Pseudo Code - (Greedy)
Start iterating on the array from a[1] (0-based indexing).
if a[i] > a[i-1] then continue
else -
if (a[i-2]+1 < a[i] && a[i-1] != a[i-2]+1) , then make a[i-1] = a[i-2] + 1
else make a[i] = a[i-1] + 1
Continue till end of array.
Basic approach is, that if an out of order element is found(a[i]) then first check if the previous element
(a[i-1]) can be decreased or not.
If it is possible to decrease it check if decreasing it to the minimum possible number changes the fact that a[i] < a[i-1] or not.
If it make no change to that fact then don't decrease a[i-1], instead increase a[i] to a[i-1] + 1.
eg - 3 2 5 3 8 4 4
2<3, so decrease 3 to 0. Array becomes - 0 2 5 3 8 4 4
5>2 so continue.
3<5, decreasing to 3 and 4 is possible but not effective. So increase 3 to 6. Array - 0 2 5 6 8 4 4
8>6 so continue
4<8, decreasing 8 to 7 is possible but not effective. Array - 0 2 5 6 8 9 4
4<9, decreasing 9 to 8 is not possible, so increase 4 to 10. Array - 0 2 5 6 8 9 10
Total changes = 4. Giving the answer in O(n).
I have been trying to prove the greedy strategy wrong but have been unable to do so. It is ignoring the LIS the same way that the DP approach ignores it. (i.e notice how 2,5,8 remain same and others change)
Is the solution correct ? If not can you please provide a counter-example.
Please also let me know your "thought" process that led you to think of the counter-example, if possible. Would be of great help in future questions.
Which leads me to the question i wanted to ask most -
How exactly do you figure out if your greedy solution is correct ? How do you realize that greedy wont work and you'll have to use dynamic programming ?
If the pseudo code is unclear for anyone,please let me know. I would be more than happy to submit the complete code.

Related

Does there any O(1) Solution exist for this Problem?

We have been given an array of size N 1 <= N <= 1e5, with Ai positive integers, such that
1 <= Ai <= 1e9.
we will be given Q queries. 1 <= Q <= 1e5
Every time in a query there will be two space separated integers b c , 1 <= b,c <= N
For every query we need to find that Is moving from index b of array to index c of array possible ?, and if it is then we have find a special sum, which i have explained below.
We can't just move in array simply from i to i+1 index, there is a restriction. If we want to move from i to j then A[j] should be strictly greater than A[i], i.e A[j] > A[i].
Note here one thing that : While moving we have to take the just next greater element than the current.
The sum what we need to find is sum of elements that came in the path taken to reach destination.
For Example
array : 3 2 5 4 6 6 7
query : 1 7
So, according to query we need to move from 1st element to last element if possible.
As, we can see we can take 3 --> 5 --> 6 --> 7 path to reach the destination and sum is 3+5+6+7 = 21
But if last element in array was 2
array : 3 2 5 4 6 6 2
query : 1 7
For this query we cant reach to destination as after 6 the destination element 2 is smaller than it. So for this query NO answer exist.
My approach
I know i can find the answer in O(n), by traversing the array simply from A(b) to A(c) and finding out that if answer exit or not as well as sum.
But the Problem is that There are a lot of queries so if i use O(n) solution the Time Complexity will be O(QN).
Time limit is only 1 sec, So i need to find a constant time O(c)solution for this.
One Thing more The becomes even tougher when Queries of second type appear.
Query type 2: In this query we need to update the value at an index with a given K.
query : b k , then A[b] = K.
Can anyone help me on this ??

The question is asking for N queries, the solution is most probably to do a pre-process to compute the possibilities and then query each of them in O(1) time.

Convert sorted array into low high array

Interview question:
Given a sorted array of this form :
1,2,3,4,5,6,7,8,9
( A better example would be 10,20,35,42,51,66,71,84,99 but let's use above one)
Convert it to the following low high form without using extra memory or a standard library
1,9,2,8,3,7,4,6,5
A low-high form means that we use the smallest followed by highest. Then we use the second smallest and second-highest.
Initially, when he asked, I had used a secondary array and used the 2 pointer approach. I kept one pointer in front and the second pointer at last . then one by one I copied left and right data to my new array and then moved left as left ++ and right as --right till they cross or become same.
After this, he asked me to do it without memory.
My approach to solving it without memory was on following lines . But it was confusing and not working
1) swap 2nd and last in **odd** (pos index 1)
1,2,3,4,5,6,7,8,9 becomes
1,9,3,4,5,6,7,8,2
then we reach even
2) swap 3rd and last in **even** (pos index 2 we are at 3 )
1,9,3,4,5,6,7,8,2 becomes (swapped 3 and 2_ )
1,9,2,4,5,6,7,8,3
and then sawp 8 and 3
1,9,2,4,5,6,7,8,3 becomes
1,9,2,4,5,6,7,3,8
3) we reach in odd (pos index 3 we are at 4 )
1,9,2,4,5,6,7,3,8
becomes
1,9,2,8,5,6,7,3,4
4) swap even 5 to last
and here it becomes wrong

Let me start by pointing out that even registers are a kind of memory. Without any 'extra' memory (other than that occupied by the sorted array, that is) we don't even have counters! That said, here goes:
Let a be an array of n > 2 positive integers sorted in ascending order, with the positions indexed from 0 to n-1.
From i = 1 to n-2, bubble-sort the sub-array ranging from position i to position n-1 (inclusive), alternatively in descending and ascending order. (Meaning that you bubble-sort in descending order if i is odd and in ascending order if it is even.)
Since to bubble-sort you only need to compare, and possibly swap, adjacent elements, you won't need 'extra' memory.
(Mind you, if you start at i = 0 and first sort in ascending order, you don't even need a to be pre-sorted.)
And one more thing: as there was no talk of it in your question, I will keep very silent on the performance of the above algorithm...

We will make n/2 passes and during each pass we will swap each element, from left to right, starting with the element at position 2k-1, with the last element. Example:
pass 1
V
1,2,3,4,5,6,7,8,9
1,9,3,4,5,6,7,8,2
1,9,2,4,5,6,7,8,3
1,9,2,3,5,6,7,8,4
1,9,2,3,4,6,7,8,5
1,9,2,3,4,5,7,8,6
1,9,2,3,4,5,6,8,7
1,9,2,3,4,5,6,7,8
pass 2
V
1,9,2,3,4,5,6,7,8
1,9,2,8,4,5,6,7,3
1,9,2,8,3,5,6,7,4
1,9,2,8,3,4,6,7,5
1,9,2,8,3,4,5,7,6
1,9,2,8,3,4,5,6,7
pass 3
V
1,9,2,8,3,4,5,6,7
1,9,2,8,3,7,5,6,4
1,9,2,8,3,7,4,6,5
1,9,2,8,3,7,4,5,6
pass 4
V
1,9,2,8,3,7,4,5,6
1,9,2,8,3,7,4,6,5
This should take O(n^2) swaps and uses no extra memory beyond the counters involved.
The loop invariant to prove is that the first 2k+1 positions are correct after iteration k of the loop.

Alright, assuming that with constant space complexity, we need to lose some of our time complexity, the following algorithm possibly works in O(n^2) time complexity.
I wrote this in python. I wrote it as quickly as possible so apologies for any syntactical errors.
# s is the array passed.
def hi_low(s):
last = len(s)
for i in range(0, last, 2):
if s[i+1] == None:
break
index_to_swap = last
index_to_be_swapped = i+1
while s[index_to_be_swapped] != s[index_to_swap]:
# write your own swap func here
swap(s[index_to_swap], s[index_to_swap-1])
index_to_swap -=1
return s
Quick explanation:
Suppose the initial list given to us is:
1 2 3 4 5 6 7 8 9
So in our program, initially,
index_to_swap = last
meaning that it is pointing to 9, and
index_to_be_swapped = i+1
is i+1, i.e one step ahead of our current loop pointer. [Also remember we're looping with a difference of 2].
So initially,
i = 0
index_to_be_swapped = 1
index_to_swap = 9
and in the inner loop what we're checking is: until the values in both of these indexes are same, we keep on swapping
swap(s[index_to_swap], s[index_to_swap-1])
so it'll look like:
# initially:
1 2 3 4 5 6 7 8 9
^ ^---index_to_swap
^-----index_to_be_swapped
# after 1 loop
1 2 3 4 5 6 7 9 8
^ ^-----index_to_swap
^----- index_to_be_swapped
... goes on until
1 9 2 3 4 5 6 7 8
^-----index_to_swap
^-----index_to_be_swapped
Now, the inner loop's job is done, and the main loop is run again with
1 9 2 3 4 5 6 7 8
^ ^---- index_to_swap
^------index_to_be_swapped
This runs until it's behind 2.
So the outer loop runs for almost n\2 times, and for each outer loop the inner loop runs for almost n\2 times in the worst case so the time complexity if n/2*n/2 = n^2/4 which is the order of n^2 i.e O(n^2).
If there are any mistakes please feel free to point it out.
Hope this helps!

It will work for any sorted array
let arr = [1, 2, 3, 4, 5, 6, 7, 8, 9];
let i = arr[0];
let j = arr[arr.length - 1];
let k = 0;
while(k < arr.length) {
arr[k] = i;
if(arr[k+1]) arr[k+1] = j;
i++;
k += 2;
j--;
}
console.log(arr);
Explanation: Because its a sorted array, you need to know 3 things to produce your expected output.
Starting Value : let i = arr[0]
Ending Value(You can also find it with the length of array by the way): let j = arr[arr.length -1]
Length of Array: arr.length
Loop through the array and set the value like this
arr[firstIndex] = firstValue, arr[thirdIndex] = firstValue + 1 and so on..
arr[secondIndex] = lastValue, arr[fourthIndex] = lastValue - 1 and so on..
Obviously you can do the same things in a different way. But i think that's the simplest way.

Fastest way to find twice number in C [duplicate]

This question already has answers here:
Finding out the duplicate element in an array
(2 answers)
Closed 6 years ago.
Can anyone could help me how to solve this code in C? I think that I have to use big O notation as a solution, but I have no idea about it.
The question: There is an array T sized N+1 where numbers from 1 to N are random. One number x is repeated twice (position is also random).
What should be the fastest way to find value of this number x?
For example:
N = 7
[6 3 5 1 3 7 4 2]
x=3

The sum of numbers 1..N is N*(N+1)/2.
So, the extra number is:
extra_number = sum(all N+1 numbers) - N*(N+1)/2
Everything is O(1) except the sum. The sum can be computed in O(N) time.
The overall algorithm is O(N).

Walk the array using the value as the next array index (minus 1), marking the ones visited with a special value (like 0 or the negation). O(n)
On average, only half the elements are visited.
v
6 3 5 1 3 7 4 2
v
. 3 5 1 3 7 4 2
v
. 3 5 1 3 7 . 2
v
. 3 5 1 . 7 . 2
v
. 3 5 . . 7 . 2
v !! all ready visited. Previous 3 is repeated.
. 3 5 . . 7 . 2
No overflow problem caused by adding up the sum. Of course the array needs to be modified (or a sibling bool array of flags is needed.)
This method works even if more than 1 value is repeated.

The algorithm given by Klaus has O(1) memory requirements, but requires to sum all the elements from the given array, which may be quite large to iterate (sum) all over them.
Another approach is to iterate over array and increment the occurence counter once per iteration, so the algorithm can be stopped instantly once it finds the duplicate, though the worst case scenario is to scan through all the elements. For example:
#define N 8
int T[N] = {6, 3, 5, 1, 3, 7, 4, 2};
int occurences[N+1] = {0};
int duplicate = -1;
for (int i = 0; i < N; i++) {
occurences[T[i]]++;
if (occurences[T[i]] == 2) {
duplicate = T[i];
break;
}
}
Note that this method is also immune to integer overflow, that is N*(N+1)/2. might be larger than integer data type can possibly hold.

Can we use binary search to find most frequently occuring integer in sorted array? [closed]

Closed. This question needs details or clarity. It is not currently accepting answers.
Want to improve this question? Add details and clarify the problem by editing this post.
Closed 8 years ago.
Improve this question
Problem:
Given a sorted array of integers find the most frequently occurring integer. If there are multiple integers that satisfy this condition, return any one of them.
My basic solution:
Scan through the array and keep track of how many times you've seen each integer. Since it's sorted, you know that once you see a different integer, you've gotten the frequency of the previous integer. Keep track of which integer had the highest frequency.
This is O(N) time, O(1) space solution.
I am wondering if there's a more efficient algorithm that uses some form of binary search. It will still be O(N) time, but it should be faster for the average case.

Asymptotically (big-oh wise), you cannot use binary search to improve the worst case, for the reasons the answers above mine have presented. However, here are some ideas that may or may not help you in practice.
For each integer, binary search for its last occurrence. Once you find it, you know how many times it appears in the array, and can update your counts accordingly. Then, continue your search from the position you found.
This is advantageous if you have only a few elements that repeat a lot of times, for example:
1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3
Because you will only do 3 binary searches. If, however, you have many distinct elements:
1 2 3 4 5 6
Then you will do O(n) binary searches, resulting in O(n log n) complexity, so worse.
This gives you a better best case and a worse worst case than your initial algorithm.
Can we do better? We could improve the worst case by finding the last occurrence of the number at position i like this: look at 2i, then at 4i etc. as long as the value at those positions are the same. If they are not, look at (i + 2i) / 2 etc.
For example, consider the array:
i
1 2 3 4 5 6 7 ...
1 1 1 1 1 2 2 2 2 3 3 3 3 3 3 3 3 3 3
We look at 2i = 2, it has the same value. We look at 4i = 4, same value. We look at 8i = 8, different value. We backtrack to (4 + 8) / 2 = 6. Different value. Backtrack to (4 + 6) / 2 = 5. Same value. Try (5 + 6) / 2 = 5, same value. We search no more, because our window has width 1, so we're done. Continue the search from position 6.
This should improve the best case, while keeping the worst case as fast as possible.
Asymptotically, nothing is improved. To see if it actually works better on average in practice, you'll have to test it.

Binary search, which eliminates half of the remaining candidates, probably wouldn't work. There are some techniques you could use to avoid reading every element in the array. Unless your array is extremely long or you're solving a problem for curiosity, the naive (linear scan) solution is probably good enough.
Here's why I think binary search wouldn't work: start with an array: given the value of the middle item, you do not have enough information to eliminate the lower or upper half from the search.
However, we can scan the array in multiple passes, each time checking twice as many elements. When we find two elements that are the same, make one final pass. If no other elements were repeated, you've found the longest element run (without even knowing how many of that element is in the sorted list).
Otherwise, investigate the two (or more) longer sequences to determine which is longest.
Consider a sorted list.
Index 0 1 2 3 4 5 6 7 8 9 a b c d e f
List 1 2 3 3 3 3 3 3 3 4 5 5 6 6 6 7
Pass1 1 . . . . . . 3 . . . . . . . 7
Pass2 1 . . 3 . . . 3 . . . 5 . . . 7
Pass3 1 2 . 3 . x . 3 . 4 . 5 . 6 . 7
After pass 3, we know that the run of 3's must be at least 5, while the longest run of any other number is at most 3. Therefore, 3 is the most frequently occurring number in the list.
Using the right data structures and algorithms (use binary-tree-style indexing), you can avoid reading values more than once. You can also avoid reading the 3 (marked as an x in pass 3) since you already know its value.
This solution has running time O(n/k) which degrades to O(n) for k=1 for a list with n elements and a longest run of k elements. For small k, the naive solution will perform better due to simpler logic, data structures, and higher RAM cache hits.
If you need to determine the frequency of the most common number, it would take O((n/k) log k) as indicated by David to find the first and last position of the longest run of numbers using binary search on up to n/k groups of size k.

The worst case cannot be better than O(n) time. Consider the case where each element exists once, except for one element which exists twice. In order to find that element, you'd need to look at every element in the array until you find it. This is because knowing the value of any array element does not give you any information regarding the location of the duplicate element, until it's actually found. This is in contrast to binary search, where the value of an array element allows you to rule out many other elements.

No, in the worst case we have to scan at least n - 2 elements, but see
below for an algorithm that exploits inputs with many duplicates.
Consider an adversary that, for the first n - 3 distinct probes into the
n-element array, returns m for the value at index m. Now the algorithm
knows that the array looks like
1 2 3 ... i-1 ??? i+1 ... j-1 ??? j+1 ... k-1 ??? k+1 ... n-2 n-1 n.
Depending on what the ???s are, the sole correct answer could be j-1
or j+1, so the algorithm isn’t done yet.
This example involved an array where there were very few duplicates. In
fact, we can design an algorithm that, if the most frequent element
occurs k times out of n, uses O((n/k) log k) probes into the array. For
j from ceil(log2(n)) - 1 down to 0, examine the subarray consisting of
every (2**j)th element. Stop if we find a duplicate. The cost so far
is O(n/k). Now, for each element in the subarray, use binary search to
find its extent (O(n/k) searches in subarrays of size O(k), for a total
of O((n/k) log k)).
It can be shown that all algorithms have a worst case of Omega((n/k) log
k), making this one optimal in the worst case up to constant factors.

Is there a more elegant way of doing this?

Given an array of positive integers a I want to output array of integers b so that b[i] is the closest number to a[i] that is smaller then a[i], and is in {a[0], ... a[i-1]}. If such number doesn't exist, then b[i] = -1.
Example:
a = 2 1 7 5 7 9
b = -1 -1 2 2 5 7
b[0] = -1 since there is no number that is smaller than 2
b[1] = -1 since there is no number that is smaller than 1 from {2}
b[2] = 2, closest number to 7 that is smaller than 7 from {2,1} is 2
b[3] = 2, closest number to 5 that is smaller than 5 from {2,1,7} is 2
b[4] = 5, closest number to 7 that is smaller than 7 from {2,1,7,5} is 5
I was thinking about implementing balanced binary tree, however it will require a lot of work. Is there an easier way of doing this?

Here is one approach:
for i ← 1 to i ← (length(A)-1) {
// A[i] is added in the sorted sequence A[0, .. i-1] save A[i] to make a hole at index j
item = A[i]
j = i
// keep moving the hole to next smaller index until A[j - 1] is <= item
while j > 0 and A[j - 1] > item {
A[j] = A[j - 1] // move hole to next smaller index
j = j - 1
}
A[j] = item // put item in the hole
// if there are elements to the left of A[j] in sorted sequence A[0, .. i-1], then store it in b
// TODO : run loop so that duplicate entries wont hamper results
if j > 1
b[i] = A[j-1]
else
b[1] = -1;
}
Dry run:
a = 2 1 7 5 7 9
a[1] = 2
its straight forward, set b[1] to -1
a[2] = 1
insert into subarray : [1 ,2]
any elements before 1 in sorted array ? no.
So set b[2] to -1 . b: [-1, -1]
a[3] = 7
insert into subarray : [1 ,2, 7]
any elements before 7 in sorted array ? yes. its 2
So set b[3] to 2. b: [-1, -1, 2]
a[4] = 5
insert into subarray : [1 ,2, 5, 7]
any elements before 5 in sorted array ? yes. its 2
So set b[4] to 2. b: [-1, -1, 2, 2]
and so on..

Here's a sketch of a (nearly) O(n log n) algorithm that's somewhere in between the difficulty of implementing an insertion sort and balanced binary tree: Do the problem backwards, use merge/quick sort, and use binary search.
Pseudocode:
let c be a copy of a
let b be an array sized the same as a
sort c using an O(n log n) algorithm
for i from a.length-1 to 1
binary search over c for key a[i] // O(log n) time
remove the item found // Could take O(n) time
if there exists an item to the left of that position, b[i] = that item
otherwise, b[i] = -1
b[0] = -1
return b
There's a few implementation details that can make this have poor runtime.
For instance, since you have to remove items, doing this on a regular array and shifting things around will make this algorithm still take O(n^2) time. So, you could store key-value pairs instead. One would be the key, and the other would be the number of those keys (kind of like a multiset implemented on an array). "Removing" one would just be subtracting the second item from the pair and so on.
Eventually you will be left with a bunch of 0-value keys. This would eventually make the if there exists an item to the left take roughly O(n) time, and therefore, the entire algorithm would degrade to a O(n^2) for that reason. So another optimization might be to batch remove all of them periodically. For instance, when 1/2 of them are 0-values, perform a pruning.
The ideal option might be to implement another data structure that has a much more favorable remove time. Something along the lines of a modified unrolled linked list with indices could work, but it would certainly increase the implementation complexity of this approach.
I've actually implemented this. I used the first two optimizations above (storing key-value pairs for compression, and pruning when 1/2 of them are 0s). Here's some benchmarks to compare using an insertion sort derivative to this one:
a.length This method Insert sort Method
100 0.0262ms 0.0204ms
1000 0.2300ms 0.8793ms
10000 2.7303ms 75.7155ms
100000 32.6601ms 7740.36 ms
300000 98.9956ms 69523.6 ms
1000000 333.501 ms ????? Not patient enough
So, as you can see, this algorithm grows much, much slower than the insertion sort method I posted before. However, it took 73 lines of code vs 26 lines of code for the insertion sort method. So in terms of simplicity, the insertion sort method might still be the way to go if you don't have time requirements/the input is small.

You could treat it like an insertion sort.
Pseudocode:
let arr be one array with enough space for every item in a
let b be another array with, again, enough space for all elements in a
For each item in a:
perform insertion sort on item into arr
After performing the insertion, if there exists a number to the left, append that to b.
Otherwise, append -1 to b
return b
The main thing you have to worry about is making sure that you don't make the mistake of reallocating arrays (because it would reallocate n times, which would be extremely costly). This will be an implementation detail of whatever language you use (std::vector's reserve for C++ ... arr.reserve(n) for D ... ArrayList's ensureCapacity in Java...)
A potential downfall with this approach compared to using a binary tree is that it's O(n^2) time. However, the constant factors using this method vs binary tree would make this faster for smaller sizes. If your n is smaller than 1000, this would be an appropriate solution. However, O(n log n) grows much slower than O(n^2), so if you expect a's size to be significantly higher and if there's a time limit that you are likely to breach, you might consider a more complicated O(n log n) algorithm.
There are ways to slightly improve the performance (such as using a binary insertion sort: using binary search to find the position to insert into), but generally they won't improve performance enough to matter in most cases since it's still O(n^2) time to shift elements to fit.

Consider this:
a = 2 1 7 5 7 9
b = -1 -1 2 2 5 7
c 0 1 2 3 4 5 6 7 8 9
0 - - - - - - - - - -
Where the index of C is value of a[i] such that 0,3,4,6,8 would have null values.
and the 1st dimension of C contains the highest to date closest value to a[i]
So in step by a[3] we have the following
c 0 1 2 3 4 5 6 7 8 9
0 - -1 -1 - - 2 - 2 - -
and by step a[5] we have the following
c 0 1 2 3 4 5 6 7 8 9
0 - -1 -1 - - 2 - 5 - 7
This way when we get to the 2nd 7 at a[4] we know that 2 is the largest value to date and all we need to do is loop back through a[i-1] until we encounter a 7 again comparing the a[i] value to that in c[7] if bigger, replace c[7]. Once a[i-1] = the 7 we put c[7] into b[i] and move on to next a[i].
The main downfalls to this approach that I can see are:
footprint size depending on how big the c[] needs to be dimensioned..
the fact that you have to revisit elements of a[] that you've already touched. If the distribution of data is such that there are significant spaces between the two 7's then keeping track of the highest value as you go would presumably be faster. Alternatively it might be better to gather statistics on the a[i] up front to know what distributions exist and then use a hybrid method maintaining the max until such time that no more instances of that number are in the statistics.