In the familiar problem where every element in an array is at most k positions away from it's correct location, either to the left or to the right, I don't completely understand how the Insertion sort algorithm works.
I drew it on a paper, and debugged it step. It does seem to work, and the order of time complexity is O(n.k) as well.
But my question, is that, in this problem, the element could be k positions away either in the left or in the right. However, insertion sort only checks left. How does it still manage to get it right? Could you please explain to me, how this algorithm still works, although we look only to the left, in a manner I can convince myself?
PS : Irrelevant here : If I wasn't aware of this algorithm, I would've thought of something like Selection sort, where for a given element, i, you look k positions on the left and right, to choose the smallest element.
PS : Even more irrelavant : I'm aware of the min-heap based method for this. Again, the same question, why do we only look left?
Items to the left of the correct location get pushed right while the algorithm is processing smaller items. (Assuming that we're sorting in ascending order.) For example, if k is 3 and the initial array is
D A B E H C F G
Let's examine how D gets to location 3 in the array. (Using zero-based indexing, D starts at index 0, and needs to move to index 3 in the array.)
The first pass starts at E, and finds that it can swap A and D resulting in
A D B E H C F G
Second pass starts at H, and swaps B and D
A B D E H C F G
Third pass starts at C, and swaps C with H, E, and D
A B C D E H F G
And now you see that D is already where it's supposed to be.
That's always going to be the case. Any element that starts to the left of its final position will be pushed right (to its final position) as smaller elements are processed. Otherwise, the smaller elements aren't in their correct locations. And an element (like D) won't get pushed past it's correct location, because the algorithm won't swap that element with elements that are larger.
In Insertion Sort what actually happens is we compare an element with every element to its left . So what happens is the list to the left of that element gets sorted and the list to the right of that element is not. Then we move on to the next element and repeat the same process again. This is done till we reach the last element in the list. Therefore we get the sorted list.
It's a wrong assumption that you only need to look left. You can also start from the beginning index and look right.
But, there's something called Loop Invariant (Read more about it). The invariant of insertion sort is that the it keeps a sorted subarray(growing as the algorithm runs) to its left or its right.
Here's the link to a read which will clear it up. https://www.hackerrank.com/challenges/correctness-invariant
Related
Let V be a vector of n elements, where each cell can contain one of k possible colors, that is
V[i] ∈ {c1. . . ,ck}
Design an algorithm that, given V, construct a "oracle" (a data structure) able to answer in O(1) to query of the following type:
Given an index i and a color c, which is the index of the cell closer to i that contains the color c?
The oracle construction algorithm must have complexity in O(kn), query algorithm in O(1).
EDIT
O(kn) concerns a time complexity, so there are no limits about the additional memory.
My reasoning
Given i and c, the query should return an index j with
V[j] = c
which minimizes | i - j |. If there's no cell that contains the color c, it must returns -1. So I guess that the two functions prototypes should be as following:
ORACLE(array V, int k)
QUERY(array O, int i, int c)
the array O is created by the oracle function in order to "save" the preprocessed values that will be subsequently extrapolated in O(1) by the function query. I'm stuck in this passage, because I can't understand how place values in order to get the right result. Any hints?
As you stated, your oracle should probably be an NxK array with the answer for every index and every color stored as an integer index that gives the index closes to the query index that has the query color. Initialize your oracle array to all -1. Then go through your array V first forward and then backards. When you go through forward, simply keep track of the last index in V where you have seen color k for each color k (with -1 if you haven't seen the color yet), and then as you proceed through V in forward order, if you are at index i then the answer for the oracle for color j is the last index where you saw color j. Then go through the array V backwards, and keep track of the last time you saw each color. When you are at position j in the array V, check to see what the index for the closest cell of each color was when you went forward, and if the index for the last cell when you saw a color when you are going backwards is closer, then over-write the oracle cell with the closer index. After you go through the array both forwards and backwards, you will have the oracle fully constructed and ready to query in O(1) time.
Given an array , each element is one more or one less than its preceding element .find an element in it.(better than O(n) approach)
I have a solution for this but I have no way to tell formally if it is the correct solution:
Let us assume we have to find n.
From the given index, find the distance to n; d = |a[0] - n|
The desired element will be atleast d elements apart and jump d elements
repeat above till d = 0
Yes, your approach will work.
If you can only increase / decrease by one at each following index, there's no way a value at an index closer than d could be a distance d from the current value. So there's no way you can skip over the target value. And, unless the value is found, the distance will always be greater than 0, thus you'll keep moving right. Thus, if the value exists, you'll find it.
No, you can't do better than O(n) in the worst case.
Consider an array 1,2,1,2,1,2,1,2,1,2,1,2 and you're looking for 0. Any of the 2's can be changed to a 0 without having to change any of the other values, thus we have to look at all the 2's and there are n/2 = O(n) 2's.
Prepocessing can help here.
Find Minimum and Maximum element of array in O(n) time complexity.
If element to be queried is between Minimum and Maximum of array, then that element is present in array, else that element is not present in that array.So any query will take O(1) time. If that array is queried multiple times, than amortized time complexity will be lesser that O(n).
I want to sort a list of entries and then select a subset (page) of that sorted list. For example; I have 10.000 items and want to have items 101 until 200.
A naive approach would be to first sort all 10.000 items and then select the page; it would mean that items 1 - 100 and 201 - 10.000 are all unnecessarily fully sorted.
Is there an existing algorithm that will only fully sort the items in the page and stops further sorting of an entry once it is clear it is not in the page? source code in C would be great, but descriptions would also be ok
Suppose you want items p through q out of n. While sorting would cost O(n·log n) time, the operation you mention can be done in O(n) time (so long as q-p « n) as follows: Apply an O(n)-time method to find the pᵗʰ and qᵗʰ values. Then select only items with values from p to q, in time O(n+k) if k=q-p, or about O(n) time, and sort those items in time O(k·log k), which is about O(1), for net time O(n) if k is O(1).
Suppose the page you want starts with the nth "smallest" element (or largest or whatever ordinal scale you prefer). Then you need to divide your partial sorting algorithm into two steps:
Find the nth element
Sort elements {n, n+1, ..., n+s} (where s is the page size)
Quicksort is a sorting algorithm that can be conveniently modified to suit your needs. Basically, it works as follows:
Given: a list L of ordinally related elements.
If L contains exactly one element, return L.
Choose a pivot element p from L at random.
Divide L into two sets: A and B such that A contains all the elements from L which are smaller than p and B contains all the elements from L which are larger.
Apply the algorithm recursively to A and B to obtain the sorted sublists A' and B'.
Return the list A || p || B, where || denotes appending lists or elements.
What you want to do in step #1, is run Quicksort until you've found the nth element. So step #1 will look like this:
Given: a list L of ordinally related elements, a page offset n and a page size s.
Choose a pivot element p from L at random.
Divide L into A and B.
If the size of A, #A = n-1, then return p || B.
If #A < n-1, then apply the algorithm recursively for L' = B and n' = n - #A
If #A > n-1, then apply the algorithm recursively for L' = A and n' = n
This algorithm returns an unsorted list of elements starting with the nth element. Next, run Quicksort on this list but keep ignoring B unless #A < s. At the end, you should have a list of s sorted elements which are larger than n elements from the original list but not larger than n+1 elements from the original list.
The term you want to research is partial sorting. There is likely to be an implementation of it in C or any sufficiently popular language.
how can we reverse a subarray ( say from i-th index to j-th index ) of an array ( or any other data structure , like linked-list ( not doubly )), in less than O(n) time ? the O(n) time consumption is trivial.( I want to do this reversion many times on the array , like starting from the beginning and reversing it for n times (each time , going forward for one index and then reversing it again), so there should be a way ,which its amortized analysis would give us a time consumption less than O(n) , any idea ?
thanks In advance :)
I think you want to solve this with a wrong approach. I guess you want to improve the algorithm as a whole, and not the O(n) reversing stuff. Because that's not possible. You always have O(n) if you have to consider each of the n elements.
As I said, what you can do is improve the O(n^2) algorithm. You can solve that in O(n):
Let's say we have this list:
a b c d e
You then modify this list using your algorithm:
e d c b a
e a b c d
and so on.. in the end you have this:
e a d b c
You can get this list if you have two pointers coming from both ends of the array and alternate between the pointers (increment/decrement/get value). Which gives you O(n) for the whole procedure.
More detailed explanation of this algorithm:
Using the previous list, we want the elements in the follow order:
a b c d e
2 4 5 3 1
So you create two pointers. One pointing at the beginning of the list, the other one at the end:
a b c d e
^ ^
p1 p2
Then the algorithms works as follows:
1. Take the value of p2
2. Take the value of p1
3. Move the pointer p2 one index back
4. Move the pointer p1 one index further
5. If they point to the same location, take the value of p1 and stop.
or if p1 has passed p2 then stop.
or else go to 1.
You can do it O(n) time for given array. Here l represents starting index and r represents end. So we need to reverse subarray from r to l.
public void reverse(int[] arr, int l, int r)
{
int d = (r-l+1)/2;
for(int i=0;i<d;i++)
{
int t = arr[l+i];
arr[l+i] = arr[r-i];
arr[r-i] = t;
}
// print array here
}
As duedl0r mentioned, O(n) is your minimum. you will have to move n items to their new position.
Since you mentioned a linked list, here's an O(n) solution for that.
If you move through all nodes and reverse their direction, then tie the ends to the rest of the list, the sublist is reversed. So:
1->2->3->4->5->6->7->8->9
reversing 4 through 7 would change:
4->5->6->7
into:
4<-5<-6<-7
Then just let 3 point to 7 and let 4 point to 8.
Somewhat copying duedl0r's format for consistency:
1. Move to the item before the first item to reorder(n steps)
2. remember it as a (1 step)
3. Move to the next item (1 step)
4. Remember it as b (1 step)
while not at the last item to reorder: (m times)
5. Remember current item as c (1 step)
6. Go to next item (1 step)
7. Let the next item point to the previous item (1 step)
having reached the last item to reorder:
8. let item a point to item c (1 step)
if there is a next item:
9. move to next item (1 step)
10. let item b point to current item (1 step)
That's O(n+1+1+1+m*(1+1+1)+1+1+1).
Without all the numbers that aren't allowed in Big O, that's O(n+m), which may be called O(n+n), which may be called O(2n).
And that's O(n).
We are given a string of the form: RBBR, where R - red and B - blue.
We need to find the minimum number of swaps required in order to club the colors together. In the above case that answer would be 1 to get RRBB or BBRR.
I feel like an algorithm to sort a partially sorted array would be useful here since a simple sort would give us the number of swaps, but we want the minimum number of swaps.
Any ideas?
This is allegedly a Microsoft interview question according to this.
Take one pass over the string and count the number of reds (#R) and the number of blues (#B). Then take a second pass counting the number of reds in the first #R balls (#r) and the number of blue balls in the first #B balls (#b). The lesser of (#R - #r) and (#B - #b) will be the minimum number of swaps needed.
We are given the string S that we have to convert to the final string F = R^a B^b or B^b R^a. The number of differences between S and F should be even because for every misplaced R there will be a complementary misplaced B. So why not find the minimum number of differences between S and both possible F's and divide that by 2?
For example, you're given S = RBRRBRBR which should convert to
RRRRRBBB
or
BBBRRRRR
Comparing the differences between S and F for each character for each possibility, there are 4 differences for each possible final string so regardless the minimum is 2 swaps.
Let's look at your example. You know that the end state will be RRBB or BBRR. In other words, the end state is always nRmB or mBnR, where n is the number of R's and m is the number o B's in your string.
Since the end state is defined, maybe some sort of path-finding algorithm would be a good aproach for this? How about considering each swap as a state-change and thinking of a heuristic function to aproximate the number of left over swaps needed.
I'm just throwing an idea in the air, but I hope this helps.
Start with two indices simultaneously from the right and left end of the string. Advance the left index until you find an R. Advance the right index backwards until you find a B. Swap them. Repeat until the left index meets the right index, and count the swaps. Then, do the same, but look for B on the left and R on the right. The minimum is the lower of both swap counts.
I think the number of swaps can be derived from the number of inversions required to sort the vector. This is the example of doing the same with permutation vector.
This isn't a technical answer, but I looked at this more intuitively.
RRBBBBR is can be reduced to RBR, since a group of R's can be moved as a single block. This means that the array is really just a N sets of RB.
The only thing that matters is the number of N sets of RB blocks (including incomplete blocks for the last one).
RBR -> 1 swap to get to RRB (2 sets of RB block, RB and R)
RBRB-> 1 swap to get to RRBB (2 full sets of RB blocks)
RBRBRB-> 2 swaps to get to RRRBBB (3 full sets of RB blocks)
RBRBRBRB -> 4 sets of RB = 3 swaps
So to generalize this, the number of swaps needed = N sets of RB block (including incomplete blocks) and subtract 1.