I encountered an interview question.
Given two sorted arrays.
Initially both have set set of elements but one element is removed from one array.
Find the element removed.
Constraint is we have to done it inplace in O(logn)
For ex:
arr1[]={1,2,3,8,12,16};
arr2[]={1,2,8,12,16};
element removed is 3
I am typing from mobile, so it is a pseudo code, but you will get it:
take arr1.len / 2. It is 3. Check arr1[3] and arr2[3]. If they equal then missing vsalue is in index greater than 3 else less than 3. Here we get 8 and 12. So missing is before. We take index 3/2=1. Compare arr1[1] and arr2[1]. They are equal, so missing is after index 1 and before 3. So it is arr1[2] = 3.
This is the idea. You are doing a binary search, deviding searvh area by half everytime. You take left or right part of the array depending on comparing. You just need to implement this and do some checks, but the idea is clear I think.
Related
Assume we have got an array arr with all initial values 0. Now we are given n operations - an operation consists of two numbers a b. It means that we are adding +1 to the value of arr[a] and adding -1 to the value of arr[b].
Moreover, we can swap numbers in some operations, what means that we will add -1 to arr[a] and +1 to arr[b].
We want to achieve a situation, in which all values of arr are equal to 0 even after all these operations. We are wondering if that is possible, and, if yes, what operations should we swap to achieve that.
Any thoughts on that?
Some example input:
3
1 2
3 2
3 1
should result in
YES
R
N
R
where R means to reverse that operation, and N not to do it.
input:
3
1 2
2 3
3 2
results in answer NO.
Let each of the array element be a vertex in a graph and the operation (a,b) be a edge from vertex a to b (there might be multiple edges between same vertices). Now traveling from the vertex a means decrease array element a and traveling to the vertex a means increase array element a. Now if each vertex has an even number of edges and you find a cyclic path that visits each edge exactly once you will have a zero total sum in the array.
Such a path is called Eulerian cycle (Wikipedia). From the Wikipedia: An undirected graph has an Eulerian cycle if and only if every vertex has even degree, and all of its vertices with nonzero degree belong to a single connected component. As in your case only all the disconnected sub graphs need to have an Eulerian cycle it is enough to count how many times each array index appears and if the count is even for each one of them there is always way to obtain zero total in the array.
If you want to find out which operations to reverse, you need to find one of such paths and check which directions you travel the edges.
Just count the number of times the indices appear. If all indices appear in even numbers, then the answer is YES.
You can prove it by construction. You will need to build a pair list from the original pair list. The goal is to build the list such that you can match every index that appears on the left with an index that appears on the right.
Go from the first pair to the last. For each pair, try to match an index that appears an odd number of times.
For example, in your first example, each index appears twice, so the answer is YES. To build the list, you start with (1,2). Then you look at the pair (3,2) and you know that 2 appears once on the right, so you swap it to have 2 on the left: (2,3). For the last pair, you have (3,1) which matches 1 and 3 that appear only once so far.
Note that at the end, you can always find a matching pair, because each number appears in an even number. Each number should have a match.
In the second example, 2 appears three times. So the answer is NO.
I have a m x m two-dimensional array and I want to randomly select a sequence of n elements. The elements have to be adjacent (not diagonally). What is a good approach here? I though about a depth-first search from a random starting point but that seemed a little bit overkill for such a simple problem.
If I get this right, you are looking for sequence like continuous numbers ?
When i simplyfy this:
9 4 3
0 7 2
5 6 1
So when the 1 is selected, you'd like to have path from 1 to 4 right ? I personally think that Depth-First search would be the best choice. It's not that hard, it's actually pretty simple. Imagine you select number 2. You'll remember position of number 2 and then you can look for lowest numbers until there are any. When you are done with this part, you just do the same for higher numbers.
You have two stacks one for possible ways and another one for final path.
When going through the array, you are just poping from possibilities and pushing right ones into the final stack.
The best approach would be finding the lowest possible number without saving anything and then just looking for higher numbers and storing them so at the end you'll get stack from the highest number to the lowest.
If I get that wrong and you mean just like selecting elements that are "touching" like (from my table) 9 0 7 6, which means that the content doesn't matter, then you can do it simple by picking one number, storing all possibilities (every element around it) and then pick random number from 0 to size of that stored values. When you select one, you remove it from these stored values but you keep them. Then you run this on the new element and you just add these new elements to stored elements so the random will always pick surrounding around these selected numbers.
Suppose you have an array of integers (for eg. [1 5 3 4 6]). The elements are rearranged according to the following rule. Every element can hop forward (towards left) and slide the elements in those indices over which it hopped. The process starts with element in second index (i.e. 5). It has a choice to hop over element 1 or it can stay in its own position.If it does choose to hop, element 1 slides down to index 2. Let us assume it does choose to hop and our resultant array will then be [5 1 3 4 6]. Element 3 can now hop over 1 or 2 positions and the process repeats. If 3 hops over one position the array will now be [5 3 1 4 6] and if it hops over two positions it will now be [3 5 1 4 6].
It is very easy to show that all possible permutation of the elements is possible in this way. Also any final configuration can be reached by an unique set of occurrences.
The question is, given a final array and a source array, find the total number of hops required to arrive at the final array from the source. A O(N^2) implementation is easy to find, however I believe this can be done in O(N) or O(NlogN). Also if it is not possible to do better than O(N2) it will be great to know.
For example if the final array is [3,5,1,4,6] and the source array [1,5,3,4,6], the answer will be 3.
My O(N2) algorithm is like this: you loop over all the positions of the source array from the end, since we know that is the last element to move. Here it will be 6 and we check its position in the final array. We calculate the number of hops necessary and need to rearrange the final array to put that element in its original position in the source array. The rearranging step goes over all the elements in the array and the process loops over all the elements, hence O(N^2). Using Hashmap or map can help in searching, but the map needs to be updated after every step which makes in O(N^2).
P.S. I am trying to model correlation between two permutations in a Bayesian way and this is a sub-problem of that. Any ideas on modifying the generative process to make the problem simpler is also helpful.
Edit: I have found my answer. This is exactly what Kendall Tau distance does. There is an easy merge sort based algorithm to find this out in O(NlogN).
Consider the target array as an ordering. A target array [2 5 4 1 3] can be seen as [1 2 3 4 5], just by relabeling. You only have to know the mapping to be able to compare elements in constant time. On this instance, to compare 4 and 5 you check: index[4]=2 > index[5]=1 (in the target array) and so 4 > 5 (meaning: 4 must be to the right of 5 at the end).
So what you really have is just a vanilla sorting problem. The ordering is just different from the usual numerical ordering. The only thing that changes is your comparison function. The rest is basically the same. Sorting can be achieved in O(nlgn), or even O(n) (radix sort). That said, you have some additional constraints: you must sort in-place, and you can only swap two adjacent elements.
A strong and simple candidate would be selection sort, which will do just that in O(n^2) time. On each iteration, you identify the "leftiest" remaining element in the "unplaced" portion and swap it until it lands at the end of the "placed" portion. It can improve to O(nlgn) with the use of an appropriate data structure (priority queue for identifying the "leftiest" remaining element in O(lgn) time). Since nlgn is a lower bound for comparison based sorting, I really don't think you can do better than that.
Edit: So you're not interested in the sequence of swaps at all, only the minimum number of swaps required. This is exactly the number of inversions in the array (adapted to your particular needs: "non natural ordering" comparison function, but it doesn't change the maths). See this answer for a proof of that assertion.
One way to find the number of inversions is to adapt the Merge Sort algorithm. Since you have to actually sort the array to compute it, it turns out to be still O(nlgn) time. For an implementation, see this answer or this (again, remember that you'll have to adapt).
From your answer I assume number of hops is total number of swaps of adjacent elements needed to transform original array to final array.
I suggest to use something like insert sort, but without insertion part - data in arrays will not be altered.
You can make queue t for stalled hoppers as balanced binary search tree with counters (number of elements in subtree).
You can add element to t, remove element from t, balance t and find element position in t in O(log C) time, where C is the number of elements in t.
Few words on finding position of element. It consists of binary search with accumulation of skipped left sides (and middle elements +1 if you keep elements on branches) counts.
Few words on balancing/addition/removal. You have to traverse upward from removed/added element/subtree and update counters. Overall number of operations still hold at O(log C) for insert+balance and remove+balance.
Let's t is that (balanced search tree) queue, p is current original array index, q is final array index, original array is a, final array is f.
Now we have 1 loop starting from left side (say, p=0, q=0):
If a[p] == f[q], then original array element hops over the whole queue. Add t.count to the answer, increment p, increment q.
If a[p] != f[q] and f[q] is not in t, then insert a[p] into t and increment p.
If a[p] != f[q] and f[q] is in t, then add f[q]'s position in queue to answer, remove f[q] from t and increment q.
I like the magic that will ensure this process will move p and q to ends of arrays in the same time if arrays are really permutations of one array. Nevertheless you should probably check p and q overflows to detect incorrect data as we have no really faster way to prove data is correct.
This is related to problem 5.7 (under Bit Manipulation) in the 5th edition of Cracking the coding interview (if the is question is not appropriate for SO, please let me know the correct site and I'll move it):
An array A[1..n] contains all the integers from 0 to n except for one
number which is missing. In this problem, we cannot access an entire
integer in A with a single opera-tion. The elements of A are
represented in binary, and the only operation we can use to access
them is “fetch the jth bit of A[i]”, which takes constant time. Write
code to find the missing integer. Can you do it in O(n) time?
The algorithm applied is this:
Check LSBs of all numbers in the list.
Count occurrence of 1's and 0's in the LSBs.
If count(0)<=count(1), the LSB of the missing number is 0. Else it is 1.
Remove all numbers with LSB not matching result found in step 2.
Repeat 1 to 4, and progressively check the next LSB in each iteration.
Can someone explain the logic behind step 3? It basically removes all odd/even numbers from the current list (depending on the bit found for the missing number) and uses the modified list in the next iteration. Why do we do this?
Step 3 is meant to (vastly) improve the runtime of the algorithm. If step 3 is included, then the overall algorithm is a binary search algorithm using the LSB as the branching decider. If step 3 is omitted, then it is still a binary search, but one that is implemented in a way that does not become logarithmically faster on each pass (which will exceed the O(n) bound).
Incidentally, as written it seems like there's a bit shift missing, or the term LSB is being used in a rather liberal way.
The algorithm does not work. If the number of elements in the array is even, then step 2 may find an equal number of 0s and 1s. This can happen, for example, when the missing number is at one end or the other of the range.
If the number of elements in the array is not even, it will be on the next iteration after step 3.
ADDENDUM
Here is an example.
Set: 0, 1, 3
After step 1&2, we have 1 LSB=0 and 2 LSB=1. So, according to step 2, the LSB must be 0. So far, so good.
After step 3, removing values from the set with LBS=1 we have:
Set: 0
After step 1&2, we have 1 LSB'=0 and 0 LSB'=1. So, according to step 2, the LSB' must be 1.
At this point we're done (the set is empty after removing elements with LSB'=0), and have identified 2 as the answer (LSB' = 1, LBS = 0).
How does this work?
After every iteration, think about shifting the values in the array right by one bit to discard the bit determined in the previous iteration. This will create duplicates in the array for all values except the missing one. The algorithm is simply throwing away these duplicates.
I have the problem where given an array, A[1...n] of n (not necessarily distinct) integers, find an algorithm to determine whether any item occurs more than ceiling of n/4 times in A.
It seems that the best-possible worst case time is O(n). I am aware of the majority element algorithm and am wondering if this may be applied to this situation. Please let me know if you have any suggestions for approaching this problem.
This is only an idea of an algorithm but I believe it should be possible to make it work.
The main trick is as follows. If you look at any four elements and they are all different, you may throw all four away. (If any of the thrown elements had more than 1/4 frequency in the old array, it will in the new array; if none had, none will).
So you go over an array, throwing away tuples of four, and rearranging the rest. For instance, if you have AABC and then DDEF, you rearrange to AADDBCEF and throw BCEF away. I will let you work out the details, it's not hard. In the end you should be left with pairs of identical elements. Then you throw odd-numbered elements away and repeat.
After each run you may be left with 1, 2 or 3 elements with no pair that you cannot throw away. No worry, you can combine the leftovers of two runs such that there are never more than 3 elements in the leftover pile. E.g. if after run 1 you have A,B,C and after run 2 you have A,D,E you leave just A. Remember that elements from the second run count twice, so in effect you have 3"A, which is more than 1/4 of the total of 9. Keep count of each leftover element to track which of them can be thrown away. (You might be able to just always keep the last leftovers, I have not checked that).
In the end you will have just the leftovers. Check each one against the original array.
There are three possibilities for this element(s), either it's a median of array, or is a median of n/2 smallest elements of array or it's a median of n/2 largest elements of array.
In all cases first find the median of array. After that check whether it occurs in at least n/4 elements, if not then divide array into two part of almost same size (they differ in size by at most one element):
smaller than equal to median
bigger than equal to median
Then find the median of each of these two subarrays and check the number of occurrences of each of them. This is in all O(n). Also in this way you can find all elements with occurrence at least n/4 times.
By the way you can extend this technique to find any element with O(n) time occurrence (e.g n/10000), which works again in O(n).