Could you please help me with this one? :
"Let A and B are an incrementally ordered arrays of natural numbers and K be some arbitrary natural number. Find an effective algorithm which determines all possible pairs of indexes (i,j) such that A[i]+B[j]=K. Prove algorithm's correctness and estimate its complexity."
Should I just iterate over the first array and do a binary search on the other one?
Thanks :)
No!
Both arrays are ordered, so you do the following:
Place an iterator itA on the beginning of A
Place an iterator itB on the end of B
Move iterators in opposite directions, testing *itA + *itB at each iteration. If the value is equal to K, return both indexes. If the value is smaller than K, increment itA. Else, decrement itB.
When you go through both arrays, you are done, in linear time.
Since, for every A[i] there can only be one B[j], you can find the solution with O(n+m) complexity. You can rely on the fact that, if (A[i1] B[j1]) and (A[i2] B[i2]) are both correct pairs, and i1 is less than i2 then j1 must be greater than j2. Hope this helps.
I don't know if it helps, it's just an idea. Loop A linearly and binary search B, but do A backwards. This might give you a better best case, by being able to exclude some portion of B at each step in A.
If you know that A[i] needed say B[42] to solve for K, you'll know that A[i-1] will need at least B[43] or higher.
EDIT: I would also like to add that if B has fewer elements than A, turn it around and do B linearly instead.
Possible implementation in C++ can look as follows:
#include <iostream>
int main()
{
int A[]={1,2,3,6,7,8,9};
int B[]={0,2,4,5,6,7,8,12};
int K=9;
int sizeB=sizeof B/sizeof(int);
int sizeA=sizeof A/sizeof(int);
int i=0;
int j=sizeB-1;
while(i<sizeA && j>=0)
{
if ((A[i]+B[j])==K){
std::cout << i<<","<<j<< std::endl;
i++;
j--;
}
else if((A[i]+B[j])<K){
i++;
}
else{
j--;
}
}
return 0;
}
Related
I have been attempting to solve the following problem:
You are given an array of n+1 integers where all the elements lies in [1,n]. You are also given that one of the elements is duplicated a certain number of times, whilst the others are distinct. Develop an algorithm to find both the duplicated number and the number of times it is duplicated.
Here is my solution where I let k = number of duplications:
struct LatticePoint{ // to hold duplicate and k
int a;
int b;
LatticePoint(int a_, int b_) : a(a_), b(b_) {}
}
LatticePoint findDuplicateAndK(const std::vector<int>& A){
int n = A.size() - 1;
std::vector<int> Numbers (n);
for(int i = 0; i < n + 1; ++i){
++Numbers[A[i] - 1]; // A[i] in range [1,n] so no out-of-access
}
int i = 0;
while(i < n){
if(Numbers[i] > 1) {
int duplicate = i + 1;
int k = Numbers[i] - 1;
LatticePoint result{duplicate, k};
return LatticePoint;
}
So, the basic idea is this: we go along the array and each time we see the number A[i] we increment the value of Numbers[A[i]]. Since only the duplicate appears more than once, the index of the entry of Numbers with value greater than 1 must be the duplicate number with the value of the entry the number of duplications - 1. This algorithm of O(n) in time complexity and O(n) in space.
I was wondering if someone had a solution that is better in time and/or space? (or indeed if there are any errors in my solution...)
You can reduce the scratch space to n bits instead of n ints, provided you either have or are willing to write a bitset with run-time specified size (see boost::dynamic_bitset).
You don't need to collect duplicate counts until you know which element is duplicated, and then you only need to keep that count. So all you need to track is whether you have previously seen the value (hence, n bits). Once you find the duplicated value, set count to 2 and run through the rest of the vector, incrementing count each time you hit an instance of the value. (You initialise count to 2, since by the time you get there, you will have seen exactly two of them.)
That's still O(n) space, but the constant factor is a lot smaller.
The idea of your code works.
But, thanks to the n+1 elements, we can achieve other tradeoffs of time and space.
If we have some number of buckets we're dividing numbers between, putting n+1 numbers in means that some bucket has to wind up with more than expected. This is a variant on the well-known pigeonhole principle.
So we use 2 buckets, one for the range 1..floor(n/2) and one for floor(n/2)+1..n. After one pass through the array, we know which half the answer is in. We then divide that half into halves, make another pass, and so on. This leads to a binary search which will get the answer with O(1) data, and with ceil(log_2(n)) passes, each taking time O(n). Therefore we get the answer in time O(n log(n)).
Now we don't need to use 2 buckets. If we used 3, we'd take ceil(log_3(n)) passes. So as we increased the fixed number of buckets, we take more space and save time. Are there other tradeoffs?
Well you showed how to do it in 1 pass with n buckets. How many buckets do you need to do it in 2 passes? The answer turns out to be at least sqrt(n) bucekts. And 3 passes is possible with the cube root. And so on.
So you get a whole family of tradeoffs where the more buckets you have, the more space you need, but the fewer passes. And your solution is merely at the extreme end, taking the most spaces and the least time.
Here's a cheekier algorithm, which requires only constant space but rearranges the input vector. (It only reorders; all the original elements are still present at the end.)
It's still O(n) time, although that might not be completely obvious.
The idea is to try to rearrange the array so that A[i] is i, until we find the duplicate. The duplicate will show up when we try to put an element at the right index and it turns out that that index already holds that element. With that, we've found the duplicate; we have a value we want to move to A[j] but the same value is already at A[j]. We then scan through the rest of the array, incrementing the count every time we find another instance.
#include <utility>
#include <vector>
std::pair<int, int> count_dup(std::vector<int> A) {
/* Try to put each element in its "home" position (that is,
* where the value is the same as the index). Since the
* values start at 1, A[0] isn't home to anyone, so we start
* the loop at 1.
*/
int n = A.size();
for (int i = 1; i < n; ++i) {
while (A[i] != i) {
int j = A[i];
if (A[j] == j) {
/* j is the duplicate. Now we need to count them.
* We have one at i. There's one at j, too, but we only
* need to add it if we're not going to run into it in
* the scan. And there might be one at position 0. After that,
* we just scan through the rest of the array.
*/
int count = 1;
if (A[0] == j) ++count;
if (j < i) ++count;
for (++i; i < n; ++i) {
if (A[i] == j) ++count;
}
return std::make_pair(j, count);
}
/* This swap can only happen once per element. */
std::swap(A[i], A[j]);
}
}
/* If we get here, every element from 1 to n is at home.
* So the duplicate must be A[0], and the duplicate count
* must be 2.
*/
return std::make_pair(A[0], 2);
}
A parallel solution with O(1) complexity is possible.
Introduce an array of atomic booleans and two atomic integers called duplicate and count. First set count to 1. Then access the array in parallel at the index positions of the numbers and perform a test-and-set operation on the boolean. If a boolean is set already, assign the number to duplicate and increment count.
This solution may not always perform better than the suggested sequential alternatives. Certainly not if all numbers are duplicates. Still, it has constant complexity in theory. Or maybe linear complexity in the number of duplicates. I am not quite sure. However, it should perform well when using many cores and especially if the test-and-set and increment operations are lock-free.
The problem is to check two arrays for the same integer value and put matching values in a new array.
Let say I have two arrays
a[n] = {2,5,2,7,8,4,2}
b[m] = {1,2,6,2,7,9,4,2,5,7,3}
Each array can be a different size.
I need to check if the arrays have matching elements and put them in a new array. The result in this case should be:
array[] = {2,2,2,5,7,4}
And I need to do it in O(n.log(n) + m.log(m)).
I know there is a way to do with merge sorting or put one of the array in a hash array but I really don't know how to implement it.
I will really appreciate your help, thanks!!!
As you have already figured out you can use merge sort (implementing it is beyond the scope of this answer, I suppose you can find a solution on wikipedia or searching on Stack Overflow) so that you can get nlogn + mlogm complexity supposing n is the size of the first array and m is the size of another.
Let's call the first array a (with the size n) and the second one b (with size m). First sort these arrays (merge sort would give us nlogn + mlogm complexity). And now we have:
a[n] // {2,2,2,4,5,7,8} and b[n] // {1,2,2,2,3,4,5,6,7,7,9}
Supposing n <= m we can simply iterate simulateously comparing coresponding values:
But first lets allocate array int c[n]; to store results (you can print to the console instead of storing if you need). And now the loop itself:
int k = 0; // store the new size of c array!
for (int i = 0, j = 0; i < n && j < m; )
{
if (a[i] == b[j])
{
// match found, store it
c[k] = a[i];
++i; ++j; ++k;
}
else if (a[i] > b[j])
{
// current value in a is leading, go to next in b
++j;
}
else
{
// the last possibility is a[i] < b[j] - b is leading
++i;
}
}
Note: the loop itself is n+m complexity at worst (remember n <= m assumption) which is less than for sorting so overal complexity is nlogn + mlogm. Now you can iterate c array (it's size is actually n as we allocated, but the number of elements in it is k) and do what you need with that numbers.
From the way that you explain it the way to do this would be to loop over the shorter array and check it against the longer array. Let us assume that A is the shorter array and B the longer array. Create a results array C.
Loop over each element in A, call it I
If I is found in B, remove it from B and put it in C, break out of the test loop.
Now go to the next element in A.
This means that if a number I is found twice in A and three times in B, then I will only appear twice in C. Once you finish, then every number found in both arrays will appear in C the number of times that it actually appears in both.
I am carefully not putting in suggested code as your question is about a method that you can use. You should figure out the code yourself.
I would be inclined to take the following approach:
1) Sort array B. There are many well published sort algorithms to do this, as well as several implementations in various generally available libraries.
2) Loop through array A and for each element do a binary search (or other suitable algorithm) on array B for a match. If a match is found, remove the element from array B (to avoid future matches) and add it to the output array.
i have a quicksort code that is supposed to run on the text "B A T T A J U S" (ignore blanks). But i dont seem to understand the code that well.
void quicksort (itemType a[], int l, int r)
{
int i, j; itemType v;
if (r>l)
{
v = a[r]; i = l-1; j = r;
for (;;)
{
while (a[++i] < v);
while (a[--j] >= v);
if (i >= j) break;
swap(a,i,j);
}
swap(a,i,r);
quicksort(a,l,i-1);
quicksort(a,i+1,r);
}
}
i can explain what i understand: the first if check if l < r which in this case it is since, s is greater than b. THen i get alittle confused: v is set to be equal to a[r], does this mean S? since S is all the way to the right? then l is set to outside the "array" since its -1. (so its undefined, i assume) then j is set to be equal to r, but is that the posision r? as in S?
I kinda dont understand what values are set to what, if the a[r] = the letter in the posision or the or anything else. Hopefully some1 can explain me how the first swap works, so i hopefully can learn this?
It is probably better to start with an understanding of the QuickSort algorithm, and then see how the code corresponds to it, than to study the code to try to figure out how QuickSort works. Basic QuickSort (which is what you have) is in fact a pretty simple algorithm. To sort an array A:
If the length of A is less than 2 then the array is already sorted. Otherwise,
Select any element of A to be a "pivot element".
Rearrange the other elements as needed so that all those that are less than the pivot are at the beginning of A, and those that are greater than or equal to the pivot are at the end. (This particular version also puts the pivot itself between the two, which is common but not strictly necessary; it could simply be included in the upper subarray, and the algorithm would still work.)
Apply the QuickSort procedure to each of the two sub-arrays produced by (3).
Your particular code chooses the right-most element of each (sub)array as the pivot element, and at step (4) it excludes the pivot from the sub-arrays to be recursively sorted.
Quick sort works by separating your array into a "left" subarray which contains only values stricly less than an arbitrarily chosen a pivot value and a "right" subarray that contains only elements that are greater than or equal to the pivot. Once the array has been divided like this, each of the two subarrays are sorted using the same algorithm. Here is how this applies to your code:
v = a[r] sets the pivot value to the last element in the array. This works well since the array is presumably unsorted to begin with, so a[r] is as good a value as any.
while(a[++i] < v) ; keeps stopping at the first element of the left sub-array that is greater than or equal to the pivot, v. When this loop ends, i is the index of an element that should be in the right sub-array rather than the left.
while(a[--j] >= v) ; does the same thing, except that it stops at the last element of the right sub-array that is strictly less than the pivot, v. When this loop ends, j is the index of an element that should be in the left sub-array rather than the right.
Whenever we find a pair of elements that are in the wrong sub-arrays, we swap them.
When all of the elements in the array are sorted (i meets j), we swap the pivot with the element at index i (which is now guaranteed to be in the right sub-array).
Since the pivot is guaranteed to be in the right position (left sub-array is strictly less and right sub-array is greater than or equal), we need to sort the sub-arrays but not the pivot. That is why the recursive calls use indices l,i-1 and i+1,r, leaving the pivot at index i.
I can't offer a solution in that exact form. That code is overly complicated in my thinking.
Also not sure if what I'm proposing is a bubble sort, or modified bubble, but to me just easier. My added comment is that quicksort() is calling itself, therefore it is recursive. Not good in my book for something as simple as a sort. This all depends on what you need for size and efficiency. If you're sorting many terms, then my proposed sort is not the best.
for(i = 0; i < (n - 1); i++) {
for(j = (i + 1); j < n; j++) {
if(value[i] > value[j]) {
tmp = value[i];
value[i] = value[j];
value[j] = tmp;
}
}
}
Where
n is the number of total elements.
i, j, and tmp are integers
value[] is an array of integers to sort
Based on a this logic given as an answer on SO to a different(similar) question, to remove repeated numbers in a array in O(N) time complexity, I implemented that logic in C, as shown below. But the result of my code does not return unique numbers. I tried debugging but could not get the logic behind it to fix this.
int remove_repeat(int *a, int n)
{
int i, k;
k = 0;
for (i = 1; i < n; i++)
{
if (a[k] != a[i])
{
a[k+1] = a[i];
k++;
}
}
return (k+1);
}
main()
{
int a[] = {1, 4, 1, 2, 3, 3, 3, 1, 5};
int n;
int i;
n = remove_repeat(a, 9);
for (i = 0; i < n; i++)
printf("a[%d] = %d\n", i, a[i]);
}
1] What is incorrect in above code to remove duplicates.
2] Any other O(N) or O(NlogN) solution for this problem. Its logic?
Heap sort in O(n log n) time.
Iterate through in O(n) time replacing repeating elements with a sentinel value (such as INT_MAX).
Heap sort again in O(n log n) to distil out the repeating elements.
Still bounded by O(n log n).
Your code only checks whether an item in the array is the same as its immediate predecessor.
If your array starts out sorted, that will work, because all instances of a particular number will be contiguous.
If your array isn't sorted to start with, that won't work because instances of a particular number may not be contiguous, so you have to look through all the preceding numbers to determine whether one has been seen yet.
To do the job in O(N log N) time, you can sort the array, then use the logic you already have to remove duplicates from the sorted array. Obviously enough, this is only useful if you're all right with rearranging the numbers.
If you want to retain the original order, you can use something like a hash table or bit set to track whether a number has been seen yet or not, and only copy each number to the output when/if it has not yet been seen. To do this, we change your current:
if (a[k] != a[i])
a[k+1] = a[i];
to something like:
if (!hash_find(hash_table, a[i])) {
hash_insert(hash_table, a[i]);
a[k+1] = a[i];
}
If your numbers all fall within fairly narrow bounds or you expect the values to be dense (i.e., most values are present) you might want to use a bit-set instead of a hash table. This would be just an array of bits, set to zero or one to indicate whether a particular number has been seen yet.
On the other hand, if you're more concerned with the upper bound on complexity than the average case, you could use a balanced tree-based collection instead of a hash table. This will typically use more memory and run more slowly, but its expected complexity and worst case complexity are essentially identical (O(N log N)). A typical hash table degenerates from constant complexity to linear complexity in the worst case, which will change your overall complexity from O(N) to O(N2).
Your code would appear to require that the input is sorted. With unsorted inputs as you are testing with, your code will not remove all duplicates (only adjacent ones).
You are able to get O(N) solution if the number of integers is known up front and smaller than the amount of memory you have :). Make one pass to determine the unique integers you have using auxillary storage, then another to output the unique values.
Code below is in Java, but hopefully you get the idea.
int[] removeRepeats(int[] a) {
// Assume these are the integers between 0 and 1000
Boolean[] v = new Boolean[1000]; // A lazy way of getting a tri-state var (false, true, null)
for (int i=0;i<a.length;++i) {
v[a[i]] = Boolean.TRUE;
}
// v[i] = null => number not seen
// v[i] = true => number seen
int[] out = new int[a.length];
int ptr = 0;
for (int i=0;i<a.length;++i) {
if (v[a[i]] != null && v[a[i]].equals(Boolean.TRUE)) {
out[ptr++] = a[i];
v[a[i]] = Boolean.FALSE;
}
}
// Out now doesn't contain duplicates, order is preserved and ptr represents how
// many elements are set.
return out;
}
You are going to need two loops, one to go through the source and one to check each item in the destination array.
You are not going to get O(N).
[EDIT]
The article you linked to suggests a sorted output array which means the search for duplicates in the output array can be a binary search...which is O(LogN).
Your logic just wrong, so the code is wrong too. Do your logic by yourself before coding it.
I suggest a O(NlnN) way with a modification of heapsort.
With heapsort, we join from a[i] to a[n], find the minimum and replace it with a[i], right?
So now is the modification, if the minimum is the same with a[i-1] then swap minimum and a[n], reduce your array item's number by 1.
It should do the trick in O(NlnN) way.
Your code will work only on particular cases. Clearly, you're checking adjacent values but duplicate values can occur any where in array. Hence, it's totally wrong.
This is a algorithm for this question: Rotate a array of n elements left by i positions. For instance, with n = 8 and i = 3, the array abcdefg is rotated to defghabc.
/* Alg 1: Rotate by reversal */
void reverse(int i, int j)
{ int t;
while (i < j) {
t = x[i]; x[i] = x[j]; x[j] = t;
i++;
j--;
}
}
void revrot(int rotdist, int n)
{ reverse(0, rotdist-1);
reverse(rotdist, n-1);
reverse(0, n-1);
}
What is the time complexity of this method? And is there any better solution to this problem?
Thanks indeed.
Should be roughly linear O(n).
The loop has to go for no more than (i+j)/2 times. Dropping the constant, O(i+j).
Big-O notation:
n is always O(n). (loops, as they have to go through several iterations)
1 so O(1). (if statement, specified quantity)
Agreed, it'd O(n) since we're merely shifting.
As a food for thought, another possible algorithm is to make a new array with the original appended to itself (ie. abcd --> abcdabcd). Then shift the pointers right n times! Of course, you'll need two pointers, one for the end, one for the beginning. Remember to cut off the end with '\0'
Same run time btw.