Given an array of N integers, sort the array, and find the 2 consecutive numbers in the sorted array with the maximum difference.
Example – on input [1,7,3,2] output 4 (the sorted array is [1,2,3,7], and the maximum difference is 7-3=4).
Algorithm A runs in O(NlogN) time.
I need to find an algorithm identical in function to algorithm A, that runs in O(N) time.
Let the array be X and let n = length(X). Put each element x in bucket number floor((x - min(X)) * (n - 1) / (max(X) - min(X))). The width of each bucket is (max(X) - min(X))/(n - 1) and the maximum adjacent difference is at least that much, so the numbers in question wind up in different buckets. Now all we have to do is consider the pairs where one is the max in bucket i and the other is the min in bucket j where i < j and all buckets k in (i, j) are empty. This is linear time.
Proof that we really need floor: let the function be f(X). If we could compute f(X) in linear time, then surely we could decide in linear time whether
0 < f(X) ≤ (max(X) - min(X))/(length(X) - 1),
i.e., whether the elements of X are evenly spaced and not all identical. Let this predicate be P(X). The support of P has factorial(length(X)) connected components, so the usual Ω(n log n) lower bounds for algebraic models of computation apply.
Execute a Counting Sort and then scan the result for the largest difference.
Because of the consecutive number requirement, at first glance it seems like any solution will require sorting, and this means at best O(n log n) unless your number range is sufficiently constrained for a Counting Sort. But if it is, you win with O(n).
Now, first try to think if you were already given the minimum value MIN and maximum value MAX in the array of size N, under what circumstances would the max gap be minimum and maximum ?
Obviously, maximum gap will be maximum when all elements are either MIN or MAX making maxgap = MAX - MIN.
Maximum gap will be minimum when all the elements are equally spaced apart between MIN and MAX. Lets say the spacing between them is gap.
So, they are arranged as
MIN, MIN + gap, MIN + 2*gap, MIN + 3*gap, ... MIN + (N-1)*gap
where
MIN + (N-1)*gap = MAX .
gap = (MAX - MIN) / (N - 1).
So, we know now that our answer will lie in the range [gap, MAX - MIN].
Now, if we know the answer is more than gap, what we do is create buckets of size gap for ranges .
[MIN, MIN + gap), [Min + gap, `MIN` + 2* gap) ... and so on
There will only be (N-1) such buckets. We place the numbers in these buckets based on their value.
If you pick any 2 numbers from a single bucket, their difference will be less than gap, and hence they would never contribute to maxgap ( Remember maxgap >= gap ). We only need to store the largest number and the smallest number in each bucket, and we only look at the numbers across bucket.
Now, we just need to go through the bucket sequentially ( they are
already sorted by value ), and get the difference of min_value with
max_value of previous bucket with at least one value. We take maximum of all such values.
int maximumGap(const vector<int> &num) {
if (num.empty() || num.size() < 2) return 0;
int maxNum = *max_element(num.begin(), num.end());
int minNum = *min_element(num.begin(), num.end());
//average gap from minNum to maxNum.
int gap = (maxNum - minNum - 1) / (num.size() - 1) + 1;
//number of buckets = num.size() - 1
vector<int> bucketsMin(num.size() - 1, INT_MAX);
vector<int> bucketsMax(num.size() - 1, INT_MIN);
//put into buckets
for (int i = 0; i < num.size(); i++)
{
if (num[i] != maxNum && num[i] != minNum)
{
int buckInd = (num[i] - minNum) / gap;
bucketsMin[buckInd] = min(bucketsMin[buckInd], num[i]);
bucketsMax[buckInd] = max(bucketsMax[buckInd], num[i]);
}
}
int maxGap = INT_MIN;
int previous = minNum;
for (int i = 0; i < num.size() - 1; i++)
{
if (bucketsMin[i] == INT_MAX && bucketsMax[i] == INT_MIN) continue; //empty
//i_th gap is minvalue in i+1_th bucket minus maxvalue in i_th bucket
maxGap = max(maxGap, bucketsMin[i] - previous);
previous = bucketsMax[i];
}
maxGap = max(maxGap, maxNum - previous);
return maxGap;
}
Find minimum and maximum
Pick a random number k from the array
Sort the algorithm by placing all the values smaller than k to the left and larger than k to the right.
You know the minimum and the maximum of both of the groups, calculate the gape of the left group assuming that the values are on a strait line. Do the same for the right group.
Go to 2 with the group that got the bigger gape, you know the min and max of that group. Do this until the selected group got no more than 4 values.
You got now a group with only 4 elements, sort and find the solution.
Here is an example of how this algorithm works:
Input: 9 5 3 4 12 9 31 17
Pick random number: k = 9
Sort by smaller and bigger values of k
5 3 4 9 9 12 31 17, k is in index 3
Left group gape = (9 + 3) / (4 - 1) = 4
Right group gape = (31 + 9) / (5 - 1) = 10
We pick the right group 9 9 12 31 17
Pick random number: k = 12
Sort by smaller and bigger values of k
9 9 12 31 17, k is in index 2
Left group gape = (12 + 9) / (3 - 1) = 11.5
Right group gape = (31 + 12) / (3 - 1) = 21.5
The maximum gape in 12 31 17 is 31 - 17 = 14
My algorithm is very similar to Selection Algorithm for finding the k index value of sorted algorithm in linear time.
Related
I got a problem about finding the smallest N, where N! contains exactly k trailing zeros.
I've got an idea of finding it through binary search from here - Finding natural numbers having n Trailing Zeroes in Factorial .
Is it possible to calculate it without binary search, using any formula or some iterations?
You can use the formula that the number of times p divides n! is:
k = floor(n/p) + floor(n/p^2) + floor(n/p^3) + ...
Combining this with the fact that the number of trailing zeros is exactly equal to the number of times 5 evenly divides it (each zero corresponds to a 2 and 5 pair, but given 2 is less than 5, we'll always have more 2s than 5s, and thus be constrained by our 5s).
With some algebra, and applying the formula for an infinite geometric series, we can get a very (very!) close lower bound on n. We then just increment n by 5 as much as necessary get the actual result. In practice, this ends up being 1-2 increments for k in the lower thousands, so its fairly negligible.
Full code below:
def count_div5(n):
result = 0
pow5 = 5
while pow5 <= n:
result += n // pow5
pow5 *= 5
return result
def n_from_fact_zeros(k):
n = round(4*k)
n += -n % 5
while count_div5(n) < k:
n += 5
return n
For [1, 2, 3], all possible subsets are {1}, {2}, {3}, {1,2}, {1,3}, {2,3}, {1,2,3}
The sum of OR of these subsets are, 1 + 2 + 3 + 3 + 3 + 3 + 3 = 18.
My Approach is to generate all possible subset and find their OR and sum it but time complexity is O(2^n) , but I need a solution with O(nlogn) or less.
As you having 3 alements so 2^3=8 subsets will be created and you need to or all subset and print the sum of all subsets, By following logic you can get the solution you required
public class AndOfSubSetsOfSet {
public static void main(String[] args) {
findSubsets(new int[]{1, 2,3});
}
private static void findSubsets(int array[]) {
int numOfSubsets = 1 << array.length;
int a = 0;
for (int i = 0; i < numOfSubsets; i++) {
int pos = array.length - 1;
int bitmask = i;
int temp = 0;
int count = 0;
while (bitmask > 0) {
if ((bitmask & 1) == 1) {
if (count == 0) {
temp = array[pos];
} else
temp = array[pos] | temp;
count++;
}
//this will shift this number to left so one bit will be remove
bitmask >>= 1;
pos--;
}
count = 0;
a += temp;
temp = 0;
}
System.out.println(a);
}
}
`
one best approach you can use 3 loops outer loop would select number of elements of pair we have to make 2,3,4....upto n. and inner two loops would select elements according to outer loop. in the inner loop you can use bitwise OR so get the answer.
here time complexicity is better than exponential.
if any problem i would gave you code .
please vote if like.
Let's find the solution by calculating bitwise values. Consider the following points first. We will formulate the algorithm based on these points
For N numbers, there can be 2^N-1 such subsets.
For N numbers, where the maximum number of bits can be k, what can be the maximum output? Obviously when every subset sum is all 1's (i.e., for every combination there will be 1 in every bit of k positions). So calculate this MAX. In your example k = 2 and N = 3. So the MAX is when all the subset sum will be 11 (i.e.,3). SO MAX = (2^N-1)*(2^k-1) = 21.
Note that, the value of a bit of subset sum will only be 0 when the bits of every element of that subset is 0. So For every bit first calculate how many subsets can have 0 value in that bit. Then multiply that number with the corresponding value (2^bit_position) and deduct from MAX. In your case, for the leftmost position (i.e., position 0), there is only one 0 (in 2). So in 2^1-1 = 1 subset, the subsets sum's 0 position will be 0. So deduct 1*1 from MAX. Similarly for position 1, there can be only 1 subset with 0 at position 1 of subset sum ({2}). so deduct 1*2 from MAX. For every bit, calculate this value and keep deducting. the final MAX will be the result. If you consider 16 bit integer and you don't know about max k, then calculate using k = 16.
Let's consider another example with N = {1,4}. The subsets are {1},{4},{1,4}, and the result is = 1+4+5 = 10
here k = 3, N = 2. SO MAX = (2^K-1)*(2^N-1) = 21.
For 0 bit, there is only single 0 (in 4). so deduct 1*1 from MAX. So new MAX = 21 -1 = 20.
For 1 bit, both 1 and 4 has 0. so deduct (2^2-1)*2 from MAX. So new MAX = 20 -6 = 14.
For 2 bit, there is only single 0 (in 1). so deduct 1*4 from MAX. So new MAX = 14 -4 = 10.
As we have calculated for every bit position, thus the final result is 10.
Time Complexity
First and second steps can be calculated in constant time
In third step, the main thing is to find the number of 0 bit of each position. So for N number it takes O(k*N) in total. as k will be constant so the overall complexity will be O(N).
In C, how do I get an array of n numbers (each 0x00-0xFF in my case), of which the sum is within a given range 0..k?
The almost duplicate C++ multiple random numbers adding up to equal a certain number targets a specific sum, but in my case the sum can be anything between 0..k.
You need to specify what is the desired distribution of the random numbers.
If there are no further requirements, I would suggest one of the following:
(1)
pick random number a[1] in interval 0 .. k
pick random number a[2] in interval 0 .. k-a[1]
pick random number a[3] in interval 0 .. k-a[1]-a[2]
...
pick random number a[n] in interval 0 .. k-a[1]-a[2]-...-a[n-1]
If you have upper limit m on the range of the random number, use min(k-a[1]-... m) as upper bound of the interval.
Disadvantages: you will get a lot of small numbers and just a few big ones.
(2)
pick n random numbers a[1], .., a[n] in interval 0 .. m, m being the upper limit
s = a[1]+a[2]+...+a[n]
multiply each a[i] by k/s (if integers are required, round down)
Disadvantages: It is unlikely to get large numbers this way. If integers are required, there will likely be a gap between the sum of numbers and k due to rounding error.
I think you get "nicer" numbers with option (2) but as stated above, it depends on the requirements.
Assuming k is less than 255 * n one solution is to assign k / n to every element of the array, then randomly subtract a value to the array elements.
// for (int i = 0; i < n; i++) array[i] = k / n;
// for (int i = 0; i < n; i++) array[i] -= randbetween(0, array[i]);
for (int i = 0; i < n; i++) array[i] = randbetween(0, k / n);
This has an expected sum of k / 2. By tweaking the randbetween() function you can change the probability of the resulting array sum.
It is easy to create one number within range [0, 255].
It is easy to identify if k > 255*n or k < 0 there is no solution.
If 0 <= k <= 255*n, the solution exists. Here we only talk about n > 1 condition.
You have created n-1 random numbers, and sum of the n-1 numbers is s1, suppose the nth number is x. So s1 + x = k, and x should be [0, 255]. If the n-1 numbers are all within range [0, a], then (n-1)*a + 255 >= k, we get a >= (k-255)/(n-1).
If k > 255, just let a = (k-255)/(n-1). It means s1 is [0, k-255]. Then the nth number x can be any random number within [0, 255].
So the solution is arbitrary select n-1 numbers each within [0, (k-255)/(n-1)] (you know (k-255)/(n-1) <= 255, thus it satisfied your condition), and select one random number within [0, 255].
If k <= 255, arbitrary select n numbers each within [0, k/n] (you know k/n is within [0, 255]).
I am given two arrays that contains natural numbers , A and B , and I need to find the index k that minimizes the sum A[i] * |B[i]-B[k]| from i=0 to n-1.
(Both arrays have the same length)
Its obviously easy to do in O(n^2) , I just calculate all sums for all k between 0 and n-1, but I need a better run time complexity.
Any ideas? Thanks!
You can do this in time O(nlogn) by first sorting both arrays based on the values in B, and then performing a single scan.
Once the arrays are sorted, then B[i]>=B[k] if i>k and B[i]<=B[k] if i<= k, so the sum can be rewritten as:
sum A[i] * abs(B[i]-B[k]) = sum A[i]*(B[i]-B[k]) for i=k..n-1
+ sum A[i]*(B[k]-B[i]) for i=0..k-1
= sum A[i]*B[i] for i=k..n-1
- B[k] * sum A[i] for i=k..n-1
+ B[k] * sum A[i] for i = 0..k-1
- sum A[i]*B[i] for i = 0..k-1
You can precalculate all of the sums in time O(n) which then lets you evaluate the target sum at every position in O(n) and select the value for k which gives the best score.
I believe I can do this is O(n log n).
First, sort the B array, applying the same permutation to the A array (and remembering the permutation). This is the O(n log n) part. Since we sum over all i, applying the same permutation to the A and B arrays does not change the minimum.
With a sorted B array, the rest of the algorithm is actually O(n).
For each k, define an array Ck[i] = |B[i] - B[k]|
(Note: We will not actually construct Ck... We will just use it as a concept for easier reasoning.)
Observe that the quantity we are trying to minimize (over k) is the sum of A[i] * Ck[i]. Let's go ahead and give that a name:
Define: Sk = Σ A[i] * Ck[i]
Now, for any particular k, what does Ck look like?
Well, Ck[k] = 0, obviously.
More interestingly, since the B array is sorted, we can get rid of the absolute value signs:
Ck[i] = B[k] - B[i], for 0 <= i < k
Ck[i] = 0, for i = k
Ck[i] = B[i] - B[k], for k < i < n
Let's define two more things.
Definition: Tk = Σ A[i] for 0 <= i < k
Definition: Uk = Σ A[i] for k < i < n
(That is, Tk is the sum of the first k-1 elements of A. Uk is the sum of all but the first k elements of A.)
The key observation: Given Sk, Tk, and Uk, we can compute Sk+1, Tk+1, and Uk+1 in constant time. How?
T and U are easy.
The question is, how do we get from Sk to Sk+1?
Consider what happens to Ck when we go to Ck+1. We simply add B[k+1]-B[k] to every element of C from 0 to k, and we subtract the same amount from every element of C from k+1 to n (prove this). That means we just need to add Tk * (B[k+1] - B[k]) and subtract Uk * (B[k+1] - B[k]) to get from Sk to Sk+1.
Algebraically... The first k terms of Sk are just the sum from 0 to k-1 of A[i] * (B[k] - B[i]).
The first k terms of Sk+1 are the sum from 0 to k-1 of A[i] * (B[k+1] - B[i])
The difference between these is the sum, from 0 to k-1, of (A[i] * (B[k+1] - B[i]) - (A[i] * (B[k] - B[i])). Factor out the A[i] terms and cancel the B[i] terms to get the sum from 0 to k-1 of A[i] * (B[k+1] - B[k]), which is just Tk * (B[k+1] - B[k]).
Similarly for the last n-k-1 terms of Sk.
Since we can compute S0, T0, and U0 in linear time, and we can go from Sk to Sk+1 in constant time, we can calculate all of the Sk in linear time. So do that, remember the smallest, and you are done.
Use the inverse of the sort permutation to get the k for the original arrays.
Here is O(NlogN) solution.
Example
A 6 2 5 10 3 8 7
B 1 5 4 3 6 9 7
1) First sort the two array to increasing of order of B. A's element is just binding with B.
After sort, we get
A 6 10 5 2 3 7
B 1 3 4 5 6 7
Since B are in order now. We have
n-1
sum A[i]|B[i]-B[k]|
i=0
k-1 n-1
=sum A[i](B[k]-B[i])+ sum A[i](B[k]-B[i])
i=0 i=k+1
k-1 n-1 k-1 n-1
=B[k](sum A[i] -sum A[i]) - (sum A[i]B[i]- sum A[i]B[i])
i=0 i=k+1 i=0 i=k+1
2) We calculate prefix sum of array A sumA=0 6 16 21 23 26 33
i=e
With sumA sum A[i] can be calcuated in O(1) time for any s and e.
i=s
For the same reason, we can calculate A[i]B[i]'s prefix sum.
So for each k, to check its value, it just take O(1) time.
So total time complexity is O(NlogN)+O(N).
I have an array that is of size 4,9,16 or 25 (according to the input) and the numbers in the array are the same but less by one (if the array size is 9 then the biggest element in the array would be 8) the numbers start with 0
and I would like to do some algorithm to generate some sort of a checksum for the array so that I can compare that 2 arrays are equal without looping through the whole array and checking each element one by one.
Where can I get this sort of information? I need something that is as simple as possible. Thank you.
edit: just to be clear on what I want:
-All the numbers in the array are distinct, so [0,1,1,2] is not valid because there is a repeated element (1)
-The position of the numbers matter, so [0,1,2,3] is not the same as [3,2,1,0]
-The array will contain the number 0, so this should also be taken into consideration.
EDIT:
Okay I tried to implement the Fletcher's algorithm here:
http://en.wikipedia.org/wiki/Fletcher%27s_checksum#Straightforward
int fletcher(int array[], int size){
int i;
int sum1=0;
int sum2=0;
for(i=0;i<size;i++){
sum1=(sum1+array[i])%255;
sum2=(sum2+sum1)%255;
}
return (sum2 << 8) | sum1;
}
to be honest I have no idea what does the return line do but unfortunately, the algorithm does not work.
For arrays [2,1,3,0] and [1,3,2,0] I get the same checksum.
EDIT2:
okay here's another one, the Adler checksum
http://en.wikipedia.org/wiki/Adler-32#Example_implementation
#define MOD 65521;
unsigned long adler(int array[], int size){
int i;
unsigned long a=1;
unsigned long b=0;
for(i=0;i<size;i++){
a=(a+array[i])%MOD;
b=(b+a)%MOD;
}
return (b <<16) | a;
}
This also does not work.
Arrays [2,0,3,1] and [1,3,0,2] generate same checksum.
I'm losing hope here, any ideas?
Let's take the case of your array of 25 integers. You explain that it can contains any permutations of the unique integers 0 to 24. According to this page, there is 25! (25 factorial) possible permutations, that is 15511210043330985984000000. Far more than a 32bit integer can contains.
The conclusion is that you will have collision, no matter how hard you try.
Now, here is a simple algorithm that account for position:
int checksum(int[] array, int size) {
int c = 0;
for(int i = 0; i < size; i++) {
c += array[i];
c = c << 3 | c >> (32 - 3); // rotate a little
c ^= 0xFFFFFFFF; // invert just for fun
}
return c;
}
I think what you want is in the answer of the following thread:
Fast permutation -> number -> permutation mapping algorithms
You just take the number your permutation is mapped to and take that as your Checksum. As there is exactly one Checksum per permutation there can't be a smaller Checksum that is collision free.
How about the checksum of weighted sum? Let's take an example for [0,1,2,3]. First pick a seed and limit, let's pick a seed as 7 and limit as 10000007.
a[4] = {0, 1, 2, 3}
limit = 10000007, seed = 7
result = 0
result = ((result + a[0]) * seed) % limit = ((0 + 0) * 7)) % 10000007 = 0
result = ((result + a[1]) * seed) % limit = ((0 + 1) * 7)) % 10000007 = 7
result = ((result + a[2]) * seed) % limit = ((7 + 2) * 7)) % 10000007 = 63
result = ((result + a[3]) * seed) % limit = ((63 + 3) * 7)) % 10000007 = 462
Your checksum is 462 for that [0, 1, 2, 3].
The reference is http://www.codeabbey.com/index/wiki/checksum
For an array of N unique integers from 1 to N, just adding up the elements will always be N*(N+1)/2. Therefore the only difference is in the ordering. If by "checksum" you imply that you tolerate some collisions, then one way is to sum the differences between consecutive numbers. So for example, the delta checksum for {1,2,3,4} is 1+1+1=3, but the delta checksum for {4,3,2,1} is -1+-1+-1=-3.
No requirements were given for collision rates or computational complexity, but if the above doesn't suit, then I recommend a position dependent checksum
From what I understand your array contains a permutation of numbers from 0 to N-1. One check-sum which will be useful is the rank of the array in its lexicographic ordering. What does it means ? Given 0, 1, 2
You have the possible permutations
1: 0, 1, 2
2: 0, 2, 1
3: 1, 0, 2
4: 1, 2, 0
5: 2, 0, 1
6: 2, 1, 0
The check-sum will be the first number, and computed when you create the array. There are solutions proposed in
Find the index of a given permutation in the list of permutations in lexicographic order
which can be helpful, although it seems the best algorithm was of quadratic complexity. To improve it to linear complexity you should cache the values of the factorials before hand.
The advantage? ZERO collision.
EDIT: Computation
The value is like the evaluation of a polynomial where factorial is used for the monomial instead of power. So the function is
f(x0,....,xn-1) = X0 * (0!) + X1 * (1!) + X2 * (2!) +...+ Xn-1 * (n-1!)
The idea is to use each values to get a sub-range of permutations, and with enough values you pinpoint an unique permutation.
Now for the implementation (like the one of a polynomial):
pre compute 0!.... to n-1! at the beginning of the program
Each time you set an array you use f(elements) to compute its checksum
you compare in O(1) using this checksum