Related
I was reading about static array queries and this is what I found:
Minimum Queries: There is an O(nlogn) time preprocessing method after which we can answer any minimum query in O(1) time.
The idea is to precalculate all values of min(a, b) where b - a + 1 (the length of the range) is a power of two. The number of precalculated values is O(nlogn), because there are O(logn) range lengths that are powers of two.
The values can be calculated efficiently using the recursive formula:
min(a,b) = min(min(a, a + w - 1), min(a + w, b))
where b-a+1 is a power of two and w = (b - a + 1) / 2
What is meant by the part quoted above? Why do we calculate the minimum only for certain lengths?
What is the idea and the intuition behind it? What does the logic do?
It's kind of a hunch that it must be related to something about a binary tree because we're considering just the powers of two for the lengths.
This structure is referred to an RMQ, a Range Minimum Query. It achieves its O(1) queries by exploiting the associativity and commutativity of the min operation (that is, min(x,y) = min(y,x) and min(x,y,z) = min(x,min(y,z)). The other property that min has is that min(x,x) = x, and more importantly, min(x,y,z) = min(min(x,y),min(y,z))
If you have all the mins for every subarray of length of every power of 2 (hence the n log n memory), you can compute the range min(l-r) by taking the min of the largest power of 2, starting at l, that doesn't overshoot r, with the min of the largest power of 2 ending at r that doesn't undershoot l. The idea here is as follows:
arr=[a,b,c,d,e,f,g,h]
We calculate our RMQ to have mins as follows:
of length 1: [min(a), min(b), min(c), etc]
of length 2: [min(a,b), min(b,c), min(c,d), min(d,e), etc]
of length 4: [min(a,b,c,d}, min(b,c,d,e), min(c,d,e,f), etc]
To take the min from 1 to 6, we want the range min of length 4 starting at 1 (since 8 would go past our right index) and take the min of that with the range min of length 4 ending at 6. So we take these queries from the array of length 4, and take the min of
min(of length 4[1], of length 4[2]) and that's our answer.
I have been trying this question on hackerearth practice which requires below work done.
PROBLEM
Given an integer n which signifies a sequence of n numbers from {0,1,2,3,4,5.......n-2,n-1}
We are provided m ranges in form of (L,R) such that (0<=L<=n-1)(0<=R<=n-1)
if(L <= R) (L,R) signifies numbers {L,L+1,L+2,L+3.......R-1,R} from above sequence
else (L,R) signifies numbers {R,R+1,R+2,.......n-2,n-1} & {0,1,2,3....L-1,L} ie numbers wrap around
example
n = 5 ie {0,1,2,3,4}
(0,3) signifies {0,1,2,3}
(3,0) signifies {3,4,0}
(3,2) signifies {3,4,0,1,2}
Now we have to select ONE (only one) number from each range without repeating any selection. We have to tell is it possible to select one number from each(and every) range without repetition.
Example test case
n = 5// numbers {0,1,2,3,4}
// ranges m in number //
0 0 ie {0}
1 2 ie {1,2}
2 3 ie {2,3}
4 4 ie {4}
4 0 ie {4,0}
Answer is "NO" it's not possible.
Because we cannot select any number from range 4 0 because if we select 4 from it we could not be able to select from range 4 4 and if select 0 from it we would not be able to select from 0 0
My approaches -
1) it can be done in O(N*M) using recurrsion checking all possibilitie of selection from each range and side by side using hash map to record our selections.
2) I was trying it in order n or m ie linear order .Problem lack editorial explanation .Only a code is mentioned in the editorial without comments and explanation . I m not able to get the codelinear solution code by someone which passes all test cases and got accepted.
I am not able to understand the logic/algo used in the code and why is it working?
Please suggest ANY linear method and logic behind it because problem has these constraints
1 <= N<= 10^9
1 <= M <= 10^5
0 <= L, R < N
which demands a linear or nlogn solution as i guess??
The code in the editorial can also be seen here http://ideone.com/5Xb6xw
Warning --After looking The code I found the code is using n and m interchangebly So i would like to mention the input format for the problem.
INPUT FORMAT
The first line contains test cases, tc, followed by two integers N,M- the first one depicting the number of countries on the globe, the second one depicting the number of ranges his girlfriend has given him. After which, the next M lines will have two integers describing the range, X and Y. If (X <= Y), then range covers countries [X,X+1... Y] else range covers [X,X+1,.... N-1,0,1..., Y].
Output Format
Print "YES" if it is possible to do so, print "NO", if it is not.
There are two components to the editorial solution.
Linear-time reduction to a problem on ordinary intervals
Assume to avoid trivial cases that the number of input intervals is less than n.
The first is to reduce the problem to one where the intervals don't wrap around as follows. Given an interval [L, R], if L ≤ R, then emit two intervals [L, R] and [L + n, R + n]; if L > R, emit [L, R + n]. The easy direction of the reduction is showing that, if the original problem has a solution, then the reduced problem has a solution. For [L, R] with L ≤ R assigned a number k, assign k to [L, R] and k + n to [L + n, R + n]. For [L, R] with L > R, assign whichever of k, k + n belongs to [L, R + n]. Except for the dual assignment of k and k + n for intervals [L, R] and [L + n, R + n] respectively, each interval gets its own residue class mod n, so the assignments do not conflict.
Conversely, the hard direction of the reduction (if the original problem has no solution, then the reduced problem has no solution) is proved using Hall's marriage theorem. By Hall's criterion, an unsolvable original problem has, for some k, a set of k input intervals whose union has size less than k. We argue first that there exists such a set of input intervals whose union is a (circular) interval (which by assumption isn't all of 0..n-1). Decompose the union into the set of maximal (circular) intervals that comprise it. Each input interval is contained in exactly one of these intervals. By an averaging argument, some maximal (circular) interval contains more input intervals than its size. We finish by "lifting" this counterexample to the reduced problem. Given the maximal (circular) interval [L*, R*], we lift it to the ordinary interval [L*, R*] if L* ≤ R*, or [L*, R* + n] if L* > R*. Do likewise with the circular intervals contained in this interval. It is tedious but straightforward to show that this lifted counterexample satisfies Hall's criterion, which implies that the reduced problem has no solution.
O(m log m)-time solution for ordinary intervals
This is a sweep-line algorithm. Sort the intervals by lower endpoint and scan them in that order. We imagine that the sweep line moves from lower endpoint to lower endpoint. Maintain the set of intervals that intersect the sweep line and have not been assigned a number, sorted by upper endpoint. When the sweep line is about to move, assign the numbers between the old and new positions to the intervals in the set, preferentially to the ones whose upper endpoint is the lowest. The correctness of this strategy should be clear: the intervals that could be assigned a number but are passed over have at least as many options (in the sense of being a superset) as the intervals that are assigned, so we never make a choice that we have cause to regret.
Given two sorted array A and B length N. Each elements may contain natural number less than M. Determine all possible distances for all combinations elements A and B. In this case, if A[i] - B[j] < 0, then the distance is M + (A[i] - B[j]).
Example :
A = {0,2,3}
B = {1,2}
M = 5
Distances = {0,1,2,3,4}
Note: I know O(N^2) solution, but I need faster solution than O(N^2) and O(N x M).
Edit: Array A, B, and Distances contain distinct elements.
You can get a O(MlogM) complexity solution in the following way.
Prepare an array Ax of length M with Ax[i] = 1 if i belongs to A (and 0 otherwise)
Prepare an array Bx of length M with Bx[M-1-i] = 1 if i belongs to B (and 0 otherwise)
Use the Fast Fourier Transform to convolve these 2 sequences together
Inspect the output array, non-zero values correspond to possible distances
Note that the FFT is normally done with floating point numbers, so in step 4 you probably want to test if the output is greater than 0.5 to avoid potential rounding noise issues.
I possible done with optimized N*N.
If convert A to 0 and 1 array where 1 on positions which present in A (in range [0..M].
After convert this array into bitmasks, size of A array will be decreased into 64 times.
This will allow insert results by blocks of size 64.
Complexity still will be N*N but working time will be greatly decreased. As limitation mentioned by author 50000 for A and B sizes and M.
Expected operations count will be N*N/64 ~= 4*10^7. It will passed in 1 sec.
You can use bitvectors to accomplish this. Bitvector operations on large bitvectors is linear in the size of the bitvector, but is fast, easy to implement, and may work well given your 50k size limit.
Initialize two bitvectors of length M. Call these vectA and vectAnswer. Set the bits of vectA that correspond to the elements in A. Leave vectAnswer with all zeroes.
Define a method to rotate a bitvector by k elements (rotate down). I'll call this rotate(vect,k).
Then, for every element b of B, vectAnswer = vectAnswer | rotate(vectA,b).
The setup
I am writing a code for dealing with polynomials of degree n over d-dimensional variable x and ran into a problem that others have likely faced in the past. Such polynomial can be characterized by coefficients c(alpha) corresponding to x^alpha, where alpha is a length d multi-index specifying the powers the d variables must be raised to.
The dimension and order are completely general, but known at compile time, and could be easily as high as n = 30 and d = 10, though probably not at the same time. The coefficients are dense, in the sense that most coefficients are non-zero.
The number of coefficients required to specify such a polynomial is n + d choose n, which in high dimensions is much less than n^d coefficients that could fill a cube of side length n. As a result, in my situation I have to store the coefficients rather compactly. This is at a price, because retrieving a coefficient for a given multi-index alpha requires knowing its location.
The question
Is there a (straightforward) function mapping a d-dimensional multi-index alpha to a position in an array of length (n + d) choose n?
Ordering combinations
A well-known way to order combinations can be found on this wikipedia page. Very briefly you order the combinations lexically so you can easily count the number of lower combinations. An explanation can be found in the sections Ordering combinations and Place of a combination in the ordering.
Precomputing the binomial coefficients will speed up the index calculation.
Associating monomials with combinations
If we can now associate each monomial with a combination we can effectively order them with the method above. Since each coefficient corresponds with such a monomial this would provide the answer you're looking for. Luckily if
alpha = (a[1], a[2], ..., a[d])
then the combination you're looking for is
combination = (a[1] + 0, a[1] + a[2] + 1, ..., a[1] + a[2] + ... + a[d] + d - 1)
The index can then readily be calculated with the formula from the wikipedia page.
A better, more object oriented solution, would be to create Monomial and Polynomial classes. The Polynomial class would encapsulate a collection of Monomials. That way you can easily model a pathological case like
y(x) = 1.0 + x^50
using just two terms rather than 51.
Another solution would be a map/dictionary where the key was the exponent and the value is the coefficient. That would only require two entries for my pathological case. You're in business if you have a C/C++ hash map.
Personally, I don't think doing it the naive way with arrays is so terrible, even with a polynomial containing 1000 terms. RAM is cheap; that array won't make or break you.
Is there any efficient techniques to do the following summation ?
Given a finite set A containing n integers A={X1,X2,…,Xn}, where Xi is an integer. Now there are n subsets of A, denoted by A1, A2, ... , An. We want to calculate the summation for each subset. Are there some efficient techniques ?
(Note that n is typically larger than the average size of all the subsets of A.)
For example, if A={1,2,3,4,5,6,7,9}, A1={1,3,4,5} , A2={2,3,4} , A3= ... . A naive way of computing the summation for A1 and A2 needs 5 Flops for additions:
Sum(A1)=1+3+4+5=13
Sum(A2)=2+3+4=9
...
Now, if computing 3+4 first, and then recording its result 7, we only need 3 Flops for addtions:
Sum(A1)=1+7+5=13
Sum(A2)=2+7=9
...
What about the generalized case ? Is there any efficient methods to speed up the calculation? Thanks!
For some choices of subsets there are ways to speed up the computation, if you don't mind doing some (potentially expensive) precomputation, but not for all. For instance, suppose your subsets are {1,2}, {2,3}, {3,4}, {4,5}, ..., {n-1,n}, {n,1}; then the naive approach uses one arithmetic operation per subset, and you obviously can't do better than that. On the other hand, if your subsets are {1}, {1,2}, {1,2,3}, {1,2,3,4}, ..., {1,2,...,n} then you can get by with n-1 arithmetic ops, whereas the naive approach is much worse.
Here's one way to do the precomputation. It will not always find optimal results. For each pair of subsets, define the transition cost to be min(size of symmetric difference, size of Y - 1). (The symmetric difference of X and Y is the set of things that are in X or Y but not both.) So the transition cost is the number of arithmetic operations you need to do to compute the sum of Y's elements, given the sum of X's. Add the empty set to your list of subsets, and compute a minimum-cost directed spanning tree using Edmonds' algorithm (http://en.wikipedia.org/wiki/Edmonds%27_algorithm) or one of the faster but more complicated variations on that theme. Now make sure that when your spanning tree has an edge X -> Y you compute X before Y. (This is a "topological sort" and can be done efficiently.)
This will give distinctly suboptimal results when, e.g., you have {1,2}, {3,4}, {1,2,3,4}, {5,6}, {7,8}, {5,6,7,8}. After deciding your order of operations using the procedure above you could then do an optimization pass where you find cheaper ways to evaluate each set's sum given the sums already computed, and this will probably give fairly decent results in practice.
I suspect, but have made no attempt to prove, that finding an optimal procedure for a given set of subsets is NP-hard or worse. (It is certainly computable; the set of possible computations you might do is finite. But, on the face of it, it may be awfully expensive; potentially you might be keeping track of about 2^n partial sums, be adding any one of them to any other at each step, and have up to about n^2 steps, for a super-naive cost of (2^2n)^(n^2) = 2^(2n^3) operations to try every possibility.)
Assuming that 'addition' isn't simply an ADD operation but instead some very intensive function involving two integer operands, then an obvious approach would be to cache the results.
You could achieve that via a suitable data structure, for example a key-value dictionary containing keys formed by the two operands and the answers as the value.
But as you specified C in the question, then the simplest approach would be an n by n array of integers, where the solution to x + y is stored at array[x][y].
You can then repeatedly iterate over the subsets, and for each pair of operands you check the appropriate position in the array. If no value is present then it must be calculated and placed in the array. The value then replaces the two operands in the subset and you iterate.
If the operation is commutative then the operands should be sorted prior to looking up the array (i.e. so that the first index is always the smallest of the two operands) as this will maximise "cache" hits.
A common optimization technique is to pre-compute intermediate results. In your case, you might pre-compute all sums with 2 summands from A and store them in a lookup table. This will result in |A|*|A+1|/2 table entries, where |A| is the cardinality of A.
In order to compute the element sum of Ai, you:
look up the sum of the first two elements of Ai and save them in tmp
while there is an element x left in Ai:
look up the sum of tmp and x
In order to compute the element sum of A1 = {1,3,4,5} from your example, you do the following:
lookup(1,3) = 4
lookup(4,4) = 8
lookup(8,5) = 13
Note that computing the sum of any given Ai doesn't require summation, since all the work has already been conducted while pre-computing the lookup table.
If you store the lookup table in a hash table, then lookup() is in O(1).
Possible optimizations to this approach:
construct the lookup table while computing the summation results; hence, you only compute those summations that you actually need. Your lookup table is now a cache.
if your addition operation is commutative, you can save half of your cache size by storing only those summations where the smaller summand comes first. Then modify lookup() such that lookup(a,b) = lookup(b,a) if a > b.
If assuming summation is time consuming action you can find LCS of every pair of subsets (by assuming they are sorted as mentioned in comments, or if they are not sorted sort them), after that calculate sum of LCS of maximum length (over all LCS in pairs), then replace it's value in related arrays with related numbers, update their LCS and continue this way till there is no LCS with more than one number. Sure this is not optimum, but it's better than naive algorithm (smaller number of summation). However you can do backtracking to find best solution.
e.g For your sample input:
A1={1,3,4,5} , A2={2,3,4}
LCS (A_1,A_2) = {3,4} ==>7 ==>replace it:
A1={1,5,7}, A2={2,7} ==> LCS = {7}, maximum LCS length is `1`, so calculate sums.
Still you can improve it by calculation sum of two random numbers, then again taking LCS, ...
NO. There is no efficient techique.
Because it is NP complete problem. and there are no efficient solutions for such problem
why is it NP-complete?
We could use algorithm for this problem to solve set cover problem, just by putting extra set in set, conatining all elements.
Example:
We have sets of elements
A1={1,2}, A2={2,3}, A3 = {3,4}
We want to solve set cover problem.
we add to this set, set of numbers containing all elements
A4 = {1,2,3,4}
We use algorhitm that John Smith is aking for and we check solution A4 is represented whit.
We solved NP-Complete problem.