matrix mul max value estimate

matrix mul max value estimate - c

Given matrix product C = A*B, is there N^2 way to estimate max value in C? Or rather what is a good way to do so?

How about this:
For each row in A and each column in B, find the vector-norm squared (i.e. sum of squares). O(n^2)
For each combination of row from A and column from B, multiply the corresponding vector-norm squareds. O(n^2)
Find the maximum of these. O(n^2)
The square-root of this will be an upper-bound for max(abs(C)). Why? Because, from the Cauchy-Schwartz inequality, we know that |<x,y>|^2 <= <x,x>.<y,y>, where <> denotes the inner-product. We have calculated the RHS of this relationship for each point in C; we therefore know that the corresponding element of C (the LHS) must be less.
Disclaimer: There may well be a method to give a tighter bound; this was the first thing that came to mind.

Obviously,
N * max(abs(A)) * max(abs(B))
is an upper bound (since each element of C is the sum of N products of two values from A and B).

this is my take:
A,B,C
a(i) = max(abs(A(i,:)))
b(j) = max(abs(B(j,:)))
c(i,j) = N*max(a(i)*b(j))
What you think? Gonna try Oli's answer and see what gives me best approximation/performance.

Related

Is it possible to do 3-sum/4-sum...k-sum better than O(n^2) with these conditions? - Tech Interview

this is a classic problem, but I am curious if it is possible to do better with these conditions.
Problem: Suppose we have a sorted array of length 4*N, that is, each element is repeated 4 times. Note that N can be any natural number. Also, each element in the array is subject to the constraint 0 < A[i] < 190*N. Are there 4 elements in the array such that A[i] + A[j] + A[k] + A[m] = V, where V can be any positive integer; note we must use exactly 4 elements and they can be repeated. It is not necessarily a requirement to find the 4 elements that satisfy the condition, rather, just showing it can be done for a given array and V is enough.
Ex : A = [1,1,1,1,4,4,4,4,5,5,5,5,11,11,11,11]
V = 22
This is true because, 11 + 5 + 5 + 1 = 22.
My attempt:
Instead of "4sum" I first tried k-sum, but this proved pretty difficult so I instead went for this variation. The first solution I came to was rather naive O(n^2). However, given these constraints, I imagine that we can do better. I tried some dynamic programming methods and divide and conquer, but that didn't quite get me anywhere. To be specific, I am not sure how to cleverly approach this in a way where I can "eliminate" portions of the array without having to explicitly check values against all or almost all permutations.

Make an vector S0 of length 256N where S0[x]=1 if x appears in A.
Perform a convolution of S0 with itself to produce a new vector S1 of length 512N. S1[x] is nonzero iff x is the sum of 2 numbers in A.
Perform a convolution of S1 with itself to make a new vector S2. S2[x] is nonzero iff x is the sum of 4 numbers in A.
Check S2[V] to get your answer.
Convolution can be performed in O(N log N) time using FFT convolution (http://www.dspguide.com/ch18/2.htm) or similar techniques.
Since at most 4 such convolutions are performed, the total complexity is O(N log N)

Maximal subset sum smaller than a given value

You are given an array of integers and a number k. The question is to find a subset such that the sum is maximal and smaller than given number k.
I feel like there is a dynamic programming approach to solve this but I am not sure how to solve this problem efficiently.

The simple dynamic programs for this class of problems all use the same basic trick: for each successive prefix of the array, compute the set of its subset sums, depending on the fact that, when the input elements are bounded, this set has many fewer than 2^n elements (for the problem in question, less than 10,000,000 instead of 2^1000). This particular problem, in Python:
def maxsubsetsumlt(array, k):
sums = {0}
for elem in array:
sums.update({sum_ + elem for sum_ in sums if sum_ + elem < k})
return max(sums)

This could be done using the Subset Sum algorithm with a small adaptation. Instead of returning the boolean value from the last (bottom right) cell you search for the first cell with a value of true which indicates that there is a combination of elements that sum up to this particular k_i < k. This k_i is your maximal sum smaller than k. This algorithm has a worst case time and space complexity of O(nk).

Largest triangle in convex hull

The question has already been answered, but the main problem I am facing is in understanding one of the answers..
From
https://stackoverflow.com/a/1621913/2673063
How is the following algorithm O(n) ?
It states as
By first sorting the points / computing the convex hull (in O(n log n) time) if necessary, we can assume we have the convex polygon/hull with the points cyclically sorted in the order they appear in the polygon. Call the points 1, 2, 3, … , n. Let (variable) points A, B, and C, start as 1, 2, and 3 respectively (in the cyclic order). We will move A, B, C until ABC is the maximum-area triangle. (The idea is similar to the rotating calipers method, as used when computing the diameter (farthest pair).)
With A and B fixed, advance C (e.g. initially, with A=1, B=2, C is advanced through C=3, C=4, …) as long as the area of the triangle increases, i.e., as long as Area(A,B,C) ≤ Area(A,B,C+1). This point C will be the one that maximizes Area(ABC) for those fixed A and B. (In other words, the function Area(ABC) is unimodal as a function of C.)
Next, advance B (without changing A and C) if that increases the area. If so, again advance C as above. Then advance B again if possible, etc. This will give the maximum area triangle with A as one of the vertices. (The part up to here should be easy to prove, and simply doing this separately for each A would give O(n2). But read on.) Now advance A again, if it improves the area, etc.
Although this has three "nested" loops, note that B and C always advance "forward", and they advance at most 2n times in total (similarly A advances at most n times), so the whole thing runs in O(n) time.

As the author of the answer that is the subject of the question, I feel obliged to give a more detailed explanation of the O(n) runtime.
Firstly, just as an example, here is a figure from the paper, showing the first few steps of the algorithm, for a particular sample input (a 12-gon). First we start with A, B, C as three consecutive vertices (step 1 in the figure), advance C as long as area increases (steps 2 to 6), then advance B, and so on.
The triangles with asterisks above them are the "anchored local maxima", i.e., the ones that are best for a given A (i.e., advancing either C or B would decrease the area).
As far as the runtime being O(n): Let the "actual" value of B, in terms of the number of times it's been incremented and ignoring the wrap around, be nB, and similarly for C be nC. (In other words, B = nB % n and C = nC % n.) Now, note that,
("B is ahead of A") whatever the value of A, we have A ≤ nB < A + n
nB is always increasing
So, as A varies from 0 to n, we know that nB only varies between 0 and 2n: it can be incremented at most 2n times. Similarly nC. This shows that the running time of the algorithm, which is proportional to the total number of times A, B and C are incremented, is bounded by O(n) + O(2n) + O(2n), which is O(n).

Think about it like this: each of A, B, C are pointers that, at any given moment, point towards one of the elements of the convex hull. Due to the way the algorithm increments them, each one of them will point to each element of the convex hull at most once. Therefore, each one will iterate over a collection of O(n) elements. They will never be reset, once one of them has passed an element, it will not pass that element ever again.
Since there are 3 pointers (A, B, C), we have time complexity 3 * O(n) = O(n).
Edit:
As the code is presented in the provided link, it sounds possible that it is not O(n), since B and C wrap around the array. However, according to the description, this wrapping around does not sound necessary: before seeing the code, I imagined the method stopping the advancement of B and C past n. In that case, it would definitely be O(n). As the code is presented however, I'm not sure.
It might still be that, for some mathematical reason, B and C still iterate only O(n) times in the entirety of the algorithm, but I can't prove that. Neither can I prove that it is correct to not wrap around (as long as you take care of index out of bounds errors).

fast algorithm of finding sums in array

I am looking for a fast algorithm:
I have a int array of size n, the goal is to find all patterns in the array that
x1, x2, x3 are different elements in the array, such that x1+x2 = x3
For example I know there's a int array of size 3 is [1, 2, 3] then there's only one possibility: 1+2 = 3 (consider 1+2 = 2+1)
I am thinking about implementing Pairs and Hashmaps to make the algorithm fast. (the fastest one I got now is still O(n^2))
Please share your idea for this problem, thank you

Edit: The answer below applies to a version of this problem in which you only want one triplet that adds up like that. When you want all of them, since there are potentially at least O(n^2) possible outputs (as pointed out by ex0du5), and even O(n^3) in pathological cases of repeated elements, you're not going to beat the simple O(n^2) algorithm based on hashing (mapping from a value to the list of indices with that value).
This is basically the 3SUM problem. Without potentially unboundedly large elements, the best known algorithms are approximately O(n^2), but we've only proved that it can't be faster than O(n lg n) for most models of computation.
If the integer elements lie in the range [u, v], you can do a slightly different version of this in O(n + (v-u) lg (v-u)) with an FFT. I'm going to describe a process to transform this problem into that one, solve it there, and then figure out the answer to your problem based on this transformation.
The problem that I know how to solve with FFT is to find a length-3 arithmetic sequence in an array: that is, a sequence a, b, c with c - b = b - a, or equivalently, a + c = 2b.
Unfortunately, the last step of the transformation back isn't as fast as I'd like, but I'll talk about that when we get there.
Let's call your original array X, which contains integers x_1, ..., x_n. We want to find indices i, j, k such that x_i + x_j = x_k.
Find the minimum u and maximum v of X in O(n) time. Let u' be min(u, u*2) and v' be max(v, v*2).
Construct a binary array (bitstring) Z of length v' - u' + 1; Z[i] will be true if either X or its double [x_1*2, ..., x_n*2] contains u' + i. This is O(n) to initialize; just walk over each element of X and set the two corresponding elements of Z.
As we're building this array, we can save the indices of any duplicates we find into an auxiliary list Y. Once Z is complete, we just check for 2 * x_i for each x_i in Y. If any are present, we're done; otherwise the duplicates are irrelevant, and we can forget about Y. (The only situation slightly more complicated is if 0 is repeated; then we need three distinct copies of it to get a solution.)
Now, a solution to your problem, i.e. x_i + x_j = x_k, will appear in Z as three evenly-spaced ones, since some simple algebraic manipulations give us 2*x_j - x_k = x_k - 2*x_i. Note that the elements on the ends are our special doubled entries (from 2X) and the one in the middle is a regular entry (from X).
Consider Z as a representation of a polynomial p, where the coefficient for the term of degree i is Z[i]. If X is [1, 2, 3, 5], then Z is 1111110001 (because we have 1, 2, 3, 4, 5, 6, and 10); p is then 1 + x + x2 + x3 + x4 + x5 + x9.
Now, remember from high school algebra that the coefficient of xc in the product of two polynomials is the sum over all a, b with a + b = c of the first polynomial's coefficient for xa times the second's coefficient for xb. So, if we consider q = p2, the coefficient of x2j (for a j with Z[j] = 1) will be the sum over all i of Z[i] * Z[2*j - i]. But since Z is binary, that's exactly the number of triplets i,j,k which are evenly-spaced ones in Z. Note that (j, j, j) is always such a triplet, so we only care about ones with values > 1.
We can then use a Fast Fourier Transform to find p2 in O(|Z| log |Z|) time, where |Z| is v' - u' + 1. We get out another array of coefficients; call it W.
Loop over each x_k in X. (Recall that our desired evenly-spaced ones are all centered on an element of X, not 2*X.) If the corresponding W for twice this element, i.e. W[2*(x_k - u')], is 1, we know it's not the center of any nontrivial progressions and we can skip it. (As argued before, it should only be a positive integer.)
Otherwise, it might be the center of a progression that we want (so we need to find i and j). But, unfortunately, it might also be the center of a progression that doesn't have our desired form. So we need to check. Loop over the other elements x_i of X, and check if there's a triple with 2*x_i, x_k, 2*x_j for some j (by checking Z[2*(x_k - x_j) - u']). If so, we have an answer; if we make it through all of X without a hit, then the FFT found only spurious answers, and we have to check another element of W.
This last step is therefore O(n * 1 + (number of x_k with W[2*(x_k - u')] > 1 that aren't actually solutions)), which is maybe possibly O(n^2), which is obviously not okay. There should be a way to avoid generating these spurious answers in the output W; if we knew that any appropriate W coefficient definitely had an answer, this last step would be O(n) and all would be well.
I think it's possible to use a somewhat different polynomial to do this, but I haven't gotten it to actually work. I'll think about it some more....
Partially based on this answer.

It has to be at least O(n^2) as there are n(n-1)/2 different sums possible to check for other members. You have to compute all those, because any pair summed may be any other member (start with one example and permute all the elements to convince yourself that all must be checked). Or look at fibonacci for something concrete.
So calculating that and looking up members in a hash table gives amortised O(n^2). Or use an ordered tree if you need best worst-case.

You essentially need to find all the different sums of value pairs so I don't think you're going to do any better than O(n2). But you can optimize by sorting the list and reducing duplicate values, then only pairing a value with anything equal or greater, and stopping when the sum exceeds the maximum value in the list.

find if two arrays contain the same set of integers without extra space and faster than NlogN

I came across this post, which reports the following interview question:
Given two arrays of numbers, find if each of the two arrays have the
same set of integers ? Suggest an algo which can run faster than NlogN
without extra space?
The best that I can think of is the following:
(a) sort each array, and then (b) have two pointers moving along the two arrays and check if you find different values ... but step (a) has already NlogN complexity :(
(a) scan shortest array and put values into a map, and then (b) scan second array and check if you find a value that is not in the map ... here we have linear complexity, but we I use extra space
... so, I can't think of a solution for this question.
Ideas?
Thank you for all the answers. I feel many of them are right, but I decided to choose ruslik's one, because it gives an interesting option that I did not think about.

You can try a probabilistic approach by choosing a commutative function for accumulation (eg, addition or XOR) and a parametrized hash function.
unsigned addition(unsigned a, unsigned b);
unsigned hash(int n, int h_type);
unsigned hash_set(int* a, int num, int h_type){
unsigned rez = 0;
for (int i = 0; i < num; i++)
rez = addition(rez, hash(a[i], h_type));
return rez;
};
In this way the number of tries before you decide that the probability of false positive will be below a certain treshold will not depend on the number of elements, so it will be linear.
EDIT: In general case the probability of sets being the same is very small, so this O(n) check with several hash functions can be used for prefiltering: to decide as fast as possible if they are surely different or if there is a probability of them being equivalent, and if a slow deterministic method should be used. The final average complexity will be O(n), but worst case scenario will have the complexity of the determenistic method.

You said "without extra space" in the question but I assume that you actually mean "with O(1) extra space".
Suppose that all the integers in the arrays are less than k. Then you can use in-place radix sort to sort each array in time O(n log k) with O(log k) extra space (for the stack, as pointed out by yi_H in comments), and compare the sorted arrays in time O(n log k). If k does not vary with n, then you're done.

I'll assume that the integers in question are of fixed size (eg. 32 bit).
Then, radix-quicksorting both arrays in place (aka "binary quicksort") is constant space and O(n).
In case of unbounded integers, I believe (but cannot proof, even if it is probably doable) that you cannot break the O(n k) barrier, where k is the number of digits of the greatest integer in either array.
Whether this is better than O(n log n) depends on how k is assumed to scale with n, and therefore depends on what the interviewer expects of you.

A special, not harder case is when one array holds 1,2,..,n. This was discussed many times:
How to tell if an array is a permutation in O(n)?
Algorithm to determine if array contains n...n+m?
mathoverflow
and despite many tries no deterministic solutions using O(1) space and O(n) time were shown. Either you can cheat the requirements in some way (reuse input space, assume integers are bounded) or use probabilistic test.
Probably this is an open problem.

Here is a co-rp algorithm:
In linear time, iterate over the first array (A), building the polynomial
Pa = A[0] - x)(A[1] -x)...(A[n-1] - x). Do the same for array B, naming this polynomial Pb.
We now want to answer the question "is Pa = Pb?" We can check this probabilistically as follows. Select a number r uniformly at random from the range [0...4n] and compute d = Pa(r) - Pb(r) in linear time. If d = 0, return true; otherwise return false.
Why is this valid? First of all, observe that if the two arrays contain the same elements, then Pa = Pb, so Pa(r) = Pb(r) for all r. With this in mind, we can easily see that this algorithm will never erroneously reject two identical arrays.
Now we must consider the case where the arrays are not identical. By the Schwart-Zippel Lemma, P(Pa(r) - Pb(r) = 0 | Pa != Pb) < (n/4n). So the probability that we accept the two arrays as equivalent when they are not is < (1/4).

The usual assumption for these kinds of problems is Theta(log n)-bit words, because that's the minimum needed to index the input.
sshannin's polynomial-evaluation answer works fine over finite fields, which sidesteps the difficulties with limited-precision registers. All we need are a prime of the appropriate (easy to find under the same assumptions that support a lot of public-key crypto) or an irreducible polynomial in (Z/2)[x] of the appropriate degree (difficulty here is multiplying polynomials quickly, but I think the algorithm would be o(n log n)).
If we can modify the input with the restriction that it must maintain the same set, then it's not too hard to find space for radix sort. Select the (n/log n)th element from each array and partition both arrays. Sort the size-(n/log n) pieces and compare them. Now use radix sort on the size-(n - n/log n) pieces. From the previously processed elements, we can obtain n/log n bits, where bit i is on if a[2*i] > a[2*i + 1] and off if a[2*i] < a[2*i + 1]. This is sufficient to support a radix sort with n/(log n)^2 buckets.

In the algebraic decision tree model, there are known Omega(NlogN) lower bounds for computing set intersection (irrespective of the space limits).
For instance, see here: http://compgeom.cs.uiuc.edu/~jeffe/teaching/497/06-algebraic-tree.pdf
So unless you do clever bit manipulations/hashing type approaches, you cannot do better than NlogN.
For instance, if you used only comparisons, you cannot do better than NlogN.

You can break the O(n*log(n)) barrier if you have some restrictions on the range of numbers. But it's not possible to do this if you cannot use any extra memory (you need really silly restrictions to be able to do that).
I would also like to note that even O(nlog(n)) with sorting is not trivial if you have O(1) space limit as merge sort uses O(n) space and quicksort (which is not even strict o(nlog(n)) needs O(log(n)) space for the stack. You have to use heapsort or smoothsort.
Some companies like to ask questions which cannot be solved and I think it is a good practice, as a programmer you have to know both what's possible and how to code it and also know what are the limits so you don't waste your time on something that's not doable.
Check this question for a couple of good techniques to use:
Algorithm to tell if two arrays have identical members

For each integer i check that the number of occurrences of i in the two arrays are either both zero or both nonzero, by iterating over the arrays.
Since the number of integers is constant the total runtime is O(n).
No, I wouldn't do this in practice.

Was just thinking if there was a way you could hash the cumulative of both arrays and compare them, assuming the hashing function doesn't produce collisions from two differing patterns.

why not i find the sum , product , xor of all the elements one array and compare them with the corresponding value of the elements of the other array ??
the xor of elements of both arrays may give zero if the it is like
2,2,3,3
1,1,2,2
but what if you compare the xor of the elements of two array to be equal ???
consider this
10,3
12,5
here xor of both arrays will be same !!! (10^3)=(12^5)=9
but their sum and product are different . I think two different set of elements cannot have same sum ,product and xor !
This can be analysed by simple bitvalue examination.
Is there anything wrong in this approach ??

I'm not sure that correctly understood the problem, but if you are interested in integers that are in both array:
If N >>>>> 2^SizeOf(int) (count of bit for integer (16, 32, 64)) there is one solution:
a = Array(N); //length(a) = N;
b = Array(M); //length(b) = M;
//x86-64. Integer consist of 64 bits.
for i := 0 to 2^64 / 64 - 1 do //very big, but CONST
for k := 0 to M - 1 do
if a[i] = b[l] then doSomething; //detected
for i := 2^64 / 64 to N - 1 do
if not isSetBit(a[i div 64], i mod 64) then
setBit(a[i div 64], i mod 64);
for i := 0 to M - 1 do
if isSetBit(a[b[i] div 64], b[i] mod 64) then doSomething; //detected
O(N), with out aditional structures

All I know is that comparison based sorting cannot possibly be faster than O(NlogN), so we can eliminate most of the "common" comparison based sorts. I was thinking of doing a bucket sort. Perhaps if this qn was asked in an interview, the best response would first be to clarify what sort of data those integers represent. For e.g., if they represent a persons age, then we know that the range of values of int is limited, and can use bucket sort at O(n). However, this will not be in place....

If the arrays have the same size, and there are guaranteed to be no duplicates, sum each of the arrays. If the sum of the values is different, then they contain different integers.
Edit: You can then sum the log of the entries in the arrays. If that is also the same, then you have the same entries in the array.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

matrix mul max value estimate - c

Given matrix product C = A*B, is there N^2 way to estimate max value in C? Or rather what is a good way to do so?

Obviously, N * max(abs(A)) * max(abs(B)) is an upper bound (since each element of C is the sum of N products of two values from A and B).

this is my take: A,B,C a(i) = max(abs(A(i,:))) b(j) = max(abs(B(j,:))) c(i,j) = Nmax(a(i)b(j)) What you think? Gonna try Oli's answer and see what gives me best approximation/performance.

Related

Is it possible to do 3-sum/4-sum...k-sum better than O(n^2) with these conditions? - Tech Interview

Maximal subset sum smaller than a given value

Largest triangle in convex hull

fast algorithm of finding sums in array

find if two arrays contain the same set of integers without extra space and faster than NlogN

Categories

Resources

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

matrix mul max value estimate - c

Given matrix product C = A*B, is there N^2 way to estimate max value in C? Or rather what is a good way to do so?

Obviously, N * max(abs(A)) * max(abs(B)) is an upper bound (since each element of C is the sum of N products of two values from A and B).

this is my take: A,B,C a(i) = max(abs(A(i,:))) b(j) = max(abs(B(j,:))) c(i,j) = N*max(a(i)*b(j)) What you think? Gonna try Oli's answer and see what gives me best approximation/performance.

Related

Is it possible to do 3-sum/4-sum...k-sum better than O(n^2) with these conditions? - Tech Interview

Maximal subset sum smaller than a given value

Largest triangle in convex hull

fast algorithm of finding sums in array

find if two arrays contain the same set of integers without extra space and faster than NlogN

Categories

Resources

this is my take: A,B,C a(i) = max(abs(A(i,:))) b(j) = max(abs(B(j,:))) c(i,j) = Nmax(a(i)b(j)) What you think? Gonna try Oli's answer and see what gives me best approximation/performance.