Matlab: Looping a function on the elements of a 3-array vs Passing a 3-array to the same function

Matlab: Looping a function on the elements of a 3-array vs Passing a 3-array to the same function - arrays

In Matlab I have an array v of length m, a matrix of order n and a function F that takes as an input a single matrix and outputs a number. Starting from v I would like to apply the function to the whole array of matrices whose i-th element consists of a matrix M_i whose entries are obtained by multiplicating all the entries of M by v_i. The output would be itself an array of length n.
As far as I can see there are two ways of achieving this:
Looping on all i=1:n, computing F on all the M_is and store all the corresponding values in an array
Defining a 3-array structure that contains all the matrices M_i and correspondingly extending the function F as to act on 3-arrays instead of matrices. However this entails overloading some matrix operators and functions (transpose, exponential, logarithm, square root, inverse etc...) as to formally handle a 3-array.
I have done the simpler option 1. It takes a long time to execute. Number 2 promises to be faster- However, I am not sure if this is the case, and I am not familiar with overloading operators on Matlab. In particular: how to extend a matrix operator to a 3-array in such a way that it performs the related function on all of its entries.

A for loop is probably no slower than vectorising this, especially for larger problems where memory starts to limit speed. Nevertheless, here are two ways of doing it:
M=rand(3,3,5) % I'm using a 3x3x5 matrix
v=1:5
F=#sum % This is the function
M2=bsxfun(#times,M,permute(v.',[2 3 1])) % Multiply the M(:,:,i) matrix by v(i)
R=arrayfun(#(t) F(M2(:,:,t)),(1:size(M,3)).','UniformOutput',false) % applies the function F to the resulting matrices
cell2mat(R) % Convert from cell array to matrix, since my F function returns row vectors
R2=zeros(size(M,3),size(M,1)); % Initialise R2
for t=1:size(M,3)
R2(t,:)=F(M(:,:,t)*v(t)); % Apply F to M(:,:,i)*v(i)
end
R2
You should do some testing to see which will be more efficient for your actual problem. The vectorised version should be faster for small problems, but use more memory, whereas the for loop will be slower for small problems but use less memory, and so could be faster on larger problems.

Related

Pairwise comparisons of a large amount of sorted arrays

Suppose I have n sorted integer arrays (a_1, ..., a_n, there may be duplicated elements in a single array), and T is a threshold value between 0 and 1. I would like to find all pairs of arrays the similarity of which is larger than T. The similarity of array a_j w.r.t. array a_i is defined as follows:
sim(i, j) = intersection(i, j) / length(i)
where intersection(i, j) returns the number of elements shared in a_i and a_j, and length(i) returns the length of array a_i.
I can enumerate all pairs of arrays and compute the similarity value, but this takes too much time for a large n (say n=10^5). Is there any data structure, pruning strategy, or other techniques that can reduce the time cost of this procedure? I'm using Java so the technique should be easily applicable in Java.

There are (n^2 - n)/2 pairs of arrays. If n=10^5, then you have to compute the similarity of 5 billion pairs of arrays. That's going to take some time.
One potential optimization is to shortcut your evaluation of two arrays if it becomes clear that you won't reach T. For example, if T is 0.5, you've examined more than half of the array and haven't found any intersections, then it's clear that that pair of arrays won't meet the threshold. I don't expect this optimization to gain you much.
It might be possible to make some inferences based on prior results. That is, if sim(1,2) = X and sim(1,3) < T, there's probably a value of X (likely would have to be very high) at which you can say definitively that sim(2,3) < T.

Binomial coefficients using dynamic programming and one dimensional array

Most implementations of binomial coefficient computation using dynamic programming makes use of 2-dimensional arrays, as in these examples:
http://www.csl.mtu.edu/cs4321/www/Lectures/Lecture%2015%20-%20Dynamic%20Programming%20Binomial%20Coefficients.htm
http://www.geeksforgeeks.org/dynamic-programming-set-9-binomial-coefficient/
My question is, why not just compute it using a single dimensional array like this:
def C(n, r):
memo = list()
if (r > int(n/2)):
r = n - r
memo.append(1.0)
for i in range(1,r+1):
now = ((n-i+1)*memo[i-1])/i
memo.append(now)
return memo[r]
Basically using the recursive formula:
C(n,r) = ((n-r+1)/r) * C(n,r-1)
This has a O(r) complexity, while the 2 dimensional logic has a O(nr) complexity.
Am I missing something here?

If you want all of the values, then the 2D logic is certainly more efficient. The 2D logic may be more efficient for some parameters on some hardware that, e.g., lacks hardware multiply and divide. You have to be careful about integer overflow when multiplying before dividing, whereas the integer addition in the 2D recurrence is always fine. Other than that, no, the 1D recurrence is better.

Is sparse tensor multiplication implemented in TensorFlow?

Multiplication of sparse tensors with themselves or with dense tensors does not seem to work in TensorFlow. The following example
from __future__ import print_function
import tensorflow as tf
x = tf.constant([[1.0,2.0],
[3.0,4.0]])
y = tf.SparseTensor(indices=[[0,0],[1,1]], values=[1.0,1.0], shape=[2,2])
z = tf.matmul(x,y)
sess = tf.Session()
sess.run(tf.initialize_all_variables())
print(sess.run([x, y, z]))
fails with the error message
TypeError: Input 'b' of 'MatMul' Op has type string that does not match type
float32 of argument 'a'
Both tensors have values of type float32 as seen by evaluating them without the multiplication op. Multiplication of y with itself returns a similar error message. Multipication of x with itself works fine.

General-purpose multiplication for tf.SparseTensor is not currently implemented in TensorFlow. However, there are three partial solutions, and the right one to choose will depend on the characteristics of your data:
If you have a tf.SparseTensor and a tf.Tensor, you can use tf.sparse_tensor_dense_matmul() to multiply them. This is more efficient than the next approach if one of the tensors is too large to fit in memory when densified: the documentation has more guidance about how to decide between these two methods. Note that it accepts a tf.SparseTensor as the first argument, so to solve your exact problem you will need to use the adjoint_a and adjoint_b arguments, and transpose the result.
If you have two sparse tensors and need to multiply them, the simplest (if not the most performant) way is to convert them to dense and use tf.matmul:
a = tf.SparseTensor(...)
b = tf.SparseTensor(...)
c = tf.matmul(tf.sparse_tensor_to_dense(a, 0.0),
tf.sparse_tensor_to_dense(b, 0.0),
a_is_sparse=True, b_is_sparse=True)
Note that the optional a_is_sparse and b_is_sparse arguments mean that "a (or b) has a dense representation but a large number of its entries are zero", which triggers the use of a different multiplication algorithm.
For the special case of sparse vector by (potentially large and sharded) dense matrix multiplication, and the values in the vector are 0 or 1, the tf.nn.embedding_lookup operator may be more appropriate. This tutorial discusses when you might use embeddings and how to invoke the operator in more detail.
For the special case of sparse matrix by (potentially large and sharded) dense matrix, tf.nn.embedding_lookup_sparse() may be appropriate. This function accepts one or two tf.SparseTensor objects, with sp_ids representing the non-zero values, and the optional sp_weights representing their values (which otherwise default to one).

Recently, tf.sparse_tensor_dense_matmul(...) was added that allows multiplying a sparse matrix by a dense matrix.
https://www.tensorflow.org/versions/r0.9/api_docs/python/sparse_ops.html#sparse_tensor_dense_matmul
https://github.com/tensorflow/tensorflow/issues/1241

In TF2.4.1 you can use the methods in tensorflow.python.ops.linalg.sparse.sparse_csr_matrix_ops to multiply to arbitrary SparseTensor (I think up to 3 dimensions).
Something like the following should be used (in general you turn the sparse tensors into a CSR representation)
import tensorflow as tf
from tensorflow.python.ops.linalg.sparse import sparse_csr_matrix_ops
def tf_multiply(a: tf.SparseTensor, b: tf.SparseTensor):
a_sm = sparse_csr_matrix_ops.sparse_tensor_to_csr_sparse_matrix(
a.indices, a.values, a.dense_shape
)
b_sm = sparse_csr_matrix_ops.sparse_tensor_to_csr_sparse_matrix(
b.indices, b.values, b.dense_shape
)
c_sm = sparse_csr_matrix_ops.sparse_matrix_sparse_mat_mul(
a=a_sm, b=b_sm, type=tf.float32
)
c = sparse_csr_matrix_ops.csr_sparse_matrix_to_sparse_tensor(
c_sm, tf.float32
)
return tf.SparseTensor(
c.indices, c.values, dense_shape=c.dense_shape
)
For a while I was prefering scipy multiplication (via a py_function) because this multiplication in TF (2.3 and 2.4) was not performing as well as scipy. I tried again recently and, either I changed something on my code, or there was some fix in 2.4.1 that makes the TF sparse multiplication faster that using scipy, in both CPU and GPU.

It seems that
tf.sparse_matmul(
a,
b,
transpose_a=None,
transpose_b=None,
a_is_sparse=None,
b_is_sparse=None,
name=None
)
is not for multiplication of two SparseTensors.
a and b are Tensors not SparseTensors. And I have tried that, it is not working with SparseTensors.

tf.sparse_matmul is for multiplying two dense tensor not sparse type of data structure. That function is just an optimized version of tensor multiplication if the given matrix (or both two matrixes) have many zero value. Again it does not accept sparse tensor data type. It accepts dense tensor data type. It might fasten your calculations if values are mostly zero.
as far as I know there is no implementation of two sparse type tensor muliplication. but just one sparse one dense which is tf.sparse_tensor_dense_matmul(x, y) !

To make the answer more complete:
tf.sparse_matmul(
a,
b,
transpose_a=None,
transpose_b=None,
a_is_sparse=None,
b_is_sparse=None,
name=None
)
exists as well:
https://www.tensorflow.org/api_docs/python/tf/sparse_matmul

GSL vs Numerical Recipes. Best way to handle matrices

In GSL a real n * m matrix M is represented internally as an array of size n*m. To access the (i,j) element of M, internally GSL has to access the (i-1) * n + j - 1 location of the array, which involves integer multiplications and additions.
In Numerical Recipes for C, they recommend the alternative method of declaring an array of n pointers, each pointing to an array of m numbers. Then to access the (i,j) element, one puts M[i-1][j-1]. They claim that this is more efficient because it avoids the integer multiplication. The downside is that one has to initialize each pointer separately.
I am wondering, what are the advantages/disadvantages of each approach?

In C:
#define n 2
#define m 3
int M[n*m];
is the same as
int M[n][m];
in C matrices are said to be stored in row-major order
http://en.wikipedia.org/wiki/Row-major_order
In C,
M[1][2]
is the same as
*(M + 1*m + 2) // if M is define as M[n][m]
You could define M as an array of n pointers, but you still have to put the data somewhere and the best place is probably a 2D array. I would suggest:
int M[n][m];
int* Mrows[n] = {M[0], M[1]};
You can then do a direct offset into rows to get to the row you want. Then:
Mrows[1][2]
is the same as
*((*(Mrows + 1)) + 2)
Its more work for the programmer and probably only worth it if you want to go really fast. In that case you may want to look into more optimizations such as specific machine instructions. Also, depending on your algorithm, you may be able to just use + operations (like if you are iterating over the matrix)

find if two arrays contain the same set of integers without extra space and faster than NlogN

I came across this post, which reports the following interview question:
Given two arrays of numbers, find if each of the two arrays have the
same set of integers ? Suggest an algo which can run faster than NlogN
without extra space?
The best that I can think of is the following:
(a) sort each array, and then (b) have two pointers moving along the two arrays and check if you find different values ... but step (a) has already NlogN complexity :(
(a) scan shortest array and put values into a map, and then (b) scan second array and check if you find a value that is not in the map ... here we have linear complexity, but we I use extra space
... so, I can't think of a solution for this question.
Ideas?
Thank you for all the answers. I feel many of them are right, but I decided to choose ruslik's one, because it gives an interesting option that I did not think about.

You can try a probabilistic approach by choosing a commutative function for accumulation (eg, addition or XOR) and a parametrized hash function.
unsigned addition(unsigned a, unsigned b);
unsigned hash(int n, int h_type);
unsigned hash_set(int* a, int num, int h_type){
unsigned rez = 0;
for (int i = 0; i < num; i++)
rez = addition(rez, hash(a[i], h_type));
return rez;
};
In this way the number of tries before you decide that the probability of false positive will be below a certain treshold will not depend on the number of elements, so it will be linear.
EDIT: In general case the probability of sets being the same is very small, so this O(n) check with several hash functions can be used for prefiltering: to decide as fast as possible if they are surely different or if there is a probability of them being equivalent, and if a slow deterministic method should be used. The final average complexity will be O(n), but worst case scenario will have the complexity of the determenistic method.

You said "without extra space" in the question but I assume that you actually mean "with O(1) extra space".
Suppose that all the integers in the arrays are less than k. Then you can use in-place radix sort to sort each array in time O(n log k) with O(log k) extra space (for the stack, as pointed out by yi_H in comments), and compare the sorted arrays in time O(n log k). If k does not vary with n, then you're done.

I'll assume that the integers in question are of fixed size (eg. 32 bit).
Then, radix-quicksorting both arrays in place (aka "binary quicksort") is constant space and O(n).
In case of unbounded integers, I believe (but cannot proof, even if it is probably doable) that you cannot break the O(n k) barrier, where k is the number of digits of the greatest integer in either array.
Whether this is better than O(n log n) depends on how k is assumed to scale with n, and therefore depends on what the interviewer expects of you.

A special, not harder case is when one array holds 1,2,..,n. This was discussed many times:
How to tell if an array is a permutation in O(n)?
Algorithm to determine if array contains n...n+m?
mathoverflow
and despite many tries no deterministic solutions using O(1) space and O(n) time were shown. Either you can cheat the requirements in some way (reuse input space, assume integers are bounded) or use probabilistic test.
Probably this is an open problem.

Here is a co-rp algorithm:
In linear time, iterate over the first array (A), building the polynomial
Pa = A[0] - x)(A[1] -x)...(A[n-1] - x). Do the same for array B, naming this polynomial Pb.
We now want to answer the question "is Pa = Pb?" We can check this probabilistically as follows. Select a number r uniformly at random from the range [0...4n] and compute d = Pa(r) - Pb(r) in linear time. If d = 0, return true; otherwise return false.
Why is this valid? First of all, observe that if the two arrays contain the same elements, then Pa = Pb, so Pa(r) = Pb(r) for all r. With this in mind, we can easily see that this algorithm will never erroneously reject two identical arrays.
Now we must consider the case where the arrays are not identical. By the Schwart-Zippel Lemma, P(Pa(r) - Pb(r) = 0 | Pa != Pb) < (n/4n). So the probability that we accept the two arrays as equivalent when they are not is < (1/4).

The usual assumption for these kinds of problems is Theta(log n)-bit words, because that's the minimum needed to index the input.
sshannin's polynomial-evaluation answer works fine over finite fields, which sidesteps the difficulties with limited-precision registers. All we need are a prime of the appropriate (easy to find under the same assumptions that support a lot of public-key crypto) or an irreducible polynomial in (Z/2)[x] of the appropriate degree (difficulty here is multiplying polynomials quickly, but I think the algorithm would be o(n log n)).
If we can modify the input with the restriction that it must maintain the same set, then it's not too hard to find space for radix sort. Select the (n/log n)th element from each array and partition both arrays. Sort the size-(n/log n) pieces and compare them. Now use radix sort on the size-(n - n/log n) pieces. From the previously processed elements, we can obtain n/log n bits, where bit i is on if a[2*i] > a[2*i + 1] and off if a[2*i] < a[2*i + 1]. This is sufficient to support a radix sort with n/(log n)^2 buckets.

In the algebraic decision tree model, there are known Omega(NlogN) lower bounds for computing set intersection (irrespective of the space limits).
For instance, see here: http://compgeom.cs.uiuc.edu/~jeffe/teaching/497/06-algebraic-tree.pdf
So unless you do clever bit manipulations/hashing type approaches, you cannot do better than NlogN.
For instance, if you used only comparisons, you cannot do better than NlogN.

You can break the O(n*log(n)) barrier if you have some restrictions on the range of numbers. But it's not possible to do this if you cannot use any extra memory (you need really silly restrictions to be able to do that).
I would also like to note that even O(nlog(n)) with sorting is not trivial if you have O(1) space limit as merge sort uses O(n) space and quicksort (which is not even strict o(nlog(n)) needs O(log(n)) space for the stack. You have to use heapsort or smoothsort.
Some companies like to ask questions which cannot be solved and I think it is a good practice, as a programmer you have to know both what's possible and how to code it and also know what are the limits so you don't waste your time on something that's not doable.
Check this question for a couple of good techniques to use:
Algorithm to tell if two arrays have identical members

For each integer i check that the number of occurrences of i in the two arrays are either both zero or both nonzero, by iterating over the arrays.
Since the number of integers is constant the total runtime is O(n).
No, I wouldn't do this in practice.

Was just thinking if there was a way you could hash the cumulative of both arrays and compare them, assuming the hashing function doesn't produce collisions from two differing patterns.

why not i find the sum , product , xor of all the elements one array and compare them with the corresponding value of the elements of the other array ??
the xor of elements of both arrays may give zero if the it is like
2,2,3,3
1,1,2,2
but what if you compare the xor of the elements of two array to be equal ???
consider this
10,3
12,5
here xor of both arrays will be same !!! (10^3)=(12^5)=9
but their sum and product are different . I think two different set of elements cannot have same sum ,product and xor !
This can be analysed by simple bitvalue examination.
Is there anything wrong in this approach ??

I'm not sure that correctly understood the problem, but if you are interested in integers that are in both array:
If N >>>>> 2^SizeOf(int) (count of bit for integer (16, 32, 64)) there is one solution:
a = Array(N); //length(a) = N;
b = Array(M); //length(b) = M;
//x86-64. Integer consist of 64 bits.
for i := 0 to 2^64 / 64 - 1 do //very big, but CONST
for k := 0 to M - 1 do
if a[i] = b[l] then doSomething; //detected
for i := 2^64 / 64 to N - 1 do
if not isSetBit(a[i div 64], i mod 64) then
setBit(a[i div 64], i mod 64);
for i := 0 to M - 1 do
if isSetBit(a[b[i] div 64], b[i] mod 64) then doSomething; //detected
O(N), with out aditional structures

All I know is that comparison based sorting cannot possibly be faster than O(NlogN), so we can eliminate most of the "common" comparison based sorts. I was thinking of doing a bucket sort. Perhaps if this qn was asked in an interview, the best response would first be to clarify what sort of data those integers represent. For e.g., if they represent a persons age, then we know that the range of values of int is limited, and can use bucket sort at O(n). However, this will not be in place....

If the arrays have the same size, and there are guaranteed to be no duplicates, sum each of the arrays. If the sum of the values is different, then they contain different integers.
Edit: You can then sum the log of the entries in the arrays. If that is also the same, then you have the same entries in the array.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight

Matlab: Looping a function on the elements of a 3-array vs Passing a 3-array to the same function - arrays

Related

Pairwise comparisons of a large amount of sorted arrays

Binomial coefficients using dynamic programming and one dimensional array

Is sparse tensor multiplication implemented in TensorFlow?

GSL vs Numerical Recipes. Best way to handle matrices

find if two arrays contain the same set of integers without extra space and faster than NlogN

Categories

Resources