How to improve the execution time of this function? - arrays

Suppose that f(x,y) is a bivariate function as follows:
function [ f ] = f(x,y)
UN=(g)1.6*(1-acos(g)/pi)-0.8;
f= 1+UN(cos(0.5*pi*x+y));
end
How to improve execution time for function F(N) with the following code:
function [VAL] = F(N)
x=0:4/N:4;
y=0:2*pi/1000:2*pi;
VAL=zeros(N+1,3);
for i = 1:N+1
val = zeros(1,N+1);
for j = 1:N+1
val(j) = trapz(y,f(0,y).*f(x(i),y).*f(x(j),y))/2/pi;
end
val = fftshift(fft(val))/N;
l = (length(val)+1)/2;
VAL(i,:)= val(l-1:l+1);
end
VAL = fftshift(fft(VAL,[],1),1)/N;
L = (size(VAL,1)+1)/2;
VAL = VAL(L-1:L+1,:);
end
Note that N=2^p where p>10, so please consider the memory limitations while optimizing the code using ndgrid, arrayfun, etc.
FYI: The code intends to find the central 3-by-3 submatrix of the fftn of
fun=#(a,b) trapz(y,f(0,y).*f(a,y).*f(b,y))/2/pi;
where a,b are in [0,4]. The key idea is that we can save memory using the code above specially when N is very large. But the execution time is still an issue because of nested loops. See the figure below for N=2^2:

This is not a full answer, but some possibly helpful hints:
0) The trivial: Are you sure you need numerics? Can't you do the computation analytically?
1) Do not use function handles:
function [ f ] = f(x,y)
f= 1+1.6*(1-acos(cos(0.5*pi*x+y))/pi)-0.8
end
2) Simplify analytically: acos(cos(x)) is the same as abs(mod(x + pi, 2 * pi) - pi), which should compute slightly faster. Or, instead of sampling and then numerically integrating, first integrate analytically and sample the result.
3) The FFT is a very efficient algorithm to compute the full DFT, but you don't need the full DFT. Since you only want the central 3 x 3 coefficients, it might be more efficient to directly apply the DFT definition and evaluate the formula only for those coefficients that you want. That should be both fast and memory-efficient.
4) If you repeatedly do this computation, it might be helpful to precompute DFT coefficients. Here, dftmtx from the Signal Processing toolbox can assist.
5) To get rid of the loops, think about the problem not in the form of computation instructions, but a single matrix operation. If you consider your input N x N matrix as a vector with N² elements, and your output 3 x 3 matrix as a 9-element vector, then the whole operation you apply (numerical integration via trapz and DFT via fft) appears to be a simple linear transform, which it should be possible to express as an N² x 9 matrix.

Related

Vectorization in Matlab, incomprehensible syntax difference causing failure

I fail to understand why, in the below example, only x1 turns into a 1000 column array while y is a single number.
x = [0:1:999];
y = (7.5*(x))/(18000+(x));
x1 = exp(-((x)*8)/333);
Any clarification would be highly appreciated!
Why is x1 1x1000?
As given in the documentation,
exp(X) returns the exponential eˣ for each element in array X.
Since x is 1x1000, so -(x*8)/333 is 1x1000 and when exp() is applied on it, exponentials of all 1000 elements are computed and hence x1 is also 1x1000. As an example, exp([1 2 3]) is same as [exp(1) exp(2) exp(3)].
Why is y a single number?
As given in the documentation,
If A is a rectangular m-by-n matrix with m~= n, and B is a matrix
with n columns, then x = B/A returns a least-squares solution of the
system of equations x*A = B.
In your case,
A is 18000+x and size(18000+x) is 1x1000 i.e. m=1 and n=1000, and m~=n
and B is 7.5*x which has n=1000 columns.
⇒(7.5*x)/(18000+x) is returning you least-squares solution of equations x*(18000+x) = 7.5*x.
Final Remarks:
x = [0:1:999];
Brackets are unnecessary here and it should better be use like this: x=0:1:999 ;
It seems that you want to do element-wise division for computing x1 for which you should use ./ operator like this:
y=(7.5*x)./(18000+x); %Also removed unnecessary brackets
Also note that addition is always element-wise. .+ is not a valid MATLAB syntax (It works in Octave though). See valid arithmetic array and matrix operators in MATLAB here.
‍‍‍‍‍‍ ‍‍ ‍‍‍‍‍‍ ‍‍3. x1 also has some unnecessary brackets.
The question has already been answered by other people. I just want to point out a small thing. You do not need to write x = 0:1:999. It is better written as x = 0:999 as the default increment value used by MATLAB or Octave is 1.
Try explicitly specifying that you want to do element-wise operations rather than matrix operations:
y = (7.5.*(x))./(18000+(x));
In general, .* does elementwise multiplication, ./ does element-wise division, etc. So [1 2] .* [3 4] yields [3 8]. Omitting the dots will cause Matlab to use matrix operations whenever it can find a reasonable interpretation of your inputs as matrices.

a faster way to compute the error of a vector

For a given vector $(x_1,x_2,\ldots, x_n)$ I am trying to compute
I wrote the following code
for l = 1:n
for k = 1:n
error = error + norm(x(i)-x(j))
end
end
This code is not fast, especially when $n$ is large. I am aware that I am double counting actually... But how may I avoid it? How can I speed up my code?
Thank you!
You can do it with bsxfun, which is fast:
d = (abs(bsxfun(#minus, x, x.')));
result = sum(d(:));
Or alternatively use pdist with 'cityblock' distance (which for one-dimensional observations reduces to absolute difference). This computes each distance once, so you need to multiply the sum by 2:
result = 2*sum(pdist(x(:),'cityblock'));
How about a simple speed up?
for a=1:n
for b=a+1:n
error = error + 2*norm(x(a)-x(b))
end
end
For a scalar, norm just gives abs.
So,
error = sum(abs( bsxfun(#minus, error,error') ))
will do the same thing.
also check out pdist which will do this for vectors, using vector norms, in an even faster way.

Efficiently calculating weighted distance in MATLAB

Several posts exist about efficiently calculating pairwise distances in MATLAB. These posts tend to concern quickly calculating euclidean distance between large numbers of points.
I need to create a function which quickly calculates the pairwise differences between smaller numbers of points (typically less than 1000 pairs). Within the grander scheme of the program i am writing, this function will be executed many thousands of times, so even small gains in efficiency are important. The function needs to be flexible in two ways:
On any given call, the distance metric can be euclidean OR city-block.
The dimensions of the data are weighted.
As far as i can tell, no solution to this particular problem has been posted. The statstics toolbox offers pdist and pdist2, which accept many different distance functions, but not weighting. I have seen extensions of these functions that allow for weighting, but these extensions do not allow users to select different distance functions.
Ideally, i would like to avoid using functions from the statistics toolbox (i am not certain the user of the function will have access to those toolboxes).
I have written two functions to accomplish this task. The first uses tricky calls to repmat and permute, and the second simply uses for-loops.
function [D] = pairdist1(A, B, wts, distancemetric)
% get some information about the data
numA = size(A,1);
numB = size(B,1);
if strcmp(distancemetric,'cityblock')
r=1;
elseif strcmp(distancemetric,'euclidean')
r=2;
else error('Function only accepts "cityblock" and "euclidean" distance')
end
% format weights for multiplication
wts = repmat(wts,[numA,1,numB]);
% get featural differences between A and B pairs
A = repmat(A,[1 1 numB]);
B = repmat(permute(B,[3,2,1]),[numA,1,1]);
differences = abs(A-B).^r;
% weigh difference values before combining them
differences = differences.*wts;
differences = differences.^(1/r);
% combine features to get distance
D = permute(sum(differences,2),[1,3,2]);
end
AND:
function [D] = pairdist2(A, B, wts, distancemetric)
% get some information about the data
numA = size(A,1);
numB = size(B,1);
if strcmp(distancemetric,'cityblock')
r=1;
elseif strcmp(distancemetric,'euclidean')
r=2;
else error('Function only accepts "cityblock" and "euclidean" distance')
end
% use for-loops to generate differences
D = zeros(numA,numB);
for i=1:numA
for j=1:numB
differences = abs(A(i,:) - B(j,:)).^(1/r);
differences = differences.*wts;
differences = differences.^(1/r);
D(i,j) = sum(differences,2);
end
end
end
Here are the performance tests:
A = rand(10,3);
B = rand(80,3);
wts = [0.1 0.5 0.4];
distancemetric = 'cityblock';
tic
D1 = pairdist1(A,B,wts,distancemetric);
toc
tic
D2 = pairdist2(A,B,wts,distancemetric);
toc
Elapsed time is 0.000238 seconds.
Elapsed time is 0.005350 seconds.
Its clear that the repmat-and-permute version works much more quickly than the double-for-loop version, at least for smaller datasets. But i also know that calls to repmat often slow things down, however. So I am wondering if anyone in the SO community has any advice to offer to improve the efficiency of either function!
EDIT
#Luis Mendo offered a nice cleanup of the repmat-and-permute function using bsxfun. I compared his function with my original on datasets of varying size:
As the data become larger, the bsxfun version becomes the clear winner!
EDIT #2
I have finished writing the function and it is available on github [link]. I ended up finding a pretty good vectorized method for computing euclidean distance [link], so i use that method in the euclidean case, and i took #Divakar's advice for city-block. It is still not as fast as pdist2, but its must faster than either of the approaches i laid out earlier in this post, and easily accepts weightings.
You can replace repmat by bsxfun. Doing so avoids explicit repetition, therefore it's more memory-efficient, and probably faster:
function D = pairdist1(A, B, wts, distancemetric)
if strcmp(distancemetric,'cityblock')
r=1;
elseif strcmp(distancemetric,'euclidean')
r=2;
else
error('Function only accepts "cityblock" and "euclidean" distance')
end
differences = abs(bsxfun(#minus, A, permute(B, [3 2 1]))).^r;
differences = bsxfun(#times, differences, wts).^(1/r);
D = permute(sum(differences,2),[1,3,2]);
end
For r = 1 ("cityblock" case), you can use bsxfun to get elementwise subtractions and then use matrix-multiplication, which must speed up things. The implementation would look something like this -
%// Calculate absolute elementiwse subtractions
absm = abs(bsxfun(#minus,permute(A,[1 3 2]),permute(B,[3 1 2])));
%// Perform matrix multiplications with the given weights and reshape
D = reshape(reshape(absm,[],size(A,2))*wts(:),size(A,1),[]);

Optimize parameters of a pairwise distance function in Matlab

This question is related to matlab: find the index of common values at the same entry from two arrays.
Suppose that I have an 1000 by 10000 matrix that contains value 0,1,and 2. Each row are treated as a sample. I want to calculate the pairwise distance between those samples according to the formula d = 1-1/(2p)sum(a/c+b/d) where a,b,c,d can treated as as the row vector of length 10000 according to some definition and p=10000. c and d are probabilities such that c+d=1.
An example of how to find the values of a,b,c,d: suppose we want to find d between sample i and bj, then I look at row i and j.
If kth entry of row i and j has value 2 and 2, then a=2,b=0,c=1,d=0 (I guess I will assign 0/0=0 in this case).
If kth entry of row i and j has value 2 and 1 or vice versa, then a=1,b=0,c=3/4,d=1/4.
The similar assignment will give to the case for 2,0(a=0,b=0,c=1/2,d=1/2),1,1(a=1,b=1,c=1/2,d=1/2),1,0(a=0,b=1,c=1/4,d=3/4),0,0(a=0,b=2,c=0,d=1).
The matlab code I have so far is using for loops for i and j, then find the cases above by using find, then create two arrays for a/c and b/d. This is extremely slow, is there a way that I can improve the efficiency?
Edit: the distance d is the formula given in this paper on page 13.
Provided those coefficients are fixed, then I think I've successfully vectorised the distance function. Figuring out the formulae was fun. I flipped things around a bit to minimise division, and since I wasn't aware of pdist until #horchler's comment, you get it wrapped in loops with the constants factored out:
% m is the data
[n p] = size(m, 1);
distance = zeros(n);
for ii=1:n
for jj=ii+1:n
a = min(m(ii,:), m(jj,:));
b = 2 - max(m(ii,:), m(jj,:));
c = 4 ./ (m(ii,:) + m(jj,:));
c(c == Inf) = 0;
d = 1 - c;
distance(ii,jj) = sum(a.*c + b.*d);
% distance(jj,ii) = distance(ii,jj); % optional for the full matrix
end
end
distance = 1 - (1 / (2 * p)) * distance;

Labview: element-wise array multiplication operations

Does there exist a function similar to that of numpy's * operator for two arrays to multiply their elements in an element-wise manner, returning an array of the similar type?
For example:
#Lets define:
a = [0,1,2,3]
b = [1,2,3,4]
d = [[1,2] , [3,4], [5,6]]
e = [3,4,5]
#I want:
a * 2 == [2*0, 1*2, 2*2, 2*3]
a * b == [0*1, 1*2, 2*3, 3*4]
d * e == [[1*3, 2*3], [3*4, 4*4], [5*5, 6*5]]
d * d == [[1*1, 2*2], [3*3, 4*4], [5*5, 6*6]]
Note how * IS NOT regular matrix multiplication it is element-wise multiplication.
My current best solution is to write some c code, which does this, and import a compiled dll.
There must exist a better solution.
EDIT:
Using LabVIEW 2011 - Needs to be fast.
The first two multiplications can be done by using the 'multiply' primitive. Make sure the arrays in the second case are of the same length.
For the third multipllication you can use a for loop (with auto-indexing). This is needed because you need to instruct LabVIEW what the basic index is.
The last multiplication can (again) be done using the multiply primitive.
My result is different (opposite) from the previous posters. I generated a 4x1000 array of random numbers (magnitude 1000) which I multiplied by a 4x4 array of integers (1,2,3,4,...). I did this 100,000 times using the matrix multiplication VI and also using for loops to perform the operation on the arrays. I'm seeing times on the order of 0.328s for the matrix VIs and 0.051s for the for loops. Using a compiled DLL may be faster than Labview, but this does not seem to be true for the built-in functions.
This is certainly not what I expected, but it is consistent over many cycles. The VI is standard execution thread. All data types are set before the timed operations - no coercion takes place in the loops. The operations are performed separately, staged by a flat sequence structure, as is the time measurement. Parallelism is turned off.

Resources