a faster way to compute the error of a vector - arrays

For a given vector $(x_1,x_2,\ldots, x_n)$ I am trying to compute
I wrote the following code
for l = 1:n
for k = 1:n
error = error + norm(x(i)-x(j))
This code is not fast, especially when $n$ is large. I am aware that I am double counting actually... But how may I avoid it? How can I speed up my code?
Thank you!

You can do it with bsxfun, which is fast:
d = (abs(bsxfun(#minus, x, x.')));
result = sum(d(:));
Or alternatively use pdist with 'cityblock' distance (which for one-dimensional observations reduces to absolute difference). This computes each distance once, so you need to multiply the sum by 2:
result = 2*sum(pdist(x(:),'cityblock'));

How about a simple speed up?
for a=1:n
for b=a+1:n
error = error + 2*norm(x(a)-x(b))
For a scalar, norm just gives abs.
error = sum(abs( bsxfun(#minus, error,error') ))
will do the same thing.
also check out pdist which will do this for vectors, using vector norms, in an even faster way.


Output arrayfun into trid dimension of the matrix in MATLAB

Assume that I have a matrix A = rand(n,m). I want to compute matrix B with size n x n x m, where B(:,:,i) = A(:,i)*A(:,i)';
The code that can produce this is quite simple:
A = rand(n,m); B = zeros(n,n,m);
for i=1:m
B(:,:,i) = A(:,i)*A(:,i)'
However, I am concerned about speed and would like to ask you help to tell me how to implement it without using loops. Very likely that I need to use either bsxfun, arrayfun or rowfun, but I am not sure.
All answers are appreciated.
I don't have MATLAB at hand right now, but I think this code should produce the same result as your loop:
A1 = reshape(A,n,1,m);
A2 = reshape(A,1,n,m);
B = bsxfun(#times,A1,A2);
If you have a newer version of MATLAB, you don't need bsxfun any more, you can just write
B = A1 .* A2;
On older versions this last line will give an error message.
Whether any of this is faster than your loop depends also on the version of MATLAB. Newer MATLAB versions are not slow any more with loops. I think the loop is more readable, it's worth using more readable code, or at least keep the loop in a comment to clarify what the vectorized code does.
arrayfun and bsxfun does not speed up the calculations in my attempt as below:
clc;close all;
clear all;
A = rand(n,m); B = zeros(n,n,m);
for i=1:m
B(:,:,i) = A(:,i)*A(:,i)';
C = reshape(cell2mat(arrayfun(#(k) bsxfun(#times, A(:,k), A(:,k)'), ...
1:m, 'UniformOutput',false)),n,n,m);
% t1 =0.3079
% t2 =0.5112

Using hist in Matlab to compute occurrences

I am using hist to compute the number of occurrences of values in a matrix in Matlab.
I think I am using it wrong because it gives me completely weird results. Could you help me to understand what is going on?
When I run this piece of code I get countsB as desired
rng default;
When I run this other piece of code I get wrong results for countsA
countsA=[zeros(1709,1); 524288; zeros(1708,1)];
What am I doing wrong?
To add to the other answers: you can replace hist by the explicit sum:
idxA = unique(A);
countsA = sum(bsxfun(#eq, A(:), idxA(:).'), 1);
idxA is a scalar, which means the number of bins in this context.
setting idxA as a vector instead e.g. [0,3418] will get you a hist with bins centered at 0 and 3418, similarly to what you got with idxB, which was also a vector
I think it has to do with:
N = HIST(Y,M), where M is a scalar, uses M bins.
and I think you are assuming it would do:
N = HIST(Y,X), where X is a vector, returns the distribution of Y
among bins with centers specified by X.
In other words, in the first case matlab is assuming that you are asking for 3418 bins

Matlab: average each element in 2D array based on neighbors [duplicate]

I've written code to smooth an image using a 3x3 averaging filter, however the output is strange, it is almost all black. Here's my code.
function [filtered_img] = average_filter(noisy_img)
[m,n] = size(noisy_img);
filtered_img = zeros(m,n);
for i = 1:m-2
for j = 1:n-2
sum = 0;
for k = i:i+2
for l = j:j+2
sum = sum+noisy_img(k,l);
filtered_img(i+1,j+1) = sum/9.0;
I call the function as follows:
filtered = average_filter(img);
I can't see anything wrong in the code logic so far, I'd appreciate it if someone can spot the problem.
Assuming you're working with grayscal images, you should replace the inner two for loops with :
filtered_img(i+1,j+1) = mean2(noisy_img(i:i+2,j:j+2));
Does it change anything?
EDIT: don't forget to reconvert it to uint8!!
filtered_img = uint8(filtered_img);
Edit 2: the reason why it's not working in your code is because sum is saturating at 255, the upper limit of uint8. mean seems to prevent that from happening
another option:
f = #(x) mean(x(:));
filtered_img = nlfilter(noisy_img,[3 3],f);
img = imread('img.bmp');
filtered = imfilter(double(img), ones(3) / 9, 'replicate');
Implement neighborhood operation of sum of product operation between an image and a filter of size 3x3, the filter should be averaging filter.
Then use the same function/code to compute Laplacian(2nd order derivative, prewitt and sobel operation(first order derivatives).
Use a simple 10*10 matrix to perform these operations
need matlab code
Tangentially to the question:
Especially for 5x5 or larger window you can consider averaging first in one direction and then in the other and you save some operations. So, point at 3 would be (P1+P2+P3+P4+P5). Point at 4 would be (P2+P3+P4+P5+P6). Divided by 5 in the end. So, point at 4 could be calculated as P3new + P6 - P2. Etc for point 5 and so on. Repeat the same procedure in other direction.
Make sure to divide first, then sum.
I would need to time this, but I believe it could work a bit faster for larger windows. It is sequential per line which might not seem the best, but you have many lines where you can work in parallel, so it shouldn't be a problem.
This first divide, then sum also prevents saturation if you have integers, so you might use the approach even in 3x3 case, as it is less wrong (though slower) to divide twice by 3 than once by 9. But note that you will always underestimate final value with that, so you might as well add a bit of bias (say all values +1 between the steps).

Efficiently calculating weighted distance in MATLAB

Several posts exist about efficiently calculating pairwise distances in MATLAB. These posts tend to concern quickly calculating euclidean distance between large numbers of points.
I need to create a function which quickly calculates the pairwise differences between smaller numbers of points (typically less than 1000 pairs). Within the grander scheme of the program i am writing, this function will be executed many thousands of times, so even small gains in efficiency are important. The function needs to be flexible in two ways:
On any given call, the distance metric can be euclidean OR city-block.
The dimensions of the data are weighted.
As far as i can tell, no solution to this particular problem has been posted. The statstics toolbox offers pdist and pdist2, which accept many different distance functions, but not weighting. I have seen extensions of these functions that allow for weighting, but these extensions do not allow users to select different distance functions.
Ideally, i would like to avoid using functions from the statistics toolbox (i am not certain the user of the function will have access to those toolboxes).
I have written two functions to accomplish this task. The first uses tricky calls to repmat and permute, and the second simply uses for-loops.
function [D] = pairdist1(A, B, wts, distancemetric)
% get some information about the data
numA = size(A,1);
numB = size(B,1);
if strcmp(distancemetric,'cityblock')
elseif strcmp(distancemetric,'euclidean')
else error('Function only accepts "cityblock" and "euclidean" distance')
% format weights for multiplication
wts = repmat(wts,[numA,1,numB]);
% get featural differences between A and B pairs
A = repmat(A,[1 1 numB]);
B = repmat(permute(B,[3,2,1]),[numA,1,1]);
differences = abs(A-B).^r;
% weigh difference values before combining them
differences = differences.*wts;
differences = differences.^(1/r);
% combine features to get distance
D = permute(sum(differences,2),[1,3,2]);
function [D] = pairdist2(A, B, wts, distancemetric)
% get some information about the data
numA = size(A,1);
numB = size(B,1);
if strcmp(distancemetric,'cityblock')
elseif strcmp(distancemetric,'euclidean')
else error('Function only accepts "cityblock" and "euclidean" distance')
% use for-loops to generate differences
D = zeros(numA,numB);
for i=1:numA
for j=1:numB
differences = abs(A(i,:) - B(j,:)).^(1/r);
differences = differences.*wts;
differences = differences.^(1/r);
D(i,j) = sum(differences,2);
Here are the performance tests:
A = rand(10,3);
B = rand(80,3);
wts = [0.1 0.5 0.4];
distancemetric = 'cityblock';
D1 = pairdist1(A,B,wts,distancemetric);
D2 = pairdist2(A,B,wts,distancemetric);
Elapsed time is 0.000238 seconds.
Elapsed time is 0.005350 seconds.
Its clear that the repmat-and-permute version works much more quickly than the double-for-loop version, at least for smaller datasets. But i also know that calls to repmat often slow things down, however. So I am wondering if anyone in the SO community has any advice to offer to improve the efficiency of either function!
#Luis Mendo offered a nice cleanup of the repmat-and-permute function using bsxfun. I compared his function with my original on datasets of varying size:
As the data become larger, the bsxfun version becomes the clear winner!
I have finished writing the function and it is available on github [link]. I ended up finding a pretty good vectorized method for computing euclidean distance [link], so i use that method in the euclidean case, and i took #Divakar's advice for city-block. It is still not as fast as pdist2, but its must faster than either of the approaches i laid out earlier in this post, and easily accepts weightings.
You can replace repmat by bsxfun. Doing so avoids explicit repetition, therefore it's more memory-efficient, and probably faster:
function D = pairdist1(A, B, wts, distancemetric)
if strcmp(distancemetric,'cityblock')
elseif strcmp(distancemetric,'euclidean')
error('Function only accepts "cityblock" and "euclidean" distance')
differences = abs(bsxfun(#minus, A, permute(B, [3 2 1]))).^r;
differences = bsxfun(#times, differences, wts).^(1/r);
D = permute(sum(differences,2),[1,3,2]);
For r = 1 ("cityblock" case), you can use bsxfun to get elementwise subtractions and then use matrix-multiplication, which must speed up things. The implementation would look something like this -
%// Calculate absolute elementiwse subtractions
absm = abs(bsxfun(#minus,permute(A,[1 3 2]),permute(B,[3 1 2])));
%// Perform matrix multiplications with the given weights and reshape
D = reshape(reshape(absm,[],size(A,2))*wts(:),size(A,1),[]);

How to improve the execution time of this function?

Suppose that f(x,y) is a bivariate function as follows:
function [ f ] = f(x,y)
f= 1+UN(cos(0.5*pi*x+y));
How to improve execution time for function F(N) with the following code:
function [VAL] = F(N)
for i = 1:N+1
val = zeros(1,N+1);
for j = 1:N+1
val(j) = trapz(y,f(0,y).*f(x(i),y).*f(x(j),y))/2/pi;
val = fftshift(fft(val))/N;
l = (length(val)+1)/2;
VAL(i,:)= val(l-1:l+1);
VAL = fftshift(fft(VAL,[],1),1)/N;
L = (size(VAL,1)+1)/2;
VAL = VAL(L-1:L+1,:);
Note that N=2^p where p>10, so please consider the memory limitations while optimizing the code using ndgrid, arrayfun, etc.
FYI: The code intends to find the central 3-by-3 submatrix of the fftn of
fun=#(a,b) trapz(y,f(0,y).*f(a,y).*f(b,y))/2/pi;
where a,b are in [0,4]. The key idea is that we can save memory using the code above specially when N is very large. But the execution time is still an issue because of nested loops. See the figure below for N=2^2:
This is not a full answer, but some possibly helpful hints:
0) The trivial: Are you sure you need numerics? Can't you do the computation analytically?
1) Do not use function handles:
function [ f ] = f(x,y)
f= 1+1.6*(1-acos(cos(0.5*pi*x+y))/pi)-0.8
2) Simplify analytically: acos(cos(x)) is the same as abs(mod(x + pi, 2 * pi) - pi), which should compute slightly faster. Or, instead of sampling and then numerically integrating, first integrate analytically and sample the result.
3) The FFT is a very efficient algorithm to compute the full DFT, but you don't need the full DFT. Since you only want the central 3 x 3 coefficients, it might be more efficient to directly apply the DFT definition and evaluate the formula only for those coefficients that you want. That should be both fast and memory-efficient.
4) If you repeatedly do this computation, it might be helpful to precompute DFT coefficients. Here, dftmtx from the Signal Processing toolbox can assist.
5) To get rid of the loops, think about the problem not in the form of computation instructions, but a single matrix operation. If you consider your input N x N matrix as a vector with N² elements, and your output 3 x 3 matrix as a 9-element vector, then the whole operation you apply (numerical integration via trapz and DFT via fft) appears to be a simple linear transform, which it should be possible to express as an N² x 9 matrix.
