Efficiently vectorize an element-wise operation in matlab - arrays

I have an nx4 matrix A representing n spheres, and an mx3 matrix B representing m points. I need to test whether these m points are inside any of the spheres. I can do this using a for loop, but with large n and m this method is very inefficient. How can I vectorize this operation? My current method is
A = [0.8622 1.1594 0.7457 0.6925;
1.4325 0.2559 0.0520 0.4687;
1.8465 0.3979 0.2850 0.4259;
1.4387 0.8713 1.6585 0.4616;
0.2383 1.5208 0.5415 0.9417;
1.6812 0.2045 0.1290 0.1972];
B = [0.5689 0.9696 0.8196;
0.5211 0.4462 0.6254;
0.9000 0.4894 0.2202;
0.4192 0.9229 0.4639];
for i=1:size(B,1)
mask = vecnorm(A(:, 1:3) - B(i,:), 2, 2) < A(:, 4);
if sum(mask) > 0
C(i) = true;
else
C(i) = false;
end %if
end %for
I tested the method suggested by #LuisMendo, and it seems it only speeds up the calculation for quite small m and n, but for large m and n, say, around 10000 for my problem, the improvement is very limited. But #NickyMattsson gave me some hint. Because logical operation in matlab is faster than vecnorm, I first use a rough check to find the spheres near the point, and then do a fine check:
A = [0.8622 1.1594 0.7457 0.6925;
1.4325 0.2559 0.0520 0.4687;
1.8465 0.3979 0.2850 0.4259;
1.4387 0.8713 1.6585 0.4616;
0.2383 1.5208 0.5415 0.9417;
1.6812 0.2045 0.1290 0.1972];
B = [0.5689 0.9696 0.8196;
0.5211 0.4462 0.6254;
0.9000 0.4894 0.2202;
0.4192 0.9229 0.4639];
ids = 1:size(A, 1);
for i=1:size(B,1)
% first a rough check
xbound = abs(A(:, 1) - B(i, 1)) < A(:, 4);
ybound = abs(A(:, 2) - B(i, 2)) < A(:, 4);
zbound = abs(A(:, 3) - B(i, 3)) < A(:, 4);
nears = ids(xbound & ybound & zbound);
if isempty(nears)
C(i) = false;
else
% then a fine check
mask = vecnorm(A(nears, 1:3) - B(i,:), 2, 2) < A(nears, 4);
if sum(mask) > 0
C(i) = true;
else
C(i) = false;
end
end
end
This may reduce the time to 1/2 or 1/3, which is acceptable, and if I divide m and n into batches it may be even faster without too heavy memory burden. #CrisLuengo mentioned the R*-tree method, but it seems that the implementation is quite complicated XD

This uses implicit expansion to compute all distances between points and sphere centers, and then to compare those with the sphere radii:
C = any(vecnorm(permute(B, [1 3 2]) - permute(A(:,1:3), [3 1 2]), 2, 3) < A(:,4).', 2);
This is probably faster than the loop approach, but also more memory-intensive, because an intermediate m×n×3 array is computed.

Related

Summing partitions of an array in matlab

Suppose I have an array of length 15
x = randi([0 5], 1,15);
I want to sum every 3 elements of x together and put each sum in a new array called y, as in the following:
y = [y1 y2 y3 y4 y5];
Please help me in doing that in Matlab using for loops.
Here's a vectorized approach that automatically deals with a possible smaller last chunk:
x = randi([0 5], 1, 15); % example data
N = 3; % chunk size
y = accumarray(ceil((1:numel(x))/N).', x(:));
you can use reshape as follows:
y = sum(reshape(x,3,[]))
This reshapes your vector x to an array 3 by whatever is left, then sum along right dimension...
For the case the # of elements you want to sum doesnt add up to the total length of the vector, you can pad with zeros or NaN at the end to make it work. Here's I chose adding zeros:
x = randi([0 5], 1,15);
n = 4 ; % sum every n elements (which is the number of rows in the reshape)
try
y = sum(reshape(x, n, []));
catch
disp('added trailing zeros!')
x(numel(x) + (n - mod(numel(x), n))) = 0;
y = sum(reshape(x, n, []));
end
(you can do this with an if condition instead, I just like try catch more here)
Using for loops:
y = zeros(1,5);
for i = 1:5
idx = (i-1)*3 + 1:(i-1)*3 + 3;
y(i) = sum(x(idx));
end
Using a reference variable Target that is used to indicate the start position of each partition the loop below can be achieved. If you would only like to use only loops an alternative inner loop can be done. This method works almost on the same premise as windowing.
Method 1: Single For-Loop with Indexing
x = randi([0 5], 1,15);
y = zeros(1,length(x)/3);
Index = 1;
for Target = 1: +3: 15
Partition = x(Target:Target+2);
y(1,Index) = sum(Partition);
Index = Index + 1;
end
Method 2: Outer and Inner For-Loops
x = randi([0 5], 1,15);
y = zeros(1,length(x)/3);
Partition = zeros(1,3);
Index = 1;
for Target = 1: +3: 15
for Column = 1: +1: 3
Partition(1,Column) = x(1,Target+Column-1);
end
y(1,Index) = sum(Partition);
Index = Index + 1;
end

Vectorizing a code that requires to complement some elements of a binary array

I have a matrix A of dimension m-by-n composed of zeros and ones, and a matrix J of dimension m-by-1 reporting some integers from [1,...,n].
I want to construct a matrix B of dimension m-by-n such that for i = 1,...,m
B(i,j) = A(i,j) for j=1,...,n-1
B(i,n) = abs(A(i,n)-1)
If sum(B(i,:)) is odd then B(i,J(i)) = abs(B(i,J(i))-1)
This code does what I want:
m = 4;
n = 5;
A = [1 1 1 1 1; ...
0 0 1 0 0; ...
1 0 1 0 1; ...
0 1 0 0 1];
J = [1;2;1;4];
B = zeros(m,n);
for i = 1:m
B(i,n) = abs(A(i,n)-1);
for j = 1:n-1
B(i,j) = A(i,j);
end
if mod(sum(B(i,:)),2)~=0
B(i,J(i)) = abs(B(i,J(i))-1);
end
end
Can you suggest more efficient algorithms, that do not use the nested loop?
No for loops are required for your question. It just needs an effective use of the colon operator and logical-indexing as follows:
% First initialize B to all zeros
B = zeros(size(A));
% Assign all but last columns of A to B
B(:, 1:end-1) = A(:, 1:end-1);
% Assign the last column of B based on the last column of A
B(:, end) = abs(A(:, end) - 1);
% Set all cells to required value
% Original code which does not work: B(oddRow, J(oddRow)) = abs(B(oddRow, J(oddRow)) - 1);
% Correct code:
% Find all rows in B with an odd sum
oddRow = find(mod(sum(B, 2), 2) ~= 0);
for ii = 1:numel(oddRow)
B(oddRow(ii), J(oddRow(ii))) = abs(B(oddRow(ii), J(oddRow(ii))) - 1);
end
I guess for the last part it is best to use a for loop.
Edit: See the neat trick by EBH to do the last part without a for loop
Just to add to #ammportal good answer, also the last part can be done without a loop with the use of linear indices. For that, sub2ind is useful. So adopting the last part of the previous answer, this can be done:
% Find all rows in B with an odd sum
oddRow = find(mod(sum(B, 2), 2) ~= 0);
% convert the locations to linear indices
ind = sub2ind(size(B),oddRow,J(oddRow));
B(ind) = abs(B(ind)- 1);

R - avoid nested for loops

I have the following function which takes 4 vectors. The T vector has a given length and all 3 other vectors (pga, Sa5Hz and Sa1Hz) have a given (identical but not necessarily equal to T) lenght.
The output is a matrix with length(T) rows and length(pga) columns.
My code below seems like the perfect example of what NOT to do, however, I could not figure out a way to optimize it using an apply function. Can anyone help?
designSpectrum <- function (T, pga, Sa5Hz, Sa1Hz){
Ts <- Sa1Hz / Sa5Hz
#By convention, if Sa5Hz is null, set Ts as 0.
Ts[is.nan(Ts)] <- 0
res <- matrix(NA, nrow = length(T), ncol = length(pga))
for (i in 1:nrow(res))
{
for (j in 1:ncol(res))
{
res[i,j] <- if(T[i] <= 0) {pga[j]}
else if (T[i] <= 0.2 * Ts[j]) {pga[j] + T[i] * (Sa5Hz[j] - pga[j]) / (0.2 * Ts[j])}
else if (T[i] <= Ts[j]) {Sa5Hz[j]}
else Sa1Hz[j] / T[i]
}
}
return(res)
}
Instead of doing a double for loop and processing each i and j value separately, you could use the outer function to process all of them in one shot. Since you're now processing multiple i and j values simultaneously, you could switch to the vectorized ifelse statement instead of the non-vectorized if and else statements:
designSpectrum2 <- function (T, pga, Sa5Hz, Sa1Hz) {
Ts <- Sa1Hz / Sa5Hz
Ts[is.nan(Ts)] <- 0
outer(1:length(T), 1:length(pga), function(i, j) {
ifelse(T[i] <= 0, pga[j],
ifelse(T[i] <= 0.2 * Ts[j], pga[j] + T[i] * (Sa5Hz[j] - pga[j]) / (0.2 * Ts[j]),
ifelse(T[i] <= Ts[j], Sa5Hz[j], Sa1Hz[j] / T[i])))
})
}
identical(designSpectrum(T, pga, Sa5Hz, Sa1Hz), designSpectrum2(T, pga, Sa5Hz, Sa1Hz))
# [1] TRUE
Data:
T <- -1:3
pga <- 1:3
Sa5Hz <- 2:4
Sa1Hz <- 3:5
You can see the efficiency gains by testing on rather large vectors (here I'll use an output matrix with 1 million entries):
# Larger vectors
set.seed(144)
T2 <- runif(1000, -1, 3)
pga2 <- runif(1000, -1, 3)
Sa5Hz2 <- runif(1000, -1, 3)
Sa1Hz2 <- runif(1000, -1, 3)
# Runtime comparison
all.equal(designSpectrum(T2, pga2, Sa5Hz2, Sa1Hz2), designSpectrum2(T2, pga2, Sa5Hz2, Sa1Hz2))
# [1] TRUE
system.time(designSpectrum(T2, pga2, Sa5Hz2, Sa1Hz2))
# user system elapsed
# 4.038 1.011 5.042
system.time(designSpectrum2(T2, pga2, Sa5Hz2, Sa1Hz2))
# user system elapsed
# 0.517 0.138 0.652
The approach with outer is almost 10x faster.

How to vectorize the antenna arrayfactor expression in matlab

I have the antenna array factor expression here:
I have coded the array factor expression as given below:
lambda = 1;
M = 100;N = 200; %an M x N array
dx = 0.3*lambda; %inter-element spacing in x direction
m = 1:M;
xm = (m - 0.5*(M+1))*dx; %element positions in x direction
dy = 0.4*lambda;
n = 1:N;
yn = (n - 0.5*(N+1))*dy;
thetaCount = 360; % no of theta values
thetaRes = 2*pi/thetaCount; % theta resolution
thetas = 0:thetaRes:2*pi-thetaRes; % theta values
phiCount = 180;
phiRes = pi/phiCount;
phis = -pi/2:phiRes:pi/2-phiRes;
cmpWeights = rand(N,M); %complex Weights
AF = zeros(phiCount,thetaCount); %Array factor
tic
for i = 1:phiCount
for j = 1:thetaCount
for p = 1:M
for q = 1:N
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))));
end
end
end
end
How can I vectorize the code for calculating the Array Factor (AF).
I want the line:
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))));
to be written in vectorized form (by modifying the for loop).
Approach #1: Full-throttle
The innermost nested loop generates this every iteration - cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i)))), which are to summed up iteratively to give us the final output in AF.
Let's call the exp(.... part as B. Now, B basically has two parts, one is the scalar (2*pi*1j/lambda) and the other part
(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))) that is formed from the variables that are dependent on
the four iterators used in the original loopy versions - i,j,p,q. Let's call this other part as C for easy reference later on.
Let's put all that into perspective:
Loopy version had AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i)))), which is now equivalent to AF(i,j) = AF(i,j) + cmpWeights(q,p)*B, where B = exp((2*pi*1j/lambda)*(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i)))).
B could be simplified to B = exp((2*pi*1j/lambda)* C), where C = (xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))).
C would depend on the iterators - i,j,p,q.
So, after porting onto a vectorized way, it would end up as this -
%// 1) Define vectors corresponding to iterators used in the loopy version
I = 1:phiCount;
J = 1:thetaCount;
P = 1:M;
Q = 1:N;
%// 2) Create vectorized version of C using all four vector iterators
mult1 = bsxfun(#times,sin(thetas(J)),cos(phis(I)).'); %//'
mult2 = bsxfun(#times,sin(thetas(J)),sin(phis(I)).'); %//'
mult1_xm = bsxfun(#times,mult1(:),permute(xm,[1 3 2]));
mult2_yn = bsxfun(#times,mult2(:),yn);
C_vect = bsxfun(#plus,mult1_xm,mult2_yn);
%// 3) Create vectorized version of B using vectorized C
B_vect = reshape(exp((2*pi*1j/lambda)*C_vect),phiCount*thetaCount,[]);
%// 4) Final output as matrix multiplication between vectorized versions of B and C
AF_vect = reshape(B_vect*cmpWeights(:),phiCount,thetaCount);
Approach #2: Less-memory intensive
This second approach would reduce the memory traffic and it uses the distributive property of exponential - exp(A+B) = exp(A)*exp(B).
Now, the original loopy version was this -
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp((2*pi*1j/lambda)*...
(xm(p)*sin(thetas(j))*cos(phis(i)) + yn(q)*sin(thetas(j))*sin(phis(i))))
So, after using the distributive property, we would endup with something like this -
K = (2*pi*1j/lambda)
part1 = K*xm(p)*sin(thetas(j))*cos(phis(i));
part2 = K*yn(q)*sin(thetas(j))*sin(phis(i));
AF(i,j) = AF(i,j) + cmpWeights(q,p)*exp(part1)*exp(part2);
Thus, the relevant vectorized approach would become something like this -
%// 1) Define vectors corresponding to iterators used in the loopy version
I = 1:phiCount;
J = 1:thetaCount;
P = 1:M;
Q = 1:N;
%// 2) Define the constant used at the start of EXP() call
K = (2*pi*1j/lambda);
%// 3) Perform the sine-cosine operations part1 & part2 in vectorized manners
mult1 = K*bsxfun(#times,sin(thetas(J)),cos(phis(I)).'); %//'
mult2 = K*bsxfun(#times,sin(thetas(J)),sin(phis(I)).'); %//'
%// Perform exp(part1) & exp(part2) in vectorized manners
part1_vect = exp(bsxfun(#times,mult1(:),xm));
part2_vect = exp(bsxfun(#times,mult2(:),yn));
%// Perform multiplications with cmpWeights for final output
AF = reshape(sum((part1_vect*cmpWeights.').*part2_vect,2),phiCount,[])
Quick Benchmarking
Here are the runtimes with the input data listed in the question for the original loopy approach and proposed approach #2 -
---------------------------- With Original Approach
Elapsed time is 358.081507 seconds.
---------------------------- With Proposed Approach #2
Elapsed time is 0.405038 seconds.
The runtimes suggests a crazy performance improvement with Approach #2!
The basic trick is to figure out what things are constant, and what things depend on the subscript term - and therefore are matrix terms.
Within the sum:
C(n,m) is a matrix
2π/λ is a constant
sin(θ)cos(φ) is a constant
x(m) and y(n) are vectors
So the two things I would do are:
Expand the xm and ym into matrices using meshgrid()
Take all the constant term stuff outside the loop.
Like this:
...
piFactor = 2 * pi * 1j / lambda;
[xgrid, ygrid] = meshgrid(xm, ym); % xgrid and ygrid will be size (N, M)
for i = 1:phiCount
for j = 1:thetaCount
xFactor = sin(thetas(j)) * cos(phis(i));
yFactor = sin(thetas(j)) * sin(phis(i));
expFactor = exp(piFactor * (xgrid * xFactor + ygrid * yFactor)); % expFactor is size (N, M)
elements = cmpWeights .* expFactor; % elements of sum, size (N, M)
AF(i, j) = AF(i, j) + sum(elements(:)); % sum and then integrate.
end
end
You could probably figure out how to vectorise the outer loop too, but hopefully that gives you a starting point.

generating distanced matrix for an n-dimensional hypercube

is there any algorithm or method of generating the adjacency matrix for a hypercube for any dimension? say your input is 5 it would create a 5-dimensional hypercube
all i can find are sources from
wiki
and
wolfram
If you want to generate the vertices of a N-D unit hypercube, you can basically make an N-value truthtable. Here's some code I use for that:
function output = ttable(values)
output = feval(#(y)feval(#(x)mod(ceil(repmat((1:x(1))', 1, numel(x) - 1) ./ repmat(x(2:end), x(1), 1)) - 1, repmat(fliplr(y), x(1), 1)) + 1, fliplr([1 cumprod(y)])), fliplr(values));
end
and to get the vertices of a 5-D hypercube you can call it like this:
vertices = ttable(ones(1, 5) * 2) - 1;
From here you can calculate the adjacency matrix by finding all vertices that differ by only one bit, i.e.:
adj_list = zeros(2^5, 5);
adj_mat = zeros(2^5, 2^5);
for v=1:2^5
L1_dists = sum(abs(vertices - repmat(vertices(v, :), 2^5, 1)), 2);
adj_list(v, :) = find(L1_dists == 1);
adj_mat(v, find(L1_dists == 1)) = 1;
end

Resources