Imagine for instance we have the following functions:
f = #(n) sin((0:1e-3:1) .* n * pi);
g = #(n, t) cos(n .^ 2 * pi ^2 / 2 .* t);
h = #(n) f(n) * g(n, 0);
Now, I would like to be able to enter an array of values for n into h and return a sum of the results for each value of n.
I am trying to be efficient, so I am avoiding the novice for-loop method of just filling out a pre-allocated matrix and summing down the columns. I also tried using arrayfun and converting the cell to a matrix then summing that, but it ended up being a slower process than the for-loop.
Does anyone know how I might do this?
The fact is the "novice" for-loop is going to be competitively as fast as any other vectorized solution thanks to improvements in JIT compilation in recent versions of MATLAB.
% array of values of n
len = 500;
n = rand(len,1);
% preallocate matrix
X = zeros(len,1001);
% fill rows
for i=1:len
X(i,:) = h(n(i)); % call function handle
end
out = sum(X,1);
The above is as fast as (maybe even faster):
XX = cell2mat(arrayfun(h, n, 'UniformOutput',false));
out = sum(XX,1);
EDIT:
Here it is computed directly without function handles in a single vectorized call:
n = rand(len,1);
t = 0; % or any other value
out = sum(bsxfun(#times, ...
sin(bsxfun(#times, n, (0:1e-3:1)*pi)), ...
cos(n.^2 * t * pi^2/2)), 1);
Related
I have an 8x8 matrix, e.g. A=rand(8,8). What I need to do is subset all 2x2 matrices along the diagonal. That means that I need to save matrices A(1:2,1:2), A(3:4,3:4), A(5:6,5:6), A(7:8,7:8). To better explain myself, the current version that I am using is the following:
AA = rand(8,8);
BB = zeros(8,2);
for i = 1:4
BB(2*i-1:2*i,:) = AA(2*i-1:2*i,2*i-1:2*i);
end
This works fine for small AA matrices and small AA submatrices, however as the size grows significantly (it can get up to even 50,000x50,000) using a for loop like the one above in not viable. Is there a way to achieve the above without the loop? I've thought of other approaches that could perhaps utilize upper and lower triangular matrices, however even these seem to need a loop at some point. Any help is appreciated!!
Here's a way:
AA = rand(8,8); % example matrix. Assumed square
n = 2; % submatrix size. Assumed to divide the size of A
mask = repelem(logical(eye(size(AA,1)/n)), n, n);
BB = reshape(permute(reshape(AA(mask), n, n, []), [1 3 2]), [], n);
This generates a logical mask that selects the required elements, and then rearranges them as desired using reshape and permute.
Here is an alternative that doesn't generate a full matrix to select the block diagonals, as in Luis Mendo' answer but instead directly generates the indices to these elements. It is likely that this will be faster for very large matrices, as creating the indexing matrix will be expensive in that case.
AA = rand(8,8); % example matrix. Assumed square
n = 2; % submatrix size. Assumed to divide the size of A
m=size(AA,1);
bi = (1:n)+(0:m:n*m-1).'; % indices for elements of one block
bi = bi(:); % turn into column vector
di = 1:n*(m+1):m*m; % indices for first element of each block
BB = AA(di+bi-1); % extract the relevant elements
BB = reshape(BB,n,[]).' % put these elements in the desired order
Benchmark
AA = rand(5000); % couldn't do 50000x50000 because that's too large!
n = 2;
BB1 = method1(AA,n);
BB2 = method2(AA,n);
BB3 = method3(AA,n);
assert(isequal(BB1,BB2))
assert(isequal(BB1,BB3))
timeit(#()method1(AA,n))
timeit(#()method2(AA,n))
timeit(#()method3(AA,n))
% OP's loop
function BB = method1(AA,n)
m = size(AA,1);
BB = zeros(m,n);
for i = 1:m/n
BB(n*(i-1)+1:n*i,:) = AA(n*(i-1)+1:n*i,n*(i-1)+1:n*i);
end
end
% Luis' mask matrix
function BB = method2(AA,n)
mask = repelem(logical(eye(size(AA,1)/n)), n, n);
BB = reshape(permute(reshape(AA(mask), n, n, []), [1 3 2]), [], n);
end
% Cris' indices
function BB = method3(AA,n)
m = size(AA,1);
bi = (1:n)+(0:m:n*m-1).';
bi = bi(:);
di = 0:n*(m+1):m*m-1;
BB = reshape(AA(di+bi),n,[]).';
end
On my computer, with MATLAB R2017a I get:
method1 (OP's loop): 0.0034 s
method2 (Luis' mask matrix): 0.0599 s
method3 (Cris' indices): 1.5617e-04 s
Note how, for a 5000x5000 array, the method in this answer is ~20x faster than a loop, whereas the loop is ~20 faster than Luis' solution.
For smaller matrices things are slightly different, Luis' method is almost twice as fast as the loop code for a 50x50 matrix (though this method still beats it by ~3x).
Here is the problem:
data = 1:0.5:(8E6+0.5);
An array of 16 million points, needs to be averaged every 10,000 elements.
Like this:
x = mean(data(1:10000))
But repeated N times, where N depends on the number of elements we average over
range = 10000;
N = ceil(numel(data)/range);
My current method is this:
data(1) = mean(data(1,1:range));
for i = 2:N
data(i) = mean(data(1,range*(i-1):range*i));
end
How can the speed be improved?
N.B: We need to overwrite the original array of data (essentially bin the data and average it)
data = 1:0.5:(8E6-0.5); % Your data, actually 16M-2 elements
N = 1e4; % Amount to average over
tmp = mod(numel(data),N); % find out whether it fits
data = [data nan(1,N-tmp)]; % add NaN if necessary
data2=reshape(data,N,[]); % reshape into a matrix
out = nanmean(data2,1); % get average over the rows, ignoring NaN
Visual confirmation that it works using plot(out)
Note that technically you can't do what you want if mod(numel(data),N) is not equal to 0, since then you'd have a remainder. I elected to average over everything in there, although ignoring the remainder is also an option.
If you're sure mod(numel(data),N) is zero every time, you can leave all that out and reshape directly. I'd not recommend using this though, because if your mod is not 0, this will error out on the reshape:
data = 1:0.5:(8E6+0.5); % 16M elements now
N = 1e4; % Amount to average over
out = sum(reshape(data,N,[]),1)./N; % alternative
This is a bit wasteful, but you can use movmean (which will handle the endpoints the way you want it to) and then subsample the output:
y = movmean(x, [0 9999]);
y = y(1:10000:end);
Even though this is wasteful (you're computing a lot of elements you don't need), it appears to outperform the nanmean approach (at least on my machine).
=====================
There's also the option to just compensate for the extra elements you added:
x = 1:0.5:(8E6-0.5);
K = 1e4;
Npad = ceil(length(x)/K)*K - length(x);
x((end+1):(end+Npad)) = 0;
y = mean(reshape(x, K, []));
y(end) = y(end) * K/(K - Npad);
reshape the data array into a 10000XN matrix, then compute the mean of each column using the mean function.
I would like to extend "univariate bootstrapping" to "multivariate bootsrapping", meaning in a first step I draw randomly with replacement out of a one-dimensional vector using this code:
s = RandStream.getGlobalStream();
reset(s)
n = 100000; % # of independent random trials
h = 52; % horizon
T = size(Resid_standard, 1);
Resid_bootstrapped = Resid_standard(unidrnd(T, h, n));
Now, the basic vector Resid_standard is not a uni-dimensional vector but a Tx2 matrix and I want to not only draw random numbers but random pairs.
How do I have to modify my code to achieve this?
The output in the univariate case is a 100000x50 matrix. The output for the two-dimensional case would be three-dimensional. How could I store my results?
One solution is to store the index vector, make use of linear indexing, and concatenate the results:
r_ind = unidrnd(T, h, n);
Resid_bootstrapped = cat(3, Resid_standard(r_ind), Resid_standard(r_ind + T));
Resid_bootstrapped will then be a h×n×2 matrix.
This can even be shortened into a one-liner:
Resid_bootstrapped = reshape(Resid_standard(unidrnd(T, h, n), [1,2]), h, n, 2);
So I'm trying to write a function to generate Hermite polynomials and it's doing something super crazy ... Why does it generate different elements for h when I start with a different n? So inputting Hpoly(2,1) gives
h = [ 1, 2*y, 4*y^2 - 2]
while for Hpoly(3,1) ,
h = [ 1, 2*y, 4*y^2 - 4, 2*y*(4*y^2 - 4) - 8*y]
( (4y^2 - 2) vs (4y^2 - 4) as a third element here )
also, I can't figure out how to actually evaluate the expression. I tried out = subs(h(np1),y,x) but that did nothing.
code:
function out = Hpoly(n, x)
clc;
syms y
np1 = n + 1;
h = [1, 2*y];
f(np1)
function f(np1)
if numel(h) < np1
f(np1 - 1)
h(np1) = 2*y*h(np1-1) - 2*(n-1)*h(np1-2);
end
end
h
y = x;
out = h(np1);
end
-------------------------- EDIT ----------------------------
So I got around that by using a while loop instead. I wonder why the other way didn't work ... (and still can't figure out how to evaluate the expression other than just plug in x from the very beginning ... I suppose that's not that important, but would still be nice to know...)
Sadly, my code isn't as fast as hermiteH :( I wonder why.
function out = Hpoly(n, x)
h = [1, 2*x];
np1 = n + 1;
while np1 > length(h)
h(end+1) = 2*x*h(end) - 2*(length(h)-1)*h(end-1);
end
out = h(end)
end
Why is your code slower? Recursion is not necessarily of Matlab's fortes so you may have improved it by using a recurrence relation. However, hermiteH is written in C and your loop won't be as fast as it could be because you're using a while instead of for and needlessly reallocating memory instead of preallocating it. hermiteH may even use a lookup table for the first coefficients or it might benefit from vectorization using the explicit expression. I might rewrite your function like this:
function h = Hpoly(n,x)
% n - Increasing sequence of integers starting at zero
% x - Point at which to evaluate polynomial, numeric or symbolic value
mx = max(n);
h = cast(zeros(1,mx),class(x)); % Use zeros(1,mx,'like',x) in newer versions of Matlab
h(1) = 1;
if mx > 0
h(2) = 2*x;
end
for i = 2:length(n)-1
h(i+1) = 2*x*h(i)-2*(i-1)*h(i-1);
end
You can then call it with
syms x;
deg = 3;
h = Hpoly(0:deg,x)
which returns [ 1, 2*x, 4*x^2 - 2, 2*x*(4*x^2 - 2) - 8*x] (use expand on the output if you want). Unfortunately, this won't be much faster if x is symbolic.
If you're only interested in numeric results of the the polynomial evaluated at particular values, then it's best to avoid symbolic math altogether. The function above valued for double precision x will be three to four orders of magnitude faster than for symbolic x. For example:
x = pi;
deg = 3;
h = Hpoly(0:deg,x)
yields
h =
1.0e+02 *
0.010000000000000 0.062831853071796 0.374784176043574 2.103511015993210
Note:
The hermiteH function is R2015a+, but assuming that you still have access to the Symbolic Math toolbox and the Matlab version is R2012b+, you can also try calling MuPAD's orthpoly::hermite. hermiteH used this function under the hood. See here for details on how to call MuPAD functions from Matlab. This function is a bit simpler in that it only returns a single term. Using a for loop:
syms x;
deg = 2;
h = sym(zeros(1,deg+1));
for i = 1:deg+1
h(i) = feval(symengine,'orthpoly::hermite',i-1,x);
end
Alternatively, you can use map to vectorize the above:
deg = 2;
h = feval(symengine,'map',0:deg,'n->orthpoly::hermite(n,x)');
Both return [ 1, 2*x, 4*x^2 - 2].
By "practically equivalent", I mean that their distances are of order epsilon apart (or 0.000001). Equality in MATLAB often doesn't really work for long floating numbers.
If I simply do abs(A1 - A2) < 0.000001, it's not going to work since size(A1) != size(A2)
You can get the answer by calculating distance between two vectors using MATLAB's pdist2 function.
dist=pdist2(A1,A2);
minDist=min(dist,[],2);
indices_A1=minDist<=0.000001;
desired_A1=A1(indices_A1);
Not tested, but should work.
Since you care about a distance from any element to any element, you can create a distance matrix from the vectorized matrices and probe that for the distance threshold. Example:
A = rand(10, 4); % (example) matrix A
B = rand(3, 5); % matrix B of different size
minDist = 0.005;
Solution: Repeat vectorized matrices, column- and row-wise to get same size matrices and apply matrix-based distance estimation:
Da = repmat(A(:), 1, length(B(:))); % size 40 x 15
Db = repmat(B(:)', length(A(:)), 1); % size 40 x 15
DD = Da - Db;
indA = any(abs(DD) < minDist, 2);
The use of any() will give you logical indices to any value of A that is close to any value of B). You can directly index/return the elements of A using indA.
The matrix DD (as #Shai also points out) can be equivalently estimated through bsxfun
DD = bsxfun(#minus, A(:), B(:)');
In addition: you can map from row-index (corresponding to A elements) back to matrix A indices using:
[iA, jA] = ind2sub(size(A), indA);
Assert/test that all the returned values are less than minDist,e.g. using:
for k = 1:length(iA);
d(k) = min(min(abs(A(iA(k), jA(k)) - B)));
end
all(d < minDist)
(tested in Octave 3.6.2)