Matlab: create matrix whose rows are identical vector. Use repmat() or multiply by ones() - arrays

I want to create a matrix from a vector by concatenating the vector onto itself n times. So if my vector is mx1, then my matrix will be mxn and each column of the matrix will be equal to the vector.
Which of the following is the best/correct way, or maybe there is a better way I do not know?
matrix = repmat(vector, 1, n);
matrix = vector * ones(1, n);
Thanks

Here is some benchmarking using timeit with different vector sizes and repetition factors. The results to be shown are for Matlab R2015b on Windows.
First define a function for each of the considered approaches:
%// repmat approach
function matrix = f_repmat(vector, n)
matrix = repmat(vector, 1, n);
%// multiply approach
function matrix = f_multiply(vector, n)
matrix = vector * ones(1, n);
%// indexing approach
function matrix = f_indexing(vector,n)
matrix = vector(:,ones(1,n));
Then generate vectors of different size, and use different repetition factors:
M = round(logspace(2,4,15)); %// vector sizes
N = round(logspace(2,3,15)); %// repetition factors
time_repmat = NaN(numel(M), numel(N)); %// preallocate results
time_multiply = NaN(numel(M), numel(N));
time_indexing = NaN(numel(M), numel(N));
for ind_m = 1:numel(M);
for ind_n = 1:numel(N);
vector = (1:M(ind_m)).';
n = N(ind_n);
time_repmat(ind_m, ind_n) = timeit(#() f_repmat(vector, n)); %// measure time
time_multiply(ind_m, ind_n) = timeit(#() f_multiply(vector, n));
time_indexing(ind_m, ind_n) = timeit(#() f_indexing(vector, n));
end
end
The results are plotted in the following two figures, using repmat as reference:
figure
imagesc(time_multiply./time_repmat)
set(gca, 'xtick',1:2:numel(N), 'xticklabels',N(1:2:end))
set(gca, 'ytick',1:2:numel(M), 'yticklabels',M(1:2:end))
title('Time of multiply / time of repmat')
axis image
colorbar
figure
imagesc(time_indexing./time_repmat)
set(gca, 'xtick',1:2:numel(N), 'xticklabels',N(1:2:end))
set(gca, 'ytick',1:2:numel(M), 'yticklabels',M(1:2:end))
title('Time of indexing / time of repmat')
axis image
colorbar
Perhaps a better comparison is to indicate, for each tested vector size and repetition factor, which of the three approaches is the fastest:
figure
times = cat(3, time_repmat, time_multiply, time_indexing);
[~, fastest] = min(times, [], 3);
imagesc(fastest)
set(gca, 'xtick',1:2:numel(N), 'xticklabels',N(1:2:end))
set(gca, 'ytick',1:2:numel(M), 'yticklabels',M(1:2:end))
title('1: repmat is fastest; 2: multiply is; 3: indexing is')
axis image
colorbar
Some conclusions can be drawn from the figures:
The multiply-based approach is always slower than repmat
The indexing-based approach is similar to repmat. It tends to be faster for large values of vector size or repetition factor, and slower for small values.

Either method is correct if they provide you with the desired output.
However, depending on how you declare your vector you may get incorrect results with repmat that will be spotted if you use ones. For instance take this example
>> v = 1:10;
>> m = v * ones(1, n)
Error using *
Inner matrix dimensions must agree.
>> m = repmat(v, 1, n)
m =
Columns 1 through 22
1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2
Columns 23 through 44
3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 1 2 3 4
Columns 45 through 50
5 6 7 8 9 10
ones provides an error to let you know you aren't doing the right thing but repmat doesn't. Whilst this example works correctly with both repmat and ones
>> v = (1:10).';
>> m = v * ones(1, n)
m =
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
6 6 6 6 6
7 7 7 7 7
8 8 8 8 8
9 9 9 9 9
10 10 10 10 10
>> m = repmat(v, 1, n)
m =
1 1 1 1 1
2 2 2 2 2
3 3 3 3 3
4 4 4 4 4
5 5 5 5 5
6 6 6 6 6
7 7 7 7 7
8 8 8 8 8
9 9 9 9 9
10 10 10 10 10

You can also do this -
vector(:,ones(1,n))
But, if I have to choose, repmat would be the go-to approach for me, as it is made exactly for this purpose. Also, depending on how you are going to use this replicated array, you can just avoid creating it altogether with bsxfun that does on-the-fly replication on its input arrays and some operation to be applied on the inputs. Here's a comparison on that - Comparing BSXFUN and REPMAT that shows bsxfun to be better than repmat in most cases.
Benchmarking
For the sake of performance, let's test out these. Here's a benchmarking code to do so -
%// Inputs
vector = rand(1000,1);
n = 1000;
%// Warm up tic/toc.
for iter = 1:50000
tic(); elapsed = toc();
end
disp(' ------- With REPMAT -------')
tic,
for iter = 1:200
A = repmat(vector, 1, n);
end
toc, clear A
disp(' ------- With vector(:,ones(1,n)) -------')
tic,
for iter = 1:200
A = vector(:,ones(1,n));
end
toc, clear A
disp(' ------- With vector * ones(1, n) -------')
tic,
for iter = 1:200
A = vector * ones(1, n);
end
toc
Runtime results -
------- With REPMAT -------
Elapsed time is 1.241546 seconds.
------- With vector(:,ones(1,n)) -------
Elapsed time is 1.212566 seconds.
------- With vector * ones(1, n) -------
Elapsed time is 3.023552 seconds.

Both are correct, but repmat is a more general solution for multi-dimensional matrix copying and is thus bound to be slower than an other solution. The specific 'homemade' solution of multiplying two vectors is possibly faster. It is probably even faster to do selecting instead of multiplying, i.e. vector(:,ones(n,1)) instead of vector*ones(1,n).
EDIT:
Type open repmat in your Command Window. As you can see, it is not a built-in function. You can see that it also makes use of ones (selecting) to copy matrices. However, since it is a more general solution (for scalars and multi-dimensional matrices and copies in multiple directions), you will find unnecessary if statements and other unnecessary code, effectively slowing things down.
EDIT:
Multiplying vectors with ones becomes slower for very large vectors. The unequivocal winner is using ones with selection, i.e. vector(:,ones(n,1)) (which should always be faster than repmat since it uses the same strategy).

Related

get x elements from center of vector

How do I create a function (e.g. here, an anonymous one but I don't mind any) to get x elements from vec that are most centered (i.e. around the median)? In essence I want a function with same syntax as Matlab's randsample(n,k), but for non-random, with elements spanning around the center.
cntr=#(vec,x) vec(round(end*.5)+(-floor(x/2):floor(x/2))); %this function in question
cntr(1:10,3) % outputs 3 values around median 5.5 => [4 5 6];
cntr(1:11,5) % outputs => [4 5 6 7 8]
Note that vec is always sorted.
One part that I struggle with is not to output more than the limits of vec. For example, cntr(1:10, 10) should not throw an error.
edit: sorry to answer-ers for many updates of question
It's not a one-line anonymous function, but you can do this pretty simply with a couple calls to sort:
function vec = cntr(vec, x)
[~, index] = sort(abs(vec-median(vec)));
vec = vec(sort(index(1:min(x, end))));
end
The upside: it will still return the same set of values even if vec isn't sorted. Some examples:
>> cntr(1:10, 3)
ans =
4 5 6
>> cntr(1:11, 5)
ans =
4 5 6 7 8
>> cntr(1:10, 10) % No indexing errors
ans =
1 2 3 4 5 6 7 8 9 10
>> cntr([3 10 2 4 1 6 5 8 11 7 9], 5) % Unsorted version of example 2
ans =
4 6 5 8 7 % Same values, in their original order in vec
OLD ANSWER
NOTE: This applied to an earlier version of the question where a range of x values below and x values above the median were desired as output. Leaving it for posterity...
I broke it down into these steps (starting with a sorted vec):
Find the values in vec less than the median, get the last x indices of these, then take the first (smallest) of them. This is the starting index.
Find the values in vec greater than the median, get the first x indices of these, then take the last (largest) of them. This is the ending index.
Use the starting and ending indices to select the center portion of vec.
Here's the implementation of the above, using the functions find, min, and max:
cntr = #(vec, x) vec(min(find(vec < median(vec), x, 'last')):max(find(vec > median(vec), x)));
And a few tests:
>> cntr(1:10, 3) % 3 above and 3 below 5.5
ans =
3 4 5 6 7 8
>> cntr(1:11, 5) % 5 above and 5 below 6 (i.e. all of vec)
ans =
1 2 3 4 5 6 7 8 9 10 11
>> cntr(1:10, 10) % 10 above and 10 below 5.5 (i.e. all of vec, no indexing errors)
ans =
1 2 3 4 5 6 7 8 9 10
median requires sorting the array elements. Might as well sort manually, and pick out the middle block (edit: OP's comment indicates elements are already sorted, more justification for keeping it simple):
function data = cntr(data,x)
x = min(x,numel(data)); % don't pick more elements than exist
data = sort(data);
start = floor((numel(data)-x)/2) + 1;
data = data(start:start+x-1);
You could stick this into a single-line anonymous function with some tricks, but that just makes the code ugly. :)
Note that in the case of an uneven division (when we don't leave an even number of elements out), here we prioritize an element on the left. Here is what I mean:
0 0 0 0 0 0 0 0 0 0 0 => 11 elements, x=4
\_____/
picking these 4 values
This choice could be made more complex, for example shifting the interval left or right depending on which of those values is closest to the mean.
Given data (i.e. vec) is already sorted, the indexing operation can be kept to a single line:
cntr = #(data,x) data( floor((numel(data)-x)/2) + (1:x) );
The thing that is missing in that line is x = min(x,numel(data)), which we need to add twice becuase we can't change a variable in an anonymous function:
cntr = #(data,x) data( floor((numel(data)-min(x,numel(data)))/2) + (1:min(x,numel(data))) );
This we can simplify to:
cntr = #(data,x) data( floor(max(numel(data)-x,0)/2) + (1:min(x,numel(data))) );

How to repeat every 3rd element of a vector?

I have a vector like this:
h = [1,2,3,4,5,6,7,8,9,10,11,12]
And I want to repeat every third element like so:
h_rep = [1,2,3,3,4,5,6,6,7,8,9,9,10,11,12,12]
How do I accomplish this elegantly in MATLAB? The actual arrays are huge, so ideally I don't want to write a for loop. Is there a vectorized way to do this?
One way to do this would be to use the recent repelem function that was released in version R2015b where you can repeat each element in a vector a certain amount of times. In this case, specify a vector where every third element is a 2 with the rest of the values being a 1 as the number of times to repeat the corresponding element, then use the function:
N = numel(h);
rep = ones(1, N);
rep(3:3:end) = 2;
h_rep = repelem(h, rep);
Using your example: h = 1 : 12, we thus get:
>> h_rep
h_rep =
1 2 3 3 4 5 6 6 7 8 9 9 10 11 12 12
If repelem is not available to you, then a clever use of cumsum may help. Basically, note that for every three elements, the next one is a copy of the previous element. If we had an indicator vector of [1 1 1 0] where 1 is the position that we want to copy and 0 tells us to copy the last value, using cumulative sum or cumsum on repeated versions of this vector - exactly 1 + (numel(h) / 4) will give us exactly where we would need to index into h. Therefore, create a vector of ones that is the length of h added with 1 + (numel(h) / 4 to ensure that we make space for the duplicate elements, then make sure every fourth element is set to 0 before applying the cumsum:
N = numel(h);
rep = ones(1, N + 1 + (N / 4));
rep(4:4:end) = 0;
rep = cumsum(rep);
h_rep = h(rep);
Thus:
>> h_rep
h_rep =
1 2 3 3 4 5 6 6 7 8 9 9 10 11 12 12
One last suggestion (thanks to user #bremen_matt) would be to reshape your vector into a matrix so that it has 3 rows, duplicate the last row, then reshape the resulting duplicated matrix back to a single vector:
h_rep = reshape(h, 3, []);
h_rep = reshape([h_rep; h_rep(end,:)], 1, []);
We again get:
>> h_rep
h_rep =
1 2 3 3 4 5 6 6 7 8 9 9 10 11 12 12
Of course the obvious caveat with the above code is that the length of vector h is evenly divisible by 4.
(Modified according to rayryeng's correct observations)...
Another solution is to play around with the reshape function. If you reshape the matrix to a 3xn matrix first...
B = reshape(h,3,[])
And then copy the last row
B = [B;B(end,:)]
And finally vectorize the solution...
B(:).'
You can use just indexing:
h = [1,2,3,4,5,6,7,8,9,10,11,12]; % initial data
n = 3; % step for repetition
h_rep = h(ceil(n/(n+1):n/(n+1):end));
An index-based approach (using sort):
h_rep = h(sort([1:numel(h) 3:3:numel(h)]));
Or a slightly shorter syntax...
h_rep = h(sort([1:end 3:3:end]));
I think this will do it:
h = [1,2,3,4,5,6,7,8,9,10,11,12];
h0=kron(h,[1 1])
h_rep=h0(mod(1:length(h0),2)==0 | mod(1:length(h0),3)==2)
Answer:
1 2 3 3 4 5 6 6 7 8 9 9 10 11 12 12
Explanation:
After duplicating every element, you select only those that you wants. You can extend this idea to duplicate second and third. etc..

average operation in the first 2 of 3 dimensions of a matrix

Suppose A is a 3-D matrix as below (2 rows-2 columns-2 pages).
A(:,:,1)=[1,2;3,4];
A(:,:,2)=[5,6;7,8];
I want to have a vector, say "a", whose inputs are the average of diagonal elements of matrices on each page. So in this simple case, a=[(1+4)/2;(5+8)/2].
But I have difficulties in matlab to do so. I tried the codes below but failed.
mean(A(1,1,:),A(2,2,:))
You can use "partially linear indexing" in the two dimensions that define the diagonal, as follows:
Since partially linear indexing can only be applied on trailing dimensions, you first need to apply permute to rearrange dimensions, so that the first and second dimensions become second and third.
Now you leave the first dimension untouched, linearly-index the diagonals in the second and third dimensions (which effectly reduces those two dimensions to one), and apply mean along the (combined) second dimension.
Code:
B = permute(A, [3 1 2]); %// step 1: permute
result = mean(B(:,1:size(A,1)+1:size(A,1)*size(A,2)), 2); %// step 2: index and mean
In your example,
A(:,:,1)=[1,2;3,4];
A(:,:,2)=[5,6;7,8];
this gives
result =
2.5000
6.5000
You can use bsxfun for a generic solution -
[m,n,r] = size(A)
mean(A(bsxfun(#plus,[1:n+1:n^2]',[0:r-1]*m*n)),1)
Sample run -
>> A
A(:,:,1) =
8 4 1
7 6 3
1 5 8
A(:,:,2) =
1 7 6
8 5 2
1 2 7
A(:,:,3) =
6 2 8
1 1 6
1 4 5
A(:,:,4) =
8 1 6
1 5 1
9 2 7
>> [m,n,r] = size(A);
>> sum(A(bsxfun(#plus,[1:n+1:n^2]',[0:r-1]*m*n)),1)
ans =
22 13 12 20
>> mean(A(bsxfun(#plus,[1:n+1:n^2]',[0:r-1]*m*n)),1)
ans =
7.3333 4.3333 4 6.6667

Matlab: Reshaping grid points from ndgrid into N x m matrix [duplicate]

This question pops up quite often in one form or another (see for example here or here). So I thought I'd present it in a general form, and provide an answer which might serve for future reference.
Given an arbitrary number n of vectors of possibly different sizes, generate an n-column matrix whose rows describe all combinations of elements taken from those vectors (Cartesian product) .
For example,
vectors = { [1 2], [3 6 9], [10 20] }
should give
combs = [ 1 3 10
1 3 20
1 6 10
1 6 20
1 9 10
1 9 20
2 3 10
2 3 20
2 6 10
2 6 20
2 9 10
2 9 20 ]
The ndgrid function almost gives the answer, but has one caveat: n output variables must be explicitly defined to call it. Since n is arbitrary, the best way is to use a comma-separated list (generated from a cell array with ncells) to serve as output. The resulting n matrices are then concatenated into the desired n-column matrix:
vectors = { [1 2], [3 6 9], [10 20] }; %// input data: cell array of vectors
n = numel(vectors); %// number of vectors
combs = cell(1,n); %// pre-define to generate comma-separated list
[combs{end:-1:1}] = ndgrid(vectors{end:-1:1}); %// the reverse order in these two
%// comma-separated lists is needed to produce the rows of the result matrix in
%// lexicographical order
combs = cat(n+1, combs{:}); %// concat the n n-dim arrays along dimension n+1
combs = reshape(combs,[],n); %// reshape to obtain desired matrix
A little bit simpler ... if you have the Neural Network toolbox you can simply use combvec:
vectors = {[1 2], [3 6 9], [10 20]};
combs = combvec(vectors{:}).' % Use cells as arguments
which returns a matrix in a slightly different order:
combs =
1 3 10
2 3 10
1 6 10
2 6 10
1 9 10
2 9 10
1 3 20
2 3 20
1 6 20
2 6 20
1 9 20
2 9 20
If you want the matrix that is in the question, you can use sortrows:
combs = sortrows(combvec(vectors{:}).')
% Or equivalently as per #LuisMendo in the comments:
% combs = fliplr(combvec(vectors{end:-1:1}).')
which gives
combs =
1 3 10
1 3 20
1 6 10
1 6 20
1 9 10
1 9 20
2 3 10
2 3 20
2 6 10
2 6 20
2 9 10
2 9 20
If you look at the internals of combvec (type edit combvec in the command window), you'll see that it uses different code than #LuisMendo's answer. I can't say which is more efficient overall.
If you happen to have a matrix whose rows are akin to the earlier cell array you can use:
vectors = [1 2;3 6;10 20];
vectors = num2cell(vectors,2);
combs = sortrows(combvec(vectors{:}).')
I've done some benchmarking on the two proposed solutions. The benchmarking code is based on the timeit function, and is included at the end of this post.
I consider two cases: three vectors of size n, and three vectors of sizes n/10, n and n*10 respectively (both cases give the same number of combinations). n is varied up to a maximum of 240 (I choose this value to avoid the use of virtual memory in my laptop computer).
The results are given in the following figure. The ndgrid-based solution is seen to consistently take less time than combvec. It's also interesting to note that the time taken by combvec varies a little less regularly in the different-size case.
Benchmarking code
Function for ndgrid-based solution:
function combs = f1(vectors)
n = numel(vectors); %// number of vectors
combs = cell(1,n); %// pre-define to generate comma-separated list
[combs{end:-1:1}] = ndgrid(vectors{end:-1:1}); %// the reverse order in these two
%// comma-separated lists is needed to produce the rows of the result matrix in
%// lexicographical order
combs = cat(n+1, combs{:}); %// concat the n n-dim arrays along dimension n+1
combs = reshape(combs,[],n);
Function for combvec solution:
function combs = f2(vectors)
combs = combvec(vectors{:}).';
Script to measure time by calling timeit on these functions:
nn = 20:20:240;
t1 = [];
t2 = [];
for n = nn;
%//vectors = {1:n, 1:n, 1:n};
vectors = {1:n/10, 1:n, 1:n*10};
t = timeit(#() f1(vectors));
t1 = [t1; t];
t = timeit(#() f2(vectors));
t2 = [t2; t];
end
Here's a do-it-yourself method that made me giggle with delight, using nchoosek, although it's not better than #Luis Mendo's accepted solution.
For the example given, after 1,000 runs this solution took my machine on average 0.00065935 s, versus the accepted solution 0.00012877 s. For larger vectors, following #Luis Mendo's benchmarking post, this solution is consistently slower than the accepted answer. Nevertheless, I decided to post it in hopes that maybe you'll find something useful about it:
Code:
tic;
v = {[1 2], [3 6 9], [10 20]};
L = [0 cumsum(cellfun(#length,v))];
V = cell2mat(v);
J = nchoosek(1:L(end),length(v));
J(any(J>repmat(L(2:end),[size(J,1) 1]),2) | ...
any(J<=repmat(L(1:end-1),[size(J,1) 1]),2),:) = [];
V(J)
toc
gives
ans =
1 3 10
1 3 20
1 6 10
1 6 20
1 9 10
1 9 20
2 3 10
2 3 20
2 6 10
2 6 20
2 9 10
2 9 20
Elapsed time is 0.018434 seconds.
Explanation:
L gets the lengths of each vector using cellfun. Although cellfun is basically a loop, it's efficient here considering your number of vectors will have to be relatively low for this problem to even be practical.
V concatenates all the vectors for easy access later (this assumes you entered all your vectors as rows. v' would work for column vectors.)
nchoosek gets all the ways to pick n=length(v) elements from the total number of elements L(end). There will be more combinations here than what we need.
J =
1 2 3
1 2 4
1 2 5
1 2 6
1 2 7
1 3 4
1 3 5
1 3 6
1 3 7
1 4 5
1 4 6
1 4 7
1 5 6
1 5 7
1 6 7
2 3 4
2 3 5
2 3 6
2 3 7
2 4 5
2 4 6
2 4 7
2 5 6
2 5 7
2 6 7
3 4 5
3 4 6
3 4 7
3 5 6
3 5 7
3 6 7
4 5 6
4 5 7
4 6 7
5 6 7
Since there are only two elements in v(1), we need to throw out any rows where J(:,1)>2. Similarly, where J(:,2)<3, J(:,2)>5, etc... Using L and repmat we can determine whether each element of J is in its appropriate range, and then use any to discard rows that have any bad element.
Finally, these aren't the actual values from v, just indices. V(J) will return the desired matrix.

periodic structure in matlab

I'm trying to create a script to solve my problem, but I got stuck in one place.
So I have imported .txt file with 4x1 sized matrix (simplified to give an example in my case it might be 1209x1 matrix) which contains some coordinate X. And it's look like this:
0
1
2
3
That's coordinates for one period, and I need to get one column for different number of periods N . Each period is the same and lenght=L
So you can do it manually by this script, for example for N=3 periods:
X=[X; X+L; X+2*L];
so for example if L=3
then i will get
0
1
2
3
3
4
5
6
6
7
8
9
it works well but it's not efficient in case if I need to work with number of periods let's say N=1000 or if I need to change their number quickly. Any solution to parameterize this operation so I can just put number for N and get X for N periods?
Thanks and Regards
I don't have MATLAB on this machine so I can't test, but the most straightforward implementation would be something like:
n = 1000;
L = 3;
nvalues = length(X); % Assuming X is your initial vector
newx = zeros(n*nvalues, 1); % Preallocate new array
for ii = 0:(n-1)
startidx = (nvalues*ii) + 1;
endidx = nvalues*(ii+1);
newx(startidx:endidx) = X + ii*L
end
You can use bsxfun to create X, X+L, X+2*L, ... and then reshape it to a vector
>> F=bsxfun(#plus, X, (0:(N-1))*L)
F =
0 3 6
1 4 7
2 5 8
3 6 9
>> X=F(:)
X =
0
1
2
3
3
4
5
6
6
7
8
9
or in a more concise form:
>> X=reshape(bsxfun(#plus, X, (0:(N-1))*L), [], 1)
X =
0
1
2
3
3
4
5
6
6
7
8
9

Resources