I am new to matlab. I want to do the following:
Generate an array of a thousand replications of a random draws between three alternatives A,B and C, where at every draw, each alternative has the same probability to be picked.
So eventually I need something like P = [ A A B C B B B C A C A C C ... ] where each element in the array was chosen randomly among the three possible outcomes.
I came up with a solution which gives me exactly what I want, namely
% Generating random pick among doors 1,2,3, where 1 stands for A, 2 for B,
% 3 for B.
I = rand(1);
if I < 1/3
PP = 1;
elseif 1/3 <= I & I < 2/3
PP = 2;
else
PP = 3;
end
% Generating a thousand random picks among dors A,B,C
I = rand(999);
for i=1:999
if I(i) < 1/3
P = 1;
elseif 1/3 <= I(i) & I(i) < 2/3
P = 2;
else
P = 3;
end
PP = [PP P]
end
As I said, it works, but when I run the procedure, it takes a while for what appears to me as a simple task. At the same time, I long such a task is "supposed" to take in matlab. So I have three question:
Is this really a slow procedure to generate the desired outcome?
If it is, why is this procedure particularly slow?
What would be a more effective way to produce the desired outcome?
This can be done much easier with randi
>> PP = randi(3,1,10)
PP =
2 1 3 3 2 2 2 3 2 1
If you actually want to choose between 3 alternatives, you use the output of randi directly to index into another matrix.
>> options = [13,22,77]
options =
13 22 77
>> options(randi(3,1,10))
ans =
22 13 77 13 77 13 22 22 77 13
As to the reason why your solution is slow, you do something similar to this:
x = [];
for i=1:10
x = [x i^2]; %size of x grows on every iteration
end
This is not very good, since on every iteration, Matlab needs to allocate space for a larger vector x. In old versions of Matlab, this lead to quadratic behavior (if you double the size of the problem, it takes 4 times longer). In newer versions, Matlab is smart enough to avoid this problem. It is however still considered nice to preallocate space for your array if you know beforehand how big it will be:
x = zeros(1,10); % space for x is preallocated. can also use nan() or ones()
for i = 1:length(x)
x(i) = i^2;
end
But in many cases, it is even faster to use vectorized code that does not use any for-loops like so:
x = (1:10).^2;
All 3 solutions give the same result:
x = 1 4 9 16 25 36 49 64 81 100
cnt=10;
option={'a','b','c'}
x=option([randi(numel(option),cnt,1)])
Related
This question already has answers here:
Create a zero-filled 2D array with ones at positions indexed by a vector
(4 answers)
Closed 6 years ago.
I have a vector v of size (m,1) whose elements are integers picked from 1:n. I want to create a matrix M of size (m,n) whose elements M(i,j) are 1 if v(i) = j, and are 0 otherwise. I do not want to use loops, and would like to implement this as a simple vector-matrix manipulation only.
So I thought first, to create a matrix with repeated elements
M = v * ones(1,n) % this is a (m,n) matrix of repeated v
For example v=[1,1,3,2]'
m = 4 and n = 3
M =
1 1 1
1 1 1
3 3 3
2 2 2
then I need to create a comparison vector c of size (1,n)
c = 1:n
1 2 3
Then I need to perform a series of logical comparisons
M(1,:)==c % this results in [1,0,0]
.
M(4,:)==c % this results in [0,1,0]
However, I thought it should be possible to perform the last steps of going through each single row in compact matrix notation, but I'm stumped and not knowledgeable enough about indexing.
The end result should be
M =
1 0 0
1 0 0
0 0 1
0 1 0
A very simple call to bsxfun will do the trick:
>> n = 3;
>> v = [1,1,3,2].';
>> M = bsxfun(#eq, v, 1:n)
M =
1 0 0
1 0 0
0 0 1
0 1 0
How the code works is actually quite simple. bsxfun is what is known as the Binary Singleton EXpansion function. What this does is that you provide two arrays / matrices of any size, as long as they are broadcastable. This means that they need to be able to expand in size so that both of them equal in size. In this case, v is your vector of interest and is the first parameter - note that it's transposed. The second parameter is a vector from 1 up to n. What will happen now is the column vector v gets replicated / expands for as many values as there are n and the second vector gets replicated for as many rows as there are in v. We then do an eq / equals operator between these two arrays. This expanded matrix in effect has all 1s in the first column, all 2s in the second column, up until n. By doing an eq between these two matrices, you are in effect determining which values in v are equal to the respective column index.
Here is a detailed time test and breakdown of each function. I placed each implementation into a separate function and I also let n=max(v) so that Luis's first code will work. I used timeit to time each function:
function timing_binary
n = 10000;
v = randi(1000,n,1);
m = numel(v);
function luis_func()
M1 = full(sparse(1:m,v,1));
end
function luis_func2()
%m = numel(v);
%n = 3; %// or compute n automatically as n = max(v);
M2 = zeros(m, n);
M2((1:m).' + (v-1)*m) = 1;
end
function ray_func()
M3 = bsxfun(#eq, v, 1:n);
end
function op_func()
M4= ones(1,m)'*[1:n] == v * ones(1,n);
end
t1 = timeit(#luis_func);
t2 = timeit(#luis_func2);
t3 = timeit(#ray_func);
t4 = timeit(#op_func);
fprintf('Luis Mendo - Sparse: %f\n', t1);
fprintf('Luis Mendo - Indexing: %f\n', t2);
fprintf('rayryeng - bsxfun: %f\n', t3);
fprintf('OP: %f\n', t4);
end
This test assumes n = 10000 and the vector v is a 10000 x 1 vector of randomly distributed integers from 1 up to 1000. BTW, I had to modify Luis's second function so that the indexing will work as the addition requires vectors of compatible dimensions.
Running this code, we get:
>> timing_binary
Luis Mendo - Sparse: 0.015086
Luis Mendo - Indexing: 0.327993
rayryeng - bsxfun: 0.040672
OP: 0.841827
Luis Mendo's sparse code wins (as I expected), followed by bsxfun, followed by indexing and followed by your proposed approach using matrix operations. The timings are in seconds.
Assuming n equals max(v), you can use sparse:
v = [1,1,3,2];
M = full(sparse(1:numel(v),v,1));
What sparse does is build a sparse matrix using the first argument as row indices, the second as column indices, and the third as matrix values. This is then converted into a full matrix with full.
Another approach is to define the matrix containing initially zeros and then use linear indexing to fill in the ones:
v = [1,1,3,2];
m = numel(v);
n = 3; %// or compute n automatically as n = max(v);
M = zeros(m, n);
M((1:m) + (v-1)*m) = 1;
I think I've also found a way to do it, and it would be nice if somebody could tell me which of the methods shown is faster for very large vectors and matrices. The additional method I thought of is the following
M= ones(1,m)'*[1:n] == v * ones(1,n)
I have a matlab problem to solve. In have two vectores that limit my space, x_low and x_high. The matrix pos needs to have values within this spaces and each column of the matrix has different bounds given by the two vectores. Now my problem is that randi gives valus between two integers but i need to change the bounds for each columns. There is another way to use randi or a different matlab function to do this?
I know there are better codes to do this but i'm starting to use matlab and i know to do it this way, any aid is welcome
x_low = [Io_low, Iirr_low, Rp_low, Rs_low, n_low]; % vector of constant values
x_high = [Io_high, Iirr_high, Rp_high, Rs_high, n_high]; % vector of constant values
pos = rand(particles, var);
var = length(x_high);
for i = 1: particles % rows
for k = 1: var %columns
if pos(i, k) < x_low(k) || pos(i, k) > x_high(k) % if the position is out of bounder
pos(i, k) = randi(x_low(k), x_high(k), 1); % fill it with a particle whithin the bounderies
end
end
end
If I understand correctly, you need to generate a matrix with integer values such that each column has different lower and upper limits; and those lower and upper limits are inclusive.
This can be done very simply with
rand (to generate random numbers between 0 and 1 ),
bsxfun (to take care of the lower and upper limits on a column basis), and
round (so that the results are integer values).
Let the input data be defined as
x_low = [1 6 11]; %// lower limits
x_high = [3 10 100]; %// upper limits
n_rows = 7; %// number of columns
Then:
r = rand(n_rows, numel(x_low)); %// random numbers between 0 and 1
r = floor(bsxfun(#times, r, x_high-x_low+1)); %// adjust span and round to integers
r = bsxfun(#plus, r, x_low); %// adjust lower limit
gives something like
r =
2 7 83
3 6 93
2 6 22
3 10 85
3 7 96
1 10 90
2 8 57
If you need to fill in values only at specific entries of matrix pos, you can use something like
ind = bsxfun(#lt, pos, x_low) | bsxfun(#gt, pos, x_high); %// index of values to replace
pos(ind) = r(ind);
This a little overkill, because the whole matrixd r is generated only to use some of its entries. To generate only the needed values the best way is probably to use loops.
You can use cellfun for this. Something like:
x_low = [Io_low, Iirr_low, Rp_low, Rs_low, n_low];
x_high = [Io_high, Iirr_high, Rp_high, Rs_high, n_high];
pos = cell2mat(cellfun(#randi, mat2cell([x_low' x_high'], ones(numel(x_low),1), 1), repmat({[particles 1]}, [numel(x_low) 1)])))';
Best,
I am using MATLAB to write a code that multiplies polynomials. Most parts of my code work however there is one part where I have two row vectors a and b. I want to remove repeated elements of a and then add the corresponding elements of b. This is what I have written
c=length(a);
d=length(b);
remove=[];
for i=1:c
for j=i+1:c
if (a(i)==a(j))
remove=[remove,i];
b(j)=b(i)+b(j);
end
end
end
a(remove)=[];
b(remove)=[];
The problem with this is if there is an element in a that appears more than twice, it doesn't work properly.
For example if a=[5,6,8,9,6,7,9,10,8,9,11,12] and b=[1,7,1,-1,3,21,3,-3,-4,-28,-4,4]
then once this code is run a becomes [5,6,7,10,8,9,11,12] which is correct but b becomes [1,10,21,-3,-3,-27,-4,4] which is correct except the -27 should be a -26.
I know why this happens because the 9 in a(1,4) gets compared with the 9 in a(1,7) so b(1,7) becomes b(1,7)+b(1,4) and then a(1,4) gets compared with the 9 in a(1,10). and then later the a(1,7) compares with a(1,10) and so the new b(1,7) adds to the b(1,10) however the b(1,4) adds to the b(1,10) too. I somehow need to stop this once one repeated element has been found because here b(1,4) has been added twice when it should only be added once.
I am not supposed to use any built in functions, is there a way of resolving this easily?
I would prefer using built in functions, but assuming you have to stick to your own approach, you can try this:
a=[5,6,8,9,6,7,9,10,8,9,11,12];
b=[1,7,1,-1,3,21,3,-3,-4,-28,-4,4];
n = numel(a);
remove = zeros(1,n);
temp = a;
for ii = 1:n
for jj = ii+1:n
if temp(ii) == temp(jj)
temp(ii) = NaN;
remove(ii) = ii;
b(jj) = b(jj) + b(ii);
end
end
end
a(remove(remove>0)) = []
b(remove(remove>0)) = []
a =
5 6 7 10 8 9 11 12
b =
1 10 21 -3 -3 -26 -4 4
It's not much different from your approach, except for changing the iith value of a if it is found later. To avoid overwriting the values in a with NaN, I'm using a temporary variable for this.
Also, as you can see, I'm avoiding remove = [remove i], because this will create a growing vector, which is very slow.
It can be solved in a vectorized manner with the following indexing nightmare (perhaps someone will come up with a simpler approach):
a = [5,6,8,9,6,7,9,10,8,9,11,12];
b = [1,7,1,-1,3,21,3,-3,-4,-28,-4,4];
[sa, ind1] = sort(a);
[~, ii, jj] = unique(sa);
[ind2, ind3] = sort(ind1(ii));
a = a(ind2);
b = accumarray(jj(:),b(ind1)).';
b = b(ind3);
Anyway, to multiply polynomials you could use conv:
>> p1 = [1 3 0 2]; %// x^3 + 3x^2 + 1
>> p2 = [2 -1 4]; %// 2x^2 - x + 4
>> conv(p1,p2)
ans =
2 5 1 16 -2 8 %// 2x^5 + 5x^4 + x^3 + 16x^2 - 2x + 8
I am trying to compute with the equation
and I would like to store each value into a row vector. Here is my attempt:
multiA = [1];
multiB = [];
NA = 6;
NB = 4;
q = [0,1,2,3,4,5,6];
for i=2:7
multiA = [multiA(i-1), (factorial(q(i) + NA - 1))/(factorial(q(i))*factorial(NA-1))];
%multiA = [multiA, multiA(i)];
end
multiA
But this does not work. I get the error message
Attempted to access multiA(3); index out
of bounds because numel(multiA)=2.
multiA = [multiA(i-1), (factorial(q(i)
+ NA -
1))/(factorial(q(i))*factorial(NA-1))];
Is my code even remotely close to what I want to achieve? What can I do to fix it?
You don't need any loop, just use the vector directly.
NA = 6;
q = [0,1,2,3,4,5,6];
multiA = factorial(q + NA - 1)./(factorial(q).*factorial(NA-1))
gives
multiA =
1 6 21 56 126 252 462
For multiple N a loop isn't necessary neither:
N = [6,8,10];
q = [0,1,2,3,4,5,6];
[N,q] = meshgrid(N,q)
multiA = factorial(q + N - 1)./(factorial(q).*factorial(N-1))
Also consider the following remarks regarding the overflow for n > 21 in:
f = factorial(n)
Limitations
The result is only accurate for double-precision values of n that are less than or equal to 21. A larger value of n produces a result that
has the correct order of magnitude and is accurate for the first 15
digits. This is because double-precision numbers are only accurate up
to 15 digits.
For single-precision input, the result is only accurate for values of n that are less than or equal to 13. A larger value of n produces a
result that has the correct order of magnitude and is accurate for the
first 8 digits. This is because single-precision numbers are only
accurate up to 8 digits.
Factorials of moderately large numbers can cause overflow. Two possible approaches to prevent that:
Avoid computing terms that will cancel. This approach is specially suited to the case when q is of the form 1,2,... as in your example. It also has the advantage that, for each value of q, the result for the previous value is reutilized, thus minimizing the number of operations:
>> q = 1:6;
>> multiA = cumprod((q+NA-1)./q)
multiA =
6 21 56 126 252 462
Note that 0 is not allowed in q. But the result for 0 is just 1, so the final result would be just [1 multiA].
For q arbitrary (not necessarily of the form 1,2,...), you can use the gammaln function, which gives the logarithms of the factorials:
>> q = [0 1 2 6 3];
>> multiA = exp(gammaln(q+NA)-gammaln(q+1)-gammaln(NA));
>>multiA =
1.0000 6.0000 21.0000 462.0000 56.0000
You want to append a new element to the end of 'multiA':
for i=2:7
multiA = [multiA, (factorial(q(i) + NA - 1))/(factorial(q(i))*factorial(NA-1))];
end
A function handle makes it much simpler:
%define:
omega=#(q,N)(factorial(q + N - 1))./(factorial(q).*factorial(N-1))
%use:
omega(0:6,4) %q=0..6, N=4
It might be better to use nchoosek as opposed to factorial. The latter can overflow quite easily, I'd imagine.
multiA=nan(1,7);
for i=1:7
multiA(i)=nchoosek(q(i)+N-1, q(i));
end
I have two matrices, A and B. (B is continuous like 1:n)
I need to find all the occurrences of each individual row of B in A, and store those row indices accordingly in cell array C. See below for an example.
A = [3,4,5;1,3,5;1,4,3;4,2,1]
B = [1;2;3;4;5]
Thus,
C = {[2,3,4];[4];[1,2,3];[1,3,4];[1,2]}
Note C does not need to be in a cell array for my application. I only suggest it because the row vectors of C are of unequal length. If you can suggest a work-around, this is fine too.
I've tried using a loop running ismember for each row of B, but this is too slow when the matrices A and B are huge, with around a million entries. Vectorized code is appreciated.
(To give you context, the purpose of this is to identify, in a mesh, those faces that are attached to a single vertex. Note I cannot use the function edgeattachments because my data are not of the form "TR" in triangulation representation. All I have is a list of faces and list of vertices.)
Well, the best answer for this would require knowledge of how A is filled. If A is sparse, that is, if it has few columns values and B is quite large, then I think the best way for memory saving may be using a sparse matrix instead of a cell.
% No fancy stuff, just fast and furious
bMax = numel(B);
nRows = size(A,1);
cLogical = sparse(nRows,bMax);
for curRow = 1:nRows
curIdx = A(curRow,:);
cLogical(curRow,curIdx) = 1;
end
Answer:
cLogical =
(2,1) 1
(3,1) 1
(4,1) 1
(4,2) 1
(1,3) 1
(2,3) 1
(3,3) 1
(1,4) 1
(3,4) 1
(4,4) 1
(1,5) 1
(2,5) 1
How to read the answer. For each column the rows show the indexes that the column index appears in A. That is 1 appears in rows [2 3 4], 2 appear in row [4], 3 rows [1 2 3], 4 row [1 3 4], 5 in row [1 2].
Then you can use cLogical instead of a cell as an indexing matrix in the future for your needs.
Another way would be to allocate C with the expected value for how many times an index should appear in C.
% Fancier solution using some assumed knowledge of A
bMax = numel(B);
nRows = size(A,1);
nColumns = size(A,2);
% Pre-allocating with the expected value, an attempt to reduce re-allocations.
% tic; for rep=1:10000; C = mat2cell(zeros(bMax,nColumns),ones(1,bMax),nColumns); end; toc
% Elapsed time is 1.364558 seconds.
% tic; for rep=1:10000; C = repmat({zeros(1,nColumns)},bMax,1); end; toc
% Elapsed time is 0.606266 seconds.
% So we keep the not fancy repmat solution
C = repmat({zeros(1,nColumns)},bMax,1);
for curRow = 1:nRows
curIdxMsk = A(curRow,:);
for curCol = 1:nColumns
curIdx = curIdxMsk(curCol);
fillIdx = ~C{curIdx};
if any(fillIdx)
fillIdx = find(fillIdx,1);
else
fillIdx = numel(fillIdx)+1;
end
C{curIdx}(fillIdx) = curRow;
end
end
% Squeeze empty indexes:
for curRow = 1:bMax
C{curRow}(~C{curRow}) = [];
end
Answer:
>> C{:}
ans =
2 3 4
ans =
4
ans =
1 2 3
ans =
1 3 4
ans =
1 2
Which solution will performs best? You do a performance test in your code because it depends on how big is A, bMax, the memory size of your computer and so on. Yet, I'm still curious with solutions other people can do for this x). I liked chappjc's solution although it has the cons that he has pointed out.
For the given example (10k times):
Solution 1: Elapsed time is 0.516647 seconds.
Solution 2: Elapsed time is 4.201409 seconds (seems that solution 2 is a bad idea hahaha, but since it was created to the specific issue of A having many rows it has to be tested in those conditions).
chappjc' solution: Elapsed time is 2.405341 seconds.
We can do it without making any assumptions about B. Try this use of bsxfun and mat2cell:
M = squeeze(any(bsxfun(#eq,A,permute(B,[3 2 1])),2)); % 4x3x1 #eq 1x1x5 => 4x3x5
R = sum(M); % 4x5 -> 1x5
[ii,jj] = find(M);
C = mat2cell(ii,R)
The cells in C above will be column vectors rather than rows as in your example. To make the cells contain row vectors, use C = mat2cell(ii',1,R)' instead.
My only concern is that mat2cell could be slow for millions of values of R, but if you want your output in a cell, I'm not sure how much better you can do. EDIT: If you can deal with a sparse matrix like in Werner's first solution with the loop, replace the last line of the above with the following:
>> Cs = sparse(ii,jj,1)
Cs =
(2,1) 1
(3,1) 1
(4,1) 1
(4,2) 1
(1,3) 1
(2,3) 1
(3,3) 1
(1,4) 1
(3,4) 1
(4,4) 1
(1,5) 1
(2,5) 1
Unfortunately, bsxfun will probably run out of memory if both size(A,1) and numel(B) are large! You may have to loop over the elements of A or B if memory becomes an issue. Here's one way to do it by looping over your vertexes in B:
for i=1:numel(B), C{i} = find(any(A==B(i),2)); end
Yup, that easy. Cell array growing is extremely fast in MATLAB as it similar to a sequence container that stores contiguous references to the data, rather than keeping the data itself contiguous. Perhaps ismember was the bottleneck in your test.