Matlab Convert Vector to Binary Matrix [duplicate] - arrays

This question already has answers here:
Create a zero-filled 2D array with ones at positions indexed by a vector
(4 answers)
Closed 6 years ago.
I have a vector v of size (m,1) whose elements are integers picked from 1:n. I want to create a matrix M of size (m,n) whose elements M(i,j) are 1 if v(i) = j, and are 0 otherwise. I do not want to use loops, and would like to implement this as a simple vector-matrix manipulation only.
So I thought first, to create a matrix with repeated elements
M = v * ones(1,n) % this is a (m,n) matrix of repeated v
For example v=[1,1,3,2]'
m = 4 and n = 3
M =
1 1 1
1 1 1
3 3 3
2 2 2
then I need to create a comparison vector c of size (1,n)
c = 1:n
1 2 3
Then I need to perform a series of logical comparisons
M(1,:)==c % this results in [1,0,0]
.
M(4,:)==c % this results in [0,1,0]
However, I thought it should be possible to perform the last steps of going through each single row in compact matrix notation, but I'm stumped and not knowledgeable enough about indexing.
The end result should be
M =
1 0 0
1 0 0
0 0 1
0 1 0

A very simple call to bsxfun will do the trick:
>> n = 3;
>> v = [1,1,3,2].';
>> M = bsxfun(#eq, v, 1:n)
M =
1 0 0
1 0 0
0 0 1
0 1 0
How the code works is actually quite simple. bsxfun is what is known as the Binary Singleton EXpansion function. What this does is that you provide two arrays / matrices of any size, as long as they are broadcastable. This means that they need to be able to expand in size so that both of them equal in size. In this case, v is your vector of interest and is the first parameter - note that it's transposed. The second parameter is a vector from 1 up to n. What will happen now is the column vector v gets replicated / expands for as many values as there are n and the second vector gets replicated for as many rows as there are in v. We then do an eq / equals operator between these two arrays. This expanded matrix in effect has all 1s in the first column, all 2s in the second column, up until n. By doing an eq between these two matrices, you are in effect determining which values in v are equal to the respective column index.
Here is a detailed time test and breakdown of each function. I placed each implementation into a separate function and I also let n=max(v) so that Luis's first code will work. I used timeit to time each function:
function timing_binary
n = 10000;
v = randi(1000,n,1);
m = numel(v);
function luis_func()
M1 = full(sparse(1:m,v,1));
end
function luis_func2()
%m = numel(v);
%n = 3; %// or compute n automatically as n = max(v);
M2 = zeros(m, n);
M2((1:m).' + (v-1)*m) = 1;
end
function ray_func()
M3 = bsxfun(#eq, v, 1:n);
end
function op_func()
M4= ones(1,m)'*[1:n] == v * ones(1,n);
end
t1 = timeit(#luis_func);
t2 = timeit(#luis_func2);
t3 = timeit(#ray_func);
t4 = timeit(#op_func);
fprintf('Luis Mendo - Sparse: %f\n', t1);
fprintf('Luis Mendo - Indexing: %f\n', t2);
fprintf('rayryeng - bsxfun: %f\n', t3);
fprintf('OP: %f\n', t4);
end
This test assumes n = 10000 and the vector v is a 10000 x 1 vector of randomly distributed integers from 1 up to 1000. BTW, I had to modify Luis's second function so that the indexing will work as the addition requires vectors of compatible dimensions.
Running this code, we get:
>> timing_binary
Luis Mendo - Sparse: 0.015086
Luis Mendo - Indexing: 0.327993
rayryeng - bsxfun: 0.040672
OP: 0.841827
Luis Mendo's sparse code wins (as I expected), followed by bsxfun, followed by indexing and followed by your proposed approach using matrix operations. The timings are in seconds.

Assuming n equals max(v), you can use sparse:
v = [1,1,3,2];
M = full(sparse(1:numel(v),v,1));
What sparse does is build a sparse matrix using the first argument as row indices, the second as column indices, and the third as matrix values. This is then converted into a full matrix with full.
Another approach is to define the matrix containing initially zeros and then use linear indexing to fill in the ones:
v = [1,1,3,2];
m = numel(v);
n = 3; %// or compute n automatically as n = max(v);
M = zeros(m, n);
M((1:m) + (v-1)*m) = 1;

I think I've also found a way to do it, and it would be nice if somebody could tell me which of the methods shown is faster for very large vectors and matrices. The additional method I thought of is the following
M= ones(1,m)'*[1:n] == v * ones(1,n)

Related

MATLAB Vectorised Pairwise Distance

I'm struggling to vectorise a function which performs a somewhat pairwise difference between two vectors x = 2xN and v = 2xM, for some arbitrary N, M. I have this to work when N = 1, although, I would like to vectorise this function to apply to inputs with N arbitrary.
Indeed, what I want this function to do is for each column of x find the normed difference between x(:,column) (a 2x1) and v (a 2xM).
A similar post is this, although I haven't been able to generalise it.
Current implementation
function mat = vecDiff(x,v)
diffVec = bsxfun(#minus, x, v);
mat = diffVec ./ vecnorm(diffVec);
Example
x =
1
1
v =
1 3 5
2 4 6
----
vecDiff(x,v) =
0 -0.5547 -0.6247
-1.0000 -0.8321 -0.7809
Your approach can be adapted as follows to suit your needs:
Permute the dimensions of either x or v so that its number of columns becomes the third dimension. I'm choosing v in the code below.
This lets you exploit implicit expansion (or equivalently bsxfun) to compute a 2×M×N array of differences, where M and N are the numbers of columns of x and v.
Compute the vector-wise (2-)norm along the first dimension and use implicit expansion again to normalize this array:
x = [1 4 2 -1; 1 5 3 -2];
v = [1 3 5; 2 4 6];
diffVec = x - permute(v, [1 3 2]);
diffVec = diffVec./vecnorm(diffVec, 2, 1);
You may need to apply permute differently if you want the dimensions of the output in another order.
Suppose your two input matrices are A (a 2 x N matrix) and B (a 2 x M matrix), where each column represents a different observation (note that this is not the traditional way to represent data).
Note that the output will be of the size N x M x 2.
out = zeros(N, M, 2);
We can find the distance between them using the builtin function pdist2.
dists = pdist2(A.', B.'); (with the transpositions required for the orientation of the matrices)
To get the individual x and y distances, the easiest way I can think of is using repmat:
xdists = repmat(A(1,:).', 1, M) - repmat(B(1,:), N, 1);
ydists = repmat(A(2,:).', 1, M) - repmat(B(2,:), N, 1);
And we can then normalise this by the distances found earlier:
out(:,:,1) = xdists./dists;
out(:,:,2) = ydists./dists;
This returns a matrix out where the elements at position (i, j, :) are the components of the normed distance between A(:,i) and B(:,j).

Matlab - Sort into deciles each column

Suppose I have a matrix A [m x 1], where m is not necessarily even. I to create a matrix B also [m x 1] which tells me the decile of the elements in A (i.e. matrix B has numbers from 1 to 10).
I know I can use the function sort(A) to get the position of the elements in A and from there I can manually get deciles. Is there another way of doing it?
I think one possibility would be B = ceil(10 * tiedrank(A) / length(A) . What do you think? Are there any issues with this?
Also, more generally, if I have a matrix A [m x n] and I want to create a matrix B also [m x n], in which each column of B should have the decile of the corresponding column in A , is there a way of doing it without a for loop through the columns?
Hope the problem at hand is clear. So far I have been doing it using the sort function and then manually assigning the deciles, but it is very inefficient.
This is how I would do it:
N = 10;
B = ceil(sum(bsxfun(#le, A(:), A(:).'))*N/numel(A));
This counts, for each element, how many elements are less than or equal to it; and then rounds the results to 10 values.
Depending on how you define deciles, you may want to change #le to #lt, or ceil to floor. For numel(A) multiple of N, the above definition gives exactly numel(A)/N values in each of the N quantiles. For example,
>> A = rand(1,8)
A =
0.4387 0.3816 0.7655 0.7952 0.1869 0.4898 0.4456 0.6463
>> N = 4;
>> B = ceil(sum(bsxfun(#le, A(:), A(:).'))*N/numel(A))
B =
2 1 4 4 1 3 2 3

MATLAB function to replace randi to generate a matrix

I have a matlab problem to solve. In have two vectores that limit my space, x_low and x_high. The matrix pos needs to have values within this spaces and each column of the matrix has different bounds given by the two vectores. Now my problem is that randi gives valus between two integers but i need to change the bounds for each columns. There is another way to use randi or a different matlab function to do this?
I know there are better codes to do this but i'm starting to use matlab and i know to do it this way, any aid is welcome
x_low = [Io_low, Iirr_low, Rp_low, Rs_low, n_low]; % vector of constant values
x_high = [Io_high, Iirr_high, Rp_high, Rs_high, n_high]; % vector of constant values
pos = rand(particles, var);
var = length(x_high);
for i = 1: particles % rows
for k = 1: var %columns
if pos(i, k) < x_low(k) || pos(i, k) > x_high(k) % if the position is out of bounder
pos(i, k) = randi(x_low(k), x_high(k), 1); % fill it with a particle whithin the bounderies
end
end
end
If I understand correctly, you need to generate a matrix with integer values such that each column has different lower and upper limits; and those lower and upper limits are inclusive.
This can be done very simply with
rand (to generate random numbers between 0 and 1 ),
bsxfun (to take care of the lower and upper limits on a column basis), and
round (so that the results are integer values).
Let the input data be defined as
x_low = [1 6 11]; %// lower limits
x_high = [3 10 100]; %// upper limits
n_rows = 7; %// number of columns
Then:
r = rand(n_rows, numel(x_low)); %// random numbers between 0 and 1
r = floor(bsxfun(#times, r, x_high-x_low+1)); %// adjust span and round to integers
r = bsxfun(#plus, r, x_low); %// adjust lower limit
gives something like
r =
2 7 83
3 6 93
2 6 22
3 10 85
3 7 96
1 10 90
2 8 57
If you need to fill in values only at specific entries of matrix pos, you can use something like
ind = bsxfun(#lt, pos, x_low) | bsxfun(#gt, pos, x_high); %// index of values to replace
pos(ind) = r(ind);
This a little overkill, because the whole matrixd r is generated only to use some of its entries. To generate only the needed values the best way is probably to use loops.
You can use cellfun for this. Something like:
x_low = [Io_low, Iirr_low, Rp_low, Rs_low, n_low];
x_high = [Io_high, Iirr_high, Rp_high, Rs_high, n_high];
pos = cell2mat(cellfun(#randi, mat2cell([x_low' x_high'], ones(numel(x_low),1), 1), repmat({[particles 1]}, [numel(x_low) 1)])))';
Best,

indices of occurence of each row in MATLAB

I have two matrices, A and B. (B is continuous like 1:n)
I need to find all the occurrences of each individual row of B in A, and store those row indices accordingly in cell array C. See below for an example.
A = [3,4,5;1,3,5;1,4,3;4,2,1]
B = [1;2;3;4;5]
Thus,
C = {[2,3,4];[4];[1,2,3];[1,3,4];[1,2]}
Note C does not need to be in a cell array for my application. I only suggest it because the row vectors of C are of unequal length. If you can suggest a work-around, this is fine too.
I've tried using a loop running ismember for each row of B, but this is too slow when the matrices A and B are huge, with around a million entries. Vectorized code is appreciated.
(To give you context, the purpose of this is to identify, in a mesh, those faces that are attached to a single vertex. Note I cannot use the function edgeattachments because my data are not of the form "TR" in triangulation representation. All I have is a list of faces and list of vertices.)
Well, the best answer for this would require knowledge of how A is filled. If A is sparse, that is, if it has few columns values and B is quite large, then I think the best way for memory saving may be using a sparse matrix instead of a cell.
% No fancy stuff, just fast and furious
bMax = numel(B);
nRows = size(A,1);
cLogical = sparse(nRows,bMax);
for curRow = 1:nRows
curIdx = A(curRow,:);
cLogical(curRow,curIdx) = 1;
end
Answer:
cLogical =
(2,1) 1
(3,1) 1
(4,1) 1
(4,2) 1
(1,3) 1
(2,3) 1
(3,3) 1
(1,4) 1
(3,4) 1
(4,4) 1
(1,5) 1
(2,5) 1
How to read the answer. For each column the rows show the indexes that the column index appears in A. That is 1 appears in rows [2 3 4], 2 appear in row [4], 3 rows [1 2 3], 4 row [1 3 4], 5 in row [1 2].
Then you can use cLogical instead of a cell as an indexing matrix in the future for your needs.
Another way would be to allocate C with the expected value for how many times an index should appear in C.
% Fancier solution using some assumed knowledge of A
bMax = numel(B);
nRows = size(A,1);
nColumns = size(A,2);
% Pre-allocating with the expected value, an attempt to reduce re-allocations.
% tic; for rep=1:10000; C = mat2cell(zeros(bMax,nColumns),ones(1,bMax),nColumns); end; toc
% Elapsed time is 1.364558 seconds.
% tic; for rep=1:10000; C = repmat({zeros(1,nColumns)},bMax,1); end; toc
% Elapsed time is 0.606266 seconds.
% So we keep the not fancy repmat solution
C = repmat({zeros(1,nColumns)},bMax,1);
for curRow = 1:nRows
curIdxMsk = A(curRow,:);
for curCol = 1:nColumns
curIdx = curIdxMsk(curCol);
fillIdx = ~C{curIdx};
if any(fillIdx)
fillIdx = find(fillIdx,1);
else
fillIdx = numel(fillIdx)+1;
end
C{curIdx}(fillIdx) = curRow;
end
end
% Squeeze empty indexes:
for curRow = 1:bMax
C{curRow}(~C{curRow}) = [];
end
Answer:
>> C{:}
ans =
2 3 4
ans =
4
ans =
1 2 3
ans =
1 3 4
ans =
1 2
Which solution will performs best? You do a performance test in your code because it depends on how big is A, bMax, the memory size of your computer and so on. Yet, I'm still curious with solutions other people can do for this x). I liked chappjc's solution although it has the cons that he has pointed out.
For the given example (10k times):
Solution 1: Elapsed time is 0.516647 seconds.
Solution 2: Elapsed time is 4.201409 seconds (seems that solution 2 is a bad idea hahaha, but since it was created to the specific issue of A having many rows it has to be tested in those conditions).
chappjc' solution: Elapsed time is 2.405341 seconds.
We can do it without making any assumptions about B. Try this use of bsxfun and mat2cell:
M = squeeze(any(bsxfun(#eq,A,permute(B,[3 2 1])),2)); % 4x3x1 #eq 1x1x5 => 4x3x5
R = sum(M); % 4x5 -> 1x5
[ii,jj] = find(M);
C = mat2cell(ii,R)
The cells in C above will be column vectors rather than rows as in your example. To make the cells contain row vectors, use C = mat2cell(ii',1,R)' instead.
My only concern is that mat2cell could be slow for millions of values of R, but if you want your output in a cell, I'm not sure how much better you can do. EDIT: If you can deal with a sparse matrix like in Werner's first solution with the loop, replace the last line of the above with the following:
>> Cs = sparse(ii,jj,1)
Cs =
(2,1) 1
(3,1) 1
(4,1) 1
(4,2) 1
(1,3) 1
(2,3) 1
(3,3) 1
(1,4) 1
(3,4) 1
(4,4) 1
(1,5) 1
(2,5) 1
Unfortunately, bsxfun will probably run out of memory if both size(A,1) and numel(B) are large! You may have to loop over the elements of A or B if memory becomes an issue. Here's one way to do it by looping over your vertexes in B:
for i=1:numel(B), C{i} = find(any(A==B(i),2)); end
Yup, that easy. Cell array growing is extremely fast in MATLAB as it similar to a sequence container that stores contiguous references to the data, rather than keeping the data itself contiguous. Perhaps ismember was the bottleneck in your test.

Creating Indicator Matrix

For a vector V of size n x 1, I would like to create binary indicator matrix M of the size n x Max(V) such that the row entries of M have 1 in the corresponding columns index, 0 otherwise.
For eg: If V is
V = [ 3
2
1
4]
The indicator matrix should be
M= [ 0 0 1 0
0 1 0 0
1 0 0 0
0 0 0 1]
The thing about an indicator matrix like this, is it is better if you make it sparse. You will almost always be doing a matrix multiply with it anyway, so make that multiply an efficient one.
n = 4;
V = [3;2;1;4];
M = sparse(V,1:n,1,n,n);
M =
(3,1) 1
(2,2) 1
(1,3) 1
(4,4) 1
If you insist on M being a full matrix, then making it so is simple after the fact, by use of full.
full(M)
ans =
0 0 1 0
0 1 0 0
1 0 0 0
0 0 0 1
Learn how to use sparse matrices. You will gain greatly from doing so. Admittedly, for a 4x4 matrix, sparse will not gain by much. But the example cases are never your true problem. Suppose that n was really 2000?
n = 2000;
V = randperm(n);
M = sparse(V,1:n,1,n,n);
FM = full(M);
whos FM M
Name Size Bytes Class Attributes
FM 2000x2000 32000000 double
M 2000x2000 48008 double sparse
Sparse matrices do not gain only in terms of memory used. Compare the time required for a single matrix multiply.
A = magic(2000);
tic,B = A*M;toc
Elapsed time is 0.012803 seconds.
tic,B = A*FM;toc
Elapsed time is 0.560671 seconds.
a quick way to do this - if you do not require sparse matrix - is to create an identity matrix, of size at least the max(v), then to create your indicator matrix by extracting indexes from v:
m = max(V);
I = eye(m);
V = I(V, :);
You would like to create the Index matrix to be sparse for memory sake. It is as easy as:
vSize = size(V);
Index = sparse(vSize(1),max(V));
for i = 1:vSize(1)
Index(i, v(i)) = 1;
end
I've used this myself, enjoy :)
You can simply combine the column index in V with a row index to create a linear index, then use that to fill M (initialized to zeroes):
M = zeros(numel(V), max(V));
M((1:numel(V))+(V.'-1).*numel(V)) = 1;
Here's another approach, similar to sparse but with accumarray:
V = [3; 2; 1; 4];
M = accumarray([(1:numel(V)).' V], 1);
M=sparse(V,1:size(V,1),1)';
will produce a sparse matrix that you can use in calculations as a full version.
You could use full(M) to "inflate" M to actually store zeros.

Resources