Forming a 'partial' identity-matrix according to a partially filled vector - arrays

I'm currently forming a matrix from a vector in MATLAB following the scheme described below:
Given is a vector x containing ones and zeros in an arbitrary order, e.g.
x = [0 1 1 0 1];
From this, I would like to form a matrix Y that is described as follows:
Y has m rows, where m is the number of ones in x (here: 3).
Each row of Y is filled with a one at the k-th entry, where k is the position of a one in vector x (here: k = 2,3,5)
For the example x from above, this would result in:
Y = [0 1 0 0 0;
0 0 1 0 0;
0 0 0 0 1]
This is identical to an identity matrix, that has its (x=0)th rows eliminated.
I'm currently achieving this via the following code:
x = [0,1,1,0,1]; %example from above
m = sum(x==1);
Y = zeros(m,numel(x));
p = 1;
for n = 1:numel(x)
if x(n) == 1
Y(p,n) = 1;
p = p+1;
end
end
It works but I'm kind of unhappy with it as it seems rather inefficient and inelegant. Any ideas for a smoother implementation, maybe using some matrix multiplications or so are welcome.

Here are a few one-line alternatives:
Using sparse:
Y = full(sparse(1:nnz(x), find(x), 1));
Similar but with accumarray:
Y = accumarray([(1:nnz(x)).' find(x(:))], 1);
Using eye and indexing. This assumes Y is previously undefined:
Y(:,logical(x)) = eye(nnz(x));

Use find to obtain the indices of ones in x which are also the column subscripts of ones in Y. Find the number of rows of Y by adding all the elements of the vector x. Use these to initialise Y as a zero matrix. Now find the linear indices to place 1s using sub2ind. Use these indices to change the elements of Y to 1.
cols = find(x);
noofones = sum(x);
Y = zeros(noofones, size(x,2));
Y(sub2ind(size(Y), 1:noofones, cols)) = 1;

Here's an alternative using matrix multiplications:
x = [0,1,1,0,1];
I = eye(numel(x));
% construct identity matrix with zero rows
Y = I .* x; % uses implicit expansion from 2016b or later
Y = Y(logical(x), :); % take only non-zero rows of Y
Result:
Y =
0 1 0 0 0
0 0 1 0 0
0 0 0 0 1
Thanks to #SardarUsama's comment for simplifying the code a bit.

Thanks everybody for the nice alternatives! I tried out all your solutions and averaged execution times over 1e4 executions for random (1000-entry) x-vectors. Here are the results:
(7.3e-4 sec) full(sparse(1:nnz(x), find(x), 1));
(7.5e-4 sec) cols = find(x);
noofones = sum(x);
Y = zeros(noofones, size(x,2));
Y(sub2ind(size(Y), 1:noofones, cols)) = 1;
(7.7e-4 sec) Y = accumarray([(1:nnz(x)).' find(x(:))], 1);
(1.7e-3 sec) I = speye(numel(x));
Y = I .* x;
Y = full(Y(logical(x), :));
(3.1e-3 sec) Y(:,logical(x)) = eye(nnz(x));

From your comment "This is identical to an identity matrix, that has its (x=0)th rows eliminated.", well, you can also explicitly generate it as such:
Y = eye(length(x));
Y(x==0, :) = [];
Very slow option for long x, but it works slightly faster than full(sparse(... for x with 10 elements on my computer.

Related

Difference of elements of 2 sorted arrays within given interval

Let us assume that we have 2 sorted arrays A and B of integers and a given interval [L,M] . If x is an element of A and y an element of B ,our task is to find all pairs of (x,y) that hold the following property: L<=y-x<=M.
Which is the most suitable algorithm for that purpose?
So far ,I have considered the following solution:
Brute force. Check the difference of all possible pairs of elements with a double loop .Complexity O(n^2).
A slightly different version of the previous solution is to make use of the fact that arrays are sorted by not checking the elements of A ,once difference gets out of interval .Complexity would still be O(n^2) but hopefully our program would run faster at an average case.
However ,I believe that O(n^2) is not optimal .Is there an algorithm with better complexity?
Here is a solution.
Have a pointer at the beginning of each array say i for array A and j for array B.
Calculate the difference between B[j] and A[i].
If it is less than L, increment the pointer in array B[], i.e increment j by 1
If it is more than M, increment i, i.e pointer of A.
If the difference is in between, then do the following:
search for the position of an element whose value is B[j]-A[i]-L or the nearest
element whose value is lesser than (B[j]-A[i])-L in array A. This
takes O(logN) time. Say the position is p. Increment the count of
(x,y) pairs by p-i+1
Increment only pointer j
My solution only counts the number of possible (x,y) pairs in O(NlogN) time
For A=[1,2,3] and B=[10,12,15] and L=12 and M=14, answer is 3.
Hope this helps. I leave it up to you, to implement the solution
Edit: Enumerating all the possible (x,y) pairs would take O(N^2) worst case time. We will be able to return the count of such pairs (x,y) in O(NlogN) time. Sorry for not clarifying it earlier.
Edit2: I am attaching a sample implementation of my proposed method below:
def binsearch(a, x):
low = 0
high = len(a)-1
while(low<=high):
mid = (low+high)/2
if a[mid] == x:
return mid
elif a[mid]<x:
k = mid
low = low + mid
else:
high = high - mid
return k
a = [1, 2, 3]
b = [10, 12, 15, 16]
i = 0
j = 0
lenA = len(a)
lenB = len(b)
L = 12
M = 14
count = 0
result = []
while i<lenA and j<lenB:
if b[j] - a[i] < L:
j = j + 1
elif b[j] - a[i] > M:
i = i + 1
else:
p = binsearch(a,b[j]-L)
count = count + p - i + 1
j = j + 1
print "number of (x,y) pairs: ", count
Because it's possible for every combination to be in the specified range, the worst-case is O([A][B]), which is basically O(n^2)
However, if you want the best simple algorithm, this is what I've come up with. It starts similarly to user-targaryen's algorithm, but handles overlaps in a simplistic fashion
Create three variables: x,y,Z and s (set all to 0)
Create a boolean 'success' and set to false
Calculate Z = B[y] - A[x]
if Z < L
increment y
if Z >= L and <= M
if success is false
set s = y
set success = true
increment y
store x,y
if Z > M
set y = s //this may seem inefficient with my current example
//but you'll see the necessity if you have a sorted list with duplicate values)
//you can just change the A from my example to {1,1,2,2,3} to see what I mean
set success = false
an example:
A = {1,2,3,4,5}
B = {3,4,5,6,7}
L = 2, M = 3
In this example, the first pair is x,y. The second number is s. The third pair is the values A[x] and B[y]. The fourth number is Z, the difference between A[x] and B[y]. The final value is X for not a match and O for a match
0,0 - 0 - 1,3 = 2 O
increment y
0,1 - 0 - 1,4 = 3 O
increment y
0,2 - 0 - 1,5 = 4 X
//this is the tricky part. Look closely at the changes this makes
set y to s
increment x
1,0 - 0 - 2,3 = 1 X
increment y
1,1 - 0 - 2,4 = 2 O
set s = y, set success = true
increment y
1,2 - 1 - 2,5 = 3 O
increment y
1,3 - 1 - 2,6 = 4 X
set y to s
increment x
2,1 - 1 - 3,4 = 1 X
increment y
2,2 - 1 - 3,5 = 2 O
set s = y, set success = true
increment y
2,3 - 2 - 3,6 = 3 O
... and so on

Upsample vector by including zeros in-between elements

I am trying to make a Upsampling code, so i want to insert a zeros vector inside a vector, like this:
z=[0 0]%Zeros Vector
x=[1 2 3 4]%Vector to Upsample
y=[1 0 0 2 0 0 3 0 0 4 0 0]%Vector Upsampled
I have this code:
fs=20;
N=50;
T=1/fs;
n=0:1:N-1;
L=3;
M=2;
x = exp(-0.5*n*T).*sin(2*pi*n*T);
A=zeros(1,L);
disp(A);
for i = 1:M:length(x)
x(:,i)=A;
end
disp(x);
But I am gettinf this error:
A(I,J,...) = X: dimensions mismatch
Any Idea of how can I do that?
Forget the loop. Use the following solution:
out = zeros(1,length(x)*L);
out(:,1:L:end) = x
Here's a solution using bsxfun and reshape:
y = reshape(bsxfun(#times,[1;z.'],x),1,[]);
I initially thought of repelem, but decided it was too much work. However, if you just want to make your vector longer using a "zero order approximation" - this is just the function for you.
You can use repmat and reshape to get the upsampled vector as:
y = reshape([x' repmat(z,size(x,2),1)]',1,[])
y =
1 0 0 2 0 0 3 0 0 4 0 0
Keep in mind that z and x are row vectors, you may need to play with the statement a little bit if they are column vectors.
A short alternative using the kronecker tensor product:
y = kron(x,[1, z]) %// x(:).' and z(:).' for independent vector orientations
And another fast alternative:
y = [1; z(:)]*x; y = y(:).' %// x(:).' for independent vector orientations
which is basically equivalent to:
y = reshape( [1; z(:)]*x, 1, []) %// x(:).' for independent vector orientations
Use upsample
fs=20;
N=50;
T=1/fs;
n=0:1:N-1;
L=3;
x = exp(-0.5*n*T).*sin(2*pi*n*T);
x = upsample(x, L)

Matlab Convert Vector to Binary Matrix [duplicate]

This question already has answers here:
Create a zero-filled 2D array with ones at positions indexed by a vector
(4 answers)
Closed 6 years ago.
I have a vector v of size (m,1) whose elements are integers picked from 1:n. I want to create a matrix M of size (m,n) whose elements M(i,j) are 1 if v(i) = j, and are 0 otherwise. I do not want to use loops, and would like to implement this as a simple vector-matrix manipulation only.
So I thought first, to create a matrix with repeated elements
M = v * ones(1,n) % this is a (m,n) matrix of repeated v
For example v=[1,1,3,2]'
m = 4 and n = 3
M =
1 1 1
1 1 1
3 3 3
2 2 2
then I need to create a comparison vector c of size (1,n)
c = 1:n
1 2 3
Then I need to perform a series of logical comparisons
M(1,:)==c % this results in [1,0,0]
.
M(4,:)==c % this results in [0,1,0]
However, I thought it should be possible to perform the last steps of going through each single row in compact matrix notation, but I'm stumped and not knowledgeable enough about indexing.
The end result should be
M =
1 0 0
1 0 0
0 0 1
0 1 0
A very simple call to bsxfun will do the trick:
>> n = 3;
>> v = [1,1,3,2].';
>> M = bsxfun(#eq, v, 1:n)
M =
1 0 0
1 0 0
0 0 1
0 1 0
How the code works is actually quite simple. bsxfun is what is known as the Binary Singleton EXpansion function. What this does is that you provide two arrays / matrices of any size, as long as they are broadcastable. This means that they need to be able to expand in size so that both of them equal in size. In this case, v is your vector of interest and is the first parameter - note that it's transposed. The second parameter is a vector from 1 up to n. What will happen now is the column vector v gets replicated / expands for as many values as there are n and the second vector gets replicated for as many rows as there are in v. We then do an eq / equals operator between these two arrays. This expanded matrix in effect has all 1s in the first column, all 2s in the second column, up until n. By doing an eq between these two matrices, you are in effect determining which values in v are equal to the respective column index.
Here is a detailed time test and breakdown of each function. I placed each implementation into a separate function and I also let n=max(v) so that Luis's first code will work. I used timeit to time each function:
function timing_binary
n = 10000;
v = randi(1000,n,1);
m = numel(v);
function luis_func()
M1 = full(sparse(1:m,v,1));
end
function luis_func2()
%m = numel(v);
%n = 3; %// or compute n automatically as n = max(v);
M2 = zeros(m, n);
M2((1:m).' + (v-1)*m) = 1;
end
function ray_func()
M3 = bsxfun(#eq, v, 1:n);
end
function op_func()
M4= ones(1,m)'*[1:n] == v * ones(1,n);
end
t1 = timeit(#luis_func);
t2 = timeit(#luis_func2);
t3 = timeit(#ray_func);
t4 = timeit(#op_func);
fprintf('Luis Mendo - Sparse: %f\n', t1);
fprintf('Luis Mendo - Indexing: %f\n', t2);
fprintf('rayryeng - bsxfun: %f\n', t3);
fprintf('OP: %f\n', t4);
end
This test assumes n = 10000 and the vector v is a 10000 x 1 vector of randomly distributed integers from 1 up to 1000. BTW, I had to modify Luis's second function so that the indexing will work as the addition requires vectors of compatible dimensions.
Running this code, we get:
>> timing_binary
Luis Mendo - Sparse: 0.015086
Luis Mendo - Indexing: 0.327993
rayryeng - bsxfun: 0.040672
OP: 0.841827
Luis Mendo's sparse code wins (as I expected), followed by bsxfun, followed by indexing and followed by your proposed approach using matrix operations. The timings are in seconds.
Assuming n equals max(v), you can use sparse:
v = [1,1,3,2];
M = full(sparse(1:numel(v),v,1));
What sparse does is build a sparse matrix using the first argument as row indices, the second as column indices, and the third as matrix values. This is then converted into a full matrix with full.
Another approach is to define the matrix containing initially zeros and then use linear indexing to fill in the ones:
v = [1,1,3,2];
m = numel(v);
n = 3; %// or compute n automatically as n = max(v);
M = zeros(m, n);
M((1:m) + (v-1)*m) = 1;
I think I've also found a way to do it, and it would be nice if somebody could tell me which of the methods shown is faster for very large vectors and matrices. The additional method I thought of is the following
M= ones(1,m)'*[1:n] == v * ones(1,n)

Carrying out of loop so as to do the following operation

consider an area with size m*n. Here the size of m and n is unknown. Now I am extracting data from each point in the area. I am scanning the area first going in the x direction till m point and the again returning to m=0 and n=1, i.e the second row. Again I scan along the x direction till the end of m. An example of the data has been shown below. Here I get value for different x,y coordinates during the scan. I can carry out operation between the first two points in x direction by
p1 = A{1}; %%reading the data from the text file
p2 = A{2};
LA=[p1 p2];
for m=1:length(y)
p= LA(m,1);
t= LA(m,2);
%%and
q=LA(m+1,1)
r=LA(m+1,2)
I want to do the same for y axis. That is I want to operate between first point in x=0 and y=1 then between x=2 and y=1 and so on. Hope you have got it.
g x y
2 0 0
3 1 0
2 2 0
4 3 0
1 4 0
2 m 0
3 0 1
2 1 1
4 2 1
5 3 1
.
.
.
.
2 m 1
now I was thinking of a logic where I will first find the size of n by counting the number of zeros
NUMX = 0;
while y((NUMX+1),:) == 0
NUMX = NUMX + 1;
end
NU= NUMX;
And then I was thinking of applying the following loop
for m=1:NU:n-1
%%and
p= LA(m,1);
t= LA(m,2);
%%and
q=LA(m+1,1)
r=LA(m+1,2)
But its showing error. Please help!!
??? Attempted to access del2(99794,:); index out of bounds because
size(del2)=[99793,1].
Here NUMX=198
Comment: The nomenclature in your question is inconsistent, making it difficult to understand what you are doing. The variable del2 you mention in the error message is nowhere to be seen.
1.) Let's start off by creating a minimal working example that illustrates the data structure and provides knowledge of the dimensions we want to retrieve later. You matrix is not m x n but m*n x 3.
The following example will set up a matrix with data similar to what you have shown in your question:
M = zeros(8,3);
for J=1:4
for I=1:2
M((J-1)*2+I,1) = rand(1);
M((J-1)*2+I,2) = I;
M((J-1)*2+I,3) = J-1;
end
end
M =
0.469 1 0
0.012 2 0
0.337 1 1
0.162 2 1
0.794 1 2
0.311 2 2
0.529 1 3
0.166 2 3
2.) Next, let's determine the number of x and y, to use the nomenclature of your question:
NUMX = 0;
while M(NUMX+1,3) == 0
NUMX = NUMX + 1;
end
NUMY = size(M,1)/NUMX;
NUMX =
2
NUMY =
4
3.) The data processing you want to do still is unclear, but here are two approaches that can be used for different means:
(a)
COUNT = 1;
for K=1:NUMX:size(M,1)
A(COUNT,1) = M(K,1);
COUNT = COUNT + 1;
end
In this case, you step through the first column of M with a step-size corresponding to NUMX. This will result in all the values for x=1:
A =
0.469
0.337
0.794
0.529
(b) You can also use NUMX and NUMY to reorder M:
for J=1:NUMY
for I=1:NUMX
NEW_M(I,J) = M((J-1)*NUMX+I,1);
end
end
NEW_M =
0.469 0.337 0.794 0.529
0.012 0.162 0.311 0.166
The matrix NEW_M now is of size m x n, with the values of constant y in the columns and the values of constant x in the rows.
Concluding remark: It is unclear how you define m and n in your code, so your specific error message cannot be resolved here.

Creating Indicator Matrix

For a vector V of size n x 1, I would like to create binary indicator matrix M of the size n x Max(V) such that the row entries of M have 1 in the corresponding columns index, 0 otherwise.
For eg: If V is
V = [ 3
2
1
4]
The indicator matrix should be
M= [ 0 0 1 0
0 1 0 0
1 0 0 0
0 0 0 1]
The thing about an indicator matrix like this, is it is better if you make it sparse. You will almost always be doing a matrix multiply with it anyway, so make that multiply an efficient one.
n = 4;
V = [3;2;1;4];
M = sparse(V,1:n,1,n,n);
M =
(3,1) 1
(2,2) 1
(1,3) 1
(4,4) 1
If you insist on M being a full matrix, then making it so is simple after the fact, by use of full.
full(M)
ans =
0 0 1 0
0 1 0 0
1 0 0 0
0 0 0 1
Learn how to use sparse matrices. You will gain greatly from doing so. Admittedly, for a 4x4 matrix, sparse will not gain by much. But the example cases are never your true problem. Suppose that n was really 2000?
n = 2000;
V = randperm(n);
M = sparse(V,1:n,1,n,n);
FM = full(M);
whos FM M
Name Size Bytes Class Attributes
FM 2000x2000 32000000 double
M 2000x2000 48008 double sparse
Sparse matrices do not gain only in terms of memory used. Compare the time required for a single matrix multiply.
A = magic(2000);
tic,B = A*M;toc
Elapsed time is 0.012803 seconds.
tic,B = A*FM;toc
Elapsed time is 0.560671 seconds.
a quick way to do this - if you do not require sparse matrix - is to create an identity matrix, of size at least the max(v), then to create your indicator matrix by extracting indexes from v:
m = max(V);
I = eye(m);
V = I(V, :);
You would like to create the Index matrix to be sparse for memory sake. It is as easy as:
vSize = size(V);
Index = sparse(vSize(1),max(V));
for i = 1:vSize(1)
Index(i, v(i)) = 1;
end
I've used this myself, enjoy :)
You can simply combine the column index in V with a row index to create a linear index, then use that to fill M (initialized to zeroes):
M = zeros(numel(V), max(V));
M((1:numel(V))+(V.'-1).*numel(V)) = 1;
Here's another approach, similar to sparse but with accumarray:
V = [3; 2; 1; 4];
M = accumarray([(1:numel(V)).' V], 1);
M=sparse(V,1:size(V,1),1)';
will produce a sparse matrix that you can use in calculations as a full version.
You could use full(M) to "inflate" M to actually store zeros.

Resources