Modifying a numpy array efficiently

Modifying a numpy array efficiently - arrays

I have a numpy array A of size 10 with values ranging from 0-4. I want to create a new 2-D array B from this with its ith column being a vector corresponding to the ith element of A.
For example, the value 1 as the first element of A would correspond to B having a column vector [0,1,0,0,0] as it's first column. A having 4 as its third element would correspond to B having it's 3rd column as [0,0,0,1,0]
I have the following code:
import numpy as np
A = np.random.randint(0,5,10)
B = np.ones((5,10))
iden = np.identity(5, dtype=np.float64)
for i in range(0,10):
a = A[i]
B[:,i:i+1] = iden[:,a:a+1]
print A
print B
The code is doing what it's supposed to be doing but I am sure there are more efficient ways of doing this. Can anyone please suggest some?

That could be solved by initializing an array of zeros and then integer-indexing into it with indices from A and assigning 1s, like so -
M,N = 5,10
A = np.random.randint(0,M,N)
B = np.zeros((M,N))
B[A,np.arange(len(A))] = 1

Related

Creating a numpy array from a vector by removing one item at a time

I have a list b of elements of size n.
I want to create a 2D numpy array of size (n-1,n) from this list such that i-th row is the elements of b without the i-th term.
For example, if
b = [1,2,3,4]
The numpy array will be,
A = np.array( [[2,3,4],
[1,3,4],
[1,2,4],
[1,2,3]])

Approach #1 : One approach with masking -
n = len(b)
b2D = np.broadcast_to(b, (n,n)) # or np.repeat(b[None],n,axis=0)
out = b2D[~np.eye(n, dtype=bool)].reshape(n,-1)
Approach #2 : With focus on performance and memory efficiency, another with NumPy strides -
strided = np.lib.stride_tricks.as_strided
n = len(b)
b_ext = np.r_[b[1:],b[:-1]]
s = b_ext.strides[0]
out = strided(b_ext, shape=(n-1,n), strides=(s,s)).reshape(n,-1)

How to extract different values/elements of matrix or array without repeating?

I have a vector/ or it could be array :
A = [1,2,3,4,5,1,2,3,4,5,1,2,3]
I want to extract existing different values/elements from this vector without repeating:
1,2,3,4,5
B= [1,2,3,4,5]
How can I extract it ?
I would appreciate for any help please

Try this,
A = [1,2,3,4,5,1,2,3,4,5,1,2,3]
y = unique(A)
B = unique(A) returns the same values as in a but with no repetitions. The resulting vector is sorted in ascending order. A can be a cell array of strings.
B = unique(A,'stable') does the same as above, but without sorting.
B = unique(A,'rows') returns the unique rows ofA`.
[B,i,j] = unique(...) also returns index vectors i and j such that B = A(i) and A = B(j) (or B = A(i,:) and A = B(j,:)).
Reference: http://cens.ioc.ee/local/man/matlab/techdoc/ref/unique.html
Documentation: https://uk.mathworks.com/help/matlab/ref/unique.html

The answers below are correct but if the user does not want to sort the data, you can use unique with the parameter stable
A = [1,2,3,4,5,1,2,3,4,5,1,2,3]
B = unique(A,'stable')

How do I convert a cell array w/ different data formats to a matrix in Matlab?

So my main objective is to take a matrix of form
matrix = [a, 1; b, 2; c, 3]
and a list of identifiers in matrix[:,1]
list = [a; c]
and generate a new matrix
new_matrix = [a, 1;c, 3]
My problem is I need to import the data that would be used in 'matrix' from a tab-delimited text file. To get this data into Matlab I use the code:
matrix_open = fopen(fn_matrix, 'r');
matrix = textscan(matrix_open, '%c %d', 'Delimiter', '\t');
which outputs a cell array of two 3x1 arrays. I want to get this into one 3x2 matrix where the first column is a character, and the second column an integer (these data formats will be different in my implementation).
So far I've tried the code:
matrix_1 = cell2mat(matrix(1,1));
matrix_2 = cell2mat(matrix(1,2));
matrix = horzcat(matrix_1, matrix_2)
but this is returning a 3x2 matrix where the second column is empty.
If I just use
cell2mat(matrix)
it says it can't do it because of the different data formats.
Thanks!

This is the help of matlab for the cell2mat function:
cell2mat Convert the contents of a cell array into a single matrix.
M = cell2mat(C) converts a multidimensional cell array with contents of
the same data type into a single matrix. The contents of C must be able
to concatenate into a hyperrectangle. Moreover, for each pair of
neighboring cells, the dimensions of the cell's contents must match,
excluding the dimension in which the cells are neighbors. This constraint
must hold true for neighboring cells along all of the cell array's
dimensions.
From what I understand the contents you want to put in a matrix should be of the same type otherwise why do you want a matrix? you could simply create a new cell array.

It's not possible to have a normal matrix with characters and numbers. That's why cell2mat won't work here. But you can store different datatypes in a cell-array. Use cellstr for the strings/characters and num2cell for the integers to convert the contents of matrix. If you have other datatypes, use an appropriate function for this step. Then assign them to the columns of an empty cell-array.
Here is the code:
fn_matrix = 'data.txt';
matrix_open = fopen(fn_matrix, 'r');
matrix = textscan(matrix_open, '%c %d', 'Delimiter', '\t');
X = cell(size(matrix{1},1),2);
X(:,1) = cellstr(matrix{1});
X(:,2) = num2cell(matrix{2});
The result:
X =
'a' [1]
'b' [2]
'c' [3]
Now we can do the second part of the question. Extracting the entries where the letter matches with one of the list. Therefore you can use ismember and logical indexing like this:
list = ['a'; 'c'];
sel = ismember(X(:,1),list);
Y(:,1) = X(sel,1);
Y(:,2) = X(sel,2);
The result here:
Y =
'a' [1]
'c' [3]

MATLAB: How to subset a multidimensional matrix using 1-D vector indices without for loops?

I am currently looking for an efficient way to slice multidimensional matrices in MATLAB. Ax an example, say I have a multidimensional matrix such as
A = rand(10,10,10)
I would like obtain a subset of this matrix (let's call it B) at certain indices along each dimension. To do this, I have access to the index vectors along each dimension:
ind_1 = [1,4,5]
ind_2 = [1,2]
ind_3 = [1,2]
Right now, I am doing this rather inefficiently as follows:
N1 = length(ind_1)
N2 = length(ind_2)
N3 = length(ind_3)
B = NaN(N1,N2,N3)
for i = 1:N1
for j = 1:N2
for k = 1:N3
B(i,j,k) = A(ind_1(i),ind_2(j),ind_3(k))
end
end
end
I suspect there is a smarter way to do this. Ideally, I'm looking for a solution that does not use for loops and could be used for an arbitrary N dimensional matrix.

Actually it's very simple:
B = A(ind_1, ind_2, ind_3);
As you see, Matlab indices can be vectors, and then the result is the Cartesian product of those vector indices. More information about Matlab indexing can be found here.
If the number of dimensions is unknown at programming time, you can define the indices in a cell aray and then expand into a comma-separated list:
ind = {[1 4 5], [1 2], [1 2]};
B = A(ind{:});

You can reference data in matrices by simply specifying the indices, like in the following example:
B = A(start:stop, :, 2);
In the example:
start:stop gets a range of data between two points
: gets all entries
2 gets only one entry
In your case, since all your indices are 1D, you could just simply use:
C = A(x_index, y_index, z_index);

How to find the occurences of the elements of first cell array in second cell array in MATLAB?

I have two cell arrays:
A={'abc','pai','abd','pa/n/v/d'}
B={'pai-pro','abc','pai','abd/','abd','pa/n/v/d','abd-','pa/n/v/d','pai-pro'}
I need a code to find the occurrence of the elements of A in B. Such that the output would be:
'abc' = 1
'pai' = 3
'abd' = 3
'pa/n/v/d' = 2

This will do it:
for i = 1:length(A)
sum(cell2mat(strfind(cellstr(B),A{i})))
end

For each element of A, you can do the following to get its occurrence in B
[isPresent, index] = ismember(A{1}, B)
The index will contain the location of the element A{i} in B if it is present indicated by the isPresent variable.