I have two large arrays which I will illustrate using the following examples.
The first array A is:
[ 1 21;
3 4;
4 12;
5 65 ];
The second array B is:
[ 3 56;
5 121];
I want to obtain the final array C as following:
[ 1 21;
3 56;
4 12;
5 121 ];
i.e. replace second column of A with elements of B when available.
I am using Matlab 2007.
MATLAB Solution
With ismember -
C = A;
[is_present,pos] = ismember(A(:,1),B(:,1))
C(is_present,2) = B(pos(is_present),2)
Or use bsxfun to replace ismember -
[is_present,pos] = max(bsxfun(#eq,A(:,1),B(:,1).'),[],2);
Sample run -
>> A,B
A =
1 21
3 4
4 12
5 65
B =
3 56
5 121
4 66
>> C = A;
[is_present,pos] = ismember(A(:,1),B(:,1));
C(is_present,2) = B(pos(is_present),2);
>> C
C =
1 21
3 56
4 66
5 121
Bonus: NUMPY/PYTHON Solution
You can use boolean indexing with np.in1d -
import numpy as np
mask = np.in1d(A[:,0],B[:,0])
C = A.copy()
C[mask] = B
Sample run -
In [34]: A
Out[34]:
array([[ 1, 21],
[ 3, 4],
[ 4, 12],
[ 5, 65]])
In [35]: B
Out[35]:
array([[ 3, 56],
[ 5, 121]])
In [36]: mask = np.in1d(A[:,0],B[:,0])
...: C = A.copy()
...: C[mask] = B
...:
In [37]: C
Out[37]:
array([[ 1, 21],
[ 3, 56],
[ 4, 12],
[ 5, 121]])
Related
I have a 2D array. For example:
ary = np.arange(24).reshape(6,4)
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]
[16 17 18 19]
[20 21 22 23]]
I want to break this into smaller 2D arrays, each 2x2, and compute the square root of the sum of each. I actually want to use arbitrary sized sub-arrays, and compute arbitrary functions of them, but I think this question is easier to ask with concrete operations and concrete array sizes, so in this example starting with a 6x4 array and computing the square root of sums of 2x2 sub-arrays, the final result would be a 3x2 array, as follows:
[[3.16, 4.24] # math.sqrt(0+1+4+5) , math.sqrt(2+3+6+7)
[6.48, 7.07] # math.sqrt(8+9+12+13) , math.sqrt(10+11+14+15)
[8.60, 9.05]] # math.sqrt(16+17+20+21), math.sqrt(18+19+22+23)
How can I slice, or split, or do some operation to perform some computation on 2D sub-arrays?
Here is a working, inefficient example of what I'm trying to do:
import numpy as np
a_height = 6
a_width = 4
a_area = a_height * a_width
a = np.arange(a_area).reshape(a_height, a_width)
window_height = 2
window_width = 2
b_height = a_height // window_height
b_width = a_width // window_width
b_area = b_height * b_width
b = np.zeros(b_area).reshape(b_height, b_width)
for i in range(b_height):
for j in range(b_width):
b[i, j] = a[i * window_height:(i + 1) * window_height, j * window_width:(j + 1) * window_width].sum()
b = np.sqrt(b)
print(b)
# [[3.16227766 4.24264069]
# [6.4807407 7.07106781]
# [8.60232527 9.05538514]]
In [2]: ary = np.arange(24).reshape(6,4)
In [3]: ary
Out[3]:
array([[ 0, 1, 2, 3],
[ 4, 5, 6, 7],
[ 8, 9, 10, 11],
[12, 13, 14, 15],
[16, 17, 18, 19],
[20, 21, 22, 23]])
While I recommended moving-windows based on as_strided, we can also divide the array into 'blocks' with reshape and transpose:
In [4]: ary.reshape(3,2,2,2).transpose(0,2,1,3)
Out[4]:
array([[[[ 0, 1],
[ 4, 5]],
[[ 2, 3],
[ 6, 7]]],
[[[ 8, 9],
[12, 13]],
[[10, 11],
[14, 15]]],
[[[16, 17],
[20, 21]],
[[18, 19],
[22, 23]]]])
In [5]: np.sqrt(_.sum(axis=(2,3)))
Out[5]:
array([[3.16227766, 4.24264069],
[6.4807407 , 7.07106781],
[8.60232527, 9.05538514]])
While the transpose makes it easier to visual the blocks that need to be summed, it isn't necessary:
In [7]: np.sqrt(ary.reshape(3,2,2,2).sum(axis=(1,3)))
Out[7]:
array([[3.16227766, 4.24264069],
[6.4807407 , 7.07106781],
[8.60232527, 9.05538514]])
np.lib.stride_tricks.sliding_window doesn't give us as much direct control as I thought, but
np.lib.stride_tricks.sliding_window_view(ary,(2,2))[::2,::2]
gives the same result as Out[4].
In [13]: np.sqrt(np.lib.stride_tricks.sliding_window_view(ary,(2,2))[::2,::2].sum(axis=(2,3)))
Out[13]:
array([[3.16227766, 4.24264069],
[6.4807407 , 7.07106781],
[8.60232527, 9.05538514]])
[7] is faster.
In general, it can be done like this:
a_height = 15
a_width = 16
a_area = a_height * a_width
a = np.arange(a_are).reshape(a_height, a_width)
window_height = 3 # must evenly divide a_height
window_width = 4 # must evenly divide a_width
b_height = a_height // window_height
b_width = a_width // window_width
b = a.reshape(b_height, window_height, b_width, window_width).transpose(0,2,1,3)
# or, assuming you want sum or another function that takes `axis` argument
b = a.reshape(b_height, window_height, b_width, window_width).sum(axis=(1,3))
I wonder if there is a way of looping through a number of arrays of different sizes and trimming data from the beginning of each array in order to achieve the same amount of elements in each array?
For instance, if I have:
A = [4 3 9 8 13]
B = [15 2 6 11 1 12 8 9 10 13 4]
C = [2 3 11 12 10 9 15 4 14]
and I want B an C to lose some elements at the beginning, such that they end up being 5 elements in length, just like A, to achieve:
A = [4 3 9 8 13]
B = [8 9 10 13 4]
C = [10 9 15 4 14]
How would I do that?
EDIT/UPDATE:
I have accepted the answer proposed by #excaza, who wrote a nice function called "naivetrim". I saved that function as a .m script and then used it: First I define my three arrays and, as #excaza suggests, called the function:
[A, B, C] = naivetrim(A, B, C);
Another solution variation that worked for me - based on #Sardar_Usama's answer below (looping it). I liked this as well, because it was a bit more straightforward (with my level, I can follow what is happening in the code)
A = [4 3 9 8 13]
B = [15 2 6 11 1 12 8 9 10 13 4]
C = [2 3 11 12 10 9 15 4 14]
arrays = {A,B,C}
temp = min([numel(A),numel(B), numel(C)]); %finding the minimum number of elements
% Storing only required elements
for i = 1:size(arrays,2)
currentarray = arrays{i}
arrays(i) = {currentarray(end-temp+1:end)}
end
A naive looped solution:
function testcode()
% Sample data arrays
A = [4, 3, 9, 8, 13];
B = [15, 2, 6, 11, 1, 12, 8, 9, 10, 13, 4];
C = [2, 3, 11, 12, 10, 9, 15, 4, 14];
[A, B, C] = naivetrim(A, B, C);
end
function varargout = naivetrim(varargin)
% Assumes all inputs are vectors
% Find minumum length
lengths = zeros(1, length(varargin), 'uint32'); % Preallocate
for ii = 1:length(varargin)
lengths(ii) = length(varargin{ii});
end
% Loop through input arrays and trim any that are longer than the shortest
% input vector
minlength = min(lengths);
varargout = cell(size(varargin)); % Preallocate
for ii = 1:length(varargout)
if length(varargin{ii}) >= minlength
varargout{ii} = varargin{ii}(end-minlength+1:end);
end
end
end
Which returns:
A =
4 3 9 8 13
B =
8 9 10 13 4
C =
10 9 15 4 14
If you have a large number of arrays you may be better off with alternative intermediate storage data types, like cells or structures, which would be "simpler" to assign and iterate through.
Timing code for a few different similar approaches can be found in this Gist.
Performance Profile, MATLAB (R2016b)
Number of Elements in A: 999999
Number of Elements in B: 424242
Number of Elements in C: 101325
Trimming, deletion: 0.012537 s
Trimming, copying: 0.000430 s
Trimming, cellfun copying: 0.000493 s
If there are not many matrices then it can be done as:
temp = min([numel(A),numel(B), numel(C)]); %finding the minimum number of elements
% Storing only required elements
A = A(end-temp+1:end);
B = B(end-temp+1:end);
C = C(end-temp+1:end);
I have two giant array which looks like:
A = [11, 11, 12, 3, 3, 4, 4, 4 ];
B = [ 12, 4; 3, 11; 11, 1; 4, 13 ];
I want to create an array which takes values from B and column 1 from A to look like:
C = [ 11, 1; 11, 1; 12, 4; 3, 11; 3, 11; 4, 13; 4, 13; 4, 13 ];
I don't want to use for or any other kind of loop to optimize the process.
Sorry for being terse.
I will search each element from column 1 of A in column 1 of B and pick the corresponding column 2 elements from B and create a new array with column 1 elements of A and discovered column 2 elements from B.
What you are doing in this problem is using A and searching the first column of B to see if there's a match. Once there's a match, extract out the row that corresponds to this matched location in B. Repeat this for the rest of the values in A.
Assuming that all values of A can be found in B and that the first column of B is distinct and that there are no duplicates, you can a unique call and sortrows call. The unique call is on A so that you can assign each value in A to be a unique label in sorted order. You would then use these labels to index into the sorted version of B to get your desired result:
[~,~,id] = unique(A);
Bs = sortrows(B);
C = Bs(id,:);
We get for C:
C =
11 1
11 1
12 4
3 11
3 11
4 13
4 13
4 13
Thanks to #rayryeng for clarifying the question to me.
Assuming each element from A is present in column 1 of B:
[~, ind] = max(bsxfun(#eq, A(:).', B(:,1)), [], 1);
C = B(ind,:);
If that assumption doesn't necessarily hold:
[val, ind] = max(bsxfun(#eq, A(:).', B(:,1)), [], 1);
C = B(ind(val),:);
So for example A = [11, 20, 12, 3, 3, 4, 4, 4 ]; would produce
C =
11 1
12 4
3 11
3 11
4 13
4 13
4 13
MATLAB:
In MATLAB,
I have 2 m-by-n matrices, A and B. I want to make a set of n
m-by-2 matrices such as in ith matrix (of set of n), first column will be ith
column from A and second column will be ith column from B.
How to extract and concatenate ith columns from both matrices?
How I can store these n matrices? Using loops? (Memory?)
Example:
Input:
A = [ 1, 2, 3; 4, 5 ,6; 7, 8, 9] (3x3 matrix)
B = [ 11, 22, 33; 44, 55 ,66; 77, 88, 99] (3x3 matrix)
Output:
For i=1:3
C1 = [1, 11; 4, 44; 7, 77]
C2 = [2, 22; 5, 55; 8, 88]
C3 = [3, 33; 6, 66; 9, 99]
The first thing I'm going to do is change your variable names. Mainly this is just to make referring to the variables easier, especially as m and n change. Instead of writing
C1(:,:)
C2(:,:)
...
Cn(:,:)
I'm going to write
C(:,:,1)
C(:,:,2)
...
C(:,:,n)
All I've done is moved the index from the variable name to the index of the 3rd dimension.
Now, to create the C array:
A = [ 1, 2, 3; 4, 5 ,6; 7, 8, 9]
B = [ 11, 22, 33; 44, 55 ,66; 77, 88, 99]
[m,n]=size(A)
C = reshape([A',B']', m, 2, n)
The output of this is:
A =
1 2 3
4 5 6
7 8 9
B =
11 22 33
44 55 66
77 88 99
m = 3
n = 3
C =
ans(:,:,1) =
1 11
4 44
7 77
ans(:,:,2) =
2 22
5 55
8 88
ans(:,:,3) =
3 33
6 66
9 99
As you can see, C(:,:,1) is equal to C1 in your example, C(:,:,2) = C2 and so on. And this extends without change as the sizes of A and B change. You never have to come up with new variable names. And all you have to do to know how many m-by-2 matrices you've got is
numVars = size(C,3);
Note: This uses the same technique found in the answer here: matlab - how to merge/interlace 2 matrices?
I have this numpy array:
a = np.array([[[1,2,3],[-1,-2,-3]],[[4,5,6],[-4,-5,-6]]])
b is a transpose of a. I want b be like this:
b = np.array([[[1,-1],[2,-2],[3,-3]],[[4,-4],[5,-5],[6,-6]]])
Is it possible to do it in one line?
EDIT:
And if I have this instead:
a = np.empty(3,dtype = object)
a[0] = np.array([[1,2,3],[-1,-2,-3]])
a[1] = np.array([[4,5,6],[-4,-5,-6]])
How can I get b?
You can do it using np.transpose(a,(0,2,1)):
In [26]: a = np.array([[[1,2,3],[-1,-2,-3]],[[4,5,6],[-4,-5,-6]]])
In [27]: b = np.transpose(a,(0,2,1))
In [28]: print a
[[[ 1 2 3]
[-1 -2 -3]]
[[ 4 5 6]
[-4 -5 -6]]]
In [29]: print b
[[[ 1 -1]
[ 2 -2]
[ 3 -3]]
[[ 4 -4]
[ 5 -5]
[ 6 -6]]]
For your edited question with an array of dtype=object -- there is no direct way to compute the transpose, because numpy doesn't know how to transpose a generic object. However, you can use list comprehension and transpose each object separately:
In [90]: a = np.empty(2,dtype = object)
In [91]: a[0] = np.array([[1,2,3],[-1,-2,-3]])
In [92]: a[1] = np.array([[4,5,6],[-4,-5,-6]])
In [93]: print a
[[[ 1 2 3]
[-1 -2 -3]] [[ 4 5 6]
[-4 -5 -6]]]
In [94]: b = np.array([np.transpose(o) for o in a],dtype=object)
In [95]: print b
[[[ 1 -1]
[ 2 -2]
[ 3 -3]]
[[ 4 -4]
[ 5 -5]
[ 6 -6]]]