I have a 2D numpy array and I want to get the maximum value contained in each 2d rolling window that starts from left to right, top to bottom, rolling one row or column each time. The most naive method would be iterating through all rolling windows and get the maximum of all values enclosed in this rolling window. I wrote down this method below:
import numpy as np
shape=(1050,300)
window_size=(120,60)
a = np.arange(shape[1]*shape[0]).reshape(shape[1],shape[0])
max_Map=np.full((shape[1]-window_size[1]+1,shape[0]-window_size[0]+1),0,dtype='uint32')
for i in range(shape[1]-window_size[1]+1):
for j in range(shape[0]-window_size[0]+1):
window_max=np.max(a[i:i+window_size[1],j:j+window_size[0]])
max_Map[i][j]=window_max
But this is terribly inefficient, as there are only 2 rows(or 2 column) changed between each sliding but my code doesn't take into account any correlations between 2 consecutive rolling windows. An improvement I could think of is for each sliding window(assuming rolling horizontally) I will calculate the maximum of the left most column and the maximum of the remaining columns and take the max of the 2 values as the current window maximum. And for the next rolling window the maximum will be max of the newly added column and the previous remaining columns...But I still don't think this is optimized...
I will really appreciate it if someone can point me to the right direction,I feel like this should be a well studied problem but I couldn't find solutions anywhere...
Thanks in advance!
Approach #1 Using Scipy's 2D max filter -
from scipy.ndimage.filters import maximum_filter as maxf2D
# Store shapes of inputs
N,M = window_size
P,Q = a.shape
# Use 2D max filter and slice out elements not affected by boundary conditions
maxs = maxf2D(a, size=(M,N))
max_Map_Out = maxs[M//2:(M//2)+P-M+1, N//2:(N//2)+Q-N+1]
Approach #2 Using Scikit's 2D sliding window views -
from skimage.util.shape import view_as_windows
N,M = window_size
max_Map_Out = view_as_windows(a, (M,N)).max(axis=(-2,-1))
Note on window size and its use : The original approach has the window sizes aligned in a flipped manner, i.e. the first shape parameter of window_size slides along the second axis, while the second shape parameter decides how the window slides along the first axis. This might not be the case for other problems that do sliding max filtering, where we usually use the first shape parameter for the first axis of the 2D array and similarly for the second shape parameter. So, to solve for those cases, simply use : M,N = window_size and use the rest of the codes as they are.
Runtime test
Approaches -
def org_app(a, window_size):
shape = a.shape[1], a.shape[0]
max_Map=np.full((shape[1]-window_size[1]+1,
shape[0]-window_size[0]+1),0,dtype=a.dtype)
for i in range(shape[1]-window_size[1]+1):
for j in range(shape[0]-window_size[0]+1):
window_max=np.max(a[i:i+window_size[1],j:j+window_size[0]])
max_Map[i][j]=window_max
return max_Map
def maxf2D_app(a, window_size):
N,M = window_size
P,Q = a.shape
maxs = maxf2D(a, size=(M,N))
return maxs[M//2:(M//2)+P-M+1, N//2:(N//2)+Q-N+1]
def view_window_app(a, window_size):
N,M = window_size
return view_as_windows(a, (M,N)).max(axis=(-2,-1))
Timings and verification -
In [573]: # Setup inputs
...: shape=(1050,300)
...: window_size=(120,60)
...: a = np.arange(shape[1]*shape[0]).reshape(shape[1],shape[0])
...:
In [574]: np.allclose(org_app(a, window_size), maxf2D_app(a, window_size))
Out[574]: True
In [575]: np.allclose(org_app(a, window_size), view_window_app(a, window_size))
Out[575]: True
In [576]: %timeit org_app(a, window_size)
1 loops, best of 3: 2.11 s per loop
In [577]: %timeit view_window_app(a, window_size)
1 loops, best of 3: 1.14 s per loop
In [578]: %timeit maxf2D_app(a, window_size)
100 loops, best of 3: 3.09 ms per loop
In [579]: 2110/3.09 # Speedup using Scipy's 2D max filter over original approach
Out[579]: 682.8478964401295
Related
I've just started using for loops in matlab in programming class and the basic stuff is doing me fine, However I've been asked to "Use loops to create a 3 x 5 matrix in which the value of each element is its row number to the power of its column number divided by the sum of its row number and column number for example the value of element (2,3) is (2^3 / 2+3) = 1.6
So what sort of looping do I need to use to enable me to start new lines to form a matrix?
Since you need to know the row and column numbers (and only because you have to use loops), for-loops are a natural choice. This is because a for-loop will automatically keep track of your row and column number for you if you set it up right. More specifically, you want a nested for loop, i.e. one for loop within another. The outer loop might loop through the rows and the inner loop through the columns for example.
As for starting new lines in a matrix, this is extremely bad practice to do in a loop. You should rather pre-allocate your matrix. This will have a major performance impact on your code. Pre-allocation is most commonly done using the zeros function.
e.g.
num_rows = 3;
num_cols = 5;
M = zeros(num_rows,num_cols); %// Preallocation of memory so you don't grow your matrix in your loop
for row = 1:num_rows
for col = 1:num_cols
M(row,col) = (row^col)/(row+col);
end
end
But the most efficient way to do it is probably not to use loops at all but do it in one shot using ndgrid:
[R, C] = ndgrid(1:num_rows, 1:num_cols);
M = (R.^C)./(R+C);
The command bsxfun is very helpful for such problems. It will do all the looping and preallocation for you.
eg:
bsxfun(#(x,y) x.^y./(x+y), (1:3)', 1:5)
I have a image, it's a array. I want to get the value after minimum value, I wish you understand me because I dont speak english very well.
The minimum value in a pixel of this image is -3.40282e+38. I want to know the value that is after -3.40282e+38.
it must be for example 0.3 0.4..
I tried with image.min() but it print -3.40282e+38 .. I need the next value of that.
also I tried
minimo = img.min()
for i in range(rows):
for j in range(cols):
for k in img[i,j]:
if k> minimo:
print k.min()
but I got this error
TypeError: 'numpy.float32' object is not iterable
You can do it like this:
import numpy as np
sorted_vec = np.unique(img.reshape(-1))
second_smallest = sorted_vec[1]
For large arrays, using np.partition will be much faster than sorting the array, as in #dslack's answer:
import numpy as np
img = np.random.rand(1000, 1000)
# Compute via a full sort
np.unique(img.ravel())[1]
# 3.25658401967e-06
# Compute via a partition
np.partition(img.ravel(), 1)[1]
# 3.25658401967e-06
The two methods give the same results, and we can see that the partition approach is significantly faster:
%timeit np.unique(img.ravel())[1]
# 10 loops, best of 3: 86.8 ms per loop
%timeit np.partition(img.ravel(), 1)[1]
# 100 loops, best of 3: 4.99 ms per loop
The reason for the speed is that partition does not sort the full array, but simply swaps values until all smaller values are to the left of the given index, and all larger values are to the right.
Note that the results will differ if the minimum value is not unique – but it is not clear from your question which output you desire in this case.
For my project, I wish to quickly generate random permutations of a binary array of fixed length and a given number of 1s and 0s. Given these random permutations, I wish to add them elementwise.
I am currently using numpy's ndarray object, which is convenient for adding elementwise. My current code is as follows:
# n is the length of the array. I want to run this across a range of
# n=100 to n=1000.
row = np.zeros(n)
# m_list is a given list of integers. I am iterating over many possible
# combinations of possible values for m in m_list. For example, m_list
# could equal [5, 100, 201], for n = 500.
for m in m_list:
row += np.random.permutation(np.concatenate([np.ones(m), np.zeros(n - m)]))
My question is, is there any faster way to do this? According to timeit, 1000000 calls of "np.random.permutation(np.concatenate([np.ones(m), np.zeros(n - m)]))" takes 49.6 seconds. For my program's purposes, I'd like to decrease this by an order of magnitude. Can anyone suggest a faster way to do this?
Thank you!
For me version with array allocation outside the loop
was faster but not much - 8% or so, using cProfile
row = np.zeros(n, dtype=np.float64)
wrk = np.zeros(n, dtype=np.float64)
for m in m_list:
wrk[0:m] = 1.0
wrk[m:n] = 0.0
row += np.random.permutation(wrk)
You might try to shuffle(wrk) in-place instead of returning another array from permutation, but for me difference was negligible
I have a 3 dimensional array (10x3x3) in Matlab and I want to change any value greater than 999 to Inf. However, I only want this to apply to (:,:,2:3) of this array.
All the help I have found online seems to only apply to the whole array, or 1 column of a 2D array. I can't work out how to apply this to a 3D array.
I have tried the following code, but it becomes a 69x3x3 array after I run it, and I don't really get why. I tried to copy the code from someone using a 2D array, so I just think I don't really understand what the code is doing.
A(A(:,:,2)>999,2)=Inf;
A(A(:,:,3)>999,3)=Inf;
One approach with logical indexing -
mask = A>999; %// get the 3D mask
mask(:,:,1) = 0; %// set all elements in the first 3D slice to zeros,
%// to neglect their effect when we mask the entire input array with it
A(mask) = Inf %// finally mask and set them to Infs
Another with linear indexing -
idx = find(A>999); %// Find linear indices that match the criteria
valid_idx = idx(idx>size(A,1)*size(A,2)) %// Select indices from 2nd 3D slice onwards
A(valid_idx)=Inf %// Set to Infs
Or yet another with linear indexing, almost same as the previous one with the valid index being calculated in one step and thus enabling us a one-liner -
A(find(A(:,:,2:3)>999) + size(A,1)*size(A,2))=Inf
I am new to Matlab.
Lets say I have an array a = [1:1:1000]
I have to divide this into 50 parts 1-20; 21-40 .... 981-1000.
I am trying to do it this way.
E=1000X
a=[1:E]
n=50
d=E/n
b=[]
for i=0:n
b(i)=a[i:d]
end
But I am unable to get the result.
And the second part I am working on is, depending on another result, say if my answer is 3, the first split array should have a counter and that should be +1, if the answer is 45 the 3rd split array's counter should be +1 and so on and in the end I have to make a histogram of all the counters.
You can do all of this with one function: histc. In your situation:
X = (1:1:1000)';
Edges = (1:20:1000)';
Count = histc(X, Edges);
Essentially, Count contains the number of elements in X that fall into the categories defined in Edges, where Edges is a monotonically increasing vector whose elements define the boundaries of sequential categories. A more common example might be to construct X using a probability density, say, the uniform distribution, eg:
X = 1000 * rand(1000, 1);
Play around with specifications for X and Edges and you should get the idea. If you want the actual histogram plot, look into the hist function.
As for the second part of your question, I'm not really sure what you're asking.