Get the next value from the minimum value in array - arrays

I have a image, it's a array. I want to get the value after minimum value, I wish you understand me because I dont speak english very well.
The minimum value in a pixel of this image is -3.40282e+38. I want to know the value that is after -3.40282e+38.
it must be for example 0.3 0.4..
I tried with image.min() but it print -3.40282e+38 .. I need the next value of that.
also I tried
minimo = img.min()
for i in range(rows):
for j in range(cols):
for k in img[i,j]:
if k> minimo:
print k.min()
but I got this error
TypeError: 'numpy.float32' object is not iterable

You can do it like this:
import numpy as np
sorted_vec = np.unique(img.reshape(-1))
second_smallest = sorted_vec[1]

For large arrays, using np.partition will be much faster than sorting the array, as in #dslack's answer:
import numpy as np
img = np.random.rand(1000, 1000)
# Compute via a full sort
np.unique(img.ravel())[1]
# 3.25658401967e-06
# Compute via a partition
np.partition(img.ravel(), 1)[1]
# 3.25658401967e-06
The two methods give the same results, and we can see that the partition approach is significantly faster:
%timeit np.unique(img.ravel())[1]
# 10 loops, best of 3: 86.8 ms per loop
%timeit np.partition(img.ravel(), 1)[1]
# 100 loops, best of 3: 4.99 ms per loop
The reason for the speed is that partition does not sort the full array, but simply swaps values until all smaller values are to the left of the given index, and all larger values are to the right.
Note that the results will differ if the minimum value is not unique – but it is not clear from your question which output you desire in this case.

Related

How to count for 2 different arrays how many times the elements are repeated, in MATLAB?

I have array A (44x1) and B (41x1), and I want to count for both arrays how many times the elements are repeated. And if the repeated values are present in both arrays, I want their counting to be divided (for instance: value 0.5 appears 500 times in A and 350 times in B, so now divide 500 by 350).
I have to do this for bigger arrays as well, so I was thinking about using a looping (but no idea how to do it on MATLAB).
I got what I want on python:
import pandas as pd
data1 = pd.read_excel('C:/Users/Desktop/Python/data1.xlsx')
data2 = pd.read_excel('C:/Users/Desktop/Python/data2.xlsx')
for i in data1['Mag'].value_counts() & data2['Mag'].value_counts():
a = data1['Mag'].value_counts()/data2['Mag'].value_counts()
print(a)
break
Any idea of how to do the same on MATLAB? Thanks!
Since you can enumerate all valid earthquake magnitude values, you could use:
% Make up some data
A=randi([2 58],[100 1])/10;
B=randi([2 58],[20 1])/10;
% Round data to nearest tenth
%A=round(A,1); %uncomment if necessary
%B=round(B,1); %same
% Divide frequencies
validmags=0.2:0.1:5.8;
Afreqs=sum(double( abs(A-validmags)<1e-6 ),1); %relies on implicit expansion; A must be a column vector and validmags must be a row vector; dimension argument to sum() only to remind user; double() not really needed
Bfreqs=sum(double( abs(B-validmags)<1e-6 ),1); %same
Bfreqs./Afreqs, %for a fancier version: [{'Magnitude'} num2cell(validmags) ; {'Freq(B)/Freq(A)'} num2cell(Bfreqs./Afreqs)].'
The last line will produce NaN for 0/0, +Inf for nn/0, and 0 for 0/nn.
You could also use uniquetol, align the unique values of each vector, and divide the respective absolute frequencies. But I think the above approach is cleaner and easier to understand.

Python: Finding the row index of a value in 2D array when a condition is met

I have a 2D array PointAndTangent of dimension 8500 x 5. The data is row-wise with 8500 data rows and 5 data values for each row. I need to extract the row index of an element in 4th column when this condition is met, for any s:
abs(PointAndTangent[:,3] - s) <= 0.005
I just need the row index of the first match for the above condition. I tried using the following:
index = np.all([[abs(s - PointAndTangent[:, 3])<= 0.005], [abs(s - PointAndTangent[:, 3]) <= 0.005]], axis=0)
i = int(np.where(np.squeeze(index))[0])
which doesn't work. I get the follwing error:
i = int(np.where(np.squeeze(index))[0])
TypeError: only size-1 arrays can be converted to Python scalars
I am not so proficient with NumPy in Python. Any suggestions would be great. I am trying to avoid using for loop as this is small part of a huge simulation that I am trying.
Thanks!
Possible Solution
I used the following
idx = (np.abs(PointAndTangent[:,3] - s)).argmin()
It seems to work. It returns the row index of the nearest value to s in the 4th column.
You were almost there. np.where is one of the most abused functions in numpy. Half the time, you really want np.nonzero, and the other half, you want to use the boolean mask directly. In your case, you want np.flatnonzero or np.argmax:
mask = abs(PointAndTangent[:,3] - s) <= 0.005
mask is a 1D array with ones where the condition is met, and zeros elsewhere. You can get the indices of all the ones with flatnonzero and select the first one:
index = np.flatnonzero(mask)[0]
Alternatively, you can select the first one directly with argmax:
index = np.argmax(mask)
The solutions behave differently in the case when there are no rows meeting your condition. Three former does indexing, so will raise an error. The latter will return zero, which can also be a real result.
Both can be written as a one-liner by replacing mask with the expression that was assigned to it.

Getting the maximum in each rolling window of a 2D numpy array

I have a 2D numpy array and I want to get the maximum value contained in each 2d rolling window that starts from left to right, top to bottom, rolling one row or column each time. The most naive method would be iterating through all rolling windows and get the maximum of all values enclosed in this rolling window. I wrote down this method below:
import numpy as np
shape=(1050,300)
window_size=(120,60)
a = np.arange(shape[1]*shape[0]).reshape(shape[1],shape[0])
max_Map=np.full((shape[1]-window_size[1]+1,shape[0]-window_size[0]+1),0,dtype='uint32')
for i in range(shape[1]-window_size[1]+1):
for j in range(shape[0]-window_size[0]+1):
window_max=np.max(a[i:i+window_size[1],j:j+window_size[0]])
max_Map[i][j]=window_max
But this is terribly inefficient, as there are only 2 rows(or 2 column) changed between each sliding but my code doesn't take into account any correlations between 2 consecutive rolling windows. An improvement I could think of is for each sliding window(assuming rolling horizontally) I will calculate the maximum of the left most column and the maximum of the remaining columns and take the max of the 2 values as the current window maximum. And for the next rolling window the maximum will be max of the newly added column and the previous remaining columns...But I still don't think this is optimized...
I will really appreciate it if someone can point me to the right direction,I feel like this should be a well studied problem but I couldn't find solutions anywhere...
Thanks in advance!
Approach #1 Using Scipy's 2D max filter -
from scipy.ndimage.filters import maximum_filter as maxf2D
# Store shapes of inputs
N,M = window_size
P,Q = a.shape
# Use 2D max filter and slice out elements not affected by boundary conditions
maxs = maxf2D(a, size=(M,N))
max_Map_Out = maxs[M//2:(M//2)+P-M+1, N//2:(N//2)+Q-N+1]
Approach #2 Using Scikit's 2D sliding window views -
from skimage.util.shape import view_as_windows
N,M = window_size
max_Map_Out = view_as_windows(a, (M,N)).max(axis=(-2,-1))
Note on window size and its use : The original approach has the window sizes aligned in a flipped manner, i.e. the first shape parameter of window_size slides along the second axis, while the second shape parameter decides how the window slides along the first axis. This might not be the case for other problems that do sliding max filtering, where we usually use the first shape parameter for the first axis of the 2D array and similarly for the second shape parameter. So, to solve for those cases, simply use : M,N = window_size and use the rest of the codes as they are.
Runtime test
Approaches -
def org_app(a, window_size):
shape = a.shape[1], a.shape[0]
max_Map=np.full((shape[1]-window_size[1]+1,
shape[0]-window_size[0]+1),0,dtype=a.dtype)
for i in range(shape[1]-window_size[1]+1):
for j in range(shape[0]-window_size[0]+1):
window_max=np.max(a[i:i+window_size[1],j:j+window_size[0]])
max_Map[i][j]=window_max
return max_Map
def maxf2D_app(a, window_size):
N,M = window_size
P,Q = a.shape
maxs = maxf2D(a, size=(M,N))
return maxs[M//2:(M//2)+P-M+1, N//2:(N//2)+Q-N+1]
def view_window_app(a, window_size):
N,M = window_size
return view_as_windows(a, (M,N)).max(axis=(-2,-1))
Timings and verification -
In [573]: # Setup inputs
...: shape=(1050,300)
...: window_size=(120,60)
...: a = np.arange(shape[1]*shape[0]).reshape(shape[1],shape[0])
...:
In [574]: np.allclose(org_app(a, window_size), maxf2D_app(a, window_size))
Out[574]: True
In [575]: np.allclose(org_app(a, window_size), view_window_app(a, window_size))
Out[575]: True
In [576]: %timeit org_app(a, window_size)
1 loops, best of 3: 2.11 s per loop
In [577]: %timeit view_window_app(a, window_size)
1 loops, best of 3: 1.14 s per loop
In [578]: %timeit maxf2D_app(a, window_size)
100 loops, best of 3: 3.09 ms per loop
In [579]: 2110/3.09 # Speedup using Scipy's 2D max filter over original approach
Out[579]: 682.8478964401295

Fast Random Permutation of Binary Array

For my project, I wish to quickly generate random permutations of a binary array of fixed length and a given number of 1s and 0s. Given these random permutations, I wish to add them elementwise.
I am currently using numpy's ndarray object, which is convenient for adding elementwise. My current code is as follows:
# n is the length of the array. I want to run this across a range of
# n=100 to n=1000.
row = np.zeros(n)
# m_list is a given list of integers. I am iterating over many possible
# combinations of possible values for m in m_list. For example, m_list
# could equal [5, 100, 201], for n = 500.
for m in m_list:
row += np.random.permutation(np.concatenate([np.ones(m), np.zeros(n - m)]))
My question is, is there any faster way to do this? According to timeit, 1000000 calls of "np.random.permutation(np.concatenate([np.ones(m), np.zeros(n - m)]))" takes 49.6 seconds. For my program's purposes, I'd like to decrease this by an order of magnitude. Can anyone suggest a faster way to do this?
Thank you!
For me version with array allocation outside the loop
was faster but not much - 8% or so, using cProfile
row = np.zeros(n, dtype=np.float64)
wrk = np.zeros(n, dtype=np.float64)
for m in m_list:
wrk[0:m] = 1.0
wrk[m:n] = 0.0
row += np.random.permutation(wrk)
You might try to shuffle(wrk) in-place instead of returning another array from permutation, but for me difference was negligible

NumPy change elements of an array with a given probability

I have an array of random numbers. I want to change only some of the elements based on a probability of say 0.07. Currently I am doing this using a for loop to iterate over every element. Is there a better way of doing this?
If the array in question is called a, you can select an average proportion of 0.07 of its values by
a[numpy.random.rand(*a.shape) < 0.07]
I don't know how you want to change these values. To multiply them by two, just do
a[numpy.random.rand(*a.shape) < 0.07] *= 2.0
Sven's answer is elegant. However, it is much faster to pick the elements you want to change with
n = numpy.random.binomial(len(a), 0.07)
a[numpy.random.randint(0, len(a), size=n)] *= 2.0
The first expression determines how many elements you want to sample (n is an integer between 0 and len(a), but on average 0.07), the second generates exactly the number of indices you want to retrieve. (Note, however, that you might get the same index several times.)
The difference to
a[numpy.random.rand(len(a)) < p]
becomes small as p approaches 1, but for small p, it might be a factor of 10 or more.

Resources