BatchNormalization backpropagation - artificial-intelligence

While doing cs231n assignment, I got stuck with batchnorm_backward_alt(BatchNormalization backpropogation with an efficient way). I've tried many ways, but failed. please help me with this. it would be grateful if there's an explanation plz.
my code
def batchnorm_backward_alt(dout, cache):
dx, dgamma, dbeta = None, None, None
N=dout.shape[0]
xmu,var,xhat,gamma,eps,mu=cache
sigma=np.sqrt(var+eps)
dbeta=np.sum(dout,axis=0)
dgamma=np.sum(dout*xhat,axis=0)
dy=dout*gamma
dsig=-xmu/(sigma**2)
dvar=np.sum(dy*dsig/(2*sigma),axis=0)
dmu=np.sum(-dy/sigma+dvar*xmu*2/N,axis=0)
dx=dvar*2*xmu/N+dmu/N
pass
I wrote my code based on the chain rule that I've calculated
answer code that I've found
dx, dgamma, dbeta = None, None, None
N, D = dout.shape
xhat, gamma, xmu, ivar, sqrtvar, var, eps = cache
dxhat = dout * gamma
dx = 1.0/N * ivar * (N*dxhat - np.sum(dxhat, axis=0) - xhat*np.sum(dxhat*xhat, axis=0))
dbeta = np.sum(dout, axis=0)
dgamma = np.sum(xhat*dout, axis=0)

Related

Python code has a big bottleneck, but I am not experienced enough to see where it is

My code is supposed to model the average energy for alpha decay, it works but it is very slow.
import numpy as np
from numpy import sin, cos, arccos, pi, arange, fromiter
import matplotlib.pyplot as plt
from random import choices
r_cell, d, r, R, N = 5.5, 15.8, 7.9, 20, arange(1,10000, 50)
def total_decay(N):
theta = 2*pi*np.random.rand(2,N)
phi = arccos(2*np.random.rand(2,N)-1)
x = fromiter((r*sin(phi[0][i])*cos(theta[0][i]) for i in range(N)),float, count=-1)
dx = fromiter((x[i] + R*sin(phi[1][i])*cos(theta[1][i]) for i in range(N)), float,count=-1)
y = fromiter((r*sin(phi[0][i])*sin(theta[0][i]) for i in range(N)),float, count=-1)
dy = fromiter((y[i] + R*sin(phi[1][i])*sin(theta[1][i]) for i in range(N)),float,count=-1)
z = fromiter((r*cos(phi[0][i]) for i in range(N)),float, count=-1)
dz = fromiter((z[i] + R*cos(phi[1][i]) for i in range(N)),float, count=-1)
return x, y, z, dx, dy, dz
def inter(x,y,z,dx,dy,dz, N):
intersections = 0
for i in range(N): #Checks to see if a line between two points intersects with the target cell
a = (dx[i] - x[i])*(dx[i] - x[i]) + (dy[i] - y[i])*(dy[i] - y[i]) + (dz[i] - z[i])*(dz[i] - z[i])
b = 2*((dx[i] - x[i])*(x[i]-d) + (dy[i] - y[i])*(y[i])+(dz[i] - z[i])*(z[i]))
c = d*d + x[i]*x[i] + y[i]*y[i] + z[i]*z[i] - 2*(d*x[i]) - r_cell*r_cell
if b*b - 4*a*c >= 0:
intersections += 1
return intersections
def hits(N):
I = []
for i in range(len(N)):
decay = total_decay(N[i])
I.append(inter(decay[0],decay[1],decay[2],decay[3],decay[4],decay[5],N[i]))
return I
def AE(I,N):
p1, p2 = 52.4 / (52.4 + 18.9), 18.9 / (52.4 + 18.9)
E = [choices([5829.6, 5793.1], cum_weights=(p1,p2),k=1)[0] for _ in range(I)]
return sum(E)/N
def list_AE(I,N):
E = [AE(I[i],N[i]) for i in range(len(N))]
return E
plt.plot(N, list_AE(hits(N),N))
plt.title('Average energy per dose with respect to number of decays')
plt.xlabel('Number of decays [N]')
plt.ylabel('Average energy [keV]')
plt.show()
Can anyone experienced point out where the bottleneck takes place, explain why it happens and how to optimize it? Thanks in advance.
To find out where most of the time is spent in your code, examine it with a profiler. By wrapping your main code like this:
import cProfile
import pstats
profiler = cProfile.Profile()
profiler.enable()
result = list_AE(hits(N), N)
profiler.disable()
stats = pstats.Stats(profiler).sort_stats('tottime')
stats.print_stats()
You will get the following overview (abbreviated):
6467670 function calls in 19.982 seconds
Ordered by: internal time
ncalls tottime percall cumtime percall filename:lineno(function)
200 4.766 0.024 4.766 0.024 ./alphadecay.py:24(inter)
995400 2.980 0.000 2.980 0.000 ./alphadecay.py:17(<genexpr>)
995400 2.925 0.000 2.925 0.000 ./alphadecay.py:15(<genexpr>)
995400 2.690 0.000 2.690 0.000 ./alphadecay.py:16(<genexpr>)
995400 2.683 0.000 2.683 0.000 ./alphadecay.py:14(<genexpr>)
995400 1.674 0.000 1.674 0.000 ./alphadecay.py:19(<genexpr>)
995400 1.404 0.000 1.404 0.000 ./alphadecay.py:18(<genexpr>)
1200 0.550 0.000 14.907 0.012 {built-in method numpy.fromiter}
Most of the time is spent in the inter function since it runs a huge loop over N. To improve this, you could parallelize its executing to multiple threads using multiprocessing.Pool.
Another way to speed up your calculations is to make use of NumPy vectorization. That is, avoid iterating over N inside the total_decay() function:
def total_decay(N):
theta = 2 * pi * np.random.rand(2, N)
phi = arccos(2 * np.random.rand(2, N) - 1)
x = r * sin(phi[0]) * cos(theta[0])
y = r * sin(phi[0]) * sin(theta[0])
z = r * cos(phi[0])
dx = x + R * sin(phi[1]) * cos(theta[1])
dy = y + R * sin(phi[1]) * sin(theta[1])
dz = z + R * cos(phi[1])
return x, y, z, dx, dy, dz
I've arranged the code a bit to make it more readable. On that note, I strongly suggest you to follow the Python formatting conventions and to use descriptive variable names to make your code more understandable.
I won't tell you where the bottleneck is, but I can tell you how to find bottlenecks in complex programs. The keyword is profiling. A profiler is an application that will run alongside your code and measure the execution times of each statement. Search online for python profiler.
The poor person's version would be debugging and guesstimating the execution times of statements or using print statements or a library for measuring execution times. Using a profiler is an important skill that's not that difficult to learn, though.

Scipy Curve Fit: "Result from function call is not a proper array of floats."

I am trying to fit a 2D Gaussian with an offset to a 2D array. The code is based on this thread here (which was written for Python2 while I am using Python3, therefore some changes were necessary to make it run somewhat):
import numpy as np
import scipy.optimize as opt
n_pixels = 2400
def twoD_Gaussian(data_list, amplitude, xo, yo, sigma_x, sigma_y, offset):
x = data_list[0]
y = data_list[1]
theta = 0 # don't care about theta for the moment but want to leave the option in
a = (np.cos(theta)**2)/(2*sigma_x**2) + (np.sin(theta)**2)/(2*sigma_y**2)
b = -(np.sin(2*theta))/(4*sigma_x**2) + (np.sin(2*theta))/(4*sigma_y**2)
c = (np.sin(theta)**2)/(2*sigma_x**2) + (np.cos(theta)**2)/(2*sigma_y**2)
g = offset + amplitude*np.exp( - (a*((x-xo)**2) + 2*b*(x-xo)*(y-yo) + c*((y-yo)**2)))
return g
x = np.linspace(1, n_pixels, n_pixels) #starting with 1 because proper data is from a fits file
y = np.linspace(1, n_pixels, n_pixels)
x, y = np.meshgrid(x,y)
amp = -3
x0, y0 = n_pixels/2, n_pixels/2
sigma_x, sigma_y = 100, 100
offset = -1
initial_guess = np.asarray([amp, x0, y0, sigma_x, sigma_y, offset])
data_array = np.asarray([x, y])
testmap = twoD_Gaussian(data_array, initial_guess[0], initial_guess[1], initial_guess[2], initial_guess[3], initial_guess[4], initial_guess[5])
popt, pcov = opt.curve_fit(twoD_Gaussian, data_array, testmap, p0=initial_guess)
However, I first get a value error:
ValueError: object too deep for desired array
Which the traceback then traces to:
error: Result from function call is not a proper array of floats.
From what I understood in other threads with this other, this has to do with some part of the argument not being properly defined as an array, but e.g. as a symbolic object, which I do not understand since the output testmap (which is working as expected) is actually a numpy array, and all input into curve_fit is also either a numpy array or the function itself. What is the exact issue and how can I solve it?
edit: the full error if I try to run it from console is:
ValueError: object too deep for desired array
Traceback (most recent call last):
File "fit-2dgauss.py", line 41, in <module>
popt, pcov = opt.curve_fit(twoD_Gaussian, data_array, test, p0=initial_guess)
File "/users/drhiem/.local/lib/python3.6/site-packages/scipy/optimize/minpack.py", line 784, in curve_fit
res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
File "/users/drhiem/.local/lib/python3.6/site-packages/scipy/optimize/minpack.py", line 423, in leastsq
gtol, maxfev, epsfcn, factor, diag)
minpack.error: Result from function call is not a proper array of floats.
I just noticed that instead of "error", it's now "minpack.error". I ran this in an ipython console environment beforehand for testing purposes, so maybe that difference is down to that, not sure how much this difference matters.
data_array is (2, 2400, 2400) float64 (from added print)
testmap is (2400, 2400) float64 (again a diagnostic print)
curve_fit docs talk about M length or (k,M) arrays.
You are providing (2,N,N) and (N,N) shape arrays.
Lets try flattening the N,N dimensions:
In the objective function:
def twoD_Gaussian(data_list, amplitude, xo, yo, sigma_x, sigma_y, offset):
x = data_list[0]
y = data_list[1]
x = x.reshape(2400,2400)
y = y.reshape(2400,2400)
theta = 0 # don't care about theta for the moment but want to leave the option in
a = (np.cos(theta)**2)/(2*sigma_x**2) + (np.sin(theta)**2)/(2*sigma_y**2)
b = -(np.sin(2*theta))/(4*sigma_x**2) + (np.sin(2*theta))/(4*sigma_y**2)
c = (np.sin(theta)**2)/(2*sigma_x**2) + (np.cos(theta)**2)/(2*sigma_y**2)
g = offset + amplitude*np.exp( - (a*((x-xo)**2) + 2*b*(x-xo)*(y-yo) + c*((y-yo)**2)))
return g.ravel()
and
and in the calls:
testmap = twoD_Gaussian(data_array.reshape(2,-1), initial_guess[0], initial_guess[1], initial_guess[2], initial_guess[3], initial_guess[4], initial_guess[5])
# shape (5760000,) float64
print(type(testmap),testmap.shape, testmap.dtype)
popt, pcov = opt.curve_fit(twoD_Gaussian, data_array.reshape(2,-1), testmap, p0=initial_guess)
And it runs:
1624:~/mypy$ python3 stack65587542.py
(2, 2400, 2400) float64
<class 'numpy.ndarray'> (5760000,) float64
popt and pcov:
[-3.0e+00 1.2e+03 1.2e+03 1.0e+02 1.0e+02 -1.0e+00]
[[ 0. -0. -0. 0. 0. -0.]
[-0. 0. -0. -0. -0. -0.]
[-0. -0. 0. -0. -0. -0.]
[ 0. -0. -0. 0. 0. 0.]
[ 0. -0. -0. 0. 0. 0.]
[-0. -0. -0. 0. 0. 0.]]
The popt values are the same as initial_guess as expected with the exact testmap.
So the basic issue is that you did not take the documented specifications seriously. That
ValueError: object too deep for desired array
error message is a bit obscure, though I vaguely recall seeing it before. Sometimes we get errors like this when inputs are ragged arrays and the result arrays is object dtype. But here it's simply a matter of shape.
A past SO with similar problem and fix:
Scipy curve_fit for Two Dimensions Not Working - Object Too Deep?
ValueError When Performing scipy.stats test on Pandas Column Selection by Row
Fitting a 2D Gaussian function using scipy.optimize.curve_fit - ValueError and minpack.error
This is just a subset of SO with the same error message. Other scipy functions produce it. And often the problem is with shapes like (m,1) instead of (N,N). I'd be tempted to close this as a duplicate, but my long answer with debugging details may be instructive.

How do I fix fix error in plotting due to indexing?

import numpy as np
from pcw import dls
n = 5
alpha = np.logspace(-1, -5, n)
x = 0.5/alpha
uw = np.logspace(-9, 1)
t = 0.25/uw
for i in range(len(x)):
s = 2*np.array(dls([x[i]],t)) ###may be error is here due to index or something
print(s)
fontsize_labels = 12
fontsize_tick_labels = 12
fontsize_legend = 10
fig = plt.figure()
ax1 = fig.add_subplot(111)
color_list = plt.rcParams['axes.prop_cycle'].by_key()['color']
ax1.loglog(uw, s, linewidth=2, color=color_list[1], clip_on=True)
ax1.set_xlabel(r'$u_{w}$', fontsize=fontsize_labels)
ax1.set_ylabel(r'$s_{w}\ /\ (Q/4 \pi T)$', fontsize=fontsize_labels)
ax1.tick_params(axis='both', which='major', labelsize=fontsize_tick_labels)
ax1.yaxis.set_ticks_position('both')
ax1.xaxis.set_ticks_position('both')
ax1.set_ylim(1e-4, 1e2)
ax1.set_xlim(1e-9, 1e1)
ax1.legend(frameon=False, loc='best', fontsize=fontsize_legend)
# plt.tight_layout()
plt.show()
Hi All,
I am a very beginner in Python. After running this code, I am expecting to get five curves in a single plot but I am getting five different plots. Please suggest to me how to get over this error. The error may be near for loop as per my knowledge. Your help will be highly appreciated.
#Diziet Asahi I want all five curves on the same axes.
You need to take the figure and axes creation outside of your loop:
fig = plt.figure()
ax1 = fig.add_subplot(111)
for i in range(len(x)):
(...)
ax1.loglog(uw, s, linewidth=2, color=color_list[1], clip_on=True)
(...)
plt.show()

numpy binned mean, conserving extra axes

It seems I am stuck on the following problem with numpy.
I have an array X with shape: X.shape = (nexp, ntime, ndim, npart)
I need to compute binned statistics on this array along npart dimension, according to the values in binvals (and some bins), but keeping all the other dimensions there, because I have to use the binned statistic to remove some bias in the original array X. Binning values have shape binvals.shape = (nexp, ntime, npart).
A complete, minimal example, to explain what I am trying to do. Note that, in reality, I am working on large arrays and with several hunderds of bins (so this implementation takes forever):
import numpy as np
np.random.seed(12345)
X = np.random.randn(24).reshape(1,2,3,4)
binvals = np.random.randn(8).reshape(1,2,4)
bins = [-np.inf, 0, np.inf]
nexp, ntime, ndim, npart = X.shape
cleanX = np.zeros_like(X)
for ne in range(nexp):
for nt in range(ntime):
indices = np.digitize(binvals[ne, nt, :], bins)
for nd in range(ndim):
for nb in range(1, len(bins)):
inds = indices==nb
cleanX[ne, nt, nd, inds] = X[ne, nt, nd, inds] - \
np.mean(X[ne, nt, nd, inds], axis = -1)
Looking at the results of this may make it clearer?
In [8]: X
Out[8]:
array([[[[-0.20470766, 0.47894334, -0.51943872, -0.5557303 ],
[ 1.96578057, 1.39340583, 0.09290788, 0.28174615],
[ 0.76902257, 1.24643474, 1.00718936, -1.29622111]],
[[ 0.27499163, 0.22891288, 1.35291684, 0.88642934],
[-2.00163731, -0.37184254, 1.66902531, -0.43856974],
[-0.53974145, 0.47698501, 3.24894392, -1.02122752]]]])
In [10]: cleanX
Out[10]:
array([[[[ 0. , 0.67768523, -0.32069682, -0.35698841],
[ 0. , 0.80405255, -0.49644541, -0.30760713],
[ 0. , 0.92730041, 0.68805503, -1.61535544]],
[[ 0.02303938, -0.02303938, 0.23324375, -0.23324375],
[-0.81489739, 0.81489739, 1.05379752, -1.05379752],
[-0.50836323, 0.50836323, 2.13508572, -2.13508572]]]])
In [12]: binvals
Out[12]:
array([[[ -5.77087303e-01, 1.24121276e-01, 3.02613562e-01,
5.23772068e-01],
[ 9.40277775e-04, 1.34380979e+00, -7.13543985e-01,
-8.31153539e-01]]])
Is there a vectorized solution? I thought of using scipy.stats.binned_statistic, but I seem to be unable to understand how to use it for this aim. Thanks!
import numpy as np
np.random.seed(100)
nexp = 3
ntime = 4
ndim = 5
npart = 100
nbins = 4
binvals = np.random.rand(nexp, ntime, npart)
X = np.random.rand(nexp, ntime, ndim, npart)
bins = np.linspace(0, 1, nbins + 1)
d = np.digitize(binvals, bins)[:, :, np.newaxis, :]
r = np.arange(1, len(bins)).reshape((-1, 1, 1, 1, 1))
m = d[np.newaxis, ...] == r
counts = np.sum(m, axis=-1, keepdims=True).clip(min=1)
means = np.sum(X[np.newaxis, ...] * m, axis=-1, keepdims=True) / counts
cleanX = X - np.choose(d - 1, means)
Ok, I think I got it, mainly based on the answer by #jdehesa.
clean2 = np.zeros_like(X)
d = np.digitize(binvals, bins)
for i in range(1, len(bins)):
m = d == i
minds = np.where(m)
sl = [*minds[:2], slice(None), minds[2]]
msum = m.sum(axis=-1)
clean2[sl] = (X - \
(np.sum(X * m[...,np.newaxis,:], axis=-1) /
msum[..., np.newaxis])[..., np.newaxis])[sl]
Which gives the same results as my original code.
On the small arrays I have in the example here, this solution is approximately three times as fast as the original code. I expect it to be way faster on larger arrays.
Update:
Indeed it's faster on larger arrays (didn't do any formal test), but despite this, it just reaches the level of acceptable in terms of performance... any further suggestion on extra vectoriztaions would be very welcome.

Mathematica - plotting an array with values and exponential function

I have a question about mathematica. I have an array with values called tempDep:
{10.7072,11.5416,12.2065,12.774,13.2768,13.7328,14.1526,14.5436,14.9107,15.2577,15.5874,15.9022,16.2037,16.4934,16.7727,17.0425,17.3036,17.5569,17.803,18.0424,18.2756,18.503,18.725,18.9419,19.154,19.3615,19.5647,19.7637,19.9588,20.1501,20.3378,20.5219,20.7025,20.8799,21.0541,21.2252,21.3933,21.5584,21.7207,21.8801,22.0368,22.1908,22.3423,22.4911,22.6374,22.7813,22.9228,23.0619,23.1987,23.3332,23.4655,23.5955,23.7235,23.8493,23.973,24.0947,24.2143,24.332,24.4478,24.5616,24.6736,24.7837,24.892,24.9986,25.1034,25.2064,25.3078,25.4075,25.5055,25.602,25.6968,25.7901,25.8819,25.9722,26.061,26.1483,26.2342,26.3186,26.4017,26.4835,26.5638,26.6429,26.7207,26.7972,26.8724,26.9464,27.0192,27.0908,27.1612,27.2304,27.2986,27.3656,27.4315,27.4963,27.56,27.6227,27.6844,27.7451,27.8048,27.8635,27.9212,27.978,28.0338,28.0887,28.1428,28.1959,28.2482,28.2996,28.3502,28.3999,28.4488,28.497,28.5443,28.5908,28.6366,28.6817,28.726,28.7695,28.8124,28.8545,28.896,28.9368,28.9769,29.0163,29.0551,29.0933,29.1308,29.1678,29.2041,29.2398,29.2749,29.3095,29.3435,29.3769,29.4098,29.4421,29.474,29.5053,29.536,29.5663,29.5961,29.6254,29.6542,29.6825,29.7104,29.7378,29.7647,29.7913,29.8173,29.843,29.8682,29.893,29.9175,29.9415,29.9651,29.9883,30.0112,30.0336,30.0557,30.0775,30.0989,30.1199,30.1406,30.1609,30.1809,30.2006,30.22,30.239,30.2578,30.2762,30.2943,30.3121,30.3297,30.3469,30.3639,30.3806,30.397,30.4131,30.429,30.4446,30.4599,30.4751,30.4899,30.5045,30.5189,30.533,30.5469,30.5606,30.5741,30.5873,30.6003,30.6131,30.6257,30.6381,30.6503,30.6623,30.674,30.6856}
and I am plotting it using
ListPlot[tempDep]
What I want to do is to display this plot together with an exponential (that should look pretty much the same as this listPlot) in one graph. Can u help me out with this plz?
Perhaps something like this?
data = Table[Sin[x], {x, 0, 2 Pi, 0.3}];
Show[
ListPlot[data, PlotStyle -> PointSize[0.02]],
ListLinePlot[data,
InterpolationOrder -> 2,
PlotStyle -> Directive[Thick, Orange]]
]
You can use
f = Interpolate[tempDep]
and then plot the graph of interpolating function with
Plot[f,{x,1,198}]
It seems to me that your data obey something else, but if you want an exponential fit:
model = a + b Exp[c + d x];
tempDep1 = Partition[Riffle[Range#Length#tempDep, tempDep], 2];
fit = FindFit[tempDep1, model, {a, b, c, d}, x, Method -> NMinimize];
modelf = Function[{x}, Evaluate[model /. fit]]
Plot[modelf[t], {t, 0, Length#tempDep}, Epilog -> Point#tempDep1]

Resources