I'm using python 2.7
And trying to get this code to work and keep receiving an error
nsample = 50
sig = 0.25
x1 = np.linspace(0,20, nsample)
X = np.c_[x1, np.sin(x1), (x1-5)**2, np.ones(nsample)]
beta = masterAverageList
y_true = np.dot(X, beta)
y = y_true + sig * np.random.normal(size=nsample)
However I keep getting objects are not aligned error
I think it has something to do with master average list being a list?
I forgot to mention the master array list has 196 items in it if it matters. They are all floats
How can I correct this?
Thanks for any sugguestions
You should read up on numpy broadcasting here and here. You are trying to take the dot product between two arrays which have incompatible shapes.
>>> import numpy as np
>>> x1 = np.linspace(0,20,50)
>>> X = np.c_[x1,np.sin(x1),(x1-5)**2,np.ones(50)]
>>> beta = np.ones(196)
>>> y_true = np.dot(X,beta)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: matrices are not aligned
>>> X.shape
(50, 4)
>>> beta.shape
I'm not sure what to recommend, since I don't know what you were expecting by taking the dot product between these arrays.
I am trying to build a bar chart with the bars shown in a descending order.
In my code, the numpy array is a result of using SelectKmeans() to select the best features in a machine learning problem depending on their variance.
import numpy as np
import matplotlib.pyplot as plt
flist = ['int_rate', 'installment', 'log_annual_inc','dti', 'fico', 'days_with_cr_line', 'revol_bal', 'revol_util', 'inq_last_6mths','pub_rec']
fimportance = np.array([250.14120228,23.95686725,10.71979245,13.38566487,219.41737141,
8.19261323,27.69341779,64.96469182,218.77495366,22.7037686 ]) # this is the numpy.ndarray after running SelectKBest()
print(fimportance) # this gives me 'int_rate', 'fico', 'revol_util', 'inq_last_6mths' as 4 most #important features as their variance values are mapped to flist, e.g. 250 relates to'int_rate' and 218 relates to 'inq_last_6mths'.
[250.14120228 23.95686725 10.71979245 13.38566487 219.41737141
8.19261323 27.69341779 64.96469182 218.77495366 22.7037686 ]
So I want to show these values on my bar chart in descending order, with int_rate on top.
fimportance_sorted = np.sort(fimportance)
array([250.14120228, 219.41737141, 218.77495366, 64.96469182,
27.69341779, 23.95686725, 22.7037686 , 13.38566487,
10.71979245, 8.19261323])
# this bar chart is not right because here the values and indices are messed up.
plt.barh(flist, fimportance_sorted)
Next I have tried this.
plt.barh([x for x in range(len(fimportance))], fimportance)
I understand I need to map these indices to the flist values somehow and then sort them. Maybe by creating an array and then mapping my list labels instead of its index. here I am stuck.
for i,v in enumerate(fimportance):
arr = np.array([i,v])
Thank you for your help with this problem.
the values and indices are messed up
That's because you sorted fimportance (fimportance_sorted = np.sort(fimportance)), but the order of labels in flist remained unchanged, so now labels don't correspond to the values in fimportance_sorted.
You can use numpy.argsort to get the indices that would put fimportance into sorted order and then index both flist and fimportance with these indices:
>>> import numpy as np
>>> flist = ['int_rate', 'installment', 'log_annual_inc','dti', 'fico', 'days_with_cr_line', 'revol_bal', 'revol_util', 'inq_last_6mths','pub_rec']
>>> fimportance = np.array([250.14120228,23.95686725,10.71979245,13.38566487,219.41737141,
... 8.19261323,27.69341779,64.96469182,218.77495366,22.7037686 ])
>>> idx = np.argsort(fimportance)
>>> idx
array([5, 2, 3, 9, 1, 6, 7, 8, 4, 0])
>>> flist[idx]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: only integer scalar arrays can be converted to a scalar index
>>> np.array(flist)[idx]
array(['days_with_cr_line', 'log_annual_inc', 'dti', 'pub_rec',
'installment', 'revol_bal', 'revol_util', 'inq_last_6mths', 'fico',
'int_rate'], dtype='<U17')
>>> fimportance[idx]
array([ 8.19261323, 10.71979245, 13.38566487, 22.7037686 ,
23.95686725, 27.69341779, 64.96469182, 218.77495366,
219.41737141, 250.14120228])
idx is the order in which you need to put elements of fimportance to sort it. The order of flist must match the order of fimportance, so index both with idx.
As a result, elements of np.array(flist)[idx] correspond to elements of fimportance[idx].
I am trying to fit a 2D Gaussian with an offset to a 2D array. The code is based on this thread here (which was written for Python2 while I am using Python3, therefore some changes were necessary to make it run somewhat):
import numpy as np
import scipy.optimize as opt
n_pixels = 2400
def twoD_Gaussian(data_list, amplitude, xo, yo, sigma_x, sigma_y, offset):
x = data_list[0]
y = data_list[1]
theta = 0 # don't care about theta for the moment but want to leave the option in
a = (np.cos(theta)**2)/(2*sigma_x**2) + (np.sin(theta)**2)/(2*sigma_y**2)
b = -(np.sin(2*theta))/(4*sigma_x**2) + (np.sin(2*theta))/(4*sigma_y**2)
c = (np.sin(theta)**2)/(2*sigma_x**2) + (np.cos(theta)**2)/(2*sigma_y**2)
g = offset + amplitude*np.exp( - (a*((x-xo)**2) + 2*b*(x-xo)*(y-yo) + c*((y-yo)**2)))
return g
x = np.linspace(1, n_pixels, n_pixels) #starting with 1 because proper data is from a fits file
y = np.linspace(1, n_pixels, n_pixels)
x, y = np.meshgrid(x,y)
amp = -3
x0, y0 = n_pixels/2, n_pixels/2
sigma_x, sigma_y = 100, 100
offset = -1
initial_guess = np.asarray([amp, x0, y0, sigma_x, sigma_y, offset])
data_array = np.asarray([x, y])
testmap = twoD_Gaussian(data_array, initial_guess[0], initial_guess[1], initial_guess[2], initial_guess[3], initial_guess[4], initial_guess[5])
popt, pcov = opt.curve_fit(twoD_Gaussian, data_array, testmap, p0=initial_guess)
However, I first get a value error:
ValueError: object too deep for desired array
Which the traceback then traces to:
error: Result from function call is not a proper array of floats.
From what I understood in other threads with this other, this has to do with some part of the argument not being properly defined as an array, but e.g. as a symbolic object, which I do not understand since the output testmap (which is working as expected) is actually a numpy array, and all input into curve_fit is also either a numpy array or the function itself. What is the exact issue and how can I solve it?
edit: the full error if I try to run it from console is:
ValueError: object too deep for desired array
Traceback (most recent call last):
File "fit-2dgauss.py", line 41, in <module>
popt, pcov = opt.curve_fit(twoD_Gaussian, data_array, test, p0=initial_guess)
File "/users/drhiem/.local/lib/python3.6/site-packages/scipy/optimize/minpack.py", line 784, in curve_fit
res = leastsq(func, p0, Dfun=jac, full_output=1, **kwargs)
File "/users/drhiem/.local/lib/python3.6/site-packages/scipy/optimize/minpack.py", line 423, in leastsq
gtol, maxfev, epsfcn, factor, diag)
minpack.error: Result from function call is not a proper array of floats.
I just noticed that instead of "error", it's now "minpack.error". I ran this in an ipython console environment beforehand for testing purposes, so maybe that difference is down to that, not sure how much this difference matters.
data_array is (2, 2400, 2400) float64 (from added print)
testmap is (2400, 2400) float64 (again a diagnostic print)
curve_fit docs talk about M length or (k,M) arrays.
You are providing (2,N,N) and (N,N) shape arrays.
Lets try flattening the N,N dimensions:
In the objective function:
def twoD_Gaussian(data_list, amplitude, xo, yo, sigma_x, sigma_y, offset):
x = data_list[0]
y = data_list[1]
x = x.reshape(2400,2400)
y = y.reshape(2400,2400)
theta = 0 # don't care about theta for the moment but want to leave the option in
a = (np.cos(theta)**2)/(2*sigma_x**2) + (np.sin(theta)**2)/(2*sigma_y**2)
b = -(np.sin(2*theta))/(4*sigma_x**2) + (np.sin(2*theta))/(4*sigma_y**2)
c = (np.sin(theta)**2)/(2*sigma_x**2) + (np.cos(theta)**2)/(2*sigma_y**2)
g = offset + amplitude*np.exp( - (a*((x-xo)**2) + 2*b*(x-xo)*(y-yo) + c*((y-yo)**2)))
return g.ravel()
and in the calls:
testmap = twoD_Gaussian(data_array.reshape(2,-1), initial_guess[0], initial_guess[1], initial_guess[2], initial_guess[3], initial_guess[4], initial_guess[5])
# shape (5760000,) float64
print(type(testmap),testmap.shape, testmap.dtype)
popt, pcov = opt.curve_fit(twoD_Gaussian, data_array.reshape(2,-1), testmap, p0=initial_guess)
And it runs:
1624:~/mypy$ python3 stack65587542.py
(2, 2400, 2400) float64
<class 'numpy.ndarray'> (5760000,) float64
popt and pcov:
[-3.0e+00 1.2e+03 1.2e+03 1.0e+02 1.0e+02 -1.0e+00]
[[ 0. -0. -0. 0. 0. -0.]
[-0. 0. -0. -0. -0. -0.]
[-0. -0. 0. -0. -0. -0.]
[ 0. -0. -0. 0. 0. 0.]
[ 0. -0. -0. 0. 0. 0.]
[-0. -0. -0. 0. 0. 0.]]
The popt values are the same as initial_guess as expected with the exact testmap.
So the basic issue is that you did not take the documented specifications seriously. That
ValueError: object too deep for desired array
error message is a bit obscure, though I vaguely recall seeing it before. Sometimes we get errors like this when inputs are ragged arrays and the result arrays is object dtype. But here it's simply a matter of shape.
A past SO with similar problem and fix:
Scipy curve_fit for Two Dimensions Not Working - Object Too Deep?
ValueError When Performing scipy.stats test on Pandas Column Selection by Row
Fitting a 2D Gaussian function using scipy.optimize.curve_fit - ValueError and minpack.error
This is just a subset of SO with the same error message. Other scipy functions produce it. And often the problem is with shapes like (m,1) instead of (N,N). I'd be tempted to close this as a duplicate, but my long answer with debugging details may be instructive.
Good afternoon.
I've been struggling with this for a while now, and although I can find similiar problems online, nothing I found could really help me resolve it.
Starting with a standard data file (.csv or .txt, I tried both) containing three columns (x, y and the error of y), I want to read in the data and generate a line plot including error bars.
I can plot the x and y values without a problem, but if I want to add errorbars using the matplotlib.pyplot errorbar utility, I get the following error message:
ValueError: yerr must be a scalar, the same dimensions as y, or 2xN.
The code below works if I use some arbitrary arrays (numpy or plain python), but not for data read from the file. I've tried converting the tuples which I obtain from my input code to numpy arrays using asarray, but to no avail.
import numpy as np
import matplotlib.pyplot as plt
row = []
with open("data.csv") as data:
for line in data:
column = zip(*row)
x = column[0]
y = column[1]
yer = column[2]
plt.errorbar(x,y,yerr = yer)
fig = plt.gcf()
fig.set_size_inches(18.5, 10.5)
fig.savefig('example.png', dpi=300)
It must be that I am overlooking something. I would be very grateful for any thoughts on the matter.
yerr should be the added/subtracted error from the y value. In your case the added equals the subtracted equals half of the third column.
import numpy as np
import matplotlib.pyplot as plt
data = np.loadtxt('data.csv', delimiter=',')
yerr_ = np.tile(data[:, 2]/2, (2, 1))
plt.errorbar(data[:, 0], data[:, 1], yerr=yerr_)
plt.xlim([-1, 3])
I am aware that there is a similar question for Z3 C++ API, but I couldn't find corresponding information for Z3Py. I'm trying to retrieve arrays from models found by Z3, so that I can access the array's values using indexes. For instance, if I had
>>> b = Array('b', IntSort(), BitVecSort(8))
>>> s = Solver()
>>> s.add(b[0] == 0)
>>> s.check()
then I'd like to do something like
>>> s.model()[b][0]
but I currently get :
>>> s.model()[b][0]
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'FuncInterp' object does not support indexing
Judging from the C++ answer, it seems like I'd have to declare a new function using some values I got from the model, but I don't understand it well enough to adapt it to Z3Py myself.
You can ask the model to evaluate (eval(...)) the array at a particular point by constructing a call to the associated array model function. Here's an example:
b = Array('b', IntSort(), BitVecSort(8))
s = Solver()
s.add(b[0] == 21)
s.add(b[1] == 47)
m = s.model()
which produces
[1 -> 47, 0 -> 21, else -> 47]
I was just solving a problem using python, and my codes are:
from math import sin,pi
import numpy
import numpy as np
import pylab
x = np.linspace(0,1, N)
def v(x):
return 100*sin(pi*x)
#set up initial condition
u0 = [0.0] # Boundary conditions at t= 0
for i in range(1,N):
u0[i] = v(x[i])
And I would want to plot the results by updating v(x) in range(0, N) after. it looks simple but perhaps you guys could help since it gives me an error, like
Traceback (most recent call last):
File "/home/universe/Desktop/Python/sample.py", line 13, in <module>
u0[i] = v(x[i])
IndexError: list assignment index out of range
You could change u0[i] = v(x[i]) to u0.append(v(x[i])). But you should write more elegantly as
u0 = [v(xi) for xi in x]
Indices i are bug magnets.
Since you are using numpy, I'd suggest using np.vectorize. That way you can pass the array x directly to the function and the function will return an array of the same size with the function applied on each element of the input array.
from math import sin,pi
import numpy
import numpy as np
import pylab
x = np.linspace(0,1, N)
def v(x):
return 100*sin(pi*x)
vectorized_v = np.vectorize(v) #so that the function takes an array of x's and returns an array again
u0 = vectorized_v(x)
array([ 0.00000000e+00, 1.64594590e+01, 3.24699469e+01,
4.75947393e+01, 6.14212713e+01, 7.35723911e+01,
8.37166478e+01, 9.15773327e+01, 9.69400266e+01,
9.96584493e+01, 9.96584493e+01, 9.69400266e+01,
9.15773327e+01, 8.37166478e+01, 7.35723911e+01,
6.14212713e+01, 4.75947393e+01, 3.24699469e+01,
1.64594590e+01, 1.22464680e-14])
u is a list with one element, so you can't assign values to indices that don't exist. Instead make u a dictionary
u = {}
u[0] = 0.0
for i in range(1,N):
u[i] = v(x[i])