Why does my model in Keras not take in my input/output data?
The input data consist of being a list of numpy.ndarrays of shape (15,1,3) and the output is a list of numpy.arrays with only one number in each entry.
Here is the where I create my model, and pass things in:
model = Sequential()
print "Data-train-in: " + str(data_train_input[0].shape)
print "Data-train-out: " + str(data_train_output[0].shape)
print "Data-test-in: " + str(data_test_input[0].shape)
#sys.exit()
print "Model Definition"
print "Row: " + str(row)
model.add(Convolution2D(64,3,3,input_shape=(3,row,1)))
print model.output_shape
model.add(Convolution2D(32,1,3))
print model.output_shape
model.add(MaxPooling2D((1,1)))
print model.output_shape
model.add(Flatten())
print model.output_shape
model.add(Dense(1,activation='relu'))
print model.output_shape
model.compile(loss='mean_squared_error', optimizer="sgd")
reduce_lr=ReduceLROnPlateau(monitor='val_loss', factor=0.01, patience=3, verbose=1, mode='auto', epsilon=0.0001, cooldown=0, min_lr=0.000000000000000001)
stop = EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=1, mode='auto')
log=csv_logger = CSVLogger('training_'+str(i)+'.csv')
print "Model Train"
hist_current = model.fit(data_train_input,
data_train_output,
shuffle=False,
validation_data=(data_test_input,data_test_output),
validation_split=0.1,
nb_epoch=150,
verbose=1,
callbacks=[reduce_lr,log,stop])
Which outputs:
Data-train-in: (15, 1, 3)
Data-train-out: ()
Data-test-in: (15, 1, 3)
Model Definition
Row: 15
(None, 1, 13, 64)
(None, 1, 11, 32)
(None, 1, 11, 32)
(None, 352)
(None, 1)
Model Train
Traceback (most recent call last):
File "keras_convolutional_feature_extraction.py", line 502, in <module>
model(0,train_input_data,output_data_train,test_input_data,output_data_test)
File "keras_convolutional_feature_extraction.py", line 496, in model
callbacks=[reduce_lr,log,stop])
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 652, in fit
sample_weight=sample_weight)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1038, in fit
batch_size=batch_size)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 963, in _standardize_user_data
exception_prefix='model input')
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 54, in standardize_input_data
'...')
Exception: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 arrays but instead got the following list of 260182 arrays: [array([[[ 67, 255, 180]],
[[ 68, 255, 178]],
[[ 68, 255, 178]],
[[ 67, 255, 180]],
[[ 43, 254, 204]],
[[ 19, 253, 228]],
[[ 9, 205, 241]],
[[ ...
I am not sure on how to interpret the output message. What is wrong here?
Your data doesn't match your input layer. In your model you used input_shape=(3,row,1) which equals to input_shape=(3,15,1) in this context.
But your print show that your training examples are with a different shape of (15, 1, 3).
Try changing your input definition to input_shape=(row,1,3).
Another way to solve the problem is reshaping your data to the input layer shape.
import numpy as np
data_train_input = np.array(data_train_input)
this seems to work.
Related
I have an image with a list of numbers which I have scanned using PyTesseract to construct a string. Concretely, here is the code:
from PIL import Image
import pytesseract
from scipy import stats
import numpy as np
pytesseract.pytesseract.tesseract_cmd = r'C:\\\Program Files\\\Tesseract-OCR\\\tesseract.exe'
str1=pytesseract.image_to_string(Image.open('D:/Image.png'))
Here's the image I am scanning:
The problem is that PyTesseract is scanning the image as individual characters instead of integers.
I would like to understand why this is happening and what can I do to get the desired result.
In short, PyTesseract is not scanning integers in a list of numbers, instead scanning them as individual characters. How do I tell it to scan for integers and put them in an array?
Well,If you only want to get a list,Use re.split and strip can solve it.(Because tesseract's result has some errors).
You can try this:
import pytesseract
import re
data = pytesseract.image_to_string('OCR.png')
dataList = re.split(r',|\.| ',data) # split the string
resultList = [int(i.strip()) for i in dataList if i != ''] # remove the '' str and convert str to int.
print(resultList)
# result: [71, 194, 38, 1701, 89, 76, 11, 83, 1629, 48, 94, 63, 132, 16, 111, 95, 84, 341, 975, 14, 40, 64, .......
I am not able to figure out how to fix this error when I run my python code. This is the entire error
Loading all_data
type of alldata <class 'dict'>
Sorting these keys dict_keys([0, 1, 2, 3, 4, 5])
Traceback (most recent call last):
File "test.py", line 48, in <module>
keys_sorted = np.sort(all_data.keys())
File "/home/MAHEUNIX/anaconda3/lib/python3.6/site-packages/numpy/core/fromnumeric.py", line 934, in sort
a.sort(axis=axis, kind=kind, order=order)
numpy.AxisError: axis -1 is out of bounds for array of dimension 0
MAHEUNIX#WGSHA-LAB-005:/
This is the corresponding code code:
print("Loading all_data")
all_data = load_dataset()
print("type of alldata",type(all_data),"\n")
print ("Sorting these keys", all_data.keys(),"\n\n")
keys_sorted = np.sort(all_data.keys())
print("keys sorted successfully\n")
train_idx, valid_idx = train_test_split(all_data.keys(), train_size = 0.9)
print (train_idx)
What is happening?
I have written some code to read a data file using pandas and process the data with numpy. This results in some NaNs in the numpy array. I mask those out so that I can apply a linear regression fit with scipy.stats:
import pandas as pd
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
def makeArray(band):
"""
Takes as argument a string as the name of a wavelength band.
Converts the list of magnitudes in that band into a numpy array,
replacing invalid values (where invalid == -999) with NaNs.
Returns the array.
"""
array_name = band + '_mag'
array = np.array(df[array_name])
array[array==-999]=np.nan
return array
# Read data file
fields = ['no', 'NED', 'z', 'obj_type','S_21', 'power', 'SI_flag',
'U_mag', 'B_mag', 'V_mag', 'R_mag', 'K_mag', 'W1_mag',
'W2_mag', 'W3_mag', 'W4_mag', 'L_UV', 'Q', 'flag_uv']
magnitudes = ['U_mag', 'B_mag', 'V_mag', 'R_mag', 'K_mag', 'W1_mag',
'W2_mag', 'W3_mag', 'W4_mag']
df = pd.read_csv('todo.dat', sep = ' ',
names = fields, index_col = False)
# Define axes for processing
redshifts = np.array(df['z'])
y = np.log(makeArray('K'))
mask = np.isnan(y)
plt.scatter(redshifts, y, label = ('K'), s = 2, color = 'r')
slope, intercept, r_value, p_value, std_err = stats.linregress(redshifts, y[mask])
fit = slope*redshifts + intercept
plt.legend()
plt.show()
but the lines where I calculate the stats parameters and the fit line (third- and fourth-to-last lines) give me the following error:
Traceback (most recent call last):
File "<ipython-input-77-ec9f43cdfa9b>", line 1, in <module>
runfile('C:/Users/Jeremy/Dropbox/Notes/Postgrad/Masters Research/VUW/QSOs/read_csv.py', wdir='C:/Users/Jeremy/Dropbox/Notes/Postgrad/Masters Research/VUW/QSOs')
File "C:\Users\Jeremy\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 880, in runfile
execfile(filename, namespace)
File "C:\Users\Jeremy\Anaconda3\lib\site-packages\spyder\utils\site\sitecustomize.py", line 102, in execfile
exec(compile(f.read(), filename, 'exec'), namespace)
File "C:/Users/Jeremy/Dropbox/Notes/Postgrad/Masters Research/VUW/QSOs/read_csv.py", line 35, in <module>
slope, intercept, r_value, p_value, std_err = stats.linregress(redshifts, y[mask])
File "C:\Users\Jeremy\Anaconda3\lib\site-packages\scipy\stats\_stats_mstats_common.py", line 92, in linregress
ssxm, ssxym, ssyxm, ssym = np.cov(x, y, bias=1).flat
File "C:\Users\Jeremy\Anaconda3\lib\site-packages\numpy\lib\function_base.py", line 2865, in cov
X = np.vstack((X, y))
File "C:\Users\Jeremy\Anaconda3\lib\site-packages\numpy\core\shape_base.py", line 234, in vstack
return _nx.concatenate([atleast_2d(_m) for _m in tup], 0)
ValueError: all the input array dimensions except for the concatenation axis must match exactly
The variables are shaped like:
so I'm not sure what the error means, or how to fix it. Is there a way around this? Or perhaps another module I can use instead of scipy.stats that will allow me to fit a linear regression?
The problem is that y[mask] is a different length to redshifts.
Below is a simple example piece of code to show the issue..
import numpy as np
na = np.array
y = na([np.nan, 4, 5, 6, 7, 8, np.nan, 9, 10, np.nan])
mask = np.isnan(y)
print(len(y), len(y[mask]))
You will have to substitute values for the nan values in y with something like..
print('old y: ', y)
for idx, m in enumerate(mask):
if m:
y[idx] = 1000 # or whatever value you decide on
print('new y: ', y)
Full example code...
import numpy as np
na = np.array
y = na([np.nan, 4, 5, 6, 7, 8, np.nan, 9, 10, np.nan])
mask = np.isnan(y)
print(len(y), len(y[mask]))
print('old y: ', y)
for idx, m in enumerate(mask):
if m:
y[idx] = 1000 # or whatever value you decide on
print('new y: ', y)
print(len(y))
I have three equally dimensioned numpy arrays.
I would like to store the data from all three in an array of the same dimensions and size.
To do this, I would like to store three bytes of information per item in the array. I assume this would be a list.
e.g.
>>>red = np.array([[150,25],[37,214]])
>>>green = np.array([[190,27],[123,231]])
>>>blue = np.array([[10,112],[123,119]])
insert combination magic to make a combined array called RGB
>>>RGB
array([(150,190,10),(25,27,112)],[(37,123,123),(214,231,119)])
For a start, each is 2x2. Combined in a list with array, same construction as in making red, produces a 3x2x2.
In [344]: red = np.array([[150,25],[37,214]])
In [345]: green = np.array([[190,27],[123,231]])
In [346]: blue = np.array([[10,112],[123,119]])
In [347]: np.array([red,green,blue])
Out[347]:
array([[[150, 25],
[ 37, 214]],
[[190, 27],
[123, 231]],
[[ 10, 112],
[123, 119]]])
In [348]: _.shape
Out[348]: (3, 2, 2)
That's not the order you want, but we can easily reshape, and if needed transpose.
The target, with an added set of []
In [350]: np.array([[(150,190,10),(25,27,112)],[(37,123,123),(214,231,119)]])
Out[350]:
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
In [351]: _.shape
Out[351]: (2, 2, 3)
so try moving the 3 shape to the end with transpose:
In [352]: np.array([red,green,blue]).transpose(1,2,0)
Out[352]:
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
===========================
I should have suggested stack. This a newish version of concatenate that lets us join arrays on different new dimensions. With axis=0 it behaves like np.array. But to join on the last, to put the rgb dimension last use:
In [467]: np.stack((red,green,blue),axis=-1)
Out[467]:
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
In [468]: _.shape
Out[468]: (2, 2, 3)
Note that this expression does not assume anything about the shape of red, etc, except that they are equal. So it will work with 3d arrays as well.
I was trying to take input many strings in separate lines and want to store all of them for use later.For example want to take input as follows(last line ends with a ".")-
My name is ABCD
My name is BCDS
My name is fdada.
How can I implement this?? Also I want to use all these strings.In java or any other language I would have made a string array and used that array to access all the three strings.
But the moment I enter 1st line it gives me false.
you can use a failure driven loop, like
:- dynamic a_line/1.
read_lines :-
retractall(a_line(__)),
repeat,
read_line_to_codes(user_input, L),
assertz(a_line(L)),
( last(L, 0'.) ; fail ).
and then
?- read_lines.
|: My name is ABCD
|: My name is BCDS
|: My name is fdada.
true .
the result get stored in a_line/1, so
?- a_line(L),atom_codes(A,L).
L = [77, 121, 32, 110, 97, 109, 101, 32, 105|...],
A = 'My name is ABCD'
L = [32, 77, 121, 32, 110, 97, 109, 101, 32|...],
A = ' My name is BCDS'.
...