I have an image with a list of numbers which I have scanned using PyTesseract to construct a string. Concretely, here is the code:
from PIL import Image
import pytesseract
from scipy import stats
import numpy as np
pytesseract.pytesseract.tesseract_cmd = r'C:\\\Program Files\\\Tesseract-OCR\\\tesseract.exe'
Here's the image I am scanning:
The problem is that PyTesseract is scanning the image as individual characters instead of integers.
I would like to understand why this is happening and what can I do to get the desired result.
In short, PyTesseract is not scanning integers in a list of numbers, instead scanning them as individual characters. How do I tell it to scan for integers and put them in an array?
Well,If you only want to get a list,Use re.split and strip can solve it.(Because tesseract's result has some errors).
You can try this:
import pytesseract
import re
data = pytesseract.image_to_string('OCR.png')
dataList = re.split(r',|\.| ',data) # split the string
resultList = [int(i.strip()) for i in dataList if i != ''] # remove the '' str and convert str to int.
# result: [71, 194, 38, 1701, 89, 76, 11, 83, 1629, 48, 94, 63, 132, 16, 111, 95, 84, 341, 975, 14, 40, 64, .......
How could I code a function that adds to the previous index of the array without using a for loop. So for the second value in the Expected Output it will be previous index (2) plus current index(5) resulting in 7. How would I be able to do such a thing?
import numpy as np
A = np.array([2,5,44,-12,3,-5])
Expected output:
Here you go:
array([ 2, 7, 51, 39, 42, 37], dtype=int32)
I referred C# code for byte array to string conversion in swift
I tried to convert byte array to string by using below few codes:
let utf8 : [UInt8] = [231, 13, 38, 246, 234, 144, 148, 111, 174, 136, 15, 61, 200, 186, 215, 113,0]
let str = NSString(bytes: utf8, length: utf8.count, encoding: String.Encoding.utf8.rawValue)
print("str : \(str)")
result : getting nil value
let datastring = NSString(bytes: chars, length: count, encoding: String.Encoding.utf8.rawValue)
print("string byte data\(chars.map{"\($0)"}.reduce(""){$0+$1})")
result : 23113382462341441481111741361561200186215113 (I thought this is not ideal way)
I searched from last two days and tried other multiple ways but I missed something or doing some mistake .
Please help me out resolved this issue.
Referred links:
how-to-convert-uint8-byte-array-to-string-in-swift 2
Why does my model in Keras not take in my input/output data?
The input data consist of being a list of numpy.ndarrays of shape (15,1,3) and the output is a list of numpy.arrays with only one number in each entry.
Here is the where I create my model, and pass things in:
model = Sequential()
print "Data-train-in: " + str(data_train_input[0].shape)
print "Data-train-out: " + str(data_train_output[0].shape)
print "Data-test-in: " + str(data_test_input[0].shape)
print "Model Definition"
print "Row: " + str(row)
print model.output_shape
print model.output_shape
print model.output_shape
print model.output_shape
print model.output_shape
model.compile(loss='mean_squared_error', optimizer="sgd")
reduce_lr=ReduceLROnPlateau(monitor='val_loss', factor=0.01, patience=3, verbose=1, mode='auto', epsilon=0.0001, cooldown=0, min_lr=0.000000000000000001)
stop = EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=1, mode='auto')
log=csv_logger = CSVLogger('training_'+str(i)+'.csv')
print "Model Train"
hist_current = model.fit(data_train_input,
Which outputs:
Data-train-in: (15, 1, 3)
Data-train-out: ()
Data-test-in: (15, 1, 3)
Model Definition
Row: 15
(None, 1, 13, 64)
(None, 1, 11, 32)
(None, 1, 11, 32)
(None, 352)
(None, 1)
Model Train
Traceback (most recent call last):
File "keras_convolutional_feature_extraction.py", line 502, in <module>
File "keras_convolutional_feature_extraction.py", line 496, in model
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 652, in fit
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1038, in fit
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 963, in _standardize_user_data
exception_prefix='model input')
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 54, in standardize_input_data
Exception: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 arrays but instead got the following list of 260182 arrays: [array([[[ 67, 255, 180]],
[[ 68, 255, 178]],
[[ 68, 255, 178]],
[[ 67, 255, 180]],
[[ 43, 254, 204]],
[[ 19, 253, 228]],
[[ 9, 205, 241]],
[[ ...
I am not sure on how to interpret the output message. What is wrong here?
Your data doesn't match your input layer. In your model you used input_shape=(3,row,1) which equals to input_shape=(3,15,1) in this context.
But your print show that your training examples are with a different shape of (15, 1, 3).
Try changing your input definition to input_shape=(row,1,3).
Another way to solve the problem is reshaping your data to the input layer shape.
import numpy as np
data_train_input = np.array(data_train_input)
this seems to work.
I have three equally dimensioned numpy arrays.
I would like to store the data from all three in an array of the same dimensions and size.
To do this, I would like to store three bytes of information per item in the array. I assume this would be a list.
>>>red = np.array([[150,25],[37,214]])
>>>green = np.array([[190,27],[123,231]])
>>>blue = np.array([[10,112],[123,119]])
insert combination magic to make a combined array called RGB
For a start, each is 2x2. Combined in a list with array, same construction as in making red, produces a 3x2x2.
In [344]: red = np.array([[150,25],[37,214]])
In [345]: green = np.array([[190,27],[123,231]])
In [346]: blue = np.array([[10,112],[123,119]])
In [347]: np.array([red,green,blue])
array([[[150, 25],
[ 37, 214]],
[[190, 27],
[123, 231]],
[[ 10, 112],
[123, 119]]])
In [348]: _.shape
Out[348]: (3, 2, 2)
That's not the order you want, but we can easily reshape, and if needed transpose.
The target, with an added set of []
In [350]: np.array([[(150,190,10),(25,27,112)],[(37,123,123),(214,231,119)]])
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
In [351]: _.shape
Out[351]: (2, 2, 3)
so try moving the 3 shape to the end with transpose:
In [352]: np.array([red,green,blue]).transpose(1,2,0)
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
I should have suggested stack. This a newish version of concatenate that lets us join arrays on different new dimensions. With axis=0 it behaves like np.array. But to join on the last, to put the rgb dimension last use:
In [467]: np.stack((red,green,blue),axis=-1)
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
In [468]: _.shape
Out[468]: (2, 2, 3)
Note that this expression does not assume anything about the shape of red, etc, except that they are equal. So it will work with 3d arrays as well.
I was trying to take input many strings in separate lines and want to store all of them for use later.For example want to take input as follows(last line ends with a ".")-
My name is ABCD
My name is BCDS
My name is fdada.
How can I implement this?? Also I want to use all these strings.In java or any other language I would have made a string array and used that array to access all the three strings.
But the moment I enter 1st line it gives me false.
you can use a failure driven loop, like
:- dynamic a_line/1.
read_lines :-
read_line_to_codes(user_input, L),
( last(L, 0'.) ; fail ).
and then
?- read_lines.
|: My name is ABCD
|: My name is BCDS
|: My name is fdada.
true .
the result get stored in a_line/1, so
?- a_line(L),atom_codes(A,L).
L = [77, 121, 32, 110, 97, 109, 101, 32, 105|...],
A = 'My name is ABCD'
L = [32, 77, 121, 32, 110, 97, 109, 101, 32|...],
A = ' My name is BCDS'.