Converting image identified by PyTesseract to an array

Converting image identified by PyTesseract to an array - arrays

I have an image with a list of numbers which I have scanned using PyTesseract to construct a string. Concretely, here is the code:
from PIL import Image
import pytesseract
from scipy import stats
import numpy as np
pytesseract.pytesseract.tesseract_cmd = r'C:\\\Program Files\\\Tesseract-OCR\\\tesseract.exe'
str1=pytesseract.image_to_string(Image.open('D:/Image.png'))
Here's the image I am scanning:
The problem is that PyTesseract is scanning the image as individual characters instead of integers.
I would like to understand why this is happening and what can I do to get the desired result.
In short, PyTesseract is not scanning integers in a list of numbers, instead scanning them as individual characters. How do I tell it to scan for integers and put them in an array?

Well,If you only want to get a list,Use re.split and strip can solve it.(Because tesseract's result has some errors).
You can try this:
import pytesseract
import re
data = pytesseract.image_to_string('OCR.png')
dataList = re.split(r',|\.| ',data) # split the string
resultList = [int(i.strip()) for i in dataList if i != ''] # remove the '' str and convert str to int.
print(resultList)
# result: [71, 194, 38, 1701, 89, 76, 11, 83, 1629, 48, 94, 63, 132, 16, 111, 95, 84, 341, 975, 14, 40, 64, .......

Related

Adding previous and current index in Numpy Arrays Python

How could I code a function that adds to the previous index of the array without using a for loop. So for the second value in the Expected Output it will be previous index (2) plus current index(5) resulting in 7. How would I be able to do such a thing?
import numpy as np
A = np.array([2,5,44,-12,3,-5])
Expected output:
[2,7,51,39,42,37]

Here you go:
np.cumsum(A)
Prints:
array([ 2, 7, 51, 39, 42, 37], dtype=int32)

How to convert byte array to string in swift 3

I referred C# code for byte array to string conversion in swift
System.Text.Encoding.UTF8.GetString(encryptedpassword)
I tried to convert byte array to string by using below few codes:
(A)
let utf8 : [UInt8] = [231, 13, 38, 246, 234, 144, 148, 111, 174, 136, 15, 61, 200, 186, 215, 113,0]
let str = NSString(bytes: utf8, length: utf8.count, encoding: String.Encoding.utf8.rawValue)
print("str : \(str)")
result : getting nil value
(B)
let datastring = NSString(bytes: chars, length: count, encoding: String.Encoding.utf8.rawValue)
print("string byte data\(chars.map{"\($0)"}.reduce(""){$0+$1})")
result : 23113382462341441481111741361561200186215113 (I thought this is not ideal way)
I searched from last two days and tried other multiple ways but I missed something or doing some mistake .
Please help me out resolved this issue.
Referred links:
how-to-convert-uint8-byte-array-to-string-in-swift
how-to-convert-uint8-byte-array-to-string-in-swift 2

Why won't Keras take my input?

Why does my model in Keras not take in my input/output data?
The input data consist of being a list of numpy.ndarrays of shape (15,1,3) and the output is a list of numpy.arrays with only one number in each entry.
Here is the where I create my model, and pass things in:
model = Sequential()
print "Data-train-in: " + str(data_train_input[0].shape)
print "Data-train-out: " + str(data_train_output[0].shape)
print "Data-test-in: " + str(data_test_input[0].shape)
#sys.exit()
print "Model Definition"
print "Row: " + str(row)
model.add(Convolution2D(64,3,3,input_shape=(3,row,1)))
print model.output_shape
model.add(Convolution2D(32,1,3))
print model.output_shape
model.add(MaxPooling2D((1,1)))
print model.output_shape
model.add(Flatten())
print model.output_shape
model.add(Dense(1,activation='relu'))
print model.output_shape
model.compile(loss='mean_squared_error', optimizer="sgd")
reduce_lr=ReduceLROnPlateau(monitor='val_loss', factor=0.01, patience=3, verbose=1, mode='auto', epsilon=0.0001, cooldown=0, min_lr=0.000000000000000001)
stop = EarlyStopping(monitor='val_loss', min_delta=0, patience=5, verbose=1, mode='auto')
log=csv_logger = CSVLogger('training_'+str(i)+'.csv')
print "Model Train"
hist_current = model.fit(data_train_input,
data_train_output,
shuffle=False,
validation_data=(data_test_input,data_test_output),
validation_split=0.1,
nb_epoch=150,
verbose=1,
callbacks=[reduce_lr,log,stop])
Which outputs:
Data-train-in: (15, 1, 3)
Data-train-out: ()
Data-test-in: (15, 1, 3)
Model Definition
Row: 15
(None, 1, 13, 64)
(None, 1, 11, 32)
(None, 1, 11, 32)
(None, 352)
(None, 1)
Model Train
Traceback (most recent call last):
File "keras_convolutional_feature_extraction.py", line 502, in <module>
model(0,train_input_data,output_data_train,test_input_data,output_data_test)
File "keras_convolutional_feature_extraction.py", line 496, in model
callbacks=[reduce_lr,log,stop])
File "/usr/local/lib/python2.7/dist-packages/keras/models.py", line 652, in fit
sample_weight=sample_weight)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 1038, in fit
batch_size=batch_size)
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 963, in _standardize_user_data
exception_prefix='model input')
File "/usr/local/lib/python2.7/dist-packages/keras/engine/training.py", line 54, in standardize_input_data
'...')
Exception: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 1 arrays but instead got the following list of 260182 arrays: [array([[[ 67, 255, 180]],
[[ 68, 255, 178]],
[[ 68, 255, 178]],
[[ 67, 255, 180]],
[[ 43, 254, 204]],
[[ 19, 253, 228]],
[[ 9, 205, 241]],
[[ ...
I am not sure on how to interpret the output message. What is wrong here?

Your data doesn't match your input layer. In your model you used input_shape=(3,row,1) which equals to input_shape=(3,15,1) in this context.
But your print show that your training examples are with a different shape of (15, 1, 3).
Try changing your input definition to input_shape=(row,1,3).
Another way to solve the problem is reshaping your data to the input layer shape.

import numpy as np
data_train_input = np.array(data_train_input)
this seems to work.

combine multiple numpy ndarrays as list

I have three equally dimensioned numpy arrays.
I would like to store the data from all three in an array of the same dimensions and size.
To do this, I would like to store three bytes of information per item in the array. I assume this would be a list.
e.g.
>>>red = np.array([[150,25],[37,214]])
>>>green = np.array([[190,27],[123,231]])
>>>blue = np.array([[10,112],[123,119]])
insert combination magic to make a combined array called RGB
>>>RGB
array([(150,190,10),(25,27,112)],[(37,123,123),(214,231,119)])

For a start, each is 2x2. Combined in a list with array, same construction as in making red, produces a 3x2x2.
In [344]: red = np.array([[150,25],[37,214]])
In [345]: green = np.array([[190,27],[123,231]])
In [346]: blue = np.array([[10,112],[123,119]])
In [347]: np.array([red,green,blue])
Out[347]:
array([[[150, 25],
[ 37, 214]],
[[190, 27],
[123, 231]],
[[ 10, 112],
[123, 119]]])
In [348]: _.shape
Out[348]: (3, 2, 2)
That's not the order you want, but we can easily reshape, and if needed transpose.
The target, with an added set of []
In [350]: np.array([[(150,190,10),(25,27,112)],[(37,123,123),(214,231,119)]])
Out[350]:
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
In [351]: _.shape
Out[351]: (2, 2, 3)
so try moving the 3 shape to the end with transpose:
In [352]: np.array([red,green,blue]).transpose(1,2,0)
Out[352]:
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
===========================
I should have suggested stack. This a newish version of concatenate that lets us join arrays on different new dimensions. With axis=0 it behaves like np.array. But to join on the last, to put the rgb dimension last use:
In [467]: np.stack((red,green,blue),axis=-1)
Out[467]:
array([[[150, 190, 10],
[ 25, 27, 112]],
[[ 37, 123, 123],
[214, 231, 119]]])
In [468]: _.shape
Out[468]: (2, 2, 3)
Note that this expression does not assume anything about the shape of red, etc, except that they are equal. So it will work with 3d arrays as well.

String Input in Java

I was trying to take input many strings in separate lines and want to store all of them for use later.For example want to take input as follows(last line ends with a ".")-
My name is ABCD
My name is BCDS
My name is fdada.
How can I implement this?? Also I want to use all these strings.In java or any other language I would have made a string array and used that array to access all the three strings.
But the moment I enter 1st line it gives me false.

you can use a failure driven loop, like
:- dynamic a_line/1.
read_lines :-
retractall(a_line(__)),
repeat,
read_line_to_codes(user_input, L),
assertz(a_line(L)),
( last(L, 0'.) ; fail ).
and then
?- read_lines.
|: My name is ABCD
|: My name is BCDS
|: My name is fdada.
true .
the result get stored in a_line/1, so
?- a_line(L),atom_codes(A,L).
L = [77, 121, 32, 110, 97, 109, 101, 32, 105|...],
A = 'My name is ABCD'
L = [32, 77, 121, 32, 110, 97, 109, 101, 32|...],
A = ' My name is BCDS'.
...