Python image_dataset_loader Module Instances are inconsistent - arrays

I want to import an image dataset into Numpy arrays with images and labels. I am trying to use the image_dataset_loader to do this and have wrote this so far:
import image_dataset_loader
(x_train, y_train), (x_test, y_test) = image_dataset_loader.load('./data', ['train', 'test'])
I also have my data directory structured as follows:
data
-train
-male
-male_1.jpg
-male_2.jpg
-male_3.jpg
-male_4.jpg
-......
-female
-female_1.jpg
-female_2.jpg
-female_3.jpg
-female_4.jpg
-......
-test
-male
-male_1.jpg
-male_2.jpg
-male_3.jpg
-male_4.jpg
-......
-female
-female_1.jpg
-female_2.jpg
-female_3.jpg
-female_4.jpg
-......
I have formated all my images to be 120x120 and named them exactly as shown above. I have about 56000 files per category. When I run the script above, it throws the following error:
Traceback (most recent call last):
File "main.py", line 33, in <module>
(x_train, y_train), (x_test, y_test) = image_dataset_loader.load('./data', ['train', 'test'])
File "/home/user/anaconda3/envs/AIOS/lib/python3.8/site-packages/image_dataset_loader.py", line 44, in load
raise RuntimeError('Instance shapes are not consistent.')
RuntimeError: Instance shapes are not consistent.
Can someone please help me sort these images into Numpy arrays?

Check the color of your images. Chances are that some of your images may be grayscale.

Related

TLorentz vector in Uproot 4

I try to use TLorentz vector in uproot4.
But I found that methods in "uproot_methods" module are now worked with Awkward High level array.
Error message # -------------------------------------------->
Traceback (most recent call last):
File "/home/jwkim/anaconda3/lib/python3.8/site-packages/awkward/array/base.py", line 389, in _util_toarray
return cls.numpy.frombuffer(value, dtype=getattr(value, "dtype", defaultdtype)).reshape(getattr(value, "shape", -1))
TypeError: a bytes-like object is required, not 'Array'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "anal.py", line 19, in
Electron_T2vec = TVector2Array.from_polar(Electron_pt,Electron_phi)
File "/home/jwkim/anaconda3/lib/python3.8/site-packages/awkward/util.py", line 112, in func_wrapper
wrap, arrays = unwrap_jagged(cls, awkcls, _normalize_arrays(cls, arrays))
File "/home/jwkim/anaconda3/lib/python3.8/site-packages/awkward/util.py", line 84, in _normalize_arrays
arrays[i] = cls.awkward.util.toarray(arrays[i], cls.awkward.numpy.float64)
File "/home/jwkim/anaconda3/lib/python3.8/site-packages/awkward/util.py", line 32, in toarray
return awkward.array.base.AwkwardArray._util_toarray(value, defaultdtype, passthrough=passthrough)
File "/home/jwkim/anaconda3/lib/python3.8/site-packages/awkward/array/base.py", line 394, in _util_toarray
return cls.numpy.array(value, copy=False)
File "/home/jwkim/anaconda3/lib/python3.8/site-packages/awkward1/highlevel.py", line 1310, in array
return awkward1._connect._numpy.convert_to_array(self._layout, args, kwargs)
File "/home/jwkim/anaconda3/lib/python3.8/site-packages/awkward1/_connect/_numpy.py", line 16, in convert_to_array
out = awkward1.operations.convert.to_numpy(layout, allow_missing=False)
File "/home/jwkim/anaconda3/lib/python3.8/site-packages/awkward1/operations/convert.py", line 313, in to_numpy
return to_numpy(array.toRegularArray(), allow_missing=allow_missing)
ValueError: in ListOffsetArray64, cannot convert to RegularArray because subarray lengths are not regular
--------------------------------------------------> ##
It seems that "uproot_method" only support the awkward.array.jagged.JaggedArray.
Is there any other way to use the TLorentz vector in uproot4 (awkward high level array)?
I'm trying to convert this uproot3 and awkward0 based code to uproot4 and awkward1 based code.
https://github.com/JW-corp/J.W_Analysis/blob/main/Uproot/anal.py
Thank youy!
The uproot-methods package only works with Uproot 3.x and therefore Awkward Array 0.x. In the Uproot 4/Awkward 1 world, the methods for ROOT objects that uproot-methods once provided are now handled by Uproot itself (the "community contributed project" didn't work out, so this functionality is moving back into Uproot), except for Lorentz vectors.
The eventual home for Lorentz vector handling is the vector project, which I'll be contributing to in March. Here it is, almost March, and so that will be happening soon. (I have one last task before working on that.)
In the meantime, the excellent Coffea project has a module for Lorentz-vector handling: coffea.nanoevents.methods.vector. To get started with Lorentz vectors now, I recommend that.
(The standalone vector package would become Uproot's default vector handler, as in, if you read a TLorentzVector from a ROOT file, it will have vector methods automatically—when vector is finished, that is.)

Issue with pixel_array using pydicom on python 3.x

I'm using pydicom (installed with pip3, on python 3.7, using Idle) and I need to access pixel_array values.
I just copy-paste the example provided into the documentation and this leads to two errors:
first is about the get_testdata_files operation, which is not working because
Python 3.7.3 (v3.7.3:ef4ec6ed12, Mar 25 2019, 22:22:05) [MSC v.1916 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license()" for more information.
>>>
==================== RESTART: D:\OneDrive\Desktop\test.py ====================
None
Traceback (most recent call last):
File "D:\OneDrive\Desktop\test.py", line 8, in <module>
filename = get_testdata_files("bmode.dcm")[0]
IndexError: list index out of range
I have solved this not using this operation.
second is about the pixel_array and I'm not so able to decode what is wrong, but it seems like the pixel_array is not populated. However I'm able to access other fields in the dataset and the file can be displayed (using ImageJ for example).
==================== RESTART: D:\OneDrive\Desktop\test.py ====================
None
Filename.........: bmode.dcm
Storage type.....: 1.2.840.10008.5.1.4.1.1.3.1
Patient's name...: Femoral trombenarterectomy, Case Report:
Patient id.......: Case Report 1
Modality.........: US
Study Date.......: 20110824
Image size.......: 768 x 1024, 27472108 bytes
Slice location...: (missing)
Traceback (most recent call last):
File "D:\OneDrive\Desktop\test.py", line 38, in <module>
plt.imshow(dataset.pixel_array, cmap=plt.cm.bone)
File "C:\Users\marcl\AppData\Local\Programs\Python\Python37\lib\site-packages\pydicom\dataset.py", line 949, in pixel_array
self.convert_pixel_data()
File "C:\Users\marcl\AppData\Local\Programs\Python\Python37\lib\site-packages\pydicom\dataset.py", line 895, in convert_pixel_data
raise last_exception
File "C:\Users\marcl\AppData\Local\Programs\Python\Python37\lib\site-packages\pydicom\dataset.py", line 863, in convert_pixel_data
arr = handler.get_pixeldata(self)
File "C:\Users\marcl\AppData\Local\Programs\Python\Python37\lib\site-packages\pydicom\pixel_data_handlers\pillow_handler.py", line 188, in get_pixeldata
UncompressedPixelData.extend(decompressed_image.tobytes())
File "C:\Users\marcl\AppData\Local\Programs\Python\Python37\lib\site-packages\PIL\Image.py", line 746, in tobytes
self.load()
File "C:\Users\marcl\AppData\Local\Programs\Python\Python37\lib\site-packages\PIL\ImageFile.py", line 261, in load
raise_ioerror(err_code)
File "C:\Users\marcl\AppData\Local\Programs\Python\Python37\lib\site-packages\PIL\ImageFile.py", line 58, in raise_ioerror
raise IOError(message + " when reading image file")
OSError: broken data stream when reading image file
Here is my code:
import matplotlib.pyplot as plt
import sys
import pydicom
import numpy
from pydicom.data import get_testdata_files
print(__doc__)
#filename = get_testdata_files("bmode.dcm")[0]
filename = "bmode.dcm"
dataset = pydicom.dcmread(filename)
# Normal mode:
print()
print("Filename.........:", filename)
print("Storage type.....:", dataset.SOPClassUID)
print()
pat_name = dataset.PatientName
display_name = pat_name.family_name + ", " + pat_name.given_name
print("Patient's name...:", display_name)
print("Patient id.......:", dataset.PatientID)
print("Modality.........:", dataset.Modality)
print("Study Date.......:", dataset.StudyDate)
if 'PixelData' in dataset:
rows = int(dataset.Rows)
cols = int(dataset.Columns)
print("Image size.......: {rows:d} x {cols:d}, {size:d} bytes".format(
rows=rows, cols=cols, size=len(dataset.PixelData)))
if 'PixelSpacing' in dataset:
print("Pixel spacing....:", dataset.PixelSpacing)
# use .get() if not sure the item exists, and want a default value if missing
print("Slice location...:", dataset.get('SliceLocation', "(missing)"))
# plot the image using matplotlib
plt.imshow(dataset.pixel_array, cmap=plt.cm.bone)
plt.show()
Could you help me to solve these two errors and access pixel_array values?
Don't hesitate to give me some advices /remarks/...
Thanks!
Hi Marc welcome to SO!
Your first error means that the get_testdata_files returns an empty list, so your file is not found. Have a look at the pydicom source, it shows that a search is performed in [DATA_ROOT]/test_files. Is your file located in that path?
Your second error is related to PIL and that can be quite difficult to debug and fix. First try to read the pixel_array from a dataset created from one of the supplied test files. If that works, your problem is probably that PIL cannot handle the specific encoding of your image data. You want to install and use GDCM instead of PIL to see if that solves the problem. Another user has had a similar issue as you, GDCM solved the problem. It can be a bit of a headache to get working unfortunately. Or have a look at this page, it shows some other alternatives on viewing the image data.

Attribute error while exporting data from app engine using Remote APi

Use the example code from app engine will give an attribute error. The more strange thing is,
When the batch_size is 100, the first fetch will give an error while if it were set to 10, the second fetch will give the error, when the batch_size is 1, the 25th fetch will give the error. Is it due to the problem of remote API?
Python version: 2.7
App engine sdk version: 1.9.6
query = MyModel.all()
entities = query.fetch(100)
while entities:
for entity in entities:
# Do something with entity
query.with_cursor(query.cursor())
entities = query.fetch(100)
error message:
Traceback (most recent call last):
File "migrate.py", line 77, in <module>
entities = query.fetch(batch_size)
File "/home/kamel/Library/google_appengine/google/appengine/ext/db/__init__.py", line 2157, in fetch
return list(self.run(limit=limit, offset=offset, **kwargs))
File "/home/kamel/Library/google_appengine/google/appengine/ext/db/__init__.py", line 2326, in next
return self.__model_class.from_entity(self.__iterator.next())
File "/home/kamel/Library/google_appengine/google/appengine/ext/db/__init__.py", line 1435, in from_entity
entity_values = cls._load_entity_values(entity)
File "/home/kamel/Library/google_appengine/google/appengine/ext/db/__init__.py", line 1413, in _load_entity_values
value = prop.make_value_from_datastore(value)
File "/home/kamel/labola/src/model/properties.py", line 295, in make_value_from_datastore
return pickle.loads(value)
File "/usr/lib/python2.7/pickle.py", line 1382, in loads
return Unpickler(file).load()
File "/usr/lib/python2.7/pickle.py", line 858, in load
dispatch[key](self)
File "/usr/lib/python2.7/pickle.py", line 1083, in load_newobj
obj = cls.__new__(cls, *args)
AttributeError: class Reference has no attribute '__new__
I encountered the same issue when trying to unpickle python3 pickles under python2. The problem was linked to new-style classes becoming default in python3. (source)
Solution for me was to replace class AClass: by class AClass(object):

Trigger.io ToolKit fails to launch properly

I have been experimenting with Trigger.io and was successfully using it last week.
However, I have been trying new stuff out today, and have been unable to open the Trigger Toolkit properly. I'm not sure why - maybe I deleted something by accident.
The error returned was as follows:
Error in remote call to app.list: Expecting property name: line 25 column 2 (char 482) Details Close
Traceback (most recent call last):
File "/Users/josholdham/Library/Trigger Toolkit/build-tools/forge/async.py", line 96, in run
result = self._target(*self._args, **self._kwargs)
File "/Users/josholdham/Library/Trigger Toolkit/trigger/api/app.py", line 24, in list
return forge_tool.singleton.list_local_apps()
File "/Users/josholdham/Library/Trigger Toolkit/trigger/forge_tool.py", line 175, in list_local_apps
app_config = self._app_config_for_path(path)
File "/Users/josholdham/Library/Trigger Toolkit/trigger/forge_tool.py", line 147, in _app_config_for_path
app_config = json.load(app_config_file)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 278, in load
**kw)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/__init__.py", line 326, in loads
return _default_decoder.decode(s)
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 366, in decode
def raw_decode(self, s, idx=0):
File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/json/decoder.py", line 382, in raw_decode
ValueError: Expecting property name: line 25 column 2 (char 482)
I have tried uninstalling and reinstalling the toolkit a few times, but no joy.
Any help/ advice much appreciated.
Thanks
Josh
So I figured this out, I was being a bit stupid:
I had to access hidden files on my Mac to find the Users/josholdham/Library/Trigger Toolkit folder, which I then deleted.
Launching Toolkit again then worked fine.
This is how we find hidden files on the Mac, in case it isn't clear.
http://www.mikesel.info/show-hidden-files-mac-os-x-10-7-lion/
Hope that helps some people - thanks to all previous helpers for guidance.
That looks like src/config.json is malformed in one of your apps.
I'd recommend looking at src/config.json in your apps' directories and checking the JSON syntax around line 25.
In the future, you can avoid these problems by using the App Config tab in the Toolkit to change the config, rather than editing src/config.json directly.

Getting error "ImportError: Could not find 'input_readers' on path 'map reduce'" trying to start mapReduce job

I'm getting this error... "ImportError: Could not find 'input_readers' on path 'map reduce'" when trying to Run my map reduce job via the http://localhost:8080/mapreduce launcher page.
It looks like my problem is similar to this post, AppEngine mapper API import error. Unfortunately, no definitive answers were given.
I've simplified it down to this tiny testmapreduce.py:
from google.appengine.ext import db
class TestEntity(db.Model):
value = db.StringProperty()
def mapperhandler(test):
print test.value
return
And my mapreduce.yaml:
mapreduce:
- name: Simplest MapReduce
mapper:
handler: testmapreduce.mapperhandler
input_reader: mapreduce.input_readers.DatastoreInputReader
params:
- name: entity_kind
default: testmapreduce.TestEntity
One possible clue is the presence of __init__.py has no effect (whether in the project root, the mapreduce directory, or both). I'm sure I'm making a beginner mistake, but over the last couple of days I have read every bit of documentation, and all the examples I can find. Thanks.
UPDATE:
I get the same error trying to invoke it via...
control.start_map(
"Give awards",
"testmapreduce.mapperhandler",
"mapreduce.input_readers.DatastoreInputReader",
{"entity_kind": "testmapreduce.TestEntity"},
shard_count=10)
UPDATE:
As requested, the stack trace -- let me know what else would be helpful...
ERROR 2011-10-16 17:09:27,216 _webapp25.py:464] Could not find 'input_readers' on path 'mapreduce'
Traceback (most recent call last):
File "/Develop/GoogleAppEngineLauncher.app/Contents/Resources/GoogleAppEngine-default.bundle/Contents/Resources/google_appengine/google/appengine/ext/webapp/_webapp25.py", line 701, in __call__
handler.get(*groups)
File "/Users/lc/PycharmProjects/mrtest/testmapreduce.py", line 22, in get
shard_count=10) # handle web form test interface
File "/Users/lc/PycharmProjects/mrtest/mapreduce/control.py", line 94, in start_map
transactional=transactional)
File "/Users/lc/PycharmProjects/mrtest/mapreduce/handlers.py", line 811, in _start_map
mapper_input_reader_class = mapper_spec.input_reader_class()
File "/Users/lc/PycharmProjects/mrtest/mapreduce/model.py", line 393, in input_reader_class
return util.for_name(self.input_reader_spec)
File "/Users/lc/PycharmProjects/mrtest/mapreduce/util.py", line 94, in for_name
module = for_name(module_name, recursive=True)
File "/Users/lc/PycharmProjects/mrtest/mapreduce/util.py", line 102, in for_name
short_name, module_name))
ImportError: Could not find 'input_readers' on path 'mapreduce'
INFO 2011-10-16 22:09:27,253 dev_appserver.py:4247] "GET /giveawards HTTP/1.1" 500 -
This problem turned out to be that I was using the 2.7 version of the Python Interpreter in my local environment. When I switched to 2.5, it works fine.

Resources