numpy array has different shape when I pass it as input to a keras layer - arrays

I have a keras encoder (part of an autoencoder) built this way:
input_vec = Input(shape=(200,))
encoded = Dense(20, activation='relu')(input_vec)
encoder = Model(input_vec, encoded)
I want to generate a dummy input using numpy.
>>> np.random.rand(200).shape
(200,)
But if i try to pass it as input to the encoder I get a ValueError:
>>> encoder.predict(np.random.rand(200))
>>> Traceback (most recent call last):
File "<console>", line 1, in <module>
File "/home/francesco/PycharmProjects/W2VAutoencoded/venv/lib/python3.6/site-packages/keras/engine/training.py", line 1817, in predict
check_batch_axis=False)
File "/home/francesco/PycharmProjects/W2VAutoencoded/venv/lib/python3.6/site-packages/keras/engine/training.py", line 123, in _standardize_input_data
str(data_shape))
ValueError: Error when checking : expected input_1 to have shape (200,) but got array with shape (1,)
What am I missing?

While Keras Layers (Input, Dense, etc.) take as parameters the shape(s) for a single sample, Model.predict() takes as input batched data (i.e. samples stacked over the 1st dimension).
Right now your model believes you are passing it a batch of 200 samples of shape (1,).
This would work:
batch_size = 1
encoder.predict(np.random.rand(batch_size, 200))

Related

django Model one-to-many save fail

I am starting developping an app that uses django model ORM. I came upon a strange behavior when creating and saving objects related by a one-to-many relation. I have made two model creation test cases of a simple one-to-many relationship, one is working and the other is failing, but I don't understand why. Here are my models :
class Document(models.Model):
pass
class Section(models.Model):
document = models.ForeignKey('Document',on_delete=models.CASCADE)
Here is the working creation test case (in manage.py shell):
>>> doc = models.Document()
>>> doc.save()
>>> section = models.Section(document=doc)
>>> section.document
<Document: Document object (5)>
>>> section.save()
>>>
And here is the test case that fails:
>>> doc = models.Document()
>>> section = models.Section(document=doc)
>>> section.document
<Document: Document object (None)>
>>> doc.save()
>>> section.document
<Document: Document object (6)>
>>> section.save()
Traceback (most recent call last):
File "<input>", line 1, in <module>
section.save()
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/base.py", line 741, in save
force_update=force_update, update_fields=update_fields)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/base.py", line 779, in save_base
force_update, using, update_fields,
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/base.py", line 870, in _save_table
result = self._do_insert(cls._base_manager, using, fields, update_pk, raw)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/base.py", line 908, in _do_insert
using=using, raw=raw)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/manager.py", line 82, in manager_method
return getattr(self.get_queryset(), name)(*args, **kwargs)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/query.py", line 1186, in _insert
return query.get_compiler(using=using).execute_sql(return_id)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/sql/compiler.py", line 1335, in execute_sql
cursor.execute(sql, params)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/backends/utils.py", line 99, in execute
return super().execute(sql, params)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/backends/utils.py", line 67, in execute
return self._execute_with_wrappers(sql, params, many=False, executor=self._execute)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/backends/utils.py", line 76, in _execute_with_wrappers
return executor(sql, params, many, context)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/backends/utils.py", line 84, in _execute
return self.cursor.execute(sql, params)
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/backends/mysql/base.py", line 76, in execute
raise utils.IntegrityError(*tuple(e.args))
django.db.utils.IntegrityError: (1048, "Column 'document_id' cannot be null")
>>> section.document
Traceback (most recent call last):
File "<input>", line 1, in <module>
section.document
File "/home/philippe/.local/lib/python3.6/site-packages/django/db/models/fields/related_descriptors.py", line 189, in __get__
"%s has no %s." % (self.field.model.__name__, self.field.name)
diagnosisrank.models.common.Section.document.RelatedObjectDoesNotExist: Section has no document.
>>>
The only difference with the working test case is that the related Document model is not saved before instantiating the Section model. However, after saving the Document model, we see that the related document model of the Section model is pointing to the saved Document (its id is set). But when trying to save the Section, the related id is not set and the related model is lost. Why is that? Why do the related Document model instance must be assigned to the Section only after being saved?
To clarify my problem my goal is to collect all info and instantiate all my models in a first step, then saving all to database in a section step. I can still do this in this way: say D as many S, create D and S, save D, assign D in S, save S. But I would prefer to do: create D, create S with related D, save D, save S. Why can't I?
Thanks for any help or insight!
Phil
Well it was a bug and it is fixed: https://github.com/django/django/commit/519016e5f25d7c0a040015724f9920581551cab0
However it is not in the latest stable 2.2.4 that I get out of pip3 ...

The list of Numpy arrays that are passed to model is not the size the model expected

I am trying to visualize the layers of convolutional and capsule networks. The code for visualization is as follows:
layer_outputs = [layer.get_output_at(0) for layer in model.layers[:12]# Extracts the outputs of the top 12 layers
activation_model = models.Model(inputs=model.input, outputs=output_layer) # Creates a model that will return these outputs, given the model input
activations = activation_model.predict(img_tensor)
here, img_tensor is array of shape (1,28,28,1). An image from mnist dataset. The execution of code throws error as follows:
ValueError: Error when checking model input: the list of Numpy arrays that you are passing to your model is not the size the model expected. Expected to see 2 array(s), but instead got the following list of 1 arrays: [array([[[[0.0000000e+00],
[0.0000000e+00],
[0.0000000e+00],
[0.0000000e+00],
[0.0000000e+00],
[0.0000000e+00],
[0.0000000e+00],
[0.00000...
The error occurs due to line activations = activation_model.predict(img_tensor).
Can anyone knows why this happens?
Your input dimensionality does not match the model definition.
Quick debug: Compare your input shape with the input layer shape from the model summary.

PyTorch 2d Convolution with sparse filters

I am trying to perform a spatial convolution (e.g. on an image) in pytorch on dense input using a sparse filter matrix.
Sparse Tensors are implemented in PyTorch. I tried to use a sparse Tensor, but it ends up with a segmentation fault.
import torch
from torch.autograd import Variable
from torch.nn import functional as F
# build sparse filter matrix
i = torch.LongTensor([[0, 1, 1],[2, 0, 2]])
v = torch.FloatTensor([3, 4, 5])
filter = Variable(torch.sparse.FloatTensor(i, v, torch.Size([3,3])))
inputs = Variable(torch.randn(1,1,6,6))
F.conv2d(inputs, filter)
Can anyone just give me a hint how to do that?
Thanks in advance!
dymat
I know this question is outdated but I also know that there are still people looking for an answer (like myself) so here goes...
On sparse filters
If you'd like sparse convolution without the freedom to specify the sparsity pattern yourself, take a look at dilated conv (also called atrous conv). This is implemented in PyTorch and you can control the degree of sparsity by adjusting the dilation param in Conv2d.
If you'd like to specify the sparsity pattern yourself, to the best of my knowledge, this feature is not currently available in PyTorch. But you may want to check this out if you are ok with using Tensorflow. There is also a blog post providing more details on this repo.
On sparse input
A list of existing and TODO sparse tensor operations is available here.
This talks about the current state of sparse tensors in PyTorch.
This lets you propose your own sparse tensor use case to the PyTorch contributors.
But at the time of this writing, I did not see conv on sparse tensors being an implemented feature or on the TODO list. nn.Linear on sparse input, however, is supported.
And if you build a sparse tensor and apply a conv layer to it, PyTorch (1.1.0) throws an exception:
>>> a = torch.zeros((1, 3, 2, 2), layout=torch.sparse_coo)
>>> net = torch.nn.Conv2d(1, 1, 1)
>>> b = net(a)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/module.py", line 493, in __call__
result = self.forward(*input, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/torch/nn/modules/conv.py", line 338, in forward
self.padding, self.dilation, self.groups)
RuntimeError: sparse tensors do not have is_contiguous
>>> torch.__version__
'1.1.0'
Changing to a linear layer and it would work:
>>> c = torch.zeros((1, 2), layout=torch.sparse_coo)
>>> another_net = torch.nn.Linear(2, 1)
>>> d = another_net(c)
>>> d
tensor([[0.1944]], grad_fn=<AddmmBackward>)
>>> d.backward()
>>> another_net.weight.grad
tensor([[0., 0.]])
>>> another_net.bias.grad
tensor([1.])
these guys did something like a sparse conv2d - https://github.com/numenta/nupic.torch/

TensorFlow learn.Estimator : is it naive to call fit() many times? Because I get ResourceExhaustedError

I am learning machine learning using TensorFlow. I have been through a couple of tutorials but I still have a hard time trying to find what are the good ways of training a model. Recently I implemented a CNN model I found in the litterature. The model must take a crop of a certain size centered on a given pixel and predict the label of this pixel. It does that for each pixel of the image. I used:
classifier = tf.learn.Estimator(model_fn=cnn_model_fn, model_dir="./cnn")
with cnn_model_fn beeing a function I implemented.
For each training image, we take 3000 crops randomly, so I can't load all theses images and their crops to memory. The way I found is by loading one image at a time, extract the 3000 crops and then call classifier.fit() to train on the 3000 crops. Then loop for each image in my dataset.
for i in range(len(filenames)):
...
image = misc.imread(filenames[i])
labels = misc.imread(groundTruth[i]) #labels for each pixels
input_classifier = preprocess(image,...) #crops 3000 images in image and do other things
input_labels = preprocess_labels(labels, ...) #take the corresponding 3000 labels
classifier.fit(x = input_classifier,
y = input_labels,
batch_size = 30
steps = 100)
It worked fine for 100 images, but if I try on the whole dataset (2000 images), it always stops and give an error of ResourceExhausted.
...
[everything goes well]
...
iteration :227/2000
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating
TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus
id: 0000:01:00.0)
INFO:tensorflow:Create CheckpointSaverHook.
I tensorflow/core/common_runtime/gpu/gpu_device.cc:975] Creating
TensorFlow device (/gpu:0) -> (device: 0, name: Tesla K40c, pci bus
id: 0000:01:00.0)
Traceback (most recent call last):
File "train-cnn.py", line 78, in <module>
classifier.fit(x= input_classifier, y=input_labels,batch_size=30, steps=100)
File "/usr/local/lib/python3.5/dist-packages/tensorflow/python/util/deprecation.py", line 280, in new_func
...
...
...
tensorflow.python.framework.errors_impl.ResourceExhaustedError: cnn/graph.pbtxt.tmp32bcc6311c164c29b91177d17d05d669
I don't see why it gets OOM... I have suspicions that it is because of the way I call fit() in loop. After each fit(), a ckpt is saved and it must be restored right after to train on the next image. So is it a bad way to train a model?
running estimator.fit in a loop with smaller steps is not a good idea. I would put all input logic into an input_fn. then run estimator.fit only once with more steps.
An example of reading data from different files can be found here: tf.contrib.learn.read_batch_examples

Problems saving arrays as greyscale 'L' images using matplotlib?

I'm trying to save an array as an image using plt.imsave(). The original image is a 16 greyscale 'L' tiff. But I keep on getting the error:
Attribute error: 'str' object has no attribute 'shape'
figsize = [x / float(dpi) for x in (arr.shape[1], arr.shape[0])]
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import numpy as np
from PIL import Image
im2=plt.imread('C:\Documents\Image\pic.tif')
plt.imsave(im2, '*.tif')
The image is 2048x2048, the array is 2048Lx2048L. Everything I've tried doesn't work: shape=[2048,2048], im2.shape(2048,2048). Can anybody tell me out how to add shape as a keyword argument? Or is there any easier way to do this, preferably avoiding PIL, since it seems to have issues with 16-bit greyscale tiffs and I absolutely have to use that format?
I think you've got the arguments backwards. From help(plt.imsave):
Help on function imsave in module matplotlib.pyplot:
imsave(*args, **kwargs)
Saves a 2D :class:`numpy.array` as an image with one pixel per element.
The output formats available depend on the backend being used.
Arguments:
*fname*:
A string containing a path to a filename, or a Python file-like object.
If *format* is *None* and *fname* is a string, the output
format is deduced from the extension of the filename.
*arr*:
A 2D array.
i.e.:
>>> im2.shape
(256, 256)
>>> plt.imsave(im2, "pic.tif")
Traceback (most recent call last):
File "<ipython-input-36-a7bbfaeb1a4c>", line 1, in <module>
plt.imsave(im2, "pic.tif")
File "/usr/lib/pymodules/python2.7/matplotlib/pyplot.py", line 1753, in imsave
return _imsave(*args, **kwargs)
File "/usr/lib/pymodules/python2.7/matplotlib/image.py", line 1230, in imsave
figsize = [x / float(dpi) for x in arr.shape[::-1]]
AttributeError: 'str' object has no attribute 'shape'
>>> plt.imsave("pic.tif", im2)
>>>

Resources