I recently switched from using regular old tests to using WebTest and this "No Database Test Runner"
from django.test.simple import DjangoTestSuiteRunner
class NoTestDbDatabaseTestRunner(DjangoTestSuiteRunner):
def setup_databases(self, **kwargs):
pass
def teardown_databases(self, old_config, **kwargs):
pass
Here's an example test which HAS to be hitting the database somehow...
What is happening? Are my tests hitting the database but rolling back to some old state? Test-to-test I can see that each created listing has an incremented id.
def test_image_upload(self):
form_data = self.listing_form_defaults.copy()
form_data['images-TOTAL_FORMS'] = '3'
upload_files = [
('images-0-image', 'testdata/1.png'),
('images-1-image', 'testdata/2.png'),
('images-2-image', 'testdata/3.png'),
]
form_resp = self.app.post(
reverse('listing_create'),
form_data,
upload_files=upload_files,
user='kmike'
).follow()
assert len(form_resp.context['listing'].images.all()) == 3
form_resp.context['listing'].images.all() HAS to be hitting the database, I print'd it and it had database records from my database.
I'm just confused--my tests run blazing fast and don't seem to actually change my database, how is this working/happening?!
Tests that require a database (namely, model tests) will not use your “real” (production) database. Separate, blank databases are created for the tests.
See Here
Related
Background
I am using file system storage, with the Shrine::Attachment module in a model setting (my_model), with activerecord (Rails). I am also using it in a direct upload scenario, therefore i need the response from the file upload (save to cache).
my_model.rb
class MyModel < ApplicationRecord
include ImageUploader::Attachment(:image) # adds an `image` virtual attribute
omitted relations & code...
end
my_controller.rb
def create
#my_model = MyModel.new(my_model_params)
# currently creating derivatives & persisting all in one go
#my_model.image_derivatives! if #my_model.image
if #my_model.save
render json: { success: "MyModel created successfully!" }
else
#errors = #my_model.errors.messages
render 'errors', status: :unprocessable_entity
end
Goal
Ideally i want to clear only the cached file(s) I currently have hold of in my create controller as soon as they have been persisted (the derivatives and original file) to permanent storage.
What the best way is to do this for scenario A: synchronous & scenario B: asynchronous?
What i have considered/tried
After reading through the docs i have noticed 3 possible ways of clearing cached images:
1. Run a rake task to clear cached images.
I really don't like this as i believe the cached files should be cleaned once the file has been persisted and not left as an admin task (cron job) that cant be tested with an image persistence spec
# FileSystem storage
file_system = Shrine.storages[:cache]
file_system.clear! { |path| path.mtime < Time.now - 7*24*60*60 } # delete files older than 1 week
2. Run Shrine.storages[:cache] in an after block
Is this only for background jobs?
attacher.atomic_persist do |reloaded_attacher|
# run code after attachment change check but before persistence
end
3. Move the cache file to permanent storage
I dont think I can use this as my direct upload occurs in two distinct parts: 1, immediately upload the attached file to a cached store then 2, save it to the newly created record.
plugin :upload_options, cache: { move: true }, store: { move: true }
Are there better ways of clearing promoted images from cache for my needs?
Synchronous solution for single image upload case:
def create
#my_model = MyModel.new(my_model_params)
image_attacher = #my_model.image_attacher
image_attacher.create_derivatives # Create different sized images
image_cache_id = image_attacher.file.id # save image cache file id as it will be lost in the next step
image_attacher.record.save(validate: true) # Promote original file to permanent storage
Shrine.storages[:cache].delete(image_cache_id) # Only clear cached image that was used to create derivatives (if other images are being processed and are cached we dont want to blow them away)
end
I am trying to run Springboot test with H2 in-memory db with Mybatis.
So far I have done
configuring h2 DB in application-test.properties
adding annotation
#SpringBootTest, #TestPropertySource (locations = "TEST_APPLICATION_PROPERTIES_LOCATION")
autowiring dao and serviceImpl beans
adding seed.sql and purge.sql to the test class with
#SqlGroup({
#Sql(executionPhase = Sql.ExecutionPhase.BEFORE_TEST_METHOD, scripts = "classpath:/database/seed.sql"),
#Sql(executionPhase = Sql.ExecutionPhase.AFTER_TEST_METHOD, scripts = "classpath:/database/purge.sql") })
Despite the above measures, I still have two problems
I can't retrieve user that I input with the seed.sql. I made a user with id="admin", pw="admin", and was trying to retrieve with findById("admin"). But it always returns null.
I can't open h2 DB while debugging with the #test. I simply can't access h2 with localhost:8080/h2-console (the path was written explicitly in application-test.properties)
Is there any extra measure that I should take to test SpringBoot with h2?
Add spring.h2.console.enabled=true in your properties file.
Here the snippet I'm using for my end-to-end tests using selenium (i'm totally new in selenium django testing) ;
from django.contrib.auth.models import User
from django.contrib.staticfiles.testing import StaticLiveServerTestCase
from selenium.webdriver.chrome.webdriver import WebDriver
class MyTest(StaticLiveServerTestCase):
#classmethod
def setUpClass(cls):
super(DashboardTest, cls).setUpClass()
cls.selenium = WebDriver()
cls.user = User.objects.create_superuser(username=...,
password=...,
email=...)
time.sleep(1)
cls._login()
#classmethod
def _login(cls):
cls.selenium.get(
'%s%s' % (cls.live_server_url, '/admin/login/?next=/'))
...
def test_login(self):
self.selenium.implicitly_wait(10)
self.assertIn(self.username,
self.selenium.find_element_by_class_name("fixtop").text)
def test_go_to_dashboard(self):
query_json, saved_entry = self._create_entry()
self.selenium.get(
'%s%s' % (
self.live_server_url, '/dashboard/%d/' % saved_entry.id))
# assert on displayed values
def self._create_entry():
# create an entry using form and returns it
def test_create(self):
self.maxDiff = None
query_json, saved_entry = self._create_entry()
... assert on displayed values
I'm noticed that between each test the login is not persistant. So i can use _login in the setUp but make my tests slower.
So how to keep persistant login between test ? What are the best practices for testing those tests (djnago selenium tests) ?
Through-the-browser tests with Selenium are slow, period. They are, however, very valuable as they're the best shot you have at automating the true user experience.
You shouldn't try to write true unit tests with Selenium. Instead, use it to write one or two large functional tests. Try to capture an entire user interaction from start to finish. Then structure your test suite so that you can run your fast, non-Selenium unit tests separately, and only have to run the slow functional tests on occasion.
Your code looks fine, but in this scenario you'd combine test_go_to_dashboard and test_create into one method.
kevinharvey pointed me to the solution! Finally found out a way to reduce time of testing and keeping track of all tests:
I have renamed all methods starting with test.. to _test_.. and added a main method that calls each _test_ method:
def test_main(self):
for attr in dir(self):
# call each test and avoid recursive call
if attr.startswith('_test_') and attr != self.test_main.__name__:
with self.subTest("subtest %s " % attr):
self.selenium.get(self.live_server_url)
getattr(self, attr)()
This way, I can test (debug) individually each method :)
I am running grails 1.3.7 and using the grails database migration plugin version database-migration-1.0
The problem I have is I have a migration change set. That is pulling blobs out of a table and writing them to disk. When running through this migration though I am running out of heap space. I was thinking I would need to flush and clear the session to free up some space however I am having difficulty getting access to the session from within the migration. BTW The reason it's in a migration is we are moving away from storing files in oracle and putting them on disk
I have tried
SessionFactoryUtils.getSession(sessionFactory, true)
I have also tried
SecurityRequestHolder.request.getSession(false) //request in null -> not surprising
changeSet(author: "userone", id: "saveFilesToDisk-1") {
grailsChange{
change{
def fileIds = sql.rows("""SELECT id FROM erp_file""")
for (row in fileIds) {
def erpFile = ErpFile.get(row.id)
erpFile.writeToDisk()
session.flush()
session.clear()
propertyInstanceMap.get().clear()
}
ConfigurationHolder.config.erp.ErpFile.persistenceMode = previousMode
}
}
}
Any help would be greatly appreciated.
The application context will be automatically available in your migration as ctx. You can get the session like this:
def session = ctx.sessionFactory.currentSession
To access session, you can use withSession closure like this:
Book.withSession { session ->
session.clear()
}
But, this may not be the reason why your app run out of heap space. If the data volume is large, then
def fileIds = sql.rows("""SELECT id FROM erp_file""")
for (row in fileIds) {
..........
}
will consume up your space. Try to process the data with pagination. Don't load all the data at once.
I've been working on creating a subclass of db.Model that is automatically cached, i.e.:
instance.put would store the entity in memcache before persisting it to the datastore
class.get_by_key_name would first check the cache, and if missed, would go to the datastore to retrieve it and cache it after retrieval
I developed the approach below (which appears to work for me), but I have a few questions:
I had read Nick Johnson's article on efficient model memcaching which suggests implementing the serialization for memcache through protocol buffers. Looking at the memcache API source code in the SDK, it looks like Google has already implemented protobuf serialization by default. Is my interpretation correct?
Am I missing some important details (which could get me in the future) in the way I am subclassing db.Model or overriding the two methods?
Is there a more efficient way of implementing what I've done below?
Are there guidelines, benchmarks or best practices for when such entity caching would make sense from a performance perspective? Or would it always make sense to cache entities? On a related note, should I be reading anything into the fact that Google hasn't provided a cached model in the modeling API? Are there too many special cases to be thinking about?
Below is my current implementation. I would really appreciate any and all guidance/suggestions on caching entities (even if your response is not a direct answer to one of the 4 questions above, but relevant to the topic overall).
from google.appengine.ext import db
from google.appengine.api import memcache
import os
import logging
class CachedModel(db.Model):
'''Subclass of db.Model that automatically caches entities for put and
attempts to load from cache for get_by_key_name
'''
#classmethod
def get_by_key_name(cls, key_names, parent=None, **kwargs):
cache = memcache.Client()
# Ensure that every new deployment of the application results in a cache miss
# by including the application version ID in the namespace of the cache entry
namespace = os.environ['CURRENT_VERSION_ID'] + '_' + cls.__name__
if not isinstance(key_names, list):
key_names = [key_names]
entities = cache.get_multi(key_names, namespace=namespace)
if entities:
logging.info('%s (namespace=%s) retrieved from memcache' % (str(entities.keys()), namespace))
missing_key_names = list(set(key_names) - set(entities.keys()))
# For keys missed in memcahce, attempt to retrieve entities from datastore
if missing_key_names:
missing_entities = super(CachedModel, cls).get_by_key_name(missing_key_names, parent, **kwargs)
missing_mapping = zip(missing_key_names, missing_entities)
# Determine entities that exist in datastore and store them to memcache
entities_to_cache = dict()
for key_name, entity in missing_mapping:
if entity:
entities_to_cache[key_name] = entity
if entities_to_cache:
logging.info('%s (namespace=%s) cached by get_by_key_name' % (str(entities_to_cache.keys()), namespace))
cache.set_multi(entities_to_cache, namespace=namespace)
non_existent = set(missing_key_names) - set(entities_to_cache.keys())
if non_existent:
logging.info('%s (namespace=%s) missing from cache and datastore' % (str(non_existent), namespace))
# Combine entities retrieved from cache and entities retrieved from datastore
entities.update(missing_mapping)
if len(key_names) == 1:
return entities[key_names[0]]
else:
return [entities[key_name] for key_name in key_names]
def put(self, **kwargs):
cache = memcache.Client()
namespace = os.environ['CURRENT_VERSION_ID'] + '_' + self.__class__.__name__
cache.set(self.key().name(), self, namespace=namespace)
logging.info('%s (namespace=%s) cached by put' % (self.key().name(), namespace))
return super(CachedModel, self).put(**kwargs)
Rather than reinventing the wheel, why not switch to NDB, which already implements memcaching of model instances?
You might check out Nick Johnson's article on adding pre and post hooks for data model classes as an alternative to overriding get_by_key_name. That way your hook could work even when using db.get and db.put.
That said, I've found in my app that I've had more dramatic performance improvements caching things at a higher level - like all the content I need to render an entire page, or the page's html itself if possible.
You also might check out the asynctools library which can help you run datastore queries in parallel and cache the results.
I lot of good tips from Nick Johnson's you want implement are already implemented in the module appengine-mp. like serialization via protocolbuf or prefetching entities.
About your method get_by_key_names you can check the code. If you want create your own db.Model layer, maybe that can help you but you can also contribute to improve the existing model. ;)