Performance considerations with db.BlobProperty() - google-app-engine

In my app, I'd like to have an entity defined like this:
class MyModel(db.Model):
title = db.StringProperty()
myBlob = db.BlobProperty()
Say this blob holds around 1 megabyte. Will this slow down any queries I make on the MyModel type? Is it fetching the entire 1mb per entity, or just references until I actually try to access the blob?

The minute you retrieve the entity, the blob is loaded from the datastore unless you do a projection query.
You have a few options to avoid loading the BlobProperty until you need it.
do a projection query and then only fetch the full entity when you need it.
stick the BlobProperty in a Child entity (make the top level one the ancestor) and only fetch the property with a get, when you need it.
Don't use a BlobProperty but stick it in GCS (Google CLoud Storage) and serve it from there.
The last has the benefit that if you do no processing on the blob your appengine instance doesn't need to get involved with serving it (depending on what your requirements are of course)

Related

Does deferred save in Objectify apply per Java entity or per Google Cloud Datastore entity?

Our app logic has multiple layers. Each time a save is invoked, the entity on the domain layer is mapped to a database entity.
For example:
class Sample(); // the domain entity
#Entity("Sample")
#Cache
class DatabaseSample; // the database entity
Let's assume the domain entity is modified and save is invoked, which will map all properties to a new database entity, which is then saved deferred.
Let's assume the same domain entity is modified again and saved again, which will again map all properties to a new database entity and invoke deferred save.
Will the two separate Google Cloud Datastore entities compete with each other, e.g. the newer save overwrite the older save completely, or will objectify collect modified Key-Value pairs during the request and save a consolidated entity at the end of the request?
I don't quite understand the question, but I'll try to give you some context that might help.
If you defer save of an entity POJO, it just marks it as "save this thing" in the session. When the session closes (end of transaction) anything marked for deferred save (or delete) gets saved.
Only at the very moment of saving to the datastore does the POJO get mapped to a low-level Entity. You can defer save the same object a zillion times with no practical effect. In fact, this is the core use case - you might have a method that sets changes some data and wants a save, and some other method that changes some data and wants a save... by using deferred saves, you avoid making separate real saves to the datastore.
If you're asking about doing something really crazy like changing the #Id after deferring a save... don't. I really have no idea what that will do, but you probably won't like it :-)

Auto creation of entities in blobuploadsession datastore when using upload to url functionality in blob datastore

I'm wondering if deletion of these entities in blobuploadsession would affect my app functionality or performance in any which way. The reason for deletion is when a new form is created and there were no files that were uploaded to, then it results in unnecessary entities being created.
(edit: additional info from comment)
I use blobstore (part of NDB) to store images asynchronously via upload URL functionality. When I run the app on localhost, there is an auto-creation of a datastore called "BlobUploadSession". This is the entity where all the URLs for the images to be uploaded are stored as entities. When I upload a photo to the URL, it goes into the "BlobInfo" datastore. Now, I don't have a need of the URLs since the photo has already been uploaded. So, I'm wondering if I can delete the BlobUploadSession entities? Btw, BlobUploadSession and BlobInfo are default datastores automatically created.
The __BlobUploadSession__ and __BlobInfo__ entities are created by and only internally used by the development server while emulating the blobstore functionality.
There are others, similarly named __SomeEntityName__ entities for emulating other pieces of functionality, for example a pile of them are created when you're requesting datastore stats (such function doesn't exist per-se in production).
These entities aren't created on GAE, so no need to worry about them in production.
See also related How to remove built-in kinds' names in google datastore using kind queries

Deleted Datastore entries reappear

I'd like to re-open Deleted Datastore entries reappear as a registered user. Can the old question be deleted?
I'll try to be more specific this time. I'm experiencing the following problem:
Initially I put N entities of the same kind into the Datastore like that:
datastore_entity = MyModel(model_property=property_value)
datastore_entity.put()
Afterwards I delete them. I have used the Datastore Admin interface as well as a self-defined handler for the mapreduce library in order to do so. The deleted entities do not appear neither in the Datastore viewer nor in the Datastore Admin view.
When I put even just one new single entity of this kind into the Datastore, the old Datastore entities reappear in the Datastore Admin view while the new entity does not (judging by the number of entities). On the contrary, the Datastore viewer correctly reflects the Datastore state. A query also returns only the newly created entity.
There are no tasks at the time the new entity is being put into the Datastore.
I'm also not encountering this problem on my local machine where I'm using the --clean_datastore option when starting the server.
The Datastore Admin and Datastore Statistics are not "live". The Datastore viewer offers a live view.
Check "Entity statistics last updated..." and you will notice the difference.
If the old entities are not visible in the Datastore viewer - no need to worry. Eventually the statistics will be updated.

In the python version of Google App Engine, how can you override the db.Model class to save to a temp datastore rather than big table?

In Singapore, we are teaching students python using Singpath (singpath.appspot.com). In addition to letting students practice writing software in python, we would like to familiarize students with the google.appengine.ext.db API used to access big table.
What is the easiest way to modify db.Model settings in an App Engine app so that any puts or gets access a local, temporary datastore rather than writing to big table? I'm trying to do something similar to how gaeunit creates a new temporary datastore each time unit tests are run.
from google.appengine.ext import db
import logging
class MyModel(db.Model):
name = db.StringProperty()
#Configure a temp datastore to be updated instead of bigtable.
m = MyModel()
m.put() #shouldn't update bigtable
result = MyModel.all() #should fetch from temp datastore
logging.info("There were %s models saved", result.count())
You can certainly do this in the dev server by creating a new stub datastore when you want, like gaeunit. I don't think the concept really transfers to the production environment, though. A temporary datastore would have to have some kind of backing store, either the real datastore or memcache. AFAIK there's no built-in support for either.
An alternative would be to use the real datastore with some sandboxing.
You could override db.Model.kind to prefix a session ID:
#classmethod
def kind(cls):
return "%s_%s" % (SESSION_ID, cls.__name__)
This would give you basic namespacing for user-created entities.
If you have a session entity, you could populate it as the parent entity on any query that does not already specify one. This would force all of your user's entities into one entity group.
In either case, at the start of the session you could schedule a task to run later that would clean up entities that the user created.

google app engine: How do I add fields to an existent entity

I have a google app engine app where I would like to extend one of my Entity definitions. How would I ensure existent entity objects get the new fields properly initialized? Would the existent objects, the next time I query them, simply have default values? I'd like to add a StringListProperty.
If you add a new property to your model, existing entities will have the default value for it when you load them, if you supplied a default. They won't show up in queries for that value until you fetch them and store them again, though.
You will have to add the property to all of your existing entities, one by one.
You don't mention which language or API you are using. The exact details of the procedure will vary with your situation.
In general, the safest way to do this is to load up each entity with a API that doesn't validate your entities. In python, you can use Expando models. In java, you can use the low level datastore API. (trying this with JDO or JPA may not work) You now need to iterate through all existing entities. (try the new Mapper API to do this with relatively little fuss). For each entity, you will load it, add your new property, then put/save it back to the datastore. Now you can safely go back to a framework that validates your entities, like JDO or non-expando models.
This method applies to modifying the type of a property or deleting a property as well.

Resources