Datastore query without model class - google-app-engine

I recently encountered a situation where one might want to run a datastore query which includes a kind, but the class of the corresponding model is not available (e.g. if it's defined in a module that hasn't been imported yet).
I couldn't find any out-of-the-box way to do this using the google.appengine.ext.db package, so I ended up using the google.appengine.api.datastore.Query class from the low-level datastore API.
This worked fine for my needs (my query only needed to count the number of results, without returning any model instances), but I was wondering if anyone knows of a better solution.
Another approach I've tried (which also worked) was subclassing db.GqlQuery to bypass its constructor. This might not be the cleanest solution, but if anyone is interested, here is the code:
import logging
from google.appengine.ext import db, gql
class ClasslessGqlQuery(db.GqlQuery):
"""
This subclass of :class:`db.GqlQuery` uses a modified version of ``db.GqlQuery``'s constructor to suppress any
:class:`db.KindError` that might be raised by ``db.class_for_kind(kindName)``.
This allows using the functionality :class:`db.GqlQuery` without requiring that a Model class for the query's kind
be available in the local environment, which could happen if a module defining that class hasn't been imported yet.
In that case, no validation of the Model's properties will be performed (will not check whether they're not indexed),
but otherwise, this class should work the same as :class:`db.GqlQuery`.
"""
def __init__(self, query_string, *args, **kwds):
"""
**NOTE**: this is a modified version of :class:`db.GqlQuery`'s constructor, suppressing any :class:`db.KindError`s
that might be raised by ``db.class_for_kind(kindName)``.
In that case, no validation of the Model's properties will be performed (will not check whether they're not indexed),
but otherwise, this class should work the same as :class:`db.GqlQuery`.
Args:
query_string: Properly formatted GQL query string.
*args: Positional arguments used to bind numeric references in the query.
**kwds: Dictionary-based arguments for named references.
Raises:
PropertyError if the query filters or sorts on a property that's not indexed.
"""
from google.appengine.ext import gql
app = kwds.pop('_app', None)
namespace = None
if isinstance(app, tuple):
if len(app) != 2:
raise db.BadArgumentError('_app must have 2 values if type is tuple.')
app, namespace = app
self._proto_query = gql.GQL(query_string, _app=app, namespace=namespace)
kind = self._proto_query._kind
model_class = None
try:
if kind is not None:
model_class = db.class_for_kind(kind)
except db.KindError, e:
logging.warning("%s on %s without a model class", self.__class__.__name__, kind, exc_info=True)
super(db.GqlQuery, self).__init__(model_class)
if model_class is not None:
for property, unused in (self._proto_query.filters().keys() +
self._proto_query.orderings()):
if property in model_class._unindexed_properties:
raise db.PropertyError('Property \'%s\' is not indexed' % property)
self.bind(*args, **kwds)
(also available as a gist)

You could create a temporary class just to do the query. If you use an Expando model, the properties of the class don't need to match what is actually in the datastore.
class KindName(ndb.Expando):
pass
You could then do:
KindName.query()
If you need to filter on specific properties, then I suspect you'll have to add them to the temporary class.

Related

ApiTransformer for parametrized, unavailable type

I'm using Objectify and wish to have its Key<> type passed around in my API. I've created an ApiTransformer, but my questions is where to declare it, since the serialized Key<> class is not available, hence I cannot declare its transformer as a class annotation. I tried declaring it in the #Api annotation, but it doesn't work, I still get the error:
There was a problem generating the API metadata for your Cloud Endpoints classes: java.lang.IllegalArgumentException: Parameterized type com.googlecode.objectify.Key<[my package].User> not supported.
The ApiTransformer looks like:
public class KeyTransformer implements Transformer<Key<?>, String> {
public String transformTo(Key<?> in) {
return in.getString();
}
public Key<?> transformFrom(String in) {
return Key.valueOf(in);
}
}
And in my #Api I have:
#Api(name = "users", version = "v1",transformers = {KeyTransformer.class})
Unfortunately you can't. As you said you need to declare it on the Key class, your only chances to make this work are either.
1) Recompile the Key class for objectify with the #transformer annotation.
2) Extend the Key class with your own implementation and define the transformer there.
I don't really like any of those options so the way i usually resolve this is to hide the key object getter (by using #ApiResourceProperty(ignored=AnnotationBoolean.TRUE)) and only expose the id from that key.
That way you get a Endpoints frendly object, the only downside is you'll have to reconstitute the key using Key.create(YourClass.class, longId) manually whenever you need it.
You can add transforms to 3rd party classes by listing the transform in #Api annotation. I'm not dead sure it'll work parameterized class, but I don't see why not.
https://cloud.google.com/appengine/docs/java/endpoints/javadoc/com/google/api/server/spi/config/Api#transformers()

Require an entity parent in google app engine

I have a model that is meaningless without a parent. Is there a way to force an entity to have a parent? I would like an exception to be raised if the child entity is ever instantiated without a parent, similar to a required property.
class Parent(db.Model):
eye_color = db.StringProperty(required=True)
class Child(db.Model):
pass
Does not raise an exception:
mom = Parent(eye_color='purple')
jimmy = Child(parent=mom)
Raises an exception:
mom = Parent(eye_color='purple')
jimmy = Child()
I haven't tried this personally, but you should be able to override __init__ for the Child class and check to make sure the parent is not None. Like so:
class Child(db.Model):
pass
def __init__(self,
parent=None,
key_name=None,
_app=None,
_from_entity=False,
**kwds):
if not parent:
raise ValueError('parent is required.')
super(Did, self).__init__(parent=parent, key_name=key_name, app=_app,
_from_entity=_from_entity, **kwds)
With ndb, you can use a pre put hook method to check if that instance has a parent and raise an exception if it doesnt. I see you are using the older db module, I believe it doesn't have the same hook methods. You should consider moving to the much better and improved ndb datastore API, youll get other benefits like automatic caching and many more.
NDB: https://developers.google.com/appengine/docs/python/ndb/overview
NDB model hooks: https://developers.google.com/appengine/docs/python/ndb/entities#hooks
EDIT: I just reminded that you could do something similar to the ndb model hook with the db API. Explained in this, great as usual, post from Nick Johnson.

How do I handle objects that are part of a Model object’s state, but don’t need separate db-level support?

In my Google App Engine app I have model objects that need to be stored. These objects are parameterized by various policy objects. For example, my Event class has a Privacy policy object which determines who can see, update, etc. There are various subclasses of PrivacyPolicy that behave differently. The Event consults its PrivacyPolicy object at various points.
class PrivacyPolicy(db.Model):
def can_see(self, event, user):
pass
class OwnerOnlyPolicy(PrivacyPolicy):
def can_see(self, event, user):
return user == event.owner
class GroupOnlyPolicy(PrivacyPolicy):
def can_see(self, event, user):
for grp in event.owner.groups()
if grp.is_member(user):
return True
return False
class OnlyCertainUsersPolicy(PrivacyPolicy):
def __init__(self, others):
self.others = others
def can_see(self, event, user):
return user in others
I could make my Event class use a ReferenceProperty to the PrivacyPolicy:
class Event(db.Model):
privacy: db.ReferenceProperty(PrivacyPolicy)
#…
The reason I don’t like this is that the one-to-one relationship means that nobody every queries for the policy object, there is no need to maintain the back-reference from the policy to its Event object, and in no other way is PrivacyPolicy an independent db-level object. It is functionally equivalent to an IntegerProperty, in that it is part of the Event object’s state, it’s just an object instead of a number — specifically it’s an object that can have zero state or lots of state, unknown to the Event type.
I can’t find anyone talking about how to approach such a situation. Is there a tool/approach I don’t know about? Do I just suck it up and use a reference property and the hell with the overhead?
If the only other way to handle this is a custom Property type, any advice about how to approach it would be welcome. My first thought is to use a TextProperty to store the string rep of the policy object (policy), decode it when needed, caching the result, and having any change to the policy object invalidate the cache and update the string rep.
You're overcomplicating by trying to store this in the datastore. This belongs in code rather than in the datastore.
The least complicated way would be:
class Event(db.Model):
privacy = db.IntegerProperty()
def can_see(self, user):
if self.privacy == PRIVACY_OWNER_ONLY:
return user == event.owner
else if self.privacy == PRIVACY_GROUP:
for grp in self.owner.groups()
if grp.is_member(user):
return True
return False
Sometimes all it takes is to think of the right approach. The solution is to introduce a new kind of property that uses pickle to store and retrieve values, such as that described in https://groups.google.com/forum/?fromgroups#!topic/google-appengine/bwMD0ZfRnJg
I wanted something slightly more sophisticated, because pickle isn’t always the answer, and anyway documentation is nice, so here is my ObjectReference type:
import pickle
from google.appengine.ext import db
class ObjectProperty(db.Property):
def __init__(self, object_type=None, verbose_name=None, to_store=pickle.dumps, from_store=pickle.loads, **kwds):
"""Initializes this Property with all the given options
All args are passed to the superclass. The ones used specifically by this class are described here. For
all other args, see base class method documentation for details.
Args:
object_type: If not None, all values assigned to the property must be either instances of this type or None
to_store: A function to use to convert a property value to a storable str representation. The default is
to use pickle.dumps()
from_store: A function to use to convert a storable str representation to a property value. The default is
to use pickle.loads()
"""
if object_type and not isinstance(object_type, type):
raise TypeError('object_type should be a type object')
kwds['indexed'] = False # It never makes sense to index pickled data
super(ObjectProperty, self).__init__(verbose_name, **kwds)
self.to_store = to_store
self.from_store = from_store
self.object_type = object_type
def get_value_for_datastore(self, model_instance):
"""Get value from property to send to datastore.
We retrieve the value of the attribute and return the result of invoking the to_store function on it
See base class method documentation for details.
"""
value = getattr(model_instance, self.name, None)
return self.to_store(value)
def make_value_from_datastore(self, rep):
"""Get value from datastore to assign to the property.
We take the value passed, convert it to str() and return the result of invoking the from_store function
on it. The Property class assigns this returned value to the property.
See base class method documentation for details.
"""
# It passes us a unicode, even though I returned a str, so this is required
rep = str(rep)
return self.from_store(rep)
def validate(self, value):
"""Validate reference.
Returns:
A valid value.
Raises:
BadValueError for the following reasons:
- Object not of correct type.
"""
value = super(ObjectProperty, self).validate(value)
if value is not None and not isinstance(value, self.object_type):
raise db.KindError('Property %s must be of type %s' % (self.name, self.object_type))
return value

App Engine multiple namespaces

Recently there's been some data structure changes in our app, and we decided to use namespaces to separate different versions of of the data, and a mapreduce task that converts old entities to the new format.
Now that's all fine, but we don't want to always isolate the entire data set we have. The biggest part of our data is stored in a kind that's pretty simple and doesn't need to change often. So we decided to use per-kind namespaces.
Something like:
class Author(ndb.model.Model):
ns = '2'
class Book(ndb.model.Model):
ns = '1'
So, when migrating to version 2, we don't need to convert all our data (and copy all 'Book' kinds to the other namespace), only entities of the 'Author' kind. Then, instead of defining the appengine_config.namespace_manager_default_namespace_for_request, we just the 'namespace' keyword arguments to our queries:
Author.query(namespace=Author.ns).get()
Question: how to store (i.e. put()) the different kinds using these different namespaces? Something like:
# Not an API
Author().put(namespace=Author.ns)
Of course, the above doesn't work. (Yes, I could ask the datastore for an avaliable key in that namespace, and then use that key to store the instance with, but it's an extra API call that I'd like to avoid.)
To solve a problem like this I wrote a decorator as follows:
MY_NS = 'abc'
def in_my_namespace(fn):
"""Decorator: Run the given function in the MY_NS namespace"""
from google.appengine.api import namespace_manager
#functools.wraps(fn)
def wrapper(*args, **kwargs):
orig_ns = namespace_manager.get_namespace()
namespace_manager.set_namespace(MY_NS)
try:
res = fn(*args, **kwargs)
finally: # always drop out of the NS on the way up.
namespace_manager.set_namespace(orig_ns)
return res
return wrapper
So I can simply write, for functions that ought to occur in a separate namespace:
#in_my_namespace
def foo():
Author().put() # put into `my` namespace
Of course, applying this to a system to get the results you desire is a bit beyond the scope of this, but I thought it might be helpful.
EDIT: Using a with context
Here's how to accomplish the above using a with context:
class namespace_of(object):
def __init__(self, namespace):
self.ns = namespace
def __enter__(self):
self.orig_ns = namespace_manager.get_namespace()
namespace_manager.set_namespace(self.ns)
def __exit__(self, type, value, traceback):
namespace_manager.set_namespace(self.orig_ns)
Then elsewhere:
with namespace_of("Hello World"):
Author().put() # put into the `Hello World` namespace
A Model instance will use the namespace you set with the namespace_manager[1] as you can see here: python/google/appengine/ext/db/init.py
What you could do is create a child class of Model which expects a class-level 'ns' attribute to be defined. This sub class then overrides put() and sets the namespace before calling original put and resets the namespace afterwards. Something like this:
'''
class MyModel(db.Model):
ns = None
def put(*args, **kwargs):
if self.ns == None:
raise ValueError('"ns" is not defined for this class.')
original_namespace = namespace_manager.get_namespace()
try:
super(MyModelClass, self).put(*args, **kwargs)
finally:
namespace_manager.set_namespace(original_namespace)
'''
[1] http://code.google.com/appengine/docs/python/multitenancy/multitenancy.html
I don't think that it is possible to avoid the extra API call. Namespaces are encoded into the entity's Key, so in order to change the namespace within which a entity is stored, you need to create a new entity (that has a Key with the new namespace) and copy the old entity's data into it.

How to introduce required property in GAE

I've changed my object to have a new required property in v2. When I attempt to fetch a v1 object from the datastore, I get BadValueError because v1 doesn't have the required property. What's the best way to introduce new required properties on existing data
I would resolve this problem using the mapreduce library.
First, register the mapper in mapreduce.yaml:
mapreduce:
- name: fixing required property
mapper:
input_reader: mapreduce.input_readers.DatastoreInputReader
handler: your handler
params:
- name: entity_kind
default: main.ModelV2
then define a process function to modify the entities:
from mapreduce import operation as op
def process(entity):
if not entity.newproperty :
entity.newproperty = None
yield op.db.Put(entity)
If you are dealing with a relative small number of entities, you could avoid mapreduce modifying directly your entities with something like this:
entities = ModelV2.all()
for entity in entities :
if not entity.newproperty :
entity.newproperty = None
entity.put()
You'll need to add it as an optional property to your model, fetch each existing entity, add the property to it (generating a reasonable value somehow), then put() the entity. Once all of your existing entities have been "upgraded", you can make the property required.
The AppEngine mapreduce API should make this fairly easy.

Resources