Ndb Model validation with interdependent properties? - google-app-engine

I want to validate an NDB Model. My model is something like :
class Sample(ndb.Model):
x = ndb.IntegerField()
y = ndb.IntegerField()
I want to ensure that x < y at any point of time.
Approaches I have tried :
Write Validator function and call it from overridden constructor. But the field values may be changed later. And it should validate everytime it saves
Add a _pre_put_hook - but that seems to be overkill. Plus - this won't throw error until an entity is actually saved in datastore
Ideally what I want is that whenever a or b is changed - A function should be triggered which will validate if the entity is valid else throw error.
Note : Currently I am using _pre_put_hook.

This should be done outside of the ndb.Model logic.
Just create a function inside the model like the following:
class Sample(ndb.Model):
x = ndb.IntegerField()
y = ndb.IntegerField()
#classmethod
def new(cls, x, y):
if x >= y:
raise Exception
return cls(x=x, y=y)
Then replace the generic Exception I have used with a custom one and catch it with a try block.

Related

State handling on KeyedCoProcessFunction serving ML models

I am working on a KeyedCoProcessFunction that looks like this:
class MyOperator extends KeyedCoProcessFunction[String, ModelDef, Data, Prediction]
with CheckpointedFunction {
// To hold loaded models
#transient private var models: HashMap[(String, String), Model] = _
// For serialization purposes
#transient private var modelsBytes: MapState[(String, String), Array[Bytes]] = _
...
override def snapshotState(context: FunctionSnapshotContext): Unit = {
modelsBytes.clear() // This raises an exception when there is no active key set
for ((k, model) <- models) {
modelsBytes.put(k, model.toBytes(v))
}
}
override def initializeState(context: FunctionInitializationContext): Unit = {
modelsBytes = context.getKeyedStateStore.getMapState[String, String](
new MapStateDescriptor("modelsBytes", classOf[String], classOf[String])
)
if (context.isRestored) {
// restore models from modelsBytes
}
}
}
The state consists of a collection of ML models built using a third party library. Before checkpoints, I need to dump the loaded models into byte arrays in snapshotState.
My question is, within snapshotState, modelsBytes.clear() raises an exception when there is no active key. This happens when I start the application from scratch without any data on the input streams. So, when the time for a checkpoint comes, I get this error:
java.lang.NullPointerException: No key set. This method should not be called outside of a keyed context.
However, when the input stream contains data, checkpoints work just fine. I am a bit confused about this because snapshotState does not provide a keyed context (contrary to processElement1 and processElement2, where the current key is accessible by doing ctx.getCurrentKey) so it seems to me that the calls to clear and put within snapshotState should fail always since they're supposed to work only within a keyed context. Can anyone clarify if this is the expected behaviour actually?
A keyed state can only be used on a keyed stream as written in the documentation.
* <p>The state is only accessible by functions applied on a {#code KeyedStream}. The key is
* automatically supplied by the system, so the function always sees the value mapped to the
* key of the current element. That way, the system can handle stream and state partitioning
* consistently together.
If you call clear(), you will not clear the whole map, but just reset the state of the current key. The key is always known in processElementX.
/**
* Removes the value mapped under the current key.
*/
void clear();
You should actually receive a better exception when you try to call clear in a function other than processElementX. In the end, you are using the keyed state incorrectly.
Now for your actual problem. I'm assuming you are using a KeyedCoProcessFunction because the models are updated in a separate input. If they are static, you could just load them open from a static source (for example, included in the jar). Furthermore, often there is only one model that is applied for all values with different keys, then you could use BroadCast state. So I'm assuming you have different models for different types of data separated by keys.
If they are coming in from input2, then you already serialize them upon invocation of processElement2.
override def processElement2(model: Model, ctx: Context, collector): Unit = {
models.put(ctx.getCurrentKey, model)
modelsBytes.put(ctx.getCurrentKey, model.toBytes(v))
}
Then you would not override snapshotState, as the state is already up-to-date. initializeState would deserialize models eagerly or you could also materialize them lazily in processElement1.

Backbone relational fetching a model that already exists

I would like to know if there is a way to override an existing relational model after a fetch.
Example:
I have a method on an API that returns a random model. So I created a new instance of the model client side and performed a fetch:
var x = new MyModel();
x.url = 'random';
x.fetch();
// If it exists it will throw "Uncaught Error: Cannot instantiate more than one Backbone.RelationalModel with the same id per type! "
This example works fine unless I have already have an instance of that model client side. Is there a way for me to determine if that model already exists client side after a fetch and update that model instead?
backbone-relational has a built in method for this in 'findModel' which returns the model if found:
backbone-relational docs
You should be able to add a conditional statement to catch
if( x = MyModel.findModel({id: id}) ) {}
else {
x = new myModel();
}

Is it possible to only have a ComputedProperty for certain entities?

In my application I have a model like so:
class MyModel(ndb.Model):
entity_key_list = ndb.KeyProperty('k', repeated=True, indexed=False)
entity_key_num = ndb.ComputedProperty('n', lambda self: len(self.entity_key_list))
verified = ndb.BooleanProperty('v')
Is it possible to have the entity_key_num property when verified is false?
You can return None if not verified like this:
entity_key_num = ndb.ComputedProperty('n', lambda self: len(self.entity_key_list) if not self.verified else None)
If you don't want to have the value None at all and dynamically delete or create this property then you will have to use the ndb.Expando class where you can do all these fancy stuff. Note that you won't be able to delete the ComputedProperty so you will have to keep track of that value on your own.

How do I handle objects that are part of a Model object’s state, but don’t need separate db-level support?

In my Google App Engine app I have model objects that need to be stored. These objects are parameterized by various policy objects. For example, my Event class has a Privacy policy object which determines who can see, update, etc. There are various subclasses of PrivacyPolicy that behave differently. The Event consults its PrivacyPolicy object at various points.
class PrivacyPolicy(db.Model):
def can_see(self, event, user):
pass
class OwnerOnlyPolicy(PrivacyPolicy):
def can_see(self, event, user):
return user == event.owner
class GroupOnlyPolicy(PrivacyPolicy):
def can_see(self, event, user):
for grp in event.owner.groups()
if grp.is_member(user):
return True
return False
class OnlyCertainUsersPolicy(PrivacyPolicy):
def __init__(self, others):
self.others = others
def can_see(self, event, user):
return user in others
I could make my Event class use a ReferenceProperty to the PrivacyPolicy:
class Event(db.Model):
privacy: db.ReferenceProperty(PrivacyPolicy)
#…
The reason I don’t like this is that the one-to-one relationship means that nobody every queries for the policy object, there is no need to maintain the back-reference from the policy to its Event object, and in no other way is PrivacyPolicy an independent db-level object. It is functionally equivalent to an IntegerProperty, in that it is part of the Event object’s state, it’s just an object instead of a number — specifically it’s an object that can have zero state or lots of state, unknown to the Event type.
I can’t find anyone talking about how to approach such a situation. Is there a tool/approach I don’t know about? Do I just suck it up and use a reference property and the hell with the overhead?
If the only other way to handle this is a custom Property type, any advice about how to approach it would be welcome. My first thought is to use a TextProperty to store the string rep of the policy object (policy), decode it when needed, caching the result, and having any change to the policy object invalidate the cache and update the string rep.
You're overcomplicating by trying to store this in the datastore. This belongs in code rather than in the datastore.
The least complicated way would be:
class Event(db.Model):
privacy = db.IntegerProperty()
def can_see(self, user):
if self.privacy == PRIVACY_OWNER_ONLY:
return user == event.owner
else if self.privacy == PRIVACY_GROUP:
for grp in self.owner.groups()
if grp.is_member(user):
return True
return False
Sometimes all it takes is to think of the right approach. The solution is to introduce a new kind of property that uses pickle to store and retrieve values, such as that described in https://groups.google.com/forum/?fromgroups#!topic/google-appengine/bwMD0ZfRnJg
I wanted something slightly more sophisticated, because pickle isn’t always the answer, and anyway documentation is nice, so here is my ObjectReference type:
import pickle
from google.appengine.ext import db
class ObjectProperty(db.Property):
def __init__(self, object_type=None, verbose_name=None, to_store=pickle.dumps, from_store=pickle.loads, **kwds):
"""Initializes this Property with all the given options
All args are passed to the superclass. The ones used specifically by this class are described here. For
all other args, see base class method documentation for details.
Args:
object_type: If not None, all values assigned to the property must be either instances of this type or None
to_store: A function to use to convert a property value to a storable str representation. The default is
to use pickle.dumps()
from_store: A function to use to convert a storable str representation to a property value. The default is
to use pickle.loads()
"""
if object_type and not isinstance(object_type, type):
raise TypeError('object_type should be a type object')
kwds['indexed'] = False # It never makes sense to index pickled data
super(ObjectProperty, self).__init__(verbose_name, **kwds)
self.to_store = to_store
self.from_store = from_store
self.object_type = object_type
def get_value_for_datastore(self, model_instance):
"""Get value from property to send to datastore.
We retrieve the value of the attribute and return the result of invoking the to_store function on it
See base class method documentation for details.
"""
value = getattr(model_instance, self.name, None)
return self.to_store(value)
def make_value_from_datastore(self, rep):
"""Get value from datastore to assign to the property.
We take the value passed, convert it to str() and return the result of invoking the from_store function
on it. The Property class assigns this returned value to the property.
See base class method documentation for details.
"""
# It passes us a unicode, even though I returned a str, so this is required
rep = str(rep)
return self.from_store(rep)
def validate(self, value):
"""Validate reference.
Returns:
A valid value.
Raises:
BadValueError for the following reasons:
- Object not of correct type.
"""
value = super(ObjectProperty, self).validate(value)
if value is not None and not isinstance(value, self.object_type):
raise db.KindError('Property %s must be of type %s' % (self.name, self.object_type))
return value

db.expando + App Engine + Change an integer property value to float value

I've following model set up initially
class Obj (db.Model):
name = db.StringProperty(required=True)
rating = db.IntegerProperty(default=0, required=False)
There are entities already created with above, such as:
name="test1", rating="3"
So, I need to change the rating type to float. I was trying to achieve this with db.Expando
class Obj (db.Expando):
name = db.StringProperty(required=True)
rating = db.FloatProperty(default=0, required=False)
Before I'm able to retrieve the instance of the Obj model to update it to float value, I've already got the following error:
Property rating must be a float
At first, I got this error, because I wasn't using db.Expando. However, after using db.Expando, I assumed this error shouldn't come into place ? Since it can dynamically change value, type etc as I read the articles.
Being new to db.Expando, I need help. Does anyone have clue to what happened ?
EDIT
for o in Obj.all():
a = []
if o.rating:
o.rating = float(str(restaurant.rating))
else:
o.rating = float(0)
a.append(restaurant)
db.put(a)
After having above code, the same error pops up
Property rating must be a float
SOLUTION
Temporarily removed the rating from model definition and updated the values to float first and then add the new rating definition with db.FloatProperty
A db.Expando model allows you to add properties that aren't defined in the class itself; however, any properties that are explicitly defined in the class do need to be the correct type.
Simply removing rating from the model definition may work for you.

Resources