I need to store a lot of ModelA entities, and because app engine pricing is based on the number of entities written/read, I bundle together 100 entities and store them as one ModelB.
class ModelA(ndb.Expando):
a1 ... a20 = ndb.IntegerProperty()
class ModelB(ndb.Model):
data = ndb.StructuredProperty(ModelA, repeated=True)
I have only 80 such ModelB entities in my datastore, that should use around 1-2MB of memory, yet ModelB.query().fetch() takes 5 seconds. Is there any way to make this faster? Would using LocalStructuredProperty instead of StructuredProperty be better?
If you don't need to index the values (for queries), you might want to store them as an opaque object:
class ModelB(ndb.Model):
data = ndb.JsonProperty(indexed=False)
def add_data(self, model_a)
if not self.data:
self.data = []
self.data.append(model_a)
This will avoid the overhead of handling structured properties and validation.
Related
I want to read related to model entity. What API I should use?
For example:
class DeleteMe(db.Model):
x = db.FloatProperty()
DeleteMe(key_name = '1').put()
How to read raw entity from datastore for key_name = '1'?
To get the corresponding model that you just put, use get_by_key_name. (https://developers.google.com/appengine/docs/python/datastore/modelclass#Model_get_by_key_name)
DeleteMe.get_by_key_name('1')
However, I noticed you're using the db package and not ndb. I would encourage you to use ndb as it has many optimizations and a more powerful API to the datastore.
https://developers.google.com/appengine/docs/python/ndb/
Corresponding code for NDB might look like:
from google.appengine.ext import ndb
class DeleteMe(ndb.Model):
x = ndb.FloatProperty()
DeleteMe(id='1').put()
DeleteMe.get_by_id('1')
A list of StructuredProperties in NDB is modelled by a number of repeated properties, one repeated property per property of the StructuredProperty's model class. (See here: https://developers.google.com/appengine/docs/python/ndb/properties#structured.)
The closed equivalent I have found with JPA on Google App Engine is an #Embedded list of #Embeddables. The storage format, however, is now different. The GAE/J plugin for JPA generates one column per embedded entity per property of the embedded entity (see here: http://datanucleus-appengine.googlecode.com/svn-history/r979/trunk/src/com/google/appengine/datanucleus/StoreFieldManager.java). (For long lists, this generates rows with many, many columns, for example.)
Is there an easy built-in way to copy NDB's way to cope with repeated composite values using JPA or JDO?
ADDED:
For those more comfortable with Java than Python, I have found that the Java-based Objectify framework stores collections of embedded classes in the same way as NDB on Python: http://code.google.com/p/objectify-appengine/wiki/Entities#Embedding, which is the way I want to achieve with JPA (or JDO):
For convenience, I am copying their example here:
import com.googlecode.objectify.annotation.Embed;
import com.googlecode.objectify.annotation.Entity;
import com.googlecode.objectify.annotation.Id;
#Embed
class LevelTwo {
String bar;
}
#Embed
class LevelOne {
String foo;
LevelTwo two
}
#Entity
class EntityWithEmbeddedCollection {
#Id Long id;
List<LevelOne> ones = new ArrayList<LevelOne>();
}
Using this setup, run the following code:
EntityWithEmbeddedCollection ent = new EntityWithEmbeddedCollection();
for (int i=1; i<=4; i++) {
LevelOne one = new LevelOne();
one.foo = "foo" + i;
one.two = new LevelTwo();
one.two.bar = "bar" + i;
ent.ones.add(one);
}
ofy().save().entity(ent);
The resulting row layout (using repeated properties on the Datastore level) is then:
ones.foo: ["foo1", "foo2", "foo3", "foo4"]
ones.two.bar: ["bar1", "bar2", "bar3", "bar4"]
Google's JDO/JPA plugin definition of embedded collections was specified in
https://code.google.com/p/datanucleus-appengine/issues/detail?id=258&can=1&q=embedded&colspec=ID%20Stars%20Type%20Status%20Priority%20FoundIn%20TargetRelease%20Owner%20Summary
If you want some other definition of how that is stored (and there are many many ways in which it could be stored) then you raise an issue on that issue tracker of Googles (and if you want the feature sooner rather than later, then provide some resource to implement it)
I have a large (about 20000) objects that I want to retrieve based on user IDs contained in AlertSubscription. This takes 20+ seconds on GAE and I wonder if it is possible to get this faster? I have already retrieved 20000 AlertSubscriptions objects in 5 seconds using a standard Query so I think it's strange that the DeviceInfo objects should take 20+ seconds.
My current logic is as follows:
Get list of alert subscriptions
Get all the DeviceInfo tied to the subscriptions
Start one task for each OS (Android, iOS, WP7/8)
Therefore I need all the devices in one go so using a Cursor wouldn't help much. Is there a way to add a index to the Key of the object to make this go faster? I have no indexes setup in the datastore-index.xml file.
List<com.googlecode.objectify.Key<DeviceInfo>> dKeys= new ArrayList<com.googlecode.objectify.Key<DeviceInfo>>();
for (AlertSubscription s: filteredSubscriptions.values())
{
dKeys.add(new com.googlecode.objectify.Key<DeviceInfo>(DeviceInfo.class,s.user));
}
log.info("Getting keys " + filteredSubscriptions.size());
Map<com.googlecode.objectify.Key<DeviceInfo>, DeviceInfo> fetched = ofy.get(dKeys);
log.info("Got keys, looping now");
...
#PersistenceCapable(identityType = IdentityType.APPLICATION)
public class DeviceInfo {
#PrimaryKey
#Persistent
#Indexed
private Key key;
This resolved it for me:
ObjectifyOpts opts = new ObjectifyOpts();
opts.setConsistency(Consistency.EVENTUAL);
Objectify ofy = ObjectifyService.begin(opts);
https://groups.google.com/forum/?fromgroups=#!topic/objectify-appengine/1dIi-JOK_k4
I'm building a Threaded Messaging System that will be hosted on Google AppEngine
I've modeled it after the technique described by Brett Slatkin in Building Scalable, Complex Apps on App Engine
class Message(db.Model):
sender = db.StringProperty()
body = db.TextProperty()
class MessageIndex(db.Model):
receivers = db.StringListProperty()
The issue I'm having to determining the most efficient way to track the message state for a User.
For example is a message read, archived, deleted for a particular user.
Here are the solution I have come up with so far.
I'm using Datastore+'s StructuredProperty to add a state to the message MessageIndex
class Message(model.Model):
sender = model.StringProperty()
body = model.TextProperty()
class _DisplayState(model.Model):
user_key = model.KeyProperty()
state = model.IntegerProperty(default=0) # 0-unread, 1-read, 2-archived
class MessageIndex(model.Model):
receivers = model.StructuredProperty(_DisplayState, repeated=True)
This solution, while simple, negates the benefit of the MessageIndex. Additionally since the the MessageIndex is in the same entity group as the message datastore writes will be limited.
Full Source Code
What would be the most efficient way to accomplish this? Would adding an additional entity group be a better solution?
class MessageState(model.Model):
user_key = model.KeyProperty()
message_key = model.KeyPropery()
message_state = model.IntegerProperty(default=0) # 0-unread, 1-read, 2-archived
For the easiest querying - split your 'receivers' list into four different lists - 'unread', 'read', 'archived', 'deleted' and shuffle the receiver record between the lists as appropriate.
When you change data models on the app engine to add new properties those entries without a certain property are listed with the value <missing> in the online data viewer.
What I'm wondering is how can I write a query to find those entries?
There is no direct way to query for older entities with missing attribute, but you can design data model upfront to support this. Add a version attribute to each model class. Version should have a default value, which is increased every time model class is changed and deployed. This way you will be able to query entities by version number.
There's no way to query the datastore for entities that don't have a given property. You need to iterate over all the entities and check each one - possibly by using the mapreduce API.
Or you could create a script to stick null in there for all current items that don't have that property using the low level datastore API, so then you can query on null.
I had this issue and that's how I solved it. The rough code would look like:
DatastoreService datastore = DatastoreServiceFactory.getDatastoreService();
Query query = new Query("JDOObjectType");
List<Entity> results = datastore.prepare(query).asList(FetchOptions.Builder.withLimit(9999));
for (Entity lObject : results) {
Object lProperty = lObject.getProperty("YOUR_PROPERTY");
if (lProperty == null) {
lObject.setProperty("YOUR_PROPERTY", null);
datastore.put(lProperty);
}
}
}