Can I compare two properties of single kind in ndb of GAE?
class GameLog(ndb.Model):
duration = ndb.IntegerProperty(default=0)
time = ndb.IntegerProperty(default=0)
I need compare these two properties. How can I do this?
GameLog.duration > GameLog.time
To accomplish this kind of thing you need to save the (precomputed) result, so it's indexed and you can query for it.
To make this easier ndb gives you Computed Properties:
class GameLog(ndb.Model):
duration = ndb.IntegerProperty(default=0)
time = ndb.IntegerProperty(default=0)
timeout = ndb.ComputedProperty(lambda self: self.duration > self.time)
You don't need to maintain this property yourself, every time you put() the entity the value will get calculated and saved. Now you can do the query:
GameLog.query(GameLog.timeout == True)
Related
I am using the google cloud datastore python client to write an entity into the datastore which contains an embedded entity. An example entity might look like:
data_type: 1
raw_bytes: <unindexed blob>
values: <indexed embedded entity>
I checked the data from the console and the data is getting saved correctly and the values are present.
Next, I need to run a query from a python app engine application. I have represented the above as the following entity in my app engine code:
class DataValues(ndb.Model):
param1 = ndb.BooleanProperty()
param2 = ndb.IntegerProperty()
param3 = ndb.IntegerProperty()
class MyEntity(ndb.Expando):
data_type = ndb.IntegerProperty(required=True)
raw_bytes = ndb.BlobProperty()
values = ndb.StructuredProperty(DataValues)
One of the filters in the query depends on a property in values. Sample query code is as below:
MyEntity.query().filter(MyEntity.data_type == 1).filter(MyEntity.values.param1 == True).get()
I have created the corresponding composite index in my index.yaml
The query runs successfully but the resulting entity contains the embedded entity values as None. All other property values are present.
What can be the issue here ?
Add properties of DataValues entity as properties of the MyEntity.
This is a bit of a guess, but since datastore attributes are kind of keyed by both their name (in this case values) and the name of the "field type/class" (i.e. StructuredProperty), this might fix your problem:
class EmbeddedProperty(ndb.StructuredProperty):
pass
class MyEntity(ndb.Expando):
data_type = ndb.IntegerProperty(required=True)
raw_bytes = ndb.BlobProperty()
values = EmbeddedProperty(DataValues)
Give it a shot and let me know if values starts coming back non-null.
I struggled with the same problem, wanting to convert the embedded entity into a Python dictionary. One possible solution, although not a very elegant one, is to use a GenericProperty:
class MyEntity(ndb.Model):
data_type = ndb.IntegerProperty(required=True)
raw_bytes = ndb.BlobProperty()
values = ndb.GenericProperty()
values will then be read as an "Expando" object: Expando(param1=False,...). You can access the individual values with values.param1, values.param2 etc. I would prefer having a custom model class, but this should do the job.
I have NDB model class like below:
class Contest(ndb.Model):
end_date = ndb.DateProperty(required=True, indexed=True)
ended = ndb.BooleanProperty(required=False, indexed=True)
...
I will have a daily cron job to mark contests with passed end_date with ended equal to True. I've written the following code to do it:
contests = Contest.query()
current_datetime = datetime.datetime.utcnow()
today = datetime.date(current_datetime.year, current_datetime.month, current_datetime.day)
contests = contests.filter(Contest.end_date < today)
contests = contests.filter(Contest.ended == False)
contests = contests.fetch(limit=10)
for contest in contests:
contest.ended = True
ndb.put_multi(contests)
But I don't like it, since I have to read all entities just to update one value. Is there any way to modify it to read keys_only?
The object data overwrites the existing entity. The entire object is sent to Datastore
https://cloud.google.com/datastore/docs/concepts/entities#Datastore_Updating_an_entity
So you cannot send only one field of an entity, it will "remove" all existing fields. To be more accurate - replace an entity with all fields with new version of entity that have only one field.
You have to load all entities you want to update, with all properties, not just keys, set new value of a property, and put back into database.
I think a Python property is a good solution here:
class Contest(ndb.Model):
end_date = ndb.DateProperty(required=True, indexed=True)
#property
def ended(self):
return self.end_date < date.today()
This way you don't ever need to update your entities. The value is automatically computed whenever you need it.
I have a two part question.
Let's say I have a entity with a blob property...
# create entity
Entity(ndb.Model):
blob = ndb.BlobProperty(indexed=False)
e = Entity()
e.blob = 'abcd'
e_key = e.put()
# update entity
e = e_key.get()
e.blob += 'efg'
e.put()
So questions are:
The first time I put() that entity, the cost is 2 Write Ops; how many Ops does it cost to update the entity, as in the above example?
When I added 'efg' to the property, the old property had to be read into memory first, does app engine provide a way to append the old value without reading it first?
There are no partial updates. Every time you overwrite the whole entity. Numbers of indexes will also have an impact on cost. You might like to have a look at https://developers.google.com/appengine/articles/life_of_write for a detailed breakdown of what happens.
I'm wondering if I should have a kind only for counting entities.
For example
There is a model like the following.
class Message(db.Model):
title = db.StringProperty()
message = db.StringProperty()
created_on = db.DateTimeProperty()
created_by = db.ReferenceProperty(User)
category = db.StringProperty()
And there are 100000000 entities made of this model.
I want to count entities which category equals 'book'.
In this case, should I create the following mode for counting them?
class Category(db.Model):
category = db.StringProperty()
look_message = db.ReferenceProperty(Message)
Does this small model make it faster to count?
And does it erase smaller memory?
I'm thinking to count them like the following by the way
q = db.Query(Message).filter('category =', 'book')
count = q.count(10000)
Counting 100000000 entities is a very expensive operation on a NoSQL database as the App Engine datastore. You'll probably want to count as you update, or run a map-reduce operation to count after the fact.
App Engine also offers a simple way to query how many entities of each type you have:
https://developers.google.com/appengine/docs/python/datastore/stats
For example, to count all Messages:
from google.appengine.ext.db import stats
kind_stats = stats.KindStat().all().filter("kind_name =", "Message").get()
count = kind_stats.count
Note that stats are updated asynchronously, so they'll lag the actual count.
I think that you have to create another entity like this.
This entity will just count the number of messages by category.
Just change your category to this:
class Category(db.model):
category = db.StringProperty()
totalOfMessages = db.IntegerProperty(default=0)
In the message class you change to reference the category class, just change the category property to:
category = db.ReferenceProperty(Category)
When you create a new Message object, you have to update the counter, increment when you create a new message or decrement if you delete.
The best way to work with counters on GAE is using Sharding Counters
Count is implemented as an index scan that discards all data except the number of records seen . It never looks up the entity, so the size of the entity does not matter.
That being said, counting like this does not scale and is quite costly in a system without a fixed schema. It would likely be better to use another method like a Sharded Counter, MapReduce or Materialized View/Fork Join. If you really want it to scale, this talk is pretty informative: http://www.google.com/events/io/2010/sessions/high-throughput-data-pipelines-appengine.html
How to implement this query counter into existing class? The main purpose is i have datastore model of members with more than 3000 records. I just want to count its total record, and found this on app engine cookbook:
def query_counter (q, cursor=None, limit=500):
if cursor:
q.with_cursor (cursor)
count = q.count (limit=limit)
if count == limit:
return count + query_counter (q, q.cursor (), limit=limit)
return count
My existing model is:
class Members(search.SearchableModel):
group = db.ListProperty(db.Key,default=[])
email = db.EmailProperty()
name = db.TextProperty()
gender = db.StringProperty()
Further i want to count members that join certain group with list reference. It might content more than 1000 records also.
Anyone have experience with query.cursor for this purpose?
To find all members you would use it like this:
num_members = query_counter(Members.all())
However, you may find that this runs slowly, because it's making a lot of datastore calls.
A faster way would be to have a separate model class (eg MembersCount), and maintain the count in there (ie add 1 when you create a member, subtract 1 when you delete a member).
If you are creating and deleting members frequently you may need to create a sharded counter in order to get good performance - see here for details:
http://code.google.com/appengine/articles/sharding_counters.html
To count members in a certain group, you could do something like this:
group = ...
num_members = query_counter(Members.all().filter('group =', group.key()))
If you expect to have large numbers of members in a group, you could also do that more efficiently by using a counter model which is sharded by the group.