This is ridicolously trivial but i've spent half an hour trying to solve it.
class SocialPost(model.Model):
total_comments=model.IntegerProperty(default=0)
def create_reply_comment(self,content,author):
...
logging.info(self)
self.total_comments=self.total_comments+1
self.put()
In the logfile, i can see how total_comments is 0 but in the admin console, it is 1. The other fields are correct, except for this one.
Probably there's something wrong in that "default=0" but i can't find what is wrong.
Edit: full code of my function
def create_reply_comment(self,content,author):
floodControl=memcache.get("FloodControl-"+str(author.key))
if floodControl:
raise base.FloodControlException
new_comment= SocialComment(parent=self.key)
new_comment.author=author.key
new_comment.content=content
new_comment.put()
logging.info(self)
self.latest_comment_date=new_comment.creation_date
self.latest_comment=new_comment.key
self.total_comments=self.total_comments+1
self.put()
memcache.add("FloodControl-"+str(author.key), datetime.now(),time=SOCIAL_FLOOD_TIME)
Where i call the function:
if cmd == "create_reply_post":
post=memcache.get("SocialPost-"+str(self.request.get('post')))
if post is None:
post=model.Key(urlsafe=self.request.get('post')).get()
memcache.add("SocialPost-"+str(self.request.get('post')),post)
node=node.get()
if not node.get_subscription(user).can_reply:
self.success()
return
post.create_reply_comment(feedparser._sanitizeHTML(self.request.get("content"),"UTF-8"),user)
You're calling memcache.add before you make your change to total_comments, so when you read it back from memcache on subsequent calls, you're getting an out-of-date value from the cache. Your create_reply_comment needs to either delete or overwrite the "SocialPost-"+str(self.request.get('post') cache key.
[edit] Though your post title says you're using NDB (model.Model though? Hmm.), so you could just skip the memcache bits entirely, and let NDB do it's thing?
Related
class UnassignedThread(models.Manager):
def get_queryset(self):
return super(UnassignedThread,
self).get_queryset().filter(
_irc_name__isnull=True)
Would results = ThreadVault.unassigned_threads.all() be cached? I am not certain if _isnull=True counts as being a evaluated(since the evaluation causes the cache).
Also, if have a model called ThreadVault, and I want to look up if threads #777 and #888 exist in the database, which way is the best to utilize cache to do the look up?
ThreadVault.objects.get(thread_id="777")
ThreadVault.objects.get(thread_id="888")
or
results = ThreadVault.objects.all()
for ticket in results:
if ticket.thread_id == "777" or ticket.thread_id == "888":
do something
No, querysets are lazy until they are sliced or iterated. filter simply adds conditions to the query, but does not evaluate it.
For your second question, neither of these are great, although the first is vastly preferable to the second (which involves loading and iterating through every object in the table). Instead, you should use exists() in conjunction with an __in filter:
ThreadVault.objects.filter(thread_id__in=["777", "888"].exists()
Neither of these questions has anything to do with caching.
th_ids = ["777","888"]
ThreadVault.objects.filter(thread_id__in=th_ids).exists()
for caching your view
from django.views.decorators.cache import cache_page
#cache_page(60 * 15)
def my_view(request):
class Dummy(TestCase):
def setUp(self):
thing = Thing.objects.create(name="Thing")
def test_a(self):
self.assertTrue(Thing.objects.get(pk=1))
def test_b(self):
self.assertTrue(Thing.objects.get(pk=1))
In this example I expect for setUp to be run prior to every test case, but it is only run prior to the first and then the changes are rolled back. This causes test_a to pass, but the equivalent test_b to fail. Is this the expected behavior? What do I need to do to make sure that the database is in the same state prior to every test case?
Figured it out. setUp is being run each time, it's just that it's incrementing the private key in the database. Therefore the Thing with pk=1 no longer exists. This works just fine:
class Dummy_YepThatsMe(TestCase):
def setUp(self):
thing = Thing.objects.create(name="Thing")
def test_a(self):
self.assertTrue(Thing.objects.get(name="Thing"))
def test_b(self):
self.assertTrue(Thing.objects.get(name="Thing"))
I was wondering if somebody could help. I'm using the blobcache module outlined in this post here
This works fine but I'm looking to speed retrieval from memcache by using the get_multi()
key function but my current code cannot find the keys when using get_multi
My current get def looks like this
def get(key):
chunk_keys = memcache.get(key)
if chunk_keys is None:
return None
chunk_keys= ",".join(chunk_keys)
str(chunk_keys)
chunk = memcache.get_multi(chunk_keys)
if chunk is None:
return None
try:
return chunk
except Exception:
return None
My understanding per the documentation is that you only need to pass through a string of keys to get_multi.
However his is not returning anything at the moment.
Can someone point out what i'm doing wrong here?
pass it a list of strings (keys) , instead of a single string with commas in it.
get_multi(keys, key_prefix='', namespace=None, for_cas=False)
keys = List of keys to look up. A Key can be a string or a tuple of
(hash_value, string), where the hash_value, normally used for sharding
onto a memcache instance, is instead ignored, as Google App Engine
deals with the sharding transparently.
Multi Get Documentation
in my app i for one of the handler i need to get a bunch of entities and execute a function for each one of them.
i have the keys of all the enities i need. after fetching them i need to execute 1 or 2 instance methods for each one of them and this slows my app down quite a bit. doing this for 100 entities takes around 10 seconds which is way to slow.
im trying to find a way to get the entities and execute those functions in parallel to save time but im not really sure which way is the best.
i tried the _post_get_hook but the i have a future object and need to call get_result() and execute the function in the hook which works kind of ok in the sdk but gets a lot of 'maximum recursion depth exceeded while calling a Python objec' but i can't really undestand why and the error message is not really elaborate.
is the Pipeline api or ndb.Tasklets what im searching for?
atm im going by trial and error but i would be happy if someone could lead me to the right direction.
EDIT
my code is something similar to a filesystem, every folder contains other folders and files. The path of the Collections set on another entity so to serialize a collection entity i need to get the referenced entity and get the path. On a Collection the serialized_assets() function is slower the more entities it contains. If i could execute a serialize function for each contained asset side by side it would speed things up quite a bit.
class Index(ndb.Model):
path = ndb.StringProperty()
class Folder(ndb.Model):
label = ndb.StringProperty()
index = ndb.KeyProperty()
# contents is a list of keys of contaied Folders and Files
contents = ndb.StringProperty(repeated=True)
def serialized_assets(self):
assets = ndb.get_multi(self.contents)
serialized_assets = []
for a in assets:
kind = a._get_kind()
assetdict = a.to_dict()
if kind == 'Collection':
assetdict['path'] = asset.path
# other operations ...
elif kind == 'File':
assetdict['another_prop'] = asset.another_property
# ...
serialized_assets.append(assetdict)
return serialized_assets
#property
def path(self):
return self.index.get().path
class File(ndb.Model):
filename = ndb.StringProperty()
# other properties....
#property
def another_property(self):
# compute something here
return computed_property
EDIT2:
#ndb.tasklet
def serialized_assets(self, keys=None):
assets = yield ndb.get_multi_async(keys)
raise ndb.Return([asset.serialized for asset in assets])
is this tasklet code ok?
Since most of the execution time of your functions are spent waiting for RPCs, NDB's async and tasklet support is your best bet. That's described in some detail here. The simplest usage for your requirements is probably to use the ndb.map function, like this (from the docs):
#ndb.tasklet
def callback(msg):
acct = yield ndb.get_async(msg.author)
raise tasklet.Return('On %s, %s wrote:\n%s' % (msg.when, acct.nick(), msg.body))
qry = Messages.query().order(-Message.when)
outputs = qry.map(callback, limit=20)
for output in outputs:
print output
The callback function is called for each entity returned by the query, and it can do whatever operations it needs (using _async methods and yield to do them asynchronously), returning the result when it's done. Because the callback is a tasklet, and uses yield to make the asynchronous calls, NDB can run multiple instances of it in parallel, and even batch up some operations.
The pipeline API is overkill for what you want to do. Is there any reason why you couldn't just use a taskqueue?
Use the initial request to get all of the entity keys, and then enqueue a task for each key having the task execute the 2 functions per-entity. The concurrency will be based then on the number of concurrent requests as configured for that taskqueue.
In Google App Engine, I make lists of referenced properties much like this:
class Referenced(BaseModel):
name = db.StringProperty()
class Thing(BaseModel):
foo_keys = db.ListProperty(db.Key)
def __getattr__(self, attrname):
if attrname == 'foos':
return Referenced.get(self.foo_keys)
else:
return BaseModel.__getattr__(self, attrname)
This way, someone can have a Thing and say thing.foos and get something legitimate out of it. The problem comes when somebody says thing.foos.append(x). This will not save the added property because the underlying list of keys remains unchanged. So I quickly wrote this solution to make it easy to append keys to a list:
class KeyBackedList(list):
def __init__(self, key_class, key_list):
list.__init__(self, key_class.get(key_list))
self.key_class = key_class
self.key_list = key_list
def append(self, value):
self.key_list.append(value.key())
list.append(self, value)
class Thing(BaseModel):
foo_keys = db.ListProperty(db.Key)
def __getattr__(self, attrname):
if attrname == 'foos':
return KeyBackedList(Thing, self.foo_keys)
else:
return BaseModel.__getattr__(self, attrname)
This is great for proof-of-concept, in that it works exactly as expected when calling append. However, I would never give this to other people, since they might mutate the list in other ways (thing[1:9] = whatevs or thing.sort()). Sure, I could go define all the __setslice__ and whatnot, but that seems to leave me open for obnoxious bugs. However, that is the best solution I can come up with.
Is there a better way to do what I am trying to do (something in the Python library perhaps)? Or am I going about this the wrong way and trying to make things too smooth?
If you want to modify things like this, you shouldn't be changing __getattr__ on the model; instead, you should write a custom Property class.
As you've observed, though, creating a workable 'ReferenceListProperty' is difficult and involved, and there are many subtle edge cases. I would recommend sticking with the list of keys, and fetching the referenced entities in your code when needed.