As you can see from the attached screenshot, the datastore asks memcache to delete an entry inside a put(). What's that?
At least the ndb datastore caches include memcache:
The pattern you observed could be explained in this section:
Memcache does not support transactions. Thus, an update meant to be
applied to both the Datastore and memcache might be made to only one
of the two. To maintain consistency in such cases (possibly at the
expense of performance), the updated entity is deleted from memcache
and then written to the Datastore.
Related
I have limited seed-data in an entity, for which I want to fetch all keys (unique strings) frequently and whole entity not that frequent.
If I fetch the keys using Query.fetchKeys, does Objectify cache the results in memcache or hit datastore everytime for Query.fetchKeys results?
Query.fetchKeys() is a method from a very old version of Objectify.
But in answer to your question, all 'queries' (that is, anything besides get-by-key) must pass through to the datastore. Only the datastore knows what satisfies a query.
Is there any effective way (in terms of number or read/write operations) to:
delete all NDB datastore records of particular kind;
delete everything in the datastore?
ndb.delete_multi(
MyModel.query().fetch(keys_only=True)
)
You need to do this for each model separately.
--OR--
If you have Datastore Admin enabled in your developer console, your can do this directly for all entities of any or all Kinds.
The remote API is great for this sort of operation. See the article below, it even includes an example for deleting all entities of a given kind.
https://developers.google.com/appengine/articles/remote_api
Two questions about updating my domain diagram:
1) I am new to GAE and have just deployed my first application based on Objectify. Just to discover than soon after my first users came in I had soon gone through the datastore read quota limit:
I had not until now put too much thought on server side caching. I thought Objectify's session cache would do the job for me. But now I realise I need use the global memcache.
According to Objectify's doc, I have to use Objectify's #Cache annotation on every entity that is accessed by key (and not by query).
However I am concerned about the side effects this will have on data that I have already stored in datastore.
2) I also realize now that I am using #Parent too much. There are a couple entities were using #Parent has no benefit (and it has some drawbacks due the datastore limiting write operations on entities belonging to the same root).
If I go ahead and remove the #Parent annotation from the entities of my domain where it no longer is needed, will it have side effects on the already persited entities?
Thanks!
For objectify : the global cache is enabled by default, however you
must still annotate your entity classes with #Cache.
#Parent is
important if you need consistent result, and avoid eventual
consistency. Removing the ancestor will have a side effect on the already stored data as the key will change. You will need a migration plan.
But most of all, the free quota are quite reasonable, so if you already run into quota errors with your first user, then I would suggest installing appstats and actually measure what is the real underlying cause i.e. what action(s) are responsible for the bulk of the operations and work on those. Much better than a general approach.
From what I have heard, it is better to move to NDB from Datastore. I would be doing that eventually since I hope my website will be performance intensive. The question is when. My project is in its early stages.
Is it better to start in NDB itself? Does NDB take care of Memcache also. So I don't need to have an explict Memcache layer?
NDB provides an automated caching mechanism. See Caching:
NDB automatically caches data that it writes or reads (unless an
application configures it not to). Reading from cache is faster than
reading from the Datastore.
Probably the automatic caching does what you want. The rest of this
page provides more detailed information in case you want to know more
or to control some parts of the caching behavior.
As the documentation says, the default behavior probably does what you want, but you can tweak it if that's not the case. Adding your own memcache layer for the datastore shouldn't be required if you're using NDB.
As for when to migrate, sooner is probably better. The longer you wait the more code you have to rewrite to take advantage of the freebies you get with NDB. For new projects, I would recommend starting with NDB.
To add to Dan's correct answer, remember ndb and the older db are just APIs so you can seamlessly begin switching to ndb without worrying about schema changes etc.. You're question asks about switching from datastore to NDB, but you're not switching from the datastore as NDB still uses the datastore. Make sense?
I am using djangoappengine and I think have run into some problems with the way it handles eventual consistency on the high application datastore.
First, entity groups are not even implemented in djangoappengine.
Second, I think that when you do a djangoappengine get, the underlying app engine system is doing an app engine query, which are only eventually consistent. Therefore, you cannot even assume consistency using keys.
Assuming those two statements are true (and I think they are), how does one build an app of any complexity using djangoappengine on the high replication datastore? Every time you save a value and then try to get the same value, there is no guarantee that it will be the same.
Take a look in djangoappengine/db/compiler.py:get_matching_pk()
If you do a djangomodel.get() by the pk, it'll translate to a Google App Engine Get().
Otherwise it'll translate to a query. There's room for improvement here. Submit a fix?
Don't really know about djangoappengine but an appengine query if it includes only key is considered a key only query and you will always get consistent results.
No matter what the system you put on top of the AppEngine models, it's still true that when you save it to the datastore you get a key. When you look up an entity via its key in the HR datastore, you are guaranteed to get the most recent results.