I'm looking at the GAE example for datastoring here, and among other things this confused me a bit.
def guestbook_key(guestbook_name=DEFAULT_GUESTBOOK_NAME):
"""Constructs a Datastore key for a Guestbook entity with guestbook_name."""
return ndb.Key('Guestbook', guestbook_name)
I understand why we need the key, but why is 'Guestbook' necessary? Is it so you can query for all 'Guestbook' objects in the datastore? But if you need to search a datastore for a type of object why isn't there a query(type(Greeting)? Concidering that that is the ndb.model that you are putting in?
Additionally, if you are feeling generous, why in creating the object you are storing, do you have to set parent?
greeting = Greeting(parent=guestbook_key(guestbook_name))
First: GAE Datastore is one big distributed database used by all GAE apps concurrently. To distinguish entities GAE uses system-wide keys. A key is composed of:
Your application name (implicitly set, not visible via API)
Namespace, set via Namespace API (if not set in code, then an empty namespace is used).
Kind of entity. This is just a string and has nothing to do with types at database level. Datastore is schema-less so there are no types. However, language based APIs (Java JDO/JPA/objectify, Python NDB) map this to classes/objects.
Parent keys (afaik, serialised inside key). This is used to establish entity groups (defining scope of transactions).
A particular entity identifier: name (string) or ID (long). They are unique within namespace and kind (and parent key if defined) - see this for more info on ID uniqueness.
See Key methods (java) to see what data is actually stored within the key.
Second: It seems that GAE Python API does not allow you to query Datastore without defining classes that map to entity kind (I don't use GAE Python, so I might be wrong). Java does have a low-level API that you can use without mapping to classes.
Third: You are not required to define a parent to an entity. Defining a parent is a way to define entity groups, which are important when using transactions. See ancestor paths and
transactions.
That's what a key is: a path consisting of pairs of kind and ID. The key is what identifies what kind it is.
I don't understand your second question. You don't have to set a parent, but if you want to set one, you can only do it when creating the entity.
Related
By default indexing is enabled for all the fields in the ndb based model class.
What if I change the indexing definition for a field and redeploy the app; will it drop the indexing or recreate it, for that field, based on the changes in the model class?
Or is it like entity relationships which can't be changed once defined. I am asking this because, I am not sure at this point, how many fields I would require to be indexed in the final application ?
You can at any time change the definition of an entity object, the important thing is whether the property is set to be indexed when you put(). Say I have inserted a bunch of objects with a "name" property, un-indexed. Later I add an index to future put()'s on those entities. All my entities will still be in the datastore, just the ones that were indexed are query-able. A similar logic applies when I remove indexing from a language-local model property (java #Entity class for example, with objectify), and then do put().
This is what it means to have a schemaless datastore. They can have all different combinations of properties and indexing on/off for each of them. The only thing that truly binds these entities together is their "kind", which is set to the classname by the framework you're using, or set by hand if you're using the truly low-level API.
Read more here to understand better how indexing works in the schemaless datastore. This answers your question completely if you read the section linked.
I am using google app engine with python. the server got the entity key from the user, then I use this code to bring the entity:
key.get()
but I also want to get the entity just if it related to particular model, How could I do that ?
I know that I could do that by this code:
MyModel.get_by_id('my_key')
but this works just for the key_name and the id , and in my case I use the key ?
After getting user-provided key as urlsafe string from server, construct the NDB key, e.g.:
key = ndb.Key(urlsafe=string)
I'm not sure though, why you don't simply use key.get() after importing MyModel :-)
However, this is how you get an instance using its ID (whether string or integer):
MyModel.get_by_id(key.id(), parent=key.parent(), app=key.app(), namespace=key.namespace())
The keywords are optional, unless you use multiple namespaces or application IDs, or MyModel is child class in an entity group.
Alternatively, use key.string_id() or key.integer_id()
Security warning: Since your app accepts user-provided keys, be aware that even the cryptically looking URL-safe keys can be easily encoded/decoded.
For details see Reference for NDB Key
My app should contain several users, each of them having a list of objects ( only one user own the object ).
My question is : Would it be better to put an entity User that references the Ids of its objects, or should I put the user as the ancestor of the objects ? Please be kind, I am just beginning with nosql and datastore !
What approach you take will depend heavily on your access patterns, what make sense for easy retrieval, frequency of writes etc. You start your design process by building a basic entity relationship model, then start elaborating on what information you need to get to, and how frequently it is required what security restrictions are required. Then look at how you need to adjust the real model to reflect these access use cases taking into account performance, ease of use, security requirements.
Which approach you should choose depends mainly on the consistency model (strong vs eventual) you require for your entities. In Google Cloud Datastore, an entity group (an entity and its descendants) is a unit with strong consistency, transactionality, and locality.
You can read more on the topic here and here.
And there is one more important thing that is needed to take into account. If you model a parent-child relationship between a user and an object, the parent will be part of the object's key hence if you will change the object's owner later, you will end up with different object in terms of its key.
How can I view the simple index definitions on Googles AppEngine Datastore? Is it possible at all?
There is a "Datastore Indexes" view which only displays the composite indexes as it seems (the ones you define in datastore_indexes.xml).
What do you mean by does not work? For non custom index, you should put the old objects to include them in the index.
From the doc https://developers.google.com/appengine/docs/python/datastore/indexes
"Note, however, that changing a property from unindexed to indexed does not affect any existing entities that may have been created before the change. Queries filtering on the property will not return such existing entities, because the entities weren't written to the query's index when they were created. To make the entities accessible by future queries, you must rewrite them to the Datastore so that they will be entered in the appropriate indexes. That is, you must do the following for each such existing entity:"
It's not possible (yet) to view the simple index definitions on your datastore model.
The actual index in the datastore can vary between entity instances (if the definition was changed at a time where there already was data stored). Changing simple indexes thus requires a manual migration (read and put all data so it is stored and indexed again with the new definition). Thanks #marcadian for the pointer.
I'm developing an application with Google App Engine and stumbled across the following scenario, which can perhaps be described as "MVP-lite".
When modeling many-to-many relationships, the standard property to use is the ListProperty. Most likely, your list is comprised of the foreign keys of another model.
However, in most practical applications, you'll usually want at least one more detail when you get a list of keys - the object's name - so you can construct a nice hyperlink to that object. This requires looping through your list of keys and grabbing each object to use its "name" property.
Is this the best approach? Because "reads are cheap", is it okay to get each object even if I'm only using one property for now? Or should I use a special property like tipfy's JsonProperty to save a (key, name) "tuple" to avoid the extra gets?
Though datastore reads are comparatively cheaper datastore writes, they can still add significant time to request handler. Including the object's names as well as their foreign keys sounds like a good use of denormalization (e.g., use two list properties to simulate a tuple - one contains the foreign keys and the other contains the corresponding name).
If you decide against this denormalization, then I suggest you batch fetch the entities which the foreign keys refer to (rather than getting them one by one) so that you can at least minimize the number of round trips you make to the datastore.
When modeling one-to-many (or in some
cases, many-to-many) relationships,
the standard property to use is the
ListProperty.
No, when modeling one-to-many relationships, the standard property to use is a ReferenceProperty, on the 'many' side. Then, you can use a query to retrieve all matching entities.
Returning to your original question: If you need more data, denormalize. Store a list of titles alongside the list of keys.