Selecting Entity based on auto generated ID in google datastore - google-app-engine

I have created an entity with few attributes but without specifying any key in which case an auto generated ID has been created in data-store.
Entity en=new Entity("Job");
Now when I fetch such entities and try to store it in Java object, how can I get the auto generated ID (which I required to perform UPDATE operation later)?
I have tried the below ways but it does not return Identifier value.
en.getProperty("__key__");
en.getProperty("ID/Name");
en.getProperty("Key");

You are probably looking for:
en.getProperty(Entity.KEY_RESERVED_PROPERTY)
mentioned in Key Filters (not an obvious place to find it).
Another approach would be to try:
en.getKey().getId()
mentioned in Entity JavaDoc and Key JavaDoc.

Related

How to ensure isolation with non-ancestor query

I want to create user using ndb such as below:
def create_user(self, google_id, ....):
user_keys = UserInformation.query(UserInformation.google_id == google_id ).fetch(keys_only=True)
if user_keys: # check whether user exist.
# already created
...(SNIP)...
else:
# create new user entity.
UserInformation(
# primary key is incompletekey
google_id = google_id,
facebook_id = None,
twitter_id = None,
name =
...(SNIP)...
).put()
If this function is called twice in the sametime, two user is created.("Isolation" is not ensure between get() and put())
So, I added #ndb.transactional to above function.
But following error is occured.
BadRequestError: Only ancestor queries are allowed inside transactions.
How to ensure isolation with non-ancestor query?
The ndb library doesn't allow non-ancestor queries inside transactions. So if you make create_user() transactional you get the above error because you call UserInformation.query() inside it (without an ancestor).
If you really want to do that you'd have to place all your UserInformation entities inside the same entity group by specifying a common ancestor and make your query an ancestor one. But that has performance implications, see Ancestor relation in datastore.
Otherwise, even if you split the function in 2, one non-transactional making the query followed by a transactional one just creating the user - which would avoid the error - you'll still be facing the datastore eventual consistency, which is actually the root cause of your problem: the result of the query may not immediately return a recently added entity because it takes some time for the index corresponding to the query to be updated. Which leads to room for creating duplicate entities for the same user. See Balancing Strong and Eventual Consistency with Google Cloud Datastore.
One possible approach would be to check later/periodically if there are duplicates and remove them (eventually merging the info inside into a single entity). And/or mark the user creation as "in progress", record the newly created entity's key and keep querying until the key appears in the query result, when you finally mark the entity creation as "done" (you might not have time to do that inside the same request).
Another approach would be (if possible) to determine an algorithm to obtain a (unique) key based on the user information and just check if an entity with such key exists instead of making a query. Key lookups are strongly consistent and can be done inside transactions, so that would solve your duplicates problem. For example you could use the google_id as the key ID. Just an example, as that's not ideal either: you may have users without a google_id, users may want to change their google_id without loosing other info, etc. Maybe also track the user creation in progress in the session info to prevent repeated attempts to create the same user in the same session (but that won't help with attempts from different sessions).
For your use case, perhaps you could use ndb models' get_or_insert method, which according to the API docs:
Transactionally retrieves an existing entity or creates a new one.
So you can do:
user = UserInformation.get_or_insert(*args, **kwargs)
without risking the creation of a new user.
The complete docs:
classmethod get_or_insert(*args, **kwds)source Transactionally
retrieves an existing entity or creates a new one.
Positional Args: name: Key name to retrieve or create.
Keyword Arguments
namespace – Optional namespace. app – Optional app ID.
parent – Parent entity key, if any.
context_options – ContextOptions object (not keyword args!) or None.
**kwds – Keyword arguments to pass to the constructor of the model class if an instance for the specified key name does not already
exist. If an instance with the supplied key_name and parent already
exists, these arguments will be discarded. Returns Existing instance
of Model class with the specified key name and parent or a new one
that has just been created.

How to delete a property/value from an entity using google cloud datastore

I am learning google.cloud.datastore, and like to know how to delete a property along with its value from an entity. Also, is it possible to delete a specific or a list of properties from all entities of a certain kind?
My understanding is datastore stores/manipulates data in a row-wise way (entities)?
cheers
Your understanding is correct, all datastore write operations happen, indeed, at the entity level. So in order to modify one or a subset of properties you'd retrieve the entity, modify the property (or delete it, if you want to delete the property) set and save the entity.
The exact details depend on the language and library used. From Updating an entity:
To update an existing entity, modify the properties of the entity
and store it using the key:
PYTHON
with client.transaction():
key = client.key('Task', 'sample_task')
task = client.get(key)
task['done'] = True
client.put(task)
The object data overwrites the existing entity. The entire object is
sent to Cloud Datastore. If the entity does not exist, the update will
fail. If you want to update-or-create an entity, use upsert as
described previously.
Note: To delete a property, remove the property from the entity, then save the entity.
In the above snippet, for example, deleting the done property of the task entity, if existing, would be done like this:
with client.transaction():
key = client.key('Task', 'sample_task')
task = client.get(key)
if 'done' in task:
del task['done']
client.put(task)

Generating a unique id GAE datastore

In MySQL I used auto-increment to generate an id for every user. I would like to create a similar user table in Google Datastore where the id for a user will be unique. According to these docs:https://cloud.google.com/appengine/docs/java/datastore/entities
System-allocated ID values are guaranteed unique to the entity group.
But according to this post: Ever see duplicate IDs when using Google App Engine and ndb? the id's are not unique. I need this id to be unique. It is confusing because in the docs it says the id is unique, but from this post it says the id is not unique it is the key that is unique. My objective is for no two users to have the same id. How can I guarantee this? I would prefer for the database to take care of this form me opposed to me having to create large ids manually using things such as uuids.
As Igor correctly observed, IDs are always unique as long as the entity has no parent.
I can't think of any reason to make user entities children of some other entities, so you are safe.
Note that IDs will not be sequential, as it helps to spread the load equally across the entire dataset - it's a by-product of how the Datastore is designed.

Why does Google AppEngine have two primary keys, 'key' and 'id/name'?

If you leave one or the other empty, or don't specify in your Entity, it creates a key/id for that entity anyways, as seen in the admin console datastore viewer.
Bonus question: Why can't you get the ID for an Entity object after you put() it? entity.getProperty("id") returns null. Key objects cannot be serialized so cannot be used by GWT.
Reference:
https://developers.google.com/appengine/docs/java/datastore/entities
https://developers.google.com/appengine/docs/java/datastore/jdo/creatinggettinganddeletingdata#Keys
Entities have a Key, and Keys (of persisted entities) have either auto-assigned ids, or programmer-supplied names. The name/id is a property of the Key, not a property of the Entity.
Instead of entity.getProperty("id") in Java you write entity.getKey().getId() (or .getName() if you gave the key a name).
The lower-level details are in:
https://developers.google.com/appengine/docs/java/javadoc/com/google/appengine/api/datastore/Entity
https://developers.google.com/appengine/docs/java/javadoc/com/google/appengine/api/datastore/Key
`

Issues understanding Google App Engine key

I'm looking at the GAE example for datastoring here, and among other things this confused me a bit.
def guestbook_key(guestbook_name=DEFAULT_GUESTBOOK_NAME):
"""Constructs a Datastore key for a Guestbook entity with guestbook_name."""
return ndb.Key('Guestbook', guestbook_name)
I understand why we need the key, but why is 'Guestbook' necessary? Is it so you can query for all 'Guestbook' objects in the datastore? But if you need to search a datastore for a type of object why isn't there a query(type(Greeting)? Concidering that that is the ndb.model that you are putting in?
Additionally, if you are feeling generous, why in creating the object you are storing, do you have to set parent?
greeting = Greeting(parent=guestbook_key(guestbook_name))
First: GAE Datastore is one big distributed database used by all GAE apps concurrently. To distinguish entities GAE uses system-wide keys. A key is composed of:
Your application name (implicitly set, not visible via API)
Namespace, set via Namespace API (if not set in code, then an empty namespace is used).
Kind of entity. This is just a string and has nothing to do with types at database level. Datastore is schema-less so there are no types. However, language based APIs (Java JDO/JPA/objectify, Python NDB) map this to classes/objects.
Parent keys (afaik, serialised inside key). This is used to establish entity groups (defining scope of transactions).
A particular entity identifier: name (string) or ID (long). They are unique within namespace and kind (and parent key if defined) - see this for more info on ID uniqueness.
See Key methods (java) to see what data is actually stored within the key.
Second: It seems that GAE Python API does not allow you to query Datastore without defining classes that map to entity kind (I don't use GAE Python, so I might be wrong). Java does have a low-level API that you can use without mapping to classes.
Third: You are not required to define a parent to an entity. Defining a parent is a way to define entity groups, which are important when using transactions. See ancestor paths and
transactions.
That's what a key is: a path consisting of pairs of kind and ID. The key is what identifies what kind it is.
I don't understand your second question. You don't have to set a parent, but if you want to set one, you can only do it when creating the entity.

Resources