Indexes in Google Datastore - google-app-engine

By default indexing is enabled for all the fields in the ndb based model class.
What if I change the indexing definition for a field and redeploy the app; will it drop the indexing or recreate it, for that field, based on the changes in the model class?
Or is it like entity relationships which can't be changed once defined. I am asking this because, I am not sure at this point, how many fields I would require to be indexed in the final application ?

You can at any time change the definition of an entity object, the important thing is whether the property is set to be indexed when you put(). Say I have inserted a bunch of objects with a "name" property, un-indexed. Later I add an index to future put()'s on those entities. All my entities will still be in the datastore, just the ones that were indexed are query-able. A similar logic applies when I remove indexing from a language-local model property (java #Entity class for example, with objectify), and then do put().
This is what it means to have a schemaless datastore. They can have all different combinations of properties and indexing on/off for each of them. The only thing that truly binds these entities together is their "kind", which is set to the classname by the framework you're using, or set by hand if you're using the truly low-level API.
Read more here to understand better how indexing works in the schemaless datastore. This answers your question completely if you read the section linked.

Related

Updating an objectify entity without changing indexed properties

Say I have an objectify entity with 1 unindexed and 5 indexed fields. If I were to update the entity by modifying the unindexed property alone, would it cause to rewrite the indices for the five indexed fields as well? Essentially I am worried about the write cost here.
Google charges per-entity write, irrespective of the number of indexes.
See https://cloud.google.com/appengine/pricing#costs-for-datastore-calls
Yes, every update of an entity causes updates of all indexed properties. In other words, the write costs are the same whether only one property is updated or all of them.
This is not specific to Objectify - it's how the Datastore works.

NDB Query StringProperty + StructuredProperty [duplicate]

There are many properties in my model that I currently don't need indexed but can imagine I might want indexed at some unknown point in the future. If I explicitly set indexed=False for a property now but change my mind down the road, will Datastore rebuild the entire indices automatically at that point, including for previously written data? Are there any other repercussions for taking this approach?
No, changing indexed=True to indexed=False (and vice-versa) will only affect entities written after that point to the datastore. Here is the documentation that talks about it and the relevant paragraph:
Similarly, changing a property from indexed to unindexed only affects entities subsequently written to the Datastore. The index entries for any existing entities with that property will continue to exist until the entities are updated or deleted. To avoid unwanted results, you must purge your code of all queries that filter or sort by the (now unindexed) property.
If you decide later that you want to starting indexing properties, you'll have to go through your entities and re-put them into the datastore.
Note, however, that changing a property from unindexed to indexed does not affect any existing entities that may have been created before the change. Queries filtering on the property will not return such existing entities, because the entities weren't written to the query's index when they were created. To make the entities accessible by future queries, you must rewrite them to the Datastore so that they will be entered in the appropriate indexes. That is, you must do the following for each such existing entity:
Retrieve (get) the entity from the Datastore.
Write (put) the entity back to the Datastore.
To index properties of existing entities (as per the documentation):
Retrieve (get) the entity from the Datastore.
Write (put) the entity back to the Datastore.
didn't work for me. I employed appengine-mapreduce library and wrote a MapOnlyMapper<Entity, Void> using DatastoreMutationPool for indexing all the existing entities in Datastore.
Lets assume the property name was unindexed and I want to index this in all the existing entities. What I had to do is:
#Override
public void map(Entity value) {
String property = "name";
Object existingValue = value.getProperty(property);
value.setIndexedProperty(property, existingValue);
datastoreMutationPool.put(value);
}
Essentially, you will have to set the property as indexed property using setIndexedProperty(prop, value) and then save (put) the entity.
I know I am very late in posting an answer. I thought I could help someone who might be struggling with this problem.

view simple indexes on AppEngine Datastore

How can I view the simple index definitions on Googles AppEngine Datastore? Is it possible at all?
There is a "Datastore Indexes" view which only displays the composite indexes as it seems (the ones you define in datastore_indexes.xml).
What do you mean by does not work? For non custom index, you should put the old objects to include them in the index.
From the doc https://developers.google.com/appengine/docs/python/datastore/indexes
"Note, however, that changing a property from unindexed to indexed does not affect any existing entities that may have been created before the change. Queries filtering on the property will not return such existing entities, because the entities weren't written to the query's index when they were created. To make the entities accessible by future queries, you must rewrite them to the Datastore so that they will be entered in the appropriate indexes. That is, you must do the following for each such existing entity:"
It's not possible (yet) to view the simple index definitions on your datastore model.
The actual index in the datastore can vary between entity instances (if the definition was changed at a time where there already was data stored). Changing simple indexes thus requires a manual migration (read and put all data so it is stored and indexed again with the new definition). Thanks #marcadian for the pointer.

Removing field indexes safely

I'm moving the data model of an app engine app to Objectify, and I've noticed that Objectify for it's entities specifies all properties of an entity as unindexed by default, which makes sense to me as it would be quicker on writes and less space would be used up.
But the GAE default (at least when I wrote the app) is to create field endexes on all fields by default, so all my fields are indexed. And there is hundreds of thousands of rows.
I really only need only a small fraction of these fields indexed and I would like to set them as unindexed. I want to set these fields as #Unindexed in objectify, but how can I remove the indexed data already in the datastore?
To add or remove single-property indexes, change the metadata (add/remove #Index and #Unindex) and then load+save the entities. You may wish to use map/reduce for this.

Use a ListProperty or custom tuple property in App Engine?

I'm developing an application with Google App Engine and stumbled across the following scenario, which can perhaps be described as "MVP-lite".
When modeling many-to-many relationships, the standard property to use is the ListProperty. Most likely, your list is comprised of the foreign keys of another model.
However, in most practical applications, you'll usually want at least one more detail when you get a list of keys - the object's name - so you can construct a nice hyperlink to that object. This requires looping through your list of keys and grabbing each object to use its "name" property.
Is this the best approach? Because "reads are cheap", is it okay to get each object even if I'm only using one property for now? Or should I use a special property like tipfy's JsonProperty to save a (key, name) "tuple" to avoid the extra gets?
Though datastore reads are comparatively cheaper datastore writes, they can still add significant time to request handler. Including the object's names as well as their foreign keys sounds like a good use of denormalization (e.g., use two list properties to simulate a tuple - one contains the foreign keys and the other contains the corresponding name).
If you decide against this denormalization, then I suggest you batch fetch the entities which the foreign keys refer to (rather than getting them one by one) so that you can at least minimize the number of round trips you make to the datastore.
When modeling one-to-many (or in some
cases, many-to-many) relationships,
the standard property to use is the
ListProperty.
No, when modeling one-to-many relationships, the standard property to use is a ReferenceProperty, on the 'many' side. Then, you can use a query to retrieve all matching entities.
Returning to your original question: If you need more data, denormalize. Store a list of titles alongside the list of keys.

Resources