Incremental IDs on Objectify - google-app-engine

Since upgrading to GAE 1.8, I'm getting scattered ids when annotating with #Id in Objectify:
#Id
private Long id;
Even though I understand the need for scattered ids in terms of avoiding hotspots on the cloud platform, is there a way in Objectify to get the old incremental ids back? Having to display a hexatridecimal value (like 1DZENH6BSOW) in the UI to avoid that massive generated 64bit id just doesn't cut it.
I'm happy to have a secondary annotation #IdLegacy working in conjunction with the #Id, then #Id will still generate the long id and I can use the legacy id for display purposes.
SOLUTION:
Inside my construtor, I have a simple piece of code that allocates an id if ones doesn't exist:
if (getId() == null){
ObjectifyFactory f = new ObjectifyFactory();
Key<MyEntity> key = f.allocateId(MyEntity.class);
setId(key.getId());
}

As far as I know, Objectify passes along the App Engine Datastore's scattered id behavior.
A quick check of the Objectify issue tracker doesn't show that anyone has yet made a request for incremental ids. Submit a request to the Objectify devs. http://code.google.com/p/objectify-appengine/issues/list

Related

Entities missing from Google Datastore query results -- composite index stale or missing?

My application runs on Google App Engine and uses Google Cloud Datastore.
I was alerted by one of my users that some entries they are associated with and had previously been seeing, are not appearing to them.
Indeed, when I query the Datastore for these entities (with a single property filter), they are not returned. I was able to find them via querying on a different property and after writing them to the datastore, the index is updated and they are returned in the query.
Perhaps technically my query is not guaranteed to return the entities as it is weakly consistent, but none of the entities were changed recently and usually any inconsistent results are resolved quite quickly. (it has been several days now)
So it seems like the index entries for this property on these entities were lost or damaged somehow. What to do? Wait and hope the index will be regenerated? I can write entities for this user to the datastore to regenerate the index...but doing it for all my users is not really an option.
The only similar case I can see on SA is this question: Seems the indexes are missing for new entities created since some time late June 1st, 2015 which resulted from this incident: https://status.cloud.google.com/incident/appengine/15015 but no similar incident occurred recently, according to the status dashboard.
Have you changed the status of which properties are indexed?
If you do, previously existing entities will only be updated in the indices the next time they are put.
That could cause older entities to not show up in a query.
An App Engine engineer kindly pointed me towards the root of the issue - my query filter property was on a UserProperty type.
From the docs at https://cloud.google.com/appengine/docs/legacy/standard/python/users/userobjects :
A User value with an email address that does not represent a Google
account at the time it is created will never match a User value that
represents a real user.
My user's email was not a Google Account but he likely recently created one, causing the user_id() to go from None to an integer id.
This means that from now, when I do a query like this:
u = User('name#non_google_domain.com')
Entity.all().filter('user_property', u)
Internally, u is now actually looked up by the combination of name#non_google_domain.com and the integer id, instead of the combination of name#non_google_domain.com and None, causing my entities not to be returned.
Indeed, examining e._entity['user_property'] of a returned entity, the new ones are of the form User('name#non_google_domain.com', _user_id='12345678912345678900') and the old ones are User('name#non_google_domain.com').
UserProperty is no longer recommended for use, because of issues like this.

Google Cloud Datastore indexes

We are using Google App Engine for our new app. We want to use Google’s Datastore we are trying to understand how Datastore indexes work.
We understood that there are a couple of limits on indexes. We are especially focusing on entity index limitations.
We have embedded property in one of our models ,
Main class
Contact
{
#Indexed
private String name;
#Embedded
#Indexed
private CStatus cstatus;
}
Embedded class
CStatus
{
private Long start_time = 0L;
public enum Status
{
ACTIVE, PAUSE, DELETED
};
private String status = null;
}
Assume that I saved an instance of Contact,
1.How many predefined indexes will be created for the Contact kind in total?
2.How many index entries will be created in total?
3.Is there any developers’ play ground available for Datastore? We have checked Datastore statistics but it's taking 24-48 hours to update the index entries list.
According to your code, two simple indexes will be created; 1 for name and another for status.
You should note that indexes will also be created if some other place in the code you run a query that requires other indexes.
Another thing to take note of is that the 200-limit on indexes does not apply to indexes using one single attribute. It applies to composite indexes using multiple attributes.
As of yet there is no play ground that I know of unless you wanna create a dummy project and test your code on it. Otherwise you just have to play in your development environment until Google addresses that issue.

Storing Relationships as Objectify Keys vs. Long IDs

I am developing a RESTfull webservice with GAE. My technology stack is focused around Jersey, Spring, and Objectify.
If you don't know Objectify is ...
“Objectify is a Java data access API specifically designed for the Google App Engine datastore. It occupies a "middle ground"; easier to use and more transparent than JDO or JPA, but significantly more convenient than the Low-Level API. Objectify is designed to make novices immediately productive yet also expose the full power of the GAE datastore.”
https://code.google.com/p/objectify-appengine/
As of now I have used Objectify Keys to store relationships in my models. Like this ...
public class MyModel {
#Id private Long id;
private Key<MyOtherModel>> myOtherModel;
...
Objectify keys provide additional power as compared to Long IDs, but they can be created from a Long ID and a MyOtherModel.class with a static method Key.create(...),
Key.create(MyOtherModel.class, id)
so I don't exactly have to store relationships as Objectify keys at the model level, I just thought it be more consistent.
The problem is I need to write a lot of additional code to create XML adapters to convert the Objectify keys to Long IDs when I serialize my model objects to JSON, and deserialize them from JSON to a Java object.
I was thinking about using Long IDs instead and creating an Objectify Key in the DAO, when I need it. Also, this would remove any Objectify specific code from anything that wasn't a DAO.
I would like some perspective from a more experienced programmer. I have never created a software of this size, several thousand lines of code that is.
Thanks a lot everyone.
I am an in-experienced datastore/objectify developer too, so I'm just musing here.
I see your point, that replacing the Key<> type in MyModel with a Long id would simplify things for you. I would note though, that the Key<> object can contain a path (as well as a kind and an id). So, if your data model becomes more complex and MyOtherModel is no longer a root kind then your ability to generate a Key<> from a Long id breaks down.
If you know that won't happen, or don't mind changing MyModel later then I guess that isn't a problem.
For your serializing format I would suggest you use a String to hold your key or id. Your Long id can be converted to a string, and would have to be anyway for JSON (so there is no loss in efficiency), but that same string could later be used to hold the full Key too.
You can also store them as long (or Long or String) and have a method of getMyOtherModelKey() and that can return a key after calling the static method. You can also have getMyOtherModelID() to just return the ID. This really works both ways since you can have both methods if you store a key or just the ID.
The trick comes in if you use parents in any of your models. If you do the ID alone is not enough to get the other model, you need the ID and the IDs of all the parents (and grand parents if needed). This is where Keys are nice.

Can't get generated keys to increase by 1

I'm using Java and JPA for ORM.
Initially I was defining entity keys like this:
#Id
#GeneratedValue(strategy = GenerationType.IDENTITY)
private Key key;
but that resulted in ids that were growing pretty fast and in unpredictable ways (...19,20,22,1003...1007,1014,1015,2004...)
which seems to contradict docs which state that "The simplest key field is a long integer value that is automatically populated by JPA with a value unique across all other instances of the class when the object is saved to the datastore for the first time. Long integer keys use a #Id annotation, and a #GeneratedValue(strategy = GenerationType.IDENTITY) annotation"
So I found this unit test and I switched to the way it was done there:
#Id
#GeneratedValue(strategy = GenerationType.SEQUENCE)
private Long id;
which migrated fine after updating some GQL statements, but I'm still seeing keys increasing by 1000 every time.
Should I be using GenerationType.TABLE? Or should I have been using IDENTITY on a Long rather than a Key field?
I'm hoping to get some definitive answers before I keep changing this in my live (beta) app. Unfortunately all schemes I've used so far in the dev env result in contiguous keys so really no way to test out new approaches except by deploying.
Thanks in advance.
It's really hard to do contiguous keys on App Engine. The docs never stated that the auto-generated keys are contiguous - only that they will be unique.
The simplest solution on app engine is to design your keys so that you don't need them to be contiguous. Given the way BigTable is designed, if you did have contiguously incrementing keys, you'd likely have some perf bottlenecks whenever a tablet needs to be split under the hood.

how to delete an entire db.model class in google app engine easily?

Hi I read a few post on this topic, lets say I want to delete every object for a given class db.model such as LinkRating2, is there a way to delete it on startup with a simple command? I thought i remember seeing this somewhere eg --clear datastore?, otherwise I have been trying various methods in sdk console but all seem to be memory crashing as the file is taking forever just to load and start I don't think there is any memory left about lol. So what would be best code method to delete entire entity?
class LinkRating2(db.Model):
user = db.StringProperty()
link = db.StringProperty()
rating2 = db.FloatProperty()
tried this but this is super slow
results2 = LinkRating2.all()
results = results2.fetch(500)
while results:
db.delete(results)
results = results2.fetch(500)
is there a way to delete it on startup
with a simple command? I thought i
remember seeing this somewhere eg
--clear datastore?
If you're talking about the development server, yes. dev_appserver.py --clear_datastore myapp will start you up with a fresh datastore. This is the best option when you're working locally.
I didn't bother putting it in a while
loop, as trying to delete all of the
entities in one go probably stands a
good chance of exceeding the 30-second
deadline. You'll just need to do this
until all of the entities are gone.
If you do want to wipe out every entity of a given type in production, this is a good excuse to use remote_api instead of writing a web-based handler, both to circumvent the 30 second deadline and to reduce the chance of your entitypocalypse code being run again by accident.
One last thing: if you want to delete some entities but you don't want to bother loading the model definition, you can use the low level datastore API:
from google.appengine.api import datastore
kind = 'LinkRating2'
batch_size = 1000
query = datastore.Query(kind=kind, keys_only=True)
results = query.Get(batch_size)
while results:
print "Deleting %d %s entities" % (batch_size, kind)
datastore.Delete(results)
results = query.Get(batch_size)
You are much better off working simply with keys than with entire entities. Querying just for entity keys is much faster than fetching entire entities, and you only need the keys for delete.
results = db.Query(LinkRating2, keys_only=True).fetch(1000)
if len(results) > 0:
db.delete(results)
I didn't bother putting it in a while loop, as trying to delete all of the entities in one go probably stands a good chance of exceeding the 30-second deadline. You'll just need to do this until all of the entities are gone.
results = db.Query(Final, keys_only=True).fetch(1000)
while len(results) > 0:
db.delete(results)
results = db.Query(Final, keys_only=True).fetch(1000)
The --clear-datastore flag will clear all the data from the database-- not just of a specific class as it seems you want.
This question has already been asked: Delete all data for a kind in Google App Engine
I think this answer will provide you with what you want. It deletes items by their keys, which should be much faster:
Delete all data for a kind in Google App Engine

Resources