Is this the right way to define a composite key for
a class:
#PersistenceCapable
class Item {
#PrimaryKey
long id;
#PrimaryKey
String sellerID;
// ... other fields follow
}
because I want the pair (id, sellerID) to be unique, not just id on its own.
Thus in the app engine datastore I need an entity which incorporates both
fields somehow into a key (for instance separating them with a dash and
concatenating them) but I am not sure about how to go about instructing
app engine to do so via JDO or even via the low-level API.
The easiest way here is to use KeyFactory and to use a single Key that you generate each time:
http://code.google.com/appengine/docs/java/javadoc/com/google/appengine/api/datastore/KeyFactory.Builder.html
Create a String Key and concatenate the two fields. Creating two #PrimaryKey annotations will not work - treat App Engine as close to a key-value store as possible. I really like Jeff Schnitzer's explanation here about how to think of the datastore as a HashMap/Dictionary:
http://code.google.com/p/objectify-appengine/wiki/Concepts
Related
I am using google app engine with python. the server got the entity key from the user, then I use this code to bring the entity:
key.get()
but I also want to get the entity just if it related to particular model, How could I do that ?
I know that I could do that by this code:
MyModel.get_by_id('my_key')
but this works just for the key_name and the id , and in my case I use the key ?
After getting user-provided key as urlsafe string from server, construct the NDB key, e.g.:
key = ndb.Key(urlsafe=string)
I'm not sure though, why you don't simply use key.get() after importing MyModel :-)
However, this is how you get an instance using its ID (whether string or integer):
MyModel.get_by_id(key.id(), parent=key.parent(), app=key.app(), namespace=key.namespace())
The keywords are optional, unless you use multiple namespaces or application IDs, or MyModel is child class in an entity group.
Alternatively, use key.string_id() or key.integer_id()
Security warning: Since your app accepts user-provided keys, be aware that even the cryptically looking URL-safe keys can be easily encoded/decoded.
For details see Reference for NDB Key
I'm looking at the GAE example for datastoring here, and among other things this confused me a bit.
def guestbook_key(guestbook_name=DEFAULT_GUESTBOOK_NAME):
"""Constructs a Datastore key for a Guestbook entity with guestbook_name."""
return ndb.Key('Guestbook', guestbook_name)
I understand why we need the key, but why is 'Guestbook' necessary? Is it so you can query for all 'Guestbook' objects in the datastore? But if you need to search a datastore for a type of object why isn't there a query(type(Greeting)? Concidering that that is the ndb.model that you are putting in?
Additionally, if you are feeling generous, why in creating the object you are storing, do you have to set parent?
greeting = Greeting(parent=guestbook_key(guestbook_name))
First: GAE Datastore is one big distributed database used by all GAE apps concurrently. To distinguish entities GAE uses system-wide keys. A key is composed of:
Your application name (implicitly set, not visible via API)
Namespace, set via Namespace API (if not set in code, then an empty namespace is used).
Kind of entity. This is just a string and has nothing to do with types at database level. Datastore is schema-less so there are no types. However, language based APIs (Java JDO/JPA/objectify, Python NDB) map this to classes/objects.
Parent keys (afaik, serialised inside key). This is used to establish entity groups (defining scope of transactions).
A particular entity identifier: name (string) or ID (long). They are unique within namespace and kind (and parent key if defined) - see this for more info on ID uniqueness.
See Key methods (java) to see what data is actually stored within the key.
Second: It seems that GAE Python API does not allow you to query Datastore without defining classes that map to entity kind (I don't use GAE Python, so I might be wrong). Java does have a low-level API that you can use without mapping to classes.
Third: You are not required to define a parent to an entity. Defining a parent is a way to define entity groups, which are important when using transactions. See ancestor paths and
transactions.
That's what a key is: a path consisting of pairs of kind and ID. The key is what identifies what kind it is.
I don't understand your second question. You don't have to set a parent, but if you want to set one, you can only do it when creating the entity.
I'm just starting to get my head around non-relational databases, so I'd like to ask some help with converting these traditional SQL/django models into Google App Engine model(s).
The example is for event listings, where each event has a category, belongs to a venue, and a venue has a number of photos attached to it.
In django, I would model the data like this:
class Event(models.Model)
title = models.CharField()
start = models.DatetimeField()
category = models.ForeignKey(Category)
venue = models.ForeignKey(Venue)
class Category(models.Model):
name= models.CharField()
class Venue (models.Model):
name = models.CharField()
address = models.CharField()
class Photo(models.Model):
venue = models.ForeignKey(Venue)
source = models.CharField()
How would I accomplish the equivalent with App Engine models?
There's nothing here that must be de-normalized to work with App Engine. You can change ForeignKey to ReferenceProperty, CharField to StringProperty and DatetimeField to DateTimeProperty and be done. It might be more efficient to store category as a string rather than a reference, but this depends on usage context.
Denormalization becomes important when you start designing queries. Unlike traditional SQL, you can't write ad-hoc queries that have access to every row of every table. Anything you want to query for must be satisfied by an index. If you're running queries today that depend on table scans and complex joins, you'll have to make sure that the query parameters are indexed at write-time instead of calculating them on the fly.
As an example, if you wanted to do a case-insensitive search by event title, you'd have to store a lower-case copy of the title on every entity at write time. Without guessing your query requirements, I can't really offer more specific advice.
It's possible to run Django on App Engine
You need a trio of apps from here:
http://www.allbuttonspressed.com/projects
Django-nonrel
djangoappengine
djangotoolbox
Additionally, this module makes it possible to do the joins across Foreign Key relationships which are not directly supported by datastore methods:
django-dbindexer
...it denormalises the fields you want to join against, but has some limitations - doesn't update the denormalised values automatically so is only really suitable for static values
Django signals provide a useful starting point for automatic denormalisation.
My background is in relational DB's and I am doing some experimenting with Google AppEngine primarily for learning. I want to build an "election" app where a user belongs to a state (CA, NY, TX, etc), they pick a party (Republican, Democratic, etc) and cast a vote for a particular year (2012 for now but the app could be reused in 2016).
I want a user to be able to see their voting history and maybe change it once for the current election. Also, I am going to require that users specify their zip code and think it would be nice to run some reports by state and/or zip code.
Using a relational DB, it seems you would create some tables like this:
Users(userid, username, city, state, zip)
UserVote(userid, year, vote)
And then use SQL to run reports. With the AppEngine datastore it seems that running aggregate reports is somewhat of a challenge.
My initial take would be to shard by User where each user can contain a list of Votes and then maybe double-save the aggregates elsewhere.
Any suggestions?
P.S. I have seen the AppEngine-MapReduce project, but am not sure if that would be overkill.
I dont remember exactly where I read this, but List properties in GAE become slow after they reach about 200 items. I would recommend against this in favor of the foreign key approach for Users and Votes.
Aggregates are a challenge since there are none of the common helper functions such as MAX, SUM, COUNT and so on. The best approach would be to store aggregates and counts in a separate datatype which you can query easily and update that every time a user makes a vote.
Its easier in AppEngine to spend the time when you do the write so you can have faster queries later.
Here's a example of the objects in Java:
#PersistenceCapable
public class User{
#PrimaryKey
#Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Key key;
...
}
#PersistenceCapable
public class Vote{
#PrimaryKey
#Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Key key;
#Persistent
private Key userKey; // References a User
...
}
#PersistenceCapable
public class UserStats{
#PrimaryKey
#Persistent(valueStrategy = IdGeneratorStrategy.IDENTITY)
private Key key;
#Persistent
private Key userKey; // References a User
...
}
Also, traditional sharding doesn't make much sense in AppEngine since the underlying datastore is designed to handle queries on massive data sets with ease. The exception is if you have a specific counter that can be changed frequently and has a potential for multiple users changing it at the same time. This is a different type of sharding than you're used to in MySQL. Here is Google's article on sharding counters: http://code.google.com/appengine/articles/sharding_counters.html
I'm using JDO with Google App Engine for storage and I'm wondering what is the difference between the Key object and Long for id?
I find the long ID more practical, am I missing anything?
Thanks.
A Key is globally uniquely identifier which uniques identifies an entity across all of app engine. It consists of two pieces:
A path describing what app the entity belongs to, any ancestor keys, and the entity's kind.
An ID (a long) or a key name (a string).
Regardless of whether you choose to use a long or a string as the second piece, there is a Key object is associated with every entity stored in the datastore.