One-to-Many relationship in ndb - google-app-engine

I am reading up on Google app engine and preparing a sample to understand it better.
In a nutshell the user can record an entry for every day in the month, like a calendar.
And the user can view the entries on monthly basis. So no more than 30 ish at a time.
Initially I had used db and the one-to-many relationship was straight forward.
But once I came across the ndb, I realized there are two ways of modelling a one-to-many relationship.
1) The structured property seems to act like a repeated property on the User model. Does it mean if I retrieve one user, I would automatically retrieve all the records she has entered? (e.g. the entire year) This isn't very efficient though, is it? I guess the the advantage is that you get all related data in one read operation.
from google.appengine.ext import ndb
class User(UserMixin, ndb.Model):
email = ndb.StringProperty(required = True)
password_hash = ndb.TextProperty(required = True)
record = ndb.StructuredProperty(Record, repeated=True)
class Record(ndb.Model):
notes = ndb.TextProperty()
2) Alternatively I could use perhaps the more classic way:
class User(UserMixin, ndb.Model):
email = ndb.StringProperty(required = True)
password_hash = ndb.TextProperty(required = True)
class Record(ndb.Model):
user = ndb.KeyProperty(kind=User)
notes = ndb.TextProperty()
Which way is the better way in my use case?

The downside of using StructuredProperty instead of KeyProperty is that with StructuredProperty the limit on total entity size (1MB) applies to the sum of the User and all Records it contains (because the structured properties are serialized as part of the User entity). With KeyProperty, each Record has a 1MB limit by itself.

Related

GAE: NDB query on list

Google App Engine NDB data model like so:
Users
Username
FirstName
LastName
Posts
PostID
PosterUsername
SubscribedPosts
PostID
SubscriberUsername
For a specific user, I want to return all the Posts which the user is subscribed to and display them on the page.
Since the wonderful NDB doesn't support JOINs, we do two queries:
postIDList =
SubscribedPosts.query(SubscribedPosts.SubscriberUsername == 'johndoe').fetch()
This gives us a list of SubscribedPosts. So how do I take my postIDList list and use it as a filter criteria for a Posts query?
Something like:
results = Posts.query(Post.PostID IN postIDList.PostID)
In a normal relational database, this would be a simple query using table joins. How is this done in Google's ndb?
You are going to run into lots of bottlenecks if you try to design your datastore models the same way you would design tables in a relational database as you have in this example.
Your comment goes in one possible right direction, although there are other solutions. Going that route, I would drop the "subscribedPosts" model altogether use a repeated KeyProperty entity in the User model to store subscribed posts.
See this related post: One-To-Many Example in NDB
Seems you are looking to model a many-to-many relationship, not one-to-many. Read Modelling Entity Relationships (althought this is for the older db, not newer ndb, it still gives the idea).
One of the two entities should maintain a list of keys (repeated=True) of the related other entities. Which entity should hold the list? Preferably the list should be on the side that usually has fewer relationships so that the list of keys is smaller. Another consideration is which side likely has less contention for updates.
In your specific case, lets say on average users subscribe to 10 posts and lets say on average each post has 100 users subscribed to it. In this case, we would want to put the list of keys on Users side of the relation.
class Users(ndb.Model):
user_name = ndb.StringProperty()
first_name = ndb.StringProperty()
last_name = ndb.StringProperty()
posts = ndb.KeyProperty(kind='Posts', repeated=True)
class Posts(ndb.Model)
post_id = ndb.StringProperty()
poster_user_name = ndb.StringProperty()
Establish the relationship by adding to the list in the Users instance:
current_user.posts.append(current_post.key)
For a given Users instance, getting all subscribed Posts is easy since the list of keys of the subscribed Posts is already within the given Users:
ndb.get_multi(given_user.posts)
For a given Posts instance, get all subscribing Users by ...
query = Users.query(Users.posts == given_post.key)

How to implement on to many relation ? datastore google app engine

my application have business entity, and each business belong to one or more category !
How should implement the relation in my database ?
I have two options,
first option :
(to store all the categories that belong to specific business, at the business entity.)
class business(ndb.Model):
name = ndb.StringProperty()
categories = ndb.KeyProperty(kind=category,repeated=True)
class category(ndb.Model)
name = ndb.StringProperty()
the second option :
(to store all the business that belong to a specific category at the category entity)
class business(ndb.Model):
name = ndb.StringProperty()
class category(ndb.Model)
name = ndb.StringProperty()
businesses = ndb.KeyProperty(kind=business,repeated=True)
which option should I implement ?
Another problem:
every business could have one or more image:
should I store the images in list inside the business entity :
class business(ndb.Model):
name = ndb.StringProperty()
imagesUrl = ndb.StringProperty(repeated = True)
or create new entity for each image :
class image(ndb.Model):
businessKey = ndb.KeyProperty(repeated = True)
imageUrl = ndb.StringProperty()
I know that the entity size is limited to one mega! yes ?
Decisions like this one usually depend on your usage patterns. In you case, however, it looks like option 1 is the logical choice, because there are many more business entities than category entities.
For example, if you need to know which categories a business belongs to when you retrieve a single business entity, you will have to run an extra query if you choose option 2. With option 1, this extra query is unnecessary.
Another consideration is the frequency of updates. If you go with option 2, you will have to update your category entity every time you add a new business entity (thus, two entities have to be updated, which impacts performance and costs). With option 1 you only need to update one entity.

Google AppEngine: making complex queries

I have a model like this:
class Person(db.Model):
...
zipcode = db.StringProperty()
class Book(db.Model):
owner = db.ReferenceProperty(Person)
date_created = db.DateTimeProperty(auto_add=True)
How can I get a list of all the books within certain zipcode that are sorted by date_created in descending order?
Taking Greg's answer one step further, the right thing to on App Engine's datastore would be to denormalize:
class Person(db.Model):
...
zipcode = db.StringProperty()
class Book(db.Model):
owner = db.ReferenceProperty(Person)
date_created = db.DateTimeProperty(auto_add=True)
zipcode = db.StringProperty()
You'd have to be careful to set the Book's zipcode when you set the owner. After that it's pretty straightforward to do the query you want as a single query. Greg's solution is more costly since you'll need to do multiple queries and fetch many entities just to do the merge. With large datasets, it's also slow, and you risk various fetch operations timing out, as well as the merge operation taking a long time and a lot of memory.
You would have to do a query for all Person entities with the zipcode you're looking for, and then lookup each of their books, merge the lists and then sort the result.
Generally, normalisation like you'd do with a relational database won't be the best way to model things with datastore.

Google App Engine Entity Ownership

I am writing an app for GAE in Python which stores recipes for different users. I have an entity called User in the datastore and an entity called Recipe. I want to be able to set the owner of each Recipe to the User who created it. Also, I want each User entity to contain a list of all Recipes belonging to that User as well as being able to query the Recipe database to find all Recipes owned by a particular User.
What is the best way to go about creating this parent/child type relationship?
Thanks
There are two main ways. (I am going to assume your using python which defines examples)
Option 1. Make the User the ancestor of all of their recipes
recipe = Recipe(parent=user.key)
Option 2. Use key property
class Recipe(ndb.Model):
owner = ndb.KeyProperty()
recipe = Recipe(owner=user.key)
all recipes for user with option 1
recipes = Recipe.query(ancestor=user.key)
all recupes for user with option 2
recipes = Recipe.query().filter(Recipe.owner == user.key)
Which one you use really depends a lot on what you plan to do with the data after creation, transaction patterns etc.... You should elaborate on your use cases. Both will work.
Also you should read up on transactions entity groups and understand them to really determine if you want to use ancestors https://developers.google.com/appengine/docs/java/datastore/transactions?hl=en .
If you use db.Model, to model one-to-many relationship, you can use the RefernenceProperty constructor and specify a collection_name. For example, one book may have many reviews.
class Book(db.Model):
title = db.StringProperty()
author = db.StringProperty()
class BookReview(db.Model):
book = db.ReferenceProperty(Book, collection_name='reviews')
b = Book()
b.put()
br = BookReview()
br.book = b # sets br's 'book' property to b's key
for review in b.reviews:# use collection_name to retrieve all reviews for a book
....
see https://developers.google.com/appengine/docs/python/datastore/datamodeling#references
Alternatively, you can use ndb's KeyProperty as in Tim's answer.
Also see
db.ReferenceProperty() vs ndb.KeyProperty in App Engine

de-normalizing data model: django/sql -> app engine

I'm just starting to get my head around non-relational databases, so I'd like to ask some help with converting these traditional SQL/django models into Google App Engine model(s).
The example is for event listings, where each event has a category, belongs to a venue, and a venue has a number of photos attached to it.
In django, I would model the data like this:
class Event(models.Model)
title = models.CharField()
start = models.DatetimeField()
category = models.ForeignKey(Category)
venue = models.ForeignKey(Venue)
class Category(models.Model):
name= models.CharField()
class Venue (models.Model):
name = models.CharField()
address = models.CharField()
class Photo(models.Model):
venue = models.ForeignKey(Venue)
source = models.CharField()
How would I accomplish the equivalent with App Engine models?
There's nothing here that must be de-normalized to work with App Engine. You can change ForeignKey to ReferenceProperty, CharField to StringProperty and DatetimeField to DateTimeProperty and be done. It might be more efficient to store category as a string rather than a reference, but this depends on usage context.
Denormalization becomes important when you start designing queries. Unlike traditional SQL, you can't write ad-hoc queries that have access to every row of every table. Anything you want to query for must be satisfied by an index. If you're running queries today that depend on table scans and complex joins, you'll have to make sure that the query parameters are indexed at write-time instead of calculating them on the fly.
As an example, if you wanted to do a case-insensitive search by event title, you'd have to store a lower-case copy of the title on every entity at write time. Without guessing your query requirements, I can't really offer more specific advice.
It's possible to run Django on App Engine
You need a trio of apps from here:
http://www.allbuttonspressed.com/projects
Django-nonrel
djangoappengine
djangotoolbox
Additionally, this module makes it possible to do the joins across Foreign Key relationships which are not directly supported by datastore methods:
django-dbindexer
...it denormalises the fields you want to join against, but has some limitations - doesn't update the denormalised values automatically so is only really suitable for static values
Django signals provide a useful starting point for automatic denormalisation.

Resources