Twitter-ish DB structure in Google App Engine - google-app-engine

I'm trying to create a site which is quite similar to Twitter. Users will be able to post messages. And users will be able to 'follow' each other. On the homepage, they see the messages from the users they follow, sorted by time.
How do I go about creating the appengine models for this?
In a traditional relational DB, i guess it would be something like this:
Database 'user':
id
username
Database 'follows':
user_id
follow_id
Database 'messages':
user_id
message
And the query will be something like:
SELECT * FROM messages m, follows f WHERE m.user_id = f.follow_id AND f.user_id = current_user_id
I guess i was clear with the example above. How do I replicate this in Google App Engine?

There is a useful presentation at Google I/O a while back by Brett Slatkin which describes building a scalable twitter-like microblog app, and deals with this very question at length: http://www.google.com/events/io/2009/sessions/BuildingScalableComplexApps.html

REVISED:
class AppUser(db.Model):
user_id = db.UserProperty()
username = db.StringProperty()
following = db.ListProperty(db.Key) # list of AppUser keys
class Message(db.Model):
sender = db.ReferenceProperty(AppUser)
body = db.TextProperty()
You would then query the results in two steps:
message_list = []
for followed_user in current_user.following:
subresult = db.GqlQuery("SELECT __key__ FROM Message WHERE sender = :1", followed_user)
message_list.extend(subresult)
results = Message.get(message_list)
(with 'current_user' being the 'AppUser' entity corresponding with your active user)

Related

How to flatten a 'friendship' model within User model in GAE?

I recently came across a number of articles pointing out to flatten the data for NoSQL databases. Coming from traditional SQL databases I realized I am replicating a SQL db bahaviour in GAE. So I started to refactor code where possible.
We have e.g. a social media site where users can become friends with each other.
class Friendship(ndb.Model):
from_friend = ndb.KeyProperty(kind=User)
to_friend = ndb.KeyProperty(kind=User)
Effectively the app creates a friendship instance between both users.
friendshipA = Friendship(from_friend = UserA, to_friend = userB)
friendshipB = Friendship(from_friend = UserB, to_friend = userA)
How could I now move this to the actual user model to flatten it. I thought maybe I could use a StructuredProperty. I know it is limited to 5000 entries, but that should be enough for friends.
class User(UserMixin, ndb.Model):
name = ndb.StringProperty()
friends = ndb.StructuredProperty(User, repeated=True)
So I came up with this, however User can't point to itself, so it seems. Because I get a NameError: name 'User' is not defined
Any idea how I could flatten it so that a single User instance would contain all its friends, with all their properties?
You can't create a StructuredProperty that references itself. Also, use of StructuredProperty to store a copy of User has additional problem of needing to perform a manual cascade update if a user ever modifies a property that is stored.
However, as KeyProperty accept String as kind, you can easily store the list of Users using KeyProperty as suggested by #dragonx. You can further optimise read by using ndb.get_multi to avoid multiple round-trip RPC calls when retrieving friends.
Here is a sample code:
class User(ndb.Model):
name = ndb.StringProperty()
friends = ndb.KeyProperty(kind="User", repeated=True)
userB = User(name="User B")
userB_key = userB.put()
userC = User(name="User C")
userC_key = userC.put()
userA = User(name="User A", friends=[userB_key, userC_key])
userA_key = userA.put()
# To retrieve all friends
for user in ndb.get_multi(userA.friends):
print "user: %s" % user.name
Use a KeyProperty that stores the key for the User instance.

GAE Datastore ID

I've created two different Entities, one a User and one a Message they can create. I assign each user an ID and then want to assign this ID to each message which that user creates. How can I go about this? Do I have to do it in a query?
Thanks
Assuming that you are using Python NDB, you can having something like the following:
class User(ndb.Model):
# put your fileds here
class Message(ndb.Model):
owner = ndb.KeyProperty()
# other fields
Create and save a User:
user = User(field1=value1, ....)
user.put()
Create and save a Message:
message = Message(owner=user.key, ...)
message.put()
Query a message based on user:
messages = Message.query().filter(Message.owner==user.key).fetch() # returns a list of messages that have this owner
For more information about NDB, take a look at Python NDB API.
Also, you should take a look at Python Datastore in order to get a better understanding of data modeling in App Engine.

How to create a query for matching keys?

I use the key of another User, the sponsor, to indicate who is the sponsor of a User and it creates a link in the datastore for those Users that have a sponsor and it can be at most one but a sponsor can sponsor many users like in this case ID 2002 who sponsored three other users:
In this case this query does what I want: SELECT * FROM User where sponsor =KEY('agtzfmJuYW5vLXd3d3ILCxIEVXNlchjSDww') but I don't know how to program that with python, I can only use it to the datastore. How can I query by key when I want to match the set of users who has the same user as key in the same field? A user in my model can have at most one sponsor and I just want to know who a particular person sponsored which could be a list of users and then they sponsored users in their turn which I also want to query on.
The field sponsor is a key and it has a link to the sponsor in the datastore. I set the key just like user2.sponsor = user1.key and now I want to find all that user1 sponsored with a query that should be just like
User.All().filter('sponsor = ', user1.key)
but sponsor is a field of type key so I don't know how to match it to see for example a list a people the active user is a sponsor for and how it becomes a tree when the second generation also have links. How to select the list of users this user is a sponsor for and then the second generation? When i modelled the relation simply like u1=u2.key ie user2.sponsor=user1.key. Thanks for any hint
The following workaround is bad practice but is my last and only resort:
def get(self):
auser = self.auth.get_user_by_session()
realuser = auth_models.User.get_by_id(long( auser['user_id'] ))
q = auth_models.User.query()
people = []
for p in q:
try:
if p.sponsor == realuser.key:
people.append(p)
except Exception, e:
pass
if auser:
self.render_jinja('my_organization.html', people=people, user=realuser,)
Update
The issues are that the keyproperty is not required and that Guido Van Rossum has reported this as a bug in the ndb when I think it's a bug in my code. Here's what I'm using now, which is a very acceptable solution since every real user in the organization except possibly programmers, testers and admins are going the be required to have a sponsor ID which is a user ID.
from ndb import query
class Myorg(NewBaseHandler):
#user_required
def get(self):
user = auth_models.User.get_by_id(long(self.auth.get_user_by_session()['user_id']))
people = auth_models.User.query(auth_models.User.sponsor == user.key).fetch()
self.render_jinja('my_organization.html', people=people,
user=user)
class User(model.Expando):
"""Stores user authentication credentials or authorization ids."""
#: The model used to ensure uniqueness.
unique_model = Unique
#: The model used to store tokens.
token_model = UserToken
sponsor = KeyProperty()
created = model.DateTimeProperty(auto_now_add=True)
updated = model.DateTimeProperty(auto_now=True)
# ID for third party authentication, e.g. 'google:username'. UNIQUE.
auth_ids = model.StringProperty(repeated=True)
# Hashed password. Not required because third party authentication
# doesn't use password.
password = model.StringProperty()
...
The User model is an NDB Expando which is a little bit tricky to query.
From the docs
Another useful trick is querying an Expando kind for a dynamic
property. You won't be able to use class.query(class.propname ==
value) as the class doesn't have a property object. Instead, you can
use the ndb.query.FilterNode class to construct a filter expression,
as follows:
from ndb import model, query
class X(model.Expando):
#classmethod
def query_for(cls, name, value):
return cls.query(query.FilterNode(name, '=', value))
print X.query_for('blah', 42).fetch()
So try:
form ndb import query
def get(self):
auser = self.auth.get_user_by_session()
realuser = auth_models.User.get_by_id(long( auser['user_id'] ))
people = auth_models.User.query(query.FilterNode('sponsor', '=', realuser.key)).fetch()
if auser:
self.render_jinja('my_organization.html', people=people, user=realuser,)
Option #2
This option is a little bit cleaner. You subclass the model and pass it's location to webapp2. This will allow you to add custom attributes and custom queries to the class.
# custom_models.py
from webapp2_extras.appengine.auth.models import User
from google.appengine.ext.ndb import model
class CustomUser(User):
sponsor = model.KeyProperty()
#classmethod
def get_by_sponsor_key(cls, sponsor):
# How you handle this is up to you. You can return a query
# object as shown, or you could return the results.
return cls.query(cls.sponsor == sponsor)
# handlers.py
def get(self):
auser = self.auth.get_user_by_session()
realuser = custom_models.CustomUser.get_by_id(long( auser['user_id'] ))
people = custom_models.CustomUser.get_by_sponsor_key(realuser.key).fetch()
if auser:
self.render_jinja('my_organization.html', people=people, user=realuser,)
# main.py
config = {
# ...
'webapp2_extras.auth': {
# Tell webapp2 where it can find your CustomUser
'user_model': 'custom_models.CustomUser',
},
}
application = webapp2.WSGIApplication(routes, config=config)

How To Use Entity Groups And Ancestors with DjangoForms

I would like to rewrite the example from the GAE djangoforms article to be show most up to date after submitting a form (e.g. when updating or adding a new entry) on Google App Engine using the High Replication Datastore.
The main recurring query in this article is:
query = db.GqlQuery("SELECT * FROM Item ORDER BY name")
which we will translate to:
query = Item.all().order('name') // datastore request
This query I would like to get the latest updated data from the high replication datastore after submitting the form (only in these occasions, I assume I can redirect to a specific urls after submission which just uses the query for the latest data and in all other cases I would not do this).
validating the form storing the results happens like:
data = ItemForm(data=self.request.POST)
if data.is_valid():
# Save the data, and redirect to the view page
entity = data.save(commit=False)
entity.added_by = users.get_current_user()
entity.put() // datastore request
and getting the latest entry from the datastore for populating a form (for editing) happens like:
id = int(self.request.get('id'))
item = Item.get(db.Key.from_path('Item', id)) // datastore request
data = ItemForm(data=self.request.POST, instance=item)
So how do I add entity groups/ancestor keys to these datastore queries to reflect the latest data after form submission. Please note, I don't want all queries to have the latest data, when populating a form (for editing) and after submitting a form.
Who can help me with practical code examples?
If it is in the same block, you have reference of the current intance.
Then once you put() it, you can get its id by:
if data.is_valid():
entity = data.save(commit=False)
entity.added_by = users.get_current_user()
entity.put()
id= entity.key().id() #this gives you inserted data id

How to delete data from Google App Engine?

I created one table in Google App Engine. I stored and retrieved data from Google App Engine.
However, I don't know how to delete data from Google App Engine Datastore.
An application can delete an entity from the datastore using a model instance or a Key. The model instance's delete() method deletes the corresponding entity from the datastore. The delete() function takes a Key or list of Keys and deletes the entity (or entities) from the datastore:
q = db.GqlQuery("SELECT * FROM Message WHERE msg_date < :1", earliest_date)
results = q.fetch(10)
for result in results:
result.delete()
# or...
q = db.GqlQuery("SELECT __key__ FROM Message WHERE msg_date < :1", earliest_date)
results = q.fetch(10)
db.delete(results)
Source and further reading:
Google App Engine: Creating, Getting and Deleting Data
If you want to delete all the data in your datastore, you may want to check the following Stack Overflow post:
How to delete all datastore in Google App Engine?
You need to find the entity then you need delete it.
So in python it would be
q = db.GqlQuery("SELECT __key__ FROM Message WHERE create_date < :1", earliest_date)
results = q.get()
db.delete(results)
or in Java it would be
pm.deletePersistent(results);
URLS from app engine are
http://code.google.com/appengine/docs/java/datastore/creatinggettinganddeletingdata.html#Deleting_an_Object
http://code.google.com/appengine/docs/python/datastore/creatinggettinganddeletingdata.html#Deleting_an_Entity
In Java
I am assuming that you have an endpoint:
Somethingendpoint endpoint = CloudEndpointUtils.updateBuilder(endpointBuilder).build();
And then:
endpoint.remove<ModelName>(long ID);

Resources