Model datastore application - google-app-engine

I am looking how to create an efficient model which will satisfy the requirements I put below. I have tried using gcloud-node but have noticed it has limitations with read consistencies, references, etc. I would prefer to write this is nodejs, but would be open to writing in java or python as long as it would improve my model. I am building around the new pricing model which will come July 1st.
My application consists of a closed email system. In essence what happens is users register to the site. These user's can make friends. Then they can send emails to each other.
Components of the app:
Users - Unlimited amount of users can join.
Friends - A User can have 200 confirmed friends and 100 pending friend requests. When a friendlist is retrieved it should show the name of the friend. (I will also need to receive the id of the friends so I can use it on my client side to create emails).
Emails - Users can send emails to their friends and they can receive emails from their friends. The user can then view all their sent emails independently(sentbox) and all their received emails independently(inbox).
They can also view the the emails sent between themselves and a friend order by newest. The emails should show the senders and receivers names. Once an email is read it needs to be marked as read.
My model looks something like this, but as you can see their are inefficiencies.
Datastore Kinds:
USER
-email (id) //The email doesn't need to be the id, but I need to be able to retrieve users by their email
-hash_password
-name
-account_status
-created_date
FRIEND
-id (auto-generated)
-friend1
-friend2
-status
EMAIL
-id (auto-generated)
-from
-to
-mutual_id
-message
-created_date
-has_seen
Procedures of the application:
Register - Get operation to see if a user with this email exists. If does not insert key.
Login - Get operation to get user based on email. If exists retrieve the hash_password from the entity and compare to user's input.
Send friend request - Friend data will be written twice for every relationship. Then using the index on friend1 and index on status I will query all the friends for a user and filter only those which are 'pending'. I will then count these friends and see if they are over X. Again I will do this for the other user. If they are both not over the pending limit, I will insert the friend request. This needs to run in a transaction.
Accept a friend request - Friend data will be written twice for every relationship. Then using the index on friend1 and index on status I will query all the friends for a user and filter only those which are pending. I will then count these friends and see if they are over X. Again I will do this for the other user. If they are both not over the pending limit, I will change both entities's status to accepted as a transaction.
Show confirmed friends - Friend data will be written twice for every relationship. Then using the index on friend1 and index on status I will query all the friends for a user and filter only those which are accepted. Not sure how I will show the friend's names (e.g what happens if a user changed their name this needs to be reflected in all friend relationships and emails!).
Show pending friends - Friend data will be written twice for every relationship. Then using the index on friend1 and index on status I will query all the friends for a user and filter only those which are pending. Not sure how I will show the friend's names (e.g what happens if a user changed their name this needs to be reflected in all friend relationships and emails!).
View sent emails - Using the index on the from property I would query to get all the sent emails from a user 5 at a time ordered by created_date (newest first). (e.g what happens if a user changed their name this needs to be reflected in all friend relationships and emails!).
View received emails - Using the index on the to property I would query to get all the received emails to a user 5 at a time ordered by created_date (newest first). When a emails is seen it will update that entities has_seen property to true. (e.g what happens if a user changed their name this needs to be reflected in all friend relationships and emails!).
View emails between 2 users - Using the index on mutual_id which is based on [lower_lexicographic_email]:[higher_lexicographic_email] to query the mutual emails. Ordered by newest, 5 at a time. (e.g what happens if a user changed their name this needs to be reflected in all friend relationships and emails!).
Create email - Using the friend1 and status index I will confirm the user's are friends. If they are friends, I will insert an email.

Related

Creating calendar appointments (ics) - GUID or database primary key as UID

I am working on an application in which you can sign up for workshops, among other things.
When you do that, you receive an invitation by mail in the form of an ICS file.
I want that when you sign out again or the workshop is canceled, you get another mail, with which you can remove the appointment from the calendar.
As I read this here, the appointment must have the same UID for this to work.
The workshops are stored in a database.
My question is whether I should create GUIDs for the workshops, which I would then have to store in the database, or can I use the primary key as the UID?
In the second case I am afraid that other events would be removed from the user if they have the same UID.

Social Network Activity Feed + Groups

I'm working on an social network app and want to create an activity feed so people can keep up to date with all of their connections (classic facebook stream). I have a DB table called activity setup for this like so:
activity_id (int)
user_id (int) //who posted it
group_id (int) //the group of connections that have permission to view
type (enum) //the type of activity performed
time (datetime) //the time the activity was performed
I would then do a select * from activity where user_id in (connections) to get the latest news.
Here's the catch. User's activities do not always have visibility to the complete set of connections. Users can create groups of user ids to form smaller sets within their super set of connections. Its like how facebook allows you to specify who sees a particular post instead of allowing all friends to see it.
I have a separate groups table setup with the following schema:
group_id (int)
connection_id (user_id, int)
user_id (group creator)
I have a group_id in my activity table. The group_id is the link to the subset of connections that have permission to see the post.
My question is, what is the best way to do this type of feed, and is there an optimal single select statement that will get me the output desired (a list of my connections activities that I have been granted permission to see)?
Thanks.
If you're open to offloading your activity stream functionality to a service via an API, Collabinate (http://www.collabinate.com) may be useful to you.

CakePHP2 Get Associated Model from DataSource

I have an User Model which uses a standard MySQL database and users table and a Movie Model which is a datasource from Rotten Tomatoes. They have a hasAndBelongsToMany relationship and I'm successfully able to write to the join table users_movies which holds the user_id and movie_id (the movie_id is the Rotten Tomatoes id). Works great.
The trouble is retrieving an User's movies. The standard find:
$movies = $this->User->find('all', array('conditions' => array('id' => $user_id)));
only returns the User not the associated Movie(s). I put a die statement in my read method in the DataSource and it's not even reaching the read method. How can I go about retrieving an User's movies?
So Rotten Tomatoes is holding your movies? In that case, of course Rotten Tomatoes wouldn't allow direct SQL access to their database - you'd be accessing it via an API. So Cake definitely won't just be able to join Users to Movies the way it normally would with two tables in the same database.
What you'll have to do is 1) get a list of the user's movie_id's from Cake, and then 2) Call the Rotten Tomatoes API to get a list of movies where their ID is in your list of movie_id's. (That's assuming rotten tomatoes allows such an API call.)
Having a quick look at the API, it looks like their 'movies search' (http://developer.rottentomatoes.com/docs/json/v10/Movies_Search) only allows you to specify plain-text as the search criteria (ie, you can't search based on movie id's). And their 'movie info' method (http://developer.rottentomatoes.com/docs/json/v10/Movie_Info), which does allow you to retrieve a movie by id, only allows you to retrieve one movie at a time.
You could of course loop through your list of movie id's for a given user, and make a separate API call to rotten tomatoes for each one - though I'd imagine that would get VERY slow.
Someone has put in a feature request for retrieving based on a list of multiple id's (http://developer.rottentomatoes.com/forum/read/123940) but until that request gets implemented, you will probably be having a tough time getting anything decent working.
I got it working by making a Model out of my join table between users and movies and getting an user's movies that way. Then I did as you suggested and looped through the user's movie ids making a call to the API and getting each movie. Not sure how elegant it is, but it is working.

Search / Filter problems

I have a database that stores a range of email addresses, i also have a table consisting of all of the approved addresses. the search i am looking to complete is one where when i run the approved member query and its returns all of the members that have an email address already approved by the database. i have set up a provider column this i though would be best to link these through for example:
if i have hotmail.com and aol.com authorised
when i run the query it will return all emails ending with these providers. a basic filter could work but i have hundreds of approved emails. so im asking is there anyway to search through the entire column of the approved list?
How about:
SELECT Email FROM Table
WHERE Email IN (Select Email FROM ApprovedTable)

appengine data structure - child, parent or both?

I'm trying my hand at google appengine and using the datastore with php and quercus.
I'm not familiar with Java or Python, so lots of learning going on here. I've got pages rendering, and i'm able to get data in and out of the datastore.
The app I am building has users, groups, topics and comments.
A group has users, and users can belong to multiple groups.
When a user logs in, I display the groups they are members of, and the topics of those groups.
I've got this currently built in MySql, and am now figuring out how to get it into appengine.
The way I see it, a group is a parent which has topics and users as children. Topics have comments as children.
However, I have to get the groups that a user belongs to when the user logs in. Therefore, I was thinking of a separate parent entity which stores the user, contact and login info, and that user would have children containing the group id which each user belongs to, so that I know what groups to fetch.
The users are children of the group so that I can display all the users of a group, but maybe there is a more efficient way to do it.
Like this
Groups(EntityGroup) - GroupName, Owner
↳ Topics - TopicName, Content, Owner
↳ Comments - Comment, Owner
↳ Users - userid
Users(EntityGroup) - userName, email, password
↳ userGroup - groupid
Then, when a user logs in, the logic looks like this
SELECT groupid FROM Users where password=hashofpassword+uniqueusername
foreach(groupid as group){
SELECT users from group;
SELECT topics from group
foreach(topicid as topic){
SELECT comments;
}
}
The reason I'm looking at it like this is because when a user logs in, I can't very well go looking through each group for the user, and I only would want to store the login info in one place.
Please don't recommend me to the code.google.com documentation, as I've gone through that many times already, but am not completely understanding what's going on with appengine.
also, is the way I've outlined above the proper way to visualize the datastore? I think visualizing the data has been a struggle which might be causing some of the challenges.
It looks to me like there is a many-to-many relationship between Users and Groups, yes? A user can belong to many groups, and a Group can have many users who are subscribed to it. The most logical way to represent this is AppEngine to is to give the User entity a ListProperty that holds the Key of the eahc of the groups to which he belongs. In Python, it would look like this:
class User(db.Model):
userName = db.StringProperty()
email = db.EmailProperty()
password = db.StringProperty()
groups = ListProperty(db.Key)
Whenever the User subscribes to the group, you add the Group's key to the groups list.
Likewise, the Group entity will have a ListProperty that contains the Keys of each User who belongs to it.
You wouldn't want to make the Users children of the Group, as that would make it very difficult or impossible for a User to belong to more than one Group.
The difficulty that you will have is that when a User joins a group, you will need to update the Group in a Transaction -- you can only have one User being added to a Group at a time; otherwise, you have the possibility that one write will overwrite another. Presumably, the User can be updated outside of a transaction, as he or she should only be joining one group at a time.

Resources