Cheap way of modeling friend relationships - google-app-engine

I have an app engine project (java) which has a User class. I'd like to model a cheap friend relationship system. Each user can have max 50 friends. I am thinking of doing something zany like:
class User {
String username;
Text friends; // "joe,mary,frank,pete"
}
Where "friends" is a comma delimited list of usernames that the user is friends with. Here are the operations I want to support and how I'd do them with the above:
Fetch my full friend list
Just look up the User object, return back the comma delimited list of friends.
Add a friend
Fetch my user object, check if the target name exists in the string, if not, append to the end. Persist modified User object back to data store.
Delete a friend
Fetch my user object, check if the target name exists in the string, if it does, delete it from the string. Persist modified User object back to data store.
Are two users mutual friends
Fetch both user objects, check that the usernames appear on one another's user object.
Getting a full list of friends is pretty important for my application, and storing each relationship as a separate entity seems nightmarish to me (fetching each entity from the datastore when a user needs to see their friends list would probably bankrupt me). I am hoping that a simple read from the Text attribute would be much more lightweight.
Checking for the mutual friend scenario seems like the biggest drawback here, but won't happen that often. I don't know if fetching the two User objects from the datastore and doing the string comparisons would be disastrous performance-wise. Might be ok? I think I may have also read that creating and deleting objects from the data store costs more than just modifying an existing object. So the add/delete friend operations might be better this way too.
Would be happy to hear any thoughts on this or more optimal ways of going about it.
Thank you
-------------------------- Update ---------------------
As per Adrian's comment, I could also do the following:
class User {
String username;
List<String> friends;
// or //
Set<String> friends;
}
So I think if I use a List, those entities will get indexed by default. I'm not sure if I could execute a GQL query knowing that the lists are indexed to get a match without actually fetching any entities. Something like:
SELECT COUNT FROM User WHERE
(username = "me" && friends = "bob") &&
(username = "bob" && friends = "me")
Storing as a Set would help do the search faster if I loaded both User objects, but I think for both the List and Set, extra time has to be taken to deserialize them when fetched from the datastore, so not sure if their benefits are negated. Maybe it would hurt more than it'd help?

I would actually suggest you store the data in two forms. First, a list of the usernames, and secondly, a matching list of datastore keys for those users' own entities.
This will allow you to both quickly display a user's friends, and look up one particular friend to check for a mutual relationship. In particular, it will almost certainly be more efficient to check the friend's list of friend keys for the original user's keys than to match on a string.
The only drawback will be keeping the two lists in sync, but given your list of operations that doesn't sound too hard.

List<String> friends; is a good solution and one that I've seen in professional use. If friends have User IDs of your app or google you can use that data type for a list of keys instead.

Related

How to organize user's data in FireStore

I would like to create a database on Firestore that would look like this :
Collection A (Users)
Document A (User 1)
Data 1
Data 2
ThisListOfData
ThisThing1
ThisSubData1
ThisSubData2
ThisThing2
...
Document B (User 2)
Each user have his data (username, ...). But one of his data (ThisListOfData) is a list of things (a list of activities, like yoga, sport, ...). And for each activities (ThisThing1, ...) there is subData. I want to do this like that, because the user can subscribe to more activities, delete some...
It's my first time using Firestore so I have no idea how to do that. The "ThisListOfData" should/can be a collection ? Or a data like a list of string ? If someone could help me on this one, and how to get all the "Things" inside "ThisListOfData" to check the activities the user subscribed to ?
Thanks alot
When using Firestore or the Realtime Database it is more importand how you "get" the data when you try to figure out "where" and how to "structure" it.
When deciding on this always think of "where will I use this data?" and "how will I query it?".
The structure you shown us is quite OK. What I would recommend is that if there are some data parts that will be accessible also for other users to store those to a separate collection as Users and take a lot of effort to make the collection Users as secure as possible. For that reason to reduce complexity in security rules I always prefer and recommend to store such data (that should be awailable for others) into separate collections.
Also if you want to save something as data field of a document like map or array or if you want to save it as subcollection depends mostly on how you want to get the dat later or if you need to query on it. If you plan to run queries on it and sort such data I would recommend to save it as collection.
Saving data in arrays has some donwsides like updating that works only for the whole array.

Adding additional data fields to account information in Substrate

Very new to Substrate and Rust. My understanding of the ChainState is that it acts sort of like a database which holds account numbers (in this case public keys) and their associated balances. When making a transaction, Substrate basically checks to see that you have a sufficient balance, and if so, the transaction succeeds. (This is different from the UTXO method used in Bitcoin.)
First of all, if I am wrong on the above, please correct me.
If I am correct (or at least close) I would like to find a method for associating other data with each account. I've noticed that in the demos, accounts are also associated with names, like Alice, Bob, etc. Is this kept in the ChainState, or is this something which would only be stored on one's own node?
I am trying to determine a way to associate additional data with accounts in the ChainState. For example, how could I store a name (like Alice, Bob, etc.) in the ChainState (assuming that they are only stored locally) or even other information, such as the birthday of the account owner, or their favorite author, or whatever arbitrary information?
The Chain State is just the state of everything, not necessarily connected to Account IDs. It does, among other things, store balances and such, yes, but also many other things that the chain stored one way or another.
To add custom data, you would create a new structure (map) and then map account IDs to whatever data you want. As an example:
decl_storage! {
trait Store for Module<T: Trait> as TemplateModule {
/// The storage item for our proofs.
/// It maps a proof to the user who made the claim and when they made it.
Proofs: map hasher(blake2_128_concat) Vec<u8> => (T::AccountId, T::BlockNumber);
}
}
The above declares a storage map which will associate a hash with a tuple of Account and Block Number. That way, querying the hash will return those two values. You could also do the reverse and associate an AccountID with some other value, like a string (Vec<u8>).
I recommend going through this tutorial from which I took the above snippet: it will show you exactly how to add custom information into a chain.
The answer given by #Swader was very good, as it was general in scope. I will be looking into this answer more, as I try to associate more types of information. (I voted it up, but my vote isn't visible because I am relatively new to StackOverflow, at least on this account.)
After a bit more searching I also found this tutorial: Add a Pallet to Your Runtime.
This pallet happens to specifically add the ability to associate a nickname with the account ID, which was the example I gave in my question. #Swader's answer, however, was more general, and therefore both more useful and also more closely answered my question.
By the way, the nicknames are saved as hex encoded, and are returned as hex encoded as well. An easy way to check that the hex encoding is actually equivalent to the nickname which was set is to visit https://convertstring.com/EncodeDecode/HexDecode and paste in the hex string, without the initial 0x.

GAE Transaction in root entity

I'm new to GAE and I have some questions about transaction with the DataStore.
For example, I have a user entity, which is created when the user adds my app on Facebook. I get some properties with the Facebook API, but I want to add a username for the user, and it needs to be unique. So in the transaction scope I call this method:
def ExistsUsernameToDiferentUser(self, user, username):
query = User.all()
query.filter("username", username)
query.filter("idFacebook != ", user.idFacebook)
userReturned = query.get()
return True if userReturned else False
But GAE gives me this error:
BadRequestError: queries inside transactions must have ancestors
Ok, I understand, but the user doesn't have any ancestor, it's a root entity. What do I have to do?
I see what you're trying to do now.
By forcing the use of ancestors, the datastore forces you to lock down a portion of the datastore (everything under the given ancestor) so you can guarantee consistency on that portion. However, to do what you want, you essentially need to lock down all User entities to query whether a certain one exists, and then create a new one, and then unlock them.
You CAN do this, just create an entity, it can be an empty entity, but make sure it has a unique key (like "user-ancestor"), save it, and make it the ancestor of every User entity.
THIS IS A PROBABLY A BAD IDEA since this limits your performance on User entities, particularly on writes. Every time a new user is created, all User entities are prevented from being updated.
I'm trying to illustrate how you need to think about transactions a bit differently in the HRD world. It's up to you to structure your data (using ancestors) so that you get good performance characteristics for your particular application. In fact, you might disagree with me and say that User entities will be updated so infrequently that it's ok to lock them all.
For illustrative purposes, another short-sighted possibility is to create multiple ancestors based on the username. ie, one for each letter of the alphabet. Then when you need to create a new User, you can search based on the appropriate ancestor. While this is an improvement from having a single ancestor (it's 26 times better), it still limits your future performance up front. This may be ok if you know right now the total number of users you will eventually have, but I suspect you want hundreds of millions of users.
The best way is to go back to the other suggestion and make the username the key. This allows you the best scalability, since getting/setting the User entity by key can be transactional and won't lock down other entities, limiting your scalability.
You'll need to find a way to work your application around this. For example, whatever information you get before the username can be stored in another entity that has a RelatedField to the User which is created later. Or you can copy that data into the User entity after the User entity is created by key, then remove the original entity.
If usernames are unique why dont you make it the key?
class User(db.Model):
#property
def username(self):
return self.key().name()
....
User.get_or_insert(username,field1=value1,....)
Note: You will not need transactions if you use get_or_insert

How to implement user feed like in Twitter or Facebook on redis

I'm going to write simple news site on redis with supporting followers.
I can't imagine how can I organize users timeline like in twitter. I read about Retwis ( http://redis.io/topics/twitter-clone ), but its feed creating method seems stupid. What if I want to remove entries? I'll should to remove all entry references from followers feeds. What if I already do not follow some users?
There are several ways to attack what you describe using a bit of imagination, here are some examples that address your questions:
What if I want to remove entries?
One could mantain a set such as post:$postid:users for each post, holding all the userids that may have the post in their feeds; when the post is to be deleted one just has to extract all members from this set and iterate through the ids to remove it from each uid:$userid:posts set; speaking of which you would have to turn that last one into a set instead of a list like the original article suggests in order to be able to extract and remove individual items but that is trivial, the logic is pretty similar.
What if I already do not follow some users?
When the feed is being generated for each individual user you have to necessarily iterate and read each post:$postid key, from which you have access to the author userid; so before showing the post you read this id and look it up in the uid:$userid:following set, if it's there we show the post, if it's not we delete it from uid:$userid:posts and don't show it.
In a nutshell, this is what you have to keep in mind in order to build this kind of logic in redis:
You'll need many commands, but that's ok, Redis is supposed to be fast enough to handle it well.
Data will repeat, but that is also ok; it may look insane for someone with a relational DBMS background to store a set of users for each post if each user already has a set with their posts, but this is the only way around building relationships in a non-relational data store like redis.
Generally speaking think of sets and sorted sets when designing something relational in Redis.
With redis you get to do everything yourself, but once you get your head around it it's actually pretty powerful.

Creating a Notifications type feed in GAE Objectify

I'm working on a notification feed for my mobile app and am looking for some help on an issue.
The app is a Twitter/Facebook like app where users can post statuses and other users can like, comment, or subscribe to them.
One thing I want to have in my app is to have a notifications feed where users can see who liked/comment on their post or subscribed to them.
The first part of this system I have figured out, when a user likes/comments/subscribes, a Notification entity will be written to the datastore with details about the event. To show a users Notification's all I have to do is query for all Notification's for that user, sort by date created desc and we have a nice little feed of actions other users took on a specific users account.
The issue I have is what to do when someone unlikes a post, unsubscribes or deletes a comment. Currently, if I were to query for that specific notification, it is possible that nothing would return from the datastore because of eventual consistency. We could imagine someone liking, then immediate unliking a post (b/c who hasn't done that? =P). The query to find that Notification might return null and nothing would get deleted when calling ofy().delete().entity(notification).now(); And now the user has a notification in their feed saying Sally liked his post when in reality she liked then quickly unliked it!
A wrench in this whole system is that I cannot delete by Key<Notification>, because I don't really have a way to know id of the Notification when trying to delete it.
A potential solution I am experimenting with is to not delete any Notifications. Instead I would always write Notification's and simply indicate if the notification was positive or negative. Then in my query to display notifications to a specific user, I could somehow only display the sum-positive Notification's. This would save some money on datastore too because deleting entities is expensive.
There are three main ways I've solved this problem before:
deterministic key
for example
{user-Id}-{post-id}-{liked-by} for likes
{user-id}-{post-id}-{comment-by}-{comment-index} for comments
This will work for most basic use cases for the problem you defined, but you'll have some hairy edge cases to figure out (like managing indexes of comments as they get edited and deleted). This will allow get and delete by key
parallel data structures
The idea here is to create more than one entity at a time in a transaction, but to make sure they have related keys. For example, when someone comments on a feed item, create a Comment entity, then create a CommentedOn entity which has the same ID, but make it have a parent key of the commenter user.
Then, you can make a strongly consistent query for the CommentedOn, and use the same id to do a get by key on the Comment. You can also just store a key, rather than having matching IDs if that's too hard. Having matching IDs in practice was easier each time I did this.
The main limitation of this approach is that you're effectively creating an index yourself out of entities, and while this can give you strongly consistent queries where you need them the throughput limitations of transactional writes can become harder to understand. You also need to manage state changes (like deletes) carefully.
State flags on entities
Assuming the Notification object just shows the user that something happened but links to another entity for the actual data, you could store a state flag (deleted, hidden, private etc) on that entity. Then listing your notifications would be a matter of loading the entities server side and filtering in code (or possibly subsequent filtered queries).
At the end of the day, the complexity of the solution should mirror the complexity of the problem. I would start with approach 3 then migrate to approach 2 when the fuller set of requirements is understood. It is a more robust and flexible approach, but complexity of XG transaction limitations will rear its head - but ultimately a distributed feed like this is a hard problem.
What I ended up doing and what worked for my specific model was that before creating a Notification Entity I would first allocate and ID for it:
// Allocate an ID for a Notification
final Key<Notification> notificationKey = factory().allocateId(Notification.class);
final Long notificationId = notificationKey.getId();
Then when creating my Like or Follow Entity, I would set the property Like.notificationId = notificationId; or Follow.notificationId = notificationId;
Then I would save both Entities.
Later, when I want to delete the Like or Follow I can do so and at the same time get the Id of the Notification, load the Notification by key (which is strongly consistent to do so), and delete it too.
Just another approach that may help someone =D

Resources