I would like to create a notification document on Firebase (Cloud Firestore) which includes a "sender" display name (eg. Anonymous128 sent you a message). This name is prone to changing.
What is the best practice to dynamically update the name if it does change? Should I just store userId, and pull the name up every time I'm querying notifications from the database? Or would it be better to update all notifications belonging to a user if they change their display name?
Thanks!
If reading notifications is much more frequent than a user updating their name, then I'd recommend storing sender's name in notification documents as that'll save you plenty of read operations that'll you'll spend on fetching user's name every time.
This does mean that you'll have to update plenty of documents when a user updates their name. Usually there's some rate limit to change user name so this operation should not be much frequent. Also the term notification seems like you'll be deleting the document after the receiver has read the message. If yes, then the update costs should reduce too.
Alternatively, you just store userId in notifications documents. When you fetch all the notifications of current user, parse an array of unique userIds from them and then query senders' documents. This ensures you fetch document of each user only once and not for every notification they have sent. Additionally, you can cache these usernames like { uid: "name" } locally and periodically clear that.
Related
I wanted to build a chat app using DynamoDB, but having a hard time designing an architecture.
So, I do not need a complicated chat app like telegram, its rather simple. These are the queries that I need:
List chats for user (each chat also has lastMessageTimestamp, unreadCound and lastMessage)
List chat messages for chat
List users of chat (this is optional)
So far, I have come up with this design
And queries to get data
The problem is that, to have data about lastMessage and unreadCount, I need to update 2 rows when creating the message. And transaction should be used for that, but I do not think that DynamoDB good for high transaction apps. Is there better way to do this (maybe using different technology)?
P.S. I know I should use RDB until I hit the bottleneck, but I have done this using RDB and now wanna try it using NoSql. I also had a look at MongoDB, but it does not support transaction if I have different schemas for chat and messages and want to update them in sync. I also may use streaming in DynamoDb to update values, but that's not going to be real time(
Update
I also could embed messages in chat in mongodb, but is this scalable? I can push message as stack so would be easy to query latest messages, but what about pagination or infinite scroll, is there a way to make these queries fast? Also, what if embedded messages exceed the document size limit, how to scale then?
DynamoDB is definitely a suitable database for your needs here, but I think the design you've proposed isn't the right approach.
Your requirements are (I've split some up from your original post):
List chats for user
List chat messages for chat
List users of chat (this is optional)
Get last message for a chat
Get unread count for a chat+user
If you have a DDB table with:
PK: chat_id
SK: timestamp:message_id
And a GSI with:
PK: user_id
SK: timestamp:message_id
You can do a query for chat_id to complete requirement #2, getting all messages in a chat, sorted by time posted.
You can have a second table that handles permissions like:
PK: user_id
SK: chat_id
With a GSI:
PK: chat_id
SK: user_id
You can do a query on the permissions table to get all chat_ids for a user_id, and a query on the permissions gsi table to get all user_ids in a chat, satisfying requirements 2 & 3.
For requirement 4, this is pretty easy as you can just do a query on the chats table, with a max count of 1, which will get you the last message and the time it was last posted.
Requirement 5 is a little more tricky, but if you can keep track of the last time a specific user has viewed a chat, you can do a query with a range expression on the sort key such as timePosted >= timeLastSeen, and the number of messages you get is the unread count. It makes sense to me to store the time last viewed on client side, but if you want this to be stored server side you could make a third table.
All the operations above are highly scalable and you won't run into any concurrency issues, even with 100 users in the same chat.
I would like to prevent as much write data to firestore and make it as efficient. The user can edit their profile that consists of three sections. The way i have it at the moment is that the firestore update is stored in a method which updates all the sections even if only one section is updated.
I would like it so that if the user only edits one section then that is only updated within firestore.
My code
await firestoreInstance
.collection("users")
.document(firebaseUser.uid)
.updateData({
"images": _imagesUrl,
"name": firstNameController.text,
"biography": biographyController.text,
});
The problem here is that we have no way to know the current value of any of these fields.
If you know the current value, you can compare the current and new values and only send it to the database if they are not the same.
If you don't know the current value, then loading the current value is probably more costly than simply sending all three fields to the database.
The reason for that is that Firebase charges for (in this scenario):
document writes - each document you modify incurs a cost, but that cost does not depend on the amount of data you update in that document
document reads
bandwidth of data read by the client
So Firestore doesn't charge bandwidth for data you send to the database. So while only sending modified fields may save bandwidth, it won't save in Firebase cost on that front, while having to read the document to determine what fields are modified will definitely cost more.
I working on social media app and I want users to have option to be notified when page create new post (as in facebook). First, I create a Notification table that contain :
Id(PK)
UserId
PageId
PostId
ReadBit
Date
But this doesn't make sense. If page have 1000 or even 500 interested fans, it doesn't not logically to create 1000 or 500 record for every interested fan. Is there another method to do that ?
There are two types of information you are talking about: Pages and posts.
When a user opts in for a page, this should be persistent, so you'll need a database entry representing this. The standard way to go is to have one record per user and per page she subscribed to:
Subscription(id, user_id, page_id)
Depending on the exact requirements, there might exist simpler solutions. E.g., if the pages are about topics, and the user doesn't subscribe a page but a topic (like there are 50 pages about cars, and 70 about computers), it would be sufficient to store a subscription per user and topic. But your text doesn't indicate this.
The second question is how to track the notification process when a post has been made to a page. Strictly speaking, you don't need a database record for each such notification. When a page has changed, look up all subscribers using the Subscription table, and in a loop on them generate the notifications.
Only in case that you need some additional persistent information about each such notification you will need a record per notification, i.e.
Notification(id, subscription_id, ...)
This might be the case if you need to store the timestamp when the notification was sent, or if you have some status information for a notification process, e.g. whether the user has reacted on the notification.
I'm developing a webchat and I have to decide what strategy to use. I can use either a text file or a database to store the data that come from and go to the users.
Yes, I've seen the other posts where people are talking about it, but I have a few more questions than just "which one is faster" (that's also one of my questions, but not the only one). And I will point out the problem more carefully.
First of all many users will share the same room and chat with all the others at the same time. Many rooms will be open at the same time (the same user can be at two rooms at the same time).
In this system messages is not the only thing that come and go. The status (away, online, typing etc), updates ("don't talk to me, i had the worst day of my life" or "i'm new here"), who logged out and in, replies to other messages, and the messages themselves are the kind of traffic that will be there. Somewhere.
I want the Javascript to receive the data in JSON format.
And I want this chat to be used by many many people at the same time! Will send requests from each one of them every 5 or 10 seconds.
Some data will be stored in the Database for sure (the user sessions, its id, the user's id, information about the room, when it was created etc), but how it will deal with dynamic data flow is still to be decided.
In other words, even if it's not the most complex chat ever developed, it's not the simplest one.
Example of files distribution if strategy chosen is text file:
Path: /root/room/1 (to room 1)
Files: 1000001, 100002, 100003 (users in this room)
Path: /root/room/2
Files: 1000002, 100005, 100010
...
Messages and any other kind of update (access, status, personal update etc) sent in room 1 FROM user 1000001 will be saved in paths /root/room/1/100002 (user 2) and /root/room/1/100003 (user 3) and so on.
Each user will read his own file for that room and erase it after reading so only new data will be in that files everytime they read them.
The data is written in JSON format so no further encoding is needed.
The questions:
1- Considering a huge traffic of data being written in and read from these text files, what is the best option, using the database to save the file and then make queries (SELECT FROM messages (and from update, and from access, and from anything else) WHERE id > #number) every 5 or 10 seconds, for each user in each room, or using the text files?
2- If the Database is the best option, should I write it in JSON format? I would use a field called JSON, because I wouldn't want to loose the ability of making queries in that database. Or maybe I would create a new table, connecting the user, the room and the JSON and write AGAIN in other tables to keep the tuple format.
3- if File is the best option, would you consider replicating the data in the database (just in case you want to query or whatever) everytime it's saved in the files?
I'm working on a notification feed for my mobile app and am looking for some help on an issue.
The app is a Twitter/Facebook like app where users can post statuses and other users can like, comment, or subscribe to them.
One thing I want to have in my app is to have a notifications feed where users can see who liked/comment on their post or subscribed to them.
The first part of this system I have figured out, when a user likes/comments/subscribes, a Notification entity will be written to the datastore with details about the event. To show a users Notification's all I have to do is query for all Notification's for that user, sort by date created desc and we have a nice little feed of actions other users took on a specific users account.
The issue I have is what to do when someone unlikes a post, unsubscribes or deletes a comment. Currently, if I were to query for that specific notification, it is possible that nothing would return from the datastore because of eventual consistency. We could imagine someone liking, then immediate unliking a post (b/c who hasn't done that? =P). The query to find that Notification might return null and nothing would get deleted when calling ofy().delete().entity(notification).now(); And now the user has a notification in their feed saying Sally liked his post when in reality she liked then quickly unliked it!
A wrench in this whole system is that I cannot delete by Key<Notification>, because I don't really have a way to know id of the Notification when trying to delete it.
A potential solution I am experimenting with is to not delete any Notifications. Instead I would always write Notification's and simply indicate if the notification was positive or negative. Then in my query to display notifications to a specific user, I could somehow only display the sum-positive Notification's. This would save some money on datastore too because deleting entities is expensive.
There are three main ways I've solved this problem before:
deterministic key
for example
{user-Id}-{post-id}-{liked-by} for likes
{user-id}-{post-id}-{comment-by}-{comment-index} for comments
This will work for most basic use cases for the problem you defined, but you'll have some hairy edge cases to figure out (like managing indexes of comments as they get edited and deleted). This will allow get and delete by key
parallel data structures
The idea here is to create more than one entity at a time in a transaction, but to make sure they have related keys. For example, when someone comments on a feed item, create a Comment entity, then create a CommentedOn entity which has the same ID, but make it have a parent key of the commenter user.
Then, you can make a strongly consistent query for the CommentedOn, and use the same id to do a get by key on the Comment. You can also just store a key, rather than having matching IDs if that's too hard. Having matching IDs in practice was easier each time I did this.
The main limitation of this approach is that you're effectively creating an index yourself out of entities, and while this can give you strongly consistent queries where you need them the throughput limitations of transactional writes can become harder to understand. You also need to manage state changes (like deletes) carefully.
State flags on entities
Assuming the Notification object just shows the user that something happened but links to another entity for the actual data, you could store a state flag (deleted, hidden, private etc) on that entity. Then listing your notifications would be a matter of loading the entities server side and filtering in code (or possibly subsequent filtered queries).
At the end of the day, the complexity of the solution should mirror the complexity of the problem. I would start with approach 3 then migrate to approach 2 when the fuller set of requirements is understood. It is a more robust and flexible approach, but complexity of XG transaction limitations will rear its head - but ultimately a distributed feed like this is a hard problem.
What I ended up doing and what worked for my specific model was that before creating a Notification Entity I would first allocate and ID for it:
// Allocate an ID for a Notification
final Key<Notification> notificationKey = factory().allocateId(Notification.class);
final Long notificationId = notificationKey.getId();
Then when creating my Like or Follow Entity, I would set the property Like.notificationId = notificationId; or Follow.notificationId = notificationId;
Then I would save both Entities.
Later, when I want to delete the Like or Follow I can do so and at the same time get the Id of the Notification, load the Notification by key (which is strongly consistent to do so), and delete it too.
Just another approach that may help someone =D