I would like to prevent as much write data to firestore and make it as efficient. The user can edit their profile that consists of three sections. The way i have it at the moment is that the firestore update is stored in a method which updates all the sections even if only one section is updated.
I would like it so that if the user only edits one section then that is only updated within firestore.
My code
await firestoreInstance
.collection("users")
.document(firebaseUser.uid)
.updateData({
"images": _imagesUrl,
"name": firstNameController.text,
"biography": biographyController.text,
});
The problem here is that we have no way to know the current value of any of these fields.
If you know the current value, you can compare the current and new values and only send it to the database if they are not the same.
If you don't know the current value, then loading the current value is probably more costly than simply sending all three fields to the database.
The reason for that is that Firebase charges for (in this scenario):
document writes - each document you modify incurs a cost, but that cost does not depend on the amount of data you update in that document
document reads
bandwidth of data read by the client
So Firestore doesn't charge bandwidth for data you send to the database. So while only sending modified fields may save bandwidth, it won't save in Firebase cost on that front, while having to read the document to determine what fields are modified will definitely cost more.
Related
I would like to create a notification document on Firebase (Cloud Firestore) which includes a "sender" display name (eg. Anonymous128 sent you a message). This name is prone to changing.
What is the best practice to dynamically update the name if it does change? Should I just store userId, and pull the name up every time I'm querying notifications from the database? Or would it be better to update all notifications belonging to a user if they change their display name?
Thanks!
If reading notifications is much more frequent than a user updating their name, then I'd recommend storing sender's name in notification documents as that'll save you plenty of read operations that'll you'll spend on fetching user's name every time.
This does mean that you'll have to update plenty of documents when a user updates their name. Usually there's some rate limit to change user name so this operation should not be much frequent. Also the term notification seems like you'll be deleting the document after the receiver has read the message. If yes, then the update costs should reduce too.
Alternatively, you just store userId in notifications documents. When you fetch all the notifications of current user, parse an array of unique userIds from them and then query senders' documents. This ensures you fetch document of each user only once and not for every notification they have sent. Additionally, you can cache these usernames like { uid: "name" } locally and periodically clear that.
I have a grid of data whose endpoints are displayed from data stored in my firestore database. So for instance an outline could be as follows:
| Spent total: $150 |
| Item 1: $80 |
| Item 2: $70 |
So the value for all of these costs (70,80 and 150) is stored in my firestore database with the sub items being a separate collection from my total spent. Now, I wannt to be able to update the price of item 2 to say $90 which will then update Item 2's value in firestore, but I want this to then run a check against the table so that the "spent total" is also updated to say "$170". What would be the best way to accomplish something like this?
Especially if I were to add multiple rows and columns that all are dependent on one another, what is the best way to update one part of my grid so that afterwords all of the data endpoints on the grid are updated correctly? Should I be using cloud functions somehow?
Additionally, I am creating a ReactJS app and previously in the app I just had my grid endpoints stored in my Redux store state so that I could run complex methods that checked each row and column and did some math to update each endpoint correctly, but what is the best way to do this now that I have migrated my data to firestore?
Edit:here are some pictures of how I am trying to set up my firestore layout currently:
You might want to back up a little and get a better understanding of the type of database that Firestore is. It's NoSQL, so things like rows and columns and tables don't exist.
Try this video: https://youtu.be/v_hR4K4auoQ
and this one: https://youtu.be/haMOUb3KVSo
But yes, you could use a cloud function to update a value for you, or you could make the new Spent total calculation within your app logic and when you write the new value for Item 2, also write the new value for Spent total.
But mostly, you need to understand how firestore stores your data and how it charges you to retrieve it. You are mostly charged for each read/write request, with much less concern for the actual amount of data you have stored overall. So it will probably be better to NOT keep these values in separate collections if you are always going to be utilizing them at the same time.
For example:
Collection(transactions) => Document(transaction133453) {item1: $80, item2: $70, spentTotal: $150}
and then if you needed to update that transaction, you would just update the values for that document all at once and it would only count as 1 write operation. You could store the transactions collection as a subcollection of a customer document, or simply as its own collection. But the bottom line is most of the best practices you would rely on for a SQL database with tables, columns, and rows are 100% irrelevant for a Firestore (NoSQL) database, so you must have a full understanding of what that means before you start to plan the structure of your database.
I hope this helps!! Happy YouTubing...
Edit in response to comment:
The way I like to think about it is how am I going to use the data as opposed to what is the most logical way to organize the data. I'm not sure I understand the context of your example data, but if I were maybe tracking budgets for projects or something, I might use something like the screenshots I pasted below.
Since I am likely going to have a pretty limited number of team members for each budget, that can be stored in an array within the document, along with ALL of the fields specific to that budget - basically anything that I might like to show in a screen that displays budget details, for instance. Because when you make a query to populate the data for that screen, if everything you need is all in one document, then you only have to make one request! But if you kept your "headers" in one doc and then your "data" in another doc, now you have to make 2 requests just to populate 1 screen.
Then maybe on that screen, I have a link to "View Related Transactions", if the user clicks on that, you would then call a query to your collection of transactions. Something like transactions is best stored in a collection, because you probably don't know if you are going to have 5 transactions or 500. If you wanted to show how many total transactions you had on your budget details page, you might consider adding a field in your budget doc for "totalTransactions: (number)". Then each time a user added a transaction, you would write the transaction details to the appropriate transactions collection, and also increase the totalTransactions field by 1 - this would be 2 writes to your db. Firestore is built around the concept that users are likely reading data way more frequently than writing data. So make two writes when you update your transactions, but only have to read one doc every time you look at your budget and want to know how many transactions have taken place.
Same for something like chats. But you would only make chats a subcollection of the budget document if you wanted to only ever show chats for one budget at a time. If you wanted all your chats to be taking place in one screen to talk about all budgets, you would likely want to make your chats collection at the root level.
As for getting your data from the document, it's basically a JSON object so (may vary slightly depending on what kind of app you are working in),
a nested array is referred to by:
documentName.arrayName[index]
budget12345.teamMembers[1]
a nested object:
documentName.objectName.fieldName
budget12345.projectManager.firstName
And then a subcollection is
collection(budgets).document(budget12345).subcollection(transactions)
FirebaseExample budget doc
FirebaseExample remainder of budget doc
FirebaseExample team chats collection
FirebaseExample transactions collection
I was wondering what the best way to store data in firestore on a user and global level is so that all of the information is easily retrievable and can be calculated.
For instance, let's say you are making a race car app where your average time around the track is recorded to a specific user but you also want to use this user's average time to compare against the global average in that user's location.
A setup I am thinking of is like this:
Global Firestore Data
{
Location: France,
globalAverageTime: 6 minutes,
numberOfParticipants: 2
}
User Firestore Data
[
{
username: "firstuser",
location: France
time: 8 minutes
},
{
username: "seconduser",
location: France
time: 4 minutes
}
]
So essentially, what is the best way to house all of this type of data? So that every time a user updates their personal average time around the track a new global average is calculated based on that user's location and a new time. So for instance in the above example, there are two "number of participants" in France one with 8 minutes and one with 4 minutes so the average time is 6 minutes around the track for that region.
To run a system like this would the best thing to do be use cloud functions that run a calculation after every update to user data and then update the global data? Also is this the best way for setting up the system so that each user can then later compare themselves to the average?
Let me know any improvements that can be made or how I should change how the data is set up in firestore.
Interesting question! I think subcollections might be the answer here.
You could attack the problems a few different ways. But the first one that comes to mind is this:
Consider the following top level collections:
Users
Tracks
The 'users' collection contains the tracks driven by the user like this:
-- Users
-- user_1
-- tracks (object)
-- track_1: true (could also be a float instead of boolean, representing the completion time for the user)
-- track_2: true
-- track_5: true
Then in the 'tracks' collection you have the following:
-- Tracks
-- track_1
-- location
-- globalAverageTime
-- numberOfParticipants
-> 'users' (subcollection)
-- user_1 (document)
-- completionTime: 6.43
That way, you can have you function fire each time a new document is written to the tracks -> users subcollection and update via a transaction.
EDIT
To answer your question regarding having many users in the tracks subcollection: If you fetch your /tracks/track_1 collection, it will not fetch the users subcollection. You have to specifically fetch that subcollection to retrieve that data, that means that you can easily retrieve a track, or multiple tracks without also fetching all the users.
You can read about transactions here.
From the docs:
Using the Cloud Firestore client libraries, you can group multiple operations into a single transaction. Transactions are useful when you want to update a field's value based on its current value, or the value of some other field. You could increment a counter by creating a transaction that reads the current value of the counter, increments it, and writes the new value to Cloud Firestore.
So you could have a function watching /tracks/{track_id}/users/{user_id} where each time the users completes the track and sets a time, you update the track with the new average
IMHO, you can keep your database structure as it is. And to answer your questions:
To run a system like this would the best thing to do be use cloud functions that run a calculation after every update to user data and then update the global data?
The best solution is to use Cloud Function because you can calculate those numbers server side. So you'll be able to automatically run backend code in response to events triggered by Firebase features and HTTPS requests. In your particular case, you should change the globalAverageTime property only if one of the time property within those two user objects changes.
Also is this the best way for setting up the system so that each user can then later compare themselves to the average?
In my opinion, yes because you'll be able to compare client side user's best time with the global time.
I'm creating an application that stores user's data in multiple database tables - info, payments and booking (this is a booking system).
In 'info' table I store the user info such as email, name, phone, etc...,
In 'payments' table I store his payments details and in 'booking' I store his booking history.
My questions is - What is the best way of representing this data in Flux architecture? Do I need 3 different stores (for each table) or a single store (let's say 'UserStore') that holds all user's data?
Basically, I have a dashboard component that should show all user's data.
In case I should go with the 3 different stores solution, is it possible to know when all of them finished loading their data (since each store loads the data asynchronously from the DB)?...
Thanks!
Generally, less stores is better (easier to maintain).
Rule of thumb I use: if a store gets bigger than 300 lines of code, then you should consider splitting up in different stores.
In both solutions (one store or 3 stores) you will need code to manage dependencies between the tables. The code just lives in different files/stores.
About the user dashboard you mention: avoid storing redundant data in your store(s). So if you have e.g. total number of bookings for a user, make a getter function that does live calculation.
Better not let stores load data from the DB or server-side directly, but create a WebAPIUtil that fires an update-action when new data comes in. See the 'official' flux diagram.
The stores can have a waitFor function that wait until another store is updated.
I'm trying to make a general purpose data structure. Essentially, it will be an append-only list of updates that clients can subscribe to. Clients can also send updates.
I'm curious for suggestions on how to implement this. I could have a ndb.Model, 'Update' that contains the data and an index, or I could use a StructuredProperty with Repeated=true on the main Entity. I could also just store a list of keys somehow and then the actual update data in a not-strongly-linked structure.
I'm not sure how the repeated properties work - does appending to the list of them (via the Python API) have to rewrite them all?
I'm also worried abut consistency. Since multiple clients might be sending updates, I don't want them to overwrite eachother and lose an update or somehow end up with two updates with the same index.
The problem is that you've a maximum total size for each model in the datastore.
So any single model that accumulates updates (storing the data directly or via collecting keys) will eventually run out of space (not sure how the limit applies with regard to structured properties however).
Why not have a model "update", as you say, and a simple version would be to have each provided update create and save a new model. If you track the save date as a field in the model you can sort them by time when you query for them (presumably there is an upper limit anyway at some level).
Also that way you don't have to worry about simultaneous client updates overwriting each other, the data-store will worry about that for you. And you don't need to worry about what "index" they've been assigned, it's done automatically.
As that might be costly for datastore reads, I'm sure you could implement a version that used repeated properties in a single, moving to a new model after N keys are stored but then you'd have to wrap it in a transaction to be sure mutiple updates don't clash and so on.
You can also cache the query generating the results and invalidate it only when a new update is saved. Look at NDB also as it provides some automatic caching (not for a query however).