Removing data from Cloud Firestore using React - reactjs

I'm trying to make a simple React app that uses Cloud Firestore for user auth and storing data; something that I could serve using heroku or something like that.
I'm running into trouble with enabling a user to delete their account (and associated data), as Firestore tells me that it's a bad idea to delete collections from the client side. Here's what they say:
Deleting a collection requires coordinating an unbounded number of individual delete requests. If you need to delete entire collections, do so only from a trusted server environment. While it is possible to delete a collection from a mobile/web client, doing so has negative security and performance implications.https://firebase.google.com/docs/firestore/manage-data/delete-data
While I might be able to delete the document connected with the user's account, this suggests that I can't really delete the sub-collections under that document.
So what would be a good way of automatically removing both the user document and user sub-collections? Can I achieve this through my react code? If not, is there a relatively easy way to do it without building a fancy back end?

Well, the documentation never says that this is not possible, just not recommended.
This makes sense if you consider that if you want to delete everything, including documents from subcollection, you would have to create a logic that will do this one by one, which is a very read intensive process and all of this data processing happening in your webapp is not a good practice, it might slow your app down or even block functionality while this process is occuring, etc.
What I would recommend for you is to follow the recommendations of Firestore itself, which is to create a Callable Function that will perform this actions for you, all processing will be done by that function and this will not degrade your app performance, you can find more details in this documentation.

Related

Protecting Firestore without requiring authentication

So currently in the project we have a collection of documents that don't require authentication to be read. They are write/update protected, but everyone can read.
What we are trying to prevent is that someone looks at the firebase endpoints and somehow manages to scrape the entire collection in json format (if this is even possible). The data is public, but I want it only to be accessible from our website.
One of the solutions we could think of was SSR (we are already using Next.js), but implementing SSR just for this reason doesn't seem very enticing.
Any suggestions would be appreciated.
EDIT:
Let me rephrase a little bit.
From what you see in the network tab, is it possible to forge/create a request to Firestore and get the entire collection instead of just the 1 document that was intended?
The best solution in your case is SSR. I know, it could sound as not enticing, but well, let's reason on when we should use SSR, then. In your use case, there is an important requirement: security. I think this is already a strong enough reason to justify the usage of SSR.
Also, creating an ad hoc service account for the next.js app, and securing the data with custom rules that allow the read of your data only to that service account, would only improve the overall security level.
Last: reading the data server side should make your site work a little faster, even if it would be difficult to notice, because we are talking about milliseconds. Notice that your page, as it is now, will need to be loaded, before the request to Firebase could be sent. This is adding a small delay. If the data is loaded server side, the delay is not added.
is it possible to forge/create a request to Firestore and get the entire collection instead of just the 1 document that was intended?
If you want to limit what people can request from a collection, you're looking for security rules. The most common model there is some form of ownership-based access control or role-based access control, but both of those require some way of identifying the user. This could be anonymously (so without them entering credentials), but it'd still be a form of auth.
If you don't want to do that, you can still control how much data can be gotten through the API in one go. For example, if you in the security rules allow get but not list, the user can only request a document once they know its ID. Even if you allow list, you can control in rules what queries are allowed.
I think one approach could be writing a Cloud Function that retrieves this public data using the admin SDK. Then, you could set a rule that nobody can read those documents. This means that only your Cloud Function with the admin SDK will have access to those documents.
Finally, you could set up AppCheck for that specific Cloud Function, this way, you ensure that the request is coming from your client app only.
https://firebase.google.com/docs/app-check

Given an array of chat rooms, and an array of chat room metadata, how can I securely update the corresponding chat room in both?

To practice using Firebase for large-scale applications, I'm making a Firebase app that has multiple chat rooms, similar to Firechat.
Theoretically there could be millions of chat rooms, each of which has millions of messages.
I would like to be able to display a list of chat room names, including some metadata about each chat room such as its number of participants.
The simplest way from a coding perspective would be to load all chat rooms, but that could be a tremendous amount of data.
My solution is to have an array of Chats and an array of ChatInfos. The former will contain all the chat messages, and the latter the metadata about each chat.
My question is: How can I update an item in one array and be sure the corresponding item in the other array is also updated without relying so much on the client?
My current solution is this:
var chatData = {room_name: "Test Room"}
var chat = Chat.ref.push(chatData);
var chatMeta = ChatMeta.ref.push(chatData);
chat.update({chat_meta_id: chatMeta.key});
chatMeta.update({chat_id: chat.key});
This works. However, it feels to me like it relies too much on the client.
For instance: What if the client's Firebase connection dies after the Chat has been created but before the ChatMeta has been created? Of course Firebase is near-instantly fast, but relying on front-end code to sync things up on the back-end feels like bad practice.
One solution would be to have all "write" requests go through a back-end server I control, and then go to Firebase. But the whole "no back end!" thing is such a big selling point for Firebase that I wonder whether that can be avoided.
To my mind, there should either be some way of doing a "shallow" query of all the different Chat items so that keeping a running list of ChatMeta data is unnecessary, or some kind of rule set up on Firebase's end that updates ChatMeta whenever Chat is updated. However, neither of those seem to exist.
Thank you!
You can avoid both relying on the client too much and having a controlled back-end server by using Cloud Functions.
In this case, you'd create an HTTP TRIGGER function and you'd push your room_name to it from the client. The cloud function would perform your requirements and return the ChatMeta object.

Run code in GAE according to changes in Firebase

Since Parse is shutting down, we are currently using Firebase to support basic data storage and real-time messaging. However, in order to implement a key feature in our app, we need to run some code on a server. The following is what we are trying to accomplish:
We allow users to upload key words to Firebase, then we want to send notifications to them if any new posts that contain these key words were uploaded by other users. For example, userA wants to know if anyone posted information related to chemistry, so userA enters key words "chemistry" and "science" in our app which get stored in Firebase, userB posted an article called "chemistry rocks!" which contains the key word "chemistry", userA will then receive a notification immediately about this post.
We have a couple of solutions in mind, but we are not sure which way to go and how to properly implement these solutions.
1 - Build a server that listens to Firebase changes and also supports sending notifications to individual users. However, to host and maintain a server just to run a search algorithm is just too much work for this simple task.
2 - Store the key words in another database that somehow can send notifications according to the search result. This would be faster because we wouldn't have to connect Firebase server to our own server, but again we would still have to host and maintain a separate server.
I have looked into Google App Engine, their push/pull queue feature sounds like something we want, but does GAE support notifications? And also how can we hook it up with Firebase? We also came across Firebase+Batch to send notifications, but I don't think Batch supports cloud computation.
Has anyone run into this problem? Any solutions?

app engine update all sessions for user

In my gae application, a user can do an action (buy something). I need that information stored persistently and available imediately on all requests from all sessions of this user on multiple devices/browsers. I'm using webapp2_extras sessions.
The way I'm thinking of doing this is either:
1) adding the action_happened field to the User model and make it available in the session by adding it to the list in webapp2_extras.auth['user_attributes'] config. But this doesn't work unless the user is logged out on all sessions.
or 2) create a memcache entry (backed by the datastore) for each user like user_id_action_happened and check if it is true or false on each request. This is my preffered method.
Is there any other way to do this?
I think storing in database and doing a query on each request is the most natural option.
Don't know about your full requirements and specifications, but for keeping the sessions synchronized I think a solution like firebase makes a lot of sense, though it might be overkill in your case.

GAE datastore -- proper ways to implement search/data retrieval in response to a user request?

I am writing a web app and I am trying to improve the performance of search/displaying results. I am relatively new to programming this sort of thing, so I apologize in advance if these are simple questions/concepts.
Right now I have a database of ~20,000 sites, each with properties, and I have a search form that (for now) just asks the database to pull all sites within a set distance (for this example, say 50km). I have put the data into an index and use the Search API to find sites.
I am noticing that the database search takes ~2-3 seconds to:
1) Search the index
2) Get a list of key names (this is stored in the search index)
3) Using key names, pull from datastore (in a loop) and extract data properties to be displayed to the user
4) Transmit data to the user via jinja template variables
This is also only getting 20 results (the default maximum for a Search API query.. I haven't implemented cursors here yet, although I will have to).
For whatever reason, it feels quite slow.. I am wondering what websites do to make the process seem faster. Do they implement some kind of "asynchronous" search, where a page loads while in the background the search/data pulls are processed, and then subsequently shown to the user...?
Are there "standard" ways of performing searches here where the processing/loading feels seamless to the user?
Thanks.
edit
Would doing something like just passing a "query ID" via the page work, and then using AJAX to get data from the datastore via JSON work? Like... can app engine redirect the user to the final page, pass in only a "query ID", and then search in the meantime, and then once the data is ready, pass the information the user via JSON?
Make sure you are getting entities from the datastore in parallel. Since you already have the key names, you just have to pass your list of keys to the appropriate method.
For db:
MyModel.get_by_key_name(key_names)
For ndb:
ndb.get_multi([ndb.Key.from_path('MyModel', key_name) for key_name in key_names])
If you needed to do datastore queries, you could enable parallel fetches with the query.run (db) and query.fetch_async (ndb) methods.

Resources