firestore data model using multiple sub collections

firestore data model using multiple sub collections - database

I have a current structure in my firestore.
Data model:
- users (collection)
- 9d5UCo4RtyTosiFAIvnPK4zFCUk1 ( each user is a separate document )
- trips ( collection )
- New York (Document)
- fields: date, description, photos:[url1, url2, ....] ( fields in the New York document), status:public
- Los Angeles
....
Now I want to add notes, likes and metadata for each photo in the New York, LA, etc photos array.
I have a few ideas but not sure which is scalable and works well for what I want to do with notes, likes and metadata.
The notes don't have to load at the same time the photos do. The user would click to open notes which could be a new query to DB. But each photo does need to know how many notes are associated with each photo when photos are been displayed.
Solution 1:
photos array would be an object instead.
const photos = {
isdffd: {
url: "path",
likes: 100,
metadata: {
...
},
},
xxydfsd: {
url: `path`,
likes: 10,
metadata: {
...
}
},
};
Then I have a collection of notes back up where trips is and use the photo id as the id for the note. but that limits me to 1 note per photo which I want to be able to add multiple notes.
I also want to be able to build a search page which would search through all notes throughout all trips and display them grouped by trip in the results.
What would be a good approach for this kind of setup with what I have right now which is using sub-collections for most of my structure so far.
Solution 2:
add another sub-collection inside each trip document called photos
each photo would be its own document with likes, metadata and notes. But then notes needs to be a sub-collection of each photo document. This would make it hard to search all trips and all notes for each photo.

The way your original function has been structured:
- users (collection)
- 9d5UCo4RtyTosiFAIvnPK4zFCUk1 ( each user is a separate document )
- trips ( collection )
- New York (Document)
- fields: date, description, photos:[url1, url2, ....] ( fields in the New York document), status:public
- Los Angeles
....
If you implement the first solution, you will limit your code if in the future you want to add another field to the photos object, or as you add additional photos to the object.
Adding a photos sub-collection inside your trips is your best option, this way your collection can grow and avoid the limitation of the first solution (having to make any change manually). If your notes are only text, then you can keep them inside each photo document as an array, avoiding another subcollection. Keeping the photos as a sub-collection and keeping the URL, likes, metadata and notes (if they are only text) inside each photo created should not add much difficulty when querying.

Related

Get Firestore collection and sub-collection document data together

I have the following Firestore database structure in my Ionic 5 app.
Book(collection)
{bookID}(document with book fields)
Like (sub-collection)
{userID} (document name as user ID with fields)
Book collection has documentes and each document has a Like sub-collection. The document names of Like collection are user IDs who liked the book.
I am trying to do a query to get the latest books and at the same time trying to get the document from Like sub-collection to check if I have liked it.
async getBook(coll) {
snap = await this.afs.collection('Book').ref
.orderBy('createdDate', "desc")
.limit(10).get();
snap.docs.map(x => {
const data = x.data();
coll.push({
key: x.id,
data: data.data(),
like: this.getMyReaction(x.id)
});
}
async getMyReaction(key) {
const res = await this.afs.doc('Book/myUserID').ref.get();
if(res.exists) {
return res.data();
} else {
return 'notFound';
}
}
What I am doing here is calling the method getMyReaction() with each book ID and storing the promise in the like field. Later, I am reading the like value with async pipe in the HTML. This code is working perfectly but there is a little delay to get the like value as the promise is taking time to get resolved. Is there a solution to get sub-collection value at the same time I am getting the collection value?

Is there a solution to get sub-collection value at the same time I am getting the collection value?
Not without restructuring your data. Firestore queries can only consider documents in a single collection. The only exception to that is collection group queries, which lets you consider documents among all collections with the exact same name. What you're doing right now to "join" these two collections is probably about as effective as you'll get.
The only way you can turn this into a single query is by having another collection with the data pre-merged from the other two collections. This is actually kind of common on nosql databases, and is referred to as denormalization. But it's entirely up to you to decide if that's right for your use case.

MongoDB architecture: how to store a large amount of arrays or sub documents in a scalable way

I am currently working on a blogging app, in which users can create their own blogs and each blog has blogposts within that. I'm ideating about architecting a database that is scalable when each blog has a lot of blogposts.
So is it better to structure my database as this:
blog1 : {
blogname : 'blog1',
blogposts: [array of blogposts]
},
blog2 : {
blogname : 'blog2',
blogposts: [array of blogposts]
}
Or should I create a separate collection with all the blogposts, something like this:
blogpost1: {
id: 'blogpost1',
content: {blogpost content in json format}
},
blogpost2: {
id: 'blogpost2',
content: {blogpost content in json format}
}
and reference them in the blog collection.
I want to know which choice would be superior when there are a lot of blogposts. Because I remember reading somewhere in MongoDB docs that it's not recommended to have arrays within document that can grow beyond bounds, so approach #1 is not ideal, right?

When creating databases, I find it useful to think about the requests I would be making.
A blogging app user would want to search all blogs or find a blogger by some criteria.
In this case separate collections for bloggers and blogs would work best. Then structure your documents so that the bloggers link to their blogs and vice versa.
This can be done with Mongoose Schemas (https://mongoosejs.com/docs/index.html).
// models/blogger.js
const mongoose = require('mongoose')
const bloggerSchema = mongoose.Schema({
blogs: [
{
type: mongoose.Schema.Types.ObjectId,
ref: 'Blog'
}
],
name: String
})
bloggerSchema.set('toJSON', {
transform: (document, returnedObject) => {
const blogger = returnedObject
blogger.id = blogger._id.toString()
delete blogger._id
delete blogger.__v
}
})
module.exports = mongoose.model('Blogger', bloggerSchema)
Then use populate with your request:
// controllers/bloggers.js
const bloggersRouter = require('express').Router()
const Blogger = require('../models/blogger')
bloggersRouter.get('/', async (request, response) => {
const bloggers = await Blogger.find({}).populate(
'blogs', {
title: 1
}
)
response.json(bloggers.map(blogger => blogger.toJSON()))
})
module.exports = bloggersRouter
This way you don't have to add the blogs in their entirety to the blogger document, you can just include the title or anything else that you need on the bloggers initial view.
You could also think about limiting the length of a blog, so you can have more control over the data and then think about the options Joe suggested.

Why does it have to be one or the other?
Storing the blog posts in the same document as the blog is great as long as the individual posts are not very large, and there aren't very many of them.
Storing the posts in a separate collection is good for bigger posts and busy blogs but adds an additional query or lookup to retrieve.
I would think it is expected that your users' output will run the gamut from sparse to prolific, and individual posts will range from a few dozen bytes to many megabytes.
For small posts on not very active blogs, store the posts in the blog document for efficient retrieval.
For busy blogs, store them in an archive collection. Perhaps store the most recent couple of posts, or the most popular posts, in the blog document so you don't have to refer to the other collection every time.
You will also need to figure out how to split a post between documents. MongoDB has a 16MB limit on a single document, so if any of your users make huge posts, you'll need to be able to store them somewhere.
Your question as written seems to be asking whether it is better to follow a relation model or a strict document model. I think in reality neither is a perfect fit for this and a hybridized and flexible approach would work out better.

How do you fetch all documents (including non-existent ancestor documents) in a Firebase collection?

I am trying to pull all the documents in the collection 'users', but it only pulls 'fred' and 'lisa', and ignores all the italicized documents:
For this data:
Trying to get all documents:
Will yield:
info: length 2
info: fred => { gender: 'male', contacts: [ '' ] }
lisa => { contacts: [ '' ] }
According to the Firebase documentation (Firebase: Add and Manage Data):
Warning: Even though non-existent ancestor documents appear in the console, they do not appear in queries and snapshots. You must create the document to include it in query results.
Note: The non-existent ancestor users seem to be auto-created when the user hits the sign-up button that triggers a firebase.auth() function (fred and lisa were created manually).
How would I print the contacts of each user if some of the users do not show up in my queries? Would I need to periodically run a script that manually re-adds all the users or is there a more elegant solution?

As you have mentioned, these "documents" are displayed with an italic font in the Firebase console: this is because these documents are only present (in the console) as "container" of one or more sub-collection but they are not "genuine" documents.
As matter of fact, if you create a document directly under a col1 collection with the full path doc1/subCol1/subDoc1, no intermediate documents will be created (i.e. no doc1 document).
The Firebase console shows this kind of "container" (or "placeholder") in italics in order to "materialize" the hierarchy and allow you to navigate to the subDoc1 document but doc1 document doesn't exist in the Firestore database.
Let's take an example: Imagine a doc1 document under the col1 collection
col1/doc1/
and another one subDoc1 under the subCol1 (sub-)collection
col1/doc1/subCol1/subDoc1
Actually, from a technical perspective, they are not at all relating to each other. They just share a part of their path but nothing else. One side effect of this is that if you delete a document, its sub-collection(s) still exist.
So, if you want to be able to query for these parent documents, you will have to create them yourself, as jackz314 mentioned in the comments.

If you're trying to list all your registered users from Firebase auth, you can use the Firebase SDK function:
function listAllUsers(nextPageToken) {
admin.auth().listUsers(1000, nextPageToken)
.then(function(listUsersResult){
listUsersResult.users.forEach(function(userRecord) {
console.log('user', userRecord.toJSON());
})
if (listUsersResult.pageToken) {
// list next batch of users
}
})
.catch(function(err) {
console.log('Error listing users: ', error)
});
}
listAllUsers();
via http://firebase.google.com/docs/auth/admin/manage-users#list_all_users

AngularFire - How do I query denormalised data?

Ok Im starting out fresh with Firebase. I've read this: https://www.firebase.com/docs/data-structure.html and I've read this: https://www.firebase.com/blog/2013-04-12-denormalizing-is-normal.html
So I'm suitably confused as one seems to contradict the other. You can structure your data hierarchically, but if you want it to be scalable then don't. However that's not the actual problem.
I have the following structure (please correct me if this is wrong) for a blog engine:
"authors" : {
"-JHvwkE8jHuhevZYrj3O" : {
"userUid" : "simplelogin:7",
"email" : "myemail#domain.com"
}
},
"posts" : {
"-JHvwkJ3ZOZAnTenIQFy" : {
"state" : "draft",
"body" : "This is my first post",
"title" : "My first blog",
"authorId" : "-JHvwkE8jHuhevZYrj3O"
}
}
A list of authors and a list of posts. First of all I want to get the Author where the userUid equals my current user's uid. Then I want to get the posts where the authorId is the one provided to the query.
But I have no idea how to do this. Any help would be appreciated! I'm using AngularFire if that makes a difference.

Firebase is a NoSQL data store. It's a JSON hierarchy and does not have SQL queries in the traditional sense (these aren't really compatible with lightning-fast real-time ops; they tend to be slow and expensive). There are plans for some map reduce style functionality (merged views and tools to assist with this) but your primary weapon at present is proper data structure.
First of all, let's tackle the tree hierarchy vs denormalized data. Here's a few things you should denormalize:
lists you want to be able to iterate quickly (a list of user names without having to download every message that user ever wrote or all the other meta info about a user)
large data sets that you view portions of, such as a list of rooms/groups a user belongs to (you should be able to fetch the list of rooms for a given user without downloading all groups/rooms in the system, so put the index one place, the master room data somewhere else)
anything with more than 1,000 records (keep it lean for speed)
children under a path that contain 1..n (i.e. possibly infinite) records (example chat messages from the chat room meta data, that way you can fetch info about the chat room without grabbing all messages)
Here's a few things it may not make sense to denormalize:
data you always fetch en toto and never iterate (if you always use .child(...).on('value', ...) to fetch some record and you display everything in that record, never referring to the parent list, there's no reason to optimize for iterability)
lists shorter than a hundred or so records that you always as a whole (e.g. the list of groups a user belongs to might always be fetched with that user and would average 5-10 items; probably no reason to keep it split apart)
Fetching the author is as simple as just adding the id to the URL:
var userId = 123;
new Firebase('https://INSTANCE.firebaseio.com/users/'+userId);
To fetch a list of posts belonging to a certain user, either maintain an index of that users' posts:
/posts/$post_id/...
/my_posts/$user_id/$post_id/true
var fb = new Firebase('https://INSTANCE.firebaseio.com');
fb.child('/my_posts/'+userId).on('child_added', function(indexSnap) {
fb.child('posts/'+indexSnap.name()).once('value', function(dataSnap) {
console.log('fetched post', indexSnap.name(), dataSnap.val());
});
});
A tool like Firebase.util can assist with normalizing data that has been split for storage until Firebase's views and advanced querying utils are released:
/posts/$post_id/...
/my_posts/$user_id/$post_id/true
var fb = new Firebase('https://INSTANCE.firebaseio.com');
var ref = Firebase.util.intersection( fb.child('my_posts/'+userId), fb.child('posts') );
ref.on('child_added', function(snap) {
console.log('fetched post', snap.name(), snap.val();
});
Or simply store the posts by user id (depending on your use case for how that data is fetched later):
/posts/$user_id/$post_id/...
new Firebase('https://INSTANCE.firebaseio.com/posts/'+userId).on('child_added', function(snap) {
console.log('fetched post', snap.name(), snap.val());
});

Favoriting system on Appengine

I have the following model structure
class User(db.Model) :
nickname = db.StringProperty(required=True)
fullname = db.StringProperty(required=True)
class Article(db.Model) :
title = db.StringProperty(required=True)
body = db.StringProperty(required=True)
author = db.ReferenceProperty(User, required=True)
class Favorite(db.Model) :
who = db.ReferenceProperty(User, required=True)
what = db.ReferenceProperty(Article, required=True)
I'd like to display 10 last articles according to this pattern: article.title, article.body, article.author(nickname), info if this article has been already favorited by the signed in user.
I have added a function which I use to get the authors of these articles using only one query (it is described here)
But I don't know what to do with the favorites (I'd like to know which of the displayed articles have been favorited by me using less than 10 queries (I want to display 10 articles)). Is it possible?

You can actually do this with an amortized cost of 0 queries if you denormalize your data more! Add a favorites property to Authors which stores a list of keys of articles which the user has favorited. Then you can determine if the article is the user's favorite by simply checking this list.
If you retrieve this list of favorites when the user first logs in and just store it in your user's session data (and update it when the user adds/removes a favorite), then you won't have to query the datastore to check to see if an item is a favorite.
Suggested update to the Authors model:
class Authors(db.Model): # I think this would be better named "User"
# same properties you already had ...
favorites = db.ListProperty(db.Key, required=True, default=[])
When the user logs in, just cache their list of favorites in your session data:
session['favs'] = user.favorites
Then when you show the latest articles, you can check if they are a favorite just by seeing if each article's key is in the favorites list you cached already (or you could dynamically query the favorites list but there is really no need to).
favs = session['favs']
articles = get_ten_latest_articles()
for article in articles:
if article.key() in favs:
# ...

I think there is one more solution.
Let's add 'auto increment' fields to the User and Article class.
Then, when we want to add an entry to the Favorite class, we will also add the key name in the format which we will be able to know having auto increment value of the signed in user and the article, like this 'UserId'+id_of_the_user+'ArticleId'+id_of_an_article.
Then, when it comes to display, we will easily predict key names of the favorites and would be able to use Favorite.get_by_key_name(key_names).

An alternative solution to dound's is to store the publication date of the favorited article on the Favorite entry. Then, simply sort by that when querying.

Develop Reference

c reactjs sql-server angularjs arrays wpf database batch-file google-app-engine silverlight