Optimizing Firestore queries for our content app - database

We are building a content app using Firestore.
The basic requirement is that there is one master collection, let's say 'content'. The number of documents could run into 1000s.
content1, content2, content3 ... content9999
We want to serve our users with content from this collection, making sure they don't see the same content twice, and every time they are in the app there's new content for them.
At the same time, we don't want the same sequence of our content to be served to each user. Some randomisation would be good.
user1: content9, content123, content17, content33, content902 .. and so on
user2: content854, content79, content190, content567 ... and so on
I have been breaking my head as to how without duplicating the master collection can we possibly achieve this solution. Duplicating the master collection would just be so expensive, but will do the job.
Also, how can we possibly write cost-effective and performance-optimised queries especially when we want to maintain randomisation in the sequence of these content pieces?

Here is my suggestion. Please view it as pseudo-code as I did not run it.
If the content document ids are not previsible
You have to store and maintain which user has seen which content, for example in a collection: /seen/uid_contentId
See here a clever way to get a random document from a collection. You need to store the size of the collection, perhaps as a document in another collection. So here is how you could do it:
const snapshot = await firestore.doc(`/userSeen/${uid}`).get(); // do it only once
const alreadySeen = snapshot.exists ? snapshot.data.contents : [];
async function getContent(uid) {
for (let trials = 0; trials < 10; trials++) { // limit the cost
const startAt = Math.random() * contentCollectionSize;
const snapshot = await firestore.collection("/contents").startAt(startAt).limit(1).get();
const document = snapshot.empty ? null : snapshot.docs[0]; // a random content
if(document.exists && !alreadySeen.includes(document.id)) {
alreadySeen.push(document.id);
await firestore.doc(`/userSeen/${uid}`).set({contents: arrayUnion(document.id)}); // mark it as seen
return document;
}
}
return null;
}
Here you may have to make several queries to Firestore (capped to 10 to limit the cost), because you are not able to compute the content document ids on the client side.
If the content document ids follow a simple pattern: 1, 2, 3, ...
To save up costs and performance, you should store all the seen contents for each user in a single document (the limit is 1MB, that is more than 250,000 integers!). Then you download this document once per user, and check on the client side if a random content was already seen.
const snapshot = await firestore.doc(`/userSeen/${uid}`).get(); // do it only once
const alreadySeen = snapshot.exists ? snapshot.data.contents : [];
async function getContent(uid) {
let idx = Math.random() * contentCollectionSize;
for (let trials = 0; trials < contentCollectionSize; trials++) {
idx = idx + 1 < contentCollectionSize ? idx + 1 : 0;
if(alreadySeen.includes(idx)) continue; // this shortcut reduces the number of Firestore queries
const document = await firestore.doc(`/contents/${idx}`).get();
if(document.exists){
alreadySeen.push(idx);
await firestore.doc(`/userSeen/${uid}`).set({contents: arrayUnion(idx)}); // mark it as seen
return document;
}
}
return null;
}
As you can see, this is much cheaper if you use previsible document ids for your content. But perhaps someone will have a better idea.

I have another idea. You could generate scalars of content :D
Create another collection - scalars
Add field type array
Code a function which will walk through content collection and will generate sets of content items randomly or taking into account other attributes like popularity, demographic, behaviour of users.
Generate 1000 of sets of content items in scalars collection, and do this once a month for example.
You can even measure effectiveness of each scalar in the context of attracting back users and promote those that are more atractive.
Once you have collection of scalars containing sets of collection items, you can assign users to a scalar. And present content items accordingly.

Related

firebase firestore how to read from a subcollection [duplicate]

I thought I read that you can query subcollections with the new Firebase Firestore, but I don't see any examples. For example I have my Firestore setup in the following way:
Dances [collection]
danceName
Songs [collection]
songName
How would I be able to query "Find all dances where songName == 'X'"
Update 2019-05-07
Today we released collection group queries, and these allow you to query across subcollections.
So, for example in the web SDK:
db.collectionGroup('Songs')
.where('songName', '==', 'X')
.get()
This would match documents in any collection where the last part of the collection path is 'Songs'.
Your original question was about finding dances where songName == 'X', and this still isn't possible directly, however, for each Song that matched you can load its parent.
Original answer
This is a feature which does not yet exist. It's called a "collection group query" and would allow you query all songs regardless of which dance contained them. This is something we intend to support but don't have a concrete timeline on when it's coming.
The alternative structure at this point is to make songs a top-level collection and make which dance the song is a part of a property of the song.
UPDATE
Now Firestore supports array-contains
Having these documents
{danceName: 'Danca name 1', songName: ['Title1','Title2']}
{danceName: 'Danca name 2', songName: ['Title3']}
do it this way
collection("Dances")
.where("songName", "array-contains", "Title1")
.get()...
#Nelson.b.austin Since firestore does not have that yet, I suggest you to have a flat structure, meaning:
Dances = {
danceName: 'Dance name 1',
songName_Title1: true,
songName_Title2: true,
songName_Title3: false
}
Having it in that way, you can get it done:
var songTitle = 'Title1';
var dances = db.collection("Dances");
var query = dances.where("songName_"+songTitle, "==", true);
I hope this helps.
UPDATE 2019
Firestore have released Collection Group Queries. See Gil's answer above or the official Collection Group Query Documentation
Previous Answer
As stated by Gil Gilbert, it seems as if collection group queries is currently in the works. In the mean time it is probably better to use root level collections and just link between these collection using the document UID's.
For those who don't already know, Jeff Delaney has some incredible guides and resources for anyone working with Firebase (and Angular) on AngularFirebase.
Firestore NoSQL Relational Data Modeling - Here he breaks down the basics of NoSQL and Firestore DB structuring
Advanced Data Modeling With Firestore by Example - These are more advanced techniques to keep in the back of your mind. A great read for those wanting to take their Firestore skills to the next level
What if you store songs as an object instead of as a collection? Each dance as, with songs as a field: type Object (not a collection)
{
danceName: "My Dance",
songs: {
"aNameOfASong": true,
"aNameOfAnotherSong": true,
}
}
then you could query for all dances with aNameOfASong:
db.collection('Dances')
.where('songs.aNameOfASong', '==', true)
.get()
.then(function(querySnapshot) {
querySnapshot.forEach(function(doc) {
console.log(doc.id, " => ", doc.data());
});
})
.catch(function(error) {
console.log("Error getting documents: ", error);
});
NEW UPDATE July 8, 2019:
db.collectionGroup('Songs')
.where('songName', isEqualTo:'X')
.get()
I have found a solution.
Please check this.
var museums = Firestore.instance.collectionGroup('Songs').where('songName', isEqualTo: "X");
museums.getDocuments().then((querySnapshot) {
setState(() {
songCounts= querySnapshot.documents.length.toString();
});
});
And then you can see Data, Rules, Indexes, Usage tabs in your cloud firestore from console.firebase.google.com.
Finally, you should set indexes in the indexes tab.
Fill in collection ID and some field value here.
Then Select the collection group option.
Enjoy it. Thanks
You can always search like this:-
this.key$ = new BehaviorSubject(null);
return this.key$.switchMap(key =>
this.angFirestore
.collection("dances").doc("danceName").collections("songs", ref =>
ref
.where("songName", "==", X)
)
.snapshotChanges()
.map(actions => {
if (actions.toString()) {
return actions.map(a => {
const data = a.payload.doc.data() as Dance;
const id = a.payload.doc.id;
return { id, ...data };
});
} else {
return false;
}
})
);
Query limitations
Cloud Firestore does not support the following types of queries:
Queries with range filters on different fields.
Single queries across multiple collections or subcollections. Each query runs against a single collection of documents. For more
information about how your data structure affects your queries, see
Choose a Data Structure.
Logical OR queries. In this case, you should create a separate query for each OR condition and merge the query results in your app.
Queries with a != clause. In this case, you should split the query into a greater-than query and a less-than query. For example, although
the query clause where("age", "!=", "30") is not supported, you can
get the same result set by combining two queries, one with the clause
where("age", "<", "30") and one with the clause where("age", ">", 30).
I'm working with Observables here and the AngularFire wrapper but here's how I managed to do that.
It's kind of crazy, I'm still learning about observables and I possibly overdid it. But it was a nice exercise.
Some explanation (not an RxJS expert):
songId$ is an observable that will emit ids
dance$ is an observable that reads that id and then gets only the first value.
it then queries the collectionGroup of all songs to find all instances of it.
Based on the instances it traverses to the parent Dances and get their ids.
Now that we have all the Dance ids we need to query them to get their data. But I wanted it to perform well so instead of querying one by one I batch them in buckets of 10 (the maximum angular will take for an in query.
We end up with N buckets and need to do N queries on firestore to get their values.
once we do the queries on firestore we still need to actually parse the data from that.
and finally we can merge all the query results to get a single array with all the Dances in it.
type Song = {id: string, name: string};
type Dance = {id: string, name: string, songs: Song[]};
const songId$: Observable<Song> = new Observable();
const dance$ = songId$.pipe(
take(1), // Only take 1 song name
switchMap( v =>
// Query across collectionGroup to get all instances.
this.db.collectionGroup('songs', ref =>
ref.where('id', '==', v.id)).get()
),
switchMap( v => {
// map the Song to the parent Dance, return the Dance ids
const obs: string[] = [];
v.docs.forEach(docRef => {
// We invoke parent twice to go from doc->collection->doc
obs.push(docRef.ref.parent.parent.id);
});
// Because we return an array here this one emit becomes N
return obs;
}),
// Firebase IN support up to 10 values so we partition the data to query the Dances
bufferCount(10),
mergeMap( v => { // query every partition in parallel
return this.db.collection('dances', ref => {
return ref.where( firebase.firestore.FieldPath.documentId(), 'in', v);
}).get();
}),
switchMap( v => {
// Almost there now just need to extract the data from the QuerySnapshots
const obs: Dance[] = [];
v.docs.forEach(docRef => {
obs.push({
...docRef.data(),
id: docRef.id
} as Dance);
});
return of(obs);
}),
// And finally we reduce the docs fetched into a single array.
reduce((acc, value) => acc.concat(value), []),
);
const parentDances = await dance$.toPromise();
I copy pasted my code and changed the variable names to yours, not sure if there are any errors, but it worked fine for me. Let me know if you find any errors or can suggest a better way to test it with maybe some mock firestore.
var songs = []
db.collection('Dances')
.where('songs.aNameOfASong', '==', true)
.get()
.then(function(querySnapshot) {
var songLength = querySnapshot.size
var i=0;
querySnapshot.forEach(function(doc) {
songs.push(doc.data())
i ++;
if(songLength===i){
console.log(songs
}
console.log(doc.id, " => ", doc.data());
});
})
.catch(function(error) {
console.log("Error getting documents: ", error);
});
It could be better to use a flat data structure.
The docs specify the pros and cons of different data structures on this page.
Specifically about the limitations of structures with sub-collections:
You can't easily delete subcollections, or perform compound queries across subcollections.
Contrasted with the purported advantages of a flat data structure:
Root-level collections offer the most flexibility and scalability, along with powerful querying within each collection.

Get Firestore collection and sub-collection document data together

I have the following Firestore database structure in my Ionic 5 app.
Book(collection)
{bookID}(document with book fields)
Like (sub-collection)
{userID} (document name as user ID with fields)
Book collection has documentes and each document has a Like sub-collection. The document names of Like collection are user IDs who liked the book.
I am trying to do a query to get the latest books and at the same time trying to get the document from Like sub-collection to check if I have liked it.
async getBook(coll) {
snap = await this.afs.collection('Book').ref
.orderBy('createdDate', "desc")
.limit(10).get();
snap.docs.map(x => {
const data = x.data();
coll.push({
key: x.id,
data: data.data(),
like: this.getMyReaction(x.id)
});
}
async getMyReaction(key) {
const res = await this.afs.doc('Book/myUserID').ref.get();
if(res.exists) {
return res.data();
} else {
return 'notFound';
}
}
What I am doing here is calling the method getMyReaction() with each book ID and storing the promise in the like field. Later, I am reading the like value with async pipe in the HTML. This code is working perfectly but there is a little delay to get the like value as the promise is taking time to get resolved. Is there a solution to get sub-collection value at the same time I am getting the collection value?
Is there a solution to get sub-collection value at the same time I am getting the collection value?
Not without restructuring your data. Firestore queries can only consider documents in a single collection. The only exception to that is collection group queries, which lets you consider documents among all collections with the exact same name. What you're doing right now to "join" these two collections is probably about as effective as you'll get.
The only way you can turn this into a single query is by having another collection with the data pre-merged from the other two collections. This is actually kind of common on nosql databases, and is referred to as denormalization. But it's entirely up to you to decide if that's right for your use case.

Firebase cloud function not updating record

imagine this scenario:
You have a list with lets say 100 items, a user favorites 12 items from the list.
The user now wants these 12 items displayed in a new list with the up to date data (if data changes from the original list/node)
What would be the best way to accomplish this without having to denormalize all of the data for each item?
is there a way to orderByChild().equalTo(multiple items)
(the multiple items being the favorited items)
Currently, I am successfully displaying the favorites in a new list but I am pushing all the data to the users node with the favorites, problem is when i change data, their data from favorites wont change.
note - these changes are made manually in the db and cannot be changed by the user (meta data that can potentially change)
UPDATE
I'm trying to achieve this now with cloud functions. I am trying to update the item but I can't seem to get it working.
Here is my code:
exports.itemUpdate = functions.database
.ref('/items/{itemId}')
.onUpdate((change, context) => {
const before = change.before.val(); // DataSnapshot after the change
const after = change.after.val(); // DataSnapshot after the change
console.log(after);
if (before.effects === after.effects) {
console.log('effects didnt change')
return null;
}
const ref = admin.database().ref('users')
.orderByChild('likedItems')
.equalTo(before.title);
return ref.update(after);
});
I'm not to sure what I am doing wrong here.
Cheers!
There isn't a way to do multiple items in the orderByChild, denormalising in NoSQL is fine its just about keeping it in sync like you mention. I would recommend using a Cloud Function which will help you to keep things in sync.
Another option is to use Firestore instead of the Realtime Database as it has better querying capabilities where you could store a users id in the document and using an array contains filter you could get all the users posts.
The below is the start of a Cloud function for a realtime database trigger for an update of an item.
exports.itemUpdate = functions.database.ref('/items/{itemId}')
.onUpdate((snap, context) => {
// Query your users node for matching items and update them
});

How to Fetch a set of Specific Keys in Firebase?

Say I'd like to fetch only items that contains keys: "-Ju2-oZ8sJIES8_shkTv", "-Ju2-zGVMuX9tMGfySko", and "-Ju202XUwybotkDPloeo".
var items = new Firebase("https://hello-cambodia.firebaseio.com/items");
items.orderByKey().equalTo("-Ju2-gVQbXNgxMlojo-T").once('value', function(snap1){
items.orderByKey().equalTo("-Ju2-zGVMuX9tMGfySko").once('value', function(snap2){
items.orderByKey().equalTo("-Ju202XUwybotkDPloeo").once('value', function(snap3){
console.log(snap1.val());
console.log(snap2.val());
console.log(snap3.val());
})
})
});
I don't feel that this is the right way to fetch the items, especially, when I have 1000 keys over to fetch from.
If possible, I really hope for something where I can give a set of array
like
var itemKeys = ["-Ju2-gVQbXNgxMlojo-T","-Ju2-zGVMuX9tMGfySko", "-Ju202XUwybotkDPloeo"];
var items = new Firebase("https://hello-cambodia.firebaseio.com/items");
items.orderByKey().equalTo(itemKeys).once('value', function(snap){
console.log(snap.val());
});
Any suggestions would be appreciated.
Thanks
Doing this:
items.orderByKey().equalTo("-Ju2-gVQbXNgxMlojo-T")
Gives exactly the same result as:
items.child("-Ju2-gVQbXNgxMlojo-T")
But the latter is not only more readable, it will also prevent the need for scanning indexes.
But what you have to answer is why want to select these three items? Is it because they all have the same status? Because they fell into a specific date range? Because the user selected them in a list? As soon as you can identify the reason for selecting these three items, you can look to convert the selection into a query. E.g.
var recentItems = ref.orderByChild("createdTimestamp")
.startAt(Date.now() - 24*60*60*1000)
.endAt(Date.now());
recentItems.on('child_added'...
This query would give you the items of the past day, if you had a field with the timestamp.
You can use Firebase child. For example,
var currFirebaseRoom = new Firebase(yourFirebaseURL)
var userRef = currFirebaseRoom.child('users');
Now you can access this child with
userRef.on('value', function(userSnapshot) {
//your code
}
You generally should not be access things using the Firebase keys. Create a child called data and put all your values there and then you can access them through that child reference.

Knockoutjs how to data-bind observable array members based on IDs

I'm not if the title explains what I need to achieve or not but I can change it later if some has a better suggestion.
I'm using KO to manage a whole bunch of data on the client side.
Here's the basic.
I have a list of training sessions
Each has a list of training session parts
Each training session parts are referencing items kept in other lists. For example, I have a list of activities (ex: biking, running, swimming, etc.)
Each activity is identified by an ID which is used in the training session parts to identify which activity was used for a particular session.
Now, all these list are stored as observable arrays, and each member of the lists are observables (I use KO.Mapping to map the JSON coming from the server)
When I display a training session in my UI, I want to display various information coming from various lists
Duration: 1h30
Activity: Biking
Process: Intervals
The only information I have in order to link the training session to its component is an ID which is fine. What I'm not sure is how to data-bind the name (text) of my activity to a <p> or <div> so that the name will change if I edit the activity (by using some functionality of the application).
The training session only has the ID to identify the activity, so I don’t know how to bind the name of the activity based on its ID.
Hopefully this makes senses and someone can help me figure it out. I found lots of info on how to bind to observable array but nothing addressing ID and linked information.
The easiest way would probably be to make your own constructors and link the data by hand. You can use mapping if you really want to, but you'll basically have to do the same manual linking, only in a more verbose format.
This is the fiddle with the example implementation: http://jsfiddle.net/aKpS9/3/
The most important part of the code is the linking, you have to take care to create the activity objects only once, and use the same objects everywhere, as opposed to creating new activity objects for the parts.
var TrainingSession = function(rawData, actualActivities){
var self = this;
self.name = ko.observable(rawData.name);
self.parts = ko.observableArray(ko.utils.arrayMap(rawData.parts, function(rawPart){
return ko.utils.arrayFirst(actualActivities(), function(ac){
return ac.ID() == rawPart.ID;
})
}));
}
var Activity = function(rawData){
var self = this;
self.ID = ko.observable(rawData.ID);
self.name = ko.observable(rawData.name);
}
var MainVM = function(rawData){
var self = this;
//first create an array of all activities
self.activities = ko.observableArray(ko.utils.arrayMap(rawData.activities, function(rawAc){
return new Activity(rawAc);
}));
self.trainingSessions = ko.observableArray(ko.utils.arrayMap(rawData.trainingSessions, function(session){
return new TrainingSession(session, self.activities);
}));
}

Resources