Parse Server, MongoDB - get "liked" state of an object - database

I am using Parse Server, which runs on MongoDB.
Let's say I have collections User and Comment and a join table of user and comment.
User can like a comment, which creates a new record in a join table.
Specifically in Parse Server, join table can be defined using a 'relation' field in the collection.
Now when I want to retrieve all comments, I also need to know, whether each of them is liked by the current user. How can I do this, without doing additional queries?
You might say I could create an array field likers in Comment table and use $elemMatch, but it doesn't seem as a good idea, because potentially, there can be thousands of likes on a comment.
My idea, but I hope there could be a better solution:
I could create an array field someLikers, a relation (join table) field allLikers and a number field likesCount in Comment table. Then put first 100 likers in both someLikers and allLikers and additional likers only in the allLikers. I would always increment the likesCount.
Then when querying a list of comments, I would implement the call with $elemMatch, which would tell me whether the current user is inside someLikers. When I would get the comments, I would check whether some of the comments have likesCount > 100 AND $elemMatch returned null. If so, I would have to run another query in the join table, looking for those comments and checking (querying by) whether they are liked by the current user.
Is there a better option?
Thanks!

I'd advise agains directly accessing MongoDB unless you absolutely have to; after all, the way collections and relations are built is an implementation detail of Parse and in theory could change in the future, breaking your code.
Even though you want to avoid multiple queries I suggest to do just that (depending on your platform you might even be able to run two Parse queries in parallel):
The first one is the query on Comment for getting all comments you want to display; assuming you have some kind of Post for which comments can be written, the query would find all comments referencing the current post.
The second query again is for on Comment, but this time
constrained to the comments retrieved in the first query, e.g.: containedIn("objectID", arrayOfCommentIDs)
and constrained to the comments having the current user in their likers relation, e.g.: equalTo("likers", currentUser)

Well a join collection is not really a noSQL way of thinking ;-)
I don't know ParseServer, so below is just based on pure MongoDB.
What i would do is, in the Comment document use an array of ObjectId's for each user who likes the comment.
Sample document layout
{
"_id" : ObjectId(""),
"name" : "Comment X",
"liked" : [
ObjectId(""),
....
]
}
Then use a aggregation to get the data. I asume you have the _id of the comment and you know the _id of the user.
The following aggregation returns the comment with a like count and a boolean which indicates the user liked the comment.
db.Comment.aggregate(
[
{
$match: {
_id : ObjectId("your commentId")
}
},
{
$project: {
_id : 1,
name :1,
number_of_likes : {$size : "$liked"},
user_liked: {
$gt: [{
$size: {
$filter: {
input: "$liked",
as: "like",
cond: {
$eq: ["$$like", ObjectId("your userId")]
}
}
}
}, 0]
},
}
},
]
);
this returns
{
"_id" : ObjectId(""),
"name" : "Comment X",
"number_of_likes" : NumberInt(7),
"user_liked" : true
}
Hope this is what your after.

Related

Performance issue when querying time-based objects

I'm currently working on a mongoDB collection containing documents that looks like the following :
{ startTime : Date, endTime: Date, source: String, metaData: {}}
And my usecase is to retrieve all documents that is included within a queried time frame, such as my query looks like this :
db.myCollection.find(
{
$and: [
{"source": aSource},
{"startTime" : {$lte: timeFrame.end}},
{"endTime" : {$gte: timeFrame.start}}
]
}
).sort({ "startTime" : 1 })
With an index defined as the following :
db.myCollection.createIndex( { "source" : 1, "startTime": 1, "endTime": 1 } );
The problem is that queries are very slow (multiple hundreds of ms on a local database) as soon as the number of document per source increase.
Using mongo explain shows me that i'm efficiently using this index (only found documents are scanned, otherwise only index-access is made), so the slowness seems to come from the index scan itself, as this query needs to go over a large portion of this index.
In addition to that, such an index gets huge pretty quickly and therefore seems inefficient.
Is there anything i'm missing that could help makes those queries faster, or am I condemned to retrieve all the documents belonging to a given source as the best way to go ? I see that mongo now provides some time-series features, could that bring any help in regard of my problem ?

Why is my MongoDB aggregation query so slow

I have several IDs (usually 2 or 3) of users whom I need to fetch from the database. Thing is, I also need to know the distance from a certain point. Problem is, my collection has 1,000,000 documents (users) in it, and it takes upwards of 30 seconds to fetch the users.
Why is this happening? When I just use the $in operator for the _id it works fine and returns everything in under 200ms, and when I just use the $geoNear operator it also works fine, but when I use the 2 together everything slows down insanely. What do I do? Again, all I need is a few users with the IDs from the userIds array and their distance from a certain point (user.location).
EDIT: Also wanted to mention that when i use $nin instead of $in the query also performs pefrectly. Only $in is causing the problem when combined with $geoNear
const user = await User.findById('logged in users id');
const userIds = ['id1', 'id2', 'id3'];
[
{
$geoNear: {
near: user.location,
distanceField: 'distance',
query: {
_id: { $in: userIds }
}
}
}
]
I found a work-around: i just query by the ID field, and later I use a library to determine the distance of the returned docs from the central point.
Indexing your data could be a solution to your problem. without indexing mongodb has to scan through all documents.

How to store translations in nosql DB with minimal duplication?

I got this schema in DynamoDB
{
"timestamp" : "",
"fruit" : {
"name" : "orange",
"translations" : [
{
"en-GB" : "orange"
},
{
"sv-SE" : "apelsin"
},
....
]
}
I need to store translations for objects in a DynamoDB database, to be able to query them efficiently. E.g. my query has to be something like "give me all objects where translations array contains "
The problem is, is this a really dumb idea? There are 6500 languages out there, and this means I will be forcing all entries to each contain an array with thousands of properties with 99% of them empty string values. What's a better approach?
Thanks,
Unless your willing to let DynamoDB do a table scan to get your results, I think your using the wrong tool. Consider streaming your transactions to AWS ElasticSearch via something like Firehose. Firehose will give you a lot of nice to haves and can help you rotate transaction indexes. ElasticSearch should able to store that structure and run your query.
If you don't go that route, then at least consider dropping the language code in your structure if your not actually using it. Just make an array of the unique spellings of your fruit. This is the kind of query I might try to do with multiple queries instead of a single one; Go from the spelling of the fruit name to a fruit UUID which you can then query against.
I would rather save it as.
{
"primaryKey" : "orange",
"SecondaryKey": "en-GB"
"timestamp" : "",
"Metadata" : {
"name" : "orange",
}
And create a secondary-index with SecondaryKey as PK and primaryKey as SK.
By Doing this you can query
Get me orange in en-GB.
What all keys existing in en-GB
If you are updating multiple item at once. You can create 1 object like this
{
"KeyName" : "orange",
"SecondaryKey": "master"
"timestamp" : "",
"fruit" : {
"name" : "orange",
"translations" : [
{
"en-GB" : "orange"
},
{
"sv-SE" : "apelsin"
},
....
]
}
And create a lambda function who denormalises the above object and creates multiple entities in dynamodb. But you will have to take create of deleting the elements as well. If in the new object some language is not there.

Working with nested single queries in Firestore

Recently I moved my data model from Firebase to Firestore. All my code is working, but I'm having some ugly troubles regarding my nested queries for retrieve some data. Here is the point:
Right now my data model for this part looks like this(Yes! Another followers/feed example):
{
"Users": { //Collection
"UserId1" : { //Document
"Feed" : { //Subcollection of Id of posts from users this user Follow
"PostId1" : { //Document
"timeStamp" : "SomeDate"
},
"PostId2" : {
"timeStamp" : "SomeDate"
},
"PostId3" : {
"timeStamp" : "SomeDate"
}
}
//Some data
}
},
"Posts":{ //Collection
"PostId1":{ //Document
"Comments" :{ //Subcollection
"commentId" : { //Document
"authorId": "UserId1"
//comentsData
}
},
"Likes" : { //Subcollection
"UserId1" : { //Document
"liked" : true
}
}
}
}
}
My problem is that for retrieve the Posts of the feed of an user I should query in the next way:
Get the last X documents orderer by timeStamp from my Feed
feedCol(userId).orderBy(CREATION_DATE, Query.Direction.DESCENDING).limit(limit)
After that I should do a single query of each post retrieved from the list: workoutPostCol.document(postId)
Now I have the data of each post, but I want shot the username, picture, points.. etc of the author, which is in a different Document, so, again I should do another single query for each authorId retrieved in the list of posts userSocial(userId).document(toId)
Finally, and not less important, I need to know if my current user already liked that post, so I need to do a single query for each post(again) and check if my userId is inside posts/likes/{userId}
Right now everything is working, but thinking that the price of Firestore is depending of the number of database calls, and also that it doesn't make my queries more simple, I don't know if it's just that my data model is not good for this kind of database and I should move to normal SQL or just back to Firebase again.
Note: I know that EVERYTHING, would be a lot more easier moving this subcollections of likes, feed, etc to arraylists inside my user or post documents, but the limit of a Document is 1MB and if this grow to much, It will crash in the future. In other hand Firestore doesnt allow subdocument queries(yet) or an OR clause using multiple whereEqualTo.
I have read a lot of posts from users who have problems looking for a simple way to store this kind of ID's relationship to make joins and queries in their Collections, use Arraylists would be awesome, but the limit of 1MB limit it to much.
Hope that someone will be able to clarify this, or at least teach me something new; maybe my model is just crap and there is a simple and easiest way to do this? Or maybe my model is not possible for a non-sql database.
Not 100% sure if this solves the problem entirely, since there may be edge cases for your usage. But with a 5 min quick thinking, I feel like the following could solve your problem :
You can consider using a model similar to Instagram's. If my memory serves me well, what they use is an events-based collection. By events in this specific context I mean all actions the user takes. So a comment is an event, a like is an event etc.
This would make it so that you'll need three main collections in total.
users
-- userID1
---- userdata (profile pic, bio etc.)
---- postsByUser : [postID1, postID2]
---- followedBy : [userID2, ... ]
---- following : [userID2, ... ]
-- userID2
---- userdata (profile pic, bio etc.)
posts
-- postID1 (timestamp, so it's sortable)
---- contents
---- author : userID1
---- authorPic : authorPicUrl
---- authorPoints : 12345
---- taggedUsers : []
---- comments
------ comment1 : { copy of comment event }
---- likes : [userID1, userID2]
-- postID2 (timestamp)
---- contents
...
events
-- eventID1
---- type : comment
---- timestamp
---- byWhom : userID
---- toWhichPost : postID
---- contents : comment-text
-- eventID2
---- type : like
---- timestamp
---- byWhom : userID
---- toWhichPost : postID
For your user-bio page, you would query users.
For the news feed you would query posts for all posts by userIDs your user is following in the last 1 day (or any given timespan),
For the activity feed page (comments / likes etc.) you would query events that are relevant to your userID limited to the last 1 day (or any given timespan)
Finally query the next days for posts / events as the user scrolls (or if there's no new activity in those days)
Again, this is merely a quick thought, I know the elders of SOF have a habit of crucifying these usually, so forgive me fellow members of SOF if this answer has flaws :)
Hope it helps Francisco,
Good luck!

Check first value in array, insert new conditionally

I have an array of "states" in my documents:
{
"_id: ObjectId("53026de61e30e2525d000004")
"states" : [
{
"name" : "complete",
"userId" : ObjectId("52f4576126cd0cbe2f000005"),
"_id" : ObjectId("53026e16c054fc575d000004")
},
{
"name" : "active",
"userId" : ObjectId("52f4576126cd0cbe2f000005"),
"_id" : ObjectId("53026de61e30e2525d000004")
}
]
}
I just insert a new state onto the front of the array when there is a new state. Current work around until mongo 2.6 is released here: Can you have mongo $push prepend instead of append?
However I do not want users to be able to save the same state twice in row. I.E. if its already complete you should not be able to add another 'complete' state. Is there a way that I can check the first element in the array and only insert the new state if its not the same in one query/update command to mongo.
I say one query/update due to the fact that mongo does not support transactions so I don't want to query for the first element in the array then send another update statement, as that could cause problems if another state got inserted between my query and my update.
You can qualify your update statement with a query, for example:
db.mydb.states.update({"states.name":{$nin:["newstate"]}},{$addToSet:{"states":{"name":"newstate"}}})
This will prevent updates from a user if the query part of the update returns no document. You can additionally add more fields to filter on on the query part.

Resources