one-to-many scalability and efficiency of getting the many - parse.com - database

I have a question about the scalability of getting the many from a one-to-many relationship in parse.com. Below is a diagram of what I am trying to do.
I have a Like object that has a userWhoLiked and a messageLiked attributes as pointers. My question is in regards to checking if a User has liked a message already when loading a feed of Message objects. I was thinking that I could write some cloud code that would return both the Message itself as well as information about if the User has already liked that object. However, I feel like this would be very inefficient. I would in essence have a query for all the Message objects (which will be n objects long), and then another query for finding if the User has already liked that Message object by going through all the Like objects n times and checking the userWhoLiked and messageLiked based on the user logged in and Message I am checking. I am going to use the pointer to build the one-to-many relationship because the number of Like objects will be arbitrarily large. Is the method that I have described (using cloud code and then checking the Like objects ) for getting if a user has liked an object is okay and scalable? Is there a better way, or any suggestions? I appreciate your time. Thanks.

Why not just do one query on Like objects where the userWhoLiked key is equal to the current user? This will return all of the objects which the current user has liked and you can also infer that all objects not included have not been liked.
In case you haven't checked it out yet, I'd highly recommend the Parse Anypic tutorial which has a very similar structure

Related

Watson natural language understanding

I have customized a enter image description heremodel, but when I did the connection like in screenshot, I just get nothing in body request, its only return language. I really want to get the relationship beetwen the entities
NLU by default will not find any entity/relation in the string used as test. If your use case are going to deal with strings like that and you need to understand how the information is related, I would suggest working also with WKS, creating a model specific for your domain (juridic as it seems), then using the NLU with your new model from WKS to get the entities and relations.
Another suggestion, if you only want to test if your code is right, you can use the NLU demo, to see if something would be extracted from the input
Also, never post your API Key.

Q: Is the password hash in $user not beplaced by *

In my old cakephp 2.x Applications, the password hash was hidden by '*' when I retrieved the data from the User Model. I am not a hundret percent shure, but I think this was done automaticly by Cake.
Now testing Cakephp3.0, I am surprised finding the complete hash when retrieving data from the User Model.
I got a few questions concerning this password-hash-hiding:
Am I right with my opinion this was a function in cakephp2?
Does anyone know, why this function was not implemented in Cakephp3 and why?
If I am wrong by assuming this was included in cake, where is the place to implement this functionality in cake2 and cake3?
Thank you very much for your help.
Am I right with my opinion this was a function in cakephp2?
Yes, in Cake 2.x this is part ot the Debugger, the data itself however is not being touched, just some of the content is being masked when outputting the data.
Does anyone know, why this function was not implemented in Cakephp3 and why?
It is still implemented, but it has been moved. The whole point of this masking thingy was to avoid accidental exposure of datasource credentials (mainly in error messages/pages), it never really had something to do with possible user model data, this is just a side effect for data that happens to use keys like password.
So in 3.x this functionality has been moved to \Cake\Database\Connection::__debugInfo()
https://github.com/cakephp/cakephp/pull/4542
This ensures that you'll still end up with masked credentials when for example debugging connection objects, being it explicitly, or implicitly on error pages, while it doesn't obstruct debugging other data anymore.
[...], where is the place to implement this functionality in [...] cake3?
This highly depends on your use case, if you'd for example wanted to have it masked in debug output, then you could implement it in an overriden __debugInfo() method in your user entity class, similar to how the Connection class is doing it.
https://github.com/cakephp/cakephp/blob/3.0.11/src/Database/Connection.php#L702
Of course this would only work for entities, not for non-hydrated data (array data).

Objects not saving using Objectify and GAE

I'm trying to save an object and verify that it is saved right after, and it doesn't seem to be working.
Here is my object
import com.googlecode.objectify.annotation.Entity;
import com.googlecode.objectify.annotation.Id;
#Entity
public class PlayerGroup {
#Id public String n;//sharks
public ArrayList<String> m;//members [39393,23932932,3223]
}
Here is the code for saving then trying to load right after.
playerGroup = new PlayerGroup();
playerGroup.n = reqPlayerGroup.n;
playerGroup.m = reqPlayerGroup.m;
ofy().save().entity(playerGroup).now();
response.i = playerGroup;
PlayerGroup newOne = ofy().load().type(PlayerGroup.class).id(reqPlayerGroup.n).get();
But the "newOne" object is null. Even though I just got done saving it. What am I doing wrong?
--Update--
If I try later (like minutes later) sometimes I do see the object, but not right after saving. Does this have to do with the high replication storage?
Had the same behavior some time ago and asked a question on google groups - objectify
Here the answer I got :
You are seeing the eventual consistency of the High-Replication
Datastore. There has been a lot of discussion of this exact subject
on the Objecify list in google groups , including several links to the
Google documentation on the subject.
Basically, any kind of query which does not include an ancestor() may
return results from a stale view of the datastore.
Jeff
I also got another good answer to deal with the behavior
For deletes, query for keys and then batch-get the entities. Make sure
your gets are set to strong consistency (though I believe this is the
default). The batch-get should return null for the deleted entities.
When adding, it gets a little trickier. Index updates can take a few
seconds. AFAIK, there are three ways out of this: 1; Use precomputed
results (avoiding the query entirely). If your next view is the user's
recently created entities, keep a list of those keys in the user
entity, and update that list when a new entity is created. That list
will always be fresh, no query required. Besides avoiding stale
indexes, this also speeds up your app. The more you result sets you
can reliably manage, the more queries you can avoid.
2; Hide the latency by "enhancing" the query results with the recently
added entities. Depending on the rate at which you're adding entities,
either inject only the most recent key, or combine this with the
solution in 1.
3; Hide the latency by taking the user through some unaffected views
before landing on your query-based view. This strategy definitely has
a smell over it. You need to make sure those extra steps are relevant
to the user, or you'll give a poor experience.
Butterflies, Joakim
You can read it all here:
How come If I dont use async api after I'm deleting an object i still get it in a query that is being done right after the delete or not getting it right after I add one
Another good answer to a similar question : Objectify doesn't store synchronously, even with now

Nested structs on GAE datastore using Go

I'm trying to figure out how to get nested structs to work with GAE datastore using Go. I know the datastore doesn't specifically support nested structs. I need to find a simple way of getting user information to go with a post when it is sent out to a user as JSON.
One thing I thought of was to put two fields for the user. One for the ID/key referencing to user and another one for the user type struct which would be added there when the post is loaded from the datastore. Extra fields seem silly so I'm hoping there is a better solution for this.
There are two entity types or structs: POST and USER
Posts need to contain information about the user who made the post.
The structure for the JSON I'm going to output for users is as follows:
POST
field1
field2
USER
user_field1
user_Field2
Go's appengine datastore api provides the PropertyLoadSaver interface for this sort of thing: https://developers.google.com/appengine/docs/go/datastore/reference#PropertyLoadSaver
You structure your struct however you want and then implement the Load and Save methods of that interface to populate it correctly. It means you write the serialization code yourself but it gives you full freedom in how you structure your data.
This will allow you still filter over the fields and have a nested struct.
The python runtime has the ndb library which supports nested structures like this. Go does not, so I can think of two solutions:
In the POST kind, have a user field that is a key, referencing a USER kind with the necessary fields. Requires two fetches and roundtrips.
Make a user field in the POST kind that is a blob. The blob is a string that is [de]serialized in go. This means you can't search or filter on any of the user data, but it also allows you to store everything in one entity.
You should use these based on the needs of your app. If you need users to be a real thing, use 1. If users aren't objects you need to work with (i.e., just data to display), you can use 2.

Self Tracking Entities Traffic Optimization

I'm working on a personal project using WPF with Entity Framework and Self Tracking Entities. I have a WCF web service which exposes some methods for the CRUD operations. Today I decided to do some tests and to see what actually travels over this service and even though I expected something like this, I got really disappointed. The problem is that for a simple update (or delete) operation for just one object - lets say Category I send to the server the whole object graph, including all of its parent categories, their items, child categories and their items, etc. I my case it was a 170 KB xml file on a really small database (2 main categories and about 20 total and about 60 items). I can't imagine what will happen if I have a really big database.
I tried to google for some articles concerning traffic optimization with STE, but with no success, so I decided to ask here if somebody has done something similar, knows some good practices, etc.
One of the possible ways I came out with is to get the data I need per object with more service calls:
return context.Categories.ToList();//only the categories
...
return context.Items.ToList();//only the items
Instead of:
return context.Categories.Include("Items").ToList();
This way the categories and the items will be separated and when making changes or deleting some objects the data sent over the wire will be less.
Has any of you faced a similar problem and how did you solve it or did you solve it?
We've encountered similiar challenges. First of all, as you already mentioned, is to keep the entities as small as possible (as dictated by the desired client functionality). And second, when sending entities back over the wire to be persisted: strip all navigation properties (nested objects) when they haven't changed. This sounds very simple but is not at all trivial. What we do is to recursively dig into the entities present in trackable collections of say the "topmost" entity (and their trackable collections, and theirs, and...) and remove them when their ChangeTracking state is "Unchanged". But be carefull with this, because in some cases you still need these entities because they have been removed or added to trackable collections of their parent entity (so then you shouldn't remove them).
This, what we call "StripEntity", is also mentioned (not with any code sample or whatsoever) in Julie Lerman's - Programming Entity Framework.
And although it might not be as efficient as a more purist kind of approach, the use of STE's saves a lot of code for queries against the database. We are not in need for optimal performance in a high traffic situation, so STE's suit our needs and takes away a lot of code to communicate with the database. You have to decide for your situation what the "best" solution is. Good luck!
You can find an Entity Framework project item at http://selftrackingentity.codeplex.com/. With version 0.9.8, I added a method called GetObjectGraphChanges() that returns an optimized entity object graph with only objects that have changes.
Also, there are two helper methods: EstimateObjectGraphSize() and EstimateObjectGraphChangeSize(). The first method returns the estimate size of the whole entity object along with its object graph; and the later returns the estimate size of the optimized entity object graph with only object that have changes. With these two helper methods, you can decide whether it makes sense to call GetObjectGraphChanges() or not.

Resources