Does anyone know the benefit of using embedded class in Objectify? So far, my biggest problem with the embedded class is that I can't access outside of the entity or out of the App Engine Endpoint class. Just wondering.
Using embedded entities saves you from making additional calls to the Datastore.
For example, you may store users' phone numbers as embedded entities in a User entity. This is a good option if you always need phone numbers when you retrieve users. This way if your query returns 100 user entities, you don't have to make 100+ calls to the Datastore to retrieve their phone numbers.
However, if you need to access these numbers separately, or be able search by phone number, a better option is to keep them as separate entities.
Related
I’m building a server for customers where each customer need to have each access to a database for serving his/her clients.
So my thought was to assign each customer to a specific bucket but just to find out now that a single couchbase only serve maximum of 10 buckets as recommended. But now, i don’t know if sharing a single bucket across my customers using their ID combining with the collection documents name they are creating as namespace in document type will affect the performance of all customers due to heavy operation by each customer clients on a single bucket.
I will also appreciate any database platform that can also handle this kind of project at large that performance of one customer will affect others.
If you expect the system to be heavily loaded, the activities of one user will affect the activities of another user whether they are sharing a single bucket or operating in separate buckets. There are only so many cycles to go around, so if one user is placing a heavy load on the system, the other users will definitely feel it. If you absolutely want the users completely isolated, you need to set up separate clusters for each of them.
If you are ok with the load from one user affecting the load from another, your plan for having users sharing a bucket by adding user ids to each document sounds workable. Just make sure you are using a separator that can not be part of the user id, so you can unambiguously separate the user id from the document id.
Also be aware that while Couchbase supports multiple buckets, it tends to run best with just one. Buckets are distinctly heavyweight structures.
I have a question concerning microservices and databases. I am developing an application: a user sees a list of countries and can click through it so he can see a list of attractions of that country. I created a country-service, auth-service (contains users for oAuth2) and an attraction-service. Each service has its own database. I mapped the association between an attraction and its country by the iso code (for example: BE = belgium): /api/attraction/be.
The approach above seems to work but I am a bit stuck with the following: a user must be able to add an attraction to his/her list of favorites, but I do not see how that's possible since I have so many different databases.
Do I create a favorite-service, do I pass id's (I don't think I should do this), what kind of business key can I create, how do I associate the data in a correct way...?
Thanks in advance!
From the information you have provided, using a standalone favourite service sounds like the right option.
A secondary simpler and quicker option might be to also to handle this on your user service which looks after the persistence of your users data as favourites are exclusive to a user entity.
As for ID's, I haven't seen many reasons as to why this might be a bad idea? Your individual services are going need to store some identifying value for related data and the main issue here I feel is just keeping this ID field consistent across your different services. What you choose just needs to be reliable and predictable to keep things easy and simple as your system grows.
If you are using RESTful HTTP, you already have a persistent, bookmarkable identification of resources, URLs (URIs, IRIs if you want to be pedantic). Those are the IDs that you can use to refer to some entity in another microservice.
There is no need to introduce another layer of IDs, be it country codes, or database ids. Those things are internal to your microservice anyway and should be transparent for all clients, including other microservices.
To be clear, I'm saying, you can store the URI to the country in the attractions service. That URI should not change anyway (although you might want to prepare to change it if you receive permanent redirects), and you have to recall that URI anyway, to be able to include it in the attraction representation.
You don't really need any "business key" for favorites either, other than the URI of the attraction. You can bookmark that URI, just as you would in a browser.
I would imagine if there is an auth-service, there are URIs also for identifying individual users. So in a "favorites" service, you could simply link the User URI with Attraction URIs.
suppose I have million users registered with my app. now there's a new user, and I want to show him who all in his contacts have this app installed. A user can have many contacts, let's say 500. now if I go to get an entity for each contact from datastore then it's very time and money consuming. memcache is a good option, but I've to keep it in sync for that Kind. I can get dedicated memcache for such a large data, but how do I sync it? my logic would be, if it's not there in memcache, assume that that contact is not registered with this app. A backend module with manual scaling can be used to keep both in sync. But I don't know how good this design is. Any help will be appreciated.
This is not how memcache is designed to be used. You should never rely on memcache. Keys can drop at any time. Therefore, in your case, you can never be sure if a contact exists or not.
I don't know what your problem with datastore is? Datastore is designed to read data very fast - take advantage of it.
When new users install your app, create a lookup entity with the phone number as the key. You don't necessarily need any other properties. Something like this:
Entity contactLookup = new Entity("ContactLookup", "somePhoneNumber");
datastore.put(contactLookup);
That will keep a log of who's got the app installed.
Then, to check which of your users contacts are already using your app, you can create an array of keys out of the phone numbers from the users address book (with their permission of course!), and perform a batch get. Something like this:
Set<Key> keys = new HashSet<Key>();
for (String phoneNumber : phoneNumbers)
keys.add(KeyFactory.createKey("ContactLookup", phoneNumber));
Map<Key, Entity> entities = datastore.get(keys);
Now, entities will be those contacts that have your app installed.
You may need to batch the keys to reduce load. The python api does this for you, but not sure about the java apis. But even if your users has 500 contacts, it's only 5 queries (assuming batches of 100).
Side note: you may want to consider hashing phone numbers for storage.
Memcache is a good option to reduce costs and improve performance, but you should not assume that it is always available. Even a dedicated Memcache may fail or an individual record can be evicted. Besides, all this synchronization logic will be very complicated and error-prone.
You can use Memcache to indicate if a contact is registered with the app, in which case you do not have to check the datastore for that contact. But I would recommend checking all contacts not found in Memcache in the Datastore.
Verifying if a record is present in a datastore is fast and inexpensive. You can use .get(java.lang.Iterable<Key> keys) method to retrieve the entire list with a single datastore call.
You can further improve performance by creating an entity with no properties for registered users. This way there will be no overhead in retrieving these entities.
Since you don't use python and therefore don't have access to NDB, the suggestion would be to, when you add a user, add him to memcache and create an async query (or a task queue job) to push the same data to your datastore. Like that memcache gets pushed first, and then eventually the datastore follows. They'll always be in sync.
Then all you need to do is to first query your memcache when you do "gets" (because memcache is always in sync since you push there first), and if memcache returns empty (being volatile and whatnot), then query the actual datastore to "re fill" memcache
Is it possible to create nested namespaces in a google app engine app?
Lets say i am modeling a multi-tenant application similar to google docs. So I obviously need one namespace for each organization to avoid data leaks from one org to another. However, i probably want a namespace per user also so that when i am searching for a document, i dont have to search through all the documents of all the users in that organization and again to avoid data leaks.
Whats the best way to model this?
With namespaces just being strings (limited to [0‑9A‑Za‑z._‑]{0,100}) you could just use "_" or somewhere as your separator for your sub namespaces, so you'd have namespaces like "%organisation_%user"
But your argument
So I don't have to search through all the documents
doesn't seem like a strong enough reason to go down this route. Your code will become more complex (and as a result more likly to leak data) if you are constantly having to switch between namespaces to get at data that is organisation wide vs user wide. Performance will not be improved any more than if you were filtering your list of documents by a user-id field.
I'm trying to create a twitter-like follower system (users can follow one another). I'm confused about a good way to store the follower relationships. I'm using JDO (on google app engine).
The first thing that comes to mind is to keep a Set for followers, and the ppl you are following. Something like:
class User {
private String mUsername;
private Set<String> mFollowers;
private Set<String> mFollowees;
}
I'm worried about what happens when those sets grow to have like 10,000+ entries in them. Viewing a user's page is going to be a common operation, and I'd hate to have to load the entire Sets every time my API needs to generate user info. I'm only going to be showing 50 followers at a time anyway, so it makes no sense to load the entire Set.
An alternate could be using an intermediate class to store relationships, this way they are not bound to the User object. Paging should then also be easy (I think). For example, whenever I want to follow a user, I'd create an instance of this object:
class RelationshipInfo {
private String mMyUsername;
private String mUsernameYouAreFollowing;
}
so when I view a user's page, I could query for the first 50 such records above given the user's id. Does that make any performance sense? I'm not sure if this is better than the first option above. This way would require more trips to the datastore.
Any thoughts would be great,
Thanks
Brett Slatkin's Building Scalable, Complex Apps on App Engine talk from last year's Google I/O actually uses a Twitter-like application as its example. Even aside from that, it's a great talk and I highly recommend it even if it didn't relate specifically to what you're asking.
Also, you may want to check out Jaiku, an open-source Twitter-like application built on App Engine.