I'm trying to develop a gaming site. User can add other users as their friends. User will get points as he completes various game levels. Now I need to show the average points of all user's friends who had already played the game on its page(example: When a user plays a game A, average of points earned by his friends shall be displayed on the game A page. Similarly game B average points of his friend's shall be shown when he plays game B).
My approach:
Store user's friend list(Max 1000) as multi-valued property in datastore and load it into
GAE memcache when user log's into site.
Use resident backend to cache all the user's game data(points earned for
each specific game). A cron job updates the backend cache every hour.
When user requests for a game page(eg: game A) for the first time, request handler contacts backend for computing average of friends points via URL-Fetch service.
Now backend gets the friends-list(Max 1000) of user from memcache, fetches game A points of friends from in-memory cache(backend cache) and returns the computed average.
Request handler after getting the average, persists it in datastore and also stores it in memcache so that subsequent requests to game A page fetches data from memcache/datastore without computation overhead on backend. This average is valid for 1 hour and re-computed again after that upon next request to game A page.
My questions :
Is the above mentioned approach a right way to solve this problem ?
How to implement an in-memory cache efficiently and reliably with backend instance (python-2.7) ?
How to estimate memory and cpu required at backend for only this job ? (Assuming 0.1 million key-value pairs have to be stored with "userid/gamename" as key and user-points as value. User friend list max size is 1000.)
If I have to use multiple backend instances as the load increases, how to load balance them?
Thanks in advance.
Have a look at this blog post from Nick Johnson, about counters : http://blog.notdot.net/2010/04/High-concurrency-counters-without-sharding
Use NDB datastore for :
- automatic caching, instead of your own memcache
- NDB has some new interesting properties like : json property with compression, repeated propeties, which act like Python lists
And have a look at mapreduce for efficient updating.
Related
I need help understanding the best approach to structure my firestore data.
I come from traditional SQL background and have a little bit of nosql mongodb background as well. I am building a small football prediction app and here is the user flow:
User:
User registers/signs in
They can pick a contest to join
Enter their predictions every week
Admin:
Create contests and add/edit games to contest every week (an api will fetch all the data like fixtures and results)
Set a deadline for when users can enter their last prediction for the game week
Other:
Leaderboard
Now I did create a diagram on how I would traditionally structure this data but it would be nice if someone can exaplain to me the simplest approach to structuring such an app in Firestore
It looks reasonable but I would be concerned about storing the passwords in Firestore DB. Firestore should not ideally be concerned with authentication. Check Firebase Authentication for different auth options with Firebase. You'll probably end up only having to store the user ID as other information is in the User object.
Also check out Supported data types. You probably want to change varchar(x) to (UTF-8) string or byte types. Moreover, there is a reference type so you could reference the actual user document from the other tables.
One main design will be whether to use nested collections (Hierarchical Data).
You might be able to nest score under competitions.
I have developed a standard Google App Engine backend Application for my Android client. Now, there is search functionality in the App and during one request, I plan to return 20 results but I search for more in advanced(like 100) so that for the next hit, I will just search in these records and return. So, I need a mechanism to save these 80 records so that the same user might get them quickly.
I searched for it and found out that we can enable sessions in appengine-web.xml but all the session access has been done in doPost() and doGet() while my code is entirely Google's cloud endpoints.(like Spring)
Another thing is that I would like to persist the data both inside the Datastore and some cache(like Memcache).
My end goal is storing this data across search sessions. Is there any mechanism that will allow me to do this?
The usual approach here is to provide a code value in the response which the user can send in the next request to "continue" viewing the same results. This is called a "cursor".
For example, you might store the 80 records under some random key in your cache, and then send that random key to the user as part of the response. Then, when the user makes a new request including the key, you just the records and return them.
Cookie-based sessions don't usually work well with APIs; they introduce unnecessary statefulness.
I am looking at a way to get a page of data into memcache from datastore. Basically a comment system like Facebook where you load a set of 10 comments at a time. Datastore persists each comment as an object.
I would load 10 comments into an array object and load that on cache with a page ID suffix for the key by convention.
Now the problem : Data store doesn't seem to promise an increment of 1 while creating auto ID's. checked this on SO - Autoincrement ID in App Engine datastore
Upon eviction, how can I load these 10 comments from datastore into cache of a particular range lets say - page #5 or #6 when I can't access datastore objects by an incremented key ?
Any suggestions are welcome, even if you feel the whole approach is flawed let me know.
I did explore google cloud SQL as an alternative to datastore which takes care of my paging and ID-increment problems, and felt its not the best option as I expect this comments table to grow into very large dataset eventually.
Thanks !
I have just developed a mobile apps which basically for users to upload, download photoes, add, update, search , delete, refresh transaction, and query report. Every action need submit request to Appengine Server.
I am using CloudEndpoint, oAuth2.0 and Objectify to implement this appengine. When I'm testing alone, The instance hours has used up 40% . How much billing for instance can I imagine if 100 people using this app? How does it calculate the instance hours? by request of submitting? or by time of instance working on multiple request??
is it worth?
If my target is more than 100 users to using my apps. Is it worth? Could you please share me what exactly I misunderstood about this instance.
Thanks
As others have commented, the question is very hard to answer. The easiest answer I can think of is by looking at the response header "X-AppEngine-Estimated-CPM-US-Dollars". You have to be a member of the Cloud Platform Project (see the Permissions page in Cloud Platform developers console) to see this header (you can check it in your browser).
The header tells you what the cost of the request was in US Dollars multiplied by 1000.
But think of it as an indication. If your request spawns other processes such as tasks, those costs are not included in the number you see in that header.
The relationship between Frontend instance hours and the number of requests is not linear either. For one, you will be charged a number of minutes (not sure if it's 15 minutes) when the instance spins up. And there are other performance settings that determine how this works.
Your best bet is to run the app for a while against real users and find out what the costs were in a given month or so.
I read somewhere that the Salesforce API has a 10 request limit. If we write code to integrate with Salesforce:
1. What is the risk of this limit
2. How can we write code to negate this risk?
My real concern is that I don't want to build our customer this great standalone website that integrates with Salesforce only to have user 11 and 12 kicked out to wait until requests 1-10 are complete?
Edit:
Some more details on the specifics of the limitation can be found at http://www.salesforce.com/us/developer/docs/api/Content/implementation_considerations.htm. Look at the section titled limits.
"Limits
There is a limit on the number of queries that a user can execute concurrently. A user can have up to 10 query cursors open at a time. If 10 QueryLocator cursors are open when a client application, logged in as the same user, attempts to open a new one, then the oldest of the 10 cursors is released. This results in an error in the client application.
Multiple client applications can log in using the same username argument. However, this increases your risk of getting errors due to query limits.
If multiple client applications are logged in using the same user, they all share the same session. If one of the client applications calls logout(), it invalidates the session for all the client applications. Using a different user for each client application makes it easier to avoid these limits.*"
Not sure which limit you're referring to, but the governor limits are all listed in the Apex documentation. These limits apply to code running in a given Apex transaction (i.e. in response to a trigger/web service call etc), so adding more users won't hurt you - each transaction gets its own allocation of resources.
There are also limits on the number of long-running concurrent API requests and total API calls in a day. Most of these are per-license, so, again, as the number of users rises, so do the limits.
Few comments on:
I don't want to build our customer this great standalone website that integrates with Salesforce only to have user 11 and 12 kicked out to wait until requests 1-10 are complete?
There are two major things you need to consider when planning real-time Sfdc integration beside the api call limits mentioned in the metadaddy's answer (and if you make a lot of queries it's easy to hit these limits):
Sfdc has routine maintainance outage periods.
Querying Sfdc will always be significantly slower than a querying local datasource.
You may want to consider a local mirror of you Sfdc data where you replicate your Sfdc data.
Cheers,
Tymek
All API usage limits are calculated over 24 hours period
Limits are applicable to whole organization. So if you have several users connecting through API all of them count against the same limit.
You get 1,000 API requests per each Salesforce user. Even Unlimited Editions is actually limited to 5,000.
If you want to check your current API usage status go to Your Name |
Setup | Company Profile | Company Information
You can purchase additional API calls
You can read more at Salesforce API Limits documentation