How to change datastore options in Google App Engine? - google-app-engine

I created my application using "High Replication" option. Now I want to switch to "Master/Slave" option because I'm hitting my daily CPU quota.
It turns out High Replication uses "approximately three times the storage and CPU cost of Master/Slave"
Is there anyway I can do this without recreating my app? It's not in the Application Settings page.

You can't - once you've chosen a particular type of datastore, that application is bound to it. The only way to change it is exactly the way you suggested - you'd have to create a new app with the Master / Slave datastore and port your data to it.
You may want to profile your app and optimize it to use less CPU, although in the general case that may be easier said than done.

Take a look at the first answer to this question: Have you expirienced DataStore downtime in AppEngine? What are the odds?
As #mihai said:I would recommend you to use HRD as Google said they will make M/S more expensive than HRD until the end of the year and even remove the M/S option as they are looking to "force" the businesses & developers to take advantage of all HRD goodies. The real reason is that maintaining a single type of infrastructure is cheaper than to maintain the both HRD and M/S so Google picks HRD. Source : Google I/O 2011

Related

Caching after GAE standard migration to Go 1.11/1.12

I've almost completed migrating based on google's instructions.
It's very nice to not have to call into the app-engine libraries whatsoever.
However, now I must replace my calls to app-engine-standard memcached.
Here's what the guide says: "To use a memcache service on App Engine, use Redis Labs Memcached Cloud instead of App Engine Memcache."
So is this my only option; a third party? They don't even list pricing on their page if GCE is selected.
I also see in the standard environment how-to guides there is a guide on Connecting to internal resources in a VPC network.
From that link it mentions Cloud Memorystore. I can't find any examples if this is advisable or possible to do on GAE standard. Of course it wasn't previously possible but now that GAE standard has become much more "standard", I think it should be possible?
Thanks for any advice on the best way forward.
Memorystore appears to be Google's replacement:
https://cloud.google.com/memorystore/
You connect to it using this guide:
https://cloud.google.com/appengine/docs/standard/go/using-memorystore
Alas it costs about $1.20/GB per day with no free quota.
Thus, if your data doesn't change, and requires less than 100MB of cache at a time, the first answer might be better (free). Also, your data won't explode the instance as you can control the max size of the cache.
However, if your data changes or you need more cache, MemoryStore is a more direct replacement to MemCache - just costs money.
I've been thinking about this. 2nd gen instances have twice the ram, so if global cache isn't required (as in items don't change once created - (name items using their sha256)), you can run your own local threadsafe memcache (such as https://github.com/dgraph-io/ristretto) and allocate some of the extra ram to it. It'll be faster than Memcache was, so requests can be serviced even faster, keeping the number of instances low.
You could make it global for data that does change, by using pub/sub between instances, but I think that's significantly more work.
To ease the migration to 1.12, I have been thinking of using this solution:
create a dedicated app using the 1.11 runtime.
setup twirp endpoints to act as a proxy for all the deprecated app engine services (memcache, mail, search...)

How to configure App Engine for minimal cost?

I'm doing a prototype backend and in the near future I expect little traffic but while testing I consumed all my 300$ free trail.
How can I configure my app to consume the least possible resources? I need things like limiting the number of instances to 1, using a cheap machine, sleep whenever possible, I've read something about Client vs Backend intances.
With time I'll learn the config that best suits me, but now I need the CHEAPEST config to get going.
BTW: I am using managed-vms with Dart.
EDIT
I've been recommended to configure my app.yaml file, what options would you recommend to confront this issue?
There are two train of thought for your issue.
1) Optimization of code: This is very difficult for us as we are not privy to your App's usage and client-base and architecture. In general, it depends on what Google App Engine product you use the most, for example: Datastore API call (fetch, write, delete... etc...), BigQuery and Cloud SQL. Even after optimization, you can still incur a lot of cost depending on traffic.
2) Enforcing cheap operation: This is easier and I think this is what you want. You can manually enforce a daily budget (in your billing setup page) so the App never cost more than a certain amount per day. You can also artificially lower the maximum amount of idling instances to 0 and use the smallest instance possible (F1 for frontend).
For pricing details see this article - https://cloud.google.com/appengine/pricing#Billable_Resource_Unit_Costs
If you use managed VM -- you'll be billed for Compute Engine Instance prices, not for App Engine Instances, and, as I know, the minimum possible instance to use as Managed VM is "g1-small" which costs you $0.023 per hour full sustained usage (if it will be turned on all month), so you minimum bill will be 0.023 * 24 * 30 = $16.56 only for instance hours. Excluding disk and traffic. With minimum amount of datastore operations you may stay on free quota.
Every application consumes resources differently. To minimize your cost, you need to know what resources used the majority of your expenses and go from there.
If it is spent on extra instances that were just sitting there - then trim the number of instances to the minimum required and use a lower class instance. If you are seeing a lot of expense on datastore calls - then look at optimizing your entities and take advantage of memcache.
Lowest Cost for a simple app:
Use App Engine Standard. It scales to zero instances, so will not cost anything if there is no traffic. With App Engine Flex you will pay for the instance hours and the Flex (GCE) instances are bigger.
Use autoscaling with max instances, F1 instance class:
With autoscaling you do not need to guess how many instances you need. F1 are the smallest instances. Set the max instances in case you get DoS'd or more traffic than you can afford.
Stop Instances:
You can stop the App Engine versions when you do not expect the app to be used. The will be no charge for instance hours for either Standard or Flex. For Flex there will be disk charges. The app will be ready to go when you need it again.
App Engine Version Cleanup:
Versions are easy to create and harder to remove. Here is a post on project cleanup. See this post on App Engine cleanup
https://medium.com/google-cloud/app-engine-project-cleanup-9647296e796a

Will using a Cloud PaaS automatically solve scalability issues?

I'm currently looking for a Cloud PaaS that will allow me to scale an application to handle anything between 1 user and 10 Million+ users ... I've never worked on anything this big and the big question that I can't seem to get a clear answer for is that if you develop, let's say a standard application with a relational database and soap-webservices, will this application scale automatically when deployed on a Paas solution or do you still need to build the application with fall-over, redundancy and all those things in mind?
Let's say I deploy a Spring Hibernate application to Amazon EC2 and I create single instance of Ubuntu Server with Tomcat installed, will this application just scale indefinitely or do I need more Ubuntu instances? If more than one Ubuntu instance is needed, does Amazon take care of running the application over both instances or is this the developer's responsibility? What about database storage, can I install a database on EC2 that will scale as the database grow or do I need to use one of their APIs instead if I want it to scale indefinitely?
CloudFoundry allows you to build locally and just deploy straight to their PaaS, but since it's in beta, there's a limit on the amount of resources you can use and databases are limited to 128MB if I remember correctly, so this a no-go for now. Some have suggested installing CloudFoundry on Amazon EC2, how does it scale and how is the database layer handled then?
GAE (Google App Engine), will this allow me to just deploy an app and not have to worry about how it scales and implements redundancy? There appears to be some limitations one what you can and can't run on GAE and their price increase recently upset quite a large number of developers, is it really that expensive compared to other providers?
So basically, will it scale and what needs to be done to make it scale?
That's a lot of questions for one post. Anyway:
Amazon EC2 does not scale automatically with load. EC2 is basically just a virtual machine. You can achieve scaling of EC2 instances with Auto Scaling and Elastic Load Balancing.
SQL databases scale poorly. That's why people started using NoSQL databases in the first place. It's best to see which database your cloud provider offers as a managed service: Datastore on GAE and DynamoDB on Amazon.
Installing your own database on EC2 instances is very impractical as EC2 has ephemeral storage (it looses all data on "disk" when it reboots).
GAE Datastore is actually a one big database for all applications running on it. So it's pretty scalable - your million of users should not be a problem for it.
http://highscalability.com/blog/2011/1/11/google-megastore-3-billion-writes-and-20-billion-read-transa.html
Yes App Engine scales automatically, both frontend instances and database. There is nothing special you need to do to make it scale, just use their API.
There are limitations what you can do with AppEngine:
A. No local storage (filesystem) - you need to use Datastore or Blobstore.
B. Comet is only supported via their proprietary Channels API
C. Datastore is a NoSQL database: no JOINs, limited queries, limited transactions.
Cost of GAE is not bad. We do 1M requests a day for about 5 dollars a day. The biggest saving comes from the fact that you do not need a system admin on GAE ( but you do need one for EC2). Compared to the cost of manpower GAE is incredibly cheap.
Some hints to save money (an speed up) GAE:
A. Use get instead of query in Datastore (requires carefully crafting natiral keys).
B. Use memcache to cache data you got form datastore. This can be done automatically with objectify and it's #Cached annotation.
C. Denormalize data. Meaning you write data redundantly in various places in order to get to it in as few operations as possible.
D. If you have a lot of REST requests from devices, where you do not use cookies, then switch off session support ( or roll your own as we did). Sessions use datastore under the hood and for every request it does get and put.
E. Read about adjusting app settings. Try different settings (depending how tolerant your app is to request delay and your traffic patterns/spikes). We were able to cut down frontend instances by 70%.

How to estimate hosting services cost on GAE?

I'm building a system which I plan to deploy on Google App Engine. Current pricing is described here:
Google App Engine - Pricing and Features
I need an estimate of cost per client managed by the webapp. The cost won't be very accurate until I have completed the development. GAE uses such fine grained price calculation such as READs and WRITEs that it becomes a very daunting task to estimate operation cost per user.
I have an agile dev. process which leaves me even more clueless in determining my cost. I've been exploiting my users stories to create a cost baseline per user story. Then I roughly estimate how will the user execute each story workflow to finally compute a simplistic estimation.
As I see it, computing estimates for Datastore API is overly complex for a startup project. The other costs are a bit easier to grasp. Unfortunately, I need to give an approximate cost to my manager!
Has anyone undergone such a task? Any pointers would be great, regarding tools, examples, or any other related information.
Thank you.
Yes, it is possible to do cost estimate analysis for app engine applications. Based on my experience, the three major areas of cost that I encountered while doing my analysis are the instance hour cost, the datastore read/write cost, and the datastore stored data cost.
YMMV based on the type of app that you are developing, of course. If it is an intense OLTP application that handle simple-but-frequent CRUD to your data records, most of the cost would be on the datastore read/write operations, so I would suggest to start your estimate on this resource.
For datastore read/write, the cost for writing is generally much more expensive than the cost for reading the data. This is because write cost take into account not only the cost to write the entity, but also to write all the indexes associated with the entity. I would suggest you to read an article by Google about the life of a datastore write, especially the part about Apply Phase, to understand how to calculate the number of write per entity based on your data model.
To do an estimate of instance hours that you would need, the simplest approach (but not always feasible) would be to deploy a simple app to test how long would a particular request took. If this approach is undesirable, you might also base your estimate on the Google App Engine System Status page (e.g. what would be the latency for a datastore write for a particularly sized entity) to get a (very) rough picture on how long would it take to process your request.
The third major area of cost, in my opinion, is the datastore stored data cost. This would vary based on your data model, of course, but any estimate you made need to also take into account the storage that would be taken by the entity indexes. Taking a quick glance on the datastore statistic page, I think the indexes could increase the storage size between 40% to 400%, depending on how many index you have for the particular entity.
Remember that most costs are an estimation of real costs. The definite source of truth is here: https://cloud.google.com/pricing/.
A good tool to estimate your cost for Appengine is this awesome Chrome Extension: "App Engine Offline Statistics Estimator".
You can also check out the AppStats package (to infer costs from within the app via API).
Recap:
Official Appengine Pricing
AppStats for Python
AppStats for Java
Online Estimator (OSE) Chrome Extension
You can use the pricing calculator
https://cloud.google.com/products/calculator/

Have you experienced DataStore downtime in AppEngine? What are the odds?

Google start to use The High Replication datastore (HRD) as the default for new applications.
HR from the docs:
The HRD is a highly available, highly
reliable storage solution. It remains
available for reads and writes during
planned downtime and is extremely
resilient in the face of catastrophic
failure—but it costs more than the
master/slave option.
M/S from the docs:
your data may be temporarily
unavailable during data center issues
or planned downtime
Now, have you ever expirienced downtime? If this "downtime disclaimer" is just something theorical and doesn't happen frecuently I would use the M/S becouse it's cheaper.
What are the numbers that Google handle to say "downtime"? maybe their downtime is just a few seconds in a year, something totaly acceptable for some kind of apps.
Would love answers from experienced AppEngine developers.
I would recommend you to use HRD as Google said they will make M/S more expensive than HRD until the end of the year and even remove the M/S option as they are looking to "force" the businesses & developers to take advantage of all HRD goodies. The real reason is that maintaining a single type of infrastructure is cheaper than to maintain the both HRD and M/S so Google picks HRD.
Source : Google I/O 2011
Downtime isn't theoretical - it happens in any distributed system. There are two types, roughly speaking: localized and global. Localized issues occur when a particular machine has trouble and can't serve requests; global downtime happens when something happens to the service as a whole.
Both can occur on App Engine: the former due to localized hardware failure, and the latter generally only due to planned maintenance that requires setting the master-slave datastore read-only for a brief period. The HR datastore handles both more robustly than the MS datastore, and doesn't require a read-only period during maintenance windows.
Once the new pricing scheme comes into effect, both datastores will be charged at the same rate.
For these and many other reasons, you should always use the HR datastore in new apps.

Categories

Resources