Latency accessing Google App Engine overseas - google-app-engine

I am about to begin development of a web app in New Zealand for a NZ market for which scalability is a key requirement. I am contemplating using Google Apps Engine which I have used in the past for smaller projects where latency was not a big issue, because half the apps are client side Java script.
However, the new project requires fast AJAX response times. The local web-app companies charge about $175/month (much more than in the US I would imagine) for a dedicated server.
Is there likely to be a significant difference between the latency for AJAX requests if I use Google Apps Engine (hosted in the US I presume??) vs the local hosting company who host here in New Zealand? If so how big?

A service which may interest you in this context is CloudSleuth. They measure page load times from multiple locations. But select Asia/Oceania for Location. Then drill down for GAE to see page load time from various location. Unfortunately the closest will be Sydney, where page load for GAE currently is almost 20s.

From your explanation you would like to use App Engine as your backend, there should not be any latency problems other that the time your app would take to load and serve a request. But as they say, there is no better test like the one you do it yourself, so go ahead play with App Engine and see it for yourself!
Happy coding!

It's unavoidably the case that the latency for a request within New Zealand is going to be lower than the latency for a request to the US and back, all else being equal. There are several mitigating factors to consider, though:
The speed-of-light delay may not be significant for your application. The round trip time to the US and back is under 100 milliseconds; the latency generated by your app serving the request may be large enough that this is not a significant factor on the end-user latency.
Although your app is only in a single location at any one time, Google has caching frontends all around the world. Requests typically get routed to the closest one, and if your app generates cacheable responses, the frontend may be able to return a response from its cache immediately, without having to ever send the request to your app.
Some ISPs, particularly in places like NZ where international bandwidth is expensive, run transparent proxies. Likewise, so do organisations, and your browser itself has a cache. Any of these can satisfy the request in less time than a roundtrip, if the response is cacheable.
In the end, the question is whether or not the extra 100 milliseconds or so is acceptable. More often than not, the answer is yes, and it's worth the tradeoff of not having to handle machine provisioning, maintenance, etc etc yourself.

App Engine is not globally distributed.
The whole application is hosted around North America by default.
It you pay for the service you may request hosting within Europe instead, but there is no option to select any other regions (from https://developers.google.com/appengine/docs/python/gettingstartedpython27/uploading).

Related

Caching cloud storage files on app engine

I am using app engine to serve a bunch of sklearn models. These models are around 100 mb in size, and there are around 25 of them.
Downloading them can take up to 15s at times, despite being in the designated app engine bucket, and is often dominating request times.
I currently use a FIFO cache layer wrapped around the GCS storage client, but cache hits aren't great as the different model are used quite interspersed and app engine memory is limited.
Memcache seems too small for this, and /tmp is also stored in RAM.
Is there a better solution for caching such files?
You can imagine different solution to solve your issue.
You can embed your models in your deployment. Like that, the model are already here with the service. When a new model version is released, you deployed a new app engine service revision
The problem with the precedent solution is the deployment frequency: when one of the model is updated you need to repackage and redeploy your App Engine service. The solution is the micro services. You can have 1 model per APp Engine service and therefore only deploy this one that has been updated. If you want only entry point, you can have a 26th app engine service wich is your entry point and will route the request to the correct model service.
You can also perform the same thing with Cloud Run, where you manage the container packaging and detail if you need special things. You have also more flexibility on the number of CPUs and the memory size.
Last point, after solving the download issue part, you could have cold start issue: the time that take your server to start and to load in memory your model (at the first request, when the instance start). Cloud Run proposes a min-instance feature to keep warm a certain number of instances and therefore to eliminate the cold start issue.

Understanding Cost Estimate for Google Cloud Platform MicroServices Architecture Design

I'm redesigning a monolith application into a MicroServices architecture and am hoping to use Google Cloud Platform (GCP) to host the entire solution. I'm having a very hard time understanding their costing breakdown, and am concerned that my costs will be uncontrollable after I build it. This is for a personal project but I'm hoping will have many users after I launch so I want to get the underlying architecture right and at the same time have reasonable costs initially when I launch.
Here is my architecture:
MicroServices 1 - 4 (Total 4 API Services):
Runs on App Engine
Exposes a REST API and saves data to DataStore
Initially each API should get hit around 200 times a day
MicroService 5 (Events triggered API Service):
Runs on App Engine
Listens for PubSub events and saves to DataStore (basically I have a sensor that pushes data to this Service for storage)
Initially the PubSub should receive events around 200 times a day
MicroService 6-7 (Total 2 UI Services):
Runs on App Engine
These are UIs so people can login and use the systems. The UIs are lightweight frond end apps that use the REST Services above to populate user data in a nice way.
Each UI Service should be used around 3 hours a day
So in Total I have 7 MicroServices with each running as AppEngine "Services" in a single GCP "Project". A DataStore is shared between these APIs within this Project.
As I have 7 App Engine instances running, and they only need to be operational for a short period of time per day, how does the pricing work?
I want to use App Engine because it's completely Managed, which is one of my design requirements. But I'm hoping AppEngine has some kind of Sleep Mode, so that when there is no usage it does not bill?
Any help in understanding what my monthly costs would be would be appreciated.
Thanks very much.
Update 8/2/2017
I've decided to stay out of GCP for now. As I hope to have 7 App Engines Services running in Flex (as they are node.js) I don't seem to get access to a free tier or the ability to scale idle services to 0 instances.
This means I'll be paying full price for these services. (i.e. 7 X Full App Engine VM Cost per Monthly :O )
This is an expense I cant have just for a POC of a proper MicroService design. Instead I'm going to continue with my MicroService design but use a 10$ DigitalOcean box and Dokku to containerise my Services. If this works well and I have a need I will migrate this design to GCP (or AWS)
The full outline of App Engine instance handling is available at https://cloud.google.com/appengine/docs/python/how-instances-are-managed .
In short, your best bet is to enable automatic scaling and set
max_idle_instances = 0
in your app.yaml.
That means that your app will autoscale to handle traffic as needed and shut down the instances afterwards. Also
When settling back to normal levels after a load spike, the number of idle instances can temporarily exceed your specified maximum. However, you will not be charged for more instances than the maximum number you've specified.
Later - when load time becomes more important you can set min_idle_instances to a more suitable number - this allows for responsive apps.
am concerned that my costs will be uncontrollable after I build it
You should be aware that automatically scalable GAE apps always have cost components dependent on the external user request patterns which are not controllable.
For example, in the standard GAE env, the way those 200 requests/day are distributed matters significantly:
if they are evenly distributed they will come in less than 15 min apart - the minimum billed time per instance lifetime, so the respective service will be billed for minimum 24 instance hours per day (very close to the daily 28 free instance-hours/day for billed apps, only a single-service app using the smallest instance class can fit in it).
if they are all received within a 15 minutes interval the service will be billed for 0.5 instance hours daily (which can easily fit in the free daily quota even with multiple services and/or with more powerfull instance classes).
The actual scalability configuration of each service can matter as well. See, for example,
The only way to keep costs under strict control is via the daily budget configuration (but hitting that limit means your app's functionality will be temporarily crippled).
All other usage-based costs being equal due to the functionality being performed you have some (potentially significant) control over costs via:
the GAE environment type selected for each service:
the standard env is billed by instance hours and includes a free daily quota
the flex env has no free daily quota.
the number of services: you could start with fewer services by combining their functionalities (you can still keep them modularized for later split). The expected initial load you describe can easily fit within the free daily budget with just a single standard env service.
Once the app usage picks up and the free daily quotas percentage in the total costs become neglijible you can gradually split the app into multiple services as needed. In general this can be a relatively simple task if the app is properly modularized.

Google App Engine for long running but low CPU tasks, or long-polling?

App Engine has been great for requests that process quickly with no external API calls to databases or caches or third-party resources, but we've found that introducing any sort of "longer running" component or external latency (for example in a HTTP POST operation that runs asynchronously in the background and might take a second or two to process a few more intense database queries... totally invisible and OK from a UX perspective on the client-side because it's asynchronous but expensive to App Engine billing since it's long running) ... the "instance hours" compound and drive costs up considerably.
These sorts of expense inducing situations where a request is literally just waiting for a response from an external resource and requiring almost zero CPU during their idling seem avoidable, but I'm not sure if it's avoidable with App Engine.
It's almost like a "long poll" where the response might be left open but doing nothing.
Is there a way to do this on App Engine without just paying an insane amount for instance hours, or would we be better off moving to Compute Engine or EC2? Does it scale automatically based on CPU load, or is it based solely on open and perhaps inactive requests in total count? — threadsafe is indeed enabled.
There are really two ways to go about this one (top of mind).
Use Task Queues!
If the work doesn't need to be exactly at the same time of the request, this is exactly what [task queues] in App Engine are for. They allow you to put a job on a queue, and have another module pick up the work. They're kind of great because you can separately scale your front end and back end processes.
If that doesn't work....
Use App Engine Flexible
Under the hood App Engine Flexible is just running GCE instances. The cost structure is entirely different, since you persistently have a VM running in the background serving your requests.
Hope this helps!
What you're really worried about here is how App Engine scales your instances. Because many of your requests require few resources, your app might be able to handle many more concurrent requests on a single instance than normal. You can look into parameters that shape scaling here. Of particular interest:
max_concurrent_requests The number of concurrent requests an automatic scaling instance can accept before the scheduler spawns a new instance (Default: 8, Maximum: 80).
There is a danger here, where an instance may fill up with non-long-polling requests and become overburdened. To prevent that, you could isolate your long-polling requests into their own service and set its scaling parameters separately from the rest of your app.

google cloud storage performance characteristics (latency / request response time)

I'm considering building an app on App Engine, and I'm trying to decide if I should store data in the datastore or on google cloud storage.
Each object is going to be typically no more than around a kilobyte, perhaps a few kilobytes at most (and often less). It won't change too often.
I could have the client directly access the data, but though I might live without there would be some benefit to the app engine app accessing the data and using it as part of serving a response.
What are the performance characteristics of google cloud storage? How quickly do requests come back? I was able to find a status dashboard for the datastore which indicates that they are usually reasonably quick at handling requests but I've had trouble getting guidance on how fast GCS is.
Under the most recent price reductions, it seems like the datastore might actually be cheaper for my use case of relatively small chunks of data ($0.06/100,000 requests vs $0.01/10,000 class b operations). Am I interpreting that correctly?
This thread might give you some insights. Back in september 2013 I had about 200-250ms on average for "blank" sequential inserts. You can get a great speedup if you combine your requests. You can insert up to 500 entities in a single request. Which takes roughly 500ms-900ms if I remember correctly.

Relative advantages of storage using Amazon Web Services S3 vs Google Application Engine

What do you see as the advantages and disadvantages of Amazon Web Services S3 compared with Google Application Engine? The cost per gigabyte for the two is, at the time I ask, roughly similar; I have not seen any widespread complaints about the quality of service; so I think the decision of which one to use may depend on the API (of all things).
Google's API breaks your content into what they call static content, such as your CSS files, favicons, images, etc and non-static dynamically-generated HTTP responses. Requests for static stuff will be served to whoever requests it until your bandwidth limit is reached; non-static requests will be fulfilled until your bandwidth or CPU limit is reached. With respect to your non-static requests, you can provide any logic you are able to express in Python, so you can be choosy about who you serve.
Amazon's API treats all your content as blobs in a bucket, and provides an access protocol that lets you distinguish between a variety of fulfillable requests ranging from world-readable to owner-only. If you want to something that's not in the kit, though, I don't know what you do beyond being thoughtful about distributing your URIs.
What differences do you see between the two? Are there other cloud storage services you like? Zetta had a press release today, but they're looking for a minimum of ten terabytes on the beta application, and none of my clients are there (yet); and Joyent will probably do something in the near future.
The way I see it is the Google App Engine basically provides a sandbox for you to deploy your app as long as it is written with their requirements (Python etc). Amazon gives you a virtual machine with a lot more flexibility in what can be done but probably more work on your side needed. MS new Azure seems to be going down the GAE route, but replace Python with .NET.
GAE has a limit of 10MB each on static files uploaded through appcfg.py (look right at the bottom of http://code.google.com/appengine/docs/python/tools/uploadinganapp.html). Obviously you can write code to slice large files into bits and reassemble at download time, but it suggests to me that Google doesn't expect App Engine to be used just as a simple CDN, and that if you want to use it as one you'll have to do some work. S3 does the job out of the box, all you have to do is grab a third-party interface app.
If you want to do something non-standard with file access on S3, then probably Amazon expects you to spring for a server instance on EC2. Once this is done, you have much more flexibility than GAE's environment, but you pay more (in cash and probably in maintenance).
The plus point for GAE is that it has "cheap" on its side for small apps (up to 1GB storage, 1GB bandwidth and 1.3 million hits a day are free: http://code.google.com/appengine/docs/quotas.html). Depending on your use, this might be significant, or it might be irrelevant on the scale of your total bandwidth costs.
Coincidentally, I have just this last couple of days looked at GAE for the first time. I took an old Perl CGI script and turned it into a GAE app, which is up and running. About 10 hours total, including reading the GAE introductory docs and remembering how Python is supposed to work enough to write a couple of hundred lines. I'd speculate that's more effort than loading a bunch of files onto S3, but less effort than maintaining EC2 server(s). However, I haven't used Amazon.
[Edited to add: this sounds like the advantages are all with Amazon for commercial purposes. This may well be true, but then GAE is not yet mature and presumably will get better from here fairly rapidly. They only let people start paying in December or so, before that it was free-quota-only except by special arrangement with Google. While Google sometimes takes flack for its claims of "perpetual beta", I think GAE genuinely is still starting up. If your app is a good fit for the BigTable data paradigm, then it might scale better on GAE than EC2. For storage I assume that S3 is already good enough for all reasonable purposes, and Google's clever architecture gives GAE no advantages to compensate when all you're doing is serving files.]
* Except that Google has just offered me a preview of GAE's Java support.
** Just noticed that you can set up chron jobs, but they're limited by the same rules as any other request (30 second runtime, can't modify files, etc).

Resources