Google App Engine internal network - google-app-engine

Is it possible to route HTTP traffic between google app engine applications without going through the public internet?
For example, if I'm running a Web Service API on one application and want to build a second application on top of it without traffic going through the internet - for performance reasons.

Between separate apps running on different domains? I suspect not.
But you can use backends to do different work behind the scenes:
Backends are special App Engine instances that have no request deadlines, higher memory and CPU limits, and persistent state across requests. They are started automatically by App Engine and can run continously for long periods. Each backend instance has a unique URL to use for requests, and you can load-balance requests across multiple instances.
When I look at the logs between the backend and the front end instances I see IPs like
0.1.0.3
So yes, those communication paths are internal. I'd hazard a guess that as so much of the internet is google you could say requests between different apps might not travel on the public internet.
Logs indicate low latency communication between front and back ends, not under any particular load however. Your milage may vary.
Backends in Python

Related

Best configuration for Automatic Scaling in Google App Engine to always have an instance available?

What is the best way to set up Google App Engine to always have at least one instance ready and available to handle requests when using automatic scaling? This is for a low traffic application.
There are settings here that allow you to control them but I am not sure what the best combination is and some of them sound confusing. For example, min-instances and min-idle-instances sound similar. I tried setting min-instances to 1 but I still experienced lag.
What is a configuration that from the end user's point of view is always on and handles requests without lag (for a low traffic application)? I
In the App Engine Standard environment, when your application load or handling a requests this may cause the users to experience more latency, however warmup requests might help you reduce this latency. Before any live requests get to that instance, warmup requests load the app's code onto a new one. If this is enabled, App Engine will detect if your application needs a new instance and initiate a warmup request to initialize a new instance. You can check this link for Configuring Warmup Requests to Improve Performance
Regarding min-instances and min-idle-instances these will only apply when warmup request is enabled. As you can see in this post the difference of these two elements: min-instances used to process the incoming request immediately while min-idle-instances used to process high load traffic.
However, you mentioned that you don't need a warmup so we suggest you to select App Engine Flexible and based on this documentation it must have at least one instance running and can scale up in response to traffic. Please take note that using this environment costs you a higher price. You can refer to this link for reference regarding the pricing of two environments in App Engine.

Google Cloud architecture to reduce latency time with App Engine and a VM instance working together

Being new to GCP, I have a question about which architecture to use in a particular case.
Suppose I have a Django website running on the App engine (flexible environment?). Users upload images to the website. I would like to first use Google Vision API to perform some label detection on the images and then feed the labels and images to a VM with GPU attached (all running on Google cloud), for additional computationally costly job on the images. After the job is completed by the VM, the resulting images are then available for the user to download or sent to the user email.
Because of the relatively large time spent on the VM+GPU side, and because the website will be accessed by users globally, I would like to reduce the overall latency time and pick the most efficient architecture for the job.
My first thought was to:
upload images to Google Cloud Storage;
use GC functions to perform some quick transformations and then call Google Vision API;
pull the resulting labels and transformed images to the VM and make computations on the VM side;
upload finalized images to Google Cloud Storage.
Now, that's a lot of bouncing back and forth between a storage bucket and APP engine plus VM on either side. I was wondering if there is a 1) quicker and 2) more efficient resources-wise way to achieve the same goal.
If your website is accessed globally, your App Engine choice is the wrong one: App Engine can be deployed in only one region, not globally.
For the frontend, I recommend to use Cloud Run instead (or VM, but I don't like VM) and to put a HTTPS load balancer in front of. Like that, the physical latency is reduced.
And, the files must be also store in the closest region, so in Cloud Storage in different region.
And finally, to duplicate the VM/GPU infrastructure in each region (it could be costly, but it's the best way to reduce latency.
Your process is the right one. I recommend you to expose an API on your VM to notify it when a file is ready. You can use the PubSub notification on Cloud Storage to sink the event in PubSub, and then create a push subscription to invoke your VM directly (instead of a cloud functions).
Like that, you remove a component and you perform all your processing on the VM side.

With AngularJS based single page apps hosted on premise, how to connect to AWS cloud servers

Maybe this is a really basic question, but how do you architect your system such that your single page application is hosted on premise with some hostname, say mydogs.com but you want to host your application services code in the cloud (as well as database). For example, let's say you spin up an Amazon EC2 Container Service using docker and it is running NodeJS server. The hostnames will all have ec2_some_id.amazon.com. What system sits in from of the Amazon EC2 instance where my angularjs app connects to? What architecture facilitate this type of app? Especially AWS based services.
One of the important aspects setting up the web application and the backend is to server it using a single domain avoiding cross origin requests (CORS). To do this, you can use AWS CloudFront as a proxy, where the routing happens based on URL paths.
For example, you can point the root domain to index.html while /api/* requests to the backend endpoint running in EC2. Sample diagram of the architecture is shown below.
Also its important for your angular application to have full url paths. One of the challenges having these are, for routes such as /home /about and etc., it will reload a page from the backend for that particular path. Since its a single page application you won't be having server pages for /home and /about & etc. This is where you can setup error pages in CloudFront so that, all the not found routes also can be forwarded to the index.html (Which serves the AngularJS app).
The only thing you need to care about is the CORS on whatever server you use to host your backend in AWS.
More Doc on CORS:
https://developer.mozilla.org/en-US/docs/Web/HTTP/CORS
Hope it helps.
A good approach is to have two separated instances. It is, an instance to serve your API (Application Program Interface) and another one to serve your SPA (Single Page Application).
For the API server you may want more robust service because it's the one that will suffer the most receiving tons of requests from all client instances, so this one needs to have more performance, band, etc. In addition, you probably want your API server to be scalable when needed (depends on the load over it); maybe not, but is something to keep in mind if your application is supposed to grow fast. So you may invest a little bit more on this one.
The SPA server in the other hand, is the one that will only serve static resources (if you're not using server side rendering), so this one is supposed to be cheaper (if not free). Furthermore, all it does is to serve the application resources once and the application actually runs on client and most files will end up being cached by the browser. So you don't need to invest much on this one.
Anyhow, you question about which service will fit better for this type of application can't be answered because it doesn't define much about that you may find the one that sits for the requisites you want in terms of how your application will be consumed by the clients like: how many requests, downloads, or storage your app needs.
Amazon EC2 instance types

GAE shutdown or restart all the active instances of a service/app

In my app (Google App Engine Standard Python 2.7) I have some flags in global variables that are initialized (read values from memcache/Datastore) when the instance start (at the first request). That variables values doesn't change often, only once a month or in case of emergencies (i.e. when google app engine Taskqueue or Memcache service are not working well, that happened not more than twice a year as reported in GC Status but affected seriously my app and my customers: https://status.cloud.google.com/incident/appengine/15024 https://status.cloud.google.com/incident/appengine/17003).
I don't want to store these flags in memcache nor Datastore for efficiency and costs.
I'm looking for a way to send a message to all instances (see my previous post GAE send requests to all active instances ):
As stated in https://cloud.google.com/appengine/docs/standard/python/how-requests-are-routed
Note: Targeting an instance is not supported in services that are configured for auto scaling or basic scaling. The instance ID must be an integer in the range from 0, up to the total number of instances running. Regardless of your scaling type or instance class, it is not possible to send a request to a specific instance without targeting a service or version within that instance.
but another solution could be:
1) Send a shutdown message/command to all instances of my app or a service
2) Send a restart message/command to all instances of my app or service
I use only automatic scaling, so I'cant send a request targeted to a specific instance (I can get the list of active instances using GAE admin API).
it's there any way to do this programmatically in Python GAE? Manually in the GCP console it's easy when having a few instances, but for 50+ instances it's a pain...
One possible solution (actually more of a workaround), inspired by your comment on the related post, is to obtain a restart of all instances by re-deployment of the same version of the app code.
Automated deployments are also possible using the Google App Engine Admin API, see Deploying Your Apps with the Admin API:
To deploy a version of your app with the Admin API:
Upload your app's resources to Google Cloud Storage.
Create a configuration file that defines your deployment.
Create and send the HTTP request for deploying your app.
It should be noted that (re)deploying an app version which handles 100% of the traffic can cause errors and traffic loss due to:
overwriting the app files actually being in use (see note in Deploying an app)
not giving GAE enough time to spin up sufficient instances fast enough to handle high income traffic rates (more details here)
Using different app versions for the deployments and gradually migrating traffic to the newly deployed apps can completely eliminate such loss. This might not be relevant in your particular case, since the old app version is already impaired.
Automating traffic migration is also possible, see Migrating and Splitting Traffic with the Admin API.
It's possible to use the Google Cloud API to stop all the instances. They would then be automatically scaled back up to the required level. My first attempt at this would be a process where:
The config item was changed
The current list of instances was enumerated from the API
The instances were shutdown over a time period that allows new instances to be spun up and replace them, and how time sensitive the config change is. Perhaps close on instance per 60s.
In terms of using the API you can use the gcloud tool (https://cloud.google.com/sdk/gcloud/reference/app/instances/):
gcloud app instances list
Then delete the instances with:
gcloud app instances delete instanceid --service=s1 --version=v1
There is also a REST API (https://cloud.google.com/appengine/docs/admin-api/reference/rest/v1/apps.services.versions.instances/list):
GET https://appengine.googleapis.com/v1/{parent=apps/*/services/*/versions/*}/instances
DELETE https://appengine.googleapis.com/v1/{name=apps/*/services/*/versions/*/instances/*}

My google app instances does not seem to be on correct region

I have just created one google app engine application and one 2nd Generation MySQL instance in eu-west2 region. In GCP Console they both seems to be in eu-west2 region.
However when I try to gelocate my ip's they seem to be in somewhere in US.
What should I do to use GCP in eu-west2 region?
my GCP instances:
their locations:
Google has an extensive world wide network. What you are seeing is us routing you to Google's closest Point of Presence (POP), which from that point on you're on a software defined network (SDN). What this means is we get your traffic on to our fast network as quickly as possible and abstract away the details of getting you to the machine in question.
Check latency from you to these hosts, then spin up a VM in Europe and check latency from that VM to these hosts - you'll find the numbers will confirm they really are in eu-west2.
I faced the same issue, you can find more about it here: Outgoing HTTP Request Location on Google App Engine
Is about the Google Network usage, the outgoing traffic come from the "Point of Presence" instead of the location and it can be dynamic. In my case, I have no solution, since it's mandatory to my API to make the requests from Brazil =\

Resources