I have a google app engine auto scale flexible service which can scale from 1 to x instances by CPU utilization threshold.
Every instance send metrics to global graphite server.
I would like to know if there's a way to set\get instance consistent name for every new deployed instance.
For the moment, every instance has it's own unique id which change for every deployed. I would like to set x names which always will be attached to one of the app engine service. without using another service to manage that.
Does anyone familiar with google service\API for that purpose ?
Related
Being new to GCP, I have a question about which architecture to use in a particular case.
Suppose I have a Django website running on the App engine (flexible environment?). Users upload images to the website. I would like to first use Google Vision API to perform some label detection on the images and then feed the labels and images to a VM with GPU attached (all running on Google cloud), for additional computationally costly job on the images. After the job is completed by the VM, the resulting images are then available for the user to download or sent to the user email.
Because of the relatively large time spent on the VM+GPU side, and because the website will be accessed by users globally, I would like to reduce the overall latency time and pick the most efficient architecture for the job.
My first thought was to:
upload images to Google Cloud Storage;
use GC functions to perform some quick transformations and then call Google Vision API;
pull the resulting labels and transformed images to the VM and make computations on the VM side;
upload finalized images to Google Cloud Storage.
Now, that's a lot of bouncing back and forth between a storage bucket and APP engine plus VM on either side. I was wondering if there is a 1) quicker and 2) more efficient resources-wise way to achieve the same goal.
If your website is accessed globally, your App Engine choice is the wrong one: App Engine can be deployed in only one region, not globally.
For the frontend, I recommend to use Cloud Run instead (or VM, but I don't like VM) and to put a HTTPS load balancer in front of. Like that, the physical latency is reduced.
And, the files must be also store in the closest region, so in Cloud Storage in different region.
And finally, to duplicate the VM/GPU infrastructure in each region (it could be costly, but it's the best way to reduce latency.
Your process is the right one. I recommend you to expose an API on your VM to notify it when a file is ready. You can use the PubSub notification on Cloud Storage to sink the event in PubSub, and then create a push subscription to invoke your VM directly (instead of a cloud functions).
Like that, you remove a component and you perform all your processing on the VM side.
I have my code running on App Engine Flex with 5 instances.
Is there a way we could graphically visualize the distribution of requests per instance wise.
The current dashboard doesn't display to the instance level and my client is interested to get an instance level view.
At this time it is not possible to do something like that. However, I created a feature request for this. I suggest you to star the Feature Request and every time there is an update you will be notified.
I got a Google AppEngine Standard app running in region1 and want to deploy that same app as a backup region2 in case region1 is down. I'm looking for a way to make that failover happens seemlessly for my users (both human users on browsers and other third party services which call back to my app's service).
Currently I have my custom domain name (both naked name and www name) mapped to the app on region #1 (done in [Google Cloud Console][App Engine][Settings][Customer Domains]).
In the event region1 is down, I would like to go in that setting area of app1(region1), remove those maps and then go that setting area of app2(region2) add those maps, so that after that point, requests to myappdomainname.com and www.myappdomainname.com will go to app2 on region2
Question: is that plan feasible? In particular, if region1 is down, can I still be able to access app1's setting area to remove those maps, so that I can add the maps to app2?
Down time while switching these for about an hour is okay for my app, as long as my users can continue to use the same URL they been using when region1 was still running.
Google App Engine is a regional service, meaning that it cannot span to more than a region. However, it's replicated through all zones of the region to reduce any potential downtime.
The kind of implementation you want for GAE is opposite of the actual purpose of it. One of GAE's principal features is that you don't have to configure and manage the instances running in the background yourself.
The preferred way of getting this to work on Google Cloud Platform would be using Compute Engine. GCE gives you the option to create the instances in any region you want and configure a Load Balancer to serve the traffic and scale your instances as you want. Here is some documentation about serving applications using GCE:
Running a GO app on Compute Engine (part of a GCP quickstart)
Building Scalable and Resilient Web Applications on Google Cloud Platform
Designing Robust Systems
Also, here's a Google Groups post about this issue that goes a little bit more in detail.
In my app (Google App Engine Standard Python 2.7) I have some flags in global variables that are initialized (read values from memcache/Datastore) when the instance start (at the first request). That variables values doesn't change often, only once a month or in case of emergencies (i.e. when google app engine Taskqueue or Memcache service are not working well, that happened not more than twice a year as reported in GC Status but affected seriously my app and my customers: https://status.cloud.google.com/incident/appengine/15024 https://status.cloud.google.com/incident/appengine/17003).
I don't want to store these flags in memcache nor Datastore for efficiency and costs.
I'm looking for a way to send a message to all instances (see my previous post GAE send requests to all active instances ):
As stated in https://cloud.google.com/appengine/docs/standard/python/how-requests-are-routed
Note: Targeting an instance is not supported in services that are configured for auto scaling or basic scaling. The instance ID must be an integer in the range from 0, up to the total number of instances running. Regardless of your scaling type or instance class, it is not possible to send a request to a specific instance without targeting a service or version within that instance.
but another solution could be:
1) Send a shutdown message/command to all instances of my app or a service
2) Send a restart message/command to all instances of my app or service
I use only automatic scaling, so I'cant send a request targeted to a specific instance (I can get the list of active instances using GAE admin API).
it's there any way to do this programmatically in Python GAE? Manually in the GCP console it's easy when having a few instances, but for 50+ instances it's a pain...
One possible solution (actually more of a workaround), inspired by your comment on the related post, is to obtain a restart of all instances by re-deployment of the same version of the app code.
Automated deployments are also possible using the Google App Engine Admin API, see Deploying Your Apps with the Admin API:
To deploy a version of your app with the Admin API:
Upload your app's resources to Google Cloud Storage.
Create a configuration file that defines your deployment.
Create and send the HTTP request for deploying your app.
It should be noted that (re)deploying an app version which handles 100% of the traffic can cause errors and traffic loss due to:
overwriting the app files actually being in use (see note in Deploying an app)
not giving GAE enough time to spin up sufficient instances fast enough to handle high income traffic rates (more details here)
Using different app versions for the deployments and gradually migrating traffic to the newly deployed apps can completely eliminate such loss. This might not be relevant in your particular case, since the old app version is already impaired.
Automating traffic migration is also possible, see Migrating and Splitting Traffic with the Admin API.
It's possible to use the Google Cloud API to stop all the instances. They would then be automatically scaled back up to the required level. My first attempt at this would be a process where:
The config item was changed
The current list of instances was enumerated from the API
The instances were shutdown over a time period that allows new instances to be spun up and replace them, and how time sensitive the config change is. Perhaps close on instance per 60s.
In terms of using the API you can use the gcloud tool (https://cloud.google.com/sdk/gcloud/reference/app/instances/):
gcloud app instances list
Then delete the instances with:
gcloud app instances delete instanceid --service=s1 --version=v1
There is also a REST API (https://cloud.google.com/appengine/docs/admin-api/reference/rest/v1/apps.services.versions.instances/list):
GET https://appengine.googleapis.com/v1/{parent=apps/*/services/*/versions/*}/instances
DELETE https://appengine.googleapis.com/v1/{name=apps/*/services/*/versions/*/instances/*}
I am trying to setup application on Google cloud compute. But I want to setup scaling script that would launch VM instances on google cloud based on some criteria. So Google provides autoscaler options for this, But is it possible to do that without autoscaler through Google APIs??
Also I would like to know procedure for creating image on google cloud compute. I have created one Instance group with instance template that launched one VM instance. But when I try to create image from new image option but it doesn't list disk of that instance.
For the first question, you can write your own auto scaler. Every google compute engine machine can be accessed through a remote api: https://cloud.google.com/compute/docs/reference/latest/
You can host your own auto scaler on App Engine with a cron checking the machine health and CPU every 1 minute for example.
Please write a new SO question for the second question.