Whenever I deploy a new version in Google App Engine, and I transfer traffic to it, the previous version still consumes messages from our message broker. How can I make sure only the newly deployed version will consume messages without shutting down the old instances?
If you have multiple versions deployed while traffic is being migrated, you can check the current version using the Modules API and compare that with the default version.
Your check might look something like this:
from google.appengine.api import modules
def default_version = modules.get_default_version()
def instance_version = modules.get_current_version_name()
# you may additionally want to query the instances of the default version
# to make sure they've booted up and are actively serving traffic.
if default_version != instance_version:
# don't consume messages
In the code example above, the default version is the version traffic is being migrated to, and the current version is the version of the instance.
See also Using the Modules API.
Note: Services were formerly known as modules and the API methods still reflect that naming.
During the deployment of your service, you can use --promote and --stop-previous-version options.
https://cloud.google.com/sdk/gcloud/reference/app/deploy
However, generally it's best to gradually migrate the traffic. This is the case for both users and backend services. Since you can't deploy two services at the exact same time, imagine your GAE deployment is delayed by a few seconds. Do you expect the messages to be consumed by your running service that is getting replaced? So, it shouldn't matter if a few messages still got routed to the old instance while the traffic is migrating. That would be the right design.
Related
I'm running a Google App Engine production server, using basic_scaling as the scale type. Whenever I update the code and deploy it - using gcloud app deploy - the old version of the code is shutdown.
According to the documentation, that's expected:
The shutdown process might be triggered by a variety of planned and unplanned events, such as:
You manually stop an instance.
You deploy an updated version to the service.
...
I understand that it's easier for most developers that way. But in my case, I'd like to keep the old versions running until the idle_timeout limit is reached. Does anyone know if there's a way to avoid the automatic shutdown and let the old versions to shutdown by themselves?
Per Google's documentation, when you deploy your code, the default flag of --stop-previous-version is used. This forces the previous version to be stopped. If you do not want that, you should explicitly use the --no-stop-previous-version in your deploy command (we also have this as a feature on our App, a GUI for GAE; you check or uncheck a checkbox).
Unfortunately, Google does not provide a way for the service to automatically shut down later. You'll have to manually shut it down and start the other version later.
It appears that AppEngine standard has a warmup feature to warm up an app after a deployment but I don't see the same feature available for Flex. The readiness & liveness probes also don't work for this since setting the path setting to a custom path inside the application doesn't seem to make the probes actually hit the internal endpoint.
Is there some solution I'm missing other than doing things like manually hitting the endpoints myself after the deployment which won't be very reliable since the calls don't necessarily always round robin to each instance?
In App Engine Standard, warmup requests essentially load your app's code into a new instance before any live requests reach that instance. This can happen in the following situations:
When you redeploy a version of your app.
When new instances are created due to the load from requests
exceeding the capacity of the current set of running instances.
When maintenance and repairs of the underlying infrastructure or
physical hardware occur
In App Engine Flexible, you can achieve the same result by using the initial_delay_sec setting for liveness checks in your app.yaml file. If you set up its value to give enough time for your code to initialize, the first request coming to that instance will be processed quickly by your already-initialized code.
I have had a Python-based Google App Engine app working great using Cloud Endpoints 1.0 for several years without incident. I have had nothing but trouble migrating to Cloud Endpoints 2.0.
Currently I'm in the following state after already clearing many previous hurdles described in other similar questions:
I have one version of my service called gce1 which uses Endpoints 1.0 and is set as the default service receiving 100% of my traffic. I can point API clients and the APIs Explorer to both gce1-dot-myservice.appspot.com and the default myservice.appspot.com and everything works fine. I can verify in the logs that anything that goes through here is using GCE 1.0.
I have a second version of my service called gce2 which is not receiving any traffic by default, but if I point an API client or the APIs Explorer to gce2-dot-myservice.appspot.com it works just fine, and I can verify in the logs that anything that goes through here is using GCE 2.0.
Great, right? So it would seem that all I need to do is migrate all my traffic to gce2 and I'm done.
But... when I do that everything breaks! The default myservice.appspot.com serves up 405 POST Method Not Allowed responses to my existing clients, and if I look at the APIs Explorer, suddenly it now shows a bunch of obsolete methods that I think are from years ago and are no longer used in my current API. I can't tell where those are coming from (they are nowhere in my code, and haven't been for years), and I can't get the default service to serve the GCE 2.0 API no matter what I do.
The biggest problem is that I have thousands of users in the wild that all point to the default API URL, so it isn't so easy to just have them start pointing to gce2-dot-myservice, and besides, it doesn't make sense that I can't make the new default the new default. I've been working on this migration to GCE 2.0 for months, the deadline for getting off GCE 1.0 is getting closer by the day, and Google Support has not responded since late last year on this topic.
I should also mention I have tried:
Pushing a new service with the GCE2.0 code directly to default
Pushing a new service with no API at all (to maybe clear a cache or something)
Pushing services with all different sorts of version names
None of these have worked, although I haven't done any of them allowing a long delay since I'm working on a live service with real users.
This issue is now resolved, so for most people it should no longer occur. However, in my specific case, I had a legacy API that was getting in the way and had to be deleted, which did require specific attention from a Google engineer.
If you have similar issues, visit issuetracker.google.com/issues/76031966 and comment there.
Thanks to #saiyr for help tracking this down.
In my app (Google App Engine Standard Python 2.7) I have some flags in global variables that are initialized (read values from memcache/Datastore) when the instance start (at the first request). That variables values doesn't change often, only once a month or in case of emergencies (i.e. when google app engine Taskqueue or Memcache service are not working well, that happened not more than twice a year as reported in GC Status but affected seriously my app and my customers: https://status.cloud.google.com/incident/appengine/15024 https://status.cloud.google.com/incident/appengine/17003).
I don't want to store these flags in memcache nor Datastore for efficiency and costs.
I'm looking for a way to send a message to all instances (see my previous post GAE send requests to all active instances ):
As stated in https://cloud.google.com/appengine/docs/standard/python/how-requests-are-routed
Note: Targeting an instance is not supported in services that are configured for auto scaling or basic scaling. The instance ID must be an integer in the range from 0, up to the total number of instances running. Regardless of your scaling type or instance class, it is not possible to send a request to a specific instance without targeting a service or version within that instance.
but another solution could be:
1) Send a shutdown message/command to all instances of my app or a service
2) Send a restart message/command to all instances of my app or service
I use only automatic scaling, so I'cant send a request targeted to a specific instance (I can get the list of active instances using GAE admin API).
it's there any way to do this programmatically in Python GAE? Manually in the GCP console it's easy when having a few instances, but for 50+ instances it's a pain...
One possible solution (actually more of a workaround), inspired by your comment on the related post, is to obtain a restart of all instances by re-deployment of the same version of the app code.
Automated deployments are also possible using the Google App Engine Admin API, see Deploying Your Apps with the Admin API:
To deploy a version of your app with the Admin API:
Upload your app's resources to Google Cloud Storage.
Create a configuration file that defines your deployment.
Create and send the HTTP request for deploying your app.
It should be noted that (re)deploying an app version which handles 100% of the traffic can cause errors and traffic loss due to:
overwriting the app files actually being in use (see note in Deploying an app)
not giving GAE enough time to spin up sufficient instances fast enough to handle high income traffic rates (more details here)
Using different app versions for the deployments and gradually migrating traffic to the newly deployed apps can completely eliminate such loss. This might not be relevant in your particular case, since the old app version is already impaired.
Automating traffic migration is also possible, see Migrating and Splitting Traffic with the Admin API.
It's possible to use the Google Cloud API to stop all the instances. They would then be automatically scaled back up to the required level. My first attempt at this would be a process where:
The config item was changed
The current list of instances was enumerated from the API
The instances were shutdown over a time period that allows new instances to be spun up and replace them, and how time sensitive the config change is. Perhaps close on instance per 60s.
In terms of using the API you can use the gcloud tool (https://cloud.google.com/sdk/gcloud/reference/app/instances/):
gcloud app instances list
Then delete the instances with:
gcloud app instances delete instanceid --service=s1 --version=v1
There is also a REST API (https://cloud.google.com/appengine/docs/admin-api/reference/rest/v1/apps.services.versions.instances/list):
GET https://appengine.googleapis.com/v1/{parent=apps/*/services/*/versions/*}/instances
DELETE https://appengine.googleapis.com/v1/{name=apps/*/services/*/versions/*/instances/*}
We have Java servlets up and running on GAE, using blobstore, datastore and other cloud services.
Currently, we're starting a migration process to cloud Endpoints and we've hit an issue: if we use a different GAE project, we would not be able to query regarding current datastore entities (to the best of my knowledge, Google doesn't want you to do this - see
this question
and the GAE terms of service - section 3.3d), so we need to use the same project for both.
I looked up whether it's possible to have one GAE instance running Java servlets and one instance running Endpoints, but I found no conclusive answer anywhere.
If we try to implement and something goes wrong, we're looking at a potentially major issue for our users, so we need to be sure beforehand.
Has anyone tried something similar, and can assure us that this works?
You have 2 options to run the old and the new code inside the same app (thus with no issues sharing access to the datastore) but as separate engine instances, so they can be developed/deployed/managed independently:
as different versions of the same app/module(s):
the old version remains the default, the new one can be accessed at a different URL during development (possibly via URL routing)
you can use traffic splitting to do live A/B testing on the new code and for gradual final migration until you make the new version the default
as different modules of the same app:
both can run (fully functional) side by side indefinitely, but you need to be more careful during development
traffic is routed to the modules in several possible ways
final migration is done by publishing the new URLs, eventually re-directing the old URLs and finally bringing down the old module code
The 2 approaches can even be combined, if needed, as the final solution described by the OP's in this somehow similar question (for the python environment, but java equivalents exist): Google App Engine upgrading part by part