Error deploying java google app engine flexible application - Timed out waiting for the app infrastructure to become healthy - google-app-engine

Writing this issue as I have no idea how to investigate it.
We're having problems in deploying an app engine flexible application.
The problem is, that the only error we get is the following:
GCLOUD: ERROR: (gcloud.app.deploy) Error Response: [4] Timed out waiting for the app infrastructure to become healthy.
I tried already the following:
Try a simple helloWorld app, to make sure it's not an application issue
Check quota settings -> All green
Check activity stream for warnings or errors
Check logs for warngings or errors
Grant owner role to service account which is deploying the app
App.yaml:
service: test-service # Id of the service
env: flex # Flex environment
runtime: java # Java runtime
runtime_config:
jdk: openjdk8 # use OpenJDK 8
resources:
cpu: 1
memory_gb: 2.8
gcloud version
Google Cloud SDK 214.0.0 alpha 2018.08.24
app-engine-java 1.9.64
app-engine-python 1.9.74 beta 2018.08.24 bq 2.0.34
cloud-datastore-emulator 2.0.2
core 2018.08.24
gsutil 4.33
kubectl 2018.08.24
pubsub-emulator 2018.08.24

After contacting the google technical support, we found out, that the default app engine service account didn't have the Editor role. After assigning the editor role the deployment worked again.

This error is often reported when your application has reached the quota limit for "In-use IP addresses". Similar error was reported on this Google Cloud Platform issue link. The default value for the in-use addresses is '8', and this quota value can be increased clicking the 'Edit' button in the Cloud Console — Ensure you are editing the value for In-use IP addresses.
The Google engineer confirmed that there is a planned improvement to the quota error details to be implemented in one of the next versions of gcloud SDK. You can track updates on the CloudSDK within this Google Group link

Related

Gcloud cloud build local component failing with error "Error loading config file: unknown field "availableSecrets" in cloudbuild.Build"

Greetings stackoverflow community! First time asker, long time user.
I am testing out my cloudbuild.yaml file locally using Cloud Build Local component and Secret Manager and it is failing on "availableSecrets".
Error message: Error loading config file: unknown field "availableSecrets" in cloudbuild.Build
OS Platform: Windows 10/WSL2/Ubuntu 18.04
cloud-build-local: v0.5.2
Docker engine: v20.10.2
Nodejs version: v14.15.3
NPM version: 6.14.9
gcloud version: 326.0.0
Installed components: [BigQuery Command Line Tool, Cloud Datastore Emulator, Cloud SDK Core Libraries, Cloud Storage Command Line Tool, Google Cloud Build Local Builder, gcloud Beta Commands]
Documentation on Cloud Build build file: https://cloud.google.com/cloud-build/docs/build-config
Documentation to configure secrets with cloud build: https://cloud.google.com/cloud-build/docs/securing-builds/use-secrets
Documentation for cloud build local: https://cloud.google.com/cloud-build/docs/build-debug-locally
Steps performed:
Added secrets to Secret Manager
Enabled API between Cloud Build and Secrets Manager
Added cloudbuild service account as member of each secret password.
Added IAM permission Secret Manager Secrets Accessor to cloudbuild user. I don't know where I got this info from but it is residual at this point from other attempts to use Secret Manager with cloudbuild. I am not sure of the difference between applying access here vs applying to the Secret Manager secret.
Command: cloud-build-local --config=cloudbuild.staging.yaml --dryrun=false .
cloudbuild.staging.yaml:
- name: gcr.io/cloud-builders/npm
entrypoint: 'npm'
args: [ 'install' ]
- name: 'gcr.io/cloud-builders/gcloud'
args: ["app", "deploy"]
env:
- 'DAO_FACTORY=datastore'
- 'POLL_INTERVAL=15'
- 'PROMPT=staging>'
- 'ENVIRONMENT=staging'
- 'NAMESPACE=staging'
- 'RESET_DATASTORE=false'
secretEnv: ['ADMIN_USER', 'SUPER_ADMINS', 'BOT_TOKEN']
availableSecrets:
secretManager:
- versionName: projects/{project token}/secrets/SYSTEM_USER/versions/1
env: 'ADMIN_USER'
- versionName: projects/{project token}/secrets/SUPER_ADMINS/versions/1
env: 'SUPER_ADMINS'
- versionName: projects/{project token}/secrets/BOT_TOKEN/versions/2
env: 'BOT_TOKEN'```
Tag: cloud-build-local. I guess without reputation a meaningful tag cannot be created. Maybe an esteemed community member will create this as this may be specific to cloud-build-local only.
Support for Google Secret Manager in Google Cloud Build descriptor file is apparently very new and does not appear to be supported by cloud-build-local component at this time; please see comment from Guillaume about feature being a week old. When cloud build descriptor is ran in Cloud Build, it works fine.
I fixed a similar issue by upgrading the gcloud tool.

gcloud app deploy does not terminate even when service is running

I am deploying a node.js server to Google App Engine from Bitbucket pipeline environment and the last command in the script is: gcloud -q app deploy app.yaml --no-promote --verbosity=debug
The logs show that the service is deployed successfully but the script is not terminating, this is the last part of the log:
> DEBUG: Reading GCS logfile: 206 (read 10 bytes) PUSH DONE DEBUG:
> Operation [...] complete. Result: {...} DEBUG: Reading GCS logfile:
> 416 (no new content; keep polling)
> -------------------------------------------------------------------------------- DEBUG: Converted YAML to JSON: "{...}" DEBUG: Operation [...] not
> complete. Waiting to retry. Updating service [default] (this may take
> several minutes)... .DEBUG: Operation [...] not complete. Waiting to
> retry. ......DEBUG: Operation [...] not complete. Waiting to retry.
> .......DEBUG: Operation [...] not complete. Waiting to retry.
> ......DEBUG: Operation [...] not complete. Waiting to retry.
> .......DEBUG: Operation [...] not complete. Waiting to retry.
> .......DEBUG: Operation [...] not complete. Waiting to retry.
I tried to add readiness_check and liveness_check to app.yml but it didn't change the behaviour.
readiness_check:
path: "/api/public/logout"
check_interval_sec: 5
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
app_start_timeout_sec: 300
liveness_check:
path: "/api/public/logout"
check_interval_sec: 30
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
The main unknown here is what criteria does gcloud app deploy uses to determine termination condition?
Also, is there any bypass to this problem?
Update
The problem happens also when running the gcloud app deploy command from local environment (my laptop).
The problem does NOT happen when removing the --no-promote flag.
The gcloud app deploy command expects a well-formed and valid app.yml file, this is what determines its termination condition.
As you confirmed the deployment worked without the --no-promote flag, it could mean that something in the configuration expects the application to be already deployed and running, thus preventing the script to complete.
Another possible cause would be that the Google Cloud SDK version specified in bitbucket-pipelines.yml is an older one. Make sure you work with the latest. This consideration applies extensively to all dependencies in package.json, which might be conflicting with one another, especially when using older versions of Node.js.
This guide can help at building a sound configuration for Bitbucket-based deployments; although the example given is with Python, it might as well be used as a template for processing a Node.js pipeline.
Nb. in this solution, the Google Cloud SDK version is an older one (127.0.0), which will make this deployment fail, so it should be replaced with the latest (228.0.0 or higher). Also the guide omits another required API activation: Cloud Build API. I've notified the team to amend the solution.
I've tested several scenarios with a simple Node.js server, and could not reproduce the issue. Check my Github repository for the code.
For further help on this topic, please provide more hints, such as the content of the app.yml, bitbucket-pipelines.yml, and package.json files, as well as a description of the state of App Engine (services, versions).
In order to deploy the test repository to App Engine from Bitbucket, make sure the following is done on the project:
Enable API's:
App Engine Admin
Cloud Build
Create a Service Account with following permissions, and generate an API Key:
App Engine: Admin
Cloud Build: Editor
Storage: Object Admin

Elasticsearch deployment on google app engine flex

Is it possible to deploy Elasticsearch on App engine flex environment using a docker image.
I have tried the following
My files on the local machine
Folder : elasticsearch
app.yaml
Dockerfile
docker-entrypoint.sh
config folder(containing elasticsearch.yml)file
Contents of app.yaml
runtime: custom
env: flex
Dockerfile and docker-entrypoint.sh copied from https://github.com/GoogleCloudPlatform/elasticsearch-docker/tree/master/5/5.2.0
Modifications to the Dockerfile
replaced EXPOSE 9200 9300 to EXPOSE 8080
Modification to the elasticsearch.yml
cluster.name: "beaconinside-docker-cluster"
path.data: /usr/share/elasticsearch/data
http.host: 0.0.0.0
http.port: 8080
discovery.zen.minimum_master_nodes: 1
I build a container using the docker file on my local machine
docker build -t elasticdemo .
Then, I run the container
docker run -p 8080:8080 elasticdemo
I am able to access elasticsearch on 0.0.0.0:8080
Problem:
I am trying to deploy elasticsearch as an app to Google app engine flex environment
gcloud app deploy app.yaml --version elasticdocker --project myproject
The deployment fails with the following error
Updating service [default]...failed.
ERROR: (gcloud.app.deploy) Error Response: [9]
I was expected elasticsearch to deploy as an app and be available on the deployed url.
Could you please provide pointers/help/suggestions with this approach?
While you can deploy ES to App Engine Flexible environment it's not particularly useful. The VMs hosting GAE Flexible containers are restarted regularly as part of maintenance and whatever data is stored on the local disk will be lost on restart. If you want to use local disk for long term storage, I'd suggest to deploy the GCE VM's (or alternatively use a solution from the GCP Marketplace) or deploy to GKE which supports persistent disks
As for the actual question: you probably don't have a health check handler and therefore App Engine Flexible environment doesn't consider your app healthy after deploying it. The error message is useless, I agree.
From the GAE Flexible docs for building custom images:
"A health check is an HTTP request to the URL /_ah/health. A healthy application should respond with status code 200."
Alternatively you can turn off health checks by adding into app.yaml
enable_health_check: False

How to deploy a GAE project in flexible environment without billing?

I've been developing some REST service using Flask and other third party libraries and I want to deploy it to GAE in the flexible environment. I usually deploy to the GAE standard environment but I wanted to try the new flexible environment. At the moment I wish to deploy to flexible environment without enabling billing, and the Google support assured me that it was possible to deploy over GAE flexible environment without enabling billing.
Running my code locally works fine, and have the following yaml file:
runtime: python
env: flex
entrypoint: gunicorn -b :$PORT whereismybus230.starter:app
runtime_config:
python_version: 3
So I created a new project on through the Google cloud console web page (as usual), and created a new gcloud profile on my local machine so I deploy it to this new project.
Then I run:
gcloud app deploy --verbosity=info
I get that a docker image is being build and at some point it will be pushed to a Compute Engine but it fails after a few minutes here:
Successfully built sophiabus230 aniso8601 future docopt itsdangerous MarkupSafe
Installing collected packages: Werkzeug, click, MarkupSafe, Jinja2, itsdangerous, Flask, jsonschema, pytz, six, python-dateutil, aniso8601, flask-restplus, beautifulsoup4, future, sophiabus230, coverage, requests, docopt, coveralls
Successfully installed Flask-0.12 Jinja2-2.9.4 MarkupSafe-0.23 Werkzeug-0.11.15 aniso8601-1.2.0 beautifulsoup4-4.5.3 click-6.7 coverage-4.3.4 coveralls-1.1 docopt-0.6.2 flask-restplus-0.9.2 future-0.16.0 itsdangerous-0.24 jsonschema-2.5.1 python-dateutil-2.6.0 pytz-2016.10 requests-2.12.5 six-1.10.0 sophiabus230-0.4
---> 3e3438680079
Removing intermediate container bd9f8ccb6f4a
Step 8 : ADD . /app/
---> bde0915f6720
Removing intermediate container e3193eb4ef70
Step 9 : CMD gunicorn -b :$PORT whereismybus230.starter:app
---> Running in 022d38d769f8
---> 36893d0a549a
Removing intermediate container 022d38d769f8
Successfully built 36893d0a549a
PUSH
The push refers to a repository [us.gcr.io/whereismy230/appengine/default.20170120t131841]
e5f488ee94c5: Preparing
8d27ce27f03c: Preparing
3d5800d45c36: Preparing
06ba8a2a8ec3: Preparing
c0fb81dae3c6: Preparing
2e4eabdbeed3: Preparing
b5d474284f52: Preparing
c307273999be: Preparing
d73750730c30: Preparing
63bbaf04cf0b: Preparing
badb9b2d625b: Preparing
40c928fd4dcc: Preparing
dfcf8dbe47e1: Preparing
6d820e13990c: Preparing
2e4eabdbeed3: Waiting
b5d474284f52: Waiting
c307273999be: Waiting
d73750730c30: Waiting
63bbaf04cf0b: Waiting
badb9b2d625b: Waiting
40c928fd4dcc: Waiting
dfcf8dbe47e1: Waiting
6d820e13990c: Waiting
denied: Unable to create the repository, please check that you have access to do so.
The push refers to a repository [us.gcr.io/whereismy230/appengine/default.20170120t131841]
...
ERROR: (gcloud.app.deploy) Error Response: [2] Build failed; check build logs for details
Using the IAM service, I made sure my account was the owner of the project, and even checked all permissions.
Since the flexible environment relies on the Compute Engines (VMs), I tried to check from the web page and it's telling me that I need to enable billing to be able to use this functionality.
Am I doing something wrong ?
Thanks !
From App Engine Pricing:
Instances within the standard environment have access to a daily
limit of resource usage that is provided at no charge defined by a set
of quotas. Beyond that level, applications will incur charges as
outlined below. To control your application costs, you can set a
spending limit. To estimate costs for the standard environment,
use the pricing calculator.
Go to the pricing calculator
For instances within the flexible environment, services and APIs are
priced as described below.
And from Flexible environment instances:
Applications running in the App Engine flexible environment are
deployed to virtual machine types that you specify. This table
summarizes the hourly billing rates of the various computing
resources:
US
Resource Unit Unit cost
vCPU per core hour $0.0526
Memory per GB hour $0.0071
Persistent disk per GB per month $0.0400
Unlike the standard env, the flex env has no free quota. Which is inline with your observation that the developer console requires billing to be enabled to run GAE flex instances.
Without billing enabled you might be able to deploy your app (but without actually launching a GAE instance for it, so unsure of its usefulness, since you want to try it) by using the --no-promote option:
--promote
Promote the deployed version to receive all traffic.
True by default. To change the default behavior for your current
environment, run:
$ gcloud config set app/promote_by_default false
Overrides the default promote_by_default property value for this
command invocation. Use --no-promote to disable.
Side note: when you encounter problems you may also want to use --verbosity=debug to potentially get more relevant info about the failures.

Google app engine request_logs command line --include_all option is not available

I saw this link - App engine CPU times when downloading logs that refers to a question about how to get the performance information when downloading logs from google app engine with --include_all option.
I've tried it with the Java command line options and read the documetnation and it is not mentioned there at all!
How can I get the performance infromation such as cpu time when I download the logs from app engine?
The command I'm currently using (it works) is:
appcfg.cmd --num_days=3 --severity=0 request_logs . logs.txt
In the admin dashboard you can see this information:
"<my_app_name>.appspot.com" ms=13 cpu_ms=0 api_cpu_ms=0 cpm_usd=0.000102
I want to be able to get this information in the logs as well.
thanks,
Li
The Java version of appcfg doesn't currently support that flag. You can create an app.yaml and use the Python version of the SDK to download the logs, though.

Resources