Why can't I override the timeout on my Google Cloud Build? - google-app-engine

I am attempting to setup a CI Pipeline using Google Cloud Build.
I am attempting to deploy a MeteorJS app which has a lengthy build time - the default build timeout for GCB is 10 minutes and it was recommended here that I increase the timeout.
I have setup my cloudbuild.yaml file with the timeout option increased to 20 minutes:
steps:
- name: 'gcr.io/cloud-builders/gcloud'
args: ['app', 'deploy']
timeout: 1200s
I have a Trigger setup in GCB connected to a Bitbucket Repo and when I push a change and the Trigger fires, I get 2 new builds - one coming from Bitbucket and one whose source is Google Cloud Storage.
Once 10 minutes of build time has elapsed, the build from Cloud Storage will timeout which will cause the Bitbucket build to fail as well with Error Response: [4] DEADLINE_EXCEEDED
Occasionally, for whatever reason, the Cloud Storage build will finish in under 10 minutes which will allow the Bitbucket build to finish successfully and deploy.
If I attempt to cancel/stop the Cloud Storage build, it will also stop the Bitbucket build.
The screenshot below shows 2 attempts of the exact same build with differing results.
I do not understand where this second Cloud Storage Build is coming from, but it does not seem to be affected by the settings in my yaml file or my global GCP settings.
I have attempted to run the following commands from the gcloud CLI:
gcloud config set app/cloud_build_timeout 1200
gcloud config set builds/timeout 1200
gcloud config set container/build-timeout 1200
I have also attempted to use a high CPU build machine to speed up the process but it did not seem to have any effect.
Any insight would be greatly appreciated - I feel that I have exhausted every possible combination of Google Search keywords I can think up!

This timeout error comes from app engine deployment itself which has 10 min timeout by default.
You will need to update app/cloud_build_timeout property inside container itself like this:
steps:
- name: 'gcr.io/cloud-builders/gcloud'
entrypoint: 'bash'
args: ['-c', 'gcloud config set app/cloud_build_timeout 1200 && gcloud app deploy']
timeout: 1200s
Update
Actually simpler solution:
steps:
- name: 'gcr.io/cloud-builders/gcloud'
args: ['app', 'deploy']
timeout: 1200s
timeout: 1200s

Related

"gcloud app deploy" hangs on "Building and pushing image for service"

I suddenly can't deploy using gcloud app deploy.
It hangs on "Building and pushing image for service [default]". At that time, the Python process takes 99% CPU, and continues until the deploy times out. I've tried upgrading Python to no avail.
It occurs regardless of Google Appengine Project. Have tried installing different versions of gcloud CLI to no avail.
My teammates can deploy successfully using the same commands. Any ideas?
EDIT:
The app.yaml file:
runtime: nodejs
env: flex
manual_scaling:
instances: 1
resources:
cpu: 1
memory_gb: 6
disk_size_gb: 10
env_variables:
...
Output of my gcloud app deploy verbosity=debug: https://gist.github.com/bbarton/05b8ae514849a5731101afb5ae639357

Adding Timeout field causes GCloud Build to Fail, what is the correct method to increase timeout

My gcloud build will timeout if left at the default timeout of 10 minutes, so I have tried to increase the timeout to 20 minutes.
This is my cloudbuild.yaml.
# cloudbuild.yaml
steps:
- name: node:14.17.1
entrypoint: npm
args: ["install"]
- name: node:14.17.1
entrypoint: npm
args: ["run", "build"]
- name: "gcr.io/cloud-builders/gcloud"
args: ["app", "deploy"]
timeout: 1200s
It processes Step 0 and Step 1 and fails at Step 2, which is gcloud app deploy.
The execution log reports the following error:
ERROR: gcloud crashed (InvalidBuildError): Field [timeout] was provided, but should not have been. You may be using an improper Cloud Build pipeline.
All the documentation I've says that this is how you increase the timeout, some say the the timeout needs to be wrapped in single quotes, but this doesn't appear to be true as if I review the Execution details, it correctly identifies that the timeout is 20 minutes. Trying with single quotes makes no difference to the outcome.
I've also tried setting a timeout at the app deploy step as well, but it produces the same error and would be ineffectual anyway as it is the entire build process that is exceeding the execution time if left at default.
The timeout setting can be used if the App Engine environment is "Standard" and not "Flexible".
The environment is set by the "env" setting in app.yaml. By default if this value is not provided, the App Engine environment will be set to Standard. Simply ensure that env: flex
removed from app.yaml.
It is unclear if this is by design or a bug.

Gcloud cloud build local component failing with error "Error loading config file: unknown field "availableSecrets" in cloudbuild.Build"

Greetings stackoverflow community! First time asker, long time user.
I am testing out my cloudbuild.yaml file locally using Cloud Build Local component and Secret Manager and it is failing on "availableSecrets".
Error message: Error loading config file: unknown field "availableSecrets" in cloudbuild.Build
OS Platform: Windows 10/WSL2/Ubuntu 18.04
cloud-build-local: v0.5.2
Docker engine: v20.10.2
Nodejs version: v14.15.3
NPM version: 6.14.9
gcloud version: 326.0.0
Installed components: [BigQuery Command Line Tool, Cloud Datastore Emulator, Cloud SDK Core Libraries, Cloud Storage Command Line Tool, Google Cloud Build Local Builder, gcloud Beta Commands]
Documentation on Cloud Build build file: https://cloud.google.com/cloud-build/docs/build-config
Documentation to configure secrets with cloud build: https://cloud.google.com/cloud-build/docs/securing-builds/use-secrets
Documentation for cloud build local: https://cloud.google.com/cloud-build/docs/build-debug-locally
Steps performed:
Added secrets to Secret Manager
Enabled API between Cloud Build and Secrets Manager
Added cloudbuild service account as member of each secret password.
Added IAM permission Secret Manager Secrets Accessor to cloudbuild user. I don't know where I got this info from but it is residual at this point from other attempts to use Secret Manager with cloudbuild. I am not sure of the difference between applying access here vs applying to the Secret Manager secret.
Command: cloud-build-local --config=cloudbuild.staging.yaml --dryrun=false .
cloudbuild.staging.yaml:
- name: gcr.io/cloud-builders/npm
entrypoint: 'npm'
args: [ 'install' ]
- name: 'gcr.io/cloud-builders/gcloud'
args: ["app", "deploy"]
env:
- 'DAO_FACTORY=datastore'
- 'POLL_INTERVAL=15'
- 'PROMPT=staging>'
- 'ENVIRONMENT=staging'
- 'NAMESPACE=staging'
- 'RESET_DATASTORE=false'
secretEnv: ['ADMIN_USER', 'SUPER_ADMINS', 'BOT_TOKEN']
availableSecrets:
secretManager:
- versionName: projects/{project token}/secrets/SYSTEM_USER/versions/1
env: 'ADMIN_USER'
- versionName: projects/{project token}/secrets/SUPER_ADMINS/versions/1
env: 'SUPER_ADMINS'
- versionName: projects/{project token}/secrets/BOT_TOKEN/versions/2
env: 'BOT_TOKEN'```
Tag: cloud-build-local. I guess without reputation a meaningful tag cannot be created. Maybe an esteemed community member will create this as this may be specific to cloud-build-local only.
Support for Google Secret Manager in Google Cloud Build descriptor file is apparently very new and does not appear to be supported by cloud-build-local component at this time; please see comment from Guillaume about feature being a week old. When cloud build descriptor is ran in Cloud Build, it works fine.
I fixed a similar issue by upgrading the gcloud tool.

gcloud app deploy does not terminate even when service is running

I am deploying a node.js server to Google App Engine from Bitbucket pipeline environment and the last command in the script is: gcloud -q app deploy app.yaml --no-promote --verbosity=debug
The logs show that the service is deployed successfully but the script is not terminating, this is the last part of the log:
> DEBUG: Reading GCS logfile: 206 (read 10 bytes) PUSH DONE DEBUG:
> Operation [...] complete. Result: {...} DEBUG: Reading GCS logfile:
> 416 (no new content; keep polling)
> -------------------------------------------------------------------------------- DEBUG: Converted YAML to JSON: "{...}" DEBUG: Operation [...] not
> complete. Waiting to retry. Updating service [default] (this may take
> several minutes)... .DEBUG: Operation [...] not complete. Waiting to
> retry. ......DEBUG: Operation [...] not complete. Waiting to retry.
> .......DEBUG: Operation [...] not complete. Waiting to retry.
> ......DEBUG: Operation [...] not complete. Waiting to retry.
> .......DEBUG: Operation [...] not complete. Waiting to retry.
> .......DEBUG: Operation [...] not complete. Waiting to retry.
I tried to add readiness_check and liveness_check to app.yml but it didn't change the behaviour.
readiness_check:
path: "/api/public/logout"
check_interval_sec: 5
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
app_start_timeout_sec: 300
liveness_check:
path: "/api/public/logout"
check_interval_sec: 30
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
The main unknown here is what criteria does gcloud app deploy uses to determine termination condition?
Also, is there any bypass to this problem?
Update
The problem happens also when running the gcloud app deploy command from local environment (my laptop).
The problem does NOT happen when removing the --no-promote flag.
The gcloud app deploy command expects a well-formed and valid app.yml file, this is what determines its termination condition.
As you confirmed the deployment worked without the --no-promote flag, it could mean that something in the configuration expects the application to be already deployed and running, thus preventing the script to complete.
Another possible cause would be that the Google Cloud SDK version specified in bitbucket-pipelines.yml is an older one. Make sure you work with the latest. This consideration applies extensively to all dependencies in package.json, which might be conflicting with one another, especially when using older versions of Node.js.
This guide can help at building a sound configuration for Bitbucket-based deployments; although the example given is with Python, it might as well be used as a template for processing a Node.js pipeline.
Nb. in this solution, the Google Cloud SDK version is an older one (127.0.0), which will make this deployment fail, so it should be replaced with the latest (228.0.0 or higher). Also the guide omits another required API activation: Cloud Build API. I've notified the team to amend the solution.
I've tested several scenarios with a simple Node.js server, and could not reproduce the issue. Check my Github repository for the code.
For further help on this topic, please provide more hints, such as the content of the app.yml, bitbucket-pipelines.yml, and package.json files, as well as a description of the state of App Engine (services, versions).
In order to deploy the test repository to App Engine from Bitbucket, make sure the following is done on the project:
Enable API's:
App Engine Admin
Cloud Build
Create a Service Account with following permissions, and generate an API Key:
App Engine: Admin
Cloud Build: Editor
Storage: Object Admin

Time out error when trying to create Google managed vm

I'm trying to create a managed vm for my node 4 application using google custom runtime.
I created the following Dockerfile:
FROM node:4.2.1
ENV PORT 8080
ADD package.json package.json
RUN npm install
ADD . .
CMD [ "npm", "start" ]
Along with this app.yaml:
# [START runtime]
runtime: custom
vm: true
api_version: 1
# [END runtime]
health_check:
enable_health_check: false
skip_files:
- ^(.*/)?#.*#$
- ^(.*/)?.*~$
- ^(.*/)?.*\.py[co]$
- ^(.*/)?.*/RCS/.*$
- ^(.*/)?\..*$
- ^(.*/)?.*/node_modules/.*$
- ^(.*/)?.*\.log$
I deploy the app using gcloud preview command:
gcloud preview app deploy app.yaml --promote
It seems like the docker is being built correctly but the at the end of the process I get this message:
Copying files to Google Cloud Storage...
Synchronizing files to [gs://staging.my-project-id.appspot.com/].
Updating module [default]...\Deleted [https://www.googleapis.com/compute/v1/projects/my-project-id/zones/us-central1-f/instances/gae-builder-vm-20151030t142257].
Updating module [default]...failed.
ERROR: (gcloud.preview.app.deploy) Error Response: [4] Timed out creating VMs.
I have my deployment working now. I have had to troubleshoot the same problem before, for another project, but I didn't have the code on hand, so I had to work through the problems again.
The deployment ran smoothly up until the last steps, where updating the module would timeout. This made me think it was something to do with the application starting up on VM and not responding appropriately, so the final hook would time out.
You'll find a lot of information here - https://cloud.google.com/appengine/docs/managed-vms/config . I checked the following things:
logging - ensure that you are writing to the correct log file. See https://cloud.google.com/appengine/docs/managed-vms/custom-runtimes#logging
ensure you have a .dockerignore file and are skipping files in app.yaml so you are not asking the process to copy across unneeded node_modules or log files
turn off health checking if you are not using it, or ensure you have the correct express.js routes configured for it
check that your environment variables are set and match what GAE can use. This was my final step - GAE will let you bind to a VM port on 8080. I had to pass through a NODE_ENV flag in my app.yaml which told the app to use 8080 and not 3000.
Lift the resources of the GAE instance in app.yaml. I specified two logical CPUs and made the ram 2 gig.
Good luck.

Resources