App Engine flexible restarting from 503 /readiness_check failure - google-app-engine

My app.yaml configuration has a custom URL /my_readiness_check for readiness check instead of the default /readiness_check.
I see requests to the custom URL that succeed. Then I see my instance restarted with a SIGKILL to the app and they are immediately preceded by a couple of 503 failures to /readiness_check
runtime: python
env: flex
readiness_check:
path: "/my_readiness_check"
check_interval_sec: 10
timeout_sec: 4
failure_threshold: 1
success_threshold: 2
app_start_timeout_sec: 300
liveness_check:
path: "/my_liveness_check"
check_interval_sec: 30
timeout_sec: 30
failure_threshold: 3
initial_delay_sec: 60
...

Related

gcloud app engine instances being inactive after a while

This is the app.yaml I define
runtime: nodejs16 env: flex service: default env_variables:
MONGO_USER: 'xxxxxx' MONGO_PASS: 'xxxxxxxxxx'
automatic_scaling: min_num_instances: 3
This is the final configuration ,ignoring my config after deploying
runtime: nodejs api_version: '1.0' env: flexible threadsafe: true
env_variables: MONGO_PASS: xxxxxx MONGO_USER: xxxxxx
automatic_scaling: cool_down_period: 120s min_num_instances: 2
max_num_instances: 20 cpu_utilization:
target_utilization: 0.5 liveness_check: initial_delay_sec: '300' check_interval_sec: '30' timeout_sec: '4' failure_threshold: 4
success_threshold: 2 readiness_check: check_interval_sec: '5'
timeout_sec: '4' failure_threshold: 2 success_threshold: 2
app_start_timeout_sec: '300'
I am trying to host a simple api server (nodejs,express,typescript)
but after a while the endpoint automatically becomes inactive when I ping from postman. Once I open the app engine dashboard and click on the instance, and make the api call again it works.
I tried this with both standard and flexible environment
I tried adding min_idle_instances:1 yet, it doesn't appear in the final config. (maybe it's only for standard env)
The docs say I should handle warmup request (for standard env), but I couldn't find a boilerplate on how to hanlde a warmup request
To enable warmup requests, add the warmup element under the inbound_services directive in your app.yaml file as mentioned in this document.
To handle the warm up request call to /_ah/warmup to return a 200 response.As described in this document You can also respond to warmup requests by using one of the following methods
Using a servlet
The easiest way to provide warmup logic is to mark your own servlets as in the web.xml configuration file.
Using a ServletContextListener
Al lows you to run custom logic before any of your servlets is first invoked either through a warmup request or a loading request.
Using a custom warmup servlet
Using a custom warmup servlet invokes the servlet's service method only during a warmup request rather than during loading requests.
You may also check this similar thread

App enfine flexible with way more readiness and liveness requests than expected

I deployed a simple nodejs server on Google app engine flex.
When it has 1 instance running, it is getting 3 times as much liveness and readyness checks as it should be reiceving considering the configuration on my app.yml file.
The documentation says:
If you examine the nginx.health_check logs for your application, you might see health check polling happening more frequently than you have configured, due to the redundant health checkers that are also following your settings. These redundant health checkers are created automatically and you cannot configure them.
Still this does look like an aggressive behaviour. Is this normal?
My app.yml config :
runtime: nodejs
env: flex
service: web
resources:
cpu: 1
memory_gb: 3
disk_size_gb: 10
automatic_scaling:
min_num_instances: 1
cpu_utilization:
target_utilization: 0.6
readiness_check:
path: "/readiness_check"
timeout_sec: 4
check_interval_sec: 5
failure_threshold: 2
success_threshold: 1
app_start_timeout_sec: 300
liveness_check:
path: "/liveness_check"
timeout_sec: 4
check_interval_sec: 30
failure_threshold: 2
success_threshold: 1
Yes, this is normal. Three different locations are checking health of your service. You have configured the health check to be every five seconds. If you want less health check traffic, change check_interval_sec: 5 to be a larger number.

"network: session_affinity:true " property of app.yaml file is not reflecting in google app engine

I am using app.yaml file to configure my app engine. Below is the file.
runtime: java
env: flex
resources:
memory_gb: 6.5
cpu: 5
disk_size_gb: 20
automatic_scaling:
min_num_instances: 6
max_num_instances: 8
cpu_utilization:
target_utilization: 0.6
handlers:
- url: /.*
script: this field is required, but ignored
network:
session_affinity: true
Now when I click the "view" link for the version list in the cloud console, I can see below config.
runtime: java
api_version: '1.0'
env: flexible
threadsafe: true
handlers:
- url: /.*
script: 'this field is required, but ignored'
automatic_scaling:
cool_down_period: 120s
min_num_instances: 6
max_num_instances: 8
cpu_utilization:
target_utilization: 0.6
network: {}
resources:
cpu: 5
memory_gb: 6.5
disk_size_gb: 20
liveness_check:
initial_delay_sec: 300
check_interval_sec: 30
timeout_sec: 4
failure_threshold: 4
success_threshold: 2
readiness_check:
check_interval_sec: 5
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
app_start_timeout_sec: 300
So as you can see network property is still blank, if i change others parameters like cpu , min_num_instances all others properties are getting reflected except below one not sure why ?.
network:
session_affinity: true
Actually this is a known issue for App Engine, the status can be tracked at this link
You can use gcloud beta app deploy as a workaround to get the session affinity working until the issue is resolved
You may need to add an instance_tag and a name. The others are optional:
network:
instance_tag: TAG_NAME
name: NETWORK_NAME
session_affinity: true (optional)
subnetwork_name: SUBNETWORK_NAME (optional)
forwarded_ports: (optional)
- PORT
- HOST_PORT:CONTAINER_PORT
- PORT/tcp
- HOST_PORT:CONTAINER_PORT/udp

GAE Flex Custom Runtime random shutdown

I'm running a very simple node app in a GAE Flex custom instance instance.
All of a sudden, seemingly out of nowhere, it shuts down causing a short period of 503s, before eventually coming back up:
I'm absolutely certain nobody did this manually.
What's going on? Are GAE apps expected to randomly shut down and restart?
Here's my config:
runtime: custom
api_version: '1.0'
env: flexible
threadsafe: true
automatic_scaling:
cool_down_period: 120s
min_num_instances: 1
max_num_instances: 15
cpu_utilization:
target_utilization: 0.5
network: {}
liveness_check:
initial_delay_sec: 300
path: /
check_interval_sec: 30
timeout_sec: 4
failure_threshold: 4
success_threshold: 2
readiness_check:
path: /
check_interval_sec: 5
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
app_start_timeout_sec: 300
It turns out that GAE flex instances automatically reboot once a week. So when you have only one instance running, this behaviour is to be expected.

GAE: Upgrading vm: true to env: flex drives auto-scaler to create infinite amount of instances

I've noticed that when I'm upgrading the vm tag to the new env one, that the auto scaler creates an infinite amount of instances during the deployment process. This only happens when I'm using the new env flag. This is my conf file:
runtime: custom
vm: true
service: default
threadsafe: true
health_check:
enable_health_check: True
check_interval_sec: 5
timeout_sec: 4
unhealthy_threshold: 2
healthy_threshold: 2
restart_threshold: 60
automatic_scaling:
min_num_instances: 1
cool_down_period_sec: 60
cpu_utilization:
target_utilization: 0.9
It would be great if anyone could help me because I'm unable to migrate my VM's because of this problem.
Cheers

Resources