I deployed a nodejs server to app engine flex.
I am using websockets and should expect around 10k concurrent connections at the moment.
At around 2000 websockets connections I get this error:
[alert] 33#33: 4096 worker_connections are not enough
There is no permanent way to edit the nginx configuration in a nodejs runtime.
Isn't 2k connections on on instance quite low?
My yaml config file :
runtime: nodejs
env: flex
service: web
network:
session_affinity: true
resources:
cpu: 1
memory_gb: 3
disk_size_gb: 10
automatic_scaling:
min_num_instances: 1
cpu_utilization:
target_utilization: 0.6
As per this public issue 1 public issue 2 increasing the number of instance night increase the worker_connections. Also reducing the memory of each instance allow the instance scale up at a lower threshold. This may help with keeping the number of open connections below 4096 which should be constant across any size instance. You can change the worker_connections value for one instance SShing into the VM.
Nginx config is located in /tmp/nginx/nginx.conf and you can manually change it as follows:
sudo su
vi /tmp/nginx/nginx.conf #Make your changes
docker exec nginx_proxy nginx -s reload
Apart from the above workaround there is another PIT for implementation of a feature to allow users to change nginx.conf settings. and feel free to post there if you have any other queries.
Related
I set a cache-control on my server of 1 year.
How to say to the AppEngine "clear !" to take a new version from the server ?
The configuration is Flex custom environment
runtime: custom
env: flex
env_variables:
writecontrolEnv: 'prod'
handlers:
- url: /.*
script: this field is required, but ignored
service: gateway-prod
automatic_scaling:
min_num_instances: 1
max_num_instances: 2
resources:
cpu: 1
memory_gb: 2
disk_size_gb: 10
skip_files:
- node_modules/
network:
instance_tag: gateway
Assuming that your app is the one serving the static files then the cache parameters sent by the server are controlled by your application code. Which means that once you deploy a new version with updates parameters the server will send the updated values.
But the problem is that caching is actually performed by the client (or some middle-man network device), so the end user will not reach to the server until the (very long in your case) cache expiration time is reached, so it won't see the update until then.
You can try to clear your browser cache, hoping that the browser was the one doing the cache-ing.
To prevent such occurrences in the future you may want to choose a shorter cache expiration time or use some cache busting technique like this one.
Over the past week I've been seeing the number of instances on my GAE Flexible Environment fall to 0, with no new instance spinning up. My understanding of the Flexible environment is that this shouldn't be possible... (https://cloud.google.com/appengine/docs/the-appengine-environments)
I was wondering if anyone else has been seeing these issues, or if they've solved the problem on their end before. My one hypothesis is that this might be an issue with my health monitoring endpoints, but haven't seen anything that jumps out as a problem when I review the code.
This hasn't been a problem for me until last week, and now it seems like I have to redeploy my environment (with no changes) every couple of days just to "reset" the instances. It's worth noting that I have two services under this same App Engine project, both running flexible versions. But I only seem to have this issue with one of the services (what I call the worker service).
Screenshot from App Engine UI:
Screenshot from Logs UI that shows the SIGTERM being sent:
PS - Could this have anything to do with the recent Google Compute issues that have been coming up... https://news.ycombinator.com/item?id=18436187
Edit: Adding the yaml file for "worker" service. Note that I'm using Honcho to add an endpoint to monitor health of the worker service via Flask. I added those code examples as well.
yaml File
service: worker
runtime: python
threadsafe: yes
env: flex
entrypoint: honcho start -f /app/procfile worker monitor
runtime_config:
python_version: 3
resources:
cpu: 1
memory_gb: 4
disk_size_gb: 10
automatic_scaling:
min_num_instances: 1
max_num_instances: 20
cool_down_period_sec: 120
cpu_utilization:
target_utilization: 0.7
Procfile for Honcho
default: gunicorn -b :$PORT main:app
worker: python tasks.py
monitor: python monitor.py /tmp/psq.pid
monitor.py
import os
import sys
from flask import Flask
# The app checks this file for the PID of the process to monitor.
PID_FILE = None
# Create app to handle health checks and monitor the queue worker. This will
# run alongside the worker, see procfile.
monitor_app = Flask(__name__)
#monitor_app.route('/_ah/health')
def health():
"""
The health check reads the PID file created by tasks.py main and checks the proc
filesystem to see if the worker is running.
"""
if not os.path.exists(PID_FILE):
return 'Worker pid not found', 503
with open(PID_FILE, 'r') as pidfile:
pid = pidfile.read()
if not os.path.exists('/proc/{}'.format(pid)):
return 'Worker not running', 503
return 'healthy', 200
#monitor_app.route('/')
def index():
return health()
if __name__ == '__main__':
PID_FILE = sys.argv[1]
monitor_app.run('0.0.0.0', 8080)
We are deploying ASP.NET Core application on Appengine Flex and in Instances Summary on Dashboard page appears strange 1.9.54 appengine release as well as Flex release. What that might be?
Our app.yaml:
env: flex
runtime: aspnetcore
resources:
cpu: 8
memory_gb: 14.4
automatic_scaling:
min_num_instances: 8
max_num_instances: 20
cool_down_period_sec: 180
cpu_utilization:
target_utilization: 0.5
Your app's dashboard summary indicates your app has both:
standard env GAE instances (1.9.54 being the version of the GAE sandbox/infra running those instances), possibly from older service version(s) not yet deleted
flexible env GAE instances
You can play with the Service and/or Version selection boxes above the summary to identify which service/version those instances correspond to.
When trying to deploy my app engine using flexible environment then i am getting error.
ERROR: (gcloud.preview.app.deploy) INVALID_ARGUMENT:
The beta setting machine_type cannot be set in an App Engine Flexible Environment deployment.
My app.yaml is given below
runtime: nodejs
#vm: true
env: flex
# [END runtime]
network:
instance_tag: app-tag
name: network-tag
instance_class: F1
automatic_scaling:
min_num_instances: 1
max_num_instances: 2
cool_down_period_sec: 60
beta_settings:
machine_type: f1-micro
handlers:
- url: /.*
script: IGNORED
secure: always
# Temporary setting to keep gcloud from uploading node_modules
skip_files:
- ^node_modules$
Also can anyone please tell me what is the difference between vm: true and env: flex because both set app engine environment to flexible ??
When changing from vm: true to env: flex you're actually switching to the latest infra version, see Upgrading to the Latest App Engine Flexible Environment Beta Release.
The machine type is no longer configured that way. Instead you'd configure a custom instance shape via its resources:
Resource settings
These settings control the computing resources. App Engine assigns a
machine type based on the amount of CPU and memory you've
specified. The machine is guaranteed to have at least the level of
resources you've specified, it might have more.
You can specify up to eight volumes of tmpfs in the resource settings.
You can then enable workloads that require shared memory via tmpfs and
can improve file system I/O.
For example:
resources:
cpu: 2
memory_gb: 1.3
disk_size_gb: 10
volumes:
- name: ramdisk1
volume_type: tmpfs
size_gb: 0.5
Created a new Flexible Environment App and made successful deploy with latest GCloud version (current is 133) but get "503 Server Error" without any error logs.
Source code: https://github.com/AIMMOTH/scala-stack-angular/tree/503-error
App link: https://scala-stack-angular-us.appspot.com
Error page:
Error: Server Error
The service you requested is not available yet.
Please try again in 30 seconds.
Version info:
Version Status Traffic Allocation Instances Runtime Environment Size Deployed Diagnose
20161108t190158 Serving 100 % 2 custom Flexible
I had a filter responding to /_ah/* and broke Google App Engine.
For me, it was because of wrong settings in app.yaml:
vm: true # the flexible environment
runtime: java # Java 8 / Jetty 9.3 Runtime
service: default
threadsafe: true # handle multiple requests simultaneously
resources:
cpu: .5 # number of cores
memory_gb: 1.3
disk_size_gb: 10 # minimum is 10GB and maximum is 10240GB
health_check:
enable_health_check: true
check_interval_sec: 5 # time interval between checks (in seconds)
timeout_sec: 4 # health check timeout interval (in seconds)
unhealthy_threshold: 2 # an instance is unhealthy after failing this number of consecutive checks
healthy_threshold: 2 # an unhealthy instance becomes healthy again after successfully responding to this number of consecutive checks
restart_threshold: 60 # the number of consecutive check failures that will trigger a VM restart
automatic_scaling:
min_num_instances: 1
max_num_instances: 1
cool_down_period_sec: 120 # time interval between auto scaling checks. It must be greater than or equal to 60 seconds.
# The default is 120 seconds
cpu_utilization:
target_utilization: 0.5 # CPU use is averaged across all running instances and is used to decide when to reduce or
# increase the number of instances (default 0.5)
handlers:
- url: /.* # regex
script: ignored # required, but ignored
secure: always # https
beta_settings:
java_quickstart: true # process Servlet 3.1 annotations
use_endpoints_api_management: true # enable Google Cloud Endpoints API management
I removed use_endpoints_api_management: true and everything works fine.