Over the past week I've been seeing the number of instances on my GAE Flexible Environment fall to 0, with no new instance spinning up. My understanding of the Flexible environment is that this shouldn't be possible... (https://cloud.google.com/appengine/docs/the-appengine-environments)
I was wondering if anyone else has been seeing these issues, or if they've solved the problem on their end before. My one hypothesis is that this might be an issue with my health monitoring endpoints, but haven't seen anything that jumps out as a problem when I review the code.
This hasn't been a problem for me until last week, and now it seems like I have to redeploy my environment (with no changes) every couple of days just to "reset" the instances. It's worth noting that I have two services under this same App Engine project, both running flexible versions. But I only seem to have this issue with one of the services (what I call the worker service).
Screenshot from App Engine UI:
Screenshot from Logs UI that shows the SIGTERM being sent:
PS - Could this have anything to do with the recent Google Compute issues that have been coming up... https://news.ycombinator.com/item?id=18436187
Edit: Adding the yaml file for "worker" service. Note that I'm using Honcho to add an endpoint to monitor health of the worker service via Flask. I added those code examples as well.
yaml File
service: worker
runtime: python
threadsafe: yes
env: flex
entrypoint: honcho start -f /app/procfile worker monitor
runtime_config:
python_version: 3
resources:
cpu: 1
memory_gb: 4
disk_size_gb: 10
automatic_scaling:
min_num_instances: 1
max_num_instances: 20
cool_down_period_sec: 120
cpu_utilization:
target_utilization: 0.7
Procfile for Honcho
default: gunicorn -b :$PORT main:app
worker: python tasks.py
monitor: python monitor.py /tmp/psq.pid
monitor.py
import os
import sys
from flask import Flask
# The app checks this file for the PID of the process to monitor.
PID_FILE = None
# Create app to handle health checks and monitor the queue worker. This will
# run alongside the worker, see procfile.
monitor_app = Flask(__name__)
#monitor_app.route('/_ah/health')
def health():
"""
The health check reads the PID file created by tasks.py main and checks the proc
filesystem to see if the worker is running.
"""
if not os.path.exists(PID_FILE):
return 'Worker pid not found', 503
with open(PID_FILE, 'r') as pidfile:
pid = pidfile.read()
if not os.path.exists('/proc/{}'.format(pid)):
return 'Worker not running', 503
return 'healthy', 200
#monitor_app.route('/')
def index():
return health()
if __name__ == '__main__':
PID_FILE = sys.argv[1]
monitor_app.run('0.0.0.0', 8080)
Related
I want to run a single python flask hello world. I deploy to App Engine, but it's showing like it's saying that the port is in use and it looks like it's running on multiple instances/threads/clones concurrently.
This is my main.py
from flask import Flask
app = Flask(__name__)
#app.route('/hello')
def helloIndex():
print("Hello world log console")
return 'Hello World from Python Flask!'
app.run(host='0.0.0.0', port=4444)
This is my app.yaml
runtime: python38
env: standard
instance_class: B2
handlers:
- url: /
script: auto
- url: .*
script: auto
manual_scaling:
instances: 1
This is my requirements.txt
gunicorn==20.1.0
flask==2.2.2
And this is the logs that I got:
* Serving Flask app 'main'
* Debug mode: off
Address already in use
Port 4444 is in use by another program. Either identify and stop that program, or start the server with a different port.
[2022-08-10 15:57:28 +0000] [1058] [INFO] Worker exiting (pid: 1058)
[2022-08-10 15:57:29 +0000] [1059] [INFO] Booting worker with pid: 1059
[2022-08-10 15:57:29 +0000] [1060] [INFO] Booting worker with pid: 1060
[2022-08-10 15:57:29 +0000] [1061] [INFO] Booting worker with pid: 1061
It says that Port 4444 is in use. Initially I tried 5000 (flask's default port) but it says it's in use. Also I tried removing the port=4444 but now it's saying Port 5000 is in use by another program, I guess flask by default assign port=5000. I'm suspecting that it's because GAE is running in multiple instances that's causing this error. If not, then please help to solve this issue.
App Engine apps should listen on port 8080 and not in any other ones.
So you may need to set this like
app.run(host='0.0.0.0', port=8080)
Close the editor and then reopen. When you stop the process next time use Ctrl+C if you're on the terminal
I figured it out. Delete your old file from your terminal and or folder that created the web app. In the terminal this is done by:
rm -file_name
Then try all over again with your a fresh file and it should be okay.
I deployed a nodejs server to app engine flex.
I am using websockets and should expect around 10k concurrent connections at the moment.
At around 2000 websockets connections I get this error:
[alert] 33#33: 4096 worker_connections are not enough
There is no permanent way to edit the nginx configuration in a nodejs runtime.
Isn't 2k connections on on instance quite low?
My yaml config file :
runtime: nodejs
env: flex
service: web
network:
session_affinity: true
resources:
cpu: 1
memory_gb: 3
disk_size_gb: 10
automatic_scaling:
min_num_instances: 1
cpu_utilization:
target_utilization: 0.6
As per this public issue 1 public issue 2 increasing the number of instance night increase the worker_connections. Also reducing the memory of each instance allow the instance scale up at a lower threshold. This may help with keeping the number of open connections below 4096 which should be constant across any size instance. You can change the worker_connections value for one instance SShing into the VM.
Nginx config is located in /tmp/nginx/nginx.conf and you can manually change it as follows:
sudo su
vi /tmp/nginx/nginx.conf #Make your changes
docker exec nginx_proxy nginx -s reload
Apart from the above workaround there is another PIT for implementation of a feature to allow users to change nginx.conf settings. and feel free to post there if you have any other queries.
I have been exploring the App Engine settings for a small data science web application for 2 weeks. Since it is a personal project that bills my own wallet, I tried a few different parameters in app.yaml to reduce the "frontend instances" cost. Several changes in, I got unexpected ~10x cost surge!!! It was painful!!! In order to not waste it, I decided to learn something here to understand the behaviour :)... Don't worry, I had temporarily shut down my app ;)
Version 1 app.yaml:
service: my-app
runtime: python37
instance_class: F4
env: standard
automatic_scaling:
min_idle_instances: 1
max_idle_instances: 1
default_expiration: "1m"
inbound_services:
- warmup
entrypoint: gunicorn -b 0.0.0.0:8080 main:server
Version 1, billing result (usage.amount_in_pricing_units exported from billing account): ~100hr/day, the same as Front end Instance Hours shown from App Engine billing status.
This is understandable, because I had a F4 instance constantly runing idle that would translate into 24*4=96 frontend instance hours. Adding the instance usage from actual requests (from me only), ~100hr/day seems reasonable.
Version 2, where I intended to lower the instance class and number of instances and also made longer the default_expiration and hoping it would help the app to start quicker and some other stuff that I thought wouldn't affect much....
service: my-app
runtime: python37
instance_class: F2
env: standard
automatic_scaling:
min_instances: 1
max_instances: 1
target_cpu_utilization: 0.85
max_concurrent_requests: 80
max_pending_latency: 6s
default_expiration: "3h"
inbound_services:
- warmup
entrypoint: gunicorn -b 0.0.0.0:8080 main:server
Version 2, billing result (usage.amount_in_pricing_units exported from billing account): ~800hr+/day, ouch!!! In contrast, the Front end Instance Hours from App Engine dashboard billing status is less than 60hr/day as expected. This is where I got lost:
Why the usage from billing is so much larger than the App Engine Dashboard where do those usage come from?
Where to find and track indicators of those unaccounted usage in App Engine Dashboard etc?
2020-01-16 Solution for issue #1.
While I was waiting for Google Billing Support to come back to me, I found this:
Pricing of Google App Engine Flexible env, a $500 lesson
Namely, the past deployed versions of the app also eating frontend instance hours, which needed real world confirmation. (To my surprise, this has nothing to do with app.yaml file!!) So I deleted all the past versions of the app and let it run for two days while observing instance hours and billing records with the following app.yaml file.
service: my-app
runtime: python37
instance_class: F2
env: standard
automatic_scaling:
min_instances: 1
max_instances: 2
max_idle_instances: 1
target_cpu_utilization: 0.85
max_concurrent_requests: 80
max_pending_latency: 6s
default_expiration: "1m"
inbound_services:
- warmup
entrypoint: gunicorn -b 0.0.0.0:8080 main:server
This should have always one F2 instance running and goes up to maximum 2 instances. This time both app engine and exported billing usage hours agreed on 50 hours frontend instance hours. Yes!!! The daily cost is cut down to 1/16.
This solves the cost question #1, but #2 remains to be answered. It is very problematic that app engine dashboard is not showing all the billed usage of frontend instances. Yesterday I heard from Google Billing Support Team, the answer is not helpful (mainly talking about instance numbers in app.yaml, which doesn't help), they seem oblivious about this issue, I will have to let them know.
2020-01-31 Followup on issue #2.
Google Billing Support Team responded swiftly, acknowledged the discrepency between App Engine Dashboard vs Billing Export and agreed to ajust the billing for me. Effectively, the bills during the spiky days were refunded as a result. Kudos to them!
I've been trying to get a webapp to work which uses Server-Sent Events. The app that I've written works on my local machine when using Flask's app.run() method. But when I run it on GAE, I've not been able to make it work.
The webapp uses SSE to publish a message with the current time every so often. The client simply adds it to the HTML of a div.
Flask app
import random
from datetime import datetime
from flask import render_template, Response
from time import sleep
from message_server import app
def event_stream():
while True:
time_now = datetime.now()
message = "New message at time: {0}".format(time_now.strftime("%H:%M:%S"))
yield "event: {0}\ndata: {1}\n\n".format("listen", message)
sleep(random.randint(1, 5))
#app.route('/')
def hello():
return render_template('home.html')
#app.route('/stream')
def stream():
return Response(event_stream(), mimetype="text/event-stream")
Javascript in home.html
var source = new EventSource("/stream");
source.onmessage = function(event) {
document.getElementById("messages").innerHTML += event.data + "<br>";
};
source.addEventListener("listen", function(event) {
document.getElementById("messages").innerHTML += event.data + "<br>";
}, false);
GAE app.yaml
runtime: python
env: flex
entrypoint: gunicorn -b :$PORT --worker-class gevent --threads 10 message_server:app
runtime_config:
python_version: 3
manual_scaling:
instances: 1
resources:
cpu: 1
memory_gb: 0.5
disk_size_gb: 10
My directory structure is as following:
app.yaml
/message_server
__init__.py
sse.py
/templates
home.html
message_server is the package that contains the flask app object.
I am using Firefox 67 to test my app.
In the networking tab of Firefox developer console, I see a GET request made to /stream, but no response received even after a minute.
In the GAE logs, I am seeing "GET /stream" 499.
How do I figure out what's wrong?
I found the answer while browsing Google App Engine's documentation - on this page: https://cloud.google.com/appengine/docs/flexible/python/how-requests-are-handled
Essentially, you want the following header in the HTTP response for SSE to work:
X-Accel-Buffering: no
This disables buffering that is enabled by default. I tested it and SSE is working as expected for me.
Created a new Flexible Environment App and made successful deploy with latest GCloud version (current is 133) but get "503 Server Error" without any error logs.
Source code: https://github.com/AIMMOTH/scala-stack-angular/tree/503-error
App link: https://scala-stack-angular-us.appspot.com
Error page:
Error: Server Error
The service you requested is not available yet.
Please try again in 30 seconds.
Version info:
Version Status Traffic Allocation Instances Runtime Environment Size Deployed Diagnose
20161108t190158 Serving 100 % 2 custom Flexible
I had a filter responding to /_ah/* and broke Google App Engine.
For me, it was because of wrong settings in app.yaml:
vm: true # the flexible environment
runtime: java # Java 8 / Jetty 9.3 Runtime
service: default
threadsafe: true # handle multiple requests simultaneously
resources:
cpu: .5 # number of cores
memory_gb: 1.3
disk_size_gb: 10 # minimum is 10GB and maximum is 10240GB
health_check:
enable_health_check: true
check_interval_sec: 5 # time interval between checks (in seconds)
timeout_sec: 4 # health check timeout interval (in seconds)
unhealthy_threshold: 2 # an instance is unhealthy after failing this number of consecutive checks
healthy_threshold: 2 # an unhealthy instance becomes healthy again after successfully responding to this number of consecutive checks
restart_threshold: 60 # the number of consecutive check failures that will trigger a VM restart
automatic_scaling:
min_num_instances: 1
max_num_instances: 1
cool_down_period_sec: 120 # time interval between auto scaling checks. It must be greater than or equal to 60 seconds.
# The default is 120 seconds
cpu_utilization:
target_utilization: 0.5 # CPU use is averaged across all running instances and is used to decide when to reduce or
# increase the number of instances (default 0.5)
handlers:
- url: /.* # regex
script: ignored # required, but ignored
secure: always # https
beta_settings:
java_quickstart: true # process Servlet 3.1 annotations
use_endpoints_api_management: true # enable Google Cloud Endpoints API management
I removed use_endpoints_api_management: true and everything works fine.