Uptime checks keeping some AppEngine instances alive and others not? - google-app-engine

We've noticed that using GCP's Monitoring Uptime checks on one AppEngine service appears to keep it 'alive' unnecessarily - removing the monitoring configuration reduced the running instances to 0. However, we have two other AppEngine services that would happily reduce to 0 instances, even with the monitoring in place.
We've been unable to find any difference in configuration. The one other visible difference we spotted was in the 'Traffic' graphs, the instances that still shut down included 'Sent (cached)' and 'Received (cached)' as series on the graph (along side Sent and Received):
Whereas the 'problem' service only has Sent and Received:
There is no cloud load balancing in place - we're just using AppEngine to map the endpoints.
Config for both look like this, with different handlers configured:
runtime: python310
env: standard
instance_class: F1
handlers:
- *snip*
automatic_scaling:
min_idle_instances: automatic
max_idle_instances: automatic
min_pending_latency: automatic
max_pending_latency: automatic
service_account: XXXX#appspot.gserviceaccount.com
Can anyone clarify what might be different between these? Thank you.

Related

GAE: Specify min_instances only for default service version

We have a service running on Google App Engine.
If that service does not receive a traffic for some time then all instances are killed and the next call takes a few additional seconds to start the application.
We are thinking about specifying a min_instances option in app.yaml to always keep at least one instance alive.
We deploy new versions of that service quite frequently and keeping old versions for some time. Those old versions are not serving traffic and kept just in case.
What we would like to do is to always keep at least one instance of default service version alive and leave all other non-default versions with default behavior – we want them to be scaled automatically to 0 instances if they do not receive any traffic.
I didn't find such option in the documentation (https://cloud.google.com/appengine/docs/standard/python3/config/appref#scaling_elements) and didn't come to any workarounds.
I am thinking about creating a cron job (https://cloud.google.com/appengine/docs/flexible/python/scheduling-jobs-with-cron-yaml) which will periodically "ping" only default version of my application periodically thus making it always asleep. But I am not sure if it is good solution.
Are there any better solutions to such case?
Thanks!
min_idle_instances config option seems to solve my problem.
Note following in the documentation: "This setting only applies to the version that receives most of the traffic" which is almost exactly my case:
automatic_scaling:
min_idle_instances: 1

High latency on webapp2 endpoint under appengine

I asked about this in the appengine user group here and wasn't able to resolve the issue.
The issue I'm having is that, for seemingly a very light endpoint and others like it, latency seems to be an issue. Here's an example request as shown in GCP's Trace tool:
I'm aware that if it's a request that spawned a new instance or if memory usage is high, that would explain high latency. Neither of which are the case here.
It seems that intermittently some endpoints simply take a good second or two to respond on top of however long the endpoint itself takes to do its job, and the bulk of that is "untraced time" under GCP Stackdriver's Trace tool. I put a log entry as early as I seemingly possibly could, in webapp2's RequestHandler object on initialization. You can see that in the screenshot as "webapp2 request init".
I'm not familiar enough with webapp2's inner working to know where I could put a log elsewhere that could help explain this, if anywhere.
These are the scaling settings for that appengine service defined in my yaml file:
instance_class: F4
automatic_scaling:
max_idle_instances: 1
max_pending_latency: 1s
max_concurrent_requests: 50
Not sure what other information would be useful here.

App Engine Standard Nodejs8 ignore memory_gb in resources

I'm trying to deploy a Nodejs8, memory intense app on Google App Engine Standard.
This is my app.yaml:
runtime: nodejs8
resources:
cpu: 1
memory_gb: 6
disk_size_gb: 10
This is my deploy command:
gcloud app deploy --project=my-project --version=0-0-12
This is the error I get when I try to access the relevant endpoint of the app:
Exceeded soft memory limit of 128 MB with 182 MB after servicing 0 requests total. Consider setting a larger instance class in app.yaml.
How come the memory_gb param is ignored? What do I need to do in order to enlarge the memory of the instances?
You're attempting to use flexible environment Resource settings into a standard environment app.yaml file, which won't work. Note that in most cases the invalid settings will be silently ignored, so you need to be careful.
For the standard environment you can't explicitly pick individual resources, you can only use the instance_class option in Runtime and app elements:
instance_class
Optional. The instance class for this service.
The following values are available depending on your service's
scaling:
Automatic scaling
F1, F2, F4, F4_1G
Default: F1 is assigned if you do not specify an instance class along with the automatic_scaling element.
Basic and manual scaling
B1, B2, B4, B4_1G, B8
Default: B2 is assigned if you do not specify an instance class along with the basic_scaling element or the
manual_scaling element.
Note: If instance_class is set to F2 or higher, you can optimize your instances by setting max_concurrent_requests to a
value higher than 10, which is the default. To find the optimal value,
gradually increase it and monitor the performance of your application.
The max amount of memory available in the currently supported standard environment instance classes is 1G, if you actually need 6G you'll have to migrate to the flexible environment.
Side note: potentially useful: How to tell if a Google App Engine documentation page applies to the standard or the flexible environment

Dynamic Instances: why so many? can these be limited?

I'm maintaining a production app on GAE. The number of 'dynamic' instances we have seems very high. According to our app.yaml, these should be 'F4' instances. Some of these instances serve lots of requests - and others very few (see pic below). I have no idea why.
my questions:
why are some instance very busy and others not?
can I limit the number of dynamic instances?
EDIT: I'm adding some more details here:
the pic above shows a few dynamic instances. there are a lot more - as many as 24 in total.
In the app.yaml we configured 1 min/max idle instances. this (presumably) shows up as the single 'resident' instance I have.
below is a portion of my app.yaml that is relevant.
I understand that the idle machine kicks in when other machines can't handle the load. perhaps this is triggered by requests that take too long to complete (and perhaps this is why I have so many dynamic machines?)
runtime: python27
api_version: 1
threadsafe: yes
instance_class: F4
automatic_scaling:
min_idle_instances: 1
max_idle_instances: 1
....
libraries:
- name: webapp2
version: "2.5.1"
- name: ssl
version: latest
inbound_services:
- warmup

Target micro instance in Flexible AppEngine

Is it possible to target a custom runtime to use a micro instance?
I tried with:
resources:
cpu: 0.5
memory_gb: 0.6
disk_size_gb: 10
But a small instance is started.
Add following to your app.yaml:
beta_settings:
machine_type: f1-micro
In App Engine, the instance type cannot be specified for the flex environment (unlike the standard environment in Python, Java and Go).
In the flex environment, the instance types are derived from the resource settings specified in your app.yaml file.
App Engine Resource Settings
It appears, that for some reason the pricing calculator lets you enter small values so the instance appears similar to a F1-micro (I was looking for around $5/month to run it) but then alas when it comes to specifying it in app.yaml it defaults to something that costs a whole lot more than an f1-micro, and you can't set the values to anything lower than what OP shows.
example very small flex instance in pricing calculator

Resources