Dynamic Instances: why so many? can these be limited? - google-app-engine

I'm maintaining a production app on GAE. The number of 'dynamic' instances we have seems very high. According to our app.yaml, these should be 'F4' instances. Some of these instances serve lots of requests - and others very few (see pic below). I have no idea why.
my questions:
why are some instance very busy and others not?
can I limit the number of dynamic instances?
EDIT: I'm adding some more details here:
the pic above shows a few dynamic instances. there are a lot more - as many as 24 in total.
In the app.yaml we configured 1 min/max idle instances. this (presumably) shows up as the single 'resident' instance I have.
below is a portion of my app.yaml that is relevant.
I understand that the idle machine kicks in when other machines can't handle the load. perhaps this is triggered by requests that take too long to complete (and perhaps this is why I have so many dynamic machines?)
runtime: python27
api_version: 1
threadsafe: yes
instance_class: F4
automatic_scaling:
min_idle_instances: 1
max_idle_instances: 1
....
libraries:
- name: webapp2
version: "2.5.1"
- name: ssl
version: latest
inbound_services:
- warmup

Related

Uptime checks keeping some AppEngine instances alive and others not?

We've noticed that using GCP's Monitoring Uptime checks on one AppEngine service appears to keep it 'alive' unnecessarily - removing the monitoring configuration reduced the running instances to 0. However, we have two other AppEngine services that would happily reduce to 0 instances, even with the monitoring in place.
We've been unable to find any difference in configuration. The one other visible difference we spotted was in the 'Traffic' graphs, the instances that still shut down included 'Sent (cached)' and 'Received (cached)' as series on the graph (along side Sent and Received):
Whereas the 'problem' service only has Sent and Received:
There is no cloud load balancing in place - we're just using AppEngine to map the endpoints.
Config for both look like this, with different handlers configured:
runtime: python310
env: standard
instance_class: F1
handlers:
- *snip*
automatic_scaling:
min_idle_instances: automatic
max_idle_instances: automatic
min_pending_latency: automatic
max_pending_latency: automatic
service_account: XXXX#appspot.gserviceaccount.com
Can anyone clarify what might be different between these? Thank you.

App Engine standard vs. flexible min instances

Running an app in the standard environment with instance_class: F2 (512 MB/1.2 GHz), min_instances: 1, max_instances: 3 it almost always runs with 2 or 3 active/billed instances, rarely dropping to 1.
Attempting to match the config above in the flexible environment, if I run the app with cpu: 1, memory_gb: 0.5, max_num_instances: 3 it runs most of the time with only a single instance active, halving the cost.
Without having to resort to max_instances: 1 is there a way to configure the standard environment to bring it more inline with the flexible? I've tried using target_cpu_utilization and max_concurrent_requests. E.g. target_cpu_utilization: 0.8, max_concurrent_requests: 80 but this makes no difference.

High latency on webapp2 endpoint under appengine

I asked about this in the appengine user group here and wasn't able to resolve the issue.
The issue I'm having is that, for seemingly a very light endpoint and others like it, latency seems to be an issue. Here's an example request as shown in GCP's Trace tool:
I'm aware that if it's a request that spawned a new instance or if memory usage is high, that would explain high latency. Neither of which are the case here.
It seems that intermittently some endpoints simply take a good second or two to respond on top of however long the endpoint itself takes to do its job, and the bulk of that is "untraced time" under GCP Stackdriver's Trace tool. I put a log entry as early as I seemingly possibly could, in webapp2's RequestHandler object on initialization. You can see that in the screenshot as "webapp2 request init".
I'm not familiar enough with webapp2's inner working to know where I could put a log elsewhere that could help explain this, if anywhere.
These are the scaling settings for that appengine service defined in my yaml file:
instance_class: F4
automatic_scaling:
max_idle_instances: 1
max_pending_latency: 1s
max_concurrent_requests: 50
Not sure what other information would be useful here.

Target micro instance in Flexible AppEngine

Is it possible to target a custom runtime to use a micro instance?
I tried with:
resources:
cpu: 0.5
memory_gb: 0.6
disk_size_gb: 10
But a small instance is started.
Add following to your app.yaml:
beta_settings:
machine_type: f1-micro
In App Engine, the instance type cannot be specified for the flex environment (unlike the standard environment in Python, Java and Go).
In the flex environment, the instance types are derived from the resource settings specified in your app.yaml file.
App Engine Resource Settings
It appears, that for some reason the pricing calculator lets you enter small values so the instance appears similar to a F1-micro (I was looking for around $5/month to run it) but then alas when it comes to specifying it in app.yaml it defaults to something that costs a whole lot more than an f1-micro, and you can't set the values to anything lower than what OP shows.
example very small flex instance in pricing calculator

Is it possible to specify a machine type (e.g small/micro) when deploying to Managed VM?

I'm migrating some simple web apps (Node based static pages with some questionnaires and a very small amount of back end processing) to App Engine. I have them working well. Impressed with how easy it was!
However, I have a couple of questions that baffle me.
1) Why does GCE always deploy 2 machines? Is there a way of specifying to only run 1? I really don't need loads of redundancy, and our traffic is expected to be light.
2) I have tried to specify the machine type in app.yaml to be 'micro'. Call me cheap, but we really don't need much capacity. I have tried various perameters e.g.
resources:
cpu: .5
memory_gb: .2
disk_size_gb: 10
but it always seems to deploy 'small' machines. Is there a log somewhere that would tell me that the commmand was valid, but it chose to ingore it?
Thanks in advance.
Ah ha! Sorry, with a bit more googling around I found an answer to Q2
Setting f1-micro resource limits in app.yaml for google cloud compute node.js app without vm_settings
As Jeff and Greg both replied, "Google adds a little overhead on the
VM before picking a machine type. This is around 400mb of ram. So they
told me if you want an f1-micro try requesting .2 or lower as Greg
mentioned."
I had to drop to .18 to get it to deploy as f1-micro, but the general
idea that google is adding overhead is solid.
Dropping down the memory_gb to 0.18 did the trick.
Simply adding
resources:
cpu: .5
memory_gb: .18
disk_size_gb: 10
and deploying with the command
gcloud preview app deploy --stop-previous-version --force --promote
to make damn sure it was made #1 seemed to work - no loss in performance so far.
You can also specify machine type, not just required resources. By adding into app.yaml:
beta_settings:
machine_type: f1-micro
Also, if you want to always use 1 instance add this:
manual_scaling:
instances: 1

Resources