What does the CPU Limit mean for App Engine instances? - google-app-engine

I created a site on App Engine and chose the smallest F1 instance class which according to the docs has a CPU Limit of 600 MHz.
I limited the app to 1 instance only as a test and let it run several days then checked the CPU utilization on the dashboard. Here's part of the chart:
As you can see the utilization which is given in Megacycles/sec which I assume equals to one MHz is between like 700 and 1500.
The app uses one F1 instance only, runs without problems, there are no quota errors, but then what does the 600 Mhz CPU limit mean if the utilization is usually above it?

Megacycles/sec is not MHz in this graph. As explained in Interface QuotaService:
Measures the duration that the current request has spent so far
processing the request within the App Engine sandbox. Note that time
spent in API calls will not be added to this value. The unit the
duration is measured is Megacycles. If all instructions were to be
executed sequentially on a standard 1.2 GHz 64-bit x86 CPU, 1200
megacycles would equate to one second physical time elapsed.
In App Engine Flex, you get an entire CPU core from the machine you are renting out, but in App Engine Standard, it shows the Megacycles since it uses a sandbox.
Note that there is a feature request in the issue tracker on adding CPU% metric under gae_app for App Engine standard and I have relayed your concern about it to the Cloud App Engine product team. However, there is no guarantee of the implementation and ETA at this time. I recommend to star the ticket so that you would receive updates about it.

Related

detecting long latency in .NET Core PaaS for GCloud

I am currently experiencing really long latency issues in my .net core 2.2 applications.
The setup consist of an .net API's through the app engine (memory 2g, 1 cpu, resting instance of 2) which talk to spanner tables with indexes. Whenever our system comes under load we tend to get a spike where our instances jump and the latency raises considerably.
On average our request time for an api request is 30ms but this then jumps to 208s even on instances that do not change. The spanner requests are quite short averaging around 0.072502. The just shows a blue bar spanning the whole of the request time. Checking for row locks but these are simply just GET requests and show nothing.
Is there anything else I can look at?

App Engine Megacycles/sec per F1 instance

Automatic instance scaling depends on various factors like number of concurrent requests, cpu utilization, etc. I would like to be able to look at the App Engine dashboard and see which factor caused the number of instances to increase.
For cpu utilization, it is not clear what the comparison should be. The dashboard presents cpu utilization in terms of Megacycles per second, but the autoscaling cpu utilization parameter is just a number between 0.5 and 0.95.
From here an F1 instance apparently has a cpu limit of 600 MHz. This is frequency, not a cpu limit. Should I interpret this instead as a fully utilized F1 instance can hit 600 Megacycles per second?
And therefore, if I set a target_cpu_utilization = 0.5, I can expect autoscaling to increase the number of instances if the dashboard shows a cpu usage of more than 300 Megacycles/sec * # instances?
Indeed, there are many factors that impact on the scaling on App Engine. There are three types of scaling that you can configure on your application, that will impact the way that it will be scaled. The three types are: Automatic scaling, Basic Scaling and Manual scaling.
I would recommend you to take a look at the documentation How Instances are Managed. This documentation provides more insights on how the scaling occurs on App Engine.
Besides that, in the following articles, you can check for more information on how to configure and set the factors that control the scaling - which will be upscaled or not, etc. - that I believe should help you as well.
Designing for scale on App Engine standard environment
app.yaml Configuration File
Let me know if the information helped you!

Why are we experiencing huge latency on one autoscaled Google App Engine instance when several others are available?

Our autoscaling parameters in app.yaml are as follows:
automatic_scaling:
min_idle_instances: 3
max_idle_instances: automatic
max_pending_latency: 30ms
max_concurrent_requests: 20
The result is 3 resident instances and typically 2-6 dynamic instances (depending on traffic), but the load distribution among the instances seems inefficient. In the screenshot below we see 1 instance with the vast majority of requests, and a massive 21s latency (in last minute).
To me this indicates there must be something wrong with our setup to explain these high latencies.
Has anyone experienced issues like this with GCP or App Engine?
Idle instances aren't used to balance current load. They bridge the gap while new dynamic instances are spinning up. In your setup it might be worth trying just one or two idle instances and fiddle with min and max pending latency.
Pending latency is measured by how long a request stays in the queue before it is handled by an instance. The latency you see in your screenshot is the time between request and response. If any single request takes 21 seconds it would look like this. The pending latency could still be below 30ms though.
You should check your logs and see which request takes so long and probably break them up into smaller chunks of work. Many small jobs scale much better than huge jobs. Pending latency will also go up with lots of small jobs and will cause your app to scale properly.

High CPU utilization using f1-micro instance

I am running a site on App Engine (managed VM). It is currently running on f1-micro instances.
The Cloud platform Console reports that CPU utilization is ~40%. I became a little suspicious because the site is receiving practically zero traffic. Is this normal for an idle golang app on a f1-micro instance?
I logged onto the actual instance and "top" reports CPU utilization ~2%.
What gives? Why is "top" saying something different than the Console?
top gives a momentary measure (I believe every second?), while the Console's data might be over a longer period of time during which the site had higher activity. With a micro instance, it seems plausible that relatively normal amounts of traffic could take up a relatively high percentage of the CPU, leading to such a metric.

gae Runtime MCycles

GAE dashboard shows stats for different URIs of your app. It includes Req/Min, Requests, Runtime MCycles, and Avg Latency.
The help provided seems to be outdated, here what it states:
The current load table provides two data points for CPU usage, "Avg CPU (API)" and "% CPU". The "Avg CPU (API)" displays the average amount of CPU a request to that URI has consumed over the past hour, measured in megacycles. The "% CPU" column shows the percentage of CPU that URI has consumed since midnight PST with respect to the other URIs in your application.
So I assume Runtime MCycles is what the help calls Avg CPU (API)?
How do I map this number to the request stats in the logs?
For example one of the requests has this kind of logs: ms=583 cpu_ms=519 api_cpu_ms=402.
Do I understand correctly that ms includes cpu_ms and cpu_ms includes api_cpu_ms?
So then cpu_ms is the Runtime MCycles which is shown as average for the given URI on dashboard?
I have a F1 instance with 600Mhz and concurrency enabled for my app. Does it mean this instance throughput it 600 MCycles per second? So if average request takes 100 Mcycles, it should handle 5-6 request on average?
I am digging into this to try to predict the costs for my app under load.
This blog post (by Nick Johnson) is a useful summary of what the request log fields mean: http://blog.notdot.net/2011/06/Demystifying-the-App-Engine-request-logs

Resources