GAE dashboard shows stats for different URIs of your app. It includes Req/Min, Requests, Runtime MCycles, and Avg Latency.
The help provided seems to be outdated, here what it states:
The current load table provides two data points for CPU usage, "Avg CPU (API)" and "% CPU". The "Avg CPU (API)" displays the average amount of CPU a request to that URI has consumed over the past hour, measured in megacycles. The "% CPU" column shows the percentage of CPU that URI has consumed since midnight PST with respect to the other URIs in your application.
So I assume Runtime MCycles is what the help calls Avg CPU (API)?
How do I map this number to the request stats in the logs?
For example one of the requests has this kind of logs: ms=583 cpu_ms=519 api_cpu_ms=402.
Do I understand correctly that ms includes cpu_ms and cpu_ms includes api_cpu_ms?
So then cpu_ms is the Runtime MCycles which is shown as average for the given URI on dashboard?
I have a F1 instance with 600Mhz and concurrency enabled for my app. Does it mean this instance throughput it 600 MCycles per second? So if average request takes 100 Mcycles, it should handle 5-6 request on average?
I am digging into this to try to predict the costs for my app under load.
This blog post (by Nick Johnson) is a useful summary of what the request log fields mean: http://blog.notdot.net/2011/06/Demystifying-the-App-Engine-request-logs
Related
I created a site on App Engine and chose the smallest F1 instance class which according to the docs has a CPU Limit of 600 MHz.
I limited the app to 1 instance only as a test and let it run several days then checked the CPU utilization on the dashboard. Here's part of the chart:
As you can see the utilization which is given in Megacycles/sec which I assume equals to one MHz is between like 700 and 1500.
The app uses one F1 instance only, runs without problems, there are no quota errors, but then what does the 600 Mhz CPU limit mean if the utilization is usually above it?
Megacycles/sec is not MHz in this graph. As explained in Interface QuotaService:
Measures the duration that the current request has spent so far
processing the request within the App Engine sandbox. Note that time
spent in API calls will not be added to this value. The unit the
duration is measured is Megacycles. If all instructions were to be
executed sequentially on a standard 1.2 GHz 64-bit x86 CPU, 1200
megacycles would equate to one second physical time elapsed.
In App Engine Flex, you get an entire CPU core from the machine you are renting out, but in App Engine Standard, it shows the Megacycles since it uses a sandbox.
Note that there is a feature request in the issue tracker on adding CPU% metric under gae_app for App Engine standard and I have relayed your concern about it to the Cloud App Engine product team. However, there is no guarantee of the implementation and ETA at this time. I recommend to star the ticket so that you would receive updates about it.
I have attached two Google Stackdriver Profilers for our backend server.
The backend api simply trying to read from Memcache first, if it doesn't exist or has timeout error, then retrieve the data from Big Table.
Based on the wall time and CPU time profiles, I'd like to know
Why the libjvm_so process consumes that much (45.8%) of CPU time, is it because the server allocates lots of memory that cause garbage collection using a lot of CPU?
The wall time profile shows 97% of thread time is waiting for a resource, I guess it's waiting for Memcache/big table server to return the data, is it true? Does this mean the server doesn't need that much CPU? (currently 16 CPUs, with an average load of 35%)
Any other insights? Where can be improved etc?
I have an app engine application with some services being based on webapp2 framework and some service being based on endpoints-v2 framework.
The issue that i am facing over is that some time the OPTIONS request being sent from front end takes a huge amount of time get the response back which varies from 10 secs to 15 secs which is adding latency to my entire application. On digging down deeper into the issue and found the it is due to instance startup time that is costing us this much latency.
So my question is
Does starting up an instance takes this much of time ?
If not then how can i reduce my startup time for instances ?
How the instances start so that i can optimise those situations in my code?
Java instance takes a long time to spin up. You can hide the latency by configuring warmup request and min-idle-instances (see here) in your appengine-web.xml.
No matter what I set rate and bucket_size I always see only one or two tasks running in the same time. Does anyone know what might be the reason?
My queue configuration is:
name: MyQueue
rate: 30/s
bucket_size: 50
Note that the documentation mentions
To ensure that the taskqueue system does not overwhelm your
application, it may throttle the rate at which requests are sent. This
throttled rate is known as the enforced rate. The enforced rate may be
decreased when your application returns a 503 HTTP response code, or
if there are no instances able to execute a request for an extended
period of time.
In such situations, the admin console will show an Enforced Rate lower than the 30/s you requested.
Take these stats from a post from the App Engine blog as an example:
real = 107ms
cpu = 141ms
api = 388ms
overhead = 1ms
RPC Total: 63ms (388ms api)
Grand Total: 107ms (530ms cpu + api)
I think I understand overhead: it gives the amount of time taken to write the logs, excluding the time it took to store the logs in memcache.
I am confused by the other numbers:
What exactly do real, cpu and api mean?
How is api different from RPC total?
What is the "Grand Total"?
This is my understanding:
real is the time as measured by a clock. This is time elapsed.
api usage is the time spent on RPC's, such as accessing the datastore. This is not truly a time, but some amount of computing resources measured in time.
cpu usage is the time spent executing code. Again, it's not really a time but resource usage as measured in time.
api is different than RPC Total only in that RPC total shows the amount of clock time that has elapsed during the api time. It's possible to do 388ms of computation in 63ms because of parallelism. So, RPC Total shows both clock-time spent, as well as resoure usage.
Grand Total is the total wall time (the same as real), with the sum of cpu, api, and overhead. In this case, 530ms of quota are used in 107ms.
overhead is, of course, time "wasted" waiting for "real" work to be done. This mostly includes the resources taken by AppStats itself.
See the document Appstats: RPC Instrumentation for Google App Engine by Guido van Rossum for details.
Guido van Rossum gave a talk at Google I/O 2010 called Appstats - Instrumentation for App Engine where he discusses this briefly. It's a great talk to learn about App Engine, and optimization and instrumentation in general. It's about an hour long.