I am running a site on App Engine (managed VM). It is currently running on f1-micro instances.
The Cloud platform Console reports that CPU utilization is ~40%. I became a little suspicious because the site is receiving practically zero traffic. Is this normal for an idle golang app on a f1-micro instance?
I logged onto the actual instance and "top" reports CPU utilization ~2%.
What gives? Why is "top" saying something different than the Console?
top gives a momentary measure (I believe every second?), while the Console's data might be over a longer period of time during which the site had higher activity. With a micro instance, it seems plausible that relatively normal amounts of traffic could take up a relatively high percentage of the CPU, leading to such a metric.
Related
I'm using a App Engine task handler, to process a workload (import files to database).
Looking at my Cloud SQL monitoring, I see that after some minutes, the write rate declines (see picture), and my task runs much slower. Does Google throttle the Instance's CPU or might there be other reasons?
According to the documentation https://cloud.google.com/compute/docs/cpu-platforms
all cores have turbo frequency, although it is not guaranteed.
All-core turbo frequency: The frequency at which each CPU typically
runs when all cores in the socket are not idle at the same time.
This post explains how you can monitor your CPU speed https://askubuntu.com/questions/218567/any-way-to-check-the-clock-speed-of-my-processor
You can ssh into the machine and monitor it real-time.
Most services including Cloud SQL provide an IOPS quota which is based upon disk size and other factors.
Your screenshot indicates that you have exceeded that READ quota for Cloud SQL. The result is throttling of disk I/O.
When you created the Cloud SQL instance, you selected a very small storage disk. I recommend resizing that disk larger so that normal operations do not exceed the disk IOPS quota for both read and write.
I am currently experiencing really long latency issues in my .net core 2.2 applications.
The setup consist of an .net API's through the app engine (memory 2g, 1 cpu, resting instance of 2) which talk to spanner tables with indexes. Whenever our system comes under load we tend to get a spike where our instances jump and the latency raises considerably.
On average our request time for an api request is 30ms but this then jumps to 208s even on instances that do not change. The spanner requests are quite short averaging around 0.072502. The just shows a blue bar spanning the whole of the request time. Checking for row locks but these are simply just GET requests and show nothing.
Is there anything else I can look at?
I created a site on App Engine and chose the smallest F1 instance class which according to the docs has a CPU Limit of 600 MHz.
I limited the app to 1 instance only as a test and let it run several days then checked the CPU utilization on the dashboard. Here's part of the chart:
As you can see the utilization which is given in Megacycles/sec which I assume equals to one MHz is between like 700 and 1500.
The app uses one F1 instance only, runs without problems, there are no quota errors, but then what does the 600 Mhz CPU limit mean if the utilization is usually above it?
Megacycles/sec is not MHz in this graph. As explained in Interface QuotaService:
Measures the duration that the current request has spent so far
processing the request within the App Engine sandbox. Note that time
spent in API calls will not be added to this value. The unit the
duration is measured is Megacycles. If all instructions were to be
executed sequentially on a standard 1.2 GHz 64-bit x86 CPU, 1200
megacycles would equate to one second physical time elapsed.
In App Engine Flex, you get an entire CPU core from the machine you are renting out, but in App Engine Standard, it shows the Megacycles since it uses a sandbox.
Note that there is a feature request in the issue tracker on adding CPU% metric under gae_app for App Engine standard and I have relayed your concern about it to the Cloud App Engine product team. However, there is no guarantee of the implementation and ETA at this time. I recommend to star the ticket so that you would receive updates about it.
I have attached two Google Stackdriver Profilers for our backend server.
The backend api simply trying to read from Memcache first, if it doesn't exist or has timeout error, then retrieve the data from Big Table.
Based on the wall time and CPU time profiles, I'd like to know
Why the libjvm_so process consumes that much (45.8%) of CPU time, is it because the server allocates lots of memory that cause garbage collection using a lot of CPU?
The wall time profile shows 97% of thread time is waiting for a resource, I guess it's waiting for Memcache/big table server to return the data, is it true? Does this mean the server doesn't need that much CPU? (currently 16 CPUs, with an average load of 35%)
Any other insights? Where can be improved etc?
I have a simple app running on App Engine but I'm having odd problems with latency. It's a Python 2.7 app and a loading request takes between 1.5 and 10 secs (I guess depending on how GAE is feeling). This is a low traffic site right now, so previously GAE was sitting with no idle instances and most request were loading requests, resulting in a long wait time on the first page view.
I've tried configuring the minimum number of idle instances to "1" so that these infrequent page views can immediately hit a warm instance.
However, I've seen several cases now where even with one instance sitting unused, GAE will route an incoming request to a loading instance, leaving the warm instance untouched:
gae dashboard showing odd scheduling
How can I prevent this from happening? I feel I must be understanding something wrong, because I certainly don't expect this behavior.
Update: Also, what makes this even less comprehensible is that the app has threadsafe enabled, so I really don't understand why GAE would get flustered and spin up an instance for a single, lone request.
Actually, I believe this is normal behavior. Idle instances are supposed to guarantee a minimum number of instances always available (for spiky load).
So, when some requests start coming in, they are initially served by idle instances, but at the same time AE scheduler will start launching new instances to always guarantee the same amount of idle instances even during suddenly increased load. That is, to "cover" for those idle instances that became busy serving requests.
It is described in details on Adjusting Application Performance page.
Arrrgh! Suffer from this myself. This topic-area has come up in several threads (GAE groups & SO). If someone can dial-in the settings for a low-traffic site (billing on/off), that would be a real benefit. IIRC, someone with what I think is deep GAE experience noted in one thread that the Scheduler does not do well with very low volume apps. I have also seen wildly different startup times within a relatively short period of time. Painful to see a spinup take 700ms then 7000ms just a few minutes later. Overall the issue is not so much the cost to me, but more so the waste of infrastructure resources. In testing I've had two instances running despite having pinged the app with an RPC once every few minutes. If 50k other developers are similarly testing, that could accumulate into a significant waste.