I've deployed an app on Google Cloud App Engine instances.
I've enabled an autoscaling group that should increase the number of instances in case the CPU utilization of the instances goes beyond the 60% usage (as Google suggests).
Now from GCP dashboard I see these screen:
As you can see there are 3 instances (I've set the minimum number of instances to 3).
Hovering the blue line of the graph I see that CPU percentage changes over time, but I can't understand if this percentage is the total percentage for the 3 instances of what else.
In the image above 5.38% is the CPU percentage of a specific instance or is the sum of the percentage of the 3 instances?
I wonder to know how to retrieve CPU utilization for each of the 3 instances.
Related
I created a site on App Engine and chose the smallest F1 instance class which according to the docs has a CPU Limit of 600 MHz.
I limited the app to 1 instance only as a test and let it run several days then checked the CPU utilization on the dashboard. Here's part of the chart:
As you can see the utilization which is given in Megacycles/sec which I assume equals to one MHz is between like 700 and 1500.
The app uses one F1 instance only, runs without problems, there are no quota errors, but then what does the 600 Mhz CPU limit mean if the utilization is usually above it?
Megacycles/sec is not MHz in this graph. As explained in Interface QuotaService:
Measures the duration that the current request has spent so far
processing the request within the App Engine sandbox. Note that time
spent in API calls will not be added to this value. The unit the
duration is measured is Megacycles. If all instructions were to be
executed sequentially on a standard 1.2 GHz 64-bit x86 CPU, 1200
megacycles would equate to one second physical time elapsed.
In App Engine Flex, you get an entire CPU core from the machine you are renting out, but in App Engine Standard, it shows the Megacycles since it uses a sandbox.
Note that there is a feature request in the issue tracker on adding CPU% metric under gae_app for App Engine standard and I have relayed your concern about it to the Cloud App Engine product team. However, there is no guarantee of the implementation and ETA at this time. I recommend to star the ticket so that you would receive updates about it.
Automatic instance scaling depends on various factors like number of concurrent requests, cpu utilization, etc. I would like to be able to look at the App Engine dashboard and see which factor caused the number of instances to increase.
For cpu utilization, it is not clear what the comparison should be. The dashboard presents cpu utilization in terms of Megacycles per second, but the autoscaling cpu utilization parameter is just a number between 0.5 and 0.95.
From here an F1 instance apparently has a cpu limit of 600 MHz. This is frequency, not a cpu limit. Should I interpret this instead as a fully utilized F1 instance can hit 600 Megacycles per second?
And therefore, if I set a target_cpu_utilization = 0.5, I can expect autoscaling to increase the number of instances if the dashboard shows a cpu usage of more than 300 Megacycles/sec * # instances?
Indeed, there are many factors that impact on the scaling on App Engine. There are three types of scaling that you can configure on your application, that will impact the way that it will be scaled. The three types are: Automatic scaling, Basic Scaling and Manual scaling.
I would recommend you to take a look at the documentation How Instances are Managed. This documentation provides more insights on how the scaling occurs on App Engine.
Besides that, in the following articles, you can check for more information on how to configure and set the factors that control the scaling - which will be upscaled or not, etc. - that I believe should help you as well.
Designing for scale on App Engine standard environment
app.yaml Configuration File
Let me know if the information helped you!
I'm hosting my back end project on Google Cloud (App Engine Flex Instance), for now I have only 10 users but they charge me 250$ per month now, because I use several core, and so I used 2400 hours of accumulated instance time. Insane for only 10 users and not so much traffic!
Can I reduce or limit the number of core used by my back end?
As you can see here, the price for App Engine Flexible is computed as vCPU per core hour of usage. Basically, it does not matter if users reach your back-end project. It matters only if many users reach your App Engine Flexible deployment, increasing the number of resources required to serve them, thus increasing the price.
Yes, you can reduce the number of cores used in the back end, through the resource settings of your app.yaml configuration file. You might also want to check service scaling settings, to control the way App Engine Flexible assigns more resources based on your service's demands.
I would like to setup server for 'image processing' activity on my website. What is the comparable power in GAE if I use 'n1-standard-1' instance in GCE? It is because I miscalculated it, or the price difference is substantial between the two for the same power?
App Engine instances are more expensive than Compute Engine instances on a per hour basis. If you have a constant load, it's cheaper to keep a GCE instance running.
App Engine has an advantage that it can scale down to zero instances after 15 minutes of inactivity. If you have no load during long stretches of time (night, weekends) or few requests that come less frequently than ~ once per hour, App Engine instance maybe a more efficient solution.
Also, take a look at Google Cloud Functions. This is a Node.js runtime that is priced to 100 milliseconds. There are no instances to start or shutdown, no ongoing costs at all - you pay only when you process a request. And you get a free daily quota. The only limitation is that an individual request should be completed in 9 minutes.
Performing load tests on my app, I noticed that the Instances dashboard graph shows a pretty big difference between the number of active and billed instances:
What do active and total mean?
Also, after spending the day running load tests, here's what I see:
In the first peak, the number of billed instances pretty much matches the number of total instances. Then, on subsequent loads, the bumber of billed instances sits in between total and active.
Update 2013-02-21: I did another batch of load tests today, and I'm still seeing variance in where the billed instances stand relative to total and _active:
How are these numbers calculated? How should interpret them, considering that I'm trying to forecast our operational costs based on these numbers?
It seems (I believe) that if you have F2 instance in application settings each F2 active instance is counted as 2 billing instances. If you set F4 instances it counted as 4 billing instances. And so forth.
Total instances is number of instantiated but not billed instances - kind of "gift" from Google. If there would be more requests that need more instances GAE would not need to start a new instance but would use 1 from those "non-active". When the load is raising GAE start new instances but when the load is going down GAE would keep instances for a while but would not charge you for them. But they would be shut down eventually if load did not raise back.