Google App Engine flexible price - google-app-engine

Rails application currently running on App Engine Flexible,
It is close to $15 from March 1st to March 10th, I feel that the infrastructure cost is somewhat higher.
I set it so that access to the application rarely occurs and the resource of app.yml is minimized.
Do you have anything to keep in mind?
Or is the price of App Engine Flex like this?
It is the setting value confirmed by the GCP console.
runtime: ruby
api_version: '1.0'
env: flexible
threadsafe: true
env_variables:
RAILS_MASTER_KEY: dd89c19c2ee45246d68b8b3765625ce7
automatic_scaling:
min_num_instances: 1
max_num_instances: 2
cpu_utilization:
target_utilization: 0.5
resources:
memory_gb: 0.6

You can use Google Cloud Platform Pricing Calculator for that or calculate it yourself using the prices in App Engine Pricing:
vCPU per core hour $0.0526
Memory per GB hour $0.0071
So in your case:
1 vCPU * 0.0526 $ per vCPU = 0.0526 $ per hour
(0.6 GB + 0.4 GB) * 0.0071 $ per GB = 0.00426 $ per hour
Then 0.05686$ per hour for 240 hours is 13.6464$ for one instance over 10 days, which is close to what you paid.

Related

How can I reduce App Engine billing cost?

My app engine yaml file is somewhat like below
service: servicename
runtime: php74
automatic_scaling:
min_idle_instances: 2
max_pending_latency: 1s
env_variables:
CLOUD_SQL_CONNECTION_NAME: <MY-PROJECT>:<INSTANCE-REGION>:<MY-DATABASE>
DB_USER: my-db-user
DB_PASS: my-db-pass
DB_NAME: my-db
---
automatic scaling cause higher cost? what is the cheapest configuration I can set. it's not mandatory to have auto scaling at current stage of my application.
I think your cheapest configuration is just setting max_instances: 1 and commenting out the other options.
When you have traffic, the maximum number of instances that you will have will be 1. When there's no traffic, your instance goes down (effectively 0).
The downside with this approach (not having min_idle_instance as you currently do) is that brand new traffic to your site will take some time because of the time for your instance to be started.

Why are idle instances not being shut down when there is no traffic?

Some weeks ago my app on App Engine just started to increase the number of idle instances to an unreasonable high amount, even when there is close to zero traffic. This of course impacts my bill which is skyrocketing.
My app is simple Node.js application serving a GraphQL API that connects to my CloudSQL database.
Why are all these idle instances being started?
My app.yaml:
runtime: nodejs12
service: default
handlers:
- url: /.*
script: auto
secure: always
redirect_http_response_code: 301
automatic_scaling:
max_idle_instances: 1
Screenshot of monitoring:
This is very strange behavior, as per the documentation it should only temporarily exceed the max_idle_instances.
Note: When settling back to normal levels after a load spike, the
number of idle instances can temporarily exceed your specified
maximum. However, you will not be charged for more instances than the
maximum number you've specified.
Some possible solutions:
Confirm in the console that the actual app.yaml configuration is the same as in the app engine console.
Set min_idle_instances to 1 and max_idle_instances to 2 (temporarily) and redeploy the application. It could be that there is just something wrong on the scaling side, and redeploying the application could solve this.
Check your logging (filter app engine) if there is any problem in shutting down the idle instances.
Finally, you could tweak settings like max_pending_latency. I have seen people build applications that take 2-3 seconds to start up, while the default is 30ms before another instance is being spun up.
This post suggests setting the following, which you could try:
instance_class: F1
automatic_scaling:
max_idle_instances: 1 # default value
min_pending_latency: automatic # default value
max_pending_latency: 30ms
Switch to basic_scaling, let Google determine the best scaling algorithm (last resort option). This would look something like this:
basic_scaling:
max_instances: 5
idle_timeout: 15m
The solution could of course also be a combination of 2 and 4.
Update after 24 hours:
I followed #Nebulastic suggestions, number 2 and 4, but it did not make any difference. So in frustration I disabled the entire Google App Engine (App Engine > Settings > Disable application) and left it off for 10 minutes and confirmed in the monitoring dashboard that everything was dead (sorry, users!).
After 10 minutes I enabled App Engine again and it booted only 1 instance. I've been monitoring it closely since and it seems (finally) to be good now. And now after the restart it also adheres to the "min" and "max" idle instances configuration - the suggestion from #Nebulastic. Thanks!
Screenshots:
Have you checked to make sure you dont have a bunch of old versions still running? https://console.cloud.google.com/appengine/versions
check for each service in the services dropdown

Rolling restarts are causing are app engine app to go offline. Is there a way to change the config to prevent that from happening?

About once a week our flexible app engine node app goes offline and the following line appears in the logs: Restarting batch of VMs for version 20181008t134234 as part of rolling restart. We have our app set to automatic scaling with the following settings:
runtime: nodejs
env: flex
beta_settings:
cloud_sql_instances: tuzag-v2:us-east4:tuzag-db
automatic_scaling:
min_num_instances: 1
max_num_instances: 3
liveness_check:
path: "/"
check_interval_sec: 30
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
readiness_check:
path: "/"
check_interval_sec: 15
timeout_sec: 4
failure_threshold: 2
success_threshold: 2
app_start_timeout_sec: 300
resources:
cpu: 1
memory_gb: 1
disk_size_gb: 10
I understand the rolling restarts of GCP/GAE, but am confused as to why Google isn't spinning up another VM before taking our primary one offline. Do we have to run with a min num of 2 instances to prevent this from happening? Is there a way I get configure my app.yaml to make sure another instance is spun up before it reboots the only running instance? After the reboot finishes, everything comes back online fine, but there's still 10 minutes of downtime, which isn't acceptable, especially considering we can't control when it reboots.
We know that it is expected behaviour that Flexible instances are restarted on a weekly basis. Provided that health checks are properly configured and are not the issue, the recommendation is, indeed, to set up a minimum of two instances.
There is no alternative functionality in App Engine Flex, of which I am aware of, that raises a new instance to avoid downtime as a result of a weekly restart. You could try to run directly on Google Compute Engine instead of App Engine and manage updates and maintenance by yourself, perhaps that would suit your purpose better.
Are you just guessing this based on that num instances graph in the app engine dashboard? Or is your app engine project actually unresponsive during that time?
You could use cron to hit it every 5 minutes to see if it's responsive.
Does this issue persist if you change cool_down_period_sec & target_utilization back to their defaults?
If your service is truly down during that time, maybe you should implement a request handler for liveliness checks:
https://cloud.google.com/appengine/docs/flexible/python/reference/app-yaml#updated_health_checks
Their default polling config would tell GAE to launch within a couple minutes
Another thing worth double checking is how long it takes your instance to start up.

AppEngine NodeJS flexible spawns 2 instances after deployment

I have a pretty basic app.yaml file with the following:
runtime: nodejs
env: flex
service: front
And everytime I deploy the application, the deployment take a very long time in the step:
Updating service [front] (this may take several minutes)...
When I check in the console, I can see that it goes up from 1 instances to 2 even if I didn't specify anything about the number of instances. Why is Google doing this ? and how can we set the starting number of instances without disabling the autoscaling feature ? Thanks in advance !
On App Engine Flexible applications, the minimum number of instances given to your service defaults to 2 to reduce latency. This is documented here.
You can configure these settings differently by adding them in your app.yaml file like this:
runtime: nodejs
env: flex
service: front
automatic_scaling:
min_num_instances: 1 // Default is 2. Must be 1 or greater
max_num_instances: 10 // Default is 20.

App Engine: Using f1-micro instance but getting billed for g1-small instance

We have a flexible environment (node.js) running with one f1-micro (1 vCPU, 0.6 GB memory) instance. When I look at the Billing History, I can see that we get billed for "Compute Engine Small instance with 1 VCPU" with the price of a g1-small instance.
We're still in the 60 days free trial period, so we are still using the credit.
But I'm wondering why we get billed for the g1-small instance if we are using f1-micro?
To answer my own question: We actually got billed for "Compute Engine Micro instance with burstable CPU" (f1-micro) correctly by the end of month. It seems that there is a delay between what is really consumed and what is shown under "Costs in this month".
app.yaml:
runtime: nodejs
env: flex
manual_scaling:
instances: 1
resources:
cpu: .5
memory_gb: 0.90
disk_size_gb: 10
skip_files:
- ^(.*/)?.*/node_modules/.*$
env_variables:
...

Resources