appengine instance hours calculation

appengine instance hours calculation - google-app-engine

I develop an appengine application right now in python and I am surprised by the instance hours quotas I get, while I try to optimize my app for costs and performance.
I am testing right now one specific task_queue. (nothing else is running during that - before I start no instance is up)
the queue is configured with a rate/s of 100 with 100 buckets.
no configured limit for max_concurrent_requests
900 tasks will get pushed in this queue.
10-11 instances pop-up in this moment to deal with it.
everything takes far less than 30 seconds and every task is executed.
I check my instance hours quotas before and after that and I consume about 0.25 - 0.40 instance hours.
why is that?
shouldn't it be much less? is there an inital cost or a minimum amount which will be charged if one instance opens?

When an instance is opened it will cost you at least 15 minutes. Your 10-11 instances should cost you a total of around 2.5 hours.
If you don't need such a fast processing you should limit the amount of parallel processing of the queue using max_concurrent_requests.

I am pretty sure that the Scheduler will increase instance count when there is a backlog of tasks on a high-rate queue. 100/100 is a very high rate. You are telling the Scheduler to do these very quickly which means it fires up instances to do so.
Unless you need to process these tasks very quickly, you should use a much lower rate. This will result in fewer instances, and a longer queue of tasks. Depending on your processing requirements, you might be able to use a pull queue. Doing so allows you to lease and process hundreds of tasks at a time and take advantage of batch put()s etc. Really depends on what you are doing.

Related

Why is Google Cloud Tasks so slow?

I use Google Cloud Tasks with AppEngine to process tasks, but the tasks wait about 2-3 minutes in the queue before being sent to my App Engine endpoint.
There is no "delay" set on the tasks, and I expect them to be sent right away.
So the question is: Is Cloud Tasks slow?
As you can see is the following screenshot, Cloud Tasks gives an ETA of about 3 mins:

The official word from Google is that this is the best you can expect from their task queues.
In my experience, how you configure tasks seems to influence how quickly they get executed.
It seems that:
If you don't change the default behavior of your task queues (e.g., maximum concurrent, etc.) and if you don't specify an execution time of a task (e.g., eta) then your tasks will execute very soon after submission.
If you mess with either of these two things, then Google takes longer to execute your tasks. My guess is that it is the extra overhead of controlling task rate and execution.

I see from your screenshot that you have a task with an ETA of 2 min 49 sec which is the time until your task will be run. You have high bucket size and concurrency numbers, so I think your issue has more to do with the parameters you are using when queueing your tasks, especially the scheduled_time attribute. Check your code to see if you are adding a delay to your tasks, and make sure to tune it down.

Just adding here, that as of February 2023, I can queue tasks and then consume them VERY fast using the Python 3.7 libraries.
Takes me about 13.5 seconds to queue up 1000 tasks.
Takes about 1 minute to process those 1000 tasks using a Cloud Run deployed python/flask app. (No other processing done, just receive and reply with 200).
So, super fast!
BTW, pubsub was much slower in my tests... about 40ms per message to queue a message.

Exceeded soft memory limit of 243 MB with 307 MB after servicing 4330 requests total. Consider setting a larger instance class in app.yaml

Situation:
My project are mostly automated tasks.
My GAE (standard environment) app has 40 crons job like this, all run on default module (frontend):
- description: My cron job Nth
url: /mycronjob_n/ ###### Please note n is the nth cron job.
schedule: every 1 minutes
Each of cron jobs
#app.route('/mycronjob_n/')
def mycronjob_n():
for i in (0,100):
pram = prams[i]
options = TaskRetryOptions(task_retry_limit=0,task_age_limit=0)
deferred.defer(mytask,pram)
Where mytask is
def mytask(pram):
#Do some loops, read and write datastore, call api, which I guesss taking less than 30 seconds.
return 'Task finish'
Problem:
As title of the question, i am running out of RAM. Frontend instance hours are increasing to 100 hours.
My wrong thought?
defer task runs on background because it is not something that user sends request when visit the website. Therefore, they will not be considered as a request.
I break my cronjobs_n into small different tasks because i think it can help to reduce the running time each cronjobs_n so that REDUCE instance's ram consumption.
My question: (purpose: keep the frontend/backend instance hours as low as possible, and I accept latency)
Is defer task counted as request?
How many request do I have in 1 mintues?
40 request of mycronjob_n
or
40 requests of mycronjob_n x 100 mytask = 4000
If 3-4 instances can not handle 4000 requests, why doesnt GAE add 10 to 20 F1 instances more and then shut down if idle? I set autoscale in app.yaml. I dont see the meaning of autoscale of GAE here as advertised.
What is the best way to optimize my app?
If defer task is counted as request, it is meaningless to slit mycronjob_n into different small tasks, right? I mean, my current method is as same as:
#app.route('/mycronjob_n/')
def mycronjob_n():
for i in (0,100):
pram = prams[i]
options = TaskRetryOptions(task_retry_limit=0,task_age_limit=0)
mytask(pram) #Call function mytask
Here, will my app has 40 requests per minute, each request runs for 100 x 30s = 3000s? So will this approach also return out of memory?
Should I create a backend service running on F1 instance and put all cron jobs on that backend service? I heard that a request can run for 24 hours.
If I change default service instance from F1 to F2,F3, will I still get 28 hours free? I heard free tier apply to F1 only. And will my backend service get 9 hours free if it runs on B2 instead of B1?
My regret:
- I am quite regret that I choose GAE for this project. I choosed it because it has free tier. But I realized that free tier is just for hobby/testing purpose. If I run a real app, the cost will increase very fast that it make me think GAE is expensive. The datastore reading/writing are so expensive even though I tried my best to optimize them. The frontend hours are also always high. I am paying 40 usd per month for GAE. With 40 usd per month, maybe I can get better server if I choose Heroku, Digital Ocean? Do you think so?

Yes, task queue requests (deferred included) are also requests, they just can run longer than user requests. And they need instances to serve them, which count as instance hours. Since you have at least one cron job running every minute - you won't have any 15 minute idle interval allowing your instances to shut down - so you'll need at least one instance running at all times. If you use any instance class other than F1/B1 - you'll exceed the free instance hours quota. See Standard environment instances billing.
You seem to be under the impression that the number of requests is what's driving your costs up. It's not, at least not directly. The culprit is most likely the number of instances running.
If 3-4 instances can not handle 4000 requests, why doesnt GAE add 10
to 20 F1 instances more and then shut down if idle?
Most likely GAE does exactly that - spawns several instances. But you keep pumping requests every minute, they don't reach an idle state long enough, so they don't shut down. Which drives your instance hours up.
There are 2 things you can do about it:
stagger your deferred tasks so they don't hit need to be handled at the same time. Fewer instance (maybe even a single one?) may be necessary to handle them in such case. See Combine cron jobs to reduce number of instances and Preventing Google App Engine Cron jobs from creating multiple instances (and thus burning through all my instance hours)
tune your app's scaling configuration (the range is limited though). See Scaling elements.
You should also carefully read How Instances are Managed.
Yes, you only pay for exceeds the free quota, regardless of the instance class. Billing is in F1/B1 units anyways - from the above billing link:
Important: When you are billed for instance hours, you will not see any instance classes in your billing line items. Instead, you will
see the appropriate multiple of instance hours. For example, if you
use an F4 instance for one hour, you do not see "F4" listed, but you
see billing for four instance hours at the F1 rate.
About the RAM usage, splitting the cron job in multiple tasks isn't necessarily helping, see App Engine Deferred: Tracking Down Memory Leaks
Finally, cost comparing GAE with Heroku, Digital Ocean isn't an apples-to-apples comparison: GAE is PaaS, not IaaS, it's IMHO expected to be more expensive. Choosing one or the other is really up to you.

GAE - Execute many small tasks after a fixed time

I'd like to make a Google App Engine app that sends a Facebook message to a user a fixed time (e.g. one day) after they click a button in the app. It's not scalable to use cron or the task queue for potentially millions of tiny jobs. I've also considered implementing my own queue using a background thread, but that's only available using the Backends API as far as I know, which is designed for much larger usage and is not free.
Is there a scalable way for a free Google App Engine app to execute a large number of small tasks after a fixed period of time?

For starters, if you're looking to do millions of tiny jobs, you're going to blow past the free quota very quickly, any way you look at it. The free quota's meant for testing.
It depends on the granularity of your tasks. If you're executing a lot of tasks once per day, cron hooked up to a mapreduce operation (which essentially sends out a bunch of tasks on task queues) works fine. You'll basically issue a datastore query to find the tasks that need to be run, and send them out on the mapreduce.
If you execute this task thousands of times a day (every minute), it may start getting expensive because you're issuing many queries. Note that if most of those queries return nothing, the cost is still minimal.
The other option is to store your tasks in memory rather than in the datastore, that's where you'd want to start using backends. But backends are expensive to maintain. Look into using Google Compute Engine, which gives much cheaper VMs.
EDIT:
If you go the cron/datastore route, you'd store a new entity whenever a user wants to send a deferred message. Most importantly, it'd have a queryable timestamp for when the message should be sent, probably rounded to the nearest minute or the nearest 5 minutes, whatever you decide your granularity should be.
You would then have a cron job that runs at the set interval, say every minute. On each run it would build a query for all the cron jobs it needs to send for the given minute.
If you really do have hundreds of thousands of messages to send each minute, you're not going to want to do it from the cron task. You'd want the cron task to spawn a mapreduce job that will fan out the query and spawn tasks to send your messages.

AppEngine kills JVM instance too fast

AppEngine kills my JVM instance really quickly. It does not live longer than 30 seconds idle. Subsequent request will create new instance but this roundtrip takes 8-10 seconds. Is there potential problem (bug) in my application ? There is no record in logs/admin logs which would indicate any problem or reason of shutdown. Development server works normaly. Is there any chance to find out why the instance is shutdown so quickly ?

AppEngine especially on free account can kill the instance anytime. Instance can live couple of seconds if idle or couple of minutes, noone knows how long. Should you need resident instance (to prevent long instance start-up), switch to paid version. Even if it is paid, you can still run it for free if you keep your instance hours within free quota. This means if you have single F1 instance, it will consume 24instance-hours per day. As per today quotas, you have free 28instance-hours per day so you have 4 instance-hours spare for app redeployments (every redeployment costs 1/4 of instance-hour or 15 minutes). Having F2 instance will consume quota 2x faster e.g. it will take 14 hours to consume your 28hour free quota. Next 10 hours will be billed to your credit card as per price list.

Handling AppEngine outage with pending_ms during several minutes

Today AppEngine went down for a while:
http://code.google.com/status/appengine/detail/serving/2012/10/26#ae-trust-detail-helloworld-get-latency
The result was that all requests were kept as pending, for some for as long as 24 minutes. Here is an excerpt from my server log. These requests are in general handled in less than 200 ms.
https://www.evernote.com/shard/s8/sh/ad3b58bf-9338-4cf7-aa35-a255d96aebbc/4b90815ba1c8cd2080b157a54d714ae0
My quota (8$ per day) was exploded in a matter of minutes when it previously was at around 2$ per day.
How can I prevent pending_ms to eat all my quota, even though my actual request is still responding very fast? I had the pending delay from 300 ms to Automatic. Does limiting the maximum to 10 seconds prevent that type of outbreak?

blackjack75,
You're right, raising the pending latency to something like 10 seconds will help reduce the number of instances started.
It looks like the long running requests tied up your instances. When this happens, app engine spins up new instances to handle the new requests, and of course instances cost money.
Lowering your min and max idle instances to smaller numbers should also help.
On your dashboard, you can look at your instance graph you to see how long the burst of instances was left idle after the request load was finished.
You can look at your typical usage to help estimate a safe max.
Lowering them can cause slowness when legitimate traffic needs to spin up a new instance, especially with bursty traffic, so you would want to adjust this to match your budget. For comparision, on a non-production appspot having the min and max set to 1 works fine.
Besides that, general techniques for reducing app engine resource usage will help. It sounds like you've gone through that already since your typical request time is low. Enabling concurrent requests could help here if your code will handle threads correctly (no globals, etc.) and your instances have enough free memory handle multiple requests.