Say, I have two web apps:
The first one just waits for 10 seconds and exits (like, time.sleep(10)).
The second one is checking the time in the loop, working extensively and when it sees that 10 second have passed it quits.
Will both my apps be billed for the same amount of CPU time or the second will be much more expensive?
In other words - does "CPU time" in GAE means the actual amount of work of the instance during request or it represents the total time of instance being in memory from launch to exit?
Note that App Engine is moving away from CPU-hour billing towards instance-hour billing. 10 seconds of sleep and 10 seconds of activity incur the same cost in instance hours.
If your second app is "working extensively", you're probably using APIs and consuming additional forms of quota, making the second request more expensive.
If you're using the 2.7 runtime, you can take advantage of threading. time.sleep releases the GIL, so your instance can serve other threads while your first thread is sleeping.
Related
Situation:
My project are mostly automated tasks.
My GAE (standard environment) app has 40 crons job like this, all run on default module (frontend):
- description: My cron job Nth
url: /mycronjob_n/ ###### Please note n is the nth cron job.
schedule: every 1 minutes
Each of cron jobs
#app.route('/mycronjob_n/')
def mycronjob_n():
for i in (0,100):
pram = prams[i]
options = TaskRetryOptions(task_retry_limit=0,task_age_limit=0)
deferred.defer(mytask,pram)
Where mytask is
def mytask(pram):
#Do some loops, read and write datastore, call api, which I guesss taking less than 30 seconds.
return 'Task finish'
Problem:
As title of the question, i am running out of RAM. Frontend instance hours are increasing to 100 hours.
My wrong thought?
defer task runs on background because it is not something that user sends request when visit the website. Therefore, they will not be considered as a request.
I break my cronjobs_n into small different tasks because i think it can help to reduce the running time each cronjobs_n so that REDUCE instance's ram consumption.
My question: (purpose: keep the frontend/backend instance hours as low as possible, and I accept latency)
Is defer task counted as request?
How many request do I have in 1 mintues?
40 request of mycronjob_n
or
40 requests of mycronjob_n x 100 mytask = 4000
If 3-4 instances can not handle 4000 requests, why doesnt GAE add 10 to 20 F1 instances more and then shut down if idle? I set autoscale in app.yaml. I dont see the meaning of autoscale of GAE here as advertised.
What is the best way to optimize my app?
If defer task is counted as request, it is meaningless to slit mycronjob_n into different small tasks, right? I mean, my current method is as same as:
#app.route('/mycronjob_n/')
def mycronjob_n():
for i in (0,100):
pram = prams[i]
options = TaskRetryOptions(task_retry_limit=0,task_age_limit=0)
mytask(pram) #Call function mytask
Here, will my app has 40 requests per minute, each request runs for 100 x 30s = 3000s? So will this approach also return out of memory?
Should I create a backend service running on F1 instance and put all cron jobs on that backend service? I heard that a request can run for 24 hours.
If I change default service instance from F1 to F2,F3, will I still get 28 hours free? I heard free tier apply to F1 only. And will my backend service get 9 hours free if it runs on B2 instead of B1?
My regret:
- I am quite regret that I choose GAE for this project. I choosed it because it has free tier. But I realized that free tier is just for hobby/testing purpose. If I run a real app, the cost will increase very fast that it make me think GAE is expensive. The datastore reading/writing are so expensive even though I tried my best to optimize them. The frontend hours are also always high. I am paying 40 usd per month for GAE. With 40 usd per month, maybe I can get better server if I choose Heroku, Digital Ocean? Do you think so?
Yes, task queue requests (deferred included) are also requests, they just can run longer than user requests. And they need instances to serve them, which count as instance hours. Since you have at least one cron job running every minute - you won't have any 15 minute idle interval allowing your instances to shut down - so you'll need at least one instance running at all times. If you use any instance class other than F1/B1 - you'll exceed the free instance hours quota. See Standard environment instances billing.
You seem to be under the impression that the number of requests is what's driving your costs up. It's not, at least not directly. The culprit is most likely the number of instances running.
If 3-4 instances can not handle 4000 requests, why doesnt GAE add 10
to 20 F1 instances more and then shut down if idle?
Most likely GAE does exactly that - spawns several instances. But you keep pumping requests every minute, they don't reach an idle state long enough, so they don't shut down. Which drives your instance hours up.
There are 2 things you can do about it:
stagger your deferred tasks so they don't hit need to be handled at the same time. Fewer instance (maybe even a single one?) may be necessary to handle them in such case. See Combine cron jobs to reduce number of instances and Preventing Google App Engine Cron jobs from creating multiple instances (and thus burning through all my instance hours)
tune your app's scaling configuration (the range is limited though). See Scaling elements.
You should also carefully read How Instances are Managed.
Yes, you only pay for exceeds the free quota, regardless of the instance class. Billing is in F1/B1 units anyways - from the above billing link:
Important: When you are billed for instance hours, you will not see any instance classes in your billing line items. Instead, you will
see the appropriate multiple of instance hours. For example, if you
use an F4 instance for one hour, you do not see "F4" listed, but you
see billing for four instance hours at the F1 rate.
About the RAM usage, splitting the cron job in multiple tasks isn't necessarily helping, see App Engine Deferred: Tracking Down Memory Leaks
Finally, cost comparing GAE with Heroku, Digital Ocean isn't an apples-to-apples comparison: GAE is PaaS, not IaaS, it's IMHO expected to be more expensive. Choosing one or the other is really up to you.
We have an App Engine app that handles an average .5 requests per second, and seemingly all those requests can be handled by the same instance running a Go app as the main version.
However, sometimes App Engine kicks off a second instance (and sometimes even a third one), that doesn't seem to do anything past handling one or two requests. Here's an example.
Shutting down that instance manually doesn't seem to cause any harm, so my question is, why does App Engine not kill the instance after it did not get any requests for a while? (The above example had four requests in the past hour, often the requests/age ratio gets even lower).
Update:
A similar situation is when an instance is started on a different version. App Engine only seems to kill the instance after hours of not getting any requests.
Under Application Settings → Performance,
Idle Instances is set to Automatic – 20
Pending Latency is set to 150ms – 250ms
I wish I knew what controls if/when it kills idle instances, but I can't see any documentation of it.
To avoid excess instances starting, I think the main thing you can do here is increase the pending latency:
The Pending Latency slider controls how long requests spend in the pending queue before being served by an Instance of the default version of your application. If the minimum pending latency is high App Engine will allow requests to wait rather than start new Instances to process them. This can reduce the number of instance hours your application uses, but can result in more user-visible latency.
Even if you only average 4 requests/hour, if you happen to get two closely spaced I suppose it's possible it would start a new instance.
You can also see some small amount of information in the logs about why it started a new instance.
The "How Applications Scale" section of the Google App Engine documentation states:
Scaling in Instances
Each instance has its own queue for incoming requests. App Engine monitors the number of requests waiting in each instance's queue. If App Engine detects that queues for an application are getting too long due to increased load, it automatically creates a new instance of the application to handle that load.
App Engine also scales instances in reverse when request volumes decrease. This scaling helps ensure that all of your application's current instances are being used to optimal efficiency and cost effectiveness.
It also states you can "specify a minimum number of idle instances", and to "optimize for high performance or low cost" in the administration console.
Try setting the "Idle instances" field to something like 3 - 5, and "optimize for low cost" and see if that affects the instance kill time.
I develop an appengine application right now in python and I am surprised by the instance hours quotas I get, while I try to optimize my app for costs and performance.
I am testing right now one specific task_queue. (nothing else is running during that - before I start no instance is up)
the queue is configured with a rate/s of 100 with 100 buckets.
no configured limit for max_concurrent_requests
900 tasks will get pushed in this queue.
10-11 instances pop-up in this moment to deal with it.
everything takes far less than 30 seconds and every task is executed.
I check my instance hours quotas before and after that and I consume about 0.25 - 0.40 instance hours.
why is that?
shouldn't it be much less? is there an inital cost or a minimum amount which will be charged if one instance opens?
When an instance is opened it will cost you at least 15 minutes. Your 10-11 instances should cost you a total of around 2.5 hours.
If you don't need such a fast processing you should limit the amount of parallel processing of the queue using max_concurrent_requests.
I am pretty sure that the Scheduler will increase instance count when there is a backlog of tasks on a high-rate queue. 100/100 is a very high rate. You are telling the Scheduler to do these very quickly which means it fires up instances to do so.
Unless you need to process these tasks very quickly, you should use a much lower rate. This will result in fewer instances, and a longer queue of tasks. Depending on your processing requirements, you might be able to use a pull queue. Doing so allows you to lease and process hundreds of tasks at a time and take advantage of batch put()s etc. Really depends on what you are doing.
AppEngine kills my JVM instance really quickly. It does not live longer than 30 seconds idle. Subsequent request will create new instance but this roundtrip takes 8-10 seconds. Is there potential problem (bug) in my application ? There is no record in logs/admin logs which would indicate any problem or reason of shutdown. Development server works normaly. Is there any chance to find out why the instance is shutdown so quickly ?
AppEngine especially on free account can kill the instance anytime. Instance can live couple of seconds if idle or couple of minutes, noone knows how long. Should you need resident instance (to prevent long instance start-up), switch to paid version. Even if it is paid, you can still run it for free if you keep your instance hours within free quota. This means if you have single F1 instance, it will consume 24instance-hours per day. As per today quotas, you have free 28instance-hours per day so you have 4 instance-hours spare for app redeployments (every redeployment costs 1/4 of instance-hour or 15 minutes). Having F2 instance will consume quota 2x faster e.g. it will take 14 hours to consume your 28hour free quota. Next 10 hours will be billed to your credit card as per price list.
I'm currently in the process of rewriting my java code to run it on Google App Engine. Since I cannot use Timer for programming timeouts (no thread creation allowed), I need to rely on system clock to mark the time of the timeout start so that I could compare it later in order to find out if the timeout has occurred.
Now, several people (even on Google payroll) have advised developers not to rely on system time due to the distributed nature of Google app servers and not being able to keep their clocks in sync. Some say the deviance of system clocks can be up to 10s or even more.
1s deviance would be very good for my app, 2 seconds can be tolerable, anything higher than that would cause a lot of grief for me and my app users, but 10 second difference would turn my app effectively unusable.
I don't know if anything has changed for the better since then (I hope yes), but if not, then what are my options other than shooting up a new separate request so that its handler would sleep the duration of the timeout (which cannot exceed 30 seconds due to request timeout limitation) in order to keep the timeout duration consistent.
Thanks!
More Specifically:
I'm trying to develop a poker game server, but for those who are not familiar how online poker works: I have a set of players attached to 1 game instance. Evey player has a certain amount of time to act before the timeout will occur so the next player can act. There is a countdown on each actor and every client has to see it. Only one player can act at a time. The timeout durations I need are 10s and 20s for now.
You should never be making your request handlers sleep or wait. App Engine will only automatically scale your app if request handlers complete in an average of 1000ms or less; deliberately waiting will ruin that. There's invariably a better option than sleeping/waiting - let us know what you're doing, and perhaps we can suggest one.