App engine. Forced to single instance - google-app-engine

I make server for simple realtime multiplayer game, on google app engine, Python SDK.
Requests very simple, and process in 1ms maximum.
I hold all game data in static variables of instane.
I set 'Min Pending Latency' to 15 sec. for prevent spawn of second instance.
But some time second instance has created any way.
How i can disable, or kill second instance, if it has ben spawn, And process all requests in single instance only?

If you're fighting against the system, that's an indication that you're doing something wrong.
You should not try to manage all your requests inside a single instance. That defeats the whole purpose of using GAE. The problem of course is that you shouldn't be storing your data as static variables inside your instance. Even apart from the issues with other instances being started, every instance is stopped and restarted every so often: so your data would be lost.
You should keep your data in the places meant for that: in memcache, and in the datastore.

beside only 8 instance-hours is free,
you can do this with Module + Manual Scaling
https://developers.google.com/appengine/docs/java/modules/
https://developers.google.com/appengine/docs/python/modules/

Related

What is the difference between min-instances and min-idle-instances in google app engine?

I want to understand the difference between min-instances & min-idle-instances?
I saw documentation on https://cloud.google.com/appengine/docs/standard/java/config/appref#scaling_elements but I am not able to differentiate between the two.
My use case:
I want at least 1 instance always up, as otherwise in most of the cases GAE would take time in creating instance causing my requests to time out (in case of basic scaling).
It should stay up, no matter if there is traffic or not, and if a request comes it should immediately serve it. If request volume grows then it should scale.
Which one I should use?
The min-idle-instances make reference to the instances that are ready to support your application in case you receive high traffic or CPU intensive tasks, unlike the min_instances which are the instances used to process the incoming request immediately. I suggest you to take a look on this link to have a deeper explanation of idle instances.
Based on this, since your use-case is focused on serve the incoming requests immediately, I think you should rather go with the min_instances functionality and use the min-idle-instances only in case you want to be ready for sudden load spikes.
The min-instances configuration applies to dynamic instances while min-idle-instances applies to idle/resident instances.
See also:
Introduction to instances for a description of the 2 instance types
Why do more requests go to new (dynamic) instances than to resident instance? for a bit more details
min_instances: the minimum number of instances running at any time, traffic or no traffic, rain or shine.
min_idle_instances: the minimum of idle (or "unused") instances running over the currently used instances. Example: you automatically scaled to 5 app engine instances that are receiving requests, by setting min_idle_instances to 2, you will be running 7 instances in total, the 2 "extra" instances are idle and waiting in case you receive more load. The goal is that when load raises, your users don't have to wait the load time it takes to start up an instance.
IMPORTANT: you need to configure warmup requests for that to work
IMPORTANT2: you'll be billed for any instance running, idle or not. App engine is not cheap so be careful.
min_instances applies to the number of instances that you want to have running, from 0 (useful if you want to scale down when you don't receive traffic) to 1000. You are charged for the number of instances you have running, so, this is important to save costs.
For your case set this value to 1, as it's the most straightforward option.

Running a cronjob on each instance?

I would like to try out dropwizard-metrics + graphite.
For order to this to work out i need to run a job regular (e.g. each 5:th second) that sends metrics from the instance to the graphite server.
Is this even possible?
The java documentation and python documentation for Cron in App Engine says that the minimal interval in App Engine is configurable in 'minutes'. Thus the simple answer would be: No you cannot schedule a job every 5 seconds.
However...
Knowing that tasks queue tasks can run up to 10 Minutes (see deadlines) you could manually schedule a task every (let's say) 5 minutes and handle the 5 second interval yourself in your servlet (or whatever it is called in python).
I'm just saying it is possible to use suche short intervals. You should really avoid a crutch like that. This kind of behaviour will eat through your quota and make your app expensive really fast.
Edit:
Since the title of the question asks for running jobs in specific instances:
As Dmitry pointed out and as documented here it is possible to address specific instances when using manual or basic scaling with modules. Instances are anonymous when using automatic scaling and thus cannot be addressed. It seems this feature is only documented and available for app engine modules.

What is a Google App Engine instance?

I am trying to estimate the monthly costs for having GAE for in-app store and I do not really understand what is an instance and what can I do within one instance.
Can I just have one instance with multiple threads to deal with multiple clients? And as I have 28 hours of free instance per app per day (http://cloud.google.com/pricing/), does it mean that I would not pay for my server app running all the time?
An instance is an instance of a virtual server, running your code, that is able to serve requests to clients. This is usually done in parallel (Goroutines, Java threads, Python threads with 2.7) for most efficient usage of available resources.
Response times depends on what you're doing in your code, and it's usually IO dependent. If you have a waterfall of serial database lookups, it takes longer than if you only have a single multiget and perhaps an async write.
Part of the deal with GAE is that Google handles the elasticity for you. If there are a lot of connections waiting, new instances will start as needed (until your quota is exhausted). That means it can be difficult to estimate cost upfront, because you don't know exactly how efficient your code is and how much resources you'll need. I recommend a scheme where more usage means more income, and income per request is higher than cost per request. :)
You can tweak settings, saying you want requests to wait in queue, or always have a couple of spare instances ready to serve new requests, which will affect cost for you and response times for users.
In an IaaS scenario you could say that you will use five instances and that's the cost, but in reality you might need only 1 at night local time, and 25 the rest of the day, which means your users would most likely see dropped connections or otherwise have a negative user experience.
A free instance is normally able to handle test traffic during development without exhausting the quota.
Well AppEngine may decide you need to have more than one instance running to handle the requests and so will start another one. You won't be able to limit it to one running instance. In fact, it's sometimes unclear why AE starts another instance when it seems like the requests are low, but it will if it decides it needs another warm instance to be ready to handle requests if the serving instance(s) are too near their limit.

GAE Go - "This request caused a new process to be started for your application..."

I've encountered this problem for a second time now, and I'm wondering if there is any solution to this. I'm running an application on Google App Engine that relies on frequent communication with a website through HTTP JSON RPC. It appears that GAE has a tendency to randomly display a message like this in the logs:
"This request caused a new process to be started for your application,
and thus caused your application code to be loaded for the first time.
This request may thus take longer and use more CPU than a typical
request for your application."
And reset all variables stored in RAM without warning. The same process happens over and over no matter how many times I set the variables again or upload newer code to GAE, although incrementing the app version number seems to solve the problem.
How can I get more information on this behaviour, how to avoid it and prevent data loss of my Golang applications on Google App Engine?
EDIT:
The variables stored in RAM are small classes of strings, bytes, bools and pointers. Nothing too complicated or big.
Google App Engine seems to "start a new process" in matter of seconds of heavier use, which shouldn't be long enough time for the application to be shut down for not being used. The timespan between application being uploaded to GAE, having its variable set and a new process being created is less than a minute.
Do you realize that GAE is a cloud hosting solution that automatically manages instances based on the load? This is it's main feature and reason people are using it.
When load increases, GAE creates a new instance, which , of course, has all RAM variables empty.
The solution is not to expect variables to be available or store them to permanent storage at the end of request (session, memcache, datastore) and load them if not present at the beginnig of request.
You can read about GAE instances in their documentation here, check out the performance section:
http://code.google.com/appengine/kb/java.html
In your case of having small data available, if its static then you can load it into memory on startup of a new instance. If it's dynamic data, you should be saving it to the database using their api.
My recommendation for keeping a GAE instance alive, either pay for the Always-On service or follow my recommendations for using a cron here:
http://rwyland.blogspot.com/2012/02/keeping-google-app-engine-gae-instances.html
I use what I call a "prime schedule" of a 3, 7, 11 minute cron job.
You should consider using Backends if you want long running instances with resident memory.

how many users in a GAE instance?

I'm using the Python 2.5 runtime on Google App Engine. Needless to say I'm a bit worried about the new costs so I want to get a better idea of what kind of traffic volume I will experience.
If 10 users simultaneously access my application at myapplication.appspot.com, will that spawn 10 instances?
If no, how many users in an instance? Is it even measured that way?
I've already looked at http://code.google.com/appengine/docs/adminconsole/instances.html but I just wanted to make sure that my interpretation is correct.
"Users" is a fairly meaningless term from an HTTP point of view. What's important is how many requests you can serve in a given time interval. This depends primarily on how long your app takes to serve a given request. Obviously, if it takes 200 milliseconds for you to serve a request, then one instance can serve at most 5 requests per second.
When a request is handled by App Engine, it is added to a queue. Any time an instance is available to do work, it takes the oldest item from the queue and serves that request. If the time that a request has been waiting in the queue ('pending latency') is more than the threshold you set in your admin console, the scheduler will start up another instance and start sending requests to it.
This is grossly simplified, obviously, but gives you a broad idea how the scheduler works.
First, no.
An instance per user is unreasonable and doesn't happen.
So you're asking how does my app scale to more instances? Depends on the load.
If you have much much requests per second then you'll get (automatically) another instance so the load is distributed.
That's the core idea behind App Engine.

Resources