Turn off AppEngine (Java) sessions for certain requests - google-app-engine

We're using sessions in our GAE/J application. Over the weekend, we had a large spike in our datastore writes that appears to have been caused by a large number of _ah_SESSION entities being created (about 100-200 every minute). Near as we can tell, there was a rogue task queue creating them because they stopped when we purged the queue. The task was part of a mapper process we run hourly.
We don't need sessions in that hourly mapper (or indeed in any of our task queues or cron jobs or many other requests). Is there a way to disable creating a session for selected URLs?

Unfortunately that can not be done.
This is particularly nasty when you have a non-browser clients (devices via REST or mapreduce jobs) where every request generates a new _ah_SESSION entity in the database.
The only way to avoid this is to write your own session handler: e.g. a servlet filter that sets/checks cookies and set it so that it ignores certain paths.
EDIT:
I just realized that there could be another way: make sure your client (mapreduce job) sets a dummy cookie with a proper name. GAE uses cookies named ACSID in production and dev_appserver_login on dev server. Just use always the same cookie value, so all requests will be treated as one user/session.
There will still be overhead of looking-up/saving session objects, but at least it will not create countless _ah_SESSION entities.

Related

Programatically listing and sending requests to dynamic App Engine instances

I want to send a particular HTTP request (or otherwise communicate a message) to every (dynamic/autoscaled) instance which is currently running for a particular App Engine application.
My goal is to trigger each instance to discard some locally cached data (because I have just modified the underlying data and want them to reload it).
One possible solution is to store a value in Memcache, and have instances check this each time they handle a request to see if they should flush their cache. But this adds latency to every request.
Another possible solution would be to somehow stop all running instances. No fixed overhead, but some impact while instances are restarted.
An even less desirable solution would be to redeploy the application code in order to cause all instances to be stopped. This now adds additional delay on my end as a deployment takes some time.
You could use the management API to list instances for a given version, but I'd suggest that you'd probably want to use something like the PubSub API to create a subscription on each of your App Engine instances. Since each instance has its own subscription, any messages sent to the monitored queue will be received by all instances.
You can create the subscription at startup (the /_ah/start endpoint may be useful), and then delete it at shutdown (using the /_ah/stop endpoint).

Using 1 intance of google-app-engine to monitor external service

I planning to create a NodeJS program, that work 24/7, that ping and make requests to an external server (outside of google cloud) every minute. Just to see that it the external services are are live.
If there is any error it will notify me by SMS & Email.
I don't need any front-end for this app, and no one needs to connect to it. Just simple NodeJS program.
The monitoring and configuration will be by texts files.
Now the questions:
It looks like it will cost me just $1.64. It sounds very cheap. Am I missing something?
It needs to work around the clock, I will request it to start it once, and it need to continue working, (by using setInterval). Is it will be aborted?
What it is exactly mean buy 1 instance. What an instance can do? Only respond to one request or what?
I tried to search in Google: appengine timeout, but didn't found anything that helps.
Free Quota
If you write your application in Python, PHP, Go or Java it can fit in free usage quota:
https://cloud.google.com/appengine/docs/quotas
So there will be absolutely no costs to run it on Google App Engine platform.
There are limit of 657,000 UrlFetch API Calls per day (more than 450 calls per minute in 24/7 mode) for free apps. 4GB traffic may also be sufficient for this kind of work.
Keep in mind there is no SMS sending services provided by Google App Engine and you will need to spend additional UrlFetch API calls to use external SMS services.
Email sending is also limited to 100 Emails per day (or 5000 Emails to admin address), so try not so send repeated notifications about same monitored server every minute, or you'll deplete your Email quote in 1.5 hours.
Scheduled Tasks
There is no way to run single process indefinitely without interruption on App Engine. But you don't have to!
You'll need to encapsulate all the work you're planning to execute in every iteration into single task and then schedule it to run every minute with Cron. See this documentation for Python: https://cloud.google.com/appengine/docs/python/config/cron
It is recommended to have some configuration page where you can set some internal configuration or see monitoring statistics, at least manage flag to temporarily pause tasks execution without redeploying your app.

How to fan out URL Fetch requests in a timely fashion?

Every minute or so my app creates some data and needs to send it out to more than 1000 remote servers via URL Fetch callbacks. The callback URL for each server is stored on separate entities. The time lag between creating the data and sending it to the remote servers should be roughly less than 5 seconds.
My initial thought is to use the Pipeline API to fan out URL Fetch requests to different task queues.
Unfortunately task queues are not guaranteed to be executed in a timely fashion. Therefore from requesting a task queue start to it actually executing could take minutes to hours. From previous experience this gap is regularly over a minute so is not necessarily appropriate.
Is there any way from within App Engine to achieve what I want? Maybe you know of an outside service that can do the fan out in a timely fashion?
Well, there's probably no good solution for the gae here.
You could keep a backend running; hammering the datastore/memcache
every second for new data to send out, and then spawn dozens of async url-fetches.
But thats really inefficient...
If you want a 3rd party service, pubnub.com is capable of doing fan-out, however i don't know if it could fit in your setup.
How about using the async API? You could then do a large number of simultaneous URL calls, all from a single location.
If the performance is particularly sensitive, you could do them from a backend and use a B8 instance.

Google App Engine - Cron or Task Queue?

I'm building a simple "play against a random opponent" back-end using Goole App Engine. So far I'm adding each user that wants to play into a "table" in the Datastore. As soon as there are more than 1 player in the Datastore, I can start to match them.
The Schedule Tasks with Cron looked promising for this work until I saw that the lowest resolution seems to be minutely. If there are plenty of players signing up I want them to be matched quickly and not have to wait a whole minute (worst case).
I thought about having the servlet that recives the "play against random opponent" request POST to a Task Queue that would do the match making, but I think this will lead to a lot of contention when reading from the Datastore and deleting the enteties from the "random" table after they have been matched?
Basically I want a single worker that will do the matching, and I want to signal this worker from time to time that now is a good time to try to match opponents.
Any suggestions on what would be the right course of action here?
You can guarantee exclusive access via transactions:
Receive a request to play via REST. Check (within a transaction) if there is any request in database.
If there is, notify both users to start the play and delete request (transactionaly) from database.
If there isn't, add it to the database and wait for the next request.
Update:
Alternativelly you can achieve what you need via pull queue. Same scenario as above, just instead of datastore you'd check if there is a task in the pull queue, retrieve if there is or create a new one if there isn't one.

How do dynamic backends start in Google App Engine

Can we start a dynamic backend programatically? mean while when a backend is starting how can i handle the request by falling back on the application(i mean app.appspot.com).
When i stop a backend manually in admin console, and send a request to it, its not starting "dynamically"
Dynamic backends come into existence when they receive a request, and
are turned down when idle; they are ideal for work that is
intermittent or driven by user activity.
Resident backends run continuously, allowing you to rely on the state
of their memory over time and perform complex initialization.
http://code.google.com/appengine/docs/python/backends/overview.html
I recently started executing a long running task on a dynamic backend and noticed a dramatic increase in the performance of the frontends. I assume this was because the long running task was competing for resources with normal user requests.
Backends are documented quite thoroughly here. Backends have to be started and stopped with appcfg or the admin console, as documented here. A stopped backend will not handle requests - if you want this, you should probably be using the Task Queue instead.
It appears that a dynamic backend need not be explicitly stopped. The overvicew (http://code.google.com/appengine/docs/python/backends/overview.html) states that the billing for a dynamic backend stops 15 minutes after the last request is processed. So, if your app has a cron job, for example, that requires 5 minutes to complete, and needs to run every hour, then you could configure a backend to do this. The cost you'll incur is 15+5 minutes every hour, or 8 hours for the whole day. I suppose the free quota allows you 9 backend hours. So, this type of scenario would be free for you. The backend will start when you send your first request to it through a queue, and will stop 15 minutes after the last request you send is processed completely.

Resources